I was having trouble understanding the iptables hashlimit module and couldn’t dig up anything that really helped. The man pages are definitely lacking a clear explanation and /proc/net/ipt_hashlimit/ leaves out some information that would clarify things immensely. After some testing I managed to work it all out, so let’s go through it and see if I can help make sense of it for you too.
I’ll try not to assume too much prior knowledge about the module. We’ll be coming at this with the goal of blocking traffic that exceeds a certain amount of packets per second. From the man page:
hashlimit uses hash buckets to express a rate limiting match (like the limit match) for a group of connections using a single iptables rule. Grouping can be done per-hostgroup (source and/or destination address) and/or per-port. It gives you the ability to express “N packets per time quantum per group” or “N bytes per seconds”
There are three settings that are most important, in my opinion:
–hashlimit-above amount[/second|/minute|/hour|/day] Match if the rate is above amount/quantum.
–hashlimit-burst amount Maximum initial number of packets to match: this number gets recharged by one every time the limit specified above is not reached, up to this number; the default is 5. When byte-based rate matching is requested, this option specifies the amount of bytes that can exceed the given rate. This option should be used with caution — if the entry expires, the burst value is reset too.
–hashlimit-htable-expire msec After how many milliseconds do hash entries expire.
Each of these settings controls how the packets we’re hoping to control match on our iptables rule — and for the purpose of explaining things properly we’ll use this as our example:
-A INPUT -p icmp -m hashlimit --hashlimit-name ICMPTEST --hashlimit-mode srcip --hashlimit-srcmask 32 --hashlimit-above 5/minute --hashlimit-burst 2 --hashlimit-htable-expire 30000 -j DROP
This rule will drop icmp traffic that exceeds the configuration. If you’re following along and testing, this rule should go above any state checks in your firewall. Please note that a reload on iptables does not refresh these buckets properly. Once this rule is in place, any tweaks to the hashlimit module’s values (e.g., –hashlimit-above) requires restarting iptables!
With this in place, if you ping your server from another host, after 2 packets the rest will drop until 12 seconds elapse, then one will be let through, after another 12 seconds one will be let through, and so on. What’s going on behind the scenes?
To get an idea of that you can
watch the buckets through their
# watch --interval 1 "cat /proc/net/ipt_hashlimit/ICMPTEST"
And after a couple pings you might see something like the following:
28 10.0.0.2:0->0.0.0.0:0 198912 768000 384000
What are we seeing here?
28: This is the remaining number of seconds before the bucket expires. The maximum value of this is what was set in
--hashlimit-htable-expire, even though that was done in milliseconds. Every time a new packet comes in that matches this hash table entry, this value will reset to the maximum. When it reaches zero, the hash table entry expires.
10.0.0.2:0->0.0.0.0:0: This is the hash table entry as defined by the
--hashlimit-mode, in our case
198912: Remaining number of tokens in this entry’s bucket. Each matched packet will cause a number of tokens to be removed from the bucket (that value is shown shortly). If subtracting those tokens from the bucket would cause it to reach 0 (or less), the
-joperation for the rule takes place, otherwise it continues through the rule chain. Tokens do not get removed from the bucket unless the full amount can be removed
768000: Maxmimum number of tokens in this entry’s bucket – this value is the result of a calculation based on the
384000: Token value of a packet. Each packet matching the rule subtracts this number from the remaining number of tokens in this entry’s bucket if there are enough tokens available to do so
Now that’s all well and good but it’s hiding something important and for me that was what made part of it difficult to wrap my head around — it doesn’t tell you how many tokens are being restored to the bucket per second! It wasn’t until I
watch‘d the proc interface that I understood what was going on. How do you get that number? Let’s use the above example
- We’re allowing 5 packets per minute (
- The packet value is
- That means we need to restore
1920000tokens to the bucket per minute, or
32000tokens per second. After 12 seconds, we’ve restored
384000tokens to the bucket, allowing one packet through
We can see this in action, working as we expect:
PING 10.0.0.3 (10.0.0.3): 56 data bytes
64 bytes from 10.0.0.3: icmp_seq=0 ttl=63 time=0.527 ms
64 bytes from 10.0.0.3: icmp_seq=1 ttl=63 time=0.526 ms
Request timeout for icmp_seq 2
Request timeout for icmp_seq 3
Request timeout for icmp_seq 4
Request timeout for icmp_seq 5
Request timeout for icmp_seq 6
Request timeout for icmp_seq 7
Request timeout for icmp_seq 8
Request timeout for icmp_seq 9
Request timeout for icmp_seq 10
Request timeout for icmp_seq 11
64 bytes from 10.0.0.3: icmp_seq=12 ttl=63 time=0.470 ms
Request timeout for icmp_seq 13
Request timeout for icmp_seq 14
Request timeout for icmp_seq 15
Request timeout for icmp_seq 16
Request timeout for icmp_seq 17
Request timeout for icmp_seq 18
Request timeout for icmp_seq 19
Request timeout for icmp_seq 20
Request timeout for icmp_seq 21
Request timeout for icmp_seq 22
Request timeout for icmp_seq 23
64 bytes from 10.0.0.3: icmp_seq=24 ttl=63 time=0.443 ms
So let’s summarize:
- hashlimit uses buckets of tokens
- A calculated, static value of tokens is assigned for packets
- The maximum number of tokens in the bucket is
--hashlimit-burst * packet value
- Hash table entries are created based on the
- A new entry into the hash table creates a bucket
- When no packets have matched that entry in
--hashlimit-htable-expirems, the entry is expired
- Packets matching the iptables rule subtract tokens from the bucket for the hash table entry and reset the expire timer
- When a bucket reaches 0, or would reach 0 after a subtraction, the rule action is performed
- Every second the bucket adds enough tokens to achieve the desired
What does it mean: “requires restarting iptables”?
In CentOS 6, restarting iptables via the service command (e.g., service iptables restart). More generally speaking, it causes the iptables and netfilters modules to be unloaded from the kernel before being reloaded.
What exactly do you mean by “reload” vs “restart” iptables? Are you talking about the difference between flushing the chain (iptables -F INPUT) or unloading & reloading the the kernel module? Or???
“reload” and “restart” as in arguments to the service command on CentOS 6. reload on CentOS uses the iptables-restore(8) command to flush the tables and reload the saved configuration. So yes, it is the difference between flushing the rules and unloading and reloading the iptables and netfilter kernel modules.
On my old consumer router with kernel 2.6.22, iptables v1.3.8 and -m hashlimit v1.3.8 the functionality of the jump is reversed:
If the limit is NOT matched, then the jump is taken.
So to permit things under the limit you have to have the -m hashlimit rule as -j ACCEPT, and to deny what goes over the limit the next rule needs to be -j DROP.
Spent a long time wondering at weird behavior before confirming this, hope it’ll help someone else.
Thanks for the comment – I haven’t confirmed this behaviour myself but thought I’d let this through in case someone else comes across it
That is correct. Newer versions of iptables has –hashlimit-above and –hashlimit-below options. The older versions have only –hashlimit option. Which as you have noticed matches up until the –hashlimit is reached. Similar to the new –hashlimit-below opton and –limit option of the limit module.
The difference can be seen between this to iptables manual pages.
thanks for greate article , does hashlimit module support ipv6?
Yes it does, take a look at the ip6tables man page for specific details.
Thank you for writing this post. It led me to the solution of a very distressing problem I had with multiple SSH sessions not connecting. Basically, once I had opened an SSH session, further sessions would sometimes fail to connect (connection would time out). Sometimes I had to try 2-3 times to open my second SSH connection. I was on the verge of giving up computers (only half joking) when I came across your writeup.
What I eventually discovered, thanks to you and your “watch /cat/proc/net…” trick, is that my hashlimit buckets were getting continuously emptied, because iptables modules process packets in the order they are listed in the rule definition.
I had used an iptables rule I found somewhere on the Internet. It looked similar to this:
This is wrong. The order should be “-m tcp …” “-m state …” “-m hashlimit”. Like this:
IPTABLES modules are executed in the order they are listed, so if -m hashlimit comes first, it will catch all packets (not just those for new TCP connections) and will saturate its internal limits. You will often be unable to open another connection, even after 1 minute passes, because regular (not NEW) packets will keep emptying the hashlimit buckets.
You want to first check if a packet is TCP, then check if it’s NEW, and only then update the rate limits.
I’ve also got a small correction to your post:
Actually, this value is simply = token value * hashlimit-burst.
I’d also like to report that on Debian 9 at least, if you flush iptables with “iptables -F”, it clears the hashlimit buckets properly. There is no way to restart iptables, because it’s not a regular service.
Thank you again for your very useful writeup and I hope it stays online to keep people from banging their heads against walls and experiencing profound feelings of failure and inadequacy.
Tomasz, thanks for your detailed reply. This is probably the second most popular article on this blog so it’s nice to get some feedback.
Your point about module order in rules is definitely worth noting and I think it shows that my explanation was a bit deficient using a stateless protocol (ICMP). It probably would have been more useful to provide an example with TCP (or both!) I would say though that there is a caveat here in terms of module order. You need to make the decision for state before or after hashlimit based on what you’re trying to do. If your goal was to stop excessive HTTP requests, you would want hashlimit to come first because a single HTTP connection can usually handle multiple HTTP requests (known as HTTP keep-alive). If state came first, hashlimit would not be able to limit those requests. However, sometimes a utility like fail2ban is the better option for layer 7 (application) limiting.
For the correction about maximum tokens, I suppose I meant that token value is derived from the mentioned values, meaning the calculation isn’t so straightforward. That is correct though!
Finally, thanks for some detail on Debian – I’m usually working in a RHEL/CentOS environment, so perspective from other distributions is helpful. By the way, I checked out your blog – I wish I’d seen your series on office chairs before my most recent purchase, that kind of first hand experience is hard to come by
The hashlimit module works basically like ‘limit’ but can be used with source IP, destination port, destination IP and source port, thanks to the –hashlimit-mode option. Is there anything similar that allows to use the mac address as well (something like –hashlimit-mode srcmac,destmac)?
Thank you very much!
Not with hashlimit, but it may be possible with marking and
tc(traffic control). See this superuser.com question/answer: https://superuser.com/questions/489083/rate-limiting-traffic-between-specific-hosts-based-on-their-mac-using-iptables