Understanding iptable’s hashlimit module

I was having trouble understanding the iptables hashlimit module and couldn’t dig up anything that really helped. The man pages are definitely lacking a clear explanation and /proc/net/ipt_hashlimit/ leaves out some information that would clarify things immensely. After some testing I managed to work it all out, so let’s go through it and see if I can help make sense of it for you too.

I’ll try not to assume too much prior knowledge about the module. We’ll be coming at this with the goal of blocking traffic that exceeds a certain amount of packets per second. From the man page:

hashlimit uses hash buckets to express a rate limiting match (like the limit match) for a group of connections using a single iptables rule. Grouping can be done per-hostgroup (source and/or destination address) and/or per-port. It gives you the ability to express “N packets per time quantum per group” or “N bytes per seconds”

There are three settings that are most important, in my opinion:

–hashlimit-above amount[/second|/minute|/hour|/day] Match if the rate is above amount/quantum.

–hashlimit-burst amount Maximum initial number of packets to match: this number gets recharged by one every time the limit specified above is not reached, up to this number; the default is 5. When byte-based rate matching is requested, this option specifies the amount of bytes that can exceed the given rate. This option should be used with caution — if the entry expires, the burst value is reset too.

–hashlimit-htable-expire msec After how many milliseconds do hash entries expire.

Each of these settings controls how the packets we’re hoping to control match on our iptables rule — and for the purpose of explaining things properly we’ll use this as our example:

-A INPUT -p icmp -m hashlimit --hashlimit-name ICMPTEST --hashlimit-mode srcip --hashlimit-srcmask 32 --hashlimit-above 5/minute --hashlimit-burst 2 --hashlimit-htable-expire 30000 -j DROP

This rule will drop icmp traffic that exceeds the configuration. If you’re following along and testing, this rule should go above any state checks in your firewall. Please note that a reload on iptables does not refresh these buckets properly. Once this rule is in place, any tweaks to the hashlimit module’s values (e.g., –hashlimit-above) requires restarting iptables!

With this in place, if you ping your server from another host, after 2 packets the rest will drop until 12 seconds elapse, then one will be let through, after another 12 seconds one will be let through, and so on. What’s going on behind the scenes?

To get an idea of that you can watch the buckets through their /proc interface

# watch --interval 1 "cat /proc/net/ipt_hashlimit/ICMPTEST"

And after a couple pings you might see something like the following:

28 10.0.0.2:0->0.0.0.0:0 198912 768000 384000

What are we seeing here?

  • 28: This is the remaining number of seconds before the bucket expires. The maximum value of this is what was set in --hashlimit-htable-expire, even though that was done in milliseconds. Every time a new packet comes in that matches this hash table entry, this value will reset to the maximum. When it reaches zero, the hash table entry expires.
  • 10.0.0.2:0->0.0.0.0:0: This is the hash table entry as defined by the --hashlimit-mode, in our case srcip
  • 198912: Remaining number of tokens in this entry’s bucket. Each matched packet will cause a number of tokens to be removed from the bucket (that value is shown shortly). If subtracting those tokens from the bucket would cause it to reach 0 (or less), the -j operation for the rule takes place, otherwise it continues through the rule chain. Tokens do not get removed from the bucket unless the full amount can be removed
  • 768000: Maxmimum number of tokens in this entry’s bucket – this value is the result of a calculation based on the --hashlimit-above and --hashlimit-burst values
  • 384000: Token value of a packet. Each packet matching the rule subtracts this number from the remaining number of tokens in this entry’s bucket if there are enough tokens available to do so

Now that’s all well and good but it’s hiding something important and for me that was what made part of it difficult to wrap my head around — it doesn’t tell you how many tokens are being restored to the bucket per second! It wasn’t until I watch‘d the proc interface that I understood what was going on. How do you get that number? Let’s use the above example

  • We’re allowing 5 packets per minute (--hashlimit-above 5/minute)
  • The packet value is 384000
  • That means we need to restore 1920000 tokens to the bucket per minute, or 32000 tokens per second. After 12 seconds, we’ve restored 384000 tokens to the bucket, allowing one packet through

We can see this in action, working as we expect:

So let’s summarize:

  • hashlimit uses buckets of tokens
  • A calculated, static value of tokens is assigned for packets
  • The maximum number of tokens in the bucket is --hashlimit-burst * packet value
  • Hash table entries are created based on the --hashlimit-mode setting
  • A new entry into the hash table creates a bucket
  • When no packets have matched that entry in --hashlimit-htable-expire ms, the entry is expired
  • Packets matching the iptables rule subtract tokens from the bucket for the hash table entry and reset the expire timer
  • When a bucket reaches 0, or would reach 0 after a subtraction, the rule action is performed
  • Every second the bucket adds enough tokens to achieve the desired --hashlimit-above setting

12 thoughts on “Understanding iptable’s hashlimit module

    1. brent Post author

      In CentOS 6, restarting iptables via the service command (e.g., service iptables restart). More generally speaking, it causes the iptables and netfilters modules to be unloaded from the kernel before being reloaded.

      Reply
  1. Aaron

    What exactly do you mean by “reload” vs “restart” iptables?  Are you talking about the difference between flushing the chain (iptables -F INPUT) or unloading & reloading the the kernel module?  Or???

    Reply
    1. brent Post author

      “reload” and “restart” as in arguments to the service command on CentOS 6. reload on CentOS uses the iptables-restore(8) command to flush the tables and reload the saved configuration. So yes, it is the difference between flushing the rules and unloading and reloading the iptables and netfilter kernel modules.

      Reply
  2. ml70

    On my old consumer router with kernel  2.6.22, iptables v1.3.8 and -m hashlimit v1.3.8 the functionality of the jump is reversed:

    If the limit is NOT matched, then the jump is taken.

    So to permit things under the limit you have to have the -m hashlimit rule as -j ACCEPT, and to deny what goes over the limit the next rule needs to be -j DROP.

    Spent a long time wondering at weird behavior before confirming this, hope it’ll help someone else.

     

    Reply
    1. brent Post author

      Thanks for the comment – I haven’t confirmed this behaviour myself but thought I’d let this through in case someone else comes across it

      Reply
  3. Tomasz P. Szynalski

    Brent,

    Thank you for writing this post. It led me to the solution of a very distressing problem I had with multiple SSH sessions not connecting. Basically, once I had opened an SSH session, further sessions would sometimes fail to connect (connection would time out). Sometimes I had to try 2-3 times to open my second SSH connection. I was on the verge of giving up computers (only half joking) when I came across your writeup.

    What I eventually discovered, thanks to you and your “watch /cat/proc/net…” trick, is that my hashlimit buckets were getting continuously emptied, because iptables modules process packets in the order they are listed in the rule definition.

    I had used an iptables rule I found somewhere on the Internet. It looked similar to this:

    This is wrong. The order should be “-m tcp …” “-m state …” “-m hashlimit”. Like this:

    IPTABLES modules are executed in the order they are listed, so if -m hashlimit comes first, it will catch all packets (not just those for new TCP connections) and will saturate its internal limits. You will often be unable to open another connection, even after 1 minute passes, because regular (not NEW) packets will keep emptying the hashlimit buckets.
    You want to first check if a packet is TCP, then check if it’s NEW, and only then update the rate limits.

    I’ve also got a small correction to your post:

    Maxmimum number of tokens in this entry’s bucket – this value is the result of a calculation based on the --hashlimit-above and --hashlimit-burst values

    Actually, this value is simply = token value * hashlimit-burst.

    I’d also like to report that on Debian 9 at least, if you flush iptables with “iptables -F”, it clears the hashlimit buckets properly. There is no way to restart iptables, because it’s not a regular service.

    Thank you again for your very useful writeup and I hope it stays online to keep people from banging their heads against walls and experiencing profound feelings of failure and inadequacy.

    Reply
    1. brent Post author

      Tomasz, thanks for your detailed reply. This is probably the second most popular article on this blog so it’s nice to get some feedback.

      Your point about module order in rules is definitely worth noting and I think it shows that my explanation was a bit deficient using a stateless protocol (ICMP). It probably would have been more useful to provide an example with TCP (or both!) I would say though that there is a caveat here in terms of module order. You need to make the decision for state before or after hashlimit based on what you’re trying to do. If your goal was to stop excessive HTTP requests, you would want hashlimit to come first because a single HTTP connection can usually handle multiple HTTP requests (known as HTTP keep-alive). If state came first, hashlimit would not be able to limit those requests. However, sometimes a utility like fail2ban is the better option for layer 7 (application) limiting.

      For the correction about maximum tokens, I suppose I meant that token value is derived from the mentioned values, meaning the calculation isn’t so straightforward. That is correct though!

      Finally, thanks for some detail on Debian – I’m usually working in a RHEL/CentOS environment, so perspective from other distributions is helpful. By the way, I checked out your blog – I wish I’d seen your series on office chairs before my most recent purchase, that kind of first hand experience is hard to come by

      Reply

Leave a Reply

Your email address will not be published. Required fields are marked *