Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gluon-ebtables-limit-arp: a package for ARP rate-limiting #1113

Merged
merged 2 commits into from
Feb 15, 2018

Conversation

T-X
Copy link
Contributor

@T-X T-X commented Apr 30, 2017

This package adds filters to limit the amount of ARP Requests
devices are allowed to send into the mesh. The limits are 6 packets
per minute per client device, by MAC address, and 1 per second per
node in total.

A burst of up to 50 ARP Requests is allowed until the rate-limiting
takes effect (see --limit-burst in the ebtables manpage).

Furthermore, ARP Requests with a target IP already present in the
batman-adv DAT Cache are excluded from the rate-limiting,
both regarding counting and filtering, as batman-adv will respond
locally with no burden for the mesh. Therefore, this limiter
should not affect popular target IPs, like gateways.

However it should mitigate the problem of curious people or
smart devices scanning the whole IP range. Which could create
a significant amount of overhead for all participants so far.

@T-X T-X added the 3. topic: package Topic: Gluon Packages label Apr 30, 2017
@christf
Copy link
Member

christf commented May 2, 2017

Thank you for the package. I noticed that the lua scripts are in files instead of luasrc. That way they will not be processed by the minifier. Is that intentional? I suggest to move them to luasrc.
If it was intentional: What are your reasons?

@rotanid rotanid added the 0. type: enhancement The changeset is an enhancement label May 2, 2017
Adorfer added a commit to eulenfunk/packages that referenced this pull request May 26, 2017
adding gluon ebtables support from T-X freifunk-gluon/gluon#1113
@rotanid rotanid requested a review from neocturne June 19, 2017 22:45
@T-X
Copy link
Contributor Author

T-X commented Jul 1, 2017

@christf: The main reason was basically that I forgot to add the minifier in the end :-). A minor reason for keeping it un-minified would be to have an example in this ebtables updater and to allow other communities to easilly try different parameters for the rate limiting or to extend it. Those were values which seemed to make sense to me from experiences with nodes at Freifunk Lübeck.

@T-X
Copy link
Contributor Author

T-X commented Jul 1, 2017

Btw., one way to extend it I was thinking about was to add another ebtables worker which periodically looks at an iw station dump for the AP interface or "batctl tl" and adds one ebtables rate limiting rule per each local MAC address, but with more strict limits.

That way persistent scanners, like "smart" printers etc., which usually keep their MAC address, would be moved out of the general per node limits completely.

I wanted to start simple with this first PR though :-). But maybe someone would like to play with extensions like that already, is too lazy to copy the original, non-minified version and for those a non-minified version could be helpful to start with. But I can also just add the minifier call :D (give me a second).

@T-X T-X force-pushed the ebtables-limit-arp branch from 1ae4052 to b4f72c3 Compare July 1, 2017 22:57
@RubenKelevra
Copy link

This looks very interesting, especially the batman-adv exclusion.

50 arps look a bit high, do you expect that high value after a boot or network-split or why did you choose such a high number?

@RubenKelevra
Copy link

RubenKelevra commented Jul 8, 2017

I've tested this package. If I arp-scan a subnet, I first got ~50 packages through, then the rate dropped to 1/second - as expected. But the rate-limiter is resetting after 1 minute. So you'll get 50 packages burst every minute while dropping the rate back to 1/s between those bursts.

Is this as expected? I thought as long as the input value is too high, the rate-limiter is active until it dropped below the limiter.

@rotanid
Copy link
Member

rotanid commented Jul 8, 2017

at least the ebtables doc don't say anything about a reset of the burst limit, if the 1/sec limit is constantly achieved:
http://ebtables.netfilter.org/misc/ebtables-man.html#lbAR

@RubenKelevra
Copy link

Yeah that’s what I read, but my test showed this is happening every 60 seconds. :)

@rotanid
Copy link
Member

rotanid commented Jul 19, 2017

btw., has anyone tested this in a live network? while we wait for the core review, this would be another TODO i guess

@Adorfer
Copy link
Contributor

Adorfer commented Jul 20, 2017

@rotanid Please specify "this". This goes for the 60seconds behaviour Ruben is talking about? Or do you mean the package in the PR?

@RubenKelevra
Copy link

@rotanid I‘ve changed this package slightly, reducing outgoing ARPs to a static 2/second without the burst-option.

This is working fine on a dev-version, but needs to be tested on larger scale.

@rotanid
Copy link
Member

rotanid commented Jul 20, 2017

@Adorfer the package in general.
@RubenKelevra ok, maybe @T-X can comment on your change, maybe he missed something or you did :D

@RubenKelevra
Copy link

RubenKelevra commented Jul 20, 2017

@rotanid my best guess is, that on those low memory devices the burst-option just reset once a minute or so to reduce the load on the machine, else all dropped packets of the filter has to be monitored too.

Or the documentation is just not really clear on the actual behavior.

In the most cases a 2 arps from one node filter should be sufficient, since it would only exceeded if many devices would roam from one node to another while exchanging data with many other clients on the network.

So if there isn’t many users which does conference calls with many unicast-connections while roaming, the threshold shouldn’t be reached.

I watched our networks for a while: 8-10 arps/second in networks with >400 clients is an average number, while most of them are issued by the gateway searching for old devices.

Many clients tend to send an ARP as a burst of 10 pakets in a row (seen on old android devices from HTC) as well as many ARPs in a row with a small gap while the roundtrip cannot be finished anyway.

So I doubt a hard-limit of 2/sec would break any useful usecase. :)

@rotanid
Copy link
Member

rotanid commented Jul 20, 2017

hm, sounds really low though :D what about a lower burst so that a reset per minute isn't bad?
like 1 arp/s default cap but 10 burst?
or more conservative 2/s and 20 burst.
after all, everything will be better than without limits.
but i would be interested what @T-X reasoning behind his limits was.

@T-X
Copy link
Contributor Author

T-X commented Jul 20, 2017

With the 50x burst I was thinking about peer to peer applications. I think the Transmission Bittorrent client for instance allowed 50 simultaneous connections by default. So the idea was that once you start your P2P application stay conservative and avoid potential, spurious hiccups.

While currently it is probably very unlikely, that you'd have 50 peers from within the local mesh and not from somewhere else on the Internet I was opting for a conservative approach as spurious behavior/issues are always a pain to debug later.

@RubenKelevra
Copy link

@T-X but in this scenario you don’t need a lot of arps either: the peer-finding is done by multicasts, which should result in none arps and as long as some traffic flows between the peers a lookup while roaming is not necessary.

The main issue with this filter-approach are roamings which result in a >60 seconds connection loss, in this case the arp-cache might be invalid but the neighbor-ips still might be valid to the BitTorrent client. In this case the restart would be limited by the arp-filter but not much, since 50 connections could be established in 25 seconds.

There might be an additional problem in the long term with all arp-filtering: if one client does a lot of arp, all other clients might be unable to issue an arp thru the filter.

We might want to fix this by a filter like 2/sec per hashtable(mac) and a general limit of 4/sec per node or similar.

@T-X
Copy link
Contributor Author

T-X commented Jul 21, 2017

@RubenKelevra

@T-X but in this scenario you don’t need a lot of arps either: the peer-finding is done by multicasts [...]

That's a totally different layer. After peer discovery IPv4 still needs to resolve the discovered IP addresses to MAC addresses via ARP.

The main issue with this filter-approach are roamings which result in a >60 seconds connection loss [...]

I fail to see how. Upon roaming, a device keeps its IP address(es) and connections. A device won't reissue ARP Requests on roaming, for instance.

@T-X
Copy link
Contributor Author

T-X commented Jul 21, 2017

There might be an additional problem in the long term with all arp-filtering: if one client does a lot of arp, all other clients might be unable to issue an arp thru the filter.

Right. See my comment in #1113 (comment)

@Adorfer
Copy link
Contributor

Adorfer commented Jul 21, 2017

@Adorfer the package in general.

live since several weeks
https://map.ffdus.de/#!v:g

Test since 26. Mai(?)
eulenfunk/firmware@c87ec1b
integrated in the stable fw releasefor the full domain (>200 nodes) since 2017-07-01

currently in the limits "20/s" triggering. (which is -as discussed above- probably still far too hight)
https://github.com/eulenfunk/packages/tree/v2016.2.x/gluon-ebtables-limit-arp

(this totally migitated the issues discussed in https://forum.freifunk.net/t/ffdus-bitte-keine-portscans-auf-16/14469 )

@Adorfer
Copy link
Contributor

Adorfer commented Jul 21, 2017

@RubenKelevra

We might want to fix this by a filter like 2/sec per hashtable(mac) and a general limit of 4/sec per node or similar.

what is the scenario: UnifiAP-Pro-LR rebooting, 50 client connect instantly.
All clients are asking nearly instaly for

  • IPv4 defaultrouter
    -IPv6 routers
  • ntp (set via dhcp to local service)

I do not say "it's more like you suggested", but perhaps at least on the "per node limit" there could be something neccesary "wait for at least 5 minutes runtime" in extremly large setups.... or at least a UCI settings for x86 offloaders which feed large radio infrastructures.

@RubenKelevra
Copy link

RubenKelevra commented Jul 21, 2017 via email

@T-X
Copy link
Contributor Author

T-X commented Aug 2, 2017

@RubenKelevra

I've tested this package. If I arp-scan a subnet, I first got ~50 packages through, then the rate dropped to 1/second - as expected. But the rate-limiter is resetting after 1 minute.

Hm, you're right, I can reproduce the issue with mausezahn. This is what the Wireshark IO Graph shows me:

arp-limit-1s-50burst-mz-10ms
10msec ARP Request interval

arp-limit-1s-50burst-mz-100ms
100msec ARP Request interval

arp-limit-1s-50burst-mz-250ms
250msec ARP Request interval

arp-limit-1s-50burst-mz-500ms
500msec ARP Request interval

arp-limit-1s-50burst-mz-1000ms
1sec ARP Request interval

arp-limit-1s-50burst-mz-2000ms
2sec ARP Request interval

I'll try to play with some other burst settings later today/tomorrow. Also, I'll try to understand the short but weird ebtables netfilter arp limit code. Maybe there is a bug in there.

@RubenKelevra, how did you deactivate the burst option? When I'm setting just a 1/s limit and do not use the burst option, it still shows up with a burst=5 in "ebtables -L" to me.

@RubenKelevra
Copy link

@T-X I‘ve just used 4/s without the burst option and tested it shortly but didn’t looked at the ebtables, the similarity from 4 to 5 should explain why iI didn’t noticed it.

But if the burst-value is the same as limiter and the timeframe is seconds it should act like theres no burst-value at all, shouldn’t it?

What do you think about a qdisc based approach with a tag, this should give us the ability to push it thru a fair queuing scheduler with paket pacing like cake and we won’t block different clients if one client is flooding.

@T-X
Copy link
Contributor Author

T-X commented Aug 13, 2017

I found out what the issue is: "ebtables --atomic-save" does not save the state of the limitter. So it is actually the ebtables-worker run via micrond every minute resetting the burst limit back to 50...

I'll look into / ask around whether somehow preserving these is possible. If someone would like to have a look at alternatives in the meantime, like qdisc, please do!

(Although I'm wondering whether there is something simpler than cake/qdisc. We do not need fair bandwidth distribution, simply limiting by counting instead of queueing should technically be sufficient and less memory hungry.)

@neocturne
Copy link
Member

The whole atomic-save code is something I don't like about this PR. I would rather like to see a small daemon (written in C?) keeping track of all the MAC addresses; that way, we never need to clear the whole table, but can just remove entries explicitly.

@T-X
Copy link
Contributor Author

T-X commented Aug 13, 2017

So, basically, removing all this "--atomic-save/commit" and setenv in local function worker() in /usr/sbin/ebtables-worker seems to fix the limit reset. However I'm a little worried that adding/removing several rules dynamically per minute instead of this one atomic commit might have a noticeable impact on performance. (and it would currently cause a hiccup due to the flushing and then rebuilding approach if it were done on the live table, for sure)

Edit: @NeoRaider, saw your reply just now, after writing this. Hm, yes, that might work, too. And probably easier than patching ebtables' atomic commands to safe the limit state. I'll look into that, thanks!

@rotanid
Copy link
Member

rotanid commented Oct 1, 2017

is there progress? the idea of this package is really promising :)

@T-X
Copy link
Contributor Author

T-X commented Oct 2, 2017

Hi @rotanid. Thanks for the interest and sorry for my silence regarding this for some time.

I have implemented some daemon in C. But then even with these fine grained rule updates to ebtables, without the full table ebtables atomic commits, it seemed that the limiter were reset somehow.

Seems like I will need to have a deeper look at the ebtables code again. I hope to be able to do this about next week.

Cheers

@T-X
Copy link
Contributor Author

T-X commented Nov 24, 2017

"--atomic-save/commit" seems to be unrelated to the limit reset issue. Any "ebtables -F/-I/-A/-D" seems to trigger the reset :-(.

I'll try to find out whether it is possible to notice a reinitialization case and keep the old state from within ebt_limit_mt_check() in the Linux kernel code...

Regarding an implementation within batman-adv. I would like to avoid opening that can of worms. At least for now. The concerns are not the overhead for maintaining the state (and yes, the batman-adv translation table uses a hash table and has an entry for each client MAC address in there). But that introducing various filters here and there in batman-adv without an overall architectural consideration/concept might introduce non-negligible performance penalties for the real data.

@T-X T-X force-pushed the ebtables-limit-arp branch from b4f72c3 to 3723665 Compare November 26, 2017 22:25
@T-X
Copy link
Contributor Author

T-X commented Nov 26, 2017

Phew, I think I found the issue in the kernel. I hope this two line fix does the trick. At least this seems to work according to my Wireshark captures.

Further changes:

  • ebtables-worker faciltiy removed: Using a daemon written in C now. It updates its ebtables chains every 30 seconds.
  • Complex locking removed: Using this daemon and "ebtables --concurrent" now instead.
  • Implemented and added a 6/min. per client limit, next to the 1/s per node limit.

@ecsv
Copy link
Contributor

ecsv commented Dec 20, 2017

Just tested the patch "gluon-ebtables: Enable concurrent ebtables updates" on x86_64 and noticed that it doesn't create any rules during boot (but chains are created):

root@ffv-525400123456:/# ebtables -L
Bridge table: filter

Bridge chain: INPUT, entries: 0, policy: ACCEPT

Bridge chain: FORWARD, entries: 0, policy: ACCEPT

Bridge chain: OUTPUT, entries: 0, policy: ACCEPT

Bridge chain: IN_ONLY, entries: 0, policy: RETURN

Bridge chain: OUT_ONLY, entries: 0, policy: RETURN

Bridge chain: MULTICAST_OUT, entries: 0, policy: RETURN

Bridge chain: MULTICAST_OUT_ICMPV6, entries: 0, policy: RETURN

Bridge chain: RADV_FILTER, entries: 1, policy: DROP
-s 02:BA:7A:DF:01:00 -j ACCEPT

But they will be created when I restart /etc/init.d/gluon-ebtables. Might be important to know that I've added the patch to v2017.1.4. The device was booted using

kvm output/images/factory/gluon-ffv-b20171220-v-x86-64.vdi -nographic

@ecsv
Copy link
Contributor

ecsv commented Dec 20, 2017

One of the problems seems to be that the locking implementation is... weird. Please add the patch https://git.netfilter.org/ebtables/commit/?id=6a826591878db3fa9e2a94b87a3d5edd8e0fc442 to fix it. The patch for 2017.1.4 can be found in FreifunkVogtland@918d3c7

But this also doesn't fix the the actual problem which see. My chains are still empty.

EDIT: The other problem is another bug in gluon-radv-filterd

@T-X T-X force-pushed the ebtables-limit-arp branch from 3723665 to 77e45b5 Compare December 28, 2017 19:45
@T-X
Copy link
Contributor Author

T-X commented Dec 28, 2017

Changelog:

  • Rebased to master: updated patches/lede/0044-.patch to 0046-.patch
  • Added @ecsv's backport of the ebtables patch, with the slight modification of changing it from 0044-.patch to 0047-.patch

Regarding the discussion from yesterday of changing "batctl dc -H -n" to a query via /sys/kernel/debugfs/, to avoid spawning a new process: I'd need to change "batctl tl -H -n" to use netlink, too, then.

I think it'd make more sense to leave it as is for now and replace both batctl calls to netlink once batman-adv has netlink support for the dat cache.

return;
}

// Throw away a header line
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you throwing a header line away when you called batctl without header lines (-H)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this output with batman-adv-legacy / batctl v2013.4:

root@freifunk647002d16222:~# batctl -v
batctl 2013.4.0 [batman-adv: 2013.4.0]
root@freifunk647002d16222:~# batctl tl
Locally retrieved addresses (from bat0) announced via TT (TTVN: 2 CRC: 0x17b1):
       Client        Flags   Last seen 
 * 64:70:02:d1:62:22 [.P...]   0.000
root@freifunk647002d16222:~# batctl tl -H
       Client        Flags   Last seen 
 * 64:70:02:d1:62:22 [.P...]   0.000
root@freifunk647002d16222:~#

So looks like a bug in that batctl version, hm. Could someone tell me the output they see with a more recent compat15 batman-adv/batctl?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, funny, that was actually a patch from me fixing it some years ago:

https://git.open-mesh.org/batctl.git/commit/4aa06a7c2935158a2ff5ffea2995b011a62e229e

I'll backport it for batctl-legacy.

PS: And yes, current batctl does need need that extra skipped line, just confirmed it on some device. Thanks for reporting!

Copy link
Contributor

@ecsv ecsv Jan 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

root@PL-See68-DG:~# batctl -v
batctl 2017.2 [batman-adv: 2017.2]
root@PL-See68-DG:~# batctl tl
[B.A.T.M.A.N. adv 2017.2, MainIF/MAC: primary0/0e:79:de:08:8b:d3 (bat0/ac:86:74:00:ec:00     BATMAN_IV), TTVN: 215]
Client             VID Flags    Last seen (CRC       )
ac:86:74:00:ec:00   -1 [.P....]   0.000   (0x8ace0d4d)
f4:09:d8:2b:51:78   -1 [....W.] 167.190   (0x8ace0d4d)
ac:86:74:00:ec:00    0 [.P....]   0.000   (0x55a68955)
root@PL-See68-DG:~# batctl tl -H
ac:86:74:00:ec:00   -1 [.P....]   0.000   (0x8ace0d4d)
f4:09:d8:2b:51:78   -1 [....W.] 170.620   (0x8ace0d4d)
ac:86:74:00:ec:00    0 [.P....]   0.000   (0x55a68955)

SECTION:=gluon
CATEGORY:=Gluon
TITLE:=Ebtables limiter for ARP packets
DEPENDS:=+gluon-core +gluon-ebtables
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you call batctl but don't depend on it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

Adding batctl as a dependency directly was a little tricky though. For compat15 we do have a batctl package. However, for batman-adv-legacy/compat14, batctl is included within the gluon-mesh-batman-adv-14 package.

Therefore I now used gluon-mesh-batman-adv as a dependency. Which will be satisfied by gluon-mesh-batman-adv-14 or -15, which in turn will provide batctl or pull the batctl dependency respectively.

while (!feof(fp)) {
pline = fgets(line, sizeof(line), fp);
if (!pline) {
fprintf(stderr, "%i: Error: fgets() failed\n", clock);
Copy link
Contributor

@ecsv ecsv Jan 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hexa also mentioned that this line (and the dat one) print (to syslog) a lot of

Mon Jan 15 20:26:10 2018 daemon.err gluon-arp-limiter[1126]: 3184: Error: fgets() failed
Mon Jan 15 20:26:10 2018 daemon.err gluon-arp-limiter[1126]: 3184: Error: fgets() failed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that's verbose indeed. I think I was thinking that the "!feof(fp)" would catch that. I'll look into why that's not the case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christf
Copy link
Member

christf commented Jan 16, 2018

T_X thank you for your work, this is awesome.

Also even the first version of this package addresses a real issue of current Freifunk networks. Having this not ideal solution available in gluon while it gets discussed and refined could have been a real benefit for multiple communities during the last couple of months.

Why do we need to distinguish between nodes an clients here instead of just keeping it simple?

This is needed for the Gluon ARP limiter to work without hiccups in
traffic.

Link: https://patchwork.ozlabs.org/patch/841210/
Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
@T-X
Copy link
Contributor Author

T-X commented Jan 17, 2018

Hi @christf,

Thanks for your comment! The reason why I implemented both a per client and per node limit is the following: Generally I expect the majority of ARP spam coming from a specific MAC address. That's what we have observed so far. For that I have implemented a more strict, tighter per client limit.

After cleaning up the code I noticed that I had generalized things so much that it was trivial to implement a both a per client and per node limit. The per node limit is there in case of an ARP spammer changing its source MAC with every packet. That's what I could imagine to happen for an intentionally malicious host trying to disrupt the network. With this, such a malicious host should have it more difficult to disrupt anything other than the local node via such ARP spamming.

Also, the per node limit is there to throttle ARP for new clients which were not added with a per client limit yet (this daemon only adds/removes/updates per client limits every 30 seconds).

(hope this answers your question?)

This package adds filters to limit the amount of ARP Requests
devices are allowed to send into the mesh. The limits are 6 packets
per minute per client device, by MAC address, and 1 per second per
node in total.

A burst of up to 50 ARP Requests is allowed until the rate-limiting
takes effect (see --limit-burst in the ebtables manpage).

Furthermore, ARP Requests with a target IP already present in the
batman-adv DAT Cache are excluded from the rate-limiting,
both regarding counting and filtering, as batman-adv will respond
locally with no burden for the mesh. Therefore, this limiter
should not affect popular target IPs, like gateways.

However it should mitigate the problem of curious people or
smart devices scanning the whole IP range. Which could create
a significant amount of overhead for all participants so far.

Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue>
@T-X T-X force-pushed the ebtables-limit-arp branch from 77e45b5 to e6457ac Compare January 17, 2018 11:35
@T-X
Copy link
Contributor Author

T-X commented Jan 17, 2018

Changelog:

  • Rebased to master:
    • Removed ebtables flock() and gluon-ebtables --concurrent patches from this PR as they were now merged already via a separate PR.
    • Updated kernel ebtables patch number to 0061-*.patch.
  • Added dependency to gluon-mesh-batman-adv to satisfy need for batctl
  • Fixed faulty error messages regarding fgets() in ebt_dat_update() and ebt_tl_update()
  • Fixed issue of wrongly skipping the first line from the "batctl tl" output with batman-adv compat-15

For batman-adv-legacy/compat14 networks: Make sure you have this new fix for batctl-legacy applied.

@rotanid rotanid added this to the 2018.1 milestone Jan 28, 2018
@mweinelt
Copy link
Contributor

mweinelt commented Feb 5, 2018

We've used this package to succesfully combat ARP spam caused by a so-called "P2P Filesharing" Android app. We've also used this package at a crowded venue, where we had lots of uplink capacity.

Neither caused any issue, the limits seem just fine to me and helped protect our network.

10/10 would install again.

@christf
Copy link
Member

christf commented Feb 15, 2018

merging, as discussed in todays developer meeting. Thank you for your work!

@christf christf merged commit 84a6f65 into freifunk-gluon:master Feb 15, 2018
@neocturne
Copy link
Member

@T-X What's the state of upstreaming the kernel patch?

@T-X
Copy link
Contributor Author

T-X commented Dec 7, 2018

I wasn't able to reproduce the issue in VMs yet. It seemed that on recent kernels and with a Debian kernel config, the ebtables limiting rules use the nft_limit kernel module instead of the ebt_limit.c code. The ebt_limit_init() is called, but neither ebt_limit_mt() nor ebt_limti_mt_check() were called. Instead nft_limit_eval() was. You can even see in lsmod that the usage count of nft_limit goes up by one for each ebtables limit rule inserted.

And the nft_limit code seems to memorize the limit state just fine on table changes.

I wasn't able to figure out what makes the kernel choose between these two different limiting code variants.

@T-X
Copy link
Contributor Author

T-X commented Dec 25, 2018

Eventually I found out that I was not using the ebtables binary I thought I was on Debian. On a recent Debian /usr/sbin/ebtables can link to /usr/sbin/xtables-nft-multi. Which seems to be the new, preferred bridge filtering way. And it uses nft_limit instead of ebt_limit.

I had resubmitted my patch for the legacy ebtables, rebased and retested with fixes for newer kernels (>= 4.11):

https://patchwork.ozlabs.org/patch/1009914/

And got a negative response. The suggestion I got in the link above was to switch from ebtables-legacy to this ebtables-nft. I guess that's something that would need to be done upstream in OpenWrt (or the ebtables-tiny would need to be replaced with an ebtables-nft-tiny variant).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0. type: enhancement The changeset is an enhancement 3. topic: package Topic: Gluon Packages
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants