Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strange issue with nftables #135

Closed
cybermcm opened this issue Sep 30, 2019 · 14 comments
Closed

Strange issue with nftables #135

cybermcm opened this issue Sep 30, 2019 · 14 comments

Comments

@cybermcm
Copy link

cybermcm commented Sep 30, 2019

Me again ;-)
I noticed a very strange behavior with the new nftables implementation. I'm running a gitlab docker instance on my lab server and every time I start a build process dfw log starts filling and finally dfw doesn't work anymore.
log:

dfw                     | Sep 30 06:01:01.804 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:02.148 INFO Starting processing, started_processing_at: 2019-09-30T06:01:02Z, module: dfw::process:101
dfw                     | Sep 30 06:01:02.294 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:02Z, module: dfw::process:187
dfw                     | Sep 30 06:01:02.295 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:03.406 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:03.582 INFO Starting processing, started_processing_at: 2019-09-30T06:01:03Z, module: dfw::process:101
dfw                     | Sep 30 06:01:03.632 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:03Z, module: dfw::process:187
dfw                     | Sep 30 06:01:03.643 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:04.215 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:04.375 INFO Starting processing, started_processing_at: 2019-09-30T06:01:04Z, module: dfw::process:101
dfw                     | Sep 30 06:01:04.413 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:04Z, module: dfw::process:187
dfw                     | Sep 30 06:01:04.413 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:05.008 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:05.142 INFO Starting processing, started_processing_at: 2019-09-30T06:01:05Z, module: dfw::process:101
dfw                     | Sep 30 06:01:05.390 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:05Z, module: dfw::process:187
dfw                     | Sep 30 06:01:05.392 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:05.701 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:05.805 INFO Starting processing, started_processing_at: 2019-09-30T06:01:05Z, module: dfw::process:101
dfw                     | Sep 30 06:01:05.868 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:05Z, module: dfw::process:187
dfw                     | Sep 30 06:01:05.870 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:06.294 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:06.405 INFO Starting processing, started_processing_at: 2019-09-30T06:01:06Z, module: dfw::process:101
dfw                     | Sep 30 06:01:06.452 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:06Z, module: dfw::process:187
dfw                     | Sep 30 06:01:06.453 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:07.969 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:08.037 INFO Starting processing, started_processing_at: 2019-09-30T06:01:08Z, module: dfw::process:101
dfw                     | Sep 30 06:01:08.305 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:08Z, module: dfw::process:187
dfw                     | Sep 30 06:01:08.305 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:08.778 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:08.964 INFO Starting processing, started_processing_at: 2019-09-30T06:01:08Z, module: dfw::process:101
dfw                     | Sep 30 06:01:09.034 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:09Z, module: dfw::process:187
dfw                     | Sep 30 06:01:09.034 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:10.114 INFO Received Docker events, starting processing, module: dfw:327
dfw                     | Sep 30 06:01:10.231 INFO Starting processing, started_processing_at: 2019-09-30T06:01:10Z, module: dfw::process:101
dfw                     | Sep 30 06:01:10.262 INFO Finished processing, finished_processing_at: 2019-09-30T06:01:10Z, module: dfw::process:187
dfw                     | Sep 30 06:01:10.264 INFO Applying rules (using nft), module: dfw::process:1145
dfw                     | Sep 30 06:01:11.168 INFO Received Docker events, starting processing, module: dfw:327

I noticed that if dfw is "broken" the comments are missing from the ruleset, e.g.

ct state invalid drop comment "DFW-MARKER:defaults;filter;input;ct-state-invalid-drop"
ct state { established, related } accept comment "DFW-MARKER:defaults;filter;input;ct-state-relatedestablished-accept"
mark & 0x000000df == 0x000000df accept comment "DFW-MARKER:defaults;filter;input;meta-mark"

Any idea what can be the cause, it never happened with iptables before

@cybermcm
Copy link
Author

cybermcm commented Oct 2, 2019

For now I'll revert back to iptables and monitor if dfw is working "normal" again. First notice: building a project with git is no longer killing dfw...

@pitkley
Copy link
Owner

pitkley commented Oct 3, 2019

Hm... do the DFW-logs you have provided stop at that exact line when it is no longer working, or did you simply stop copying at some point?

I'm a bit confused as to why it would fail at that step, and the iptables-version wouldn't. With the iptables-version, are you using the regular iptables-backend or the iptables-restore-backend?

If you have the chance, can you run DFW with --log-level debug or even --log-level trace and provide the last couple hundred loglines, specifically up to the very last one?

For me DFW seems to be running just fine, although I am no longer running GitLab-builds directly on the same host, which means I don't have that many Docker events anymore -- this might be the cause of the issues somehow?

One more thing: you can set the --burst-timeout to something greater than 500 milliseconds. What the burst timeout does is batch Docker events for a configurable timeout before applying rules. So if you set it to e.g. 5000, DFW will wait 5 seconds starting from when it receives an event before applying the rules, such that the host can "stabilize" and many subsequent events don't cause rebuilding of the rules so often.

@cybermcm
Copy link
Author

cybermcm commented Oct 4, 2019

Thank you for taking time to look into it.
The log from my first post is not the full log as you noticed correctly.

I'm using the regular iptables backend (iptables version).

log file: I managed to "crash" dfw and capture a log file. Since it contains sensitive data I won't put it online but I'm happy to send it to you if you wish. Please tell me how you want it.

Many docker events: Thats possible, I'm running about 50 container on one host (a lab host).

Burst-Timeout didn't help.

One thing I noticed. Everything provided by dfw is down e.g. http and https but also other ports. It seems everything from wider_world_to_container.rules.
My rules from initialization are still working (ICMP and ssh).

@pitkley
Copy link
Owner

pitkley commented Oct 4, 2019

Interesting, thanks for testing. You can send it to me via mail, pitkley@googlemail.com. You can also share it with me via Keybase, should you use it: https://keybase.io/pitkley.

@cybermcm
Copy link
Author

cybermcm commented Oct 4, 2019

send you an email...

@pitkley
Copy link
Owner

pitkley commented Oct 4, 2019

Thank you, this will probably help. Quick question: when you reach this state of it not working, does the DFW-container itself die, or does it simply stop doing anything, although the DFW-container itself is still running?

I'm wondering if there is some form of race-condition in the event-burst-handling, although I'm not sure how the nftables-solution would be susceptible to that while the iptables-solution apparently isn't.

@cybermcm
Copy link
Author

cybermcm commented Oct 4, 2019

the container itself is still running, log file grows...

@pitkley
Copy link
Owner

pitkley commented Oct 6, 2019

Now I'm confused... maybe I didn't understand the initial error report correctly.

You are saying that DFW is still running and still outputting logs, but your nftables-ruleset is mostly empty? Specifically, the mark/comment rules mentioned in your initial report are not there?

If that is the case, I'll change the trace-logging to output the ruleset that is applied using nft to allow us to see if DFW is really just applying incorrect rules.

@cybermcm
Copy link
Author

cybermcm commented Oct 6, 2019

Sorry, my initial report was somehow not very accurate...
In my point of view dfw is still running but the wider_world_to_container.rules aren't working anymore.
As I stated in my initial posts the rules seem to be in place but missing the rules with the comments.

@pitkley
Copy link
Owner

pitkley commented May 30, 2020

Hi @cybermcm, sorry that I didn't get back to you for so long, this issue totally slipped my mind...

Anyway: after re-reading the issue, I think I understood the point you were trying to make. It might have been the case that the comment/marker-rules got overriden without being regenerated.

DFW <1.2 only inspected the rules once on startup, which means that if the marker-rules were removed after DFW started, they would not added, even if they were missing. This would explain why the container was still running, but the wider-world-to-container rules seemed to stop working.

I am currently working on DFW release v1.2 which reintegrates iptables as a supported firewall-backend. One of the changes I made during that is that DFW will now check the nftables ruleset every time it processes the rules, which means the marker-rules will be regenerated if they became absent since DFW started.

You can find the latest release-candidate here: https://github.com/pitkley/dfw/releases/tag/1.2.0-rc.3. You can also get the Docker image from Docker Hub: docker pull pitkley/dfw:1.2.0-rc.3.

If you stick with iptables, I'd still suggest upgrading to v1.2. You can find a guide on how to migrate here. 👍

@cybermcm
Copy link
Author

Hi, thx for getting back to my problem.
Currently I reverted back to iptables due to missing nftables support of fail2ban (docker). I'm using docker network for the time being (no firewall management at all).
I'll check the F2B project and then try DFW again but it will take some time, currently I'm busy with other stuff

@cybermcm
Copy link
Author

cybermcm commented Jun 5, 2020

I was too curious to wait any longer. I converted my lab server to nftables, there is also a solution for fail2ban (crazy-max/docker-fail2ban#29).
Currently almost all things work like expected. There is still a lot to learn with nftables but I think I get most parts now. For anyone who stumbles across this thread, how I did it (not perfectly clear for me with the existing examples and maybe wrong, but at least working)
I didn't touch Debians default nft rules, so inet filter input, forward and output still have "policy accept".
I created a basic set in [backend_defaults.initialization]

"add table inet custom",
"flush table inet custom",
"add chain inet custom input { type filter hook input priority 0 ; policy drop ; }",
"add rule inet custom input ct state invalid drop",
"add rule inet custom input ct state established, related accept",
"add rule inet custom input iifname lo accept",
"add rule inet custom input icmp type echo-request accept",
"add rule inet custom input tcp dport 22 ct state new,established accept"

still one little issue @pitkley but I'm sure there is a solution:
docker mail server container and docker webmail container on the same docker network. dfw allows container to container traffic
testing in webmail container: nmap -p 993 mail -> works, mail is the docker container name
webmail container: nmap -p 993 mail.domain.com -> not working (port 993 is also allowed in dfw wider world to container)

but if expose port 993 in my compose file then nmap -p 993 mail.domain.com is working!
A bug or a feature, I can't answer this ;-)

@pitkley
Copy link
Owner

pitkley commented Jun 13, 2020

@cybermcm thanks for your detailed response and for testing.

Regarding the issue you described: I was able to reproduce it and created #277 to track it. I'll see if I can get this into 1.2.0, although I might fix it in a later patch version!

@cybermcm
Copy link
Author

Thanks for solving this and working on #277! This issue is solved in my point of view, so I'm closing it and stay tuned with #277.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants