Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unable to handle containers with huge metadata #51

Closed
leogr opened this issue Jun 17, 2021 · 8 comments
Closed

unable to handle containers with huge metadata #51

leogr opened this issue Jun 17, 2021 · 8 comments
Labels
kind/bug Something isn't working lifecycle/stale

Comments

@leogr
Copy link
Member

leogr commented Jun 17, 2021

Describe the bug

This is an early description of the problem. I'm investigating. More details will come.

libsinsp throws an exception when a container with a lot of metadata is noticed.

It seems that the JSON (containing the container metadata) is truncated, and the JSON parser is returning an empty error.

The bug seems to cause the problem reported in falcosecurity/falco#683

How to reproduce it

TODO

Expected behaviour

Screenshots

Invalid JSON encountered while parsing container info: {"container":{"Mounts":[],"cpu_period":100000,"cpu_quo...

...a lot of JSON in the middle...

...ent tools\"},{\"name\":\"e2fsprogs\",\"versioerror=

Note:

  • the JSON seems to be truncated at some point
  • error= (the error returned by the parser) is empty

Environment

N/A

Additional context

@poiana
Copy link
Contributor

poiana commented Sep 15, 2021

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@mikhailadvani
Copy link

/remove-lifecycle stale

@FedeDP
Copy link
Contributor

FedeDP commented Sep 17, 2021

Reporting same comment that i wrote on falco issue: falcosecurity/falco#683 (comment).

I am working on a possible fix.

@leogr leogr changed the title wip: unable to handle containers with huge metadata unable to handle containers with huge metadata Sep 17, 2021
@FedeDP
Copy link
Contributor

FedeDP commented Sep 20, 2021

We currently have the following limitation: it seems our protocol is not able to manage single arguments larger than 64kiB.
In this case, if the json is larger, it gets truncated and we are not able to parse it.

Given that we cannot change the protocol (otherwise we'd lose backward compatibility), here are a couple of ideas on how to fix it:

  1. Always parse entire json but then adhere to 64kB limit for outputs.
    This is the simplest fix as it allows us to properly parse the json, but it comes with its bugs:
  • when storing to a scap file we will only store 64kB, thus truncating the json; this means that we won't be able to replay the stored file because it will crash with same error outlined in this issue
  • In stdout output, json will be truncated to 64kB
  1. Split the json argument into multiple (up to 32) 64kB arguments.
    This is somewhat more tricky, but it allows us to support up to 2MiB. Yet, it comes with its set of issues:
  • we still have a hard limit on json dimension, even if greater than before
  • in event_table.c we need to repeat 32times the same {"json", PT_CHARBUF, PF_NA} argument (it's ok, it is just quite ugly)
  1. Add a new ppm_event_flags that tells that the event size must be parsed as an uint32_t.
    While it is probably the cleanest solution, it would break reading a scap file from eg: wireshark.

Note however that i think we should enforce a proper json size limit anyway, because right now we are trusting external json source to give us a normal-sized json, as we are calling:

size_t totlen = sizeof(scap_evt) +  sizeof(uint16_t) + json.length() + 1;

ASSERT(evt->m_pevt_storage == nullptr);
evt->m_pevt_storage = new char[totlen];

potentially heap-allocating huge chunks of memory.

I guess that solution 2. together with a 2MiB limit on json dimension is our best candidate, yet for a simpler solution (maybe while thoroughly thinking about a backward compatible protocol change?) 1. can be enough.

/cc @gnosek

@FedeDP
Copy link
Contributor

FedeDP commented Sep 21, 2021

In the end, after a discussion with @leogr , it was decided to use a best-effort approach: if resulting json is too large to be handled, we first try to trim useless metadata (read: metadata that is not used as a filter), ie:

  • "env"
  • "port_mappings"
  • "labels"

Eventually, if we still need to free space, "Mounts" is killed too.

This is the best possible approach as it does not require many changes yet it always preserves container main info, like id, image and whatever, only trimming fields when needed.

@poiana
Copy link
Contributor

poiana commented Dec 20, 2021

Issues go stale after 90d of inactivity.

Mark the issue as fresh with /remove-lifecycle stale.

Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Provide feedback via https://github.com/falcosecurity/community.

/lifecycle stale

@leogr
Copy link
Member Author

leogr commented Dec 21, 2021

This issue should be fixed by #102
/close

@poiana
Copy link
Contributor

poiana commented Dec 21, 2021

@leogr: Closing this issue.

In response to this:

This issue should be fixed by #102
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@poiana poiana closed this as completed Dec 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working lifecycle/stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants