-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate from go-yara to yara-x; improve performance and readability #734
Conversation
d4bf8dc
to
c7aaf0d
Compare
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: Evan Gibler <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: Evan Gibler <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
@@ -17,7 +17,21 @@ permissions: | |||
jobs: | |||
test: | |||
runs-on: mal-ubuntu-latest-8-core | |||
container: cgr.dev/chainguard/wolfi-base:latest | |||
container: | |||
image: cgr.dev/chainguard/wolfi-base@sha256:eeb70e74e2ac07d3c80a30150bf473970c8b51a57f06daef3e4d065ac52489bc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to wait until CI was running consistently before locking everything down.
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
README.md
Outdated
* [pkgconf](http://pkgconf.org/) - required by Go to find C dependencies, included in many UNIX distributions | ||
|
||
Linux or macOS users can run the following command to install the necessary dependencies, other than Go: | ||
`yara-x` requires an underlying C API to function. For Wolfi users, this can be installed by running `apk add yara-x-compat`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This aside feels unhelpful as few if any users will install this onto Wolfi from source code. Can we provide a macOS example instead?
README.md
Outdated
cargo cinstall -p yara-x-capi --release --prefix=/tmp/yara-x | ||
``` | ||
|
||
And then copy the resulting files from `/tmp/yara-x` to `/usr/local/lib` and `/usr/local/include` (or the preferred `PKG_CONFIG_PATH`/`LD_LIBRARY_PATH` location(s)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels kind of gross. Is there a separate build/install step in cargo? It'd be nice if we could tell folks to use sudo cinstall ... --prefix=/usr/local
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can test that out. IIRC sudo
failed because it was looking for a root installation of cargo
/cinstall
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated in 0270283
(#734).
I tested the sudo
command and it worked without any non-intuitive copying.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved for code review, but the install instructions could use some more love.
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Signed-off-by: egibs <20933572+egibs@users.noreply.github.com>
Closes: #227
Closes: #497
After hacking around with yara-x locally last night, the performance gains over Yara are definitely noticeable (usually 2x faster at minimum). Previous versions seemed to be slower (at least
0.10.0
) but my earlier testing may have been bugged/flawed. That said, I wanted to get this rather large refactor up to test the CI experience again since we have to build the API from scratch, but when refreshing the sample data locally I was down to about ~24-25 seconds with the integration tests taking about 27 seconds (on my M1 Pro MBP).The main limitation with
yara-x
is that it does not expose any functionality to turn off problematic rules. To handle this, I added functionality to remove rules prior to compilation so that they are not evaluated by the scanner.To help with CPU overhead, the PR also adds a pool of scanners that can be re-used across files. Previously, we always GC'd the active scanner after each scan which had a sizable impact, especially for scan paths containing many files.
This PR also cleans up
recursiveScan
and fixes the behavior oflongestUnique
and splits out path-related functions into a newpath.go
file. Additionally,processArchive
will now concurrently scan extracted files which was previous serial. With this change, I can scan the OpenJDK package which extracts roughly 136,000 files in ~70-90 seconds.Outside of the
longestUnique
changes, the final, rendered output is essentially 1:1 with the current implementation.Edit: this PR is about 3-4x faster in GHA with 8-core runners (even when running in a container):
The tests and
golangci-lint
jobs now run in a Wolfi container to avoid compiling the yara-x C API each time the Workflows run; this more than halves the runtime of each job (5+ minutes down to ~2 minutes).