Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create consul binary classifier #1738

Merged
merged 3 commits into from
Apr 17, 2023

Conversation

shanedell
Copy link
Contributor

Create consul binary classifier

Closes #1590

Closes anchore#1590

Signed-off-by: Shane Dell <shanedell100@gmail.com>
@kzantow
Copy link
Contributor

kzantow commented Apr 14, 2023

Thanks for the contribution @shanedell -- could you add a test for this? There are a couple ways to do this: either a small binary in the test-fixtures (e.g. copy something like 200 bytes around the version bytes), or downloading the actual binary from an image like these examples. Tests are all here.

@shanedell
Copy link
Contributor Author

@kzantow Absolutely. I am working on that now, I went the dynamic route. However, I have the test setup the same way the other dynamic binaries were doing it but failing with this error:

--- FAIL: Test_Cataloger_DefaultClassifiers_PositiveCases (0.72s)
    --- FAIL: Test_Cataloger_DefaultClassifiers_PositiveCases/positive-consul-1.15.2 (0.19s)
        cataloger_test.go:744: 
                Error Trace:    /Users/sdell/workspaces/anchore/syft/syft/pkg/cataloger/binary/cataloger_test.go:744
                Error:          "[]" should have 1 item(s), but has 0
                Test:           Test_Cataloger_DefaultClassifiers_PositiveCases/positive-consul-1.15.2

I am not to sure why because the binary is inside of syft/pkg/cataloger/binary/test-fixtures/classifiers/dynamic/consul-1.15.2/consul. So would you have any hints as to what I am doing wrong or do I need to push up that test now so you could get a better idea of what is causing the failure?

@wagoodman
Copy link
Contributor

it's not immediately clear the exact problem -- can you push the code that's failing and we'd be able to troubleshoot more?

Signed-off-by: Shane Dell <shanedell100@gmail.com>
@shanedell
Copy link
Contributor Author

No problem, pushed it up once I saw the comment

@wagoodman
Copy link
Contributor

wagoodman commented Apr 17, 2023

It seems that crafting a robust regex for this binary (including former versions of consul) might be a little difficult. Based off of what I'm seeing in the binary:

❯ cat classifiers/dynamic/consul-1.15.2/consul | strings | grep -C 5 '1\.15\.2'

...
&Event{&Lease{&Patch{&PodIP{&Probe{&Scale{&Taint{*.%s/%s, != %s, Kind=, goid=, j0 = , type=,errno=,packed,proto3--dport.bashrc.config.consul.member.tar.gz/bin/sh/broker/ca/pem/check//emails/logout/metric/v1/kv//v1/txn0.0.0.01.15.2
1.42.34123-abc19531252.5.4.32.5.4.52.5.4.62.5.4.72.5.4.82.5.4.99765625: type ::1/128::ffff::domain:method:scheme:status</pre>
...
consul-ui%22%2C%22version%22%3A%222.2.0%2B5e08e229%22%7D%2C%22resizeServiceDefaults%22%3A%7B%22injectionFactories%22%3A%5B%22view%22%2C%22controller%22%2C%22component%22%5D%7D%2C%22CONSUL_COPYRIGHT_YEAR%22%3A%222023%22%2C%22CONSUL_GIT_SHA%22%3A%225e08e22%22%2C%22CONSUL_VERSION%22%3A%221.15.2%5Cn%22%2C%22CONSUL_BINARY_TYPE%22%3A%22oss%22%2C%22CONSUL_UI_DISABLE_REALTIME%22%3Afalse%2C%22CONSUL_UI_DISABL
  <!-- CONSUL_VERSION: 1.15.2
 -->
  <link rel="icon" href="{{.ContentPath}}assets/favicon.ico">
  <link rel="icon" href="{{.ContentPath}}assets/favicon.svg" type="image/svg+xml">
  <link rel="apple-touch-icon" href="{{.ContentPath}}assets/apple-touch-icon-01cd4680782fbb5bc02301347df9903d.png">
  <link integrity="" rel="stylesheet" href="{{.ContentPath}}assets/vendor-cf03d69ba4d9fa5934f04dca689d187f.css">
...

You could get a matching regex with something like:

CONSUL_VERSION: (?P<version>\d+\.\d+\.\d+)

But this would be rather brittle, as this is focusing on comments in static assets that are included, which could change rather easily between releases.

Also, taking a look at the code to bake in the version:

It's changed a little bit over time... not significantly, but the above regex doesn't work well against even v1.10.12 of consul.

Looking at the binary directly it seems that the version is being embedded into a large data section:

$ xxd ./consul > /tmp/consul.xxd
$ cat consul.xxd | grep -C 2 "1\.10\.12"
026e2030: 2f65 6d61 696c 732f 6d65 7472 6963 2f76  /emails/metric/v
026e2040: 312f 6b76 2f2f 7631 2f74 786e 302e 302e  1/kv//v1/txn0.0.
026e2050: 302e 3031 2e31 302e 3132 312e 3235 2e34  0.01.10.121.25.4         <--- hiding in the middle here!
026e2060: 3131 3233 2d61 6263 3139 3533 3132 3532  1123-abc19531252
026e2070: 2e35 2e34 2e33 322e 352e 342e 3532 2e35  .5.4.32.5.4.52.5

This is fairly common in golang binaries for static data encoded into the rodata section of the binary... strings are not null terminated, thus it can be rather difficult to get a good fix on a single string.

If we were to write a regex against syft it would be a little easier (even multiple regexes!):

❯ xxd syft > /tmp/syft-0.77.0.xxd
❯ cat /tmp/syft-0.77.0.xxd | grep -C 2 '0\.77\.0'
014bb4f0: 5d00 0000 7b00 0000 7d00 0000 3200 0000  ]...{...}...2...
014bb500: 0008 0000 0020 0000 4000 0000 ffff ffff  ..... ..@.......
014bb510: fdff 0000 302e 3737 2e30 0000 0000 0000  ....0.77.0......
014bb520: 0000 0000 0000 1000 0000 0000 0000 003c  ...............<
014bb530: 075c 1433 26a6 813c 0000 0000 0000 903c  .\.3&..<.......<
--
014bbc50: 2300 0000 0000 0000 0000 0100 0000 0000  #...............
014bbc60: 0000 0000 0000 0000 0f00 0000 0000 0000  ................
014bbc70: 0000 1000 0000 0000 7630 2e37 372e 3000  ........v0.77.0.
014bbc80: 1f00 0000 0000 0000 0300 0000 0000 0000  ................
014bbc90: 0700 0000 0000 0000 3300 0000 0000 0000  ........3.......
--
014f2b10: 652f 7379 6674 2f69 6e74 6572 6e61 6c2f  e/syft/internal/
014f2b20: 7665 7273 696f 6e2e 6769 7444 6573 6372  version.gitDescr
014f2b30: 6970 7469 6f6e 3d76 302e 3737 2e30 2022  iption=v0.77.0 "
014f2b40: 0a62 7569 6c64 0943 474f 5f45 4e41 424c  .build.CGO_ENABL
014f2b50: 4544 3d30 0a62 7569 6c64 0947 4f41 5243  ED=0.build.GOARC
--
01c8d910: 616e 6368 6f72 652f 7379 6674 2f69 6e74  anchore/syft/int
01c8d920: 6572 6e61 6c2f 7665 7273 696f 6e2e 7665  ernal/version.ve
01c8d930: 7273 696f 6e3d 302e 3737 2e30 202d 5820  rsion=0.77.0 -X
01c8d940: 6769 7468 7562 2e63 6f6d 2f61 6e63 686f  github.com/ancho
01c8d950: 7265 2f73 7966 742f 696e 7465 726e 616c  re/syft/internal
--
01c8da00: 742f 696e 7465 726e 616c 2f76 6572 7369  t/internal/versi
01c8da10: 6f6e 2e67 6974 4465 7363 7269 7074 696f  on.gitDescriptio
01c8da20: 6e3d 7630 2e37 372e 3020 220a 6275 696c  n=v0.77.0 ".buil
01c8da30: 6409 4347 4f5f 454e 4142 4c45 443d 300a  d.CGO_ENABLED=0.
01c8da40: 6275 696c 6409 474f 4152 4348 3d61 6d64  build.GOARCH=amd

Why?

  • We can write a regex that clues off of the build flags included in the buildinfo section of the binary (if you run go version -m <path-to-binary> you can see the raw data in this section.
  • We can write a regex that looks for version-looking patterns surrounded by null characters.

Wait, why does the syft binary (and others) have null-terminated strings and consul doesn't? Syft (and others) don't have null-terminated strings but this is an implicit behavior of baking in values via ldflags instead of hard coded variables.

building with values being passed in via ldflags:

❯ CGO_ENABLED=0 go build -ldflags="-w -s -extldflags '-static' -X github.com/anchore/syft/internal/version.version=0.77.0 -X github.com/anchore/syft/internal/version.gitCommit=dd30c99bc2439cb91e3d084eb21e1040dd5a54dc -X github.com/anchore/syft/internal/version.buildDate=2023-04-11T14:32:58Z -X github.com/anchore/syft/internal/version.gitDescription=v0.77.0 " -o .tmp/syft ./cmd/syft/main.go

❯ xxd ./.tmp/syft > /tmp/syft-ldflags.xxd

❯ cat /tmp/syft-ldflags.xxd | grep -C 1 '0\.77\.0'

0148f2e0: 7b00 0000 7d00 0000 0020 0000 fdff 0000  {...}.... ......
0148f2f0: 302e 3737 2e30 0000 676f 312e 3230 0000  0.77.0..go1.20..    # the actual value passed into the linker
0148f300: 0000 0000 0000 1000 0000 0000 0000 003c  ...............<
--
0148fa90: 0000 0100 0000 0000 0000 0000 0000 0000  ................
0148faa0: 0000 1000 0000 0000 7630 2e37 372e 3000  ........v0.77.0.   
0148fab0: 3300 0000 0000 0000 00ca 9a3b 0000 0000  3..........;....
--
014c7ea0: 6572 6e61 6c2f 7665 7273 696f 6e2e 7665  ernal/version.ve
014c7eb0: 7273 696f 6e3d 302e 3737 2e30 202d 5820  rsion=0.77.0 -X       # record of the ldflags passed in
014c7ec0: 6769 7468 7562 2e63 6f6d 2f61 6e63 686f  github.com/ancho
--
014c7f90: 6f6e 2e67 6974 4465 7363 7269 7074 696f  on.gitDescriptio
014c7fa0: 6e3d 7630 2e37 372e 3020 220a 6275 696c  n=v0.77.0 ".buil       # record of the ldflags passed in
014c7fb0: 6409 4347 4f5f 454e 4142 4c45 443d 300a  d.CGO_ENABLED=0.

Note that we see 0.77.0 surrounded with null characters...

... but we don't see that when the variables are hard coded:

// in syft/internal/version/build.go

var version = "0.77.0"
var gitCommit = "dd30c99bc2439cb91e3d084eb21e1040dd5a54dc"
var gitDescription = valueNotProvided
var buildDate = valueNotProvided
var platform = fmt.Sprintf("%s/%s", runtime.GOOS, runtime.GOARCH)
❯ CGO_ENABLED=0 go build -o .tmp/syft-static ./cmd/syft/main.go
                                                                                                                                                                                                                      
❯ xxd .tmp/syft-static  > /tmp/syft-static.xxd

❯ cat /tmp/syft-static.xxd | grep -C 1 '0\.77\.0'
00f5a0d0: 7461 7274 2f73 7461 7473 2f73 7761 726d  tart/stats/swarm
00f5a0e0: 2f74 6173 6b73 302e 3737 2e30 3030 3030  /tasks0.77.00000   #   <--- notice that 0.77.0 is in the middle of a large data block
00f5a0f0: 3030 3030 3030 3031 3030 3030 3566 3030  0000000100005f00

@wagoodman
Copy link
Contributor

I think the best path forward is to try and write a regex that will be keyed off of the static assets in the binary, even though this would be brittle compared to other methods, but seems to be the best path forward for now.

@wagoodman wagoodman self-assigned this Apr 17, 2023
@shanedell
Copy link
Contributor Author

shanedell commented Apr 17, 2023

@wagoodman So you think something like you what you suggested, CONSUL_VERSION: (?P<version>\d+\.\d+\.\d+), for now? From adding that quickly it looks to have fixed the test.

@wagoodman
Copy link
Contributor

I think that's the best way forward for now, yes 👍 just be aware that this is brittle so it won't catch earlier versions of consul and may not find future versions of consul.

… is brittle

Signed-off-by: Shane Dell <shanedell100@gmail.com>
@wagoodman wagoodman merged commit 244b797 into anchore:main Apr 17, 2023
@shanedell shanedell deleted the consul-binary-classifier branch April 17, 2023 16:27
spiffcs added a commit that referenced this pull request Apr 18, 2023
* main:
  chore(deps): update bootstrap tools to latest versions (#1744)
  chore(deps): bump github.com/docker/docker (#1746)
  Create consul binary classifier (#1738)
  chore(deps): update bootstrap tools to latest versions (#1740)
spiffcs added a commit that referenced this pull request Apr 24, 2023
* main:
  Add sections of interest for Gemfile.lock cataloger (#1749)
  fix: update cache.fingerprint file to java-builds dir (#1748)
  Add ALPM Metadata to CYCLONEDX and SPDX output formats (#1747)
  chore: bump stereoscope to latest version (#1741)
  chore(deps): update bootstrap tools to latest versions (#1744)
  chore(deps): bump github.com/docker/docker (#1746)
  Create consul binary classifier (#1738)
  chore(deps): update bootstrap tools to latest versions (#1740)

Signed-off-by: Christopher Phillips <christopher.phillips@anchore.com>
GijsCalis pushed a commit to GijsCalis/syft that referenced this pull request Feb 19, 2024
* Create consul binary classifier

Closes anchore#1590

Signed-off-by: Shane Dell <shanedell100@gmail.com>

* Create test for consul binary classifier

Signed-off-by: Shane Dell <shanedell100@gmail.com>

* Update version for consul. Add note that about consul version matcher is brittle

Signed-off-by: Shane Dell <shanedell100@gmail.com>

---------

Signed-off-by: Shane Dell <shanedell100@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

consul binary classifier
3 participants