Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace core SBOM-creation API with builder pattern #1383

Merged
merged 35 commits into from
Jan 12, 2024
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
be31e1e
remove existing cataloging API
wagoodman Dec 15, 2023
4e224b5
add file cataloging config
wagoodman Dec 15, 2023
4172a41
add package cataloging config
wagoodman Dec 15, 2023
afc5783
add configs for cross-cutting concerns
wagoodman Dec 15, 2023
f511367
rename CLI option configs to not require import aliases later
wagoodman Dec 15, 2023
f36de9d
update all nested structs for the Catalog struct
wagoodman Dec 15, 2023
768e232
update Catalog cli options
wagoodman Dec 15, 2023
143b0f6
migrate relationship capabilities to separate internal package
wagoodman Dec 15, 2023
e5c582f
refactor golang cataloger to use configuration options when creating …
wagoodman Dec 15, 2023
62b19c2
create internal object to facilitate reading from and writing to an SBOM
wagoodman Dec 15, 2023
839b017
create a command-like object (task) to facilitate partial SBOM creation
wagoodman Dec 15, 2023
44d8543
add cataloger selection capability
wagoodman Dec 15, 2023
87f3eac
add package, file, and environment related tasks
wagoodman Dec 15, 2023
4848648
update existing file catalogers to use nested UI elements
wagoodman Dec 15, 2023
0ed13db
add CreateSBOMConfig that drives the SBOM creation process
wagoodman Dec 15, 2023
b811336
capture SBOM creation info as a struct
wagoodman Dec 15, 2023
a5fe920
add CreateSBOM() function
wagoodman Dec 15, 2023
a3a3961
fix tests
wagoodman Dec 15, 2023
473605c
update docs with SBOM selection help + breaking changes
wagoodman Dec 15, 2023
63c23e2
fix multiple override default inputs
wagoodman Dec 16, 2023
2550e62
fix deprecation flag printing to stdout
wagoodman Dec 16, 2023
098fbd7
refactor cataloger selection description to separate object
wagoodman Jan 3, 2024
208333c
address review comments
wagoodman Jan 3, 2024
e561879
keep expression errors and show specific suggestions only
wagoodman Jan 3, 2024
3f38495
address additional review feedback
wagoodman Jan 10, 2024
dbfcf26
Merge remote-tracking branch 'origin/main' into refactor-cataloging-api
wagoodman Jan 10, 2024
81fa9b2
address more review comments
wagoodman Jan 11, 2024
9fcbbef
addressed additional PR review feedback
wagoodman Jan 11, 2024
81d621b
fix file selection references
wagoodman Jan 11, 2024
af42ef5
remove guess language data generation option
wagoodman Jan 11, 2024
498870d
add tests for coordinatesForSelection
wagoodman Jan 12, 2024
5628045
rename relationship attributes
wagoodman Jan 12, 2024
f8626b1
add descriptions to relationships config fields
wagoodman Jan 12, 2024
f4fb2e1
improve documentation around configuration options
wagoodman Jan 12, 2024
55b4c1d
add explicit errors around legacy config entries
wagoodman Jan 12, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -160,9 +160,9 @@ This default behavior can be overridden with the `default-image-pull-source` con

By default, Syft will catalog file details and digests for files that are owned by discovered packages. You can change this behavior by using the `SYFT_FILE_METADATA_SELECTION` environment variable or the `file.metadata.selection` configuration option. The options are:

- `all-files`: capture all files from the search space
- `owned-files`: capture only files owned by packages (default)
- `no-files`: disable capturing any file information
- `all`: capture all files from the search space
- `owned-by-package`: capture only files owned by packages (default)
- `none`: disable capturing any file information


### Package cataloger selection
Expand Down
5 changes: 5 additions & 0 deletions cmd/syft/cli/options/catalog.go
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import (
"github.com/anchore/syft/syft/cataloging"
"github.com/anchore/syft/syft/cataloging/filecataloging"
"github.com/anchore/syft/syft/cataloging/pkgcataloging"
"github.com/anchore/syft/syft/file/cataloger/filecontent"
"github.com/anchore/syft/syft/pkg/cataloger/binary"
"github.com/anchore/syft/syft/pkg/cataloger/golang"
"github.com/anchore/syft/syft/pkg/cataloger/java"
Expand Down Expand Up @@ -114,6 +115,10 @@ func (cfg Catalog) ToFilesConfig() filecataloging.Config {
return filecataloging.Config{
Selection: cfg.File.Metadata.Selection,
Hashers: hashers,
Content: filecontent.Config{
Globs: cfg.File.Content.Globs,
SkipFilesAboveSize: cfg.File.Content.SkipFilesAboveSize,
},
}
}

Expand Down
6 changes: 3 additions & 3 deletions cmd/syft/cli/options/file.go
Original file line number Diff line number Diff line change
Expand Up @@ -25,18 +25,18 @@ type fileContent struct {
func defaultFileConfig() fileConfig {
return fileConfig{
Metadata: fileMetadata{
Selection: file.OwnedFilesSelection,
Selection: file.FilesOwnedByPackageSelection,
Digests: []string{"sha1", "sha256"},
},
Content: fileContent{
SkipFilesAboveSize: 1 * intFile.MB,
SkipFilesAboveSize: 250 * intFile.KB,
},
}
}

func (c *fileConfig) PostLoad() error {
switch c.Metadata.Selection {
case file.NoFilesSelection, file.OwnedFilesSelection, file.AllFilesSelection:
case file.NoFilesSelection, file.FilesOwnedByPackageSelection, file.AllFilesSelection:
return nil
}
return fmt.Errorf("invalid file metadata selection: %q", c.Metadata.Selection)
Expand Down
33 changes: 0 additions & 33 deletions internal/sbomsync/builder.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ import (
"sync"

"github.com/anchore/syft/syft/artifact"
"github.com/anchore/syft/syft/file"
"github.com/anchore/syft/syft/linux"
"github.com/anchore/syft/syft/pkg"
"github.com/anchore/syft/syft/sbom"
Expand All @@ -20,10 +19,6 @@ type Builder interface {
// nodes

AddPackages(...pkg.Package)
AddFileMetadata(file.Coordinates, file.Metadata)
AddFileDigests(file.Coordinates, ...file.Digest)
AddFileContents(file.Coordinates, string)
AddFileLicenses(file.Coordinates, ...file.License)

// edges

Expand Down Expand Up @@ -73,34 +68,6 @@ func (b sbomBuilder) AddPackages(p ...pkg.Package) {
b.sbom.Artifacts.Packages.Add(p...)
}

func (b sbomBuilder) AddFileMetadata(coordinates file.Coordinates, metadata file.Metadata) {
b.lock.Lock()
defer b.lock.Unlock()

b.sbom.Artifacts.FileMetadata[coordinates] = metadata
}

func (b sbomBuilder) AddFileDigests(coordinates file.Coordinates, digest ...file.Digest) {
b.lock.Lock()
defer b.lock.Unlock()

b.sbom.Artifacts.FileDigests[coordinates] = append(b.sbom.Artifacts.FileDigests[coordinates], digest...)
}

func (b sbomBuilder) AddFileContents(coordinates file.Coordinates, s string) {
b.lock.Lock()
defer b.lock.Unlock()

b.sbom.Artifacts.FileContents[coordinates] = s
}

func (b sbomBuilder) AddFileLicenses(coordinates file.Coordinates, license ...file.License) {
b.lock.Lock()
defer b.lock.Unlock()

b.sbom.Artifacts.FileLicenses[coordinates] = append(b.sbom.Artifacts.FileLicenses[coordinates], license...)
}

func (b sbomBuilder) AddRelationships(relationship ...artifact.Relationship) {
b.lock.Lock()
defer b.lock.Unlock()
Expand Down
105 changes: 70 additions & 35 deletions internal/task/file_tasks.go
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ import (
"github.com/anchore/syft/internal/sbomsync"
"github.com/anchore/syft/syft/artifact"
"github.com/anchore/syft/syft/file"
"github.com/anchore/syft/syft/file/cataloger/filecontent"
"github.com/anchore/syft/syft/file/cataloger/filedigest"
"github.com/anchore/syft/syft/file/cataloger/filemetadata"
"github.com/anchore/syft/syft/pkg"
Expand All @@ -24,24 +25,10 @@ func NewFileDigestCatalogerTask(selection file.Selection, hashers ...crypto.Hash
fn := func(ctx context.Context, resolver file.Resolver, builder sbomsync.Builder) error {
accessor := builder.(sbomsync.Accessor)

var coordinates []file.Coordinates

accessor.ReadFromSBOM(func(sbom *sbom.SBOM) {
if selection == file.OwnedFilesSelection {
for _, r := range sbom.Relationships {
// TODO: double check this logic
if r.Type != artifact.ContainsRelationship {
continue
}
if _, ok := r.From.(pkg.Package); !ok {
continue
}
if c, ok := r.To.(file.Coordinates); ok {
coordinates = append(coordinates, c)
}
}
}
})
coordinates, ok := coordinatesForSelection(selection, builder)
if !ok {
return nil
}

result, err := digestsCataloger.Catalog(resolver, coordinates...)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from @willmurphyscode , blocking: we need to explicitly pass all coordinates, since there is no guarantee to have any results from a owned-files indication

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to fix the functional problem in this PR, but to address the signature and generator issue I really should break that into a separate PR that I follow up with after this PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will have some play into the solution here #2487

if err != nil {
Expand All @@ -68,23 +55,10 @@ func NewFileMetadataCatalogerTask(selection file.Selection) Task {
fn := func(ctx context.Context, resolver file.Resolver, builder sbomsync.Builder) error {
accessor := builder.(sbomsync.Accessor)

var coordinates []file.Coordinates

accessor.ReadFromSBOM(func(sbom *sbom.SBOM) {
if selection == file.OwnedFilesSelection {
for _, r := range sbom.Relationships {
if r.Type != artifact.ContainsRelationship {
continue
}
if _, ok := r.From.(pkg.Package); !ok {
continue
}
if c, ok := r.To.(file.Coordinates); ok {
coordinates = append(coordinates, c)
}
}
}
})
coordinates, ok := coordinatesForSelection(selection, builder)
if !ok {
return nil
}

result, err := metadataCataloger.Catalog(resolver, coordinates...)
if err != nil {
Expand All @@ -100,3 +74,64 @@ func NewFileMetadataCatalogerTask(selection file.Selection) Task {

return NewTask("file-metadata-cataloger", fn)
}

func NewFileContentCatalogerTask(cfg filecontent.Config) Task {
if len(cfg.Globs) == 0 {
return nil
}

cat := filecontent.NewCataloger(cfg)

fn := func(ctx context.Context, resolver file.Resolver, builder sbomsync.Builder) error {
accessor := builder.(sbomsync.Accessor)

result, err := cat.Catalog(resolver)
if err != nil {
return err
}

accessor.WriteToSBOM(func(sbom *sbom.SBOM) {
sbom.Artifacts.FileContents = result
})

return nil
}

return NewTask("file-content-cataloger", fn)
}

// TODO: this should be replaced with a fix that allows passing a coordinate or location iterator to the cataloger
// Today internal to both cataloger this functions differently: a slice of coordinates vs a channel of locations
func coordinatesForSelection(selection file.Selection, builder sbomsync.Builder) ([]file.Coordinates, bool) {
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
if selection == file.AllFilesSelection {
return nil, true
}

if selection == file.FilesOwnedByPackageSelection {
var coordinates []file.Coordinates

accessor := builder.(sbomsync.Accessor)

accessor.ReadFromSBOM(func(sbom *sbom.SBOM) {
for _, r := range sbom.Relationships {
if r.Type != artifact.ContainsRelationship {
continue
}
if _, ok := r.From.(pkg.Package); !ok {
continue
}
if c, ok := r.To.(file.Coordinates); ok {
coordinates = append(coordinates, c)
}
}
})

if len(coordinates) == 0 {
return nil, false
}

return coordinates, true
}

return nil, false
}
14 changes: 9 additions & 5 deletions syft/cataloging/filecataloging/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,19 @@ import (
intFile "github.com/anchore/syft/internal/file"
"github.com/anchore/syft/internal/log"
"github.com/anchore/syft/syft/file"
"github.com/anchore/syft/syft/file/cataloger/filecontent"
)

type Config struct {
Selection file.Selection `yaml:"selection" json:"selection" mapstructure:"selection"`
Hashers []crypto.Hash `yaml:"hashers" json:"hashers" mapstructure:"hashers"`
Selection file.Selection `yaml:"selection" json:"selection" mapstructure:"selection"`
Hashers []crypto.Hash `yaml:"hashers" json:"hashers" mapstructure:"hashers"`
Content filecontent.Config `yaml:"content" json:"content" mapstructure:"content"`
}

type configMarshaledForm struct {
Selection file.Selection `yaml:"selection" json:"selection" mapstructure:"selection"`
Hashers []string `yaml:"hashers" json:"hashers" mapstructure:"hashers"`
Selection file.Selection `yaml:"selection" json:"selection" mapstructure:"selection"`
Hashers []string `yaml:"hashers" json:"hashers" mapstructure:"hashers"`
Content filecontent.Config `yaml:"content" json:"content" mapstructure:"content"`
}

func DefaultConfig() Config {
Expand All @@ -27,8 +30,9 @@ func DefaultConfig() Config {
log.WithFields("error", err).Warn("unable to create file hashers")
}
return Config{
Selection: file.OwnedFilesSelection,
Selection: file.FilesOwnedByPackageSelection,
Hashers: hashers,
Content: filecontent.DefaultConfig(),
}
}

Expand Down
8 changes: 4 additions & 4 deletions syft/cataloging/filecataloging/config_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ func TestConfig_MarshalJSON(t *testing.T) {
{
name: "converts hashers to strings",
cfg: Config{
Selection: file.OwnedFilesSelection,
Selection: file.FilesOwnedByPackageSelection,
Hashers: []crypto.Hash{crypto.SHA256},
},
want: []byte(`{"selection":"owned-files","hashers":["sha-256"]}`),
want: []byte(`{"selection":"owned-by-package","hashers":["sha-256"],"content":{"globs":null,"skip-files-above-size":0}}`),
},
}
for _, tt := range tests {
Expand Down Expand Up @@ -54,9 +54,9 @@ func TestConfig_UnmarshalJSON(t *testing.T) {
}{
{
name: "converts strings to hashers",
data: []byte(`{"selection":"owned-files","hashers":["sha-256"]}`),
data: []byte(`{"selection":"owned-by-package","hashers":["sha-256"]}`),
want: Config{
Selection: file.OwnedFilesSelection,
Selection: file.FilesOwnedByPackageSelection,
Hashers: []crypto.Hash{crypto.SHA256},
},
},
Expand Down
5 changes: 3 additions & 2 deletions syft/configuration_audit_trail_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -101,8 +101,9 @@ func collectJSONTags(t *testing.T, v reflect.Value, tags *[]string, parentTag st
}

func assertLowercaseKebab(t *testing.T, tag string) {
t.Helper()
require.NotEmpty(t, tag)
assert.Equal(t, tag, strcase.ToKebab(tag))
assert.Equal(t, strcase.ToKebab(tag), tag)
}

func Test_collectJSONTags(t *testing.T) {
Expand Down Expand Up @@ -225,7 +226,7 @@ func Test_configurationAuditTrail_MarshalJSON(t *testing.T) {
cfg: configurationAuditTrail{

Files: filecataloging.Config{
Selection: file.OwnedFilesSelection,
Selection: file.FilesOwnedByPackageSelection,
Hashers: []crypto.Hash{
crypto.SHA256,
},
Expand Down
2 changes: 2 additions & 0 deletions syft/create_sbom.go
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,8 @@ import (
"github.com/anchore/syft/syft/source"
)

// CreateSBOM creates a software bill-of-materials from the given source. If the CreateSBOMConfig is nil, then
// default options will be used.
func CreateSBOM(ctx context.Context, src source.Source, cfg *CreateSBOMConfig) (*sbom.SBOM, error) {
wagoodman marked this conversation as resolved.
Show resolved Hide resolved
if cfg == nil {
cfg = DefaultCreateSBOMConfig()
Expand Down
6 changes: 5 additions & 1 deletion syft/create_sbom_config.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ func (c *CreateSBOMConfig) makeTaskGroups(src source.Description) ([][]task.Task
}

// combine the user-provided and configured tasks
if c.Files.Selection == file.OwnedFilesSelection {
if c.Files.Selection == file.FilesOwnedByPackageSelection {
// special case: we need the package info when we are cataloging files owned by packages
taskGroups = append(taskGroups, pkgTasks, fileTasks)
} else {
Expand Down Expand Up @@ -182,6 +182,10 @@ func (c *CreateSBOMConfig) fileTasks() []task.Task {
if t := task.NewFileMetadataCatalogerTask(c.Files.Selection); t != nil {
tsks = append(tsks, t)
}
if t := task.NewFileContentCatalogerTask(c.Files.Content); t != nil {
tsks = append(tsks, t)
}

return tsks
}

Expand Down
Loading
Loading