Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimization of Dependency Retrieval for Red Hat-Based Systems #3589

Open
PatrickStarBaby opened this issue Jan 15, 2025 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@PatrickStarBaby
Copy link

Currently, Syft retrieves dependencies for Red Hat-based systems by scanning the system's rpmdb metadata files, recording the Requires and Provides fields of each package in the system. Dependencies are determined by matching the Requires of a package with the Provides of others. If multiple providers exist for the same Require, the current logic in Syft records all of them. However, in practice, the system only uses one of these providers for the given Require.

For example, many software packages include a dependency on /bin/sh, but there are two possible providers for /bin/sh: coreutils and bash. When generating an SBOM for an openEuler 24.03 (LTS) container with Syft, it was observed that the info@7.0.3-3.oe2403 package lists bash, coreutils, glibc, and ncurses-libs as its dependencies. Upon analyzing why Syft detected these four dependencies, it became clear that:

Both coreutils and bash are providers for /bin/sh.

Thus, both coreutils and bash were recorded as dependencies. However, in actual usage, the provider for /bin/sh is the bash package. This raises the question of whether it is necessary to include coreutils as a dependency of the info package when it does not provide /bin/sh in the context of actual system use.
To improve accuracy, it is suggested to use the rpm -q --whatprovides XXX command to precisely identify which provider is actually being used by the system for a specific dependency. This would enhance the reliability of the dependency relationship retrieval process and prevent the inclusion of unnecessary packages in the SBOM.

Below is a screenshot of the metadata I took from rpmdb:
image
image
image

@PatrickStarBaby PatrickStarBaby added the enhancement New feature or request label Jan 15, 2025
@spiffcs
Copy link
Contributor

spiffcs commented Jan 15, 2025

Thanks for the report @PatrickStarBaby. This is a good enhancement request. Syft on principal does not execute commands in image we're scanning, but we might be able to imitate the behavior of rpm -q --whatprovides /bin/sh

Do you have a sense of how rpm -q --whatprovides gets the correct answer?

@PatrickStarBaby
Copy link
Author

Sorry, I made a mistake. After some investigation, I found that the "rpm -q --whatprovides" command may not always provide the correct answer. I will continue searching for a new method to find the real provider.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

2 participants