Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include chunk.Data []byte in the ResultsWithMetadata struct #1357

Closed
strazzere opened this issue May 24, 2023 · 0 comments
Closed

Include chunk.Data []byte in the ResultsWithMetadata struct #1357

strazzere opened this issue May 24, 2023 · 0 comments

Comments

@strazzere
Copy link
Contributor

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Description

When getting results from detectors it is not currently possible to associate the exact chunk.Data (or just chunk) with the result unless the code is collecting all chunk objects and result objects. It would be useful, especially for complex structures, to emit the chunk.Data into the ResultsWithMetadata struct so we can have the blob which caused the detector to fire.

This feels safe and like it would also reduce some memory when the data being scanned is large (and compressed), so that a original file is not required to be passed around and re-decompressed

Problem to be Addressed

Per the description, emitting the chunk.Data into the ResultsWithMetadata struct, so that a service downstream only needs to handle the result structs to have all the information about a detection without needing to also maintain the chunks

Description of the Preferred Solution

Emitting the chunk.Data into the ResultsWithMetadata struct.

Additional Context

The specific usecase I have, the original file is large and compressed, so the chunks are being decompressed, stepped into, and scanned. Having the ResultsWithMetadata contain the chunk being scanned would allow the original binary to not be passed around and maintained in memory, so it can be streamed during decompression vs all read in. This keeps the processing memory to a minimum.

References

N/A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants