-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
caching semgrep scan #652
base: main
Are you sure you want to change the base?
caching semgrep scan #652
Conversation
Reviewer's Guide by SourceryThis pull request implements caching for semgrep scans, storing results in a local SQLite database. It also adds autoescaping to the Jinja2 environment for enhanced security. Sequence diagram for semgrep scan with cachingsequenceDiagram
participant Agent
participant Cache
participant Semgrep
participant FileSystem
Agent->>Semgrep: Run scan with custom args
Semgrep->>FileSystem: Write findings to JSON file
alt scan successful
Agent->>FileSystem: Read findings file
Agent->>Cache: Cache results
Note over Agent,Cache: Store in ~/.verinfast_cache/semgrep.db
end
Agent->>FileSystem: Process and update findings
Agent->>FileSystem: Write processed findings
Class diagram showing Agent class modificationsclassDiagram
class Agent {
+Cache cache
+Config config
+String directory
+__init__()
+create_template()
+parseRepo(path: str, repo_name: str, branch: str)
}
class Cache {
+set(key, value)
}
Agent --> Cache: uses
note for Agent "Added caching capability"
note for Cache "New cache implementation for semgrep results"
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @ojokure - I've reviewed your changes - here's some feedback:
Overall Comments:
- The caching implementation needs cache invalidation logic to avoid using stale results. Consider adding checks based on file modification times or content hashes before using cached scan results.
- Consider adding cache management features like maximum cache size limits and a way to clear old entries to prevent unbounded cache growth.
Here's what I looked at during the review
- 🟢 General issues: all looks good
- 🟢 Security: all looks good
- 🟢 Testing: all looks good
- 🟢 Complexity: all looks good
- 🟢 Documentation: all looks good
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
Please make sure to add unit tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One change on some deleted lines.
findings_success = False | ||
if not self.config.dry: | ||
self.log(msg=repo_name, tag="Scanning repository", display=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines are important. They make it not run at all if it's "dry".
335e8d4
to
9cbaaa9
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #652 +/- ##
==========================================
+ Coverage 77.59% 77.80% +0.20%
==========================================
Files 29 29
Lines 2178 2194 +16
==========================================
+ Hits 1690 1707 +17
+ Misses 488 487 -1 ☔ View full report in Codecov by Sentry. |
1b96ad3
to
2b149ab
Compare
2b149ab
to
df75812
Compare
Why are these changes needed?
Cache semgrep scan
Related issue number
Checks
Summary by Sourcery
New Features: