-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ISSUE-1830: Adding Stop scan related changes for Vulnerability, Compliance and Cloud compliance scans #1585
Conversation
9b12357
to
17e634c
Compare
deepfence_server/controls/agent.go
Outdated
WHERE s.stop_requested=true | ||
WITH s LIMIT $max_work | ||
SET s.status = '`+utils.SCAN_STATUS_CANCELLING+`', s.updated_at = TIMESTAMP() | ||
SET s.status = '`+utils.SCAN_STATUS_CANCELLING+`', | ||
s.updated_at = TIMESTAMP(), s.stop_requested=false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we introducing this flag?
Previous behavior is more aligned with the whole status workflow:
- Starting a scan: Starting, In progress
- Stopping a scan: Cancelling, Cancelled
No need to fork and have a stop_requested
flag now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stop functionality by relying on the scan status will not work completely.
Reason being, an already running scan will continuously emit "IN PROGRESS" asynchronously
Hence there's a chance we might overwrite the "CANCEL_PENDING" status with IN PROGRESS" even before we check this scan.
This will result in the stop never getting triggered.
Hence we would need this stop_requested
flag for deterministic behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point, but the flag is not the solution: we need to set some priority when setting the status
The same way that if IN PROGRESS come after a COMPLETE we should not "uncomplete" the scan.
Let's make it consistent across the whole scan system
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have some guarantees for the statuses:
- IN PROGRESS cannot come in before the "STARTING".
- IN PROGRESS cannot come after the "COMPLETED" , "ERROR" or "CANCELLED" state (terminal state).
If these 2 are violated, we have a bug in the code and need to be fixed :)
Also, the IN_PROGRESS is used at the kafka ingester for updating the db. Making any change there is not ideal as it's going to complicate the logic. Also, that's the hot path and might have performance impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverted the flag change as we will be adding the logic to handle this in the status ingestion.
query := `MATCH (n:%s) -[:SCANNED]-> () | ||
WHERE n.node_id = $scan_id | ||
AND n.status = $in_progress | ||
SET n.status=$cancel_pending, n.stop_requested=true` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can see here how the stop_requested
is redundant with the cancel_pending
status
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this is not going to work :(
Reason being, an already running scan will continuously emit "IN PROGRESS" asynchronously
Hence there's a chance we might overwrite the "CANCEL_PENDING" status with IN PROGRESS" even before we check this scan.
This will result in the stop never getting triggered.
Hence we would need this flag for deterministic behavior.
err = RegisterControl(ctl.StopVulnerabilityScan, | ||
GetRegisterControlFunc[ctl.StopVulnerabilityScanRequest](pub, | ||
sdkUtils.StopVulnerabilityScanTask)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for consistency, let's put stopVuln scan before secret & malware, to keep them in the same order as the starts
if err != nil { | ||
return err | ||
} | ||
|
||
return nil | ||
} | ||
|
||
func GetRegisterControlFunc[T ctl.StartVulnerabilityScanRequest | ctl.StartSecretScanRequest | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactor
scanID := params.ScanId | ||
cancelFnObj, found := scanMap.Load(scanID) | ||
logMsg := "" | ||
if found { | ||
cancelFn := cancelFnObj.(context.CancelFunc) | ||
cancelFn() | ||
logMsg = "Stop GenerateSBOM request submitted" | ||
} else { | ||
logMsg = "Failed to Stop scan, SBOM may have already generated or errored out" | ||
} | ||
|
||
log.Info().Msgf("%s, scan_id: %s", logMsg, scanID) | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is an error, let's propagate it up instead of logging
We can also wrap it:
err = fmt.Errorf("Failed to stop: %w", err)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not a hard error as such.
The stop can fail if the scan was either not started or has already completed.
In either case, the scan is not running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cancelling logic should rely on status only, the same way STARTING
is purely based on status
7900621
to
b0a459c
Compare
Compliance and Cloud compliance scans
88eeb97
to
b4f27bc
Compare
scanDetail := model.CloudComplianceScanDetails{ | ||
ScanId: scan.ScanId, | ||
ScanTypes: scan.BenchmarkTypes, | ||
AccountId: req.CloudAccount, | ||
Benchmarks: benchmarks, | ||
StopRequested: scan.StopRequested, | ||
StopRequested: stopRequested, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StopRequested: stopRequested, | |
StopRequested: scan.Status == utils.SCAN_STATUS_CANCELLING, |
scanDetail := model.CloudComplianceScanDetails{ | ||
ScanId: scan.ScanId, | ||
ScanTypes: scan.BenchmarkTypes, | ||
AccountId: monitoredAccountId, | ||
Benchmarks: benchmarks, | ||
StopRequested: scan.StopRequested, | ||
StopRequested: stopRequested, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StopRequested: stopRequested, | |
StopRequested: scan.Status == utils.SCAN_STATUS_CANCELLING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Fixes #
https://github.com/deepfence/enterprise-roadmap/issues/1830
Changes proposed in this pull request: