-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow gitlab to resume from encoded resume info #611
Merged
Merged
Changes from 1 commit
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
bf8e54b
Separate repo resume functions into reusable functions in the sources…
trufflesteeeve da38994
Allow gitlab to resume from encoded resume info
trufflesteeeve dd37530
fixup - fix bug where an inaccurate number of repos would be reported
trufflesteeeve cd1d6bd
fixup - add fixes from review, including progress test
trufflesteeeve 1e4688d
fixup - fix logic bug in getRepos
trufflesteeeve File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
package sources | ||
|
||
import ( | ||
"strings" | ||
|
||
"github.com/sirupsen/logrus" | ||
) | ||
|
||
// RemoveRepoFromResumeInfo removes the repoURL from the resume info. | ||
func RemoveRepoFromResumeInfo(resumeRepos []string, repoURL string) []string { | ||
index := -1 | ||
for i, repo := range resumeRepos { | ||
if repoURL == repo { | ||
index = i | ||
} | ||
} | ||
|
||
if index == -1 { | ||
// We should never be able to be here. But if we are, it means the resume info never had the repo added. | ||
// So log the error and do nothing. | ||
logrus.Errorf("repoURL (%q) not found in list of encode resume info: %v", repoURL, resumeRepos) | ||
return resumeRepos | ||
} | ||
|
||
// This removes the element at the given index. | ||
return append(resumeRepos[:index], resumeRepos[index+1:]...) | ||
} | ||
|
||
// FilterReposToResume filters the existing repos down to those that are included in the encoded resume info. | ||
// It returns the new slice of repos to be scanned. | ||
// It also returns the difference between the original length of the repos and the new length to use for progress reporting. | ||
// It is required that both the resumeInfo repos and the existing repos are sorted. | ||
func FilterReposToResume(repos []string, resumeInfo string) (reposToScan []string, progressOffsetCount int) { | ||
if resumeInfo == "" { | ||
return repos, 0 | ||
} | ||
|
||
resumeInfoSlice := DecodeResumeInfo(resumeInfo) | ||
|
||
// Because this scanner is multithreaded, it is possible that we have scanned a range of repositories | ||
// with some gaps of unlisted but completed repositories in between the ones in resumeInfo. | ||
// So we know repositories that have not finished scanning are the ones included in the resumeInfo, | ||
// and those that come after the last repository in the resumeInfo. | ||
// However, it is possible that a resumed scan does not include all or even any of the repos within the resumeInfo. | ||
// In this case, we must ensure we still scan all repos that come after the last found repo in the list. | ||
lastFoundRepoIndex := -1 | ||
resumeRepoIndex := 0 | ||
for i, repoURL := range repos { | ||
// If the repoURL is bigger than what we're looking for, move to the next one. | ||
if repoURL > resumeInfoSlice[resumeRepoIndex] { | ||
resumeRepoIndex++ | ||
} | ||
|
||
// If we've found all of our repositories end the filter. | ||
if resumeRepoIndex == len(resumeInfoSlice) { | ||
break | ||
} | ||
|
||
// If the repoURL is the one we're looking for, add it and update the lastFoundRepoIndex. | ||
if repoURL == resumeInfoSlice[resumeRepoIndex] { | ||
lastFoundRepoIndex = i | ||
reposToScan = append(reposToScan, repoURL) | ||
} | ||
} | ||
|
||
// Append all repos after the last one we've found. | ||
reposToScan = append(reposToScan, repos[lastFoundRepoIndex+1:]...) | ||
progressOffsetCount = len(repos) - len(reposToScan) | ||
return | ||
} | ||
|
||
func EncodeResumeInfo(resumeInfoSlice []string) string { | ||
return strings.Join(resumeInfoSlice, "\t") | ||
} | ||
|
||
func DecodeResumeInfo(resumeInfo string) []string { | ||
// strings.Split will, for an empty string, return []string{""}, | ||
// which is an element, where as when there is no resume info we want an empty slice. | ||
if resumeInfo == "" { | ||
return nil | ||
} | ||
return strings.Split(resumeInfo, "\t") | ||
} |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit (optional): I know this is simply moving code around, but we can
break
out of this loop once we found the repo. Alternatively, since theresumeRepos
are sorted, we could do a binary search using sort.SearchStringsThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah that's neat. I do like the
break
. But looking atsort.SearchStrings
, it could return an index that we wouldn't want to use, because it doesn't actually contain therepoURL
, and we'd have to check that that index did exactly equal the repo. But still very cool to know about.