Skip to content
This repository has been archived by the owner on Sep 9, 2020. It is now read-only.

internal/gps: Parse abbreviated git revisions #1027

Merged
merged 8 commits into from
Aug 28, 2017

Conversation

spenczar
Copy link
Contributor

What does this do / why do we need it?

This changes the SourceManager's InferConstraint method to look up any revisions which are present in the source for a project. It does this as a last-gasp effort so that it doesn't prematurely expand a tag or a branch name into a precise commit. In practice, this means it will expand any abbreviated git commit identifiers.

While dep doesn't want to encourage the use of abbreviated git commit identifiers, they come up frequently when we parse vendoring specifiers (like glide.yaml) of existing projects. We should give a shot at parsing these and then expanding them to their unabbreviated form.

What should your reviewer look out for in this PR?

It is totally possible that there are unintended consequences here. I don't know all the ways in which InferConstraint's output is used.

I also don't know anything about how caching plays in here. I haven't tried to cache anything.

Which issue(s) does this PR fix?

Fixes #987

While dep doesn't want to encourage the use of abbreviated git commit
identifiers, they come up frequently when we parse vendoring specifiers (like
glide.yaml) of existing projects. We should give a shot at parsing these and
then expanding them to their unabbreviated form.
@@ -40,6 +41,10 @@ func (bs *baseVCSSource) upstreamURL() string {
return bs.repo.Remote()
}

func (bs *baseVCSSource) commitInfo(ctx context.Context, r Revision) (*vcs.CommitInfo, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it appropriate to return this type?

We could return (commit, author, message string, timestamp time.Time, err error) to avoid tangling up too tightly with an external project.

Another option is to make this a more targeted method, like commitFor(ctx context.Context, specifier string) (Revision, error).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was thinking that we'd return just a Revision.

@spenczar
Copy link
Contributor Author

Codeclimate is upset at context.TODO(), which seems a little draconian to me. Should I make those context.Background()?

@spenczar spenczar changed the title Parse abbreviated git revisions internal/gps: Parse abbreviated git revisions Aug 17, 2017
Copy link
Collaborator

@carolynvs carolynvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I defer on all points to whatever @sdboyer thinks about this, but wanted to comment anyway. 😀

@@ -550,6 +563,7 @@ type source interface {
getManifestAndLock(context.Context, ProjectRoot, Revision, ProjectAnalyzer) (Manifest, Lock, error)
listPackages(context.Context, ProjectRoot, Revision) (pkgtree.PackageTree, error)
revisionPresentIn(Revision) (bool, error)
commitInfo(context.Context, Revision) (*vcs.CommitInfo, error)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this should return a Revision, instead of types from vcs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me 👌

@@ -578,6 +578,18 @@ func (sm *SourceMgr) InferConstraint(s string, pi ProjectIdentifier) (Constraint
return version.Unpair(), nil
}

// Abbreviated git revision? If the string is present, there's a good shot of
// this.
if present, _ := sm.RevisionPresentIn(pi, Revision(s)); present {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the top of this function are a bunch of string checks to validate the revision. I was hoping that we could replace those heuristics with this new logic to lookup the revision from the repo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think we can definitely yoink those out, will do.

// Abbreviated git revision? If the string is present, there's a good shot of
// this.
if present, _ := sm.RevisionPresentIn(pi, Revision(s)); present {
srcg, err := sm.srcCoord.getSourceGatewayFor(context.TODO(), pi)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of dipping into the source gateway, this should probably be another function on the source manager, just like RevisionPresentIn which handles getting the value from the source gateway.

func (sm *SourceMgr) RevisionPresentIn(id ProjectIdentifier, r Revision) (bool, error) {
if atomic.CompareAndSwapInt32(&sm.releasing, 1, 1) {
return false, smIsReleased{}
}
srcg, err := sm.srcCoord.getSourceGatewayFor(context.TODO(), id)
if err != nil {
// TODO(sdboyer) More-er proper-er errors
return false, err
}
return srcg.revisionPresentIn(context.TODO(), r)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern here was just in adding another public method to the SourceManager interface, which is already very large. What would you think of a private method on the *sourceManager here to encapsulate this behavior?

(*sourceManager).RevisionPresentIn already digs into the source gateway in its implementation, so this would be similar, in my head.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good catch. A private method for InferConstraint to call would be better.

@@ -40,6 +41,10 @@ func (bs *baseVCSSource) upstreamURL() string {
return bs.repo.Remote()
}

func (bs *baseVCSSource) commitInfo(ctx context.Context, r Revision) (*vcs.CommitInfo, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was thinking that we'd return just a Revision.

commitInfo returned a struct from an external package. This coupling isn't
ideal, and we wouldn't use the extra info for anything. It's better to just have
a method for exactly what we want, which is disambiguation of short revision
specifiers.
Now that we are disambiguating revisions by going to the source, we
don't need the heuristic check of string length and hex parseability
anymore.
This just cleans up the SourceMgr implementation a bit.
// Next, try for bzr, which has a three-component GUID separated by
// dashes. There should be two, but the email part could contain
// internal dashes
// Bazaar has a three-component GUID separated by dashes. There should be two,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when you remove the specific handling for bzr? I was hoping (but don't know for sure!) that repo.CommitInfo would work for all vcs types, including bzr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was hoping we could too, but doing so makes the current test fail (since the current test operates on a git repo, not a bzr one).

I think this should work for bzr too... but I've never used bzr in my life, really. I'm a little wary of trying to write a realistic test given my lack of experience with bzr. If you feel like you could review it well and guide me, though, I'm game.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think our CI build runs against any real bzr repos?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the gps tests include some real tests against bzr.

@@ -578,9 +566,29 @@ func (sm *SourceMgr) InferConstraint(s string, pi ProjectIdentifier) (Constraint
return version.Unpair(), nil
}

// Revision, possibly abbreviated
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Originally the function checked if the string given was a revision first (note the comment at the top of the file). Unless we need to change that behavior(?), let's move this back to the top where we are replacing the old code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do need to change that behavior. If we put it at the top, then we'll dereference branch names and version tags, turning them into pure Revisions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay then let's just update the comment to reflect the new order.

The previous commit changed the order of prefernce when inferring
constraints. Also, moved the bazaar GUID parsing stuff to make sure
that it receives the same preference level as revisions for other
repository types.
String parsing was a little error prone, so we'd prefer to ask the
bazaar repository whether a particular string is a valid revision ID.
To do this, we need a very slightly custom version of
disambiguateRevision for *bzrSource.

This revision also refactors TestSourceManager_InferConstraint pretty
dramatically to let it work on repositories coming from different
VCSes.
@spenczar
Copy link
Contributor Author

spenczar commented Aug 28, 2017

@carolynvs, please take another look - I'm happy with how this turned out, I was able to get rid of the bizarre bazaar string parsing, and I learned a bit about bzr on the way :)

How would you like me to address codeclimate's complaint about using context.TODO()? I think threading a real context in all the way from package main seems out of scope.

@carolynvs
Copy link
Collaborator

Don't worry about codeclimate flagging the context.TODO() usage. I'll figure something out and get it merged.

Copy link
Collaborator

@carolynvs carolynvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great! Just a couple nits (ignore codeclimate) and we can :shipit:.

if err != nil {
t.Fatal(err)
}
testcase := func(str string, pi ProjectIdentifier, want Constraint) func(*testing.T) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can these subtests be restructured to match the way we have been doing other subtests? The pattern that we have been following is to create an array or map of testcase data, and then in a for loop call t.Run. It makes it easier to review when all the testcases are together.

dep/context_test.go

Lines 116 to 128 in 6c5ebb9

var testcases = []struct {
name string
lock bool
wd string
}{
{"direct", true, filepath.Join("src", "test1")},
{"ascending", true, filepath.Join("src", "test1", "sub")},
{"without lock", false, filepath.Join("src", "test2")},
{"ascending without lock", false, filepath.Join("src", "test2", "sub")},
}
for _, tc := range testcases {
t.Run(tc.name, func(t *testing.T) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, done.

var (
gitProj = ProjectIdentifier{ProjectRoot: "github.com/carolynvs/deptest"}
bzrProj = ProjectIdentifier{ProjectRoot: "launchpad.net/govcstestbzrrepo"}
hgProj = ProjectIdentifier{ProjectRoot: "bitbucket.org/golang-dep/dep-test"}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a testcase for svn. Here's the test repo info: https://github.com/golang/dep/blob/master/internal/gps/vcs_repo_test.go#L122

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do. That code is using svn to manually checkout a repo that would normally be cloned with git using the go tools, so I might need to do some plumbing to make it appear to the SourceMgr as a svn repo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, if it's not a straight case like the others, please don't bother. I didn't look at that too closely, and didn't realize it's not a svn repo... 😊

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, works for me 😄

Digging in further, I got the sense that dep might not fully support svn, in fact. It doesn't look like anything references the svnRepo type except one skipped test.

Copy link
Collaborator

@carolynvs carolynvs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for fixing this! 👍

@carolynvs carolynvs merged commit 43252e0 into golang:master Aug 28, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Gracefully handle abbreviated revisions during import
5 participants