-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Persistent Task Invalidation #6
Conversation
a0e379f
to
9ed212f
Compare
7827992
to
1966616
Compare
@@ -83,7 +83,7 @@ func Run(tasks []*Task, options ...RunOption) { | |||
} | |||
|
|||
if len(tasks) == 0 { | |||
panic(fmt.Errorf("no tasks specified")) | |||
log.Fatalln("No tasks specified") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consistent lowercasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm for consistently upcasing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wish we didn't change it. either way, these aren't consistent now
cache/cache.go
Outdated
} | ||
|
||
func getCacheFilePath(dir string) string { | ||
hashedName := strings.Replace(dir, "/", "%", -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about replacing any non-alphanum with -
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
regexp.MustCompile(`[^A-Za-z\-_]`)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could do that - I figured slashes would be the only dangerous thing though. NSO
We will need the output of these functions in order to get the beginnings of our cache going, since our cache is based on doing a git diff against a hash + manual hashing of uncommitted files.
1fc4a09
to
d672483
Compare
cache/md5sum.go
Outdated
"path/filepath" | ||
) | ||
|
||
// hashSum takes in a file or directory and hashes based on filename + modified time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This hashes every directory on the path as well right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorta - it does a hashsum of all the files within a directory OR a hash on a single filename
d672483
to
629e136
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look but some stuff may have changed under the hood as I was looking. Overall good!
cache/snapshot.go
Outdated
return &snapshot, nil | ||
} | ||
|
||
func (s snapshot) hashFor(path string) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pointer reference to snapshot?
|
||
// Only mark committed files as dirty (uncommitted files are handled below). | ||
for _, f := range diffFiles { | ||
if previous.hashFor(f) == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hashForUncommitedFile
?
cache/snapshot.go
Outdated
// If the file isn't currently uncommitted, rehash. | ||
md5 := current.hashFor(file.Path) | ||
if md5 == "" { | ||
md5, _ = c.HashFunc(file.Path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we print out this error? Or log it somewhere?
cache/snapshot.go
Outdated
} | ||
|
||
// write records the current state to a file. | ||
func (c *snapshotter) Write(ctx context.Context, path string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we rename this to cacheFilePath
} | ||
} | ||
|
||
func (c *Cache) Start(ctx context.Context, opt shell.RunOption) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this a list of options?
|
||
const CachePath = ".cache/taskrunner" | ||
|
||
var CacheDir = path.Join(os.Getenv("HOME"), CachePath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want this cached globally? Or do we want one per repo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the logic for that out - it will be one per repo
return false | ||
} | ||
for _, f := range c.dirtyFiles { | ||
if taskrunner.IsTaskSource(task, f) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsTaskSource doesn't have a comment. Can we add one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
@@ -7,28 +7,30 @@ import ( | |||
"github.com/samsarahq/taskrunner/watcher" | |||
) | |||
|
|||
func IsTaskSource(task *Task, path string) (matches bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it was just removed commits, looks good!
This function works for both files and directories. We are choosing to start with a hash off of filename and modified time for now, but can make this include contents if we need to make the cache more resilient.
These snapshots are comparable across time and will determine invalidated files, allowing Taskrunner to determine whether a task needs to re-run on boot. A diff against a previous commit is preferred. For uncommitted (and new) files, the hash is compared to a hash in the previous snapshot. If there was no hash before, the file is marked as changed.
Since we will be using this functionality to determine whether a task should be invalidated between task runs, it makes sense to expose this as public API. Leaving this off of the Task struct for now, but could easily move this to be a method on Task.
With this feature, tasks can remain valid between taskrunner executions if the task's dependent files have not changed. An example use-case of this is `yarn install`, which we can safely assume shouldn't run again unless `package.json`/`yarn.lock` have changed.
Many of our panics aren't that exceptional and should be normal errors to a user. This should help reduce noise when looking at Taskrunner errors.
629e136
to
58dd6f1
Compare
With this feature, tasks can remain valid between taskrunner executions if the task's dependent files have not changed.
An example use-case of this is
yarn install
, which we can safely assume shouldn't run again unlesspackage.json
/yarn.lock
have changed.This requires a git repository to be in place to work.