Improved infrastructure for notebook testing #429

jlstevens · 2016-01-29T17:04:40Z

I think it is safe to say that none of us are too fond of the reference_data submodule.

In fact, I've found it a real hinderance personally:

It bloats up any repository fast and can be a pain to simply download.
We have far too many 'Updated reference submodule reference' commits.
You have to download the reference data repository to do this update (non-trivial as history keeps changing to keep it squashed!).
Anyone submitting a PR is subjected to this painful process.

However, there are a few advantages using GitHub:

Hosting is free, unlike S3 where we would need to pay per GB downloaded.
Integrates with Travis and PRs because you point to a particular commit SHA.

Here is the overall outline of how it will work if this PR goes to plan:

Submit a PR normally. Once it is done, check Travis to find the latest build number (tests will have failed due to the PR changes).
Go to travis.holoviews.org and find the build number, check that the display data is all correct. Travis will have pushed the test data to S3 so it is available to buildbot (given a recent build number).
Go to buildbot.holoviews.org (in development), log in and supply a small amount of information (e.g the build number) and trigger the build.

This would make things a lot easier but the submodule reference in the current system would remain a major problem. Although we could automatically update the reference on our repository we can't help people submitting PRs to us. We don't have the credentials to push to other people's repositories. Other people would still be left with some of the nastiest steps to carry out themselves.

The proposed system is based on these ideas:

The holoviews-data repository stays. It is no longer used as a submodule though, it is simply cloned into doc/reference_data and switched to the correct branch.
To work with PRs, each PR will have its own branch based on the PR number. There is also a master branch.
Commits to update reference data use --amend to keep the repository small. We can snapshot master at any time by creating a copy on a branch (e.g on release).

How will this work concretely?

1.When running notebook tests, Travis uses the $TRAVIS_PULL_REQUEST environment variable to clone the holoviews-data repository and checkout the branch called PR$TRAVIS_PULL_REQUEST. If this branch doesn't exist, it backs off to the master branch to use the test data there.
2. Once the test is finished, Travis uses the $TRAVIS_BUILD_NUMBER environment variables to create a directory in an S3 bucket to push test_data. It will also write the $TRAVIS_PULL_REQUEST variable to a file called SOURCE. Note, if the build is not for a PR, it dumps $TRAVIS_BRANCH into SOURCE instead. Travis is already pushing the display data to a bucket to allow review on travis.holoviews.org.
2. Buildbot then needs two bits of information to operate: the PR number and the Travis build number. It can then checkout the PR by number (using a hub utility to make this easier). Buildbot can pull the test data from S3 and do a sanity check: make sure the SOURCE file matches the specified PR. Note, I could pull from S3 first, check SOURCE and automatically get the right PR, but I think I feel validation is necessary. You don't want a screw up in the build number to affect a different PRs test data.
3. Buildbot has now downloaded the test data from S3 so it can now commit it back to the holoviews-data repository on a branch based on the specified PR number (commit with --amend). This means the next time Travis runs a build for the PR it will fetch the correct test data.

Advantages:

No more submodule or submodule reference updating commits.
Users never need to touch the holoviews-data repository themselves.
Much easier - everything is automated as much as possible. Note that you cannot skip the review process as you have to explicitly make sure the new test data is good. Then you decide to run a buildbot build (or not) to make that test data as new reference data.
Still works with PRs but uses GitHub to hold the reference data.

I bet it sounds complicated but if it works, it will make developing new notebooks much easier.

jbednar · 2016-01-29T17:09:59Z

Sounds like a big improvement.

jlstevens · 2016-01-31T01:35:37Z

Nearly there...

It's been a pretty huge effort (all happening behind the scenes) to get this working. I'll be happy to see the end of it!

Naturally, we quickly realized the proposal above wasn't going to work. The main issue was the suggestion is it didn't tackle merges. How would you make the PR branch on holoviews-data become the main one when you click 'merge'?

So we now have a new variation on this approach.

Travis runs notebook tests and pushes test data to S3.
When you want to update reference data, just enter the PR number into the buildbot build. If you want to merge the PR with master, click the merge checkbox.
There is no step 3 unless you include the process of logging into buildbot. :-)

The idea is that build step handles merging. The way it does this, is it has to do quite a lot behind the scenes:

Use the Travis API to find all builds associated with the PR. Select the one with the most recent commit.
Check this build is available on S3. If so, fetch it.
Pull the PR and the holoviews-data from github.
Update the reference data on the PR specific branch of holoviews-data
(Now the clever bit) Restart the Travis build that was found for the selected PR using the Travis API and poll it until success or fail.
As this same build can now find the PR specific branch, it is now testing it with the updated data. If it succeeds, buildbot can continue on to the next (optional) merge step.
If you clicked the 'merge' checkbox, buildbot will immediately update the master branch of holoviews-data with the data and perform the merge on the holoviews repository.

Note that this means we should no longer use the 'Merge' button on github!

This is probably not an issue as none of us are so irresponsible enough to merge a PR without the tests going green, right? :-)

To get the tests green, we need to use buildbot anyway so enabling the merge step there helps avoid error. If you are uncertain and don't trust buildbot and want to see the tests go green for yourself, you can leave the 'Merge' checkbox unticked. Then if all is well and you decide to merge, this needs to be done via buildbot.

Anyway, I hope that makes sense and I think I am 95% the way to the final system. I need to handle #PR0 (which represents master) and some of the last merge steps need to be implemented. Also, do you think we should make the default PR number -1 so you have to enter 0 for master? I worry a little about people accidentally entering the wrong PR number...

Note: Buildbot checks all the jobs for the build that is merging into master goes green. i.e if the PR were merged to master, would the tests pass? As you can see with this PR here, that doesn't mean the other set of jobs (branch test) will pass but that will almost be due to transient errors if the other set of builds is passing, right?

jbednar · 2016-01-31T14:43:43Z

Sounds a little rickety, but still a big improvement. Maybe entering 0 for master would make sense. Is there any way to get github to turn off the Merge button and replace it with a message? Seems a bit much to ask, but...

jlstevens · 2016-01-31T16:01:19Z

Is there any way to get github to turn off the Merge button and replace it with a message?

I'm fairly certain that is not possible in any sane or legal way. :-p

What might be possible is this:

I could set up a GitHub webhook to trigger the necessary merge of holoview-data on our EC2 instance. This would be a script that would run alongside buildbot in the docker container and would trigger it when it is notified of a merge event by Github.

https://developer.github.com/v3/activity/events/types/#pullrequestevent

The script could then listen for events from github and when it receives one, trigger a build on buildbot as described in this link (essentially using buildbot sendchange once):

http://stackoverflow.com/questions/22310452/how-do-i-trigger-a-build-from-a-script-with-no-sc

Buildbot would then run a separate builder to:

1 Find the travis build that is triggered on master by the merge event.
2. Wait for it to complete, which would update the test_data appropriately.
3. Merge and update holoviews-data
4. Restart the build which should now pass.

I see some clear advantages:

Just use the GitHub merge button as usual. Note, you would still want the tests to be green so buildbot would still be used, but there wouldn't need be any need to tell buildbot if you want to merge.
Lets Github handle the merging of the main repository as usual. Pushing anything to our main repository using Buildbot makes me nervous. This is a big advantage is my opinion as it reduces the damage that bad buildbot behavior could cause.
No more worrying about whether we have the necessary test data cached on S3. Travis will handle that for us. I.e you can merge whenever you want to, before you would have issues with this anyway if the test data is no longer available.
Splits up the builder which is getting too long already. Shorter builders are independent and easier to debug. This would mean my current update-merge builder would be complete (after renaming it to update-refdata of course).
If the webhook failed for some reason, you could always force a build to update on master.

I will consider this possibility as I have the tools for it ready and I was hadn't really finished or tested the merging steps anyway.

jlstevens · 2016-01-31T16:08:00Z

The proposed builder would also mean I could keep the current one to handle PRs only (no need for 0 to mean 'master'). The new builder would be dedicated to updating the reference data for master. E.g if you see a travis build that fails on master, check travis.holoviews.org and decide the changes are ok, you could then just force this builder to run to make the update.

jbednar · 2016-01-31T16:32:05Z

That all sounds pretty clean to me.

jlstevens · 2016-01-31T18:08:05Z

I've just tested all the basic ideas. I've made myself a repo and every time an event happens there on GithHib, I can use that to trigger a build.

This opens up some really interesting/crazy possibilities. I could easily set things up so that the only thing needed to update the data for a PR is to type a special string such as:

BUILDBOT: UPDATE

This could then update the test data and restart the last Travis build. This is either very convenient or a bit nuts depending on your perspective! It would also sidestep the issue of given out user credentials entirely...

I think this could be really great if we take a few precautions:

Make sure the string is unique and make sure it has to be on its own line. Don't want it triggered if you quote something else someone said.
Limit how often a build can be triggered. E.g one allowed per PR per 10 minute period?

Anything else? With this suggestion, it really couldn't get more automated...

jlstevens · 2016-01-31T18:37:38Z

This idea could extend to website building given the following conditions;

The script would manage starting/stopping the slave (you would want to watch the buildbot site to see what is happening).
A two hour delay from when the website build is triggered (we only have one preview.holoviews.org so multiple documentation PRs would clobber it - this delay makes sure you can see the triggered version for a guaranteed two hours before it might change to something else).
A maximum of 3 or 4 builds a day. This bounds to total cost of running the slave although this maximum bound is very unlikely to be reached.
Cannot be triggered during the nightly build.

With these conditions in place, I can imagine that all you need to do to update the website for a PR is type:

BUILDBOT: WEBSITE

Any thoughts on this idea?

Edit: Having a BUILDBOT: SLAVE-STOP would be a good idea (wouldn't offer a SLAVE-START though!)

jbednar · 2016-01-31T18:42:13Z

I'll leave Philipp to decide whether it's crazy or not, but it does sound very convenient. Not sure where you would type the magic string -- in the merge command's comment field in github?

jlstevens · 2016-01-31T19:05:08Z

The merge wouldn't need a special string: that I can detect and handle without any intervention at all. The special strings could be anywhere in the comment text of a PR and processed when the comment is posted. The options would be to update the test data for the current PR or rebuild the website for the current PR. As you say, I think it would be very convenient but I would like to hear what Philipp thinks of the idea...

philippjfr · 2016-01-31T21:31:34Z

All sounds great to me. What we've got now is already miles better than what we had before. The only other thing I could wish for is a status hook for PR builds, which indicates whether a particular build is in the cache and can therefore be merged, just like the Travis and Coveralls checks below. Then you'd only get a green go-ahead to merge when the tests for that PR are either passing or you've recently built them and the new data is cached.

jlstevens · 2016-01-31T21:35:03Z

The only other thing I could wish for is a status hook for PR builds, which indicates whether a particular build is in the cache and can therefore be merged...

Hmmm. I don't think it would be hard to periodically check if the latest PR build is in the cache (already have code that does this) and serve one particular icon if it is available and another if it isn't. Then at the start of a PR, we could add a bit of markdown to show that icon. Would that do the trick?

philippjfr · 2016-01-31T22:04:41Z

According to this you can POST up to 1000 statuses for each commit so we could add a status when something is added to the cache and send a fail event when a build is deleted. Not a priority though.

Edit: We can add another webhook Lambda to signal whenever a build is added to preview.holoviews.org, which a) sends a 'pass' event to the tested commit and b) sends a fail event to the commit corresponding to the build that was just deleted. Github should then update the PRs appropriately.

jlstevens · 2016-02-02T17:12:42Z

I think I'm happy with the travis changes in this PR. It seems to work and does what we need.

I can't get the tests passing here due to changes elsewhere (some data used in one of the tutorials has moved). I think we can merge and start testing this system on a real PR!

philippjfr · 2016-02-02T17:44:07Z

Great work on this! PR build is passing so I'll go ahead and merge.

Improved infrastructure for notebook testing

jlstevens · 2016-02-04T14:43:52Z

It is all working well so I've pushed the files to the buildbot branch of ioam-builder. If could always do with more polishing but I am very pleased that all the important functionality now works.

Note that we have to be very careful not to accidentally push our secret tokens! I've censored them out so you should spot them in any diffs before you push.

jlstevens · 2016-08-22T17:16:12Z

I've decided to use this issue to log changes to the infrastructure. In the past few days, a number of things needed to be fixed:

The request for the DigitalOcean image for running the large slave broke (the page that the image used to be on changed).
Ruby 2 needed to be compiled as a dependency of the travis command (installed via gem).
The large slave needed xarray and iris to be installed.
Both seaborn and matplotlib needed to be upgraded.
The Travis API/command for restarting builds seems to have changed, now using the actual build number.

All these changes have been pushed to the buildbot branch of ioam-builder.

github-actions · 2024-10-26T05:21:19Z

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

Removed reference_data submodule

faaed18

Cloning holoviews-data and switching to reference_data

ef3ead7

jlstevens force-pushed the ref-data branch from 5f95646 to abdacaf Compare January 29, 2016 21:13

jlstevens mentioned this pull request Jan 29, 2016

Fix copy & paste errors in docstrings. ioam/holoviews-data#2

Closed

jlstevens force-pushed the ref-data branch from b980c80 to 0f9ed4b Compare January 30, 2016 13:00

Added conditional reference data checkout based on PR number

54979a5

jlstevens force-pushed the ref-data branch 2 times, most recently from 79ebe03 to acb617f Compare February 2, 2016 03:19

Renamed REF_BRANCH to REF_DATA_BRANCH

4d35a4a

jlstevens force-pushed the ref-data branch from acb617f to 4d35a4a Compare February 2, 2016 03:24

philippjfr added this to the v1.4.2 milestone Feb 2, 2016

jlstevens added 2 commits February 2, 2016 15:46

Attempting to grab the PR number for push builds

c042fd1

The jq command may already be available in the Travis environment

1304e98

philippjfr added a commit that referenced this pull request Feb 2, 2016

Merge pull request #429 from ioam/ref-data

112b856

Improved infrastructure for notebook testing

philippjfr merged commit 112b856 into master Feb 2, 2016

jlstevens mentioned this pull request Feb 3, 2016

Website building process needs improvement #180

Closed

jlstevens deleted the ref-data branch February 4, 2016 15:59

jbednar mentioned this pull request Feb 12, 2021

Notebook testing bokeh/bokeh#9742

Open

github-actions bot locked as resolved and limited conversation to collaborators Oct 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improved infrastructure for notebook testing #429

Improved infrastructure for notebook testing #429

jlstevens commented Jan 29, 2016

jbednar commented Jan 29, 2016

jlstevens commented Jan 31, 2016

jbednar commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jbednar commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jbednar commented Jan 31, 2016

jlstevens commented Jan 31, 2016

philippjfr commented Jan 31, 2016

jlstevens commented Jan 31, 2016

philippjfr commented Jan 31, 2016

jlstevens commented Feb 2, 2016

philippjfr commented Feb 2, 2016

jlstevens commented Feb 4, 2016

jlstevens commented Aug 22, 2016

github-actions bot commented Oct 26, 2024

Improved infrastructure for notebook testing #429

Improved infrastructure for notebook testing #429

Conversation

jlstevens commented Jan 29, 2016

jbednar commented Jan 29, 2016

jlstevens commented Jan 31, 2016

jbednar commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jbednar commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jlstevens commented Jan 31, 2016

jbednar commented Jan 31, 2016

jlstevens commented Jan 31, 2016

philippjfr commented Jan 31, 2016

jlstevens commented Jan 31, 2016

philippjfr commented Jan 31, 2016

jlstevens commented Feb 2, 2016

philippjfr commented Feb 2, 2016

jlstevens commented Feb 4, 2016

jlstevens commented Aug 22, 2016

github-actions bot commented Oct 26, 2024