Actions job to concatenate tutorials data to one CSV and run analysis notebook #1703

esantorella · 2023-02-25T01:16:34Z

After tutorials performance data is uploaded to artifacts branch, a job runs to concatenate tutorials data to one CSV and run analysis notebook. The notebook, minus cell outputs, and the script that runs it will be in the 'main' branch, while the data and notebook outputs stay in the 'analytics' branch, which continues to not need to stay up-to-date with 'main' since it doesn't have its own Python code.

Have you read the Contributing Guidelines on pull requests?

Yes

Test Plan

[x] Ran with manual actions run "on push" (smoke test): here, more recent actions run here.
[x] Auto-generated notebook should have nifty visualizations here
[x] Up-to-date dataset of all tutorials runs exists and looks OK here

codecov · 2023-02-25T01:23:58Z

Codecov Report

Merging #1703 (d63ede7) into main (2d87f90) will not change coverage.
The diff coverage is n/a.

❗ Current head d63ede7 differs from pull request most recent head e42fd75. Consider uploading reports for the commit e42fd75 to get more accurate results

@@            Coverage Diff            @@
##              main     #1703   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files          170       170           
  Lines        14636     14636           
=========================================
  Hits         14636     14636

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

facebook-github-bot · 2023-02-27T20:36:44Z

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-02-27T21:07:56Z

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: ## Motivation We are using nbconvert to run tutorials. nbconvert is not really made for this use case, but papermill is, so we have some handwritten code than can be handled by papermill. With papermill, we can go a bit further and use SMOKE_TEST as a [parameter](https://papermill.readthedocs.io/en/latest/usage-parameterize.html) rather than an environment variable. That would make it easy for people to work with the tutorials as notebooks. Pull Request resolved: pytorch#1706 Test Plan: Ran tutorials locally and made sure smoke-test flag was getting used appropriately. ## Related pull requests Enabling papermill will make pytorch#1703, which automates running a notebook, a bit easier. Reviewed By: saitcakmak Differential Revision: D43631568 Pulled By: esantorella fbshipit-source-id: 66fbcca511beb9f46cc281c0ba74a27e4c86e46d

Summary: ## Motivation We are using nbconvert to run tutorials. nbconvert is not really made for this use case, but papermill is, so we have some handwritten code than can be handled by papermill. With papermill, we can go a bit further and use SMOKE_TEST as a [parameter](https://papermill.readthedocs.io/en/latest/usage-parameterize.html) rather than an environment variable. That would make it easy for people to work with the tutorials as notebooks. Pull Request resolved: #1706 Test Plan: Ran tutorials locally and made sure smoke-test flag was getting used appropriately. ## Related pull requests Enabling papermill will make #1703, which automates running a notebook, a bit easier. Reviewed By: saitcakmak Differential Revision: D43631568 Pulled By: esantorella fbshipit-source-id: e7bfeb68e221fff4f1633af8deb9a11d1ff1c0e6

saitcakmak

One concern I have is that by repeatedly committing updated plots to the repo, the repo size will increase substantially over time (due to commit history). I know that this is an issue with plotly plots in Ax (which is why we force-push the gh-pages branch with no history) but I don't know how much of an issue this is with seaborn or matplotlib plots. If the NB size with the output plots is small, then I think this is ok. If it is something we'd measure in MBs, then this might become an issue over time.

.github/workflows/reusable_tutorials.yml

setup.py

.github/workflows/reusable_tutorials.yml

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

facebook-github-bot · 2023-03-11T01:04:12Z

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

esantorella · 2023-03-11T03:17:13Z

One concern I have is that by repeatedly committing updated plots to the repo, the repo size will increase substantially over time (due to commit history). I know that this is an issue with plotly plots in Ax (which is why we force-push the gh-pages branch with no history) but I don't know how much of an issue this is with seaborn or matplotlib plots. If the NB size with the output plots is small, then I think this is ok. If it is something we'd measure in MBs, then this might become an issue over time.

Great point; these were pretty big (~6 MB). I cut it down to ~1 MB by switching to SVG graphics and rearranging the plots, which hopefully makes them easier to read too.

Balandat · 2023-03-11T16:48:17Z

So does this mean that this adds about ~1MB to the repo on every tutorials run?

esantorella · 2023-03-13T13:18:38Z

So does this mean that this adds about ~1MB to the repo on every tutorials run?

Yeah. I'm not thrilled with this setup, but I'm reluctant to abandon it since it seems to have already shown some interesting patterns, like a sudden ~13% increase in how much memory the average tutorial uses in smoke test mode:

I want to look into alternatives ways of setting this up...

facebook-github-bot · 2023-03-13T13:39:50Z

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

scripts/analyze_tutorials_performance.py

suggestions from code review

facebook-github-bot · 2023-03-14T20:19:05Z

@esantorella has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-03-15T15:25:46Z

@esantorella merged this pull request in 7a04fec.

Summary: ## Motivation #1695 and #1703 introduced logging of tutorials runtime and memory usage, and visualizing the results in [a notebook stored in the artifacts branch](https://github.com/pytorch/botorch/blob/artifacts/notebooks/tutorials_performance_tracking.ipynb). This information been occasionally helpful for checking whether a tutorials timeout stemmed from a pervasive slowdown, a method-specific issue, or random chance. However, it has not been used often, increases the size of the repository, and now has stopped updating and generated [a failure in the nightly cron](https://github.com/pytorch/botorch/actions/runs/8704134925/job/23882395249#step:12:132). ### Have you read the [Contributing Guidelines on pull requests](https://github.com/pytorch/botorch/blob/main/CONTRIBUTING.md#pull-requests)? Yes Pull Request resolved: #2298 Test Plan: [x] Run tutorials locally [ ] Make sure tutorials action passes on PR [ ] Nightly cron ## Related PRs #1695 , #1703 Reviewed By: saitcakmak Differential Revision: D56192232 Pulled By: esantorella fbshipit-source-id: 02b0c1c3702929ebbea2e2cb90e5669ac7040c44

esantorella added 8 commits February 21, 2023 13:09

Notebook to visualize runtime and memory data from tutorials

fdd9d54

re-ran notebook

5f208d5

Merge remote-tracking branch 'origin' into tutorials_analytics

3db6a2c

Workflow to visualize tutorials output

adce7a6

avoid explicit reference to branch name in workflow file

05f0135

temporarily taking out csv logic to fail fast otherwise

9408a85

put everything back in except for running all the tutorials

4379114

Add seaborn to tutorials_requires

ce00604

esantorella self-assigned this Feb 25, 2023

facebook-github-bot added the CLA Signed Do not delete this pull request or issue due to inactivity. label Feb 25, 2023

esantorella mentioned this pull request Feb 27, 2023

[RFC] Use papermill for running tutorials #1705

Closed

use papermill to run tutorial

8bfc1ae

esantorella mentioned this pull request Feb 27, 2023

[RFC] Use papermill for running tutorials #1706

Closed

esantorella added 4 commits February 27, 2023 14:23

fix path

3e3990b

add papermill to tutorials dependencies

31830e3

set cwd for papermill

f0155c5

remove break statement

f37a0a1

esantorella marked this pull request as ready for review February 27, 2023 20:02

esantorella requested review from Balandat and saitcakmak February 27, 2023 20:02

esantorella changed the title ~~[WIP] Actions job to concatenate tutorials data to one CSV and run analysis notebook~~ Actions job to concatenate tutorials data to one CSV and run analysis notebook Feb 27, 2023

lint

23ef9eb

saitcakmak approved these changes Mar 9, 2023

View reviewed changes

.github/workflows/reusable_tutorials.yml Outdated Show resolved Hide resolved

.github/workflows/reusable_tutorials.yml Outdated Show resolved Hide resolved

Merge remote-tracking branch 'origin' into tutorials_analytics

be4627a

esantorella commented Mar 10, 2023

View reviewed changes

setup.py Outdated Show resolved Hide resolved

esantorella commented Mar 10, 2023

View reviewed changes

.github/workflows/reusable_tutorials.yml Show resolved Hide resolved

esantorella and others added 2 commits March 10, 2023 17:24

Apply suggestions from code review

6a2d36e

Co-authored-by: Sait Cakmak <saitcakmak@outlook.com>

Save figures as SVG + run black on notebook

1da367c

esantorella added 2 commits March 10, 2023 21:58

smaller figures

350a967

strip outputs

addf060

Merge branch 'main' into tutorials_analytics

56cfd37

Merge branch 'main' into tutorials_analytics

e2b9e52

esantorella commented Mar 14, 2023

View reviewed changes

scripts/analyze_tutorials_performance.py Outdated Show resolved Hide resolved

Update scripts/analyze_tutorials_performance.py

e42fd75

suggestions from code review

facebook-github-bot closed this in 7a04fec Mar 15, 2023

facebook-github-bot added the Merged label Mar 15, 2023

Balandat deleted the tutorials_analytics branch March 24, 2023 14:55

Balandat restored the tutorials_analytics branch March 24, 2023 14:55

Balandat deleted the tutorials_analytics branch April 15, 2023 19:45

esantorella added a commit to esantorella/botorch that referenced this pull request Apr 16, 2024

Remove changes from pytorch#1695 and pytorch#1703

c66d47c

esantorella mentioned this pull request Apr 16, 2024

Remove GH Actions tutorials performance tracking #2298

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Actions job to concatenate tutorials data to one CSV and run analysis notebook #1703

Actions job to concatenate tutorials data to one CSV and run analysis notebook #1703

esantorella commented Feb 25, 2023 •

edited

Loading

codecov bot commented Feb 25, 2023 •

edited

Loading

facebook-github-bot commented Feb 27, 2023

facebook-github-bot commented Feb 27, 2023

saitcakmak left a comment

facebook-github-bot commented Mar 11, 2023

esantorella commented Mar 11, 2023

Balandat commented Mar 11, 2023

esantorella commented Mar 13, 2023 •

edited

Loading

facebook-github-bot commented Mar 13, 2023

facebook-github-bot commented Mar 14, 2023

facebook-github-bot commented Mar 15, 2023

Actions job to concatenate tutorials data to one CSV and run analysis notebook #1703

Actions job to concatenate tutorials data to one CSV and run analysis notebook #1703

Conversation

esantorella commented Feb 25, 2023 • edited Loading

Have you read the Contributing Guidelines on pull requests?

Test Plan

codecov bot commented Feb 25, 2023 • edited Loading

Codecov Report

facebook-github-bot commented Feb 27, 2023

facebook-github-bot commented Feb 27, 2023

saitcakmak left a comment

Choose a reason for hiding this comment

facebook-github-bot commented Mar 11, 2023

esantorella commented Mar 11, 2023

Balandat commented Mar 11, 2023

esantorella commented Mar 13, 2023 • edited Loading

facebook-github-bot commented Mar 13, 2023

facebook-github-bot commented Mar 14, 2023

facebook-github-bot commented Mar 15, 2023

esantorella commented Feb 25, 2023 •

edited

Loading

codecov bot commented Feb 25, 2023 •

edited

Loading

esantorella commented Mar 13, 2023 •

edited

Loading