Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous Integration Performance and Robustness Brainstorming #311

Open
bollwyvl opened this issue Aug 11, 2020 · 13 comments
Open

Continuous Integration Performance and Robustness Brainstorming #311

bollwyvl opened this issue Aug 11, 2020 · 13 comments
Labels
question Further information is requested

Comments

@bollwyvl
Copy link
Collaborator

Elevator Pitch

Let's gather some ideas for how we might improve the value we're getting from CI.

Motivation

CI is not really that broken, but it's pretty slow. There might be some other tools and services we can look into which wouldn't require altering the code that much, so can be done in parallel, and incrementally, while feature work is on-going.

Design Ideas

Post/emote ideas below.

@bollwyvl
Copy link
Collaborator Author

Migrate to GitHub Actions

It seems to be where things are going, and 2x simultaneous jobs than Azure Pipelines.

@bollwyvl
Copy link
Collaborator Author

bollwyvl commented Aug 11, 2020

Investigate mamba

implemented in #312

Using mamba instead of upgrading conda will potentially save us some time when solving environments and downloading files. It can sometimes return different solutions (but so can different versions of conda).

@bollwyvl
Copy link
Collaborator Author

Investigate conda-lock

Instead of even solving the environments in CI, we could generate them quickly, offline, and check them into the ci subfolder. Stacks with mamba (which it will use preferentially).

@bollwyvl
Copy link
Collaborator Author

Revisit caching

Being able to cache:

  • conda lock solutions
  • the built lab
  • tectonic cache
  • conda package cache
  • yarn package cache

...in roughly that order, would knock some minutes off all of the runs, but particularly on windows which seems to have really bad IO. Github Actions and Azure both have more advanced (but of course, incompatible) caching actions. They all work best if we have things checked in that get hashed. The entropy on the built lab is pretty challenging to describe.

@bollwyvl bollwyvl mentioned this issue Aug 11, 2020
4 tasks
@bollwyvl
Copy link
Collaborator Author

Investigate dodo

dodo has been very useful for tying together some large builds, and normalizing ci vs local development. #183 has a good amount of work in it, but can be improved substantially. A challenge: pathlib is pretty important, so supporting 3.5 might be a challenge (pathlib2 isn't quite the same).

@krassowski
Copy link
Member

krassowski commented Aug 31, 2020

Some test cases fail on timeouts (especially on Windows):

Windows38.04 Interface.Statusbar                                              
==============================================================================
Statusbar Popup Opens                                                 | FAIL |
Setup failed:
Element 'css:.jp-mod-accept.jp-mod-warn' did not appear in 5 seconds.

or:

Windows38.04 Interface.Statusbar                                              
==============================================================================
Statusbar Popup Opens                                                 | FAIL |
Element 'css:div.lsp-statusbar-item' did not get text 'Fully initialized' in 1 minute.

or:

Windows38.01 Editor                                                           
==============================================================================
JS                                                                    | FAIL |
Setup failed:
Element 'css:.jp-mod-accept.jp-mod-warn' did not appear in 5 seconds.
------------------------------------------------------------------------------
JSON                                                                  | PASS |
------------------------------------------------------------------------------
SQL                                                                   | PASS |
------------------------------------------------------------------------------
YAML                                                                  | PASS |
------------------------------------------------------------------------------
Markdown                                                              | PASS |
------------------------------------------------------------------------------
Python                                                                | PASS |
------------------------------------------------------------------------------
SCSS                                                                  | PASS |
------------------------------------------------------------------------------
CSS                                                                   | PASS |
------------------------------------------------------------------------------
JSX                                                                   | FAIL |
Setup failed:
Element 'css:.jp-mod-accept.jp-mod-warn' did not appear in 5 seconds.
------------------------------------------------------------------------------

yet, they usually work out in the final attempt. The 5 seconds timeout could probably be extended (10s?).

@bollwyvl
Copy link
Collaborator Author

bollwyvl commented Sep 8, 2020

Don't run jest tests on windows

The jest tests take 3 minutes on windows, and don't really provide additional information vs the linux/osx runs (other than node not working great on windows, which is known) since it's all make-believe DOM.

I move we just drop them from all the windows runs on azure.

@krassowski
Copy link
Member

Playing around with gtihub actions.. is there a reason why we use conda rather than miniconda?

@krassowski
Copy link
Member

krassowski commented Sep 8, 2020

Or, do we? Honestly cannot see when it is getting installed...

@bollwyvl
Copy link
Collaborator Author

bollwyvl commented Sep 8, 2020 via email

@krassowski
Copy link
Member

Obviously I meant anaconda ;) Trying to use https://github.com/conda-incubator/setup-miniconda (seems to support mamba in a way?) at the moment after previously trying setup-conda and having a brief look at setup-mamba (which does not seem to support Win)

@bollwyvl
Copy link
Collaborator Author

Cache built lab static

Rebuilding JupyterLab with all the extensions on Windows is important, at least until 3.0. However, it does take a rather long amount of time, in the 3-4 minute neighborhood.

I have a hunch after we run all of our labextension install and labextension link, the contents of lab/{staging,extensions} and our built tarballs will be sufficiently reproducible that it should be safe to cache the built lab/static folder, and reuse that instead of building it on a cache hit. This would especially be useful for iteration on robot tests. Would need to do some significant investigation, however, which may not be worth it.

@bollwyvl
Copy link
Collaborator Author

bollwyvl commented Sep 11, 2020

Run robot tests in parallel

pabot can run tests in parallel. I think our tests are sufficiently self-contained (minus things like the jedi/tectonic caches) that we could run them in parallel, so it might be a one-line change (e.g. import pabot as robot)

The browser tests are the reason we're doing half this stuff, but they also contribute over half of the duration, by themselves. A lot of that is waiting around for the browser to load/do stuff, which might not actually be that taxing on the poor little processor (especially windows). However, given an anecdotal win/py37 runtime of around 27 minutes, even getting a 25% reduction in duration would be beneficial. Additionally, if this allowed local tests to run even faster, it would reduce the pain of developing/maintaining/refactoring robot tests, as well.

Update: this is now available on conda-forge

@bollwyvl bollwyvl added the question Further information is requested label Feb 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants