Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Too many failing checks / implement smarter CI #3662

Open
adityaraute opened this issue Dec 25, 2023 · 11 comments
Open

Too many failing checks / implement smarter CI #3662

adityaraute opened this issue Dec 25, 2023 · 11 comments
Labels
difficulty: easy A good issue for someone new. Can be done in a few hours difficulty: medium Will take a few days feature priority: medium To be resolved if time allows

Comments

@adityaraute
Copy link

Description

I was scrolling through the PRs and noticed most PRs failing some checks or the others; often the same ones.

I've attached a screenshot where a missing package led to a check fail, but there are several similar instances.

Motivation

If checks are failing, and yet they may not be related to the issue at hand, I wonder if it makes the Check or Action itself irrelevant, at least to the majority of PRs.

Possible Implementation

Reduce the number of actions/checks attached to certain workflows, such that only the ones that are essential for a PR are run.
Additionally, investigate why certain actions fail repeatedly and rectify any issues in the core repo build.

Additional context

An example of an action running ubuntu tests on PRs

image

@agriyakhetarpal
Copy link
Member

agriyakhetarpal commented Dec 25, 2023

Thanks for the suggestion, I have myself had a variety of ideas like this in mind since some while based off on recommendations from other repositories. Adding labels to check like [skip tests] and [skip docs] to group tests would be worthwhile and is super easy to add via already-supported GitHub Actions syntax, but the issue with this is about how maintainable this can be in the long term, since the file globs to conditionally check are too complex to accommodate all such cases. For example, if I end up modifying a docstring incorrectly in the pybamm/ folder and skip the doctests in a PR as there will be no changes in docs/, it would mean that failures in the doctests based on that incorrect change may then start to show up elsewhere in future PRs, which would become troublesome for the PR author or maintainers to fix every time. We will need to decide on how to go about this

I suppose skipping the benchmark runs on documentation-only fixes would be fine. It is the longest job and we don't need it to run everywhere, especially if we're not pushing changes to an external repository like we do with a CRON job. I'm aware that SciPy's CI configuration has a lot of such things we can seek inspiration from; for example, they check for the [Build wheels] message in a commit which proves to be useful for debugging when working on compiled code in a PR – and so on. Feel free to pitch in @arjxn-py @Saransh-cpp. If you have any suggestions w.r.t. this based on what you have observed with PyBaMM's infrastructure, please feel free to list them in this thread, @adityaraute.

There is indeed a lot of redundancy with things like repeated workflows and workflow files (the scheduled tests and regular CI tests could be integrated into the same file, #3518 is still open, benchmarks-related jobs can have less files, etc.)


P.S. The error you are receiving locally with the doctests was fixed recently and came from a missing setuptools dependency, could you retry with a cleaned .nox folder and a fresh environment? never mind: I misread and thought this was a test failure on your machine

@agriyakhetarpal agriyakhetarpal changed the title Too many failing checks Too many failing checks / implement smarter CI Dec 25, 2023
@adityaraute
Copy link
Author

I understand the perils of skipping certain checks for certain PRs. But then can it be possible to ensure that checks don't fail without a valid reason? This would mean investigating each workflow and the possibilities where they may be failing erroneously.

@agriyakhetarpal
Copy link
Member

I suppose the usual practice is to run tests in such a way that they are as upright as they can be (fail on any deprecation warnings detected or so on) and that they are not flaky (do not fail without reason). We regularly experience benchmarks failures due to the absence of fixed CI runners, one way could be to conduct a statistical analysis by looking at the past data for benchmarks available on https://pybamm.org/benchmarks and adjust the thresholds for failure using the normal distribution obtained so that it is experienced once or twice in a hundred pull request runs. The other way is to improve the benchmark times themselves by improving the solver. Both of these are hard to do, but one thing that can be done is to not fail the workflow but let it emit a "status required" message, i.e., not let it get marked with a ❌ but with a ⚠️

As for other workflows, most of them don't fail at random, except for the link checker which sometimes does so on certain links (we have a lot of them throughout the source code)

@arjxn-py
Copy link
Member

arjxn-py commented Jan 6, 2024

Thanks @adityaraute for opening this, this would really save resources and potentially make development faster. I had also been thinking that we should [skip docs] until & unless there's a documentation related change & benchmarks can be skipped in most of the PRs. +1 for @agriyakhetarpal's suggestion, can we maybe also label complexity & priority for this?
If this doesn't gets picked up for a while, I'd be happy to work on this after i finish up my pending PRs.

@agriyakhetarpal
Copy link
Member

I agree, though let's keep this open for some while first for more discussion. For now, we have these scenarios:

  1. Skip benchmarks + coverage + unit + integration on docs-related changes
  2. Skip unit + coverage + integration on changes to examples or example scripts?
  3. Don't skip anything if workflow files are modified (we will have to look into this in depth a bit, defeats the purpose)
  4. Add markers like [Build wheels] to trigger wheel builds on a commit that has this string, say if someone is working on compiled code for the IDAKLU solver

We can create file patterns based on these points, though for some of them it might make sense to do so after #3641 is complete. Please let me know if there are more scenarios that can be looked after.

@agriyakhetarpal agriyakhetarpal added difficulty: easy A good issue for someone new. Can be done in a few hours priority: medium To be resolved if time allows labels Jan 6, 2024
@kratman
Copy link
Contributor

kratman commented Jun 5, 2024

@agriyakhetarpal Can we considered this resolved now? There have been a bunch of improvements

@agriyakhetarpal
Copy link
Member

Not for now, I would say – the recent improvements have been great (#4125, #4099), but we want to do much more on this in the coming months. @prady0t will be starting out with restructuring some parts of the workflows aside from the pytest migration goal in the GSoC timeline, and @cringeyburger will help out with the build-system-related changes

@arjxn-py arjxn-py added difficulty: medium Will take a few days difficulty: easy A good issue for someone new. Can be done in a few hours and removed difficulty: easy A good issue for someone new. Can be done in a few hours labels Aug 7, 2024
@arjxn-py
Copy link
Member

arjxn-py commented Aug 7, 2024

@agriyakhetarpal, we might want to have a hypothetical checklist of tasks that one might need to do & a starting point?

@agriyakhetarpal
Copy link
Member

I think so – the easiest thing to start would be to implement some patterns in the paths: and paths-ignore: fields for the workflows so that we can skip running all the tests for all the PRs. I can add a longer checklist later, that's usually the longer thing to plan out with such issues

@agriyakhetarpal
Copy link
Member

A good place for anyone who wants to start with this is this resource: https://learn.scientific-python.org/development/guides/gha-basic/#advanced-usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty: easy A good issue for someone new. Can be done in a few hours difficulty: medium Will take a few days feature priority: medium To be resolved if time allows
Projects
None yet
Development

No branches or pull requests

4 participants