Consider splitting integration tests and benchmarks into different repo #298

gjoseph92 · 2022-09-01T20:58:16Z

We're adding more and more infrastructure around testing and benchmarking dask to this repo. It's proved extremely valuable.

However, this repo's original job was to hold the conda recipe for the coiled-runtime metapackage. That side of things has its own complexity. For developers not interested in coiled-runtime, figuring out which environment.yml file(s) to install locally and what scripts to run to update them adds a lot of overhead to development.

Especially since we're moving more and more towards running upstream integration tests, or even benchmark comparisons between arbitrary changes that haven't been merged yet #292, the integration testing is pretty divorced from coiled-runtime at this point.

It also seems like we get the most value from the upstream integration tests. I'm not sure if the tests pinned at a specific coiled-runtime version are used much (they don't seem to run that often?).

So a separate repo focused solely on upstream integration tests and benchmarking might simplify things a bit for both the benchmarks and coiled-runtime.

cc @ian-r-rose @crusaderky

The text was updated successfully, but these errors were encountered:

crusaderky · 2022-09-02T11:23:28Z

-1; we definitely want to see if a perspective new version of coiled-runtime introduces a significant change in performance compared to 0.1.0 or whatever the previous release is.

ncclementi · 2022-09-02T14:58:22Z

I'm a -1 too.

However, this repo's original job was to hold the conda recipe for the coiled-runtime metapackage. That side of things has its own complexity.

This was not the only reason, there was a big intention on using this repo for testing at scale, and testing the packages that will go on the next runtime before bumping the versions.
Regarding the complexity, it will be highly reduced once we merge #235 and agreed on what's the best matrix to run all the tests. See #279

It also seems like we get the most value from the upstream integration tests. I'm not sure if the tests pinned at a specific coiled-runtime version are used much (they don't seem to run that often?).

They run every day on main (py3.9) and help to identify coiled regressions since coiled is not pinned.

gjoseph92 · 2022-09-02T15:00:01Z

definitely want to see if a perspective new version of coiled-runtime introduces a significant change in performance

Then you just make a test case in the integration test repo with that set of packages, or with a coiled-runtime prerelease installed, or installed from git, etc.. Just like if you were testing any other change.

This seems like a relatively infrequent need compared to how often we'll want to run comparison benchmarks, so it doesn't seem like the thing to optimize for to me.

gjoseph92 added the dx Developer experience label Sep 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider splitting integration tests and benchmarks into different repo #298

Consider splitting integration tests and benchmarks into different repo #298

gjoseph92 commented Sep 1, 2022

crusaderky commented Sep 2, 2022

ncclementi commented Sep 2, 2022

gjoseph92 commented Sep 2, 2022

Consider splitting integration tests and benchmarks into different repo #298

Consider splitting integration tests and benchmarks into different repo #298

Comments

gjoseph92 commented Sep 1, 2022

crusaderky commented Sep 2, 2022

ncclementi commented Sep 2, 2022

gjoseph92 commented Sep 2, 2022