Docs: include list of sequences to test on #225

katestange · 2022-12-12T18:34:27Z

I think we should have a list of OEIS sequences with particular bug-inducing properties to test on. New visualizer PRs should be required to test on these before submitting.

These should include (but are not limited to):

all 1,0,-1s: A114592
fast growing terms: A007235 (there are plenty, this is perhaps not the fastest!)
offset of -1: A000521 (maybe we can find one that doesn't grow so fast, to separate concerns) maybe A002206 A061646 A284016
includes negative terms: A134028
has very few terms: A001220
offset of 2: https://oeis.org/A086677
more examples with offset: https://oeis.org/wiki/Offsets
no positive terms (includes zeroes): A002819
non-positive integers: A001489

katestange · 2024-07-22T19:42:18Z

The sequence A202319 consists of moderately sized semiprimes, which should be hard to factor for their size. But they are overall small. It would be nice to find examples of sequences that are inherently hard to factor, for testing.

gwhitney · 2024-08-28T06:53:04Z

These are good candidates for tests in the new end-to-end testing in #420.

gwhitney · 2024-08-30T22:49:24Z

How should we approach incorporating this excellent list of sequences into our automated tests? Do we want a bona fide grid of testing every visualizer on every one of these sequences? Such a test would take a very long time to run, so we would (a) not likely be able to run it in GitHub's CI, and (b) would not want to require it to be run for every single commit. Really rather it should be run at least once per PR, if we go this grid route. So then it would become the reviewer's responsibility to pull the PR locally and run the grid test before approving for merge.

Another option would be a sort of "double transversal" approach -- make sure that each one of the sequences is tested with at least a couple of visualizers, and each visualizer is tested with a couple of sequences. We could automate this, seeding the random number generator so that the same associations are always chosen, and if we ever want to do a more extensive version, someone could run it with a couple of different seeds locally.

Do either of these sound good? Other thoughts? (Basically I am looking to make testing/review of PRs as routine as possible, to reduce the burden of reviewing, enabling more effort on the development itself.)

katestange · 2024-08-30T23:00:36Z

This sounds like a nice idea (the double transversal with seed). Why not have the automated test do a different seed each time, though? Particularly if we have the PR submitter do more extensive testing once per PR. Maybe worth discussing at a meeting.

gwhitney · 2024-08-30T23:39:24Z

Why not have the automated test do a different seed each time, though?

Hmm, I am worried about having a state where the code all "works," and then someone is blocked from checking in a perfectly good PR because it happens to run with a different seed on their PR, uncovering a previously unknown but unrelated problem. I really think to avoid blowing our minds we need the tests to be pretty strictly reproducible. We could from time to time change only the seed in a tiny PR, just to mix things up and distribute what's being tested.

The other big point is what's the success criterion? I don't know of a truly automatable one. So I was going to make this an image test, which means that someone needs to vet new images any time the test changes...

So unless I hear otherwise, I will go ahead and implement the double-transversal test with a fixed seed and add that to the #420 PR, considering that sufficient to close this issue, and we can take it from there in the future.

katestange · 2024-08-30T23:44:06Z

ok that makes sense

gwhitney · 2024-08-31T00:20:02Z

Good candidate for negative offset test: https://oeis.org/A228060 -- it has an offset of -85, but also has a thousand terms that don't get particularly large.

gwhitney · 2024-08-31T00:34:14Z

Good candidate for large positive offset: A241298. About as large an offset as backscope can currently handle.

gwhitney · 2024-10-10T17:54:47Z

The transversal test added in #420 incorporates this. Closing.

katestange added the enhancement New feature or request label Dec 12, 2022

gwhitney added the documentation Improvements or additions to documentation label Apr 23, 2024

gwhitney added this to the Alpha v0.5: ui2 -> main & deploy milestone Aug 28, 2024

This was referenced Sep 3, 2024

indicators for loading #454

Open

standardized zoom/pan etc. #385

Open

gwhitney mentioned this issue Oct 8, 2024

test: Implement end-to-end testing with Playwright #420

Merged

gwhitney closed this as completed Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: include list of sequences to test on #225

Docs: include list of sequences to test on #225

katestange commented Dec 12, 2022 •

edited

Loading

katestange commented Jul 22, 2024

gwhitney commented Aug 28, 2024

gwhitney commented Aug 30, 2024

katestange commented Aug 30, 2024

gwhitney commented Aug 30, 2024

katestange commented Aug 30, 2024

gwhitney commented Aug 31, 2024

gwhitney commented Aug 31, 2024

gwhitney commented Oct 10, 2024

Docs: include list of sequences to test on #225

Docs: include list of sequences to test on #225

Comments

katestange commented Dec 12, 2022 • edited Loading

katestange commented Jul 22, 2024

gwhitney commented Aug 28, 2024

gwhitney commented Aug 30, 2024

katestange commented Aug 30, 2024

gwhitney commented Aug 30, 2024

katestange commented Aug 30, 2024

gwhitney commented Aug 31, 2024

gwhitney commented Aug 31, 2024

gwhitney commented Oct 10, 2024

katestange commented Dec 12, 2022 •

edited

Loading