gh-107862: Add property-based round-trip tests for base64 #119406

encukou · 2024-05-22T15:20:05Z

Add round-trip tests for the various encodings in the base64 module.
These are rather stable, so I expect the tests will serve more as examples
of how to use the Hypothesis API.

Also, fix an issue in the @example decorator stub: it didn't work with multiple
positional arguments.

cc @Zac-HD

Issue: [Meta] Research: what can we test with Hypothesis? #107862

Zac-HD

Looks good! We don't need the @composite decorator in this case, but otherwise lovely idiomatic use of Hypothesis.

Zac-HD · 2024-05-22T15:36:54Z

Lib/test/test_base64.py

+    return bytes(draw(hypothesis.strategies.lists(allowed_chars, min_size=2,
+                                                  max_size=2, unique=True)))


Instead of using @composite here, we can simply return a strategy from our function:

Suggested change

return bytes(draw(hypothesis.strategies.lists(allowed_chars, min_size=2,

max_size=2, unique=True)))

return hypothesis.strategies.lists(allowed_chars, min_size=2, max_size=2, unique=True))).map(bytes)

In more complicated scenarios, this can be a decent performance improvement (by amortizing the construction and validation over multiple draws, instead of executing the body each time we draw a value) - particularly if we @functools.cache the function, or happen to draw from it multiple times (e.g. st.lists(altchars())). Mostly though it's just an idiom to remind readers that strategies are just python values, not magic.

rhettinger · 2024-05-22T16:32:22Z

I thought there had been a decision not to add hypothesis tests to any module not under active development. During development, hypothesis is good at teasing out special cases that need to be fixed and have an explicit targeted test. Once in production, hypothesis doesn't do much for us and leaves it unclear whether the search space for a property has actually been covered.

AFAICT this PR doesn't add any value, it just eats clock cycles and adds another tool that a person has to learn before they can read the tests. If our current tests are inadequate in any way, we should add specifically targeted tests to fill the gaps.

We should differentiate a goal we care about, "thoroughly understand and test our code" from the more dubious goal, "Let's use this tool every we can." ISTM that this PR only does the latter and doesn't help base64 in any way.

Also note, that we've already had a number of people do the exercise of trying hypothesis on many standard library modules. We were able to do that without actually checking in the tests into main distribution. FWIW, the main result of the exercise was identifying bugs in the tester's understanding of the module (i.e. an expectation that colorsys conversions would round-trip without understanding gamut limits and published specifications that did not round-trip).

Zac-HD · 2024-05-22T16:41:47Z

I think these concerns are addressed in python/steering-council#65 and #107862; the stub implementation we added in #22863 has very low overhead.

We're starting with simpler cases rather than e.g. #83134, because it's easier to shake out config issues and get used to the workflow when we don't also need to work out how we'll handle e.g. the strategy to generate arbitrary Python source code.

encukou · 2024-05-22T18:14:19Z

I thought there had been a decision not to add hypothesis tests to any module not under active development.

I couldn't find that decision, do you have a link to it?
I'm happy to ask the SC for a decision if that's what we need.

I agree that the issues found by Hypothesis should be added to the tests that always run, though it might be enough to add them as @example.
Gaps in understanding can usually be used improve documentation -- see #119409 for two non-obvious things I found here.

Checking this into the repo means it stays in sync with stdlib.

rhettinger · 2024-05-22T18:20:38Z

Focusing on this particular PR as an example of what is to come, is anyone claiming that it adds any value at all? Will users be any better-off? I don't think so.

AFAICT only the time and datetime module maintainer has said they wanted this (because it is part of their development process). No other module maintainer has asked for this. I expect that every such PR in the future will come from devs who have not previously shown any interest in the module being tested. Their goal isn't to improve the module. Their goal is just to use hypothesis. IMO that is a false goal. I don't think it should be done.

encukou · 2024-05-22T19:42:45Z

The goal is to use property-based testing in general -- i.e. using a tool to find edge cases that humans are bad at finding.

You're right that this is meant for tests in general: rather than base64 itself I'm interested in hypothesis/regrtest integration, the stubs (see the second commit here), and finding usability issues in running these tests. I'd love to get to the point where people who are actively developing a module have a bunch of examples and institutional knowledge ready to help them find new edge cases.

rhettinger · 2024-05-22T20:58:05Z

FWIW, we already test properties (invariants) almost everywhere they are applicable (even in base64). You don't need hypothesis for that.

Hypothesis doesn't have a monopoly on properties or invariants (pretty much every existing assertion does that). The superpower of hypothesis is generating semi-random input patterns to find problems and then simplify to a minimal reproducer. That is helpful during development but not really something we want in the CI. Instead, a core-dev is expected to produce a comprehensive and deterministic sets of tests that reflect a full understanding of the code (both black-box and white-box).

encukou · 2024-05-23T21:35:11Z

I do not see how “not really something we want in the CI” follows from what was said before.

If this stuff is helpful during development, it should also be helpful to testing that the hypothesis/regrtest integration works as intended, and to have a bunch of worked examples around, so that getting it to run when you need it isn't an uphill battle.

Yes, a core dev is expected to produce good tests. Property-based testing is a tool to help with that. It also encodes the expected invariants for future generations, in a more useful format than only a set of examples.

encukou added 2 commits May 22, 2024 10:58

Add property-based tests to test_base64

351dc9e

Allow multiple positional arguments in @hypothesis.example stub

bac685f

bedevere-app bot added the awaiting core review label May 22, 2024

bedevere-app bot mentioned this pull request May 22, 2024

[Meta] Research: what can we test with Hypothesis? #107862

Open

encukou added tests Tests in the Lib/test dir skip news labels May 22, 2024

Zac-HD approved these changes May 22, 2024

View reviewed changes

encukou mentioned this pull request May 22, 2024

Clarify base64.a85encode docs: *wrapcols* doesn't count the newline #119409

Merged

Zac-HD reviewed May 22, 2024

View reviewed changes

encukou added 2 commits May 22, 2024 11:45

Simplify the altchars strategy

b28b80a

Simplify the altchars strategy

2539d41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-107862: Add property-based round-trip tests for base64 #119406

gh-107862: Add property-based round-trip tests for base64 #119406

encukou commented May 22, 2024 •

edited by bedevere-app bot

Loading

Zac-HD left a comment

Zac-HD May 22, 2024

rhettinger commented May 22, 2024 •

edited

Loading

Zac-HD commented May 22, 2024

encukou commented May 22, 2024

rhettinger commented May 22, 2024

encukou commented May 22, 2024

rhettinger commented May 22, 2024

encukou commented May 23, 2024

		return bytes(draw(hypothesis.strategies.lists(allowed_chars, min_size=2,
		max_size=2, unique=True)))

	return bytes(draw(hypothesis.strategies.lists(allowed_chars, min_size=2,
	max_size=2, unique=True)))
	return hypothesis.strategies.lists(allowed_chars, min_size=2, max_size=2, unique=True))).map(bytes)

gh-107862: Add property-based round-trip tests for base64 #119406

Are you sure you want to change the base?

gh-107862: Add property-based round-trip tests for base64 #119406

Conversation

encukou commented May 22, 2024 • edited by bedevere-app bot Loading

Zac-HD left a comment

Choose a reason for hiding this comment

Zac-HD May 22, 2024

Choose a reason for hiding this comment

rhettinger commented May 22, 2024 • edited Loading

Zac-HD commented May 22, 2024

encukou commented May 22, 2024

rhettinger commented May 22, 2024

encukou commented May 22, 2024

rhettinger commented May 22, 2024

encukou commented May 23, 2024

encukou commented May 22, 2024 •

edited by bedevere-app bot

Loading

rhettinger commented May 22, 2024 •

edited

Loading