Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for modules wrapping flint #39

Merged
merged 21 commits into from
Apr 21, 2023

Conversation

oscarbenjamin
Copy link
Collaborator

This is a work in progress.

Maybe the test script should be reorganised a bit as the current one-test-function-per-module structure will get a bit unwieldy if more tests are added.

Comment on lines +132 to +192
(lambda n: flint.fmpz(n).rising(2),
[0, 0, 2, 6, 12, 20, 30, 42, 56, 72, 90, 110]),
(lambda n: flint.fmpz.bin_uiui(n, 0),
[OE, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These function signatures are generally a bit inconsistent e.g.:

fmpz(n).func()
fmpz.func(n)
fmpz(n).func(m)
fmpz.func(n, m)

Is that something that should be kept for backwards compatibility?

Perhaps if a consistent scheme was decided then aliases could be added with more consistent signatures e.g. fmpz.rising_factorial(n,m).

Does it really make sense to have _ui in the name for a Python method?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has to be decided on a case by case basis I think. Sometimes a function f(x,y) has a clear primary variable and then x.f(y) makes sense; binomials and rising factorials arguably fall into this category. In other cases a static method makes more sense.

We could also add a namespace of standalone functions (not requiring method calls), but that is a separate issue.

Agree there is no reason to have ui versions in the Python interface.

(lambda n: flint.fmpz(n).is_probable_prime(),
[0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 0, 0]),
(lambda n: flint.fmpz(n).is_perfect_power(),
[T, T, T, F, F, T, F, F, F, T, T, F]),
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess is_prime and is_probable_prime should be changed to return True/False rather than 0/1.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

test/test.py Outdated
assert flint.fmpz(f) == f
assert flint.fmpz(str(i)) == f
raises(lambda: flint.fmpz("qwe"), ValueError)
raises(lambda: flint.fmpz([]), TypeError)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised I'm using this raises function incorrectly because it works different from the pytest/sympy one. Is there any reason not to use pytest in general here in python_flint?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I wouldn't mind.

@oscarbenjamin
Copy link
Collaborator Author

What exactly is a good way to test an arb?

I wanted to do something like this:

In [39]: arb("1.414213562373095048801688724") in arb(2).sqrt()
Out[39]: False

That doesn't work though because the arb created is not exactly the same and has a larger radius:

In [43]: arb("1.414213562373095048801688724").rad()
Out[43]: [2.22424236158440e-16 +/- 7.00e-32]

In [44]: arb(2).sqrt().rad()
Out[44]: [2.22044604925031e-16 +/- 3.09e-31]

It doesn't seem to be possible to make an arb with a smaller radius:

In [48]: arb("1.414213562373095048801688724", 1e-20)
Out[48]: [1.41421356237309 +/- 5.15e-15]

In [49]: arb("1.414213562373095048801688724", 1e-30)
Out[49]: [1.41421356237309 +/- 5.15e-15]

It seems like I have to change the context just to construct the arb:

In [51]: ctx.prec = 100

In [52]: a = arb(2).sqrt()

In [53]: a
Out[53]: [1.4142135623730950488016887242 +/- 1.08e-29]

In [54]: ctx.prec = 53

In [55]: a
Out[55]: [1.41421356237310 +/- 4.96e-15]

In [56]: a in arb(2).sqrt()
Out[56]: True

In [57]: a.rad()
Out[57]: [1.57772181044202e-30 +/- 3.62e-45]

The context precision also affects the way that the arb is printed even when it has a higher precision.

@oscarbenjamin
Copy link
Collaborator Author

I hadn't realised that the Linux builds in CI were still using MPIR rather than GMP. They now fail because they're trying to download from mpir.org which has now expired.

@fredrik-johansson
Copy link
Collaborator

What exactly is a good way to test an arb?

I would use showgood for explicit doctests.

Otherwise, it is usually best to check if intervals overlap:

>>> (arb(3).sqrt() ** 2).overlaps(3)
True

@oscarbenjamin
Copy link
Collaborator Author

Is there a way to construct an Arb that is exactly the same as another?

I see these

In [59]: a = arb(2).sqrt()

In [60]: a._mpf_
Out[60]: (0, 1592262918131443, -50, 51)

In [61]: a.rad()
Out[61]: [2.22044604925031e-16 +/- 3.09e-31]

I don't quite see how I can explicitly construct a second arb that precisely matches this value.

@fredrik-johansson
Copy link
Collaborator

The easiest solution might be to wrap arb_load_str and arb_dump_str.

@oscarbenjamin
Copy link
Collaborator Author

The easiest solution might be to wrap arb_load_str and arb_dump_str.

It would probably be useful to add those if that's a format already used by arb.

To me the most natural representation in Python would be to convert an arb to/from a tuple of 4 int/fmpz e.g. (m1,e1,m2,e2) for m1*2**e1 +- m2*2**e2. If I understand correctly that's basically the representation apart from a separate flag to indicate special values. I guess if the special exponent values are defined using COEFF_MIN then they aren't really portable because that varies based on FLINT_BITS but there could be another way to encode those like saying that when m1 = 0, e1=0,1,2,... denotes the special values (rather than COEFF_MIN+0,1,2,...).

@fredrik-johansson
Copy link
Collaborator

The dump/load interface is designed to be portable and future-compatible.

The representation used by arb_t in memory does depend on FLINT_BITS, but arb_dump_str converts the special exponent values to something like 0 = 0 0 0 0, +inf = 0 -1 0 0, ...

@oscarbenjamin
Copy link
Collaborator Author

arb_dump_str converts the special exponent values to something like 0 = 0 0 0 0, +inf = 0 -1 0 0, ...

Yes, that's what I was thinking. It would be good to have a way to convert to and from this format as a tuple of integers rather than a string representation of those integers. I see what you mean though that it might be simplest to just wrap the text dump functions for now.

return t
r = fmpq_poly_equal((<fmpq_series>s).val, (<fmpq_series>t).val)
if r:
r = (<fmpq_series>s).prec == (<fmpq_series>t).prec
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the kind of equality comparison we want for series?

  • A series cannot compare equal to anything that is not a series e.g. fmpq_series([1]) != 1
  • An fmpz_series can compare equal to an fmpq_series: fmpq_series([1]) == fmpz_series([1])
  • Two series with different prec compare unequal: fmpz_series([1],5) != fmpz_series([1],6)
  • Invalid series compare equal: fmpz_series([],-1) == fmpz_series([],-1)

@oscarbenjamin oscarbenjamin changed the title [WIP] add tests Add tests for modules wrapping flint Dec 20, 2022
@oscarbenjamin
Copy link
Collaborator Author

Okay I think this PR is at a reasonable milestone for review/merge.

Tests are added for almost full line coverage of all modules relating to flint (rather than arb). I've also fixed a few bugs and added __eq__ and __ne__ to fmpz_series and fmpq_series.

Later I think it would be good to reorganise these tests to make more of the testing code generic so that e.g. all poly types are tested by the same test code for consistency. There are currently some inconsistencies about exactly which type has which methods and what exceptions they raise etc so that could be fixed at the same time.

I haven't looked at the arb modules at all yet but again that can be something to follow up. The arb-related modules are only really tested by the doctests and in some cases they give fairly low coverage.

The coverage with both tests and doctests here is:

$ coverage report --sort=cover
Name                        Stmts   Miss  Cover
-----------------------------------------------
src/flint/arb_series.pyx      563    492    13%
src/flint/arb_poly.pyx        195    170    13%
src/flint/arf.pyx             109     85    22%
src/flint/acb_poly.pyx        233    179    23%
src/flint/dirichlet.pyx        56     27    52%
src/flint/arb_mat.pyx         323    127    61%
src/flint/acb_mat.pyx         370    114    69%
src/flint/arb.pyx             835    180    78%
src/flint/functions.pyx        61     12    80%
src/flint/acb.pyx             891    171    81%
test/dtest.py                   8      1    88%
src/flint/fmpz.pyx            340     34    90%
src/flint/fmpq.pyx            181     15    92%
src/flint/fmpz_poly.pyx       295     23    92%
src/flint/fmpz_mat.pyx        268     19    93%
src/flint/fmpq_poly.pyx       234     16    93%
src/flint/nmod.pyx             96      6    94%
src/flint/fmpq_mat.pyx        248     15    94%
src/flint/fmpz_series.pyx     186     11    94%
src/flint/fmpq_series.pyx     369     18    95%
src/flint/pyflint.pyx         132      6    95%
src/flint/nmod_mat.pyx        222     10    95%
src/flint/nmod_poly.pyx       199      5    97%
src/flint/__init__.py           1      0   100%
-----------------------------------------------
TOTAL                        6415   1736    73%

@oscarbenjamin
Copy link
Collaborator Author

Tests are added for almost full line coverage

The remaining parts that aren't covered are mostly:

  1. Python 2 conditional blocks (I'll remove in a subsequent PR)
  2. Lines with staticmethod, property or cpdef signatures. I'm not sure if these are being incorrectly reported as missed by coverage. I think that the C-code has some invisible branches that I don't see how to hit but it's difficult to decipher from looking at the generated code.
  3. Impossible lines e.g. I think this one:
    https://github.com/fredrik-johansson/python-flint/blob/e08d3dfc236f6b38fcc11bcde5c621993761a806/src/flint/fmpq_series.pyx#L203-L210
    The code above already tested that s and t are not zero so valuation will be nonnegative and is the index of the first nonzero coefficient. Am I misunderstanding something or am I right that fmpz_is_zero will never return true here?

@fredrik-johansson
Copy link
Collaborator

Equality of series is a bit tricky. It depends whether you want to compare series as symbolic expressions, as elements of R[x] / x^n, or as elements of R[[x]]. In the latter case, the O() term should be understood the same way as an error bound in interval arithmetic, and

1 + x + O(x^2) == 1 + x + O(x^2)

is actually undecidable (following the semantics of arb comparison, it should return False).

BTW, I think fmpz_series([1,2,3],-1) should raise an exception. I don't see any good reason to allow invalid power series.

@fredrik-johansson
Copy link
Collaborator

The code above already tested that s and t are not zero so valuation will be nonnegative and is the index of the first nonzero coefficient. Am I misunderstanding something or am I right that fmpz_is_zero will never return true here?

Yeah, that looks bogus.

@oscarbenjamin
Copy link
Collaborator Author

BTW, I think fmpz_series([1,2,3],-1) should raise an exception. I don't see any good reason to allow invalid power series.

I don't necessarily disagree but I'm wondering why they were there before. Is it to allow something like a nan e.g.:

In [50]: fmpq_series([1,1],1,3).derivative().derivative().derivative()
Out[50]: O(x^0)

In [51]: fmpq_series([1,1],1,3).derivative().derivative().derivative().derivative()
Out[51]: (invalid power series)

@fredrik-johansson
Copy link
Collaborator

I don't remember why I ever put them in; probably because I was too lazy to implement all edge cases. Case in point: the derivative of O(x^0) should be O(x^0).

(I will add a C type for power series as part of the generics system eventually...)

@oscarbenjamin
Copy link
Collaborator Author

Equality of series is a bit tricky. It depends whether you want to compare series as symbolic expressions, as elements of R[x] / x^n, or as elements of R[[x]]. In the latter case, the O() term should be understood the same way as an error bound in interval arithmetic

I think that equality also needs to be understood differently from mathematical equality in the context of a programming language. For example in testing we really want assert x == y to mean that x and y are structurally the same although we can have some allowance for things like fmpq(1) == fmpz(1) == 1. My general interpretation of == is that it should be a relatively cheap operation that determines if two things are the same for most purposes.

The general question of mathematical equality or equivalence is something much more involved and that can be interpreted differently depending on the context. Of course it is possible that our notion of series equality can correspond to a particular notion of mathematical equality but in practical terms we want basic properties like x == x which (IIUC) the error bound notion would not allow.

@fredrik-johansson
Copy link
Collaborator

Indeed:

>>> arb("0.1") == arb("0.1")
False

I think the interface should provide separate, explicit functions for testing equality of representations and testing mathematical equality, e.g. x.equal_repr(y), x.equal_value(y).

There are then pros and cons to having Python's == wrap either of these.

For mathematical equality, you could map the truth status to Python booleans in different ways.

Pessimistic logic (current arb behavior):

  • True -> True
  • False -> False
  • Unknown -> False

Optimistic logic:

  • True -> True
  • False -> False
  • Unknown -> True

Strict logic:

  • True -> True
  • False -> False
  • Unknown -> some Exception

Triple-valued logic:

  • True -> True
  • False -> False
  • Unknown -> None or a new Unknown Python object

In the generic-rings wrapper (https://github.com/fredrik-johansson/generic-rings/blob/main/python/flint_ctypes.py), I'm experimenting with allowing users to select semantics at runtime:

    >>> a = (RR_arb(1) / 3) * 3
    >>> a
    [1.00000000000000 +/- 3.89e-16]
    >>> with strict_logic:
    ...     a == 1
    ...
    Traceback (most recent call last):
      ...
    Undecidable: unable to decide x == y for x = [1.00000000000000 +/- 3.89e-16], y = 1.000000000000000 over Real numbers (arb, prec = 53)
    >>> with pessimistic_logic:
    ...     a == 1
    ...
    False
    >>> with optimistic_logic:
    ...     a == 1
    ...
    True

@oscarbenjamin
Copy link
Collaborator Author

The consideration needed for == specifically in the context of Python is that it is used by all the core data structures. We can have different methods like x.equals(y) and x.equals_modulo(y) etc but the core parts of the language will only use __eq__. This is the case for sets, dicts, tuples, lists etc. It would be useful if the method used for these was different from the method used for the == operator even if they would usually coincide so that you wouldn't have things like this:

>>> {1,1.0}
{1}
>>> {1.0,1}
{1.0}

With that being decided though the context in which we need to think about == specifically is (IMO) not direct usage by end users but rather usage by the core data structures of Python and other downstream library/application code. I envisage that the end user here is some way away e.g. python-flint is a component of a higher-level application and at that higher level == can be overloaded to do something different (e.g. the user is not necessarily even typing Python code or they do so through a wrapper interface like sage).

Having == behave strangely or even having its behaviour depend on global state is something that could be quite problematic for downstream code. The global state part couldn't work with anything hashable because it would invalidate any hash tables for example.

@oscarbenjamin
Copy link
Collaborator Author

I wanted to run these tests on OSX arm64 after merging #42 but I don't seem to be able to get Cirrus CI to run on this PR.

@oscarbenjamin oscarbenjamin reopened this Apr 20, 2023
@oscarbenjamin
Copy link
Collaborator Author

I think the interface should provide separate, explicit functions for testing equality of representations and testing mathematical equality, e.g. x.equal_repr(y)

Okay, I'll add this and use it for the series tests.

@oscarbenjamin
Copy link
Collaborator Author

Okay, this is tested with OSX arm64 after rebasing (and the tests pass).

@oscarbenjamin
Copy link
Collaborator Author

Current coverage report:

Name                        Stmts   Miss  Cover
-----------------------------------------------
src/flint/arb_series.pyx      563    492    13%
src/flint/arb_poly.pyx        195    170    13%
src/flint/arf.pyx             109     85    22%
src/flint/acb_poly.pyx        233    179    23%
src/flint/dirichlet.pyx        56     27    52%
src/flint/arb_mat.pyx         323    127    61%
src/flint/acb_mat.pyx         370    114    69%
src/flint/arb.pyx             835    180    78%
src/flint/functions.pyx        61     12    80%
src/flint/acb.pyx             891    171    81%
test/dtest.py                   8      1    88%
src/flint/fmpz.pyx            340     34    90%
src/flint/fmpq.pyx            181     15    92%
src/flint/fmpz_poly.pyx       295     23    92%
src/flint/fmpz_mat.pyx        268     19    93%
src/flint/fmpq_poly.pyx       234     16    93%
src/flint/nmod.pyx             96      6    94%
src/flint/fmpq_mat.pyx        248     15    94%
src/flint/fmpz_series.pyx     182     11    94%
src/flint/pyflint.pyx         132      6    95%
src/flint/fmpq_series.pyx     355     16    95%
src/flint/nmod_mat.pyx        222     10    95%
src/flint/nmod_poly.pyx       199      5    97%
src/flint/__init__.py           1      0   100%
-----------------------------------------------
TOTAL                        6397   1734    73%

@oscarbenjamin
Copy link
Collaborator Author

Okay, let's merge this now to have better coverage testing in others PRs.

@oscarbenjamin oscarbenjamin merged commit 69b4737 into flintlib:master Apr 21, 2023
@oscarbenjamin oscarbenjamin deleted the pr_coverage branch April 21, 2023 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants