Make fmpq_poly.factor() return primitive factors #189

oscarbenjamin · 2024-08-17T18:44:26Z

oscarbenjamin · 2024-08-17T23:20:39Z

@GiacomoPope @Jake-Moss does either of you want to review this?

The main change here is that fmpq_factor returns primitive factors rather than monic factors e.g.:

In [7]: from flint import *

In [8]: x = fmpq_poly([0, 1])

In [9]: p = (x + fmpq(1,2))*(x - fmpq(1,4))/7

In [10]: p
Out[10]: 1/7*x^2 + 1/28*x + (-1/56)

In [11]: p.factor()
Out[11]: (1/56, [(4*x + (-1), 1), (2*x + 1, 1)])

This makes fmpq_poly consistent with fmpq_mpoly. Flint itself does not seem have an fmpq_poly_factor function so fmpq_poly.factor() uses fmpz_poly_factor and then on master it makes the factors monic. Here I've changed that to keep the factors primitive as returned by fmpz_poly_factor.

I also found some other issues:

fmpz_mod_poly.factor() crashes on the zero poly. This is documented for fmpz_mod_poly_factor:

Factorises a non-constant polynomial ...

There were inconsistent types in some places e.g. nmod_poly.leading_coefficient() returns int rather than nmod. I fixed that here.

The mpoly.factor() methods return fmpz as the multiplicity of a factor unlike other poly types that use int. I have fixed this for fmpz_mpoly and fmpq_mpoly but it might also need to be fixed for the new mpoly types in gh-164.

The mpoly types need to have a leading_coefficient method. I've added this but again it might be needed for new types in gh-164. It is also missing for some univariate types.

For fq_default and fq_default_poly there is no way to get the associated context from an instance e.g.:

In [6]: p = fq_default_ctx(163)

In [7]: x = p(1)

In [8]: x.ctx
---------------------------------------------------------------------------
AttributeError

I don't see any way to get the context, characteristic, etc from x. Same for fq_default_poly.

The modulus() has a different meaning for fq_default than for nmod and fmpz_mod. We probably need to add a .characteristic() method that could have the same meaning in all cases (at least the fmpz_mod context object could have this method). It also makes sense for fq_default_ctx to define .is_prime() just like fmpz_mod_ctx.

Also I have emphasised this a lot but let me just point out again that a carefully designed test like the test_factor_poly_mpoly test here can pick up on a lot of issues including ones that have been in the codebase for a long time. The test simultaneously tests all poly types and all mpoly types together and just covers a few simple and corner cases but it checks all types comprehensively and checks all outputs thoroughly. It also demonstrates clearly what the expected behaviour is supposed to be for each of the different types and where they are similar and where they are different. If we only write separate tests for different types or for univariate vs multivariate etc then we don't get the benefits of this.

Jake-Moss · 2024-08-18T02:27:07Z

There were inconsistent types in some places e.g. nmod_poly.leading_coefficient() returns int rather than nmod. I fixed that here.

Is there a particular reason for this? Does something depend on this behaviour? It seems like an unnecessary conversion to me, any use with another python-flint object will result in a conversation back to a fmpz or similar almost immediately.

The mpoly.factor() methods return fmpz as the multiplicity of a factor unlike other poly types that use int. I have fixed this for fmpz_mpoly and fmpq_mpoly but it might also need to be fixed for the new mpoly types in gh-164.

Yes they will, I'll add those pending the above

Jake-Moss

Other than my two comments it looks good to me

Jake-Moss · 2024-08-18T02:34:32Z

src/flint/types/fmpz_mod_poly.pyx

@@ -1798,6 +1798,20 @@ cdef class fmpz_mod_poly(flint_poly):
        if not self.ctx.is_prime():
            raise NotImplementedError("factor algorithm assumes that the modulus is prime")


The new fmpz_mod_mpoly and nmod_mpoly are inconsistent with this exception type, they raise a DomainError instead. Seems like this isn't consistent across the code base, we have DomainError, NotImplementedError, and ValueError for similar exception reasons. What should the preference here?

Exported grep results: src/flint/types/nmod_mpoly.pyx:312: raise DomainError("division with non-prime modulus is not supported") src/flint/types/nmod_mpoly.pyx:759: raise DomainError("gcd with non-prime modulus is not supported") src/flint/types/nmod_mpoly.pyx:780: raise DomainError("square root with non-prime modulus is not supported") src/flint/types/nmod_mpoly.pyx:811: raise DomainError("factorisation with non-prime modulus is not supported") src/flint/types/nmod_mpoly.pyx:853: raise DomainError("factorisation with non-prime modulus is not supported") src/flint/types/fmpz_mod.pyx:344: raise NotImplementedError("algorithm assumes modulus is prime") src/flint/types/fq_default.pyx:98: raise ValueError("characteristic is not prime") src/flint/types/fq_default.pyx:108: raise ValueError("either a prime or modulus must be passed for construction") src/flint/types/fq_default.pyx:133: raise ValueError("characteristic is not prime") src/flint/types/fq_default.pyx:156: raise ValueError("characteristic is not prime") src/flint/types/fmpz_mod_poly.pyx:285: raise NotImplementedError("minpoly algorithm assumes that the modulus is prime") src/flint/types/fmpz_mod_poly.pyx:1270: raise NotImplementedError("gcd algorithm assumes that the modulus is prime") src/flint/types/fmpz_mod_poly.pyx:1765: raise NotImplementedError("factor_squarefree algorithm assumes that the modulus is prime") src/flint/types/fmpz_mod_poly.pyx:1799: raise NotImplementedError("factor algorithm assumes that the modulus is prime") src/flint/types/fmpz_mod_poly.pyx:1844: raise NotImplementedError("factor algorithm assumes that the modulus is prime") src/flint/types/fmpz_mod_mpoly.pyx:333: raise DomainError("division with non-prime modulus is not supported") src/flint/types/fmpz_mod_mpoly.pyx:783: raise DomainError("gcd with non-prime modulus is not supported") src/flint/types/fmpz_mod_mpoly.pyx:804: raise DomainError("square root with non-prime modulus is not supported") src/flint/types/fmpz_mod_mpoly.pyx:835: raise DomainError("factorisation with non-prime modulus is not supported") src/flint/types/fmpz_mod_mpoly.pyx:878: raise DomainError("factorisation with non-prime modulus is not supported") src/flint/test/test_all.py:3757: assert raises(lambda: flint.fq_default_ctx(modulus=mod_not_prime), ValueError)

I think that there should be a special exception type like NonPrimeModulusError because it is such a common case in the codebase.

It is also something that much real downstream code will never had to deal with because using a non-prime modulus is not that common or at least if using it then you would already know know not to use things like division.

oscarbenjamin · 2024-08-18T11:28:54Z

There were inconsistent types in some places e.g. nmod_poly.leading_coefficient() returns int rather than nmod. I fixed that here.

Is there a particular reason for this? Does something depend on this behaviour? It seems like an unnecessary conversion to me, any use with another python-flint object will result in a conversation back to a fmpz or similar almost immediately.

It definitely matters whether we return an int or an nmod because they have different behaviour. For example a common thing that you might want to do with a leading coefficient is to invert it:

In [1]: from flint import *

In [2]: lc_int = 3

In [3]: lc_nmod = nmod(3, 7)

In [4]: 1/lc_int
Out[4]: 0.3333333333333333

In [5]: 1/lc_nmod
Out[5]: 5

As for the unnecessary conversion there needs to be a conversion either way. The Flint routine returns a ulong and we need to turn that into a Python object whether that is PyLong or nmod. Cython handles it implicitly if you just return a ulong but it still needs to create an actual PyLong including allocating space on the heap for it.

One small difference actually is that CPython interns small integers (up to 255?) to avoid a heap allocation. We could do something similar with nmod if we had a context object to store the interned values for a given modulus. At least for small enough moduli it would make sense to do that when we can intern every possible value. It could be handled transparently by a context method like ctx.nmod_from_ulong().

oscarbenjamin · 2024-08-18T12:22:49Z

It definitely matters whether we return an int or an nmod because they have different behaviour.

I just found a similar issue with nmod_mpoly and fmpz_mod_mpoly. The coefficients should be nmod and fmpz_mod but from gh-164 they are int and fmpz:

In [7]: ctx = fmpz_mod_mpoly_ctx(2, Ordering.lex, ["x", "y"], 11)

In [8]: x = ctx.gen(0)

In [9]: x.coeffs()
Out[9]: [1]

In [10]: type(x.coeffs()[0])
Out[10]: flint.types.fmpz.fmpz

In [11]: ctx = nmod_mpoly_ctx(2, Ordering.lex, ["x", "y"], 11)

In [12]: x = ctx.gen(0)

In [13]: type(x.coeffs()[0])
Out[13]: int

Likewise leading_coefficient needs to return nmod/fmpz_mod and also the constant term in the factor method.

Probably there are also inconsistencies in the factor_squarefree methods as well...

oscarbenjamin · 2024-08-19T00:04:02Z

Sorry @Jake-Moss, you reviewed this and then I made it 3x bigger. If you are still up for reviewing then please do...

The changes here are:

The main change is that fmpq_poly returns primitive rather than monic factors which is consistent with fmpq_mpoly and SymPy and other things.
I have added tests to ensure compatibility of different types and that operations work with all types.
I have added sqrt to all scalar types so all scalar and polynomial types have sqrt.
I have added factor_squarefree to all poly types that did not have it so that all poly/mpoly types can be assumed to have it.
I have added lead_coefficient to all poly types and now all poly/mpoly types can be assumed to have it.
Various types are made more consistent and various exceptions raised are changed or made consistent. A bigger discussion needs to be has about exception types.
I've added docstrings for some methods that did not have them.

What I have not fixed here is the type pf the coefficients for nmod_poly and fmpz_mod_poly which are int and fmpz when they should be nmod and fmpz_mod. That is something that needs to be fixed before next release,

Jake-Moss · 2024-08-19T00:05:13Z

No worries, will do tonight

Jake-Moss · 2024-08-19T07:55:18Z

There were inconsistent types in some places e.g. nmod_poly.leading_coefficient() returns int rather than nmod. I fixed that here.

Is there a particular reason for this? Does something depend on this behaviour? It seems like an unnecessary conversion to me, any use with another python-flint object will result in a conversation back to a fmpz or similar almost immediately.

It definitely matters whether we return an int or an nmod because they have different behaviour. For example a common thing that you might want to do with a leading coefficient is to invert it:
In [1]: from flint import *

In [2]: lc_int = 3

In [3]: lc_nmod = nmod(3, 7)

In [4]: 1/lc_int
Out[4]: 0.3333333333333333

In [5]: 1/lc_nmod
Out[5]: 5
As for the unnecessary conversion there needs to be a conversion either way. The Flint routine returns a ulong and we need to turn that into a Python object whether that is PyLong or nmod. Cython handles it implicitly if you just return a ulong but it still needs to create an actual PyLong including allocating space on the heap for it.

One small difference actually is that CPython interns small integers (up to 255?) to avoid a heap allocation. We could do something similar with nmod if we had a context object to store the interned values for a given modulus. At least for small enough moduli it would make sense to do that when we can intern every possible value. It could be handled transparently by a context method like ctx.nmod_from_ulong().

I'm sorry I quoted the wrong section of your message. The above makes complete sense but I was look at this change where the exponent is created as an fmpz, then immediately converted to an int.

The mpoly.factor() methods return fmpz as the multiplicity of a factor unlike other poly types that use int. I have fixed this for fmpz_mpoly and fmpq_mpoly but it might also need to be fixed for the new mpoly types in #164.

https://github.com/flintlib/python-flint/pull/189/files#diff-b5a069a771eae54c8982ca6dae36c6a291b19e7bb79effc990091b0f2304fdc4R959-R963

oscarbenjamin · 2024-08-19T10:24:50Z

I was look at this change where the exponent is created as an fmpz, then immediately converted to an int.

The mpoly.factor() methods return fmpz as the multiplicity of a factor unlike other poly types that use int.

It's important that the types are consistent and we don't get into a situation where we don't know what they are supposed to be any more. If it is supposed to be an int then it needs to be an int. I recently ran into a situation where having an fmpz in a place that would usually be an int caused a bug:
python/cpython#120950

I'm a bit surprised that an fmpz is used at all for the multiplicity in the factor struct. It suggests great ambition about the sizes of polynomials that can be factorised.

The nmod_poly code just casts the exponent to an integer which in principle would be incorrect if the multiplicity is greater than 2^(FLINT_BITS-2):

python-flint/src/flint/types/nmod_poly.pyx

Lines 640 to 641 in 464a276

    
           exp = fac.exp[i] 
        
           res[i] = (u, exp)

On a 64 bit system it is inconceivable that an nmod_poly could have a factor of multiplicity greater than 2^62. On a 32 bit system could it have greater than 2^30? Maybe at a push for GF(p) with very small p...

I suppose you can have something like:

In [14]: from flint import *

In [15]: ctx = fmpz_mpoly_ctx(2, Ordering.lex, ["x", "y"])

In [16]: x, y = ctx.gens()

In [17]: ((x*y)**(2**64)).factor()
Out[17]: (1, [(x, 18446744073709551616), (y, 18446744073709551616)])

It's hard to imagine that a nontrivial polynomial could be factorised and have a multiplicity that does not fit slong/ulong.

For dense polynomials it is perhaps impossible:

In [18]: p = nmod_poly(0, 11)

In [19]: p[2**32] = 1
Flint exception (General error):
    Unable to allocate memory (34359738376).
Aborted (core dumped)

(Ideally we would have a way of preventing that core dump...)

In any case though we need the types to be consistent. The multiplicity is never going to be large enough to need to be anything other than an int and all other types already return int. Factorisation is an extremely expensive operation so I doubt very much that the cost of converting the exponents is significant.

Jake-Moss

Again all looks good to me

Jake-Moss · 2024-08-19T11:51:56Z

src/flint/test/test_all.py

        ),
    ]


 def test_mpolys():
-    for P, get_context, S, is_field in _all_mpolys():
+    for P, get_context, S, is_field, characteristic in _all_mpolys():


This new argument is unused in this test function, should the characteristic_zero be replaced with a test on this instead? Functionally it's no different but is inconsistent with the new test_factor_poly_mpoly

Oh, good point. I changed the division_not_supported to:

composite_characteristic = characteristic != 0 and not characteristic.is_prime()

That makes more sense because things like factor etc also depend on it.

Jake-Moss · 2024-08-19T12:04:18Z

src/flint/types/fmpz_mod_poly.pyx

        other = self.ctx.any_as_fmpz_mod_poly(other)
        if other is NotImplemented:
            raise TypeError(f"Cannot interpret {other} as a polynomial")

+        if not self.ctx.is_prime():
+            raise NotImplementedError("gcd algorithm assumes that the modulus is prime")


Looks like this one was missed

Suggested change

raise NotImplementedError("gcd algorithm assumes that the modulus is prime")

raise DomainError("gcd algorithm assumes that the modulus is prime")

Oh, yeah. Good point.

I've added this case to the tests and fixed it.

Jake-Moss · 2024-08-19T12:11:02Z

src/flint/test/test_all.py

            assert P([1, 1]) // 2 == P([0, 0])
            assert P([1, 1]) % 2 == P([1, 1])
+        elif nmod_poly_will_crash:
+            pass


Presumable these passes will be replace when #179 is merged

Yeah, I'd rather not have a variable called nmod_poly_will_crash in the test code :)

Fact is it does crash a lot though...

Jake-Moss · 2024-08-19T12:37:42Z

In any case though we need the types to be consistent. The multiplicity is never going to be large enough to need to be anything other than an int and all other types already return int. Factorisation is an extremely expensive operation so I doubt very much that the cost of converting the exponents is significant.

Thanks make sense, more than reasonable to have that.

oscarbenjamin · 2024-08-19T14:09:51Z

src/flint/types/fmpz_mod_mpoly.pyx

            c = fmpz.__new__(fmpz)
            fmpz_set((<fmpz>c).val, &fac.exp[i])

-            res[i] = (u, c)
+            res[i] = (u, int(c))


We could likely get away with using fmpz_get_ui here although it would be fiddly to do correctly.

Probably not worth it.

oscarbenjamin · 2024-08-19T14:22:11Z

src/flint/types/nmod_poly.pyx

+        elif factor_type == 'irreducible':
            lead = nmod_poly_factor(fac, self.val)
+        elif factor_type == 'squarefree':
+            nmod_poly_factor_squarefree(fac, self.val)
+            lead = (<nmod>self.leading_coefficient()).val


I'm not 100% sure about this.

The doc says:

void nmod_poly_factor_squarefree(nmod_poly_factor_t res, const nmod_poly_t f)

Sets res to a square-free factorization of f.

Whereas nmod_poly_factor returns the leading coefficient nmod_poly_factor_squarefree does not mention anything about it.

The factorisation seems to be just all monic factors though so I presume we just need to take the leading coefficient like this:

In [2]: x = nmod_poly([0, 1], 7) In [3]: (3*(x + 1)*(x + 4)**2).factor_squarefree() Out[3]: (3, [(x + 1, 1), (x + 4, 2)])

oscarbenjamin · 2024-08-19T14:34:03Z

src/flint/flintlib/nmod_poly_factor.pxd

    ctypedef struct nmod_poly_factor_struct:
        nmod_poly_struct *p
-        long *exp
-        long num
-        long alloc
+        slong *exp
+        slong num
+        slong alloc


This was out of data compared to what I see in the Flint code.

On 64-bit Windows at least long and slong are not equivalent and this has been a common cause of Windows-specific crashes.

I think over time Flint itself has cleaned up its use of slong in place of long for many interfaces. Some of python-flint's code predates Flint cleaning these things up though so we have some things like this that are out of date.

It is likely that almost all of these 635 lines should change from long to slong:

$ git grep ' long ' | wc -l 635

Certainly these are out of date because prec should be slong:

$ git grep long src/flint/flintlib/arb.pxd | head src/flint/flintlib/arb.pxd:from flint.flintlib.flint cimport ulong, slong, flint_rand_t src/flint/flintlib/arb.pxd: void arb_set_round(arb_t z, const arb_t x, long prec) src/flint/flintlib/arb.pxd: void arb_neg_round(arb_t x, const arb_t y, long prec) src/flint/flintlib/arb.pxd: void arb_set_si(arb_t x, long y) src/flint/flintlib/arb.pxd: void arb_set_ui(arb_t x, ulong y) src/flint/flintlib/arb.pxd: void arb_set_round_fmpz_2exp(arb_t y, const fmpz_t x, const fmpz_t exp, long prec) src/flint/flintlib/arb.pxd: void arb_set_round_fmpz(arb_t y, const fmpz_t x, long prec) src/flint/flintlib/arb.pxd: void arb_printd(const arb_t x, long digits) src/flint/flintlib/arb.pxd: void arb_mul_2exp_si(arb_t y, const arb_t x, long e) src/flint/flintlib/arb.pxd: void arb_get_abs_ubound_arf(arf_t u, const arb_t x, long prec)

https://flintlib.org/doc/arb.html

oscarbenjamin · 2024-08-19T14:34:40Z

I see that gh-190 is waiting on this so I'll merge once tests pass.

oscarbenjamin · 2024-08-19T15:07:41Z

Thanks for the reviews!

Jake-Moss mentioned this pull request Aug 18, 2024

Fmpz mod mpoly and nmod mpoly #164

Merged

Jake-Moss approved these changes Aug 18, 2024

View reviewed changes

oscarbenjamin added 5 commits August 18, 2024 12:16

Make fmpq_poly.factor() return primitive factors

8c4cc87

Test all poly types can factor the zero poly

c7a564c

fmpz_mod_poly.factor - handle constant case

a0eed7d

Add leading_coefficient to fmpz_poly and fmpq_poly

2f71b58

Simplfify fmpz_poly.leading_coefficient

1abfb62

Include nmod_mpoly and fmpz_mod_mpoly in factor test

01e6418

oscarbenjamin force-pushed the pr_fmpq_poly_factor_prim branch from 3178aa5 to 01e6418 Compare August 18, 2024 12:13

oscarbenjamin added 7 commits August 18, 2024 14:09

Include composite modulus in poly tests

c89ae73

Add factor_squarefree to all poly/mpoly types

8d4cf40

Fix doctests

fe7265d

Add .sqrt() for all scalar types and add docstrings

dcbe363

Add tests for sqrt/gcd/xgcd

26f85d6

Comments to denot poly types

3881822

Add trivial gcd tests

f29a0fc

Jake-Moss approved these changes Aug 19, 2024

View reviewed changes

Jake-Moss mentioned this pull request Aug 19, 2024

Refactor fmpz_mpoly and fmpq_mpoly arithmatic functions #190

Merged

Clean up exceptions and composite_characteristic

c65ba5d

oscarbenjamin commented Aug 19, 2024

View reviewed changes

Update nmod_poly_factor_struct declaration.

e4f2e8d

oscarbenjamin commented Aug 19, 2024

View reviewed changes

oscarbenjamin merged commit f2bda3c into flintlib:master Aug 19, 2024
32 checks passed

oscarbenjamin deleted the pr_fmpq_poly_factor_prim branch August 19, 2024 15:07

oscarbenjamin mentioned this pull request Aug 27, 2024

Add DomainError positive char univariate polynomial rings #199

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make fmpq_poly.factor() return primitive factors #189

Make fmpq_poly.factor() return primitive factors #189

oscarbenjamin commented Aug 17, 2024

oscarbenjamin commented Aug 17, 2024

Jake-Moss commented Aug 18, 2024

Jake-Moss left a comment •

edited

Loading

Jake-Moss Aug 18, 2024

oscarbenjamin Aug 18, 2024

oscarbenjamin commented Aug 18, 2024

oscarbenjamin commented Aug 18, 2024

oscarbenjamin commented Aug 19, 2024

Jake-Moss commented Aug 19, 2024

Jake-Moss commented Aug 19, 2024

oscarbenjamin commented Aug 19, 2024

Jake-Moss left a comment

Jake-Moss Aug 19, 2024

oscarbenjamin Aug 19, 2024

Jake-Moss Aug 19, 2024

oscarbenjamin Aug 19, 2024

Jake-Moss Aug 19, 2024

oscarbenjamin Aug 19, 2024

Jake-Moss commented Aug 19, 2024 •

edited

Loading

oscarbenjamin Aug 19, 2024

oscarbenjamin Aug 19, 2024

oscarbenjamin Aug 19, 2024

oscarbenjamin commented Aug 19, 2024

oscarbenjamin commented Aug 19, 2024

		@@ -1798,6 +1798,20 @@ cdef class fmpz_mod_poly(flint_poly):
		if not self.ctx.is_prime():
		raise NotImplementedError("factor algorithm assumes that the modulus is prime")

	raise NotImplementedError("gcd algorithm assumes that the modulus is prime")
	raise DomainError("gcd algorithm assumes that the modulus is prime")

Make fmpq_poly.factor() return primitive factors #189

Make fmpq_poly.factor() return primitive factors #189

Conversation

oscarbenjamin commented Aug 17, 2024

oscarbenjamin commented Aug 17, 2024

Jake-Moss commented Aug 18, 2024

Jake-Moss left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oscarbenjamin commented Aug 18, 2024

oscarbenjamin commented Aug 18, 2024

oscarbenjamin commented Aug 19, 2024

Jake-Moss commented Aug 19, 2024

Jake-Moss commented Aug 19, 2024

oscarbenjamin commented Aug 19, 2024

Jake-Moss left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Jake-Moss commented Aug 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

oscarbenjamin commented Aug 19, 2024

oscarbenjamin commented Aug 19, 2024

Jake-Moss left a comment •

edited

Loading

Jake-Moss commented Aug 19, 2024 •

edited

Loading