-
-
Notifications
You must be signed in to change notification settings - Fork 503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fast PowerSeries_poly multiplication #10480
Comments
comment:1
Could you please check if the changes of #10255 in karatsuba multiplication provide further improvement? PowerSeries are constructed over commutative rings, but polynomial_elements do not. Hence mul_trunc needs to work also over noncommutative rings. It does not right now (al least lines 5342 and 5345 do not differenciate between left and right multiplication). I see several improvements: I think that cdef int i, j, a1len, a2len, n1, n2, ih in line 5334 should be Py_ssize_t, but maybe a cython expert should look at this. c = [0]*h in line 5346 of the patch is inefficient in several cases. One of the most contributions in 10480 is to avoid python 0 'int' in the code, that can be (relatively) expensive to coerce. I guess that this code will be slow if the base ring is a number field (that has slow coercion). The last loop is classical multiplication up to the precision required. I wonder if part of that multiplication could use do_karatsuba or a truncated_do_karatsuba. |
comment:2
Some more numbers with your patch showing what I said on my previous post. with patch:
without patch
I think that we can define an algorithm that works like karatsuba but that has a prec parameter. If a portion will have order greater than the prec. That part is discarded. |
comment:4
I will try to implement the algorithm at "A long note on Mulders’ short product" of Hanrot and Zimmermann that looks not harder than Karatsuba. The authors claim that heuristically the gain time is on the average 0.7 x Karatsuba time. Also, the following cases have to be correctly covered:
|
Attachment: trac_10480_fast_PowerSeries_poly_multiplication2.patch.gz Attachment: un.sage.gz benchmark file |
comment:5
Hello, lftabera wrote:
I followed your suggestion to perform truncated multiplication recursively,
The new patch trac_10480_fast_PowerSeries_poly_multiplication2.patch passes all tests, included these.
on which it is typically up to 50% slower, while with the previous The first patch was also slow in other cases, see below. The performance of truncated multiplication vs Karatsuba multiplication using this patch:
unpached Sage (main):
has uniform distribution; for In un.sage one takes N subsequent
In the last example I stopped the computation with main because too much
If I understand correctly, that paper deals with univariate series |
comment:6
Oops, we have been duplicating work, I have written a proof of concept of a truncated Karatsuba to check that it derives a correct algorithm. It is much much slower than plain sage. It needs to eliminate unnecesary slicing and calls of type K(f).list() I will take a look at your patch. My patch depends on #10255 |
comment:7
Mario, This part on your patch
Is inefficient, 2/3 of the operations in this part is adding a ring element with python zero. This was included in original do_karatsuba and was the reason to rewrite it in #10255. I get the following numbers with the patches:
No patch: 8.92 multiplication2: 0.66 alternative: 0.45
No patch: 1.59 #10255: 0.24 multiplication2: 1.57 alternative2: 0.19
No patch: 5.75 #10255: 5.25 multiplication2: 4.62 alternative2: 2.41 |
comment:8
You are right, your patch is faster on generic univariate series; I will look at it. In the related ticket #1956 on multivariate series do_mul_trunc_generic is faster than Karatsuba even using your improved version, in the benchmarks I posted there. |
comment:9
In that case, one can play with set_karatsuba_threshold in the underlying univariate polynomial ring to try to find a better balance between karatsuba and classical multiplication. |
comment:10
Replying to @lftabera:
Optimizing multivariate power series for the algorithms here is now #10532. Since I've never heard of the Karatsuba algorithm before, could someone here point me to a reference which would explain it well enough for me to know what the better balance is? (or just tell me :) thanks, p.s. thanks to those of you involved with this -- I think it's awesome to have these tailored algorithms in Sage! |
comment:11
Well, Karatsuba is just the first algorithm you can find on faster polynomial multiplication. There is no good balance that will work for every ring. For a fixed ring it even depends on the specific input as has been shown in Mario's examples. virtually any book on computer algebra will mention Karatsuba method of multiplication. You could take a look, for example, to Joachim von zur Gathen, Jürgen Gerhard, "Modern Computer Algebra" were Karatsuba and Schönhage–Strassen methods are discussed. |
comment:12
lftabera wrote:
Right, one can always use your Karatsuba code; in the case of multivariate I do not think that optimizing the code to avoid summing to Python 0
corresponding to
while in the latter b and d are the polynomials reduced to
The middle products in the two cases are comparable. Maybe you could add a comment about the splitting in do_trunc_karatsuba? |
comment:13
I will take a look and include your suggestions. |
comment:14
In ticket #10720 I used truncated multiplication from this ticket to speedup |
Dependencies: #10255 |
comment:15
Previous implementation was not better than full multiplication. I have added a patch with original Mulder's short multiplication. If the cost of operations is similar for all coefficients of the power series, then the short multiplication compatible with karatsuba will be faster than the naive one. Said this, for multivariate power series, typically higher terms are big polynomials with significant multiplication cost. That is why the generic multiplication is better in those cases. Apply: trac_10480_fast_Powerseries_Mulder.patch |
Mulder algorithm |
comment:16
Attachment: trac_10480_fast_Powerseries_Mulder.patch.gz Apply: trac_10480_fast_Powerseries_Mulder.patch |
comment:17
Use only Mulders method patch Apply: trac_10480_fast_Powerseries_Mulder.patch |
Changed author from mario pernici to Mario Pernici, Luis Felipe Tabera Alonso |
comment:20
Attachment: trac_10480_fast_PowerSeries_poly_multiplication-alternative.patch.gz I made some experiments and Mulder's algorithm is generally worse and when it is better, the gain is residual so I go back to my first proposal. Note that the underlying series in #10532 are rather special, so this method as is is not the best for that case. Apply: trac_10480_fast_PowerSeries_poly_multiplication-alternative.patch |
This comment has been minimized.
This comment has been minimized.
comment:23
Have you looked at the implementation of Karatsuba and Schönhage-Strassen in GMP-ECM? Quote: GMP-ECM features Schönhage-Strassen multiplication for polynomials in |
comment:24
I'm new to this ticket, thus please forgive me if some comments are out of scope. About Mulders' algorithm, using a fixed cutoff point k = 25/36*n is not optimal in Since truncated (polynomial) multiplication is potentially useful in other parts of Sage, it would make sense to have the code implementing truncated multiplication and the one using it for power series in two different tickets, no? Paul |
Branch: public/ticket/10480 |
Commit: |
Branch pushed to git repo; I updated commit sha1. New commits:
|
comment:29
Branch does not apply |
comment:30
We do have now truncated multiplication on polynomials
|
comment:31
however this does not seem faster than plain multiplication (here with Sage 8.0):
|
comment:32
Replying to @zimmermann6:
It does depend on the base ring
But you are right that the generic version of |
In this patch truncated multiplication of dense polynomials is
used in
PowerSeries_poly
multiplication.in Sage-4.6 on my computer with Intel Core i7 2.8GHz
with this patch it takes 0.12s
The speed-up increases with the number of variables and with the precision.
Apply: trac_10480_fast_PowerSeries_poly_multiplication-alternative.patch
Depends on #10255
CC: @nilesjohnson @zimmermann6
Component: commutative algebra
Keywords: power series
Author: Mario Pernici, Luis Felipe Tabera Alonso
Branch/Commit: public/ticket/10480 @
e2cdc54
Issue created by migration from https://trac.sagemath.org/ticket/10480
The text was updated successfully, but these errors were encountered: