-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX: Bad Memory inefficiency in new MTC. #2812
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great improvement and should be backported to stable-4.10. However, a few comments really should be improved first.
lib/sgpres.gi
Outdated
|
||
s:=[]; # new array to trace back compression path | ||
s:=DATA.s; # new array to trace back compression path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required: The comment doesn't match the code anymore. Is the codecomment wrong, or is this missing a ShallowCopy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the memory fix -- reuse the same existing list (that does not need to be cleaned out) over and over, don't create a new one every time. A ShallowCopy
would re-introduce the same issue again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if the two possibilities I listed, "the comment is wrong" applies.
Please adjust the comment to not claim that a new array is used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That I meant to list, that is -- I see now that I wrote "code" when I meant to write "comment"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm now confused. The line is now
s:=DATA.s; # re-used array to trace back compression path
and the PR description above has been expanded. Is it clear now what is done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the line still says 'new' here. Pergament you updated it locally, but did not push it yet?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is not yet pushed (and that's why I'm quoting the line).
[Actually, once I push I think this whole subdiscussion will fold away.]
gap> rels:=ParseRelators(F,"a2,b4,(ab)11, (ab2)5,[a,bab]3,(ababaB)5");; | ||
gap> G:=F/rels;; | ||
gap> Size(G); | ||
443520 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some timings on an Ubuntu server:
GAP 4.5.7: ~17.5 seconds
GAP 4.6.5: ~17.5 seconds
GAP 4.7.9: ~17.5 seconds
GAP 4.8.9: ~14 seconds
GAP master: 519 minutes
GAP with first version of this PR: ~189 seconds
GAP with revised PR: ~51 seconds
So, while this is clearly a great improvement (and thus should be merged, and backported to stable-4.10), we are still slower by a factor 13 3.5 compared to GAP 4.8 :-(. It'd be interesting to run the profiler on this to see where we spend the time now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regression is the whole code.
There was a reported bug in the MTC that none of us was able to fix. Thus I recoded the MTC from scratch, following Holt's book. (Also, with the quiet goal of making the code better to understand and easier to expand/modify). Most of the code now is in the library, while most of the old code was in the kernel, in particular the tracing/coincidence routines (which is where the TC spends time).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for usability, I also added a small change to how cyclic subgroups are found that allows for a smaller index MTC and thus reduces the user wait time by another factor 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it now only takes 51 seconds; also, the computation w/o this PR terminated after 519 minutes.
And yeah, I realize that this is due to the completely new MTC implementation -- and I still think moving to your new MTC implementation was the right choice. And since it is new, I am very hopeful that we can further improve it. Besides algorithmic tweaks and fixes like in this PR (great work), a profile might reveal a few more hotspots; it might be that optimizing those (perhaps also rewriting a few more bits in C) could get us even closer to old times. Or maybe not, my point is, we haven't really tried yet to optimize this further, which makes me quite hopeful that there are opportunities for it left. So I am quite positive about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hulpke - would you mind giving a link to the report on the original MTC bug? Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the report was a forum email (how to link, as we have no official archive?). Also it only specified "Mathieu groups are slow" and did not bother to supply the presentation.
If someone wants to submit it as a formal issue, I'll link to that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hulpke we do have an official archive, which can be found from the GAP website via The GAP Forum -> The GAP Forum Archive -> Archive of postings since December 2003 etc.
The message in question is https://mail.gap-system.org/pipermail/forum/2018/005793.html.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tst/testbugfix/2018-09-13-MTC.tst
Outdated
@@ -0,0 +1,6 @@ | |||
#MTC as arises when testing the order of M22 as FP group. Timmons' report |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Required: I don't think I'd understand this comment a year down the road. It's also not clear from this what the "bad" behaviour was. Please improve that, for the sake of future developers who have to touch this. How about this:
# There was a regression from GAP 4.8 to 4.9 in the new MTC code, which lead to the following example
# (which computes the order of M22 as an FP group) taking very, very long.
# Originally reported 2018-09-12 on GAP support by Paul Timmons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed (by referring to this github PR which now has expanded comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see no such reference, where is it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a push that happened later
if p[alpha]=alpha then | ||
alpha:=DATA.ct[x+offset][alpha]; # beta | ||
if p[alpha]=alpha then | ||
for w in DATA.ccr[x+offset] do | ||
# AH, 9/13/18: It's R^c_{x^-1}, so -x | ||
for w in DATA.ccr[offset-x] do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Question: I assume this is what the commit message calls "Also fixed wrong index in tracing backwards." You also wrote that this PR is purely about performance, not fixing a bug. So, I gather this means that the above bug could not have lead to incorrect results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is code to trace relators through the coset table. The wrong sign used relators starting with x, not with x^-1 (Holt Handbook, p.166, code line 11), which will not give a new result here. As consequence, the coset enumeration will take much longer to terminate (potentially might not terminate), but as it is tracing of valid relators this cannot cause wrong results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation
Also fixed wrong index in tracing backwards. Special treatment for generators that are known to be of order 2. Also internal display option. Changed manual example output due to changes in MTC Also removed example for `DecodeTree` in MTC that is now meaningless.
Larger cyclic subgroup order will speed up MTC and thus in turn computation of order of fp group.
Codecov Report
@@ Coverage Diff @@
## master #2812 +/- ##
=========================================
Coverage ? 76.06%
=========================================
Files ? 480
Lines ? 241052
Branches ? 0
=========================================
Hits ? 183348
Misses ? 57704
Partials ? 0
|
@fingolfin I believe the changes you requested are all in. Did I overlook something? |
The new MTC was allocating and releasing memory excessively, in particular once the coset table got larger. This was reported by P.Timmons for the case of M22 in gap-forum, in test examples that do not create such a large coset table (here: 1.6 million definitions) the change is not so dramatic and might be overlooked.
This was fixed by re-using the same memory over and over again.
Also fixed wrong index in tracing backwards. in the deductions routine -- this will have slowed down tracing. (Cannot lead to wrong results, only unbearably slow calculations)
Special treatment for generators that are known to be of order 2 b/c of x^2 relator -- instead of using consequences every time, set column equal to column of inverse and do not run through inverse when looping over generators.