-
-
Notifications
You must be signed in to change notification settings - Fork 14k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gap: 4.11.1 -> 4.12.1 #192548
gap: 4.11.1 -> 4.12.1 #192548
Conversation
@ofborg build sage |
The deadlocks seem to have been replaced by crashes! This is very good news. I'll investigate in two weeks, hopefully gap-system/gap#5063 (which also affects x86_64) will be addressed by then too. |
I'd be interested to know what these crashes are, is there a way to see them? -- people are successfully running GAP 4.12 on aarch64 as far as I know. |
@ChrisJefferson The crashes happen during Sage doctesting. The build/test log is at https://logs.nix.ci/logfile/nixos/nixpkgs.192548/d56e9c1c-e4c5-4397-b15d-4135431c6c21, and you can grep for |
@ChrisJefferson I changed the Sage function slightly to avoid a possible GC problem on the Sage side. The cdef str capture_stdout(Obj func, Obj obj):
cdef Obj s, stream, output_text_string
cdef UInt res
cdef TypOutputFile output
try:
GAP_Enter()
s = NEW_STRING(0)
output_text_string = GAP_ValueGlobalVariable("OutputTextString")
stream = CALL_2ARGS(output_text_string, s, GAP_True)
if not OpenOutputStream(&output, stream):
raise GAPError("failed to open output capture stream for "
"representing GAP object")
CALL_1ARGS(func, obj)
CloseOutput(&output)
return char_to_str(CSTR_STRING(s))
finally:
GAP_Leave() Unfortunately, there are still some segfaults with backtraces which look like the following:
For convenience, the One funny thing I see over and over with our aarch64 builders is that they seem to hit a lot more GC-related problems then the x86_64 ones, even when the bug is not aarch64-specific. Is |
I should also point out that there are only two callsites for the function cdef str gap_element_repr(Obj obj):
cdef Obj func = GAP_ValueGlobalVariable("ViewObj")
return capture_stdout(func, obj)
cdef str gap_element_str(Obj obj):
cdef Obj func = GAP_ValueGlobalVariable("Print")
return capture_stdout(func, obj) I wonder if calling Edit: Although it's hard to be 100% sure due to the non-deterministic nature of GC bugs, looks like that's really the problem! ofBorg ran the tests twice without crashing, while previously it crashed every time. |
6acfa0d
to
489fd86
Compare
aa50bbc
to
c9687e8
Compare
66f857c
to
4046a60
Compare
cbc2846
to
ec21df7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! Some feedback, in case you are interested :-)
I just saw the discussion on AArch64 crashes above. Some remarks:
However, what could in theory happen is that GAP starts a garbage collection, but does not "see" your references to objects. You have these:
But I am not sure if the GC "sees" all of them this way in your code... nominally |
Thanks for the aarch64 comments! I should have written another comment to clarify the current status instead of just an edit. Currently sage tests pass on aarch64, but I needed to patch Sage like so: https://github.com/NixOS/nixpkgs/blob/3be405a031dcc259de0d53ff2adbb50c6247f3a6/pkgs/applications/science/math/sage/patches/copy-string-before-gap-leave.patch. I was a bit busy the last few days but I will upstream this patch to Sage soon. In particular, this changes two things:
I think I expressed myself badly before, but to clarify: You're absolutely right that narrowing the |
|
||
echo "Copying files to target directory" | ||
cp -ar . "$out/share/gap/build-dir" | ||
# make install creates an empty pkg dir. make check requires a tst dir. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So you install the tst
files just so that you can run the test suite? Isn't that overkill? Or do you consider it desirable to install test suites? What others packages do that (e.g. for git
itself, I don't think I've ever seen its test suite being installed alongside git
itself).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point! Nix packages often include wrappers to compensate for the non-FHS directory structure used by Nix, and we create a wrapper analogous to gap.sh
(in bin/gap
) to make sure -l
gets passed in and GAP can find its files (to illustrate the non-FHS thing, the current wrapper for GAP 4.11 is located at /nix/store/n9gvcx1v1lwwcax8m8p6qzwmnk1fb67m-gap-4.11.1/bin/gap
on my machine). Nix's GAP package insists on running tests with the installed version of GAP instead of the build dir version, I can only assume this caught a wrapper issue in the past.
I definitely think this is worth revisiting now that make install
exists, but I'd like feedback from @timokau, who I believe was the maintainer who designed the current test structure in #53037, before changing this to make sure I'm not missing some key fact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that instead of a wrapper script you could also add -DSYS_DEFAULT_PATHS="prefix"
to CFLAGS
(Fedora does it this way).
(See also gap-system/gap#5096 for a plan to do this "automatically " in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not remember why I decided to do things this way. If there is a way to run the tests with the installed version of gap without installing the tests, I agree that that would be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to let the installed GAP run the non-installed test suite (and I agree this is sensible and desirable).
one way would be to invoke it like this (untested; replace $srcdir
suitably):
gap --quitonbreak -c "TestDirectory( \"$srcdir/tst/testinstall\", rec(exitGAP := true) );"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I will try this! It will be a nice improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally had a chance to test this myself and unfortunately it doesn't work that way, because some tests want to access additional files in the tst
directory and for this assume it is available in a GAP root... but you could still do this then:
$prefix/bin/gap --quitonbreak -l ";$srcdir" $srcdir/tst/testinstall.g
The -l ";$srcdir"
option appends $srcdir
to the list of GAP root paths, so that code trying things like ReadGapRoot("tst/testinstall/MatrixObj/testmatobj.g")
still works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for testing and sorry for not doing it earlier. I appreciate the suggestion but I worry that -l ";$srcdir"
might be equivalent to testing the build-dir version of GAP, in the sense that a GAP test run might succeed by finding things in $srcdir
(other than tst/
) that it wouldn't on a normal test run.
I also want to keep the package as lean as possible, but the default GAP 4.12.1 install takes 759MB, and just 6MB of those come from the tst
directory. The current version of the package is heavier because it ships the full build dir (802MB), including the tst
subdirectory. Therefore, I am inclined to postpone this improvement to have more time to experiment, leaving a TODO pointing to this discussion, mostly because I will still be pretty busy until next week. If further experimentation leads to the conclusion that testing the build-dir version of GAP is fine (which is likely!), I can just move the testrun from installCheckPhase
to checkPhase
, not ship the tst
directory and avoid the problem altogether.
The -DSYS_DEFAULT_PATHS
tip is interesting and I appreciate the information! Wrappers are very common in Nix land, though, and I want to investigate if this would break a content-addressed build of GAP. Therefore, I probably will investigate both improvements at once in the near future.
I hope postponing these improvements doesn't give the wrong impression. I really appreciate the feedback and will act on the feedback in the near future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, it's not urgent.
BTW regarding space, you could gzip
some data files esp. in large packages, too reduce this a bit. E.g. for the termux package I suggested something like this to them:
# To get them to be small enough, could gzip them in place
# (GAP transparently allows read access to those)
gzip -n pkg/primgrp/data/*.g
gzip -n pkg/transgrp/data/*.grp
gzip -n -f pkg/smallgrp/id*/*.*
gzip -n -f pkg/smallgrp/small*/*.*
This was only for the required packages primgrp, transgrp, smallgrp; it could be applied to more (e.g. irredsol/data/*.grp
, ctbllib/data/*.tbl
, sglppow/lib/3hoch8/rank*
). I will also look into modifying at least those three packages to ship with pre-compressed files. (I've also used -9
/ --best
in the past but mostly the savings don't seem to justify the slow performance. YMMV)
FYI, I just released GAP 4.12.1 |
Thank you! Sorry, I was busy last week. Will push the changes now. |
I've dropped the extra aarch64 patch needed for Sage tests to pass. I don't think they need to block the update, since Sage tests were already broken on aarch64 before. The Sage fixes have been submitted upstream (https://trac.sagemath.org/ticket/34701) and are now part of #198355, which depends on the present PR. The GAP update itself should now be in a good state thanks to invaluable help from @fingolfin, so I'll put this up for review from the other maintainers. |
Description of changes
aarch64 deadlocks on Sage tests no longer happen after this update and a few minor Sage tweaks (which will be done in a separate PR). Hooray!
This applies an unreviewed patch from https://trac.sagemath.org/ticket/34391 (https://git.sagemath.org/sage.git/patch?id2=9.8.beta2&id=eb8cd42feb58963adba67599bf6e311e03424328), but the test changes all look reasonable.
Things done
sandbox = true
set innix.conf
? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD"
. Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/
)nixos/doc/manual/md-to-db.sh
to update generated release notes