Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add function to re initialize foreign type #47407

Merged
merged 1 commit into from
Nov 27, 2022

Conversation

vchuravy
Copy link
Sponsor Member

For foreign types support in GC to work in PackageCompiler and the .ji format rewrite,
we need to declare the types at the module top-level, but after module loading we will
need to re-initialize the hidden desc field with the runtime pointers of the mark
and sweep function.

@vchuravy vchuravy added this to the 1.9 milestone Oct 31, 2022
@vchuravy vchuravy added GC Garbage collector domain:packages Package management and loading compiler:precompilation Precompilation of modules labels Oct 31, 2022
@vchuravy
Copy link
Sponsor Member Author

@fingolfin Does this seem like a reasonable solution? GAPObj would become a binding in GAP_jll and then you would need to reinit it in __init__ of that oackage.

@fingolfin
Copy link
Contributor

But how would I declare GAPObj as a binding in GAP_jll in the first place?

@vchuravy
Copy link
Sponsor Member Author

Hm, probably jl_new_foreign_type should also just add it as a binding. (It needs to be called from a top-level)

@fingolfin
Copy link
Contributor

If we can ccall jl_new_foreign_type and if it is OK to pass it NULL pointers for the callbacks, I guess that might work (or perhaps one can use @cfunction to generate dummy callbacks; whatever it takes to make sure the GC does not crash...).

That said, I am still hoping to eliminate our use of foreign pointers on the top level... but that may or may not pan out, so I am happy if a second solution gets implemented

@fingolfin
Copy link
Contributor

fingolfin commented Nov 1, 2022

So on the top level it'd be something like this:

const GapObj =  # <- is this even needed? or does `jl_new_datatype` take care of it?
ccall(:jl_new_foreign_type, Any,
      (Any, Any, Any, Ptr, Ptr, Cint, Cint),
      (:GapObj, @__MODULE__, Any, C_NULL, C_NULL, 1, 0))`

And then in __init__ it'd be a ccall(:jl_reinit_foreign_type, ... (though more likely I'd ccall into libgap and have it call jl_reinit_foreign_type, but that's a minor implementation detail)

src/datatype.c Outdated Show resolved Hide resolved
@giordano
Copy link
Contributor

giordano commented Nov 2, 2022

Would this solve #36770, or it's only part of it?

@fingolfin
Copy link
Contributor

Only part of it. Still no way to serialize instances

@timholy
Copy link
Sponsor Member

timholy commented Nov 2, 2022

I added the beginnings of a test that also illustrate how to use this. Not quite sure why nmark and nsweep are still unincremented after calling GC.gc(true), is that a sign that something still isn't working?

@timholy
Copy link
Sponsor Member

timholy commented Nov 2, 2022

For (de)serializing instances, what do you need @fingolfin? Can one just write the binary blob? (It is effectively bitstype?) Or does something need to be fixed up upon deserialization?

@fingolfin
Copy link
Contributor

fingolfin commented Nov 2, 2022

I discussed this a bit at #46214. Short answer: no, it isn't a bitstype (if it was, then I'd be using a bitstype!).

In fact, we currently use three foreign types:

  1. GapObj is essentially equivalent to mutable struct GapObj ptr::Ptr{Cvoid} end but with the twist that this pointer points inside another foreign object
    • for historical reasons, ptr doesn't point at the start of another object but rather points to 8 bytes beyond the start, to skip over a header
    • this other object also has a foreign type, but a different one (see below)
    • GapObj is the only foreign type from us you'll directly see in plain Julia (unless you go to efforts via unsafe_load or so to extract a pointer to one)
    • this is also the type I am kinda hoping to be able to replace by a mutable struct; to deal with the "pointer with an offset" issue, my plan is to use something like mutable struct GapObj ptr::Ptr{Cvoid} obj::Any end when ptr will basically point 8 bytes past the start of obj; but there are subtle interactions with the GC that need to be dealt with
  2. Bag is a bunch of data. It consists of a header which in our case looks like this:
    typedef struct {
        uint8_t type : 8;
        uint8_t flags : 8;
        uint16_t : 16
        uint64_t size : 48
    } BagHeader;
    This 8 byte header is followed by the actual data, the layout of which mostly depends on the type in the header. I.e. what is a reference to other objects and what is raw data depends on type. But unfortunately, this type is allowed to change! (My dream would be to replace type with different Julia types, but this requires lots of change to the GAP kernel and some existing "kernel extensions", so nothing we can do for now.) Anyway, for some of these, writing out the bit data would be sufficient (e.g. we have a bigint type, similar to the GMP bigints in Julia, which also require custom serialization in the Julia kernel right now), but we also have a type akin to Vector{Any} that needs to be deal with recursively. Plus GAP uses "tagged pointers", to store "small" integers and a few other values directly in the pointer... This all means that scanning objects for references during GC, but also during (de)serialization, in general requires dedicated callbacks (I can't just specify a fixed "layout"). I'd be happy to implement these, if Julia was able to make use of them...
  3. Finally, LargeBag is like Bag but for "large" allocations (so the large argument for jl_new_foreign_type is set to true when defining this type)

src/datatype.c Outdated Show resolved Hide resolved
@fingolfin
Copy link
Contributor

I added the beginnings of a test that also illustrate how to use this.

Thank you, that's very helpful on multiple levels!!!

test/precompile.jl Outdated Show resolved Hide resolved
@PallHaraldsson
Copy link
Contributor

PallHaraldsson commented Nov 10, 2022

Plus GAP uses "tagged pointers", to store "small" integers and a few other values directly in the pointer... This all means that bother scanning objects for references during GC, but also (de)serialization, in general require full blown callbacks. I'd be happy to implement these, if Julia was able to make use of them...

[Not sure I understood "bother scanning" (typo for pointer scanning"?)]

It's interesting to know such "tagged pointers" actually work in Julia (or well, until they didn't, with recent PR). I've been thinking of using such a trick (for a new string type, to store a prefix in the "pointer to String"). I know this would work in e.g. C, but hadn't thought through the issue with Julia or other GC languages (know tagged done in Lisp). It would be good to have documented which bits of a pointer you can or can't alter without screwing up GC, guaranteed for the future.

#44527 (comment)

Support foreign-datatypes (GAP.jl), can be done after merge.

Is the tagged pointer this "foreign-datatype" (or at least what it points to)? It would be great if you can implement these (full) callbacks, if not already done (do I understand correctly that this PR, that I do not understand nor read carefully, does it?), and/or document somewhere what is already done or needs to be done.

https://github.com/JuliaLang/Juleps/blob/master/GcExtensions.md

  1. Conservative scanning of stack frames and objects. Currently, scanning does have to be precise. If we desire to use the GC for foreign code that requires conservative scanning [..]
    These callbacks are not per se thread-safe. It is up to to the callback implementation to ensure that no violations of thread-safety occur.

FYI: I do see:

https://github.com/vchuravy/ForeignCallbacks.jl

In my case my datatype wouldn't be foreign, in the sense of implemented in another language (I at least I hope it will be doable in Julia-only code), though I guess it could be done in C code, and it would be "foreign" in the sense of low-level C-like (or could as well be) Julia code. Maybe reconsider terminology to "tagged pointer type" from "foreign type"?

@fingolfin
Copy link
Contributor

Oops, the "bother" just can be removed, not sure what happened there (I probably edited back and forth and it's left over from some previous text). I've taken the liberty of editing a bit more afterwards to hopefully improve clarity.

It's interesting to know such "tagged pointers" actually work in Julia

I don't think they work! That's one of the reasons why we need to scan our data ourselves, instead of just specifying a "data layout": even if we know which bits are pointers and which are not, then the pointers still can be tagged...

OFF-TOPIC: So the "tagged pointers" are a feature of GAP (and FLINT, and other systems), and they are really important for us in computational algebra, because we need arbitrary precision integers all the time, and tagged pointer integers compensate for the deficiencies of larger integer arithmetic as implemented by GMP -- i.e., while your integers are small, you don't need allocation and get performance close to "native"; and only when they grow you switch to a proper GMP integer (or similar). Indeed, for us, maybe the primary grievance about Julia is that it defaults to machine integers (which to us is "performance over correctness", and so a boo-boo). Anyway, I'd love if some limited form of tagged pointers like this was possible in Julia... i.e. something which is either, say, 60 bits of payload plus a few tag/guard bits; or else an Any pointer (which I then could easily restrict further with a type assertion).

Oh, and many GC languages have such tagged pointers, in various variations. In GAP, two LSB bits are used to mark non-pointers (this conveniently uses that our pointers are always multiplies of sizeof(void *), so on 32bit archs the lower two bits are always 0 for pointers; the three non-zero values are used to differentiate what kind of "immediate" value is stored in the remaining bits; of course these could then be handled differently depending on context; but in GAP, one of the patterns is dedicated for what we call "immediate integers": specificall bit 0 is set to 1; and bit 1 is set to 0; then 61 of the remaining 62 bits store a value; and finally the MSB, bit 63, is used as a "guard bit", i.e. bit 63 and bit 62 are always equal; this makes it cheap to implement efficient addition with overflow handling: just add the two immediate values unsigned; perform some bit manipulations; if an overflow occurred, discard and restart with a big int. For details see https://github.com/gap-system/gap/blob/master/src/intobj.h.

Note that other implementations instead use top bits, esp. those which are 64bit only; they use that on many architectures, only 48 bits are really used for addresses, so you can store a ton of tags in the top bits.

Thinking about Julia, one could have a TaggedAny type which can be tested and will either yield an Any, or else a 60/61/62/63 bit payload (whatever; this could even be configurable), returned as a primitive type perhaps? (Or perhaps even as an UInt, leaving it up to the user whether they want to clean out tag bits and shift things around -- probably best for optimized work?) Then in my application I could interpret those as in GAP; for strings, you could do your thing (e.g. store bytes directly in the pointer?). And so on.

The main benefit of that type would be that the Julia GC would know about it and treat it like Any except when the lower bits are set, in which case it could ignore it. That would incur minimal overhead during GC scanning at least. But I am sure it'd have other ramifications which may be more painful; I don't claim to know the Julia kernel well enough to understand those. But I'd hope that in many ways it could be treat similar to Union{Any,UInt} ?

END OFF TOPIC

@fingolfin
Copy link
Contributor

Maybe reconsider terminology to "tagged pointer type" from "foreign type"?

No, that would be wrong. The "foreign type" we use are not the same as our "tagged pointers" which we take great pains to hide from Julia. There is no "tagged pointer type" either.

@vchuravy vchuravy added the backport 1.9 Change should be backported to release-1.9 label Nov 13, 2022
@fingolfin
Copy link
Contributor

So, what needs to be done to move this from draft to get it merged? It sounds as if #44527 may be merged soon, at which point GAP (and hence OSCAR) would be broken in Julia nightly builds. From my perspective, the ideal situation would be if this PR was merged first, then I could immediately adapt to it and hopefully not even have any outage.

One obvious thing that comes to mind is that I could try to adapt GAP.jl to the changes in this PR now, in an experimental branch of it, to validate the principle. Would that help? Is there anything else that ought to be done?

@vchuravy vchuravy marked this pull request as ready for review November 22, 2022 21:48
@vchuravy
Copy link
Sponsor Member Author

If you can confirm that this suffices and work for you that would be ideal. @timholy and I have been a bit spread thin so I would be more than happy to hand this over to you for now :)

@vchuravy
Copy link
Sponsor Member Author

@fingolfin I am not sure I understand why the test here doesn't work (nmark and nsweep) not being incremented.
I went through the reinitialization and that seemed to work fine.

The values for layout->npointers and layout->size are not preserved...

Hm do you have an example I can test? They should be preserved... Make sure you run with FORCE_ASSERTIONS=1 in your Make.user

@fingolfin
Copy link
Contributor

fingolfin commented Nov 25, 2022

@fingolfin I am not sure I understand why the test here doesn't work (nmark and nsweep) not being incremented. I went through the reinitialization and that seemed to work fine.

No idea, I can try to look at that test, but right now I need to prepare for a lecture :-)

The values for layout->npointers and layout->size are not preserved...

Hm do you have an example I can test? They should be preserved... Make sure you run with FORCE_ASSERTIONS=1 in your Make.user

My testing set up is rather non-trivial, I afraid... But in a nutshell, I inserted printf statements to show the values of dt->layout->npointer and size and they are always zero after the calls to jl_reinit_foreign_type. You can see the printf statements in https://github.com/gap-system/gap/pull/5224/files

I am testing with a patched version of GAP plus GAP.jl, using the code in these PRs:

UPDATE: here are the new simplified instructions:

JULIA=/some/path/julia # path to julia executable built from this PR
BASEDIR=/tmp/foreign  # or any temp directory of your choice

mkdir -p $BASEDIR
cd $BASEDIR
git clone https://github.com/fingolfin/GAP.jl -b mh/jl_reinit_foreign_type
git clone https://github.com/fingolfin/GAP_jll.jl -b mh/jl_reinit_foreign_type
git clone https://github.com/fingolfin/gap -b mh/jl_reinit_foreign_type

# unfortunately we have to compile GAP once in "regular mode" to let
# it generate a few files
# (to speed this up, make sure GMP is installed in your system, else it
# will try to compile its own copy of GMP)
cd gap
./autogen.sh
./configure
make -j8
cd ..

# now setup the override
cd GAP.jl
rm -rf gap_jll_override
$JULIA --proj=override etc/setup_override_dir.jl ../gap gap_jll_override --debug

# ... and run it
$JULIA --proj=override etc/run_with_override.jl gap_jll_override

# now you should be on a julia prompt and can do `using GAP`

@fingolfin
Copy link
Contributor

Just for completeness, after all that setup, I then see this (note the debug output):

julia> using GAP
[ Info: Precompiling GAP [c863536a-3901-11e9-33e7-d5cd0df7b904]
[ Info: Compiling JuliaInterface ...
About to call :GAP_InitJuliaMemoryInterface
BAR: entering GAP_InitJuliaMemoryInterface
FOO: entering GAP_InitJuliaMemoryInterface
datatype_mptr: npointers 0, size 0
datatype_bag: npointers 0, size 0
datatype_largebag: npointers 0, size 0
After :GAP_InitJuliaMemoryInterface

This is with the stat of this PR as of now, i.e., commit 2ff47e6

@vchuravy
Copy link
Sponsor Member Author

@fingolfin figured out why the test was borked. I also now test that npointers survives serialization.

@fingolfin
Copy link
Contributor

@vchuravy yet in my "real world" code, npointer does not seem to be restored... verified on macOS on an M1 and under Linux / x86. Not that I see how that matters, but I run the test on three systems in total, to make sure I didn't have something weird going on somewhere.

I've also updated to 4d1fd2c but no change.

test/gcext/Foreign/deps/foreignlib.c Outdated Show resolved Hide resolved
src/staticdata.c Show resolved Hide resolved
test/gcext/Makefile Outdated Show resolved Hide resolved
src/staticdata.c Outdated Show resolved Hide resolved
src/datatype.c Outdated Show resolved Hide resolved
@vchuravy
Copy link
Sponsor Member Author

vchuravy commented Nov 25, 2022

does not seem to be restored...

It's not during the restoration... It's seemingly during the serialization.

@fingolfin seems like the problem is sitting in front of your monitor ;)

https://github.com/oscar-system/GAP.jl/pull/845/files#diff-525588e68b2421901be164965272940effa093de733a3fd36c4b9a4344b8c20cR87-R98

Should probably have been:

const GapObj = ccall(:jl_new_foreign_type, Any, (Symbol, Module, Any, Any, Any, Cint, Cint),
                     :GapObj, GAP, Any, C_NULL, C_NULL, 1, 0)

# TODO: can we "hide" the following two types? no Julia code should ever "touch" them,
# let alone see instances of them
const Bag = ccall(:jl_new_foreign_type, Any, (Symbol, Module, Any, Any, Any, Cint, Cint),
                  :Bag, GAP, Any, C_NULL, C_NULL, 1, 0)

const LargeBag = ccall(:jl_new_foreign_type, Any, (Symbol, Module, Any, Any, Any, Cint, Cint),
                       :LargeBag, GAP, Any, C_NULL, C_NULL, 1, 1)

@vchuravy
Copy link
Sponsor Member Author

Also note that I moved the call to jl_new_foreign_type in the test to the c-library since we probably want to set mark and sweep correctly even during pre-compilation.

@fingolfin
Copy link
Contributor

@fingolfin seems like the problem is sitting in front of your monitor ;)

Aaargh, indeed (sadly not the first time, either sigh).

Great then all seems to work. Thank you again for working on this and your patience. I'll clean up my various PRs now.

Yeah, I only put the jl_new_foreign_type into GAP.jl because it was a hack and convenient at the time.

The most annoying part now will be to structure the code so that it'll keep working as before on "older" Julia versions but use the new conventions otherwise. Hurm...

  • for GAP, possibly add an autoconf check for the presence of jl_reinit_foreign_type, and then compiled different code
  • for GAP_jll, I could add a check for cglobal(:jl_reinit_foreign_type) (if it throws an exception, call the current code, otherwise do nothing)
  • and then similar for GAP.jl...

Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Max Horn <max@quendi.de>
@vchuravy
Copy link
Sponsor Member Author

I split out the get the tests working commits into #47699

From my perspective this is good to go.

@vchuravy vchuravy merged commit 96de609 into teh-vc/serialize_partial Nov 27, 2022
@vchuravy vchuravy deleted the vc/reinit_foreign_types branch November 27, 2022 23:26
@vchuravy vchuravy removed the backport 1.9 Change should be backported to release-1.9 label Nov 27, 2022
vchuravy added a commit that referenced this pull request Nov 28, 2022
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Max Horn <max@quendi.de>
vchuravy added a commit that referenced this pull request Nov 29, 2022
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Max Horn <max@quendi.de>
vchuravy added a commit that referenced this pull request Nov 29, 2022
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Max Horn <max@quendi.de>
KristofferC pushed a commit that referenced this pull request Dec 8, 2022
Co-authored-by: Tim Holy <tim.holy@gmail.com>
Co-authored-by: Max Horn <max@quendi.de>
(cherry picked from commit e06a591)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules domain:packages Package management and loading GC Garbage collector
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants