Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: std/marshal unmarshaling of ref objects #22983

Merged
merged 7 commits into from
Nov 26, 2023
Merged

fix: std/marshal unmarshaling of ref objects #22983

merged 7 commits into from
Nov 26, 2023

Conversation

viega
Copy link
Contributor

@viega viega commented Nov 24, 2023

Fixes #16496

Marshal doesn't properly unmarshal most ref objects; the exceptions being nil ones
Test case added.

Note that this test (t9754) does pass locally, but there are tons of failures by default on OS X arm64, mostly around the bohem GC, so it's pretty spammy, and could easily have missed something. If there are better instructions please do let me know.

John Viega and others added 3 commits November 23, 2023 23:44
returning a dereference of the pointer, which is clearly not correct.

This results in std/marshal being unable to unmarshal ref types--
instead of the new Any object's value field getting the correct
pointer to the object, it was getting the value obtained by
dereferencing the pointer. Perhaps there were other downstream effects
too; I'm not sure if anything else in stdlib uses Any.
lib/core/typeinfo.nim Outdated Show resolved Hide resolved
@ringabout
Copy link
Member

ringabout commented Nov 25, 2023

It seems that it might not be the root cause of the problem. It didn't seem to fix #16496, which is supposed to pass the assertions.

By checking genericAsgn, we know that refs are ptr pointers indeed

unsureAsgnRef(cast[PPointer](dest), cast[PPointer](s)[])

I think cast[pointer] might not be correct.

Comment on lines 342 to 345
result = cast[ppointer](x.value)[]

if result != nil and x.rawType.kind != tyPointer:
result = cast[pointer](x.value)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still makes no sense to me. Maybe it's a marshal.nim bug?

Copy link
Contributor Author

@viega viega Nov 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None of this code was written for clarity.

I've just spent some time on it, and I now understand why there was an extra dereference happening.

The DEREFERENCED value (the actual memory location of the object) is returned to marshal via getPointer(), and this is what is cached not the pointer that the allocating Any set.

So the real issue is that, since getPointer() returns the actual reference to the memory the user is given, but setPointer() expects something different depending on how the any object has been marked.
If it's been marked as a pointer, it expects the actual reference to memory we need. Otherwise, it expects a POINTER to the reference. This is non-intuitive, and not clearly documented in setPointer(), but I am sure it's related to the underlying storage model for refs, which I did not dig into (any pointers to docs would be appreciated; I'd prefer not to dig through code).

When the second Any comes along and Marshal recognizes it's using the same object, it's passing to 'setPointer' the actual memory location, NOT a pointer to that pointer.

So the fix was doing the right thing for this one case, but yes, the real problem is that Marshal should have been storing the cached memory address in a local variable, and pass that, instead of passing the raw memory location.

So I've gone ahead and fixed it properly, and pushed that.

Again the asymmetry between setPointer() and getPointer() clearly caused confusion and was unexpected. Some better documentation in typeinfo would be welcome-- it'd be nice to understand why it's expecting to be passed a pointer to the object's address, instead of the object's address.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A ref is implemented as a single ptr in Nim. The asymmetry in setPointer likely arises from the fact that pointer in Nim points to an object of unknown size.

@viega
Copy link
Contributor Author

viega commented Nov 26, 2023

It seems that it might not be the root cause of the problem. It didn't seem to fix #16496, which is supposed to pass the assertions.

By checking genericAsgn, we know that refs are ptr pointers indeed

unsureAsgnRef(cast[PPointer](dest), cast[PPointer](s)[])

I think cast[pointer] might not be correct.

@ringabout I just added the test from #16496, and it also passes now.

@Araq Araq merged commit 5b2fcab into nim-lang:devel Nov 26, 2023
19 checks passed
Copy link
Contributor

Thanks for your hard work on this PR!
The lines below are statistics of the Nim compiler built from 5b2fcab

Hint: mm: orc; opt: speed; options: -d:release
176852 lines; 7.525s; 766.094MiB peakmem

@ringabout
Copy link
Member

I just added the test from #16496, and it also passes now.

You are incredible!

narimiran pushed a commit that referenced this pull request Apr 18, 2024
Fixes #16496

![Marshal doesn't properly unmarshal *most* ref objects; the exceptions
being nil
ones](https://github-production-user-asset-6210df.s3.amazonaws.com/4764481/285471431-a39ee2c5-5670-4b12-aa10-7a10ba6b5b96.gif)
Test case added.

Note that this test (t9754) does pass locally, but there are tons of
failures by default on OS X arm64, mostly around the bohem GC, so it's
pretty spammy, and could easily have missed something. If there are
better instructions please do let me know.

---------

Co-authored-by: John Viega <viega@Johns-MacBook-Pro.local>
Co-authored-by: John Viega <viega@Johns-MBP.localdomain>
Co-authored-by: ringabout <43030857+ringabout@users.noreply.github.com>
(cherry picked from commit 5b2fcab)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

marshal regression with refs that alias the same object
3 participants