Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make it so grapheme can be used as an aa key #8855

Merged
merged 1 commit into from
Nov 29, 2023

Conversation

schveiguy
Copy link
Member

Need an issue report. Just want to make sure it passes CI

@dlang-bot
Copy link
Contributor

dlang-bot commented Nov 25, 2023

Thanks for your pull request, @schveiguy!

Bugzilla references

Auto-close Bugzilla Severity Description
24267 regression [REG 2.106 beta] Grapheme cannot be used as an AA key

Testing this PR locally

If you don't have a local development environment setup, you can use Digger to test this PR:

dub run digger -- build "stable + phobos#8855"

@PetarKirov
Copy link
Member

@schveiguy the CI is green!

@schveiguy
Copy link
Member Author

Yeah, I need an issue report. @kinke had a use case (see #8787 (review)), I'll see if I can test this and make a report tomorrow.

@kinke
Copy link
Contributor

kinke commented Nov 27, 2023

[The use case is https://github.com/adamdruppe/arsd/blob/410216f649f192f7bedda140f121deb00fd7fbc8/terminal.d#L2276]

I haven't filed an issue, since the regression was introduced for beta/RC releases only so far.

// Define a default toHash to allow AA usage
size_t toHash() const @trusted
{
return hashOf(slen_, hashOf(small_));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT, this only works for the small-case, not for the heap-buffer case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The heap buffer data doesn't matter. The opEquals is just comparing the pointers.

Not saying this is what we want it to be, but that is what it is now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I know that the opEquals doesn't work in that case either, and that both implicit opEquals and toHash before v2.106 were most likely working in that case only too (IIRC, unions are by-default compared and hashed as a binary blob of the union's size).

Ideally, we'd be comparing/hashing the underlying bytes slice. But as the [s]len_ fields seem to be the number of dchars, a proper comparison/hash might have to rely on the dchar range returned by opSlice().

Copy link
Member Author

@schveiguy schveiguy Nov 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually know what the tupleof comparison does for the union...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I think the proper opEquals and toHash should be based on the opSlice.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a quick check, it checks all members of the union, i.e., the same payload multiple times. For the pointers and integers here, it amounts to something like a double-memcmp, instead of a single memcmp for the implicit union comparison.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"it" being the default opEquals or being tupleof == tupleof?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new tupleof comparison. :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wow. I just tested it out. And this is pretty dangerous. For example, if one of the union members (that isn't valid) has an opEquals, then using the tupleof calls this, the default does a memcmp.

In this case of course, it's not a problem. But for sure this is not a good implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that reminds me of https://issues.dlang.org/show_bug.cgi?id=22171 - the same problem for autogenerated __xtoHash, which uses all union members.

@schveiguy schveiguy marked this pull request as ready for review November 29, 2023 21:44
@schveiguy
Copy link
Member Author

OK, bug report added and ready to close.

@dlang-bot dlang-bot merged commit 01957e9 into dlang:stable Nov 29, 2023
14 checks passed
@schveiguy schveiguy deleted the graphemehash branch November 29, 2023 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants