Efficient trie lookup for boolean Unicode properties #33098

raphlinus · 2016-04-19T19:35:06Z

Replace binary search of ranges with trie lookup using leaves of
64-bit bitmap chunks. Benchmarks suggest this is approximately 10x
faster than the bsearch approach.

Replace binary search of ranges with trie lookup using leaves of 64-bit bitmap chunks. Benchmarks suggest this is approximately 10x faster than the bsearch approach.

rust-highfive · 2016-04-19T19:35:22Z

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nrc (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

raphlinus · 2016-04-19T19:35:52Z

@SimonSapin who I discussed this with on irc.

Somehow got in my head that >> 8 was the right shift for a chunk of 64. Oops, sorry.

alexcrichton · 2016-04-19T20:29:08Z

Neat! Out of curiosity, do you know what the impact this has on the compiled table sizes? (if any)

raphlinus · 2016-04-19T21:06:36Z

@alexcrichton A very slight uptick, but almost a wash. By my calculations, the old table size is 29288 bytes and the new is 30451. If it were a concern, it'd be possible to get that back by special casing the sets with a small max value, but honestly I'd say code simplicity is more important.

nrc · 2016-04-19T21:34:36Z

r? @alexcrichton

samlh · 2016-04-19T23:37:57Z

What is the perf impact of switching to (c >> 6) - 0x20 and (c >> 12) - 0x10? It would reduce the space used a bit, but I don't know if it would lose too much of the perf gain.

raphlinus · 2016-04-20T05:48:27Z

@samlh I just measured it, the performance difference is imperceptible. This makes sense, the subtraction probably happens in parallel with getting the address of r2 into a register. Probably save 500 bytes, I think there are 11 tables. Happy to make the change if people feel like it's worthwhile.

huonw · 2016-04-20T06:06:26Z

src/etc/unicode.py

@@ -307,12 +307,114 @@ def emit_table(f, name, t_data, t_type = "&'static [(char, char)]", is_pub=True,
    format_table_content(f, data, 8)
    f.write("\n    ];\n\n")

+def emit_trie_lookup_range_table(f):
+    f.write("""
+pub struct BoolTrie {


It'd be great to have an high-level overview of what all these fields represent/how it actually represents the data.

Agreed, with some comments I think this is good to go though

Adds a comment which explains the trie structure, and also does a little arithmetic on lookup (no measurable impact, looks like modern CPUs do this arithmetic in parallel with the memory lookup to find the node) to save a bit of space. As a result, the memory impact of the compiled tables is within a couple hundred bytes of the old bsearch-range structure.

raphlinus · 2016-04-21T18:23:34Z

Some questions / comments.

The BoolTrie and tables are pub, following the pattern of the existing code, but this feels like not-great encapsulation. I'd prefer the impl details to be private, and just the query functions to be pub, but not sure if that would break anything. Thoughts?

Should I aggressively dead-code-eliminate the emit of the bsearch? It was useful to have both around during development, now I'm thinking only one is needed.

What's the feeling on tests? There are some in doc-tests, but coverage is low. For my own work, I've been doing some randomly-generated tests, but not sure how well that fits in here.

Should I do similar magic for case mapping? It's trickier because it's not just a bool, but possible. I'm thinking a different PR though.

alexcrichton · 2016-04-22T17:03:46Z

The BoolTrie and tables are pub, following the pattern of the existing code, but this feels like not-great encapsulation.

Ah no need to worry much about this. If you can get it private that'd be great, but don't sweat it too much. This crate is unstable so we're fine to change the API at any time basically.

I thought the tables module was private regardless, but maybe it's not?

What's the feeling on tests? There are some in doc-tests, but coverage is low. For my own work, I've been doing some randomly-generated tests, but not sure how well that fits in here.

I wonder how hard it might be to just exhaustively test everything? In theory we could generate tests as part of the unicode.py script as well.

Should I do similar magic for case mapping? It's trickier because it's not just a bool, but possible. I'm thinking a different PR though.

Separate PR sounds good to me, but if it's also 10x faster we'd certainly love to have the change :)

samlh · 2016-04-23T00:14:14Z

Another silly idea (not sure if worth doing):

We could combine all the u64 tables into one, where the first 32 values are as before, but can be reused as values for the other sets of leaves. This risks running out of indices for leaf nodes, but I think the current maximum total number of values is 249. The bigger question is if this would save enough bytes to be at all worth the effort :P.

Overall, I like the code as-is, though I agree that marking everything possible as private and removing old code would be nice.

alexcrichton · 2016-05-21T00:44:35Z

@raphlinus did you wanna update this with some more tests? Or do you think this is good to go?

raphlinus · 2016-05-21T00:48:33Z

@alexcrichton I'd lean toward "good to go." There isn't an obvious direction the tests should go that doesn't lead to bloat of the code-gen scripts.

alexcrichton · 2016-05-21T17:23:26Z

@bors: r+ cfaf66c

Ok, sounds good to me, thanks again @raphlinus!

bors · 2016-05-21T22:41:45Z

⌛ Testing commit cfaf66c with merge c80d23d...

bors · 2016-05-22T00:27:28Z

💔 Test failed - auto-mac-64-opt-rustbuild

alexcrichton · 2016-05-22T17:59:44Z

@bors: retry

Umm... what? If that happens again I'll file a spurious issue

bors · 2016-05-22T17:59:46Z

⌛ Testing commit cfaf66c with merge 54bfed7...

Efficient trie lookup for boolean Unicode properties Replace binary search of ranges with trie lookup using leaves of 64-bit bitmap chunks. Benchmarks suggest this is approximately 10x faster than the bsearch approach.

bors · 2016-05-23T04:47:46Z

💥 Test timed out

alexcrichton · 2016-05-23T06:14:21Z

@bors: retry

bors · 2016-05-23T09:48:11Z

⌛ Testing commit cfaf66c with merge 4c6b6c2...

Efficient trie lookup for boolean Unicode properties Replace binary search of ranges with trie lookup using leaves of 64-bit bitmap chunks. Benchmarks suggest this is approximately 10x faster than the bsearch approach.

bors · 2016-05-23T12:54:12Z

☀️ Test successful - auto-linux-32-nopt-t, auto-linux-32-opt, auto-linux-32cross-opt, auto-linux-64-cargotest, auto-linux-64-cross-armhf, auto-linux-64-cross-armsf, auto-linux-64-cross-freebsd, auto-linux-64-cross-netbsd, auto-linux-64-debug-opt, auto-linux-64-nopt-t, auto-linux-64-opt, auto-linux-64-opt-mir, auto-linux-64-opt-rustbuild, auto-linux-64-x-android-t, auto-linux-cross-opt, auto-linux-musl-64-opt, auto-mac-32-opt, auto-mac-64-nopt-t, auto-mac-64-opt, auto-mac-64-opt-rustbuild, auto-mac-cross-ios-opt, auto-win-gnu-32-nopt-t, auto-win-gnu-32-opt, auto-win-gnu-32-opt-rustbuild, auto-win-gnu-64-nopt-t, auto-win-gnu-64-opt, auto-win-msvc-32-cross-opt, auto-win-msvc-32-opt, auto-win-msvc-64-cargotest, auto-win-msvc-64-opt, auto-win-msvc-64-opt-mir, auto-win-msvc-64-opt-rustbuild

Efficient trie lookup for boolean Unicode properties

4864e0e

Replace binary search of ranges with trie lookup using leaves of 64-bit bitmap chunks. Benchmarks suggest this is approximately 10x faster than the bsearch approach.

rust-highfive assigned nrc Apr 19, 2016

Fix wrong shift in trie_lookup_range_table

6923bc5

Somehow got in my head that >> 8 was the right shift for a chunk of 64. Oops, sorry.

rust-highfive assigned alexcrichton and unassigned nrc Apr 19, 2016

huonw reviewed Apr 20, 2016
View reviewed changes

bluss added the relnotes Marks issues that should be documented in the release notes of the next release. label May 21, 2016

bors merged commit cfaf66c into rust-lang:master May 23, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Efficient trie lookup for boolean Unicode properties #33098

Efficient trie lookup for boolean Unicode properties #33098

raphlinus commented Apr 19, 2016

rust-highfive commented Apr 19, 2016

raphlinus commented Apr 19, 2016

alexcrichton commented Apr 19, 2016

raphlinus commented Apr 19, 2016

nrc commented Apr 19, 2016

samlh commented Apr 19, 2016

raphlinus commented Apr 20, 2016

huonw Apr 20, 2016

alexcrichton Apr 20, 2016

raphlinus commented Apr 21, 2016

alexcrichton commented Apr 22, 2016

samlh commented Apr 23, 2016

alexcrichton commented May 21, 2016

raphlinus commented May 21, 2016

alexcrichton commented May 21, 2016

bors commented May 21, 2016

bors commented May 22, 2016

alexcrichton commented May 22, 2016

bors commented May 22, 2016

bors commented May 23, 2016

alexcrichton commented May 23, 2016

bors commented May 23, 2016

bors commented May 23, 2016

Efficient trie lookup for boolean Unicode properties #33098

Efficient trie lookup for boolean Unicode properties #33098

Conversation

raphlinus commented Apr 19, 2016

rust-highfive commented Apr 19, 2016

raphlinus commented Apr 19, 2016

alexcrichton commented Apr 19, 2016

raphlinus commented Apr 19, 2016

nrc commented Apr 19, 2016

samlh commented Apr 19, 2016

raphlinus commented Apr 20, 2016

huonw Apr 20, 2016

Choose a reason for hiding this comment

alexcrichton Apr 20, 2016

Choose a reason for hiding this comment

raphlinus commented Apr 21, 2016

alexcrichton commented Apr 22, 2016

samlh commented Apr 23, 2016

alexcrichton commented May 21, 2016

raphlinus commented May 21, 2016

alexcrichton commented May 21, 2016

bors commented May 21, 2016

bors commented May 22, 2016

alexcrichton commented May 22, 2016

bors commented May 22, 2016

bors commented May 23, 2016

alexcrichton commented May 23, 2016

bors commented May 23, 2016

bors commented May 23, 2016