-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve FST performance created by Builder [LUCENE-9481] #10520
Comments
Anh Dung Bui (migrated from JIRA) I realize SynonymMap also uses FST and might have the same problem, although Builder has been renamed to FSTCompiler now. |
Michael McCandless (@mikemccand) (migrated from JIRA) Wow, that is a fascinating result! Maybe, instead of a paged |
Coming back to this fun FST issue ... Given that FSTs are now readable off-heap, if we simply fixed FST writing to also be off-heap (this should not be so hard since the writing is already append-only), then we wouldn't have these silly performance problems because we are re-inventing an on-heap filesystem (paged byte[]), badly. But a less radical fix here would be good too (progress not perfection) -- growing the |
Thanks @mikemccand. I originally also think of allocating single As our use case are building and using FST at runtime, there might be (some, I'm not sure) penalty we would have if we are to read/write entirely with filesystem. In some of my previous experience with Java I/O, using FileChannel would be much faster than InputStream/OutputStream when writing to filesystem, but even so they are still slower than main memory. I'm curious if Lucene also has a benchmark for comparison of off-heap and heap mode? (Also by off-heap, I assume you mean filesystem, instead of the direct memory, e.g with |
Maybe we could first allow FSTCompiler to specify its own DataOutput even when building the tree on-the-fly, instead of always relying on |
I'm planning to refactor the BytesStore into an interface that can be chosen from the FST builder. And one can decide whether on-heap or off-heap or on-heap without blocks is best suited. |
Awesome, thanks @dungba88! But, at write time, can we avoid Let's have further discussions on #12543? |
Ahh, yes, exactly! (Sorry, catching up and reading comments out-of-order!).
Well, Lucene's
I think in the original issue that introduced off-heap FST reading, |
Hi,
We are using Lucene 8 and recently we observed a performance issue with FST. When FST is used directly from org.apache.lucene.util.fst.Builder.finish(), its about 40-50% slower than when it's first saved to a DataOutput (e.g filesystem or heap) and loaded again. The FST we used is about 1.6MB.
Using VisualVM, we tracked the issue down to the DataInput.readVInt method, which in turn calls the readByte of the reversed BytesReader. When FST is loaded from filesystem/heap, it's using OnHeapFSTStore which use ReverseBytesReader, but when it's created by the Builder, it's backed by the BytesStore. The problem is, when the number of blocks of BytesStore is not 1, it'll use an anonymous class for reading bytes, and possibly because of JVM inlining, the ReverseBytesReader.readByte is inlined, while that of the anonymous class is not. We confirmed this by adding XX:+PrintInlining flag.
Currently we have 2 workarounds, which can also confirm the above theory:
But I think it would be better if this optimization is done inside Builder.finish() instead, e.g by auto-adjust the block size of BytesStore or copy it to a FSTStore if possible, since none of the 2 workarounds above look great. We have many FST with largely different sizes, so with the first option we need to maintain the bytesPageBits for each dictionary, otherwise we risk wasting unnecessary heap or using insufficient block size. I also think this issue would affect other Lucene users too.
For reference, the benchmark we setup is as below:
It reports that 98 out of 100 tests, the one with save/load logic is 40-50% faster on average.
Thanks
Migrated from LUCENE-9481 by Anh Dung Bui
The text was updated successfully, but these errors were encountered: