-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add VFS 2.0 #90
Add VFS 2.0 #90
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome stuff, thanks for your contributions 👍
I'm wondering if we have to introduce a separate class for the v2 API, do you think there's a way to add these methods to the existing class?
I made some changes to JS interop to make it simpler. But I can't skip mapping, because |
I tried to add a legacy option to VFS2 and leave |
Can we enable WAL mode in SQLite? We could achieve a better performance on this VFS I think |
How SQLite handles data corruption? When bytes write with VFS, IndexedDB may can't persist because works async. SQLite think it is written, database will be corrupted next load. Am I right? Can we handle this, or simply warn users do not rely on data saved to SQLite? I am using SQLite for cache only, so corruption not a problem. App re-downloads database in this case. |
My idea was to keep the sqlite3 build small which is why I disabled it. I can enable it if it turns out to improve performance.
I don't think we can really handle this: Even though IndexedDB provides ACID guarantees, the VFS API from sqlite requires us to be synchronous and IndexedDB is async, so we can't rely on that. The assumption is that writes will probably go through because they're fast, but as you said this isn't a safe assumption to make. If you need synchronization I'd probably recommend using shared workers. We could call asynchronous APIs by running them in a web worker and using atomics to have the main script wait for them synchronously. That sounds complicated though. |
That's awesome idea. With web compiler, FVS can be compiled to a single js file and loaded as shared worker. This way, it isn't necessary to compile to js and load the whole sqlite (or Drift) as shared worker. |
No pressure, but how's the block storage coming along? I feel Iike that approach would scale much better for large databases, so please let me know if there's anything I can do to help! |
Sorry, I didn't had much time to finish in this week. Tomorrow I'll try to. |
No worries, thanks for the update! |
Finally done. Works with current tests and in real application. Need some edge-cased tests, update comments. Journal is written very often. Should give it a try to use WAL mode. I'm thinking on a sync version with shared worker as I mentioned before. |
Nice work! I'm taking a look at it now. Can you help me understand why we're storing each file in a separate object store? I'm trying to refactor it so that we'd have one object store with file names + attributes and one with blocks, with each block referencing a file id and an offset. Do you see any problem with that right away? Maybe I'm missing something.
This would be awesome! Let me know if I can help with that, I've also thought about adding a worker-backed file system to this package. But either way let's do that in a separate PR please :D |
Storing files in separate objectStores seems much simpler. Writing or inserting blocks can be faster in dedicated objects (much less key lookup and index rebuild) and modifying metadata on each write can doubles the write time. Because of async operation, this doesn’t seem like a big deal, but slower write speed is more likely to case database corruption. Write speed will also be important in the shared worker version because of synchronous writing. I think using single object store + metadata makes things more complicated without adding benefits. Creating new file could be faster, but it's not a point here I think. But this is your library, its up to you how things going :) |
Can you compile a new sqlite.wasm? tried on Mac without success. I copied and committed the latest wasm to sqlite3/example/web/sqlite3.wasm for tests in this PR. I think this should leave there and updated when sqlite3 source changes. I can't run this PR now because develop merged and my wasm is old. |
Of course, you can grab the latest sqlite3.wasm from https://storage.googleapis.com/simon-public-euw3/assets/sqlite3.wasm (the same file is used for tests with GitHub actions). |
I've pushed some changes to only use two object stores (one for file names and one for blobs). I've also moved the block storage into the IndexedDB file system, IMO there's no big advantage of also having that in the in-memory file system. I've created the I've removed some path normalization logic you added because I didn't fully understand it. I think that sqlite3 will call our |
I'll merge this and get this released tomorrow. Thanks for your continued great work on this package 🚀 |
Thank you for improve my code. Yes, it's unnecessary having blocks in-memory. It was easier for me to test if I modified the original code as little as possible. I see you wrote your own cursor handler. I did not use the original because it discarded the transaction. The original implementation is very buggy. I don't know if it's worth dealing with and contributing to DartSDK. |
Yeah I feel like exposing IndexedDB cursors as streams was an unfortunate design choice as pauses or not immediately reacting to new rows will close the transaction. So it's very easy to cause deadlocks with that API.
I think I want the VFS API to mainly be used by sqlite3 internally at the moment, and for that it's preferable to avoid path normalization logic for every write. It's not exactly user friendly, I'll see if it makes sense to polish that API with path normalization and expose it more prominently. |
Yeah I just spent some time debugging this too :D I think it's because we don't reset cached file ids after a file is deleted, and the journal file gets deleted and re-created a lot. I've fixed this locally, but I need to write some tests to ensure this is working as intended. |
I can send you a sample project. It is working when Drift loaded as worker. But run in main application this exception comes up. It is working with the previous non-block VFS. |
I think you're just not seeing the error messages in the main log then, for me a similar error showed up when inspecting the worker. Also, workers are completely broken because instantiating a I believe I have fixed this in 94d9652, can you try that state in your project and see if that fixes the errors for you as well? |
Yeah! It's working like a charm now :) |
I created a new, improved files system. First I tried to fix the original, but after I while I realised creating a separate one will be better and more backward compatible. However, I added a path validator with exception to temp files.
When I first try WASM with IndexDBFileSystem, the data wasn't saved. There was no error, and I have no clue what's going on. After hours of investigation I found that the
persistanceRoot
is not prepended to path! I usedpersistanceRoot
other than '/'. I thought provide 'test_database' inWasmDatabase(path:)
will open database as/db/test_database
. The second problem is IndexDBFileSystem not checks path validity. Stores file in memory, but writing to IndexDB silently fails. This applies for temp files too, without workaround. Temp file path always starts with/tmp
. With custompersistanceRoot
temp files will never be saved to IndexedDB.