Add VFS 2.0 #90

westito · 2022-04-23T14:03:28Z

I created a new, improved files system. First I tried to fix the original, but after I while I realised creating a separate one will be better and more backward compatible. However, I added a path validator with exception to temp files.

When I first try WASM with IndexDBFileSystem, the data wasn't saved. There was no error, and I have no clue what's going on. After hours of investigation I found that the persistanceRoot is not prepended to path! I used persistanceRoot other than '/'. I thought provide 'test_database' in WasmDatabase(path:) will open database as /db/test_database. The second problem is IndexDBFileSystem not checks path validity. Stores file in memory, but writing to IndexDB silently fails. This applies for temp files too, without workaround. Temp file path always starts with /tmp. With custom persistanceRoot temp files will never be saved to IndexedDB.

simolus3

Awesome stuff, thanks for your contributions 👍

I'm wondering if we have to introduce a separate class for the v2 API, do you think there's a way to add these methods to the existing class?

sqlite3/lib/src/wasm/file_system.dart

sqlite3/lib/src/wasm/file_system_v2.dart

sqlite3/lib/src/wasm/js_interop.dart

westito · 2022-04-23T17:06:23Z

I made some changes to JS interop to make it simpler. But I can't skip mapping, because promiseToFuture can't handle List<> type argument, so promiseToFuture<List<DatabaseName>>() not working.
What I found:
dart-lang/sdk#39627 (comment)

westito · 2022-04-24T08:37:41Z

I tried to add a legacy option to VFS2 and leave load() as deprecated function. But beyond persistanceRoot prepending, VFS2 use path normalisation. A lot of if (legacy) statements will make the code trashy. I think better replace the original implementation without backward compatible. Or create a separate V2 class.

westito · 2022-04-24T09:03:26Z

Can we enable WAL mode in SQLite? We could achieve a better performance on this VFS I think

westito · 2022-04-24T10:12:46Z

How SQLite handles data corruption? When bytes write with VFS, IndexedDB may can't persist because works async. SQLite think it is written, database will be corrupted next load. Am I right? Can we handle this, or simply warn users do not rely on data saved to SQLite? I am using SQLite for cache only, so corruption not a problem. App re-downloads database in this case.

simolus3 · 2022-04-24T16:21:07Z

Can we enable WAL mode in SQLite?

My idea was to keep the sqlite3 build small which is why I disabled it. I can enable it if it turns out to improve performance.

How SQLite handles data corruption?

I don't think we can really handle this: Even though IndexedDB provides ACID guarantees, the VFS API from sqlite requires us to be synchronous and IndexedDB is async, so we can't rely on that. The assumption is that writes will probably go through because they're fast, but as you said this isn't a safe assumption to make.

If you need synchronization I'd probably recommend using shared workers.

We could call asynchronous APIs by running them in a web worker and using atomics to have the main script wait for them synchronously. That sounds complicated though.

westito · 2022-04-24T19:31:22Z

That's awesome idea. With web compiler, FVS can be compiled to a single js file and loaded as shared worker. This way, it isn't necessary to compile to js and load the whole sqlite (or Drift) as shared worker.

westito · 2022-04-25T11:34:20Z

I'm working on a block based storage. Works like a charm. I still have to clean the code and write tests

simolus3 · 2022-04-30T05:37:02Z

No pressure, but how's the block storage coming along? I feel Iike that approach would scale much better for large databases, so please let me know if there's anything I can do to help!

westito · 2022-04-30T08:06:14Z

Sorry, I didn't had much time to finish in this week. Tomorrow I'll try to.

simolus3 · 2022-04-30T11:52:31Z

No worries, thanks for the update!

westito · 2022-05-02T10:38:52Z

Finally done. Works with current tests and in real application. Need some edge-cased tests, update comments. Journal is written very often. Should give it a try to use WAL mode. I'm thinking on a sync version with shared worker as I mentioned before.

simolus3 · 2022-05-02T15:13:13Z

Nice work! I'm taking a look at it now. Can you help me understand why we're storing each file in a separate object store? I'm trying to refactor it so that we'd have one object store with file names + attributes and one with blocks, with each block referencing a file id and an offset. Do you see any problem with that right away? Maybe I'm missing something.

I'm thinking on a sync version with shared worker as I mentioned before.

This would be awesome! Let me know if I can help with that, I've also thought about adding a worker-backed file system to this package. But either way let's do that in a separate PR please :D

westito · 2022-05-02T16:10:45Z

Storing files in separate objectStores seems much simpler. Writing or inserting blocks can be faster in dedicated objects (much less key lookup and index rebuild) and modifying metadata on each write can doubles the write time. Because of async operation, this doesn’t seem like a big deal, but slower write speed is more likely to case database corruption. Write speed will also be important in the shared worker version because of synchronous writing. I think using single object store + metadata makes things more complicated without adding benefits. Creating new file could be faster, but it's not a point here I think. But this is your library, its up to you how things going :)

westito · 2022-05-02T19:41:11Z

Can you compile a new sqlite.wasm? tried on Mac without success. I copied and committed the latest wasm to sqlite3/example/web/sqlite3.wasm for tests in this PR. I think this should leave there and updated when sqlite3 source changes. I can't run this PR now because develop merged and my wasm is old.

simolus3 · 2022-05-02T20:54:15Z

Of course, you can grab the latest sqlite3.wasm from https://storage.googleapis.com/simon-public-euw3/assets/sqlite3.wasm (the same file is used for tests with GitHub actions).

simolus3 · 2022-05-03T18:29:39Z

I've pushed some changes to only use two object stores (one for file names and one for blobs). I've also moved the block storage into the IndexedDB file system, IMO there's no big advantage of also having that in the in-memory file system. I've created the AsynchronousIndexedDbFileSystem class which is essentially an asynchronous version of the file system interface. The IndexedDbFileSystem now wraps an async FS. I hope that this distinction will make it easier to write a say WorkerBasedFileSystem in the future where a worker is using an async FS and uses atomics to synchronize calls.

I've removed some path normalization logic you added because I didn't fully understand it. I think that sqlite3 will call our path_normalize function before using the file system, so it shouldn't be necessary? Is this to make API misuse harder? Might be reasonable to have it, but then we should use it consistently across all file system implementations.

simolus3 · 2022-05-03T20:16:39Z

I'll merge this and get this released tomorrow. Thanks for your continued great work on this package 🚀

westito · 2022-05-04T06:48:00Z

Thank you for improve my code. Yes, it's unnecessary having blocks in-memory. It was easier for me to test if I modified the original code as little as possible. I see you wrote your own cursor handler. I did not use the original because it discarded the transaction. The original implementation is very buggy. I don't know if it's worth dealing with and contributing to DartSDK.
I had a more general VFS in mind, so this path normalization idea came up.

simolus3 · 2022-05-04T08:31:47Z

The original implementation is very buggy.

Yeah I feel like exposing IndexedDB cursors as streams was an unfortunate design choice as pauses or not immediately reacting to new rows will close the transaction. So it's very easy to cause deadlocks with that API.

I had a more general VFS in mind, so this path normalization idea came up.

I think I want the VFS API to mainly be used by sqlite3 internally at the moment, and for that it's preferable to avoid path normalization logic for every write. It's not exactly user friendly, I'll see if it makes sense to polish that API with path normalization and expose it more prominently.

westito · 2022-05-04T08:42:26Z

There is an error when I try to run in my app. I can't understand exactly the problem. The getKey returns with this exception. Not immediately, after some inserts. Maybe when transaction commits.

simolus3 · 2022-05-04T10:51:50Z

Yeah I just spent some time debugging this too :D I think it's because we don't reset cached file ids after a file is deleted, and the journal file gets deleted and re-created a lot. I've fixed this locally, but I need to write some tests to ensure this is working as intended.

westito · 2022-05-04T11:16:51Z

I can send you a sample project. It is working when Drift loaded as worker. But run in main application this exception comes up. It is working with the previous non-block VFS.

simolus3 · 2022-05-04T13:22:12Z

It is working when Drift loaded as worker.

I think you're just not seeing the error messages in the main log then, for me a similar error showed up when inspecting the worker. Also, workers are completely broken because instantiating a KeyRange will internally check the window to see which property to use. This doesn't work in a worker of course, so I had to use package:js to create KeyRanges...

I believe I have fixed this in 94d9652, can you try that state in your project and see if that fixes the errors for you as well?

westito · 2022-05-05T17:16:39Z

Yeah! It's working like a charm now :)

Add VFS 2.0

b422285

westito force-pushed the vfs2 branch from 2f0ea13 to b422285 Compare April 23, 2022 14:07

Add IndexedDB.databases() support check

145fb01

simolus3 reviewed Apr 23, 2022

View reviewed changes

westito added 3 commits April 23, 2022 16:44

Fix IndexedDB.databases() existing check

133c5f1

Replace legacy VFS with V2

d23ea1f

Fix IndexedDB.database() JS interop

a48b902

Fix variable names in IndexedDB.databases() interop

13a9b81

Add block based VFS

0501937

simolus3 added 3 commits May 2, 2022 16:47

Remove some unused code

470bfa1

Merge branch 'main' into vfs2

4bcbcdf

Avoid serialization since we only store JS blobs

df2b093

simolus3 mentioned this pull request May 3, 2022

Web support #80

Merged

Move block FS implementation to indexeddb

5f2f26b

simolus3 added 2 commits May 3, 2022 21:10

Add truncate call to test, clean up a bit

1aa7961

Add changelog entry, clean up some more

ca181fe

simolus3 merged commit f624b18 into simolus3:main May 3, 2022

westito deleted the vfs2 branch May 19, 2022 21:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add VFS 2.0 #90

Add VFS 2.0 #90

westito commented Apr 23, 2022

simolus3 left a comment

westito commented Apr 23, 2022

westito commented Apr 24, 2022

westito commented Apr 24, 2022

westito commented Apr 24, 2022

simolus3 commented Apr 24, 2022

westito commented Apr 24, 2022

westito commented Apr 25, 2022

simolus3 commented Apr 30, 2022

westito commented Apr 30, 2022

simolus3 commented Apr 30, 2022

westito commented May 2, 2022

simolus3 commented May 2, 2022

westito commented May 2, 2022 •

edited

Loading

westito commented May 2, 2022

simolus3 commented May 2, 2022

simolus3 commented May 3, 2022

simolus3 commented May 3, 2022

westito commented May 4, 2022

simolus3 commented May 4, 2022

westito commented May 4, 2022 •

edited

Loading

simolus3 commented May 4, 2022

westito commented May 4, 2022

simolus3 commented May 4, 2022

westito commented May 5, 2022

Add VFS 2.0 #90

Add VFS 2.0 #90

Conversation

westito commented Apr 23, 2022

simolus3 left a comment

Choose a reason for hiding this comment

westito commented Apr 23, 2022

westito commented Apr 24, 2022

westito commented Apr 24, 2022

westito commented Apr 24, 2022

simolus3 commented Apr 24, 2022

westito commented Apr 24, 2022

westito commented Apr 25, 2022

simolus3 commented Apr 30, 2022

westito commented Apr 30, 2022

simolus3 commented Apr 30, 2022

westito commented May 2, 2022

simolus3 commented May 2, 2022

westito commented May 2, 2022 • edited Loading

westito commented May 2, 2022

simolus3 commented May 2, 2022

simolus3 commented May 3, 2022

simolus3 commented May 3, 2022

westito commented May 4, 2022

simolus3 commented May 4, 2022

westito commented May 4, 2022 • edited Loading

simolus3 commented May 4, 2022

westito commented May 4, 2022

simolus3 commented May 4, 2022

westito commented May 5, 2022

westito commented May 2, 2022 •

edited

Loading

westito commented May 4, 2022 •

edited

Loading