-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Introducing a Standardized Interface for SQL Database Drivers in JavaScript #6
base: feat/sql
Are you sure you want to change the base?
Conversation
The Neon serverless driver has an interesting API and is built on top of the standard Fetch API. I don't think a generic API should do it the same way, but it would be good to be sure that that can be made compatible. |
The main thing I think is missing is (cell/value) streaming. With Would be great to see how For insert/update statements, parameter streaming would be needed as well. Maybe if the driver detects a stream parameter, it should automatically use it. |
@skybrian Neon could also use the interfaces if they deem it fit for their usecase, but primarily, these interfaces are targeted towards low level database drivers. Projects using these drivers would of course draw benefit from it as there would be a uniform function signature. @alumni You can take a look at the implementation branch to see how it is currently implemented. The interfaces are either way fully extendable, so you can add any properties to the options. This would either way most likely be for the implementers of the drivers to support, but if you have a specific example where the current specs would not suffice, could you share it? P.S. Thanks for the review 💪🏻 |
Could we have a separate high level and low level APIs? So that drivers can implement just the lower level most efficient interface, and all the syntactic sugar of queryMany, queryArray, and all can be implemented just once in terms of the low level api? |
@lovasoa I am not sure I understand what you mean exactly, but in terms of low-level API (If i understand you right) it would be the connection interface, while high-level would be all the other interfaces. Is this what you meant? |
The rich API that is proposed here can be implemented in terms of just a few primitives. It's a lot of duplicated efforts if we reimplement the rich api n times. If we have low level drivers that expose just something like prepare(sql: string, arguments: any[]): Statement;
next(s: Statement): {columns: string[], values: any[]} Then we can reimplement the larger, more user friendly API on top of it. |
Still not sure if i understand you correctly, so bear with me. So, the query methods you should only have to implement once for the query base class, and once for the prepared statement class. After implementing these two classes, all other classes will extend these classes and you will not need to repeat this process. At the lowest level, you have the connection interface that is used by the other classes to provide the base functionalities ( Do you still have concerns? |
Yes, for instance execute, queryMany & queryManyArray can all be implemented in terms of the low level interface I suggested above. Same for queryOne, the sql string decorator, and almost all other functions of your high level api. |
Another issue I see is this: export type Row<T = unknown> = Record<string, T>; It's nice to have a type like this because it's easy to work with, but it's not general enough to represent the results of any query. SQL query results can have duplicated column names. This Row type cannot store the results of select 1 as x, 2 as x; Which is supported in SQL and on which some applications depend. |
I see what you mean, and it was actually like that in an earlier stage of the proposal, however, I experienced that it actually caused more code to be repeated since the prepared statement class does not extend the query class, so the current interface actually solved that and reduced the lines of code in the implementations significantly. The
That is a good point, do you have a suggestion for how to improve this? Also, would you see an issue in leaving this up to the respective implementations? |
The problem is that the proposal asks for database drivers to implement all these nice-to-have functions themselves. I think they should implement a much much smaller API that looks like the one I propose above, and the implementation of the functions like The |
I have been working on basically the same problem as part of PowerSync, and just came across this. It's great to see the initiative, and I'd love to participate! I'll give more thorough thoughts as well as ideas I've been working on later, but overall I agree with @lovasoa on this. The APIs that individual drivers implement can be much more low-level and reduced scope by just focusing on prepared statements, while still having the same high-level apis provided in a wrapper library. What that allows:
To give a specific example: Take the "await using" / explicit resource management syntax used to for managing connections. This actually makes quite a nice syntax for transactions as well. However, if the drivers implement the high-level transaction, adding this to the spec requires every single driver to add that. But if it's in a higher-level library, it's a 10 lines of code change. Of course it could be added to the spec right now, but what about the next nice language feature or other new ideas that could improve the APIs? The main things that remain for drivers to implement are connection management, and prepared statements. Even if actual prepared statements are not used by the driver, it could still expose the same APIs with minimal overhead. Even transacting management can happen in the higher-level library - drivers only need to implement connection management (including when there is only a single underlying connection, but I'll get into more detail around my thoughts on that later). There may be some differences between dialects here - for example transactions may be implemented differently for SQLite vs Postgres, but the same for all SQLite drivers. I think that's okay, but haven't properly thought through all the tradeoffs around this yet. I'll see if I can get the code I've been working on public in the next day or two, that could better demonstrate the split between the driver API and the high-level library. |
Some alternate ways to make implementation easier without baking it into the standard would be to provide a skeleton implementation, a base library, or common utility functions. A common test suite might be quite useful, too. (Ideally these things would be implemented before the standard becomes final, so that it benefits from feedback.) |
If it's a base implementation instead of a separate library, then it can't be updated independently of the drivers... |
I've now published the APIs I've been working on here: https://github.com/powersync-ja/sqlite-js. Currently the linked project includes:
I think most important for this discussion is the notes on the driver API design decisions: https://github.com/powersync-ja/sqlite-js/blob/main/DRIVER-API.md The functionality in the high-level API is similar to what is proposed by the RFC here. There are many minor differences, but I believe if a driver API with small surface area is fixed, those differences are not important. It would for example be possible to implement the RFC APIs here on top of that driver API. In my case, the project was focused on only SQLite support. However, there is very little directly tied to SQLite. Some things that may be sqlite-specific include the specific semantics around connection pooling and transactions, the "run with results" API, and bigint handling, and perhaps some specific performance considerations. Some of the design decisions were also influenced by what is possible to build on top of existing SQLite bindings, rather than having to fork those bindings (returning object rows instead of array rows by default as one example). Most of the work in the linked project is from before I was aware of this RFC. I'm not attached to any of the specific work there, and think it would be great to collaborate on a single project instead. I did however learn a lot in the process of iterating through various approaches, and think the notes I linked to would be valuable for the discussions here. |
Re: upgrades, I'm wondering whether a driver needs to import anything from a common base library. If so, where do they import it from, and what happens if the base library gets upgraded, or there are drivers depending on different versions? Most of the specification seems to be defined using Typescript interfaces. Since TypeScript uses structural typing, you don't need to actually declare what a class implements, and it all works at runtime. There are a few classes though, like SqlError, and you'd need to throw errors created with the same class or an instanceof check in a handler would fail. Similarly for the event class hierarchy? |
In my opinion, there are two primary use cases for a common base library/package for drivers:
There can be a single package providing both the APIs and utility functions, or use separate packages - I don't think this makes a big difference. In my experience, using instanceof anywhere in consuming code relies on a single implementation of the class being used across all drivers, which will often not be the case in practice. There should rather be common properties on error and event interfaces that can be checked. For utility functions, you can have multiple incompatible versions (used by different drivers) in the same application bundle without issues. For the type definitions - this should rarely change if the spec is stable. If it does change, TypeScript will typically error if the version implemented by the driver is incompatible with the one imported by the app. Changes here have to be managed carefully though - often even minor changes/fixes in the type definitions can be a breaking change for all existing drivers. |
I created a proof-of-concept implementation at powersync-ja/sqlite-js#4, that implements the Some initial comments on the APIs so far: Connection pool vs single connectionI touched on this in earlier comments. Users currently have to choose between a The It is possible to implement the SqlPreparedStatement
Inheritance & genericsThere are a lot of inheritance and generics used in the APIs. I understand why - the inheritance gives end-users nice APIs to work with (call any of the query methods in any of the contexts), and the generics makes everything extensible with options for the specific drivers. I don't have a strong argument on this, just a feeling that it may be better to focus more on simplicity here. Individual implementations may add options by overriding the implemented method definitions, without needing to use generics for the most part. You can see in this commit where I removed a whole bunch of generic arguments, with practically no effect on the resulting types - anything using |
@rkistner: Your low-level driver API seems nice and simple ! But maybe we should avoid requiring Symbol.dispose in it ? The nice user-facing API can still use it, but it would be nice if drivers could be defined and used in today's javascript. Also, I'm not a big fan of the union type here: |
I haven't tested without it, but at least in principle Symbol.dispose should be optional everywhere - there is always another method doing the same thing. If there is interest in taking that approach further, I can do another iteration addressing points like that union type. I'll also have to check to what extent things may need to change to support databases other than SQLite. |
Hey @halvardssm I'm one of the team behind PGlite, thanks for inviting us to take part in this. I'm excited by the ambition of this project, you have put a lot of thought into it and it solves many problems already. However, there are a few things that would block us from being able to adopt it at the moment. I think it may be a mistake for the specification to prescribe the constructor. For PGlite, and any WASM SQLite implimentation, there is many more considerations then just a connection URL. These could be covered by your My suggestion would be to have a recommendation for a standard constructor, and The other concern I have is the lack of a wrapper object around the result rows. With PGlite we have a export type Row<T = { [key: string]: any }> = T
export type Results<T = { [key: string]: any }> = {
rows: Row<T>[]
affectedRows?: number // Count of the rows in the database affected by the query
fields: { name: string; dataTypeID: number }[]
blob?: Blob // Only set when a file is returned, such as from a COPY command
} Without a wrapper object it's not possible to return this data that we receive along with a query result rows to the user. On the tagged template literal interface, I think there is a lot of nuance to it that should be carefully considered. If I understand correctly, at the moment it only provides the ability to pass in parameters, and not construct other parts of the query, such as including raw SQL, or quoted identifiers. It's worth looking at the postgres.js implementation of this feature (which is very advanced), we have a much more basic version in PGlite. I would carefully consider if this api should be part of this spec, it's a very new feature of these database drivers and I think there is a lot more exploration and experimentation thats needed first before tightly specifying what it should look like. It's essentially a query builder, and I don't think this spec wants to cover that. As I said, I'm excited about this project. The thing I particularly like is the idea of a standard set of On a final note, at ElectricSQL we wrapped a large number of SQL drivers (both SQLite and Postgres) with a standard interface for the last version of our sync engine. There is potentially valuable work there that could be used to build a set of wrappers that conform to this standard. You can see them here: https://github.com/electric-sql/electric-old/tree/main/clients/typescript/src/drivers |
Sorry everyone for the delayed answer, I've been sick for the past weeks. There are some great thoughts and insight shared from everyone, so let's see how we can improve the interfaces. Let me answer everything in one big comment, and structure it a bit (and sorry if I missed anything, just remind me if I did). Thanks @lovasoa @rkistner @skybrian @samwillis for joining the conversation and for bringing all these great points! Separation of Low and High level interfaces
@lovasoa @samwillis After thinking about it, I see what you mean, the connection class could and maybe should be a lot smaller, and only expose one iterator based query method returning a results object. As long as the result and the input are standardized, using utility functions to get the right output for the high level interfaces should not be an issue. @rkistner @lovasoa As you mentioned, a separation between a low and high level interface allows for some interesting usecases and implementations where there can exist multiple low level drives that can be interchanged for the high level wrappers implementing the high level interfaces. I don't believe there can be one high level wrapper to rule them all since each database does things slightly different, but at least there can be a more shared structure and some interoperability where possible. Explicit resource management
@lovasoa I am a bit conflicted regarding relying. On one hand, this is something that database drivers could really use, but I also agree that on the other hand it would only work for newer runtime versions supporting this. We would need to get more feedback from the community regarding this. Versioning and changes to specs
@rkistner @skybrian This is one of my concerns, how can we make sure that additions (non breaking changes) to the specs it will allow backwards compatibility. I am considering to write in the specs that new additions but be added as optional properties and will only be added on major changes. What are thoughts about this? Generally, my idea was to have the specs will follow semantic versioning, meaning:
High level interface
@rkistner Although I am not aware of any SQL based database which does not support transactions or prepared statements, I am not willing to rule out the possibility (take this as a challenge to convince me otherwise as I love to be wrong when it can simplify things). In the case that prepared statements or transactions are not supported, it can be documented in the respective drivers and a simple error can be thrown at runtime. The challenge when creating a standardisation is to account for flexibility while keep things simple, so I am happy to discuss this and maybe compare deeper with Go's std/sql. More input here would be good. Provide non-type code
@skybrian It was actually discussed internally before to add a test suite, and the feeling was that bringing too much to the first version would be too much, but if there is a consensus from the community (mainly the library maintainers) that this is wanted during the first itteration, we should add it. In regards to the skeleton implementation, I am not sure exactly what you mean. Do you mean to have some kind of wrapper class that can take the low level connection class? If so, I would be very hesitant in introducing this as it would make everything heavily dependent on this library. I would rather want to make it possible for these interfaces to be implemented even without importing the types and just by following the specs. Again, Some more nuances and middleground can be found here, but I would be very hesitant to introduce something that would make it required to import code.
@rkistner This is really great! Probably a lot of your learnings we can use. The main thing that I saw while looking through the code and documentation is that it is very specific to SQLite. It would unfortunately not translate in all cases to MySQL, postgres and potentially other databases. Would you see any changes that would be needed in the current specs from your learnings (besides simplifying as mentioned above)?
I agree, I am also therefore a bit concerned about the base Error class that is provided in this proposal. Some more insight here would be great from the community. Utility functions should be provided IMO, if we simplify the connection class, then having utility functions for getting all and one would be very helpful. Connection poolsThe question is really wether we want pools to be explicit or implicit. After reading through and looking at the existing drivers again, I have to say I'm conflicted and even leaning a bit towards having this implicit and handled by a config property and the respective drivers. It would simplify general usage, but when a connection is required for more than a single call (as you mentioned with prepared statements), I can see this being non-trivial to account for (but hey, I could be wrong). Maybe my worries are simply solved by some kind of "acquire mutex"? It might also be the case that drivers could keep track of prepared statements per connection in a hashmap or similar. Thoughts? Generics
I have indeed been worried that generics introduce too much complexity. I will look more at your code and see if simplifying it can be done. Constructor type
@samwillis I think one of the main benefits in standardising the constructor is that developers can bring the knowledge they have from one driver to another. For the cases where a connection URL is not needed, it can simply be advised to give an empty string, and the driver can ignore the field value entirely. I do however think that a connection URL is either way needed in most cases (either for a local sqlite file path, or a pg URL), and it would be up to the drivers to decide if options should only be allowed in the connectionOptions or also as url parameters. When considering simplicity first, we see that providing a single url string for connection to a database as the first argument versus having to dig through documentation to see which property is called what, makes the development experience a lot nicer.
Can you expand on this? Maybe with examples? Explicit connect
Most databases requires us to connect async, so we wont get away with doing this in the constructor as its not async. I am afraid we are kindof locked to this if we want to build this interface for most databases. This is also why all the interfaces are async even though sqlite can be sync in most cases (and performs better that way too). Template literals for querying
I think this would be up to the drivers themselves to implement, but from a typing perspective, it is not that advanced. I see your concern, but ultimately the signature of the function can't really be different in any aspect as far as I know, just the implementation (again, let me know if I'm wrong). |
Based on the discussions, the specs have been updated. The main change is that a low-level (
There are two things that we still need to decide on.
Use the following reactions to "vote", or comment if you have additional points.
|
I think the main role of the client is not to have to reimplement the same functions in all drivers. Its API should be rich, provide all the quality of life features developers expect (including pooling, array functions, iterators,...), and could be iterated upon often, since there will be one main "official" client. On the other hand the proposed limited driver API should be sufficient, and should be updated only rarely, since there will be a large number of drivers. |
Intro
Hey there, JavaScript enthusiasts and database users!
Over the past months, there has been work towards a standardized interface for SQL database drivers in JavaScript, and now we need your input! 🎉
If you’ve spent any time working with SQL databases in JavaScript, you’ve probably noticed how each database driver has its own way of doing things. Whether you’re connecting to MySQL, PostgreSQL, SQLite, or SQL Server, you’ve had to adapt to different interfaces and quirks. While this diversity showcases the richness of our ecosystem, it can also lead to a lot of headaches. Learning new APIs, dealing with inconsistencies, and maintaining codebases that juggle multiple database drivers can be a real challenge.
That’s where this RFC comes in. We’re proposing a standardized interface that will bring consistency to the way we interact with SQL databases in JavaScript. Imagine being able to switch databases without rewriting all your database code, or having a unified way to build tools and libraries that work across different SQL databases. Sounds great, right?
Whether you’re a seasoned database pro or just starting out, your feedback is invaluable. Let’s work together to build a more cohesive and developer-friendly SQL database driver ecosystem.
How to contribute
If you want to contribute with a comment, please add it directly to this PR, either commenting on specific lines or wording in the RFC document, or as a comment overall. Feedback provided in this PR will be transferred over to the implementation branch.