-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Span-based way to read binary data in ADO.NET #24899
Comments
It probably wouldn't return a Span but instead would copy into one. |
@davidfowl part of the idea here is to allow getting the data without any copying. Unless To argue against myself, it could be claimed that binary data is usually large and would there be used frequently with Don't get me wrong, the version you propose, which copies data into a Span is also needed - it's the same thing as the |
System.Data Triage: This looks like a good idea, although we are not going to implement it right away. Moving to Future. |
I bet you could enable this scenario using the existing var buffer = new Span<byte>();
reader.GetStream(ordinal).Read(span);
var bytes = buffer.Slice(dataOffset, length); |
Unless I'm confused, this will still copy your data out from the driver's buffer and into the user-provided Span, in that sense it's very similar to @davidfowl's suggestion above. I was interested in thinking about getting a If we're looking for a copying method, than I agree that doing it via the stream is probably best. On a somewhat unrelated note, this seems like a good example of why some sort of compliance testing suite would be helpful (#7810), so that missing support for this in a given provider could be discovered/flagged right away. |
For example, |
I found myself wishing for a high-performance way to retrieve binary data from SQLite; a method like the one proposed that could return the results of Note that SQLite introduces some complex lifetime issues, though:
That means that the data returned by "GetReadOnlySpan(ordinal)" might only be valid until the very next column is read (if the method called to read that column requires a type conversion). (And of course it's invalidated if |
The lifetime issues are indeed the tricky problem here... |
The span needs to be passed in. Take a look at all of the exiting APIs that deal with spans, they all have this pattern. |
@davidfowl we had this conversation above :) #24899 (comment) But yeah, to develop this further... It's true that if the row is being streamed rather than buffered (CommandBehavior.SequentialAccess), it may make sense to allow users to pass in a (writable) Span, and to have the driver read into that directly. API-wise this is actually already possible via GetBytes), and Npgsql actually implements it in a zero-copy way (when SequentialAccess is specified). What we could do is add an overload accepting a Span rather than byte[], for other memory types. |
Right, that would allow you to pass a stackalloced span. What happens if the buffer is too small today? |
I may be misunderstanding what you're saying, but how does |
|
Yep.
GetBytes accepts parameters saying how many bytes to read, from which position in the column and where to put them in the buffer, so I don't think there's a problem (unless the user passes a length that's bigger than the buffer they themselves provided, in which case we throw). If we added a Span overload, it would proabbly look something like: public virtual long GetBytes(int ordinal, long dataOffset, Span<bytes> span); This would allow reading different segments from the same binary column in different calls (the offset would need to always increment if streaming with SequentialAccess). Does this make sense and align with other APIs @davidfowl? |
Just wanted to add to this by highlighting the fact that the GetStream() call won't work for the RowVersion/Timestamp data type in MSSQL. So having a Span based read operation on DbDataReader would reduce the need for a byte buffer when reading the RowVersion into something such as a single long value. |
DbDataReader includes a method which allow reading sliced binary data from the database: GetBytes().
With Span coming, it makes sense to add a similar method which simply return a ReadOnlySpan, that can slice the provider's internal buffer - this would provide a zero-copy way to get binary data out. A default implementation could be provided in DbDataReader which internally allocates the byte array and returns a ReadOnlySpan to it, but providers could override it to provide an efficient implementation.
Note that the same isn't possible for char-based data (GetChars()) because text decoding needs to be done by someone.
The text was updated successfully, but these errors were encountered: