-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(extensions/nanoarrow_testing): Add nanoarrow_testing extension with testing JSON writer #317
Conversation
out << R"(, "VALIDITY": )"; | ||
WriteBitmap(out, value->buffer_views[0].data.as_uint8, value->length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the spec requires that a [1, 1, 1...]
VALIDITY
be provided when the buffer[0] is null? Isn't it OK with omitting the entry altogether? I imagine that is required if we are to have a 1-to-1 mapping between values and JSON representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point...the example files in apache/arrow-testing seem to all have [1, 1, 1, 1]
although I agree that it might be nice to know if the bitmap was allocated or not. When nanoarrow gets plugged into the integration tests we'll find out in a hurry if that assumption is correct!
out << values[0]; | ||
for (int64_t i = 1; i < n_values; i++) { | ||
out << ", " << static_cast<int64_t>(values[i]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would work without for both branches of the if
, right? static_cast<int64_t>
of an int64_t
becomes a no-op.
src/nanoarrow/nanoarrow_testing.hpp
Outdated
// Strings | ||
out << R"(")" << ArrowArrayViewGetIntUnsafe(value, 0) << R"(")"; | ||
for (int64_t i = 1; i < value->length; i++) { | ||
out << R"(, ")" << ArrowArrayViewGetIntUnsafe(value, i) << R"(")"; | ||
} | ||
break; | ||
case NANOARROW_TYPE_UINT64: | ||
// Strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misplaced // Strings
comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll clarify this...I was trying to highlight the "quotedness"
of the values here.
out << ", " << ArrowArrayViewGetIntUnsafe(value, i); | ||
} | ||
break; | ||
case NANOARROW_TYPE_INT64: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This case is the same as above, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's subtle, but it generates "1234567"
rather than 1234567
. I'll make sure the comments make that more clear!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right! Because of JavaScript's 53-bit limitation on integers.
return ArrowSchemaInitFromType(schema, NANOARROW_TYPE_NA); | ||
}, | ||
[](ArrowArray* array) { return NANOARROW_OK; }, &TestingJSON::WriteColumn, | ||
R"({"name": null, "count": 0})"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably want to take a look at the Go implementation examples and paste them here as test data in separate files. This way you don't have to think about interesting examples yourself and makes the test more data-driven.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great idea! I'll probably keep these tests (bare minimum to get 100% coverage over the code) and add in those tests as well (for full type coverage!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the review!
out << R"(, "VALIDITY": )"; | ||
WriteBitmap(out, value->buffer_views[0].data.as_uint8, value->length); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great point...the example files in apache/arrow-testing seem to all have [1, 1, 1, 1]
although I agree that it might be nice to know if the bitmap was allocated or not. When nanoarrow gets plugged into the integration tests we'll find out in a hurry if that assumption is correct!
out << ", " << ArrowArrayViewGetIntUnsafe(value, i); | ||
} | ||
break; | ||
case NANOARROW_TYPE_INT64: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's subtle, but it generates "1234567"
rather than 1234567
. I'll make sure the comments make that more clear!
src/nanoarrow/nanoarrow_testing.hpp
Outdated
// Strings | ||
out << R"(")" << ArrowArrayViewGetIntUnsafe(value, 0) << R"(")"; | ||
for (int64_t i = 1; i < value->length; i++) { | ||
out << R"(, ")" << ArrowArrayViewGetIntUnsafe(value, i) << R"(")"; | ||
} | ||
break; | ||
case NANOARROW_TYPE_UINT64: | ||
// Strings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll clarify this...I was trying to highlight the "quotedness"
of the values here.
return ArrowSchemaInitFromType(schema, NANOARROW_TYPE_NA); | ||
}, | ||
[](ArrowArray* array) { return NANOARROW_OK; }, &TestingJSON::WriteColumn, | ||
R"({"name": null, "count": 0})"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a great idea! I'll probably keep these tests (bare minimum to get 100% coverage over the code) and add in those tests as well (for full type coverage!).
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #317 +/- ##
==========================================
+ Coverage 87.01% 87.17% +0.16%
==========================================
Files 70 72 +2
Lines 10574 10496 -78
==========================================
- Hits 9201 9150 -51
+ Misses 1373 1346 -27 ☔ View full report in Codecov by Sentry. |
This PR adds the first few bits of infrastructure needed to implement integration testing:
The design of the testing helper library is intentionally header-only to facilitate dropping in to projects where needed (although I'm happy to change that if there are opinions otherwise).
Some obvious follow-ups not included yet: