Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Use inttypes.h macros instead of casts to print fixed-width integers #520

Merged
merged 9 commits into from
Jun 12, 2024

Conversation

WillAyd
Copy link
Contributor

@WillAyd WillAyd commented Jun 11, 2024

Very pedantic change to the great work done in #507 just to prevent any issues on LLP64 systems

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great idea!

I think that if we do this, we should do it for all the places where we are trying to printf an int64_t, which is quite a lot of places.

I also recall that I ran into some problems on mingw when packaging ADBC for R, which is one of the reasons that the "blind cast to long" + %ld was used when I started out.

I wonder if a utility function or macro + NANOARROW_PRID64 define in nanoarrow_types.h could be used to isolate this change in case we (or somebody else) needs to work around this? (Dragging inttypes.h into nanoarrow.h is maybe not great, though, and there are definitely some places in buffer_inline/array_inline that need to print int64s).

@WillAyd
Copy link
Contributor Author

WillAyd commented Jun 11, 2024

Sure happy to take a larger look. Do you know what the R issues were with ADBC? Would have to double check but I am fairly certain this is a C99 macro, so surprised that something wouldn't support it at this point

@paleolimbot
Copy link
Member

Would have to double check but I am fairly certain this is a C99 macro, so surprised that something wouldn't support it at this point

I'll take a look tomorrow, but it was something about old versions of mingw having an inconsistency between the definition of PRId64 and the implementation of printf().

In any case, isolating this behaviour into utility functions is maybe still a good idea?

@WillAyd
Copy link
Contributor Author

WillAyd commented Jun 11, 2024

Are you thinking of just redefining PRId64 via macros or wrapping a new printf function? I'm a little wary of trying to make something not standards compliant become standards compliant. Doesn't feel like a great value add to nanoarrow, but of course open to whatever direction you want to take this in

@paleolimbot
Copy link
Member

I think that the issue was that the mingw inttypes.h header ( https://github.com/msys2-contrib/mingw-w64/blob/master/mingw-w64-headers/crt/inttypes.h ) had some interactions with the format check attribute and __USE_MINGW_ANSI_STDIO:

https://github.com/apache/arrow-adbc/blob/acef3c826599c5d4c18849bfe2164d8d0ce53fbc/c/driver/common/utils.h#L34-L41

I don't think that the format check attribute is particularly important (it is opt-in via the NANOARROW_DEBUG define), so I think PRId64 is safe 🙂 . The old version of mingw that was causing problems was the basis for the R package build toolchain for R <= 3.6, but that version just went out of the tidyverse support matrix.

I was had been thinking something like #define NANOARROW_PR64(EXPR) ((int64_t)value) + ArrowErrorSet("Some big value is %" NANOARROW_PRID64, NANOARROW_PR64(some_value)), but since the issue was actually just the format check attribute and not the format itself, I don't think it's needed 🙂 .

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Epic!

Just two typos that my local build complained about, possibly because it uses -DNANOARROW_DEBUG.

ArrowErrorSet(error, "Can't resolve device with type %d and identifier %ld",
(int)device_array->device_type, (long)device_array->device_id);
ArrowErrorSet(error,
"Can't resolve device with type %" PRId32 " and identifier %l" PRId64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Can't resolve device with type %" PRId32 " and identifier %l" PRId64,
"Can't resolve device with type %" PRId32 " and identifier %" PRId64,

@@ -1145,7 +1152,7 @@ static int ArrowAssertIncreasingInt64(struct ArrowBufferView view,

for (int64_t i = 1; i < view.size_bytes / (int64_t)sizeof(int64_t); i++) {
if (view.data.as_int64[i] < view.data.as_int64[i - 1]) {
ArrowErrorSet(error, "[%ld] Expected element size >= 0", (long)i);
ArrowErrorSet(error, "[%" PRId64 "] Expected element size >= 0", (long)i);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ArrowErrorSet(error, "[%" PRId64 "] Expected element size >= 0", (long)i);
ArrowErrorSet(error, "[%" PRId64 "] Expected element size >= 0", i);

@paleolimbot paleolimbot changed the title refactor(c): Use printf macros instead of casts refactor: Use inttypes.h macros instead of casts to print fixed-width integers Jun 12, 2024
@paleolimbot paleolimbot merged commit f6a6b40 into apache:main Jun 12, 2024
32 checks passed
@WillAyd WillAyd deleted the less-casting branch June 12, 2024 15:12
@paleolimbot paleolimbot added this to the nanoarrow 0.6.0 milestone Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants