Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: Move bundling from CMake to Python #508

Merged
merged 33 commits into from
Jun 18, 2024

Conversation

paleolimbot
Copy link
Member

@paleolimbot paleolimbot commented Jun 6, 2024

Some of the work with the Meson build system has highlighted that the bundling is currently very inflexible because it's implemented in CMake. This was initially done because it made it easy to use configure_file() in CMake to keep the version information consistent, and because it allowed a straightforward path to testing the bundled files in the same way as the normal build.

This had the unfortunate side effect of requiring CMake to "build" the R or Python packages as is generally confusing, and used the CMake "install" in a confusing way (it installed sources). Instead, we can do this with a Python script (and call that from CMake, R, and Python). The testing of the namespaced builds is still done in CMake, which is essential to ensure that both build types are tested in the same way.

This also closes #485 by prefixing all nanoarrow includes with "nanoarrow/" in the source files. Many of them had to not use this prefix because when the files were bundled they weren't always bundled into a "nanoarrow/" directory. This feature has been moved to the bundling script which can now namespace or un-namespace the nanoarrow includes.

import re


def read_content(path_or_content):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course up to you but I find the behavior of this function surprising. More often than not, when you accept a path argument then accepting a string usually still represents the filepath/name, not the contents itself.

If you wanted you could refactor this to just be:

def read_content(path_or_content: io.StringIO | pathlib.Path | str):
    with open(path_or_content) as f:
        return f.read()

For the caller this would require them to pass io.StringIO("some_contents") instead of just "some_contents", but I think makes it less ambiguous in case someone expects a string argument here to still represent a path

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's a little confusing...if this utility gets more widespread use than just this bundler it should definitely get revisited!

Copy link
Member

@assignUser assignUser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1 on python > bash for the bundling ^^

@paleolimbot paleolimbot merged commit 2b935cf into apache:main Jun 18, 2024
39 checks passed
@paleolimbot paleolimbot deleted the bundle-using-python branch June 18, 2024 20:07
paleolimbot pushed a commit that referenced this pull request Jun 21, 2024
Quick follow up to #508 -
the array.c includes stopped working without this

```
FAILED: libnanoarrow.so.p/src_nanoarrow_array.c.o 
ccache cc -Ilibnanoarrow.so.p -I. -I.. -fdiagnostics-color=always -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -std=c99 -O0 -g -fPIC -MD -MQ libnanoarrow.so.p/src_nanoarrow_array.c.o -MF libnanoarrow.so.p/src_nanoarrow_array.c.o.d -o libnanoarrow.so.p/src_nanoarrow_array.c.o -c ../src/nanoarrow/array.c
../src/nanoarrow/array.c:23:10: fatal error: nanoarrow/nanoarrow.h: No such file or directory
   23 | #include "nanoarrow/nanoarrow.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~
```
@paleolimbot paleolimbot added this to the nanoarrow 0.6.0 milestone Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

#include "nanoarrow/nanoarrow.h" or "nanoarrow.h"?
3 participants