Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(python): function to inspect a single-chunk Array #436

Merged
merged 3 commits into from
Apr 22, 2024

Conversation

jorisvandenbossche
Copy link
Member

Addresses #435

Reusing the existing repr utilities, the current code gives this:

In [1]: import nanoarrow as na

In [2]: pa_arr = pa.array([[0, 1], [None, 3], None, [4]])

In [3]: arr = na.Array(pa_arr)

In [4]: arr.inspect()
<ArrowArray list<item: int64>>
- length: 4
- offset: 0
- null_count: 1
- buffers[2]:
  - validity <bool[1 b] 11010000>
  - data_offset <int32[20 b] 0 2 4 4 5>
- dictionary: NULL
- children[1]:
  'item': <ArrowArray int64>
    - length: 5
    - offset: 0
    - null_count: 1
    - buffers[2]:
      - validity <bool[1 b] 11011000>
      - data <int64[40 b] 0 1 0 3 4>
    - dictionary: NULL
    - children[0]:

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -490,3 +490,7 @@ def to_string(self, width_hint=80, items_hint=10) -> str:

def __repr__(self) -> str:
return self.to_string()

def inspect(self): # or dump?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I quite like inspect()!

This could probably fairly easily be extended to multiple chunks (by printing out in a loop?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could probably fairly easily be extended to multiple chunks (by printing out in a loop?)

That's indeed true, that shouldn't be that hard. We don't need it right now, so would prefer to do it later

@jorisvandenbossche jorisvandenbossche marked this pull request as ready for review April 22, 2024 17:19
@@ -248,3 +248,46 @@ def device_repr(device):
device_type = f"- device_type: {device.device_type.name} <{device.device_type_id}>"
device_id = f"- device_id: {device.device_id}"
return "\n".join([title_line, device_type, device_id])


def array_dump(array, indent=0, max_char_width=80):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def array_dump(array, indent=0, max_char_width=80):
def array_inspect(array, indent=0, max_char_width=80):

@jorisvandenbossche jorisvandenbossche merged commit 821b580 into apache:main Apr 22, 2024
6 checks passed
@jorisvandenbossche jorisvandenbossche deleted the inspect-dump branch April 22, 2024 18:02
@paleolimbot paleolimbot added this to the nanoarrow 0.5.0 milestone May 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants