Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-34610: [Java] Fix valueCount when loading NullVector #34611

Closed
wants to merge 2 commits into from

Conversation

sunchao
Copy link
Member

@sunchao sunchao commented Mar 17, 2023

We should set valueCount and nullCount when loading NullVector.

Rationale for this change

When importing a NullVector, for example, via Arrow C interface, the valueCount and nullCount is currently not set properly, which will result both values to be 0.

What changes are included in this PR?

Set valueCount when loading the vector.

Are these changes tested?

Added a test. Also had to add extra constructor to facilitate the test.

Are there any user-facing changes?

No.

@sunchao sunchao requested a review from lidavidm as a code owner March 17, 2023 17:02
@github-actions
Copy link

@github-actions
Copy link

⚠️ GitHub issue #34610 has been automatically assigned in GitHub to PR creator.

We should set `valueCount` when loading `NullVector`. Unfortunately there's no `nullCount` field.
@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting committer review Awaiting committer review labels Mar 17, 2023
@viirya
Copy link
Member

viirya commented Mar 17, 2023

Error:  Errors: 
Error:    TestBasicOperation.getStreamLargeBatch:273->test:315->test:333 � IllegalState Memory was leaked by query. Memory leaked: (134217728)
Allocator(ROOT) 0/134217728/270532608/9223372036854775807 (res/actual/peak/limit)

Hmm, is this related?

@lidavidm
Copy link
Member

No, there is a long standing issue for it that nobody has had time to look at.

@lidavidm
Copy link
Member

The JNI pipeline failure does seem related, though

@viirya
Copy link
Member

viirya commented Mar 17, 2023

yea, that's the added test testNullVector.

Error:  org.apache.arrow.c.RoundtripTest.testNullVector  Time elapsed: 0.008 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <true> but was: <false>

@sunchao
Copy link
Member Author

sunchao commented Mar 17, 2023

Hmm thanks, let me take a look.

@amol-
Copy link
Member

amol- commented Mar 30, 2023

Closing because it has been untouched for a while, in case it's still relevant feel free to reopen and move it forward 👍

@amol-
Copy link
Member

amol- commented Apr 13, 2023

@sunchao are you still planning to take a look at the failing test? The PR has been inactive for the past month, it would be nice to lead it to merge if we have the chance to.

@Test
public void testNullVector() {
try (final NullVector vector = new NullVector("v", 1024)) {
assertTrue(roundtrip(vector, NullVector.class));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, its not obvious to me why this test is failing. I would've expected your change to loadFieldBuffers() to work. Maybe the field comparison is failing? I think you could try passing a custom TypeEqualsVisitor with checkName or checkMetadata disabled to narrow things down on line 155.

return VectorEqualsVisitor.vectorEquals(vector, imported, <custom_visitor_here>);

@sunchao
Copy link
Member Author

sunchao commented May 12, 2023

Sorry @amol- and @danepitkin , we found a workaround to this issue internally so I haven't got a chance to re-visit the test failure. It's indeed weird and I plan to reproduce it locally and see what happened.

@danepitkin
Copy link
Member

Sorry @amol- and @danepitkin , we found a workaround to this issue internally so I haven't got a chance to re-visit the test failure. It's indeed weird and I plan to reproduce it locally and see what happened.

No worries! I noticed the Java implementation has been a bit quiet in general, so its great to see new fixes coming in.

@danepitkin
Copy link
Member

I've followed this PR up here: #38973

The issue was in NullVector.getTransferPair(). It was not passing the name when transferring, which resulted in this check failing when comparing field names:

return (!checkName || Objects.equals(leftField.getName(), rightField.getName())) &&

I created a new PR since I didn't have permissions to push to your branch. I hope you don't mind!

@danepitkin danepitkin closed this Nov 29, 2023
pitrou pushed a commit that referenced this pull request Nov 29, 2023
…ing NullVector (#38973)

### Rationale for this change

This is a follow up of #34611 (thanks @ sunchao)
We should set `valueCount` when loading `NullVector`. Unfortunately there's no `nullCount` field.

### What changes are included in this PR?

* `valueCount` added to NullVector
* `name` added to NullVector's TransferPair

### Are these changes tested?

Yes. unit tested.

### Are there any user-facing changes?

Yes, a NullVector with a non-null name will maintain the name when transferred via `TransferPair` instead of defaulting to the default name `$data$`.
* Closes: #34610

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
dgreiss pushed a commit to dgreiss/arrow that referenced this pull request Feb 19, 2024
…nsferring NullVector (apache#38973)

### Rationale for this change

This is a follow up of apache#34611 (thanks @ sunchao)
We should set `valueCount` when loading `NullVector`. Unfortunately there's no `nullCount` field.

### What changes are included in this PR?

* `valueCount` added to NullVector
* `name` added to NullVector's TransferPair

### Are these changes tested?

Yes. unit tested.

### Are there any user-facing changes?

Yes, a NullVector with a non-null name will maintain the name when transferred via `TransferPair` instead of defaulting to the default name `$data$`.
* Closes: apache#34610

Authored-by: Chao Sun <sunchao@apple.com>
Signed-off-by: Antoine Pitrou <antoine@python.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Java] When loading NullVector, valueCount should be properly set
6 participants