Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix weird NRT bug #13353 #13369

Merged
merged 3 commits into from
May 15, 2024
Merged

Fix weird NRT bug #13353 #13369

merged 3 commits into from
May 15, 2024

Conversation

benwtrent
Copy link
Member

The issue outlines the problem. When we have point value dimensions, segment core readers assume that there will be point files.

However, when allowing soft deletes and a document fails indexing failed before a point field could be written, this assumption fails. Consequently, the NRT fails to open.

I tried many different ways of fixing this issue.

  • Delaying on setting the field info until document writes are successful. But this causes weirdness with global field numbers and other assumptions and checks later in the document life-cycle
  • Reverting the field info if a all the docs for a field fail to index, but this causes other weirdness and edge cases that seemed too sharp to mess with.

So, I settled on always flushing a point file if the field info says there are point fields, even if there aren't any docs in the buffer.

I am happy to consider other options.

closes #13353

@benwtrent
Copy link
Member Author

Pinging some folks on review. This is a weird place in the codebase I haven't messed with before. This deserves careful review. Checks are green, but I may have missed some corners.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your approach makes sense to me, we should use field infos to decide whether to flush, not whether we buffered docs. This is consistent with what happens when all values get merged away, and with what happens when we flush postings, norms or doc values.

@benwtrent benwtrent merged commit b1d3c08 into apache:main May 15, 2024
3 checks passed
@benwtrent benwtrent deleted the fix/gh-13353 branch May 15, 2024 13:37
benwtrent added a commit that referenced this pull request May 15, 2024
The issue outlines the problem. When we have point value dimensions, segment core readers assume that there will be point files.

However, when allowing soft deletes and a document fails indexing failed before a point field could be written, this assumption fails. Consequently, the NRT fails to open. I settled on always flushing a point file if the field info says there are point fields, even if there aren't any docs in the buffer.

closes #13353
ChrisHegarty added a commit that referenced this pull request May 16, 2024
This commit updates the writer to handle the case where there are no values.

Previously (before #13369), there was a check that there were some points values before trying to write, this is no longer the case. The code in writeFieldNDims has an assumption that the values is not empty - an empty values will result in calculating a negative number of splits, and a negate array size to hold the splits.

The fix is trivial, return null when values is empty - null is an allowable return value from this method. Note: writeField1Dim is able to handle an empty values.
ChrisHegarty added a commit to ChrisHegarty/lucene that referenced this pull request May 16, 2024
This commit updates the writer to handle the case where there are no values.

Previously (before apache#13369), there was a check that there were some points values before trying to write, this is no longer the case. The code in writeFieldNDims has an assumption that the values is not empty - an empty values will result in calculating a negative number of splits, and a negate array size to hold the splits.

The fix is trivial, return null when values is empty - null is an allowable return value from this method. Note: writeField1Dim is able to handle an empty values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NRT failure due to FieldInfo & File mismatch
2 participants