Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ToParentBlockJoinQuery Explain Support Score Mode #12245

Merged
merged 21 commits into from
May 10, 2023
Merged
Show file tree
Hide file tree
Changes from 15 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
b4d27a0
initial commit of score mode explain.
MarcusSorealheis Apr 27, 2023
6f797dc
tidy up.
MarcusSorealheis Apr 27, 2023
2b5e084
tidy a version file that was unmodified.
MarcusSorealheis Apr 27, 2023
dcedc7a
moving to testBlockJoin.
MarcusSorealheis Apr 27, 2023
415fd86
ensure score mode is included in match text.
MarcusSorealheis Apr 27, 2023
daa584c
I think this is closer to the right track. tests won't run on my rig.
MarcusSorealheis Apr 27, 2023
2875366
tidy.
MarcusSorealheis Apr 27, 2023
cb96335
introduce a variable and string template, simplify return.
MarcusSorealheis Apr 27, 2023
40ca854
oops fall through error.
MarcusSorealheis Apr 27, 2023
9e590a0
final failure is comparison between expected value and actual value I…
MarcusSorealheis Apr 27, 2023
41e2600
change data type to match tests, decouple score modes.
MarcusSorealheis Apr 28, 2023
23c8280
all issues resolved except one.
MarcusSorealheis May 1, 2023
26db728
fixing in-progress issues.
MarcusSorealheis May 2, 2023
b19c9f9
adding a total explanation.
MarcusSorealheis May 2, 2023
1fde482
fix the control flow and comment out helpful debug aid for others in …
MarcusSorealheis May 2, 2023
3d23e2e
add min and max.
MarcusSorealheis May 3, 2023
94cb3ac
refactor to use switch statement.
MarcusSorealheis May 4, 2023
824a86e
Adding more support for none and removing score math.
MarcusSorealheis May 4, 2023
d3f413e
Adding a bit if simplicity.
ebs-mkhludnev May 5, 2023
f52ebec
some tidysh
ebs-mkhludnev May 5, 2023
b98018d
restore assert
ebs-mkhludnev May 5, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 1 addition & 2 deletions lucene/core/src/java/org/apache/lucene/util/Version.java
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,7 @@ public final class Version {
*
* @deprecated Use latest
*/
@Deprecated
public static final Version LUCENE_9_7_0 = new Version(9, 7, 0);
@Deprecated public static final Version LUCENE_9_7_0 = new Version(9, 7, 0);

/**
* Match settings and bugs in Lucene's 10.0.0 release.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,7 @@ public long cost() {
public Explanation explain(LeafReaderContext context, int doc) throws IOException {
BlockJoinScorer scorer = (BlockJoinScorer) scorer(context);
if (scorer != null && scorer.iterator().advance(doc) == doc) {
return scorer.explain(context, in);
return scorer.explain(context, in, scoreMode);
}
return Explanation.noMatch("Not a match");
}
Expand Down Expand Up @@ -391,35 +391,75 @@ private void setScoreAndFreq() throws IOException {
}
this.score = (float) score;
}

public Explanation explain(LeafReaderContext context, Weight childWeight) throws IOException {
/*
* This instance of Explanation requires three parameters, context, childWeight, and scoreMode.
* The scoreMode parameter considers Avg, Total, Min, Max, and None.
* */
public Explanation explain(LeafReaderContext context, Weight childWeight, ScoreMode scoreMode)
throws IOException {
int prevParentDoc = parentBits.prevSetBit(parentApproximation.docID() - 1);
int start =
context.docBase + prevParentDoc + 1; // +1 b/c prevParentDoc is previous parent doc
int end = context.docBase + parentApproximation.docID() - 1; // -1 b/c parentDoc is parent doc

Explanation bestChild = null;
double childScoreSum = 0;
int matches = 0;
for (int childDoc = start; childDoc <= end; childDoc++) {
Explanation child = childWeight.explain(context, childDoc - context.docBase);
if (child.isMatch()) {
matches++;
childScoreSum += child.getValue().doubleValue();

if (bestChild == null
|| child.getValue().floatValue() > bestChild.getValue().floatValue()) {
|| child.getValue().doubleValue() > bestChild.getValue().doubleValue()) {
mkhludnev marked this conversation as resolved.
Show resolved Hide resolved
bestChild = child;
}
}
}

return Explanation.match(
score(),
String.format(
Locale.ROOT,
"Score based on %d child docs in range from %d to %d, best match:",
matches,
start,
end),
bestChild);
if (bestChild == null) {
MarcusSorealheis marked this conversation as resolved.
Show resolved Hide resolved
if (scoreMode == ScoreMode.None) {
return Explanation.noMatch("No children matched");

} else {
return Explanation.match(
0.0f,
String.format(
Locale.ROOT,
"Score based on 0 child docs in range from %d to %d, using score mode %s",
start,
end,
scoreMode));
}
}
if (scoreMode == ScoreMode.Avg) {
double avgScore = matches > 0 ? childScoreSum / (double) matches : 0;
return Explanation.match(
avgScore,
String.format(
Locale.ROOT,
"Score based on %d child docs in range from %d to %d, using score mode %s",
matches,
start,
end,
scoreMode),
bestChild);
}
if (scoreMode == ScoreMode.Total && matches > 0) {
double totalScore = childScoreSum;
return Explanation.match(
totalScore,
String.format(
Locale.ROOT,
"Score based on %d child docs in range from %d to %d, using score mode %s",
matches,
start,
end,
scoreMode),
bestChild);
} else {
return Explanation.noMatch("Unexpected score mode: " + scoreMode);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read it like min/max modes aren't supported. Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to add these. I did a refactor halfway through. Let me add them back, one moment. It should be simple.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's also a test for them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Here are a few questions:

  • Score.None is still not handled
  • why it is ifs with boring copies, but not one switch(mode)
  • I see that we can just assume child match since ToPBJQ.explain() ensures the match
  • I still don't understand why to calculate score math here instead of just calling this.score()?

}
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -277,7 +277,6 @@ public void testSimple() throws Exception {
CheckHits.checkHitCollector(random(), fullQuery.build(), "country", s, new int[] {2});

TopDocs topDocs = s.search(fullQuery.build(), 1);

// assertEquals(1, results.totalHitCount);
assertEquals(1, topDocs.totalHits.value);
Document parentDoc = s.storedFields().document(topDocs.scoreDocs[0].doc);
Expand Down Expand Up @@ -892,29 +891,34 @@ public void testRandom() throws Exception {
int childId = Integer.parseInt(document.get("childID"));
// System.out.println(" hit docID=" + hit.doc + " childId=" + childId + " parentId=" +
// document.get("parentID"));
assertTrue(explanation.isMatch());
assertEquals(hit.score, explanation.getValue().doubleValue(), 0.0f);
Matcher m =
Pattern.compile(
"Score based on ([0-9]+) child docs in range from ([0-9]+) to ([0-9]+), best match:")
.matcher(explanation.getDescription());
assertTrue("Block Join description not matches", m.matches());
assertTrue("Matched children not positive", Integer.parseInt(m.group(1)) > 0);
assertEquals(
"Wrong child range start", hit.doc - 1 - childId, Integer.parseInt(m.group(2)));
assertEquals("Wrong child range end", hit.doc - 1, Integer.parseInt(m.group(3)));
Explanation childWeightExplanation = explanation.getDetails()[0];
if ("sum of:".equals(childWeightExplanation.getDescription())) {
childWeightExplanation = childWeightExplanation.getDetails()[0];
}
if (agg == ScoreMode.None) {
assertTrue(
"Wrong child weight description",
childWeightExplanation.getDescription().startsWith("ConstantScore("));
} else {
assertTrue(
"Wrong child weight description",
childWeightExplanation.getDescription().startsWith("weight(child"));
if (explanation.isMatch()) {
assertTrue(explanation.isMatch());
// System.out.println("explanation.getDescription: " +
// explanation.getDescription());
// This test is failing in strange ways.
assertEquals(hit.score, explanation.getValue().doubleValue(), 0.005f);
Matcher m =
Pattern.compile(
"Score based on ([0-9]+) child docs in range from ([0-9]+) to ([0-9]+), using score mode (None|Avg|Min|Max|Total)")
.matcher(explanation.getDescription());
assertTrue("Block Join description not matches", m.matches());
assertTrue("Matched children not positive", Integer.parseInt(m.group(1)) > 0);
assertEquals(
"Wrong child range start", hit.doc - 1 - childId, Integer.parseInt(m.group(2)));
assertEquals("Wrong child range end", hit.doc - 1, Integer.parseInt(m.group(3)));
Explanation childWeightExplanation = explanation.getDetails()[0];
if ("sum of:".equals(childWeightExplanation.getDescription())) {
childWeightExplanation = childWeightExplanation.getDetails()[0];
}
if (agg == ScoreMode.None) {
assertTrue(
"Wrong child weight description",
childWeightExplanation.getDescription().startsWith("ConstantScore("));
} else {
assertTrue(
"Wrong child weight description",
childWeightExplanation.getDescription().startsWith("weight(child"));
}
}
}
}
Expand Down