Skip to content

Commit

Permalink
ESQL: Explain IN
Browse files Browse the repository at this point in the history
Adds javadoc to IN to explain it's three-valued null logic and why it
isn't using the standard code generators.
  • Loading branch information
nik9000 committed Jan 3, 2025
1 parent 0ab59bb commit b4be583
Showing 1 changed file with 32 additions and 0 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
import org.elasticsearch.common.io.stream.NamedWriteableRegistry;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
import org.elasticsearch.compute.data.Block;
import org.elasticsearch.compute.data.Vector;
import org.elasticsearch.compute.operator.EvalOperator;
import org.elasticsearch.xpack.esql.EsqlIllegalArgumentException;
import org.elasticsearch.xpack.esql.core.expression.Expression;
Expand Down Expand Up @@ -50,6 +52,36 @@
import static org.elasticsearch.xpack.esql.core.type.DataType.VERSION;
import static org.elasticsearch.xpack.esql.core.util.StringUtils.ordinal;

/**
* The {@code IN} operator.
* <p>
* This function has quite "unique" null handling rules around {@code null} and multivalued
* fields. Let's use an example: {@code WHERE x IN (a, b, c)}. Here's the per-row
* decision sequence:
* </p>
* <ol>
* <li>{@code x IS NULL} => {@code null}</li>
* <li>{@code a IS NULL AND b IS NULL AND c IS NULL} => {@code null}</li>
* <li>{@code MV_COUNT(a) > 1 AND MV_COUNT(a) > 1 AND MV_COUNT(a) > 1} => {@code null}</li>
* <li>{@code x == a OR x == b OR x == c} => {@code true}</li>
* <li>{@code a IS NULL OR b IS NULL OR c IS NULL} => {@code null}</li>
* <li>{@code else} => {@code false}</li>
* </ol>
* <p>
* I believe the first, second, and third entries are *mostly* optimizations and making the
* <a href="https://en.wikipedia.org/wiki/Three-valued_logic">Three-valued logic</a> of SQL
* explicit and/or work with the evaluators. You could probably shorten this to the last
* three points, but lots of folks aren't familiar with SQL's three-valued logic anyway, so
* let's be explicit.
* </p>
* <p>
* Because of this chain of logic we don't use the standard evaluator generators. They'd just
* require too many special cases and nothing else quite works like this. I mean, everything
* works just like this in that "three-valued logic" sort of way, but not in the "java code"
* sort of way. So! Instead of using the standard evaluator generators we use the
* String Template generators that we use for things like {@link Block} and {@link Vector}.
* </p>
*/
public class In extends EsqlScalarFunction {
public static final NamedWriteableRegistry.Entry ENTRY = new NamedWriteableRegistry.Entry(Expression.class, "In", In::new);

Expand Down

0 comments on commit b4be583

Please sign in to comment.