-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt: add normalization rule to convert ST_Distance(...) < x to ST_DWithin(...) #52028
Comments
heads up that |
Would we want to perform this transformation in all cases, or only in cases where an inverted index scan might be possible? I'm asking because it looks to me like in order for an inverted index scan to be generated, one of the st_distance arguments must be a constant, and the value on the right side of the <= comparison must also be constant. |
i think this is applicable in all cases |
Doing something like |
We want to avoid calculating ST_Distance as it is more expensive than it's counterpart ST_DWithin which has early exit functionality. Why does |
It wouldn't happen if |
Easiest thing for that case might be to add another function, |
it wouldn't fold, say, 100 to 99.999999?
works for me if you're willing |
Unfortunately not. There's also the problem of what to do if |
`ST_DWithin` is equivalent to the expression `ST_Distance <= x`. Similar reformulations can be made for different comparison operators (<, >, >=). This commit adds two rules that replace expressions of the latter form with either `ST_DWithin` or `ST_DWithinExclusive`. This replacement is desirable because the `ST_DWithin` function can exit early. `ST_DWithin` can also be used to generate an inverted index scan. Fixes cockroachdb#52028 Release note: None
`ST_DWithin` is equivalent to the expression `ST_Distance <= x`. Similar reformulations can be made for different comparison operators (<, >, >=). This commit adds two rules that replace expressions of the latter form with either `ST_DWithin` or `ST_DWithinExclusive`. This replacement is desirable because the `ST_DWithin` function can exit early. `ST_DWithin` can also be used to generate an inverted index scan. Fixes cockroachdb#52028 Release note: None
`ST_DWithin` is equivalent to the expression `ST_Distance <= x`. Similar reformulations can be made for different comparison operators (<, >, >=). This commit adds two rules that replace expressions of the latter form with either `ST_DWithin` or `ST_DWithinExclusive`. This replacement is desirable because the `ST_DWithin` function can exit early. `ST_DWithin` can also be used to generate an inverted index scan. Fixes cockroachdb#52028 Release note: None
52938: sql: implement idle_in_transaction_session_timeout r=RichardJCai a=RichardJCai Release note (sql change): Implement IdleInTransactionSessionTimeout variable to allow terminating sessions that are idle in a transaction past the provides threshold. Set the variable by using SET idle_in_transaction_session_timeout = 'time'. Sessions that are idle in OPEN, ABORTED and DONE(COMMITWAIT) transaction states will be terminated if the user idles longer than the threshold time. Fixes #5924 53041: cli/debug: add SQL-decoding to printed keys r=dt a=dt This adds a `--decode-as-table` flag to `debug keys --values`. When passed a base64 encoded Descriptor, which can be obtained via `select encode(descriptor, 'base64') from system.descriptor where id = x`, it will attempt to decode KVs using a SQL row-fetcher backed by that descriptor and print them as col1=val1, col2=val2,... Release note: none. 53060: backupccl: refactor backup destination resolution r=dt a=pbardea During backup planning we need to resolve the backup destination. This means that we need to: 1. Determine if we're going to resolve to a backup collection or not 2. Determine if we should auto-append 3. Read any encryption options we may need if we determine that we're donig an incremental backup 4. Find the previous backups in the chain if we're doing an incremental backup. Release note: None 53066: opt: add rule to replace ST_Distance with ST_DWithin r=DrewKimball a=DrewKimball `ST_DWithin` is equivalent to the expression `ST_Distance <= x`. Similar reformulations can be made for different comparison operators (<, >, >=). This PR consists of two commits. The first commit adds an internal builtin function `ST_DWithinExclusive`, which is equivalent to `ST_Distance < x` but is able to exit early. The second commit adds two rules that replace expressions of the form `ST_Distance <= x` with either `ST_DWithin` or `ST_DWithinExclusive` depending on the comparison operator used. This replacement is desirable because the `ST_DWithin` function can exit early, while `ST_Distance` cannot do the same. `ST_DWithin` can also be used to generate an inverted index scan. Fixes #52028 Release note: None 53085: colexec: fix dynamic batch behavior in hash joiner with low mem limit r=yuzefovich a=yuzefovich The disk-spilling logic in the hash joiner relies on the assumption that a memory limit error can only occur during the building of the hash table. However, this assumption was recently broken in the case of the low `workmem` limit setting by the dynamic batch size work - it now became possible for an error to occur during output population. As a result, we might have already consumed some batches from the left input and have emitted partial results before we fall back to the external hash joiner, and this would result in some tuples not being emitted at all (if we have a pending batch from the left). This problem is now fixed by restoring the invariant with the introduction of an unlimited allocator into the in-memory hash joiner. The unlimited allocator is used only to populate the output (meaning that we have already successfully built the hash table from the right side) at the phase of the hash join algorithm when we consume the left side one batch at a time and populate the output also one batch at a time, so such choice is acceptable. This additionally allows to properly account for the memory used by the output batch. Note that the only other disk-spilling operator (different sorters that fallback to the external sort) weren't impacted by this because they are able to correctly export the buffered tuples regardless of when the memory limit error occurs. Fixes: #52859. Release note: None 53146: rangefeed: don't delete txn records from the rngfeed pusher r=andreimatei a=andreimatei Before this patch, the rangefeed txn pushed was GC'ing the records of the transactions that it found to be aborted or committed when it PUSH_TOUCHED's them. Deleting the txn record is not a nice thing to do if the txn's coordinator might still be running, and so this patch terminates the practice. If the coordinator is still running and it had STAGED the txn record, a PUSH_TOUCH might cause a transition to ABORTED or COMMITTED. If such a transition is followed by the GC of the txn record, this causes an ambigous commit error for the coordinator, which can't tell whether the transaction committed (implicitly) or was rolled back. As the code stands, this patch doesn't actually help eliminate the ambiguity in the ABORTED case (it does in the COMMIT case though), because the coordinator doesn't distinguish between an ABORTED record and a missing record; it treats both the same, as ambiguous. But we're going to improve that, and the coordinator is gonna consider an existing record as non-ambiguous. The GC'ed case is going to remain the ambiguous one, and the idea is to avoid that as much as possible by only doing GC from the gc-queue (after an hour) and from the coordinator. Touches #52566 Release note (bug fix): Some rare AmbiguousCommitErrors happening when CDC was used were eliminated. Co-authored-by: richardjcai <caioftherichard@gmail.com> Co-authored-by: David Taylor <tinystatemachine@gmail.com> Co-authored-by: Paul Bardea <pbardea@gmail.com> Co-authored-by: Drew Kimball <drewk@cockroachlabs.com> Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com> Co-authored-by: Andrei Matei <andrei@cockroachlabs.com>
Queries that use
ST_DWithin
in theWHERE
orON
clause can take advantage of an inverted index for acceleration, while queries that useST_Distance
cannot. However, in cases whereST_Distance
is used as part of an inequality condition, it should be possible to convert it to an equivalent condition withST_DWithin
. We should create a normalization rule to perform this conversion so that such queries can take advantage of index acceleration if a suitable inverted index is available.The text was updated successfully, but these errors were encountered: