Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Removal of sequence number #7103

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

Conversation

antekresic
Copy link
Contributor

@antekresic antekresic commented Jul 5, 2024

Sequence numbers were an optimization for ordering batches based on the orderby configuration setting. It was used for ordered append and avoiding sorting compressed data when it matched the query ordering. However, with enabling changes to compressed data, bookkeeping of sequence numbers is becoming more of a hassle. Removing them and using the metadata columns for ordering reduces that burden while keeping all the existing optimizations that relied on the sequences in place.

This change does not include downgrade script which will be added in a followup PR.

There are slight planning regressions but otherwise benchmark looks OK:
https://grafana.ops.savannah-dev.timescale.com/d/fasYic_4z/compare-akuzm?orgId=1&var-branch=All&var-run1=3725&var-run2=3735&var-threshold=0&var-use_historical_thresholds=true&var-threshold_expression=2.5%20%2A%20percentile_cont%280.90%29&var-exact_suite_version=false

Disable-check: force-changelog-file

@antekresic antekresic self-assigned this Jul 5, 2024
@antekresic antekresic force-pushed the antekresic/remove-seq-num branch 3 times, most recently from 7e9c8f5 to d0625c5 Compare September 13, 2024 08:11
@antekresic antekresic marked this pull request as ready for review September 13, 2024 09:04
@akuzm
Copy link
Member

akuzm commented Sep 19, 2024

Some questions:

  • Can you do backwards scan using a compressed index scan? Technically the required order on compressed data is segmentby DESC, max_orderby1 DESC, min_orderby1 DESC, and it cannot be satisfied with an index on segmentby, min_orderby1, max_orderby1.

  • We're going to have to support reading sequence numbers for a long time, not to make people recompress all the tables, right? I wonder how we should best test it, maybe we should keep the option to create chunks with sequence numbers for now, and use it in a dedicated test. I think there's a good chance we break something on a timeframe of e.g. a year.

@svenklemm svenklemm added this to the TimescaleDB 2.17.0 milestone Sep 19, 2024
Copy link
Contributor

@erimatnor erimatnor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few nits so far.

What is the plan for update/downgrade scripts? Do we need to block downgrades somehow, so that people can't downgrade if they have the new format?

tsl/src/compression/api.c Show resolved Hide resolved
tsl/src/compression/compression.c Outdated Show resolved Hide resolved
tsl/src/compression/compression.c Outdated Show resolved Hide resolved
tsl/src/compression/compression.c Outdated Show resolved Hide resolved
Custom Scan (ChunkAppend) on test_partials
Order: test_partials."time"
-> Merge Append
Sort Key: _hyper_35_68_chunk."time"
-> Custom Scan (DecompressChunk) on _hyper_35_68_chunk
-> Sort
Sort Key: compress_hyper_36_71_chunk._ts_meta_sequence_num DESC
Sort Key: compress_hyper_36_71_chunk._ts_meta_min_1, compress_hyper_36_71_chunk._ts_meta_max_1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should there be an explicit DESC also in the new sort key? Or should it be ASC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previous sort was a DESC becuase compression is set to "time DESC" so to get ASC ordering, you needed reverse order on the sequence number. New ordering should always mirror the column ordering because we are using actual values instead of a sequence which can have its own ordering.

tsl/test/sql/compression_sequence_num_removal.sql Outdated Show resolved Hide resolved
@antekresic antekresic force-pushed the antekresic/remove-seq-num branch 3 times, most recently from 0be3ba2 to 7d42912 Compare September 26, 2024 12:58
@antekresic
Copy link
Contributor Author

@erimatnor added the missing downgrade script check. Please take a look when you can.

@antekresic antekresic force-pushed the antekresic/remove-seq-num branch 2 times, most recently from df5bcff to 85531a1 Compare September 27, 2024 13:50
Sequence numbers were an optimization for ordering batches based on the
orderby configuration setting. It was used for ordered append and
avoiding sorting compressed data when it matched the query ordering.
However, with enabling changes to compressed data, bookkeeping of
sequence numbers is becoming more of a hassle. Removing them and
using the metadata columns for ordering reduces that burden while
keeping all the existing optimizations that relied on the sequences
in place.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants