Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modify default backfill data parts order in ascending #510

Conversation

yl-lisen
Copy link
Collaborator

This close #508

@yl-lisen yl-lisen requested a review from sunset3000 January 19, 2024 11:35
@yl-lisen yl-lisen self-assigned this Jan 19, 2024
@yl-lisen yl-lisen requested review from qijun-niu-timeplus and yokofly and removed request for sunset3000 January 19, 2024 11:35
@yl-lisen yl-lisen changed the title modify backfill data parts order in ascending modify default backfill data parts order in ascending Jan 19, 2024
@jovezhong
Copy link
Contributor

Question, any impact to the seek_to behavior, are the auto-hist-backfill events sorted or not?

The current docs:
https://docs.timeplus.com/query-syntax#query-settings

  1. enable_backfill_from_historical_store=0|1. By default, if it's omitted, it's 1.
    • When it's 0, the query engine either loads data from streaming storage, or from historical storage.
    • When it's 1, the query engine evaluates whether it's necessary to load data from historical storage(such as the time range is outside of the streaming storage), or it'll be more efficient to get data from historical storage(for example, count/min/max is pre-computed in historical storage, faster than scanning data in streaming storage).
  2. force_backfill_in_order=0|1. By default, if it's omitted, it's 0.
    1. When it's 0, the data from the historical storage are turned without extra sorting. This would improve the performance.
    2. When it's 1, the data from the historical storage are turned with extra sorting. This would decrease the performance. So turn on this flag carefully.

@yl-lisen
Copy link
Collaborator Author

yl-lisen commented Jan 20, 2024

Question, any impact to the seek_to behavior, are the auto-hist-backfill events sorted or not?

The current docs: https://docs.timeplus.com/query-syntax#query-settings

  1. enable_backfill_from_historical_store=0|1. By default, if it's omitted, it's 1.

    • When it's 0, the query engine either loads data from streaming storage, or from historical storage.
    • When it's 1, the query engine evaluates whether it's necessary to load data from historical storage(such as the time range is outside of the streaming storage), or it'll be more efficient to get data from historical storage(for example, count/min/max is pre-computed in historical storage, faster than scanning data in streaming storage).
  2. force_backfill_in_order=0|1. By default, if it's omitted, it's 0.

    1. When it's 0, the data from the historical storage are turned without extra sorting. This would improve the performance.
    2. When it's 1, the data from the historical storage are turned with extra sorting. This would decrease the performance. So turn on this flag carefully.

@jovezhong The historical data is sorted in ascending for each internal data parts.
Currently, By default enable_backfill_from_historical_store=true, force_backfill_in_order=0, we will read these data parts in reverse order.

For example:
ingest data: parts1 (0-3), parts2 (3-4)

read in reverse: parts2(3-4), parts1(0-3)

after modified: parts1(0-3), parts1(3-4)

@yl-lisen yl-lisen merged commit ac11533 into develop Jan 20, 2024
21 of 25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Modify the default backfill data parts order in ascending
3 participants