Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.5.0 #96

Merged
merged 15 commits into from
Apr 14, 2022
Merged

v0.5.0 #96

merged 15 commits into from
Apr 14, 2022

Conversation

bitner
Copy link
Collaborator

@bitner bitner commented Mar 22, 2022

[v0.5.0]

Version 0.5.0 is a major refactor of how data is stored. It is recommended to start a new database from scratch and to move data over rather than to use the inbuilt migration which will be very slow for larger amounts of data.

Fixed

Changed

  • The partition layout has been changed from being hardcoded to a partition to week to using nested partitions. The first level is by collection, for each collection, there is an attribute partition_trunc which can be set to NULL (no temporal partitions), month, or year.

  • CQL1 and Query Code have been refactored to translate to CQL2 to reduce duplicated code in query parsing.

  • Unused functions have been stripped from the project.

  • Pypgstac has been changed to use Fire rather than Typer.

  • Pypgstac has been changed to use Psycopg3 rather than Asyncpg to enable easier use as both sync and async.

  • Indexing has been reworked to eliminate indexes that from logs were not being used. The global json index on properties has been removed. Indexes on individual properties can be added either globally or per collection using the new queryables table.

  • Triggers for maintaining partitions have been updated to reduce lock contention and to reflect the new data layout.

  • The data pager which optimizes "order by datetime" searches has been updated to get time periods from the new partition layout and partition metadata.

  • Tests have been updated to reflect the many changes.

Added

  • On ingest, the content in an item is compared to the metadata available at the collection level and duplicate information is stripped out (this is primarily data in the item_assets property). Logic is added in to merge this data back in on data usage.

@bitner bitner force-pushed the rework_for_v0.5.0 branch 2 times, most recently from 138bc1d to 228a033 Compare March 22, 2022 11:34
@bitner bitner force-pushed the rework_for_v0.5.0 branch from 228a033 to d95d2e2 Compare March 22, 2022 11:35
@bitner bitner marked this pull request as ready for review April 12, 2022 23:07
@bitner bitner requested a review from lossyrob April 12, 2022 23:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants