-
Notifications
You must be signed in to change notification settings - Fork 384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds sql creation scripts for postgres #1299
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1299 +/- ##
==========================================
+ Coverage 60.74% 61.06% +0.31%
==========================================
Files 125 125
Lines 10072 10072
==========================================
+ Hits 6118 6150 +32
+ Misses 3259 3227 -32
Partials 695 695
Continue to review full report at Codecov.
|
storage/postgres/storage.sql
Outdated
-- are allowed they will all reference this row. | ||
CREATE TABLE IF NOT EXISTS leaf_data( | ||
tree_id BIGINT NOT NULL, | ||
-- This is a personality specific has of some subset of the leaf data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/has/hash/
storage/postgres/storage.sql
Outdated
-- This is a personality specific has of some subset of the leaf data. | ||
-- It's only purpose is to allow Trillian to identify duplicate entries in | ||
-- the context of the personality. | ||
leaf_identity_hash BIT VARYING(255) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be BYTEA
or BYTEA(32)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other storage schemas, we've deliberately kept the field size overly large in order to accommodate > 256bit hashes for future expansion.
storage/postgres/storage.sql
Outdated
-- This is a personality specific has of some subset of the leaf data. | ||
-- It's only purpose is to allow Trillian to identify duplicate entries in | ||
-- the context of the personality. | ||
leaf_identity_hash BIT VARYING(255) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same BYTEA
vs BIT
question
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a crack at adding Postgres support.
Some schema questions with respect to BYTEA vs. BIT.
Could you include some local integration test results?
I presume integration tests will follow in another PR, but some confirmation that the code base is happy with these types would be nice.
storage/postgres/storage.sql
Outdated
|
||
CREATE TABLE IF NOT EXISTS subtree( | ||
tree_id BIGINT NOT NULL, | ||
subtree_id BIT VARYING(255) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps BYTEA here too (and consider size)
storage/postgres/storage.sql
Outdated
tree_id BIGINT NOT NULL, | ||
tree_head_timestamp BIGINT, | ||
tree_size BIGINT, | ||
root_hash BIT VARYING(255) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps BYTEA here too (and consider size)
storage/postgres/storage.sql
Outdated
tree_head_timestamp BIGINT, | ||
tree_size BIGINT, | ||
root_hash BIT VARYING(255) NOT NULL, | ||
root_signature BIT VARYING(1024) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps BYTEA here too (and consider size)
storage/postgres/storage.sql
Outdated
leaf_value BYTEA NOT NULL, | ||
-- This is extra data that the application can associate with the leaf should it wish to. | ||
-- This data is not included in signing and hashing. | ||
extra_dta BYTEA, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extra_data
to mirror naming elsewhere?
storage/postgres/storage.sql
Outdated
leaf_identity_hash BIT VARYING(255) NOT NULL, | ||
-- This is a MerkleLeafHash as defined by the treehasher that the log uses. For example for | ||
-- CT this hash will include the leaf prefix byte as well as the leaf data. | ||
merkle_leaf_hash BIT VARYING(255) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps BYTEA here too (and consider size)
storage/postgres/storage.sql
Outdated
leaf_identity_hash BIT VARYING(255) NOT NULL, | ||
-- This is a MerkleLeafHash as defined by the treehasher that the log uses. For example for | ||
-- CT this hash will include the leaf prefix byte as well as the leaf data. | ||
merkle_leaf_hash BIT VARYING(255) NOT NULL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps BYTEA here too (and consider size)
storage/postgres/storage.sql
Outdated
-- This is a SHA256 hash of the TreeID, LeafIdentityHash and QueueTimestampNanos. It is used | ||
-- for batched deletes from the table when trillian_log_server and trillian_log_signer are | ||
-- built with the batched_queue tag. | ||
queue_id BIT VARYING(32) DEFAULT NULL UNIQUE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps BYTEA here too (and consider size)
@gdbelvin I've switched from Regarding integration tests; are you referring to tests similar to those under the |
Smaller PRs are easier to review. Let's proceed with this and we can modify as needed when we get some unit tests working. |
Thanks for pointing out the lack of a size limit on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@AlCutter Would you mind merging this after taking a look? |
Thanks for the updates, I think it's probably ok to have |
storage/postgres/storage.sql
Outdated
root_hash BYTEA NOT NULL, | ||
root_signature BYTEA NOT NULL, | ||
tree_revision BIGINT, | ||
PRIMARY KEY(tree_id, tree_head_timestamp), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to have the PK be tree_id, tree_revision
, you could then drop the index
below (I don't think anything needs this key to contain the timestamp - the cloudspanner implementation uses /treeID/tree_revision
as it's PK on the equivalent table there BTW).
Separately, and I must admit I don't know much about PGSQL, but other storage impls define the PK here as tree_id, tree_rev DESC
, because by far the most common operation on this table is a scan which wants to fetch the newest entry. It looks like PG doesn't support the DESC
modifier on the PK fields, so I don't know what implications this will have for performance when you come to write the code itself.
Might be worth popping a comment or a TODO
here as a note to yourself to look at this area again in detail once the code lands too.
@AlCutter you're right, it seems like postgres doesn't support scan ordering for primary keys. That being said, the workaround seems to be to create a unique index as well, seen here. I left the index below the create table statement and added a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Referencing #1298, this PR adds the SQL creation script for a postgres data store and a stripped down reset script. I've done my best to translate the MySQL <> Postgres types; however they may not be a perfect mapping and it's possible that there'll need to be further changes made but I figured it's a good starting point for now.
Note that I switched to camel casing since Postgres is case insensitive and would normalize everything to being lowercased which is hard to read without underscores.