Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalog: add support for moving reader catalogs timestamp forward for PCR #130088

Merged
merged 2 commits into from
Sep 16, 2024

Conversation

fqazi
Copy link
Collaborator

@fqazi fqazi commented Sep 4, 2024

This patch adds logic for extracting and creating a PCR reader catalog. This patch will do the following:

  1. Add a ReplicatedVersion field, which will allow us to determine if the timestamp can be advanced or the descriptor needs to be copied again
  2. Set the Replicated field on the system database once a reader catalog is setup, so that schema changes are blocked

Fixes: #129664

Copy link

blathers-crl bot commented Sep 4, 2024

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@fqazi fqazi changed the title catalog: Add interface for creating PCR reader catalogs catalog: add support for moving reader catalogs timestamp forward for PCR Sep 5, 2024
@fqazi fqazi marked this pull request as ready for review September 5, 2024 18:51
@fqazi fqazi requested review from a team as code owners September 5, 2024 18:51
@fqazi fqazi requested review from rytaft, a team and stevendanna and removed request for a team September 5, 2024 18:51
@fqazi fqazi force-pushed the pcrReaderInterface branch 4 times, most recently from 14a6f5f to 7ee3333 Compare September 10, 2024 15:10
@msbutler msbutler self-requested a review September 11, 2024 14:47
@fqazi
Copy link
Collaborator Author

fqazi commented Sep 11, 2024

@msbutler RFAL!

Copy link
Collaborator

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! mostly left nits

return err
} else {
err = nil
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just realized the comment below is out of date

if tbl.GetExternal() == nil {
return nil, errors.AssertionFailedf("cannot advanced timestamp on a real table.")
}
tbl.GetExternal().AsOf = asOf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if you instead created a BumpExternalAsOf() api off of mutableTable that contains the logic of advanceTimestampForTable? The line tbl.GetExternal().AsOf = asOf is a bit odd.

} else {
// TODO(fqazi): For non-single version jumps on replicated descriptors
// we need to either bring the descriptor offline or have a mechanism
// to force the leasing subsystem to avoid "mixing" timestamps.g
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oof, what are the consequences for the reader?

@@ -178,6 +120,10 @@ func SetupOrAdvanceStandbyReaderCatalog(
return nil
}
namespaceUpdated.Add(e.GetID())
// Do not upsert entries if one already exists.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the namespaceUpdated is no longer used?

destRunner := sqlutils.MakeSQLRunner(destConn)

compareConn := func(query string) {
compareConnAt := func(query string, now hlc.Timestamp, isEqual bool) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think it would be clearer if the underlying func never accepted a timestamp and always used the captured now var. compareConnAt could instead be check and compareConn could be checkEqual or something.

Now before any cluster of "checks" you always set now to the desired timestamp.

@@ -282,6 +294,15 @@ func (tc *Collection) WriteDescToBatch(
return errors.AssertionFailedf("cannot write descriptor with an empty ID: %v", desc)
}
desc.MaybeIncrementVersion()
// If this is not an initial version, then replication
// versions of descriptors cannot be mutated.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is desc.GetVersion() != 1 part of this check?

@@ -176,6 +176,14 @@ func TestReaderCatalog(t *testing.T) {
compareConn("SELECT * FROM t1 ORDER BY n")
compareConn("SELECT * FROM v1 ORDER BY 1")
compareConn("SELECT * FROM t2 ORDER BY j")

// Validate that schema changes are blocked.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! could you add one test case related to modifying an existing descriptor in the reader tenant?

@@ -57,6 +60,15 @@ func (p *planner) SchemaChange(ctx context.Context, stmt tree.Statement) (planNo
return nil, err
}
mode := p.extendedEvalCtx.SchemaChangerState.mode
// Lease the system database to see if schema changes are blocked on reader
// catalogs.
systemDB, err := p.Descriptors().ByIDWithLeased(p.txn).Get().Database(ctx, keys.SystemDatabaseID)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hrm, could this add overhead to every schema change?

@@ -246,6 +247,16 @@ func hasPrimaryKeySerialType(params runParams, colDef *tree.ColumnTableDef) (boo
}

func (n *createTableNode) startExec(params runParams) error {
// Lease the system database to see if schema changes are blocked on reader
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you instead just lease the parent database? but maybe it doesn't matter. Same q for create_view

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why createTable and createView are special cased, but i trust you 😁

@rytaft rytaft removed their request for review September 11, 2024 19:20
// Next ID: 62
// Replicated ID tracks the original version from the source tenant
// that this descriptor was created from.
optional uint32 replicated_version = 63 [(gogoproto.nullable) = false,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I wonder if the name of this field should be ReplicatedPCRVersion, which makes it more clear that this field is very specific to PCR.

Also the name in the comment is out of date.

Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @msbutler and @stevendanna)


pkg/sql/create_table.go line 250 at r2 (raw file):

could you instead just lease the parent database? but maybe it doesn't matter. Same q for create_view

Fair, we can just use the parent database in these code paths.

createTable / createView don't get some help from the optimizer since we support CREATE AS / CREATE VIEW, which help resolve things. Normally schema changes will take the AST and make opaque plan nodes, outside of those exceptions.


pkg/sql/schema_change_plan_node.go line 65 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

hrm, could this add overhead to every schema change?

So, the system database will eventually just leased to wait for one version (there probably are a pile of other things that will also read system tables and cause the same thing). So, this just shifts doing it a bit earlier.


pkg/sql/catalog/descs/collection.go line 298 at r1 (raw file):

Previously, msbutler (Michael Butler) wrote…

why is desc.GetVersion() != 1 part of this check?

Good point, this should no longer be needed (previously, I didn't have the bypass for the collection).

Done.


pkg/sql/catalog/replication/reader_catalog.go line 107 at r1 (raw file):

Previously, msbutler (Michael Butler) wrote…

oof, what are the consequences for the reader?

You might be mixing descriptors that were in a set of schema changes, but have vastly different version.

For example:
A user decides to drop a column on a table, which will also clean up a view. Its possible I think that you end up in a scenario where you think the view exists, but get a the table without a column referenced by the view.

I don't think there is anything to prevent that from happening. Also I think in general are we worried about people mixing different timestamps on their descriptors?


pkg/sql/catalog/replication/reader_catalog.go line 123 at r1 (raw file):

Previously, msbutler (Michael Butler) wrote…

i think the namespaceUpdated is no longer used?

Done.


pkg/sql/catalog/replication/reader_catalog.go line 171 at r1 (raw file):

Previously, msbutler (Michael Butler) wrote…

what if you instead created a BumpExternalAsOf() api off of mutableTable that contains the logic of advanceTimestampForTable? The line tbl.GetExternal().AsOf = asOf is a bit odd.

Done.


pkg/sql/catalog/replication/reader_catalog_test.go line 180 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

nice! could you add one test case related to modifying an existing descriptor in the reader tenant?

Done.

@fqazi fqazi requested a review from msbutler September 13, 2024 16:09
Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @msbutler and @stevendanna)


pkg/sql/catalog/replication/reader_catalog.go line 107 at r1 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

You might be mixing descriptors that were in a set of schema changes, but have vastly different version.

For example:
A user decides to drop a column on a table, which will also clean up a view. Its possible I think that you end up in a scenario where you think the view exists, but get a the table without a column referenced by the view.

I don't think there is anything to prevent that from happening. Also I think in general are we worried about people mixing different timestamps on their descriptors?

Actually, never mind the modification time check inside leasing should protect us from this scenario, so this should all work. Let me drop this comment

Copy link
Collaborator

@msbutler msbutler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @fqazi and @stevendanna)


pkg/sql/schema_change_plan_node.go line 65 at r2 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

So, the system database will eventually just leased to wait for one version (there probably are a pile of other things that will also read system tables and cause the same thing). So, this just shifts doing it a bit earlier.

sg.


pkg/sql/catalog/replication/reader_catalog.go line 107 at r1 (raw file):

Previously, fqazi (Faizan Qazi) wrote…

Actually, never mind the modification time check should protect us, so this should all work. Let me drop this comment

Not sure what you mean by "the modification time check". will hopefully understand with the comment.

@msbutler
Copy link
Collaborator

@fqazi LGTM modulo the rebase is clean. Also one other thing to consider, that may not have seen: #130088 (comment)

@msbutler
Copy link
Collaborator

msbutler commented Sep 14, 2024

@fqazi one thing i'm curious about: it's interesting that the approach you take to prevent schema changes in the reader tenant touches very little code that @rafiss is working through in #130697. Why is that? Perhaps because Rafi's change needs to filter by schema change, which i guess is only accessible in the schema change code, whereas your change applies a blanket lock.

@fqazi
Copy link
Collaborator Author

fqazi commented Sep 16, 2024

@fqazi one thing i'm curious about: it's interesting that the approach you take to prevent schema changes in the reader tenant touches very little code that @rafiss is working through in #130697. Why is that? Perhaps because Rafi's change needs to filter by schema change, which i guess is only accessible in the schema change code, whereas your change applies a blanket lock.

Yeah, so this is a lower level code path thats lets us block all DDL operations. The schema_locked and other variants try to allow a subset of schema changes that are not backfilled, so they take a blacklist approach. Where specific code paths have checks and others are ignored. @msbutler

@fqazi fqazi force-pushed the pcrReaderInterface branch 3 times, most recently from 72caa15 to b562e75 Compare September 16, 2024 13:38
To help support PCR, we should prevent schema changes on replicated
descriptors. This will prevent the reader catalog from being incorrectly
modified. To address this, this patch will add a ReplicatedPCRVersion field
to all descriptors and use that to block schema changes. Additionally,
this field will be used in the future to allow descriptors to be updated
to later timestamps.

Release note: None
Fixes: cockroachdb#129664
Previously, schema changes were allowed on the reader catalog created by
physical cluster replication. This could be problematic because users
could introuduce conflicts that could prevent this. To adress this, this
patch forces the reader catalogs system database to have the versions
filed set, which can be used to disable schema change operations.

Release note: None
Copy link
Collaborator Author

@fqazi fqazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @msbutler and @stevendanna)


pkg/sql/schema_change_plan_node.go line 65 at r2 (raw file):

Previously, msbutler (Michael Butler) wrote…

sg.

Done.


pkg/sql/catalog/replication/reader_catalog.go line 107 at r1 (raw file):

Previously, msbutler (Michael Butler) wrote…

Not sure what you mean by "the modification time check". will hopefully understand with the comment.

I added this comment, there is still a hazard from this:

// Its possible that multiple descriptor versions could have gone by on
// the from cluster, which could mean fairly large schema changes have
// occured. This should be fine, because the leasing subsystem will look
// at the modification time of a descriptor to determine which version should
// be picked up by a txn. Since, we are publishing all descriptors in a single txn,
// this would ensure that we don't end up using a mix of "new" descriptors in oldatxns,
// since read timestamp will be below the modification time. The only potential hazard
// that exists here is that newer txns could pick up older descriptors if the range feed
// is behind.
// TODO(fqazi): We should either change how big version jumps are handled here
// or adjust leasing to publish in batches (latter would be more usable).

I'll get an issue open and adjust the leasing code. I think we can do a fairly conservative change to eliminate the hazard and also make it more usable in other scenarios

@fqazi
Copy link
Collaborator Author

fqazi commented Sep 16, 2024

@msbutler TFTR!

bors r+

@craig craig bot merged commit 7cd5895 into cockroachdb:master Sep 16, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql/catalog: prevent schema changes on replicated descriptors from reader catalog
3 participants