Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[2DC] Support colocated tables #4516

Closed
ndeodhar opened this issue May 19, 2020 · 2 comments
Closed

[2DC] Support colocated tables #4516

ndeodhar opened this issue May 19, 2020 · 2 comments
Assignees
Labels
area/cdc Change Data Capture area/ysql Yugabyte SQL (YSQL)

Comments

@ndeodhar
Copy link
Contributor

ndeodhar commented May 19, 2020

Add ability to setup replication for colocated tables.
This involves changes to:
SetupUniverseReplication flow (schema validation)
Creation of CDC streams (today, we have 1 stream per table)
Producer-consumer tablet mapping

@ndeodhar ndeodhar added area/ysql Yugabyte SQL (YSQL) area/cdc Change Data Capture labels May 19, 2020
@ndeodhar ndeodhar self-assigned this May 19, 2020
@rahuldesirazu rahuldesirazu assigned hulien22 and unassigned ndeodhar Aug 31, 2020
@rahuldesirazu
Copy link
Contributor

A few high-level questions to think about for this feature:

  1. Do we want to support xDC at the database level or leave it as a per table replication stream?
  2. What happens if a table is collocated on only one side? Is this a mode we want to support? How will the tablet mapping look in this case?
  3. How will we store/expose metrics? Currently we have n tablets per stream so we can roll up tablet level metrics to get a per stream metric, but this will change when we have n streams per tablet.

hulien22 added a commit that referenced this issue Oct 28, 2020
…5982)

Summary:
First part of backup for colocated databases (#4874), and also a necessary component of adding 2dc
support for colocated tables (#4516).
Allows for use of `CREATE TABLE ... WITH (table_oid = x);` where the created table will be assigned
the given oid if it is free, and return an error if the oid is already in use.
Similarly, also allows for index creation with (table_oid = x).

Note that the minimum table_oid we allow is FirstNormalObjectId (which is 16384 by default as defined [[ https://github.com/yugabyte/yugabyte-db/blob/master/src/postgres/src/include/access/transam.h#L71-L94 | here ]]).

Adding postgres session variable `yb_enable_create_with_table_oid` to enable and disable this feature (defaults to false).

Test Plan:
```
ybd --java-test org.yb.pgsql.TestPgRegressTable
ybd --java-test org.yb.pgsql.TestPgRegressIndex
ybd --java-test org.yb.pgsql.TestPgWithTableOid
```

Reviewers: dmitry, mihnea, zyu

Reviewed By: zyu

Subscribers: zyu, yql

Differential Revision: https://phabricator.dev.yugabyte.com/D9588
hulien22 added a commit that referenced this issue Dec 17, 2020
Summary:
Adding initial support for 2dc with colocated databases.
This is an initial implementation for this feature and thus has a lot of room for improvements.

This implementation allows for a user to setup replication for a colocated database's colocated
tablet. This will also perform a schema validation of all the tables on the producer side and ensure
that they match the tables on the consumer side (this also performs validation that the tables have
the same postgres table oid - see #5982). Once validation is complete, a single stream is setup
between the colocated tablets which copies all the table data from the producer.

Currently, the main implementation is primarily based around using each colocated database's parent
colocated table id since that is the name of the directory where all its data is stored. I believe
this is still the cleanest way of implementing this on the backend, but work will need to be done in
order to make the user facing side more friendly. Note that this will require some additional
mapping on the _consumer_ side that maps _producer_ parent colocated table id to dbname.

Usage: Currently using all the same regular 2dc yb-admin commands without any changes. In order to
reference a colocated database, use its parent colocated table id instead of a regular table id (this
can be found via `yb-admin ... list_tables include_table_id`, or by knowing the database id and
appending `.colocated.parent.uuid`).

Example:
```
yb-admin -master_addresses 127.0.0.2:7100 setup_universe_replication \
  2498e14e-b964-481d-9894-794f4cf06be3 127.0.0.1:7100 \
  00004000000030008000000000000000.colocated.parent.uuid
```
This also works similarly for `alter_universe_replication` and `bootstrap_cdc_producer`.

`list_cdc_streams` output: (notice the table_id field)
```
~/code/yugabyte-db >>> build/latest/bin/yb-admin -master_addresses 127.0.0.1:7100 list_cdc_streams
CDC Streams:
streams {
  stream_id: "6d2715509021495ba269b3395508ce19"
  table_id: "00004000000030008000000000000000.colocated.parent.uuid"
  options {
      key: "record_format"
      value: "WAL"
  }
  options {
      key: "record_type"
      value: "CHANGE"
  }
}
```

Test Plan:
```
ybd --cxx-test twodc_ysql-test
```

Reviewers: bogdan, nicolas, rahuldesirazu

Reviewed By: rahuldesirazu

Subscribers: ybase

Differential Revision: https://phabricator.dev.yugabyte.com/D9841
@hulien22
Copy link
Contributor

Closed by 07f2c71.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cdc Change Data Capture area/ysql Yugabyte SQL (YSQL)
Projects
None yet
Development

No branches or pull requests

3 participants