[Feature] Lightweight schema change of add/drop column #10135

Lchangliang · 2022-06-14T12:24:06Z

Search before asking

I had searched in the issues and found no similar issues.

Description

Background

Add/drop column are heavy operators. They will do linkedSchemaChange or will copy data when data in s3. When add/drop column frequently, a lot of time is wasted waiting. So we need a new way to optimize the process.

Improvement

This improvement involves three aspects, read, writer, compaction. In original impl, BE will hold the tablet schema, set unique id for each column. When read/writer/compaction, BE can get the schema from tablet meta. The core of the modification is
that get the schema from FE when read/writer. And Every rowset will hold its schema. Using the newest schema when doing compaction.

Modification

Generate Unique ID by FE.
When reading/inserting data, FE will send the newest schema to BE.
When inserting, BE will persistent the schema with rowset meta.
When doing compaction, BE will choose newest schema from compation rowsets and make it persistent with new rowset meta after compaction.
The improvement is only acting on add/drop value. If add/drop key, it will be done by the old way.
It will compatible with old table. Old table is mean that system already has tables before the upgrade. But old table will always do the change by old way although the column is value.

Result

When add/drop value column, they will be lightweight operators. They don't need rewrite the data and complete quickly.

Use case

No response

Related issues

No response

Are you willing to submit PR?

Yes I am willing to submit a PR!

Code of Conduct

I agree to follow this project's Code of Conduct

Lchangliang · 2022-07-12T07:02:01Z

TODO:

FE synchronization unique_id from BE for old table.
optimize that too more tablet_schemas are in memory.
About flink connector. Support Light Schema Change and optimize that remove data use streaming load must need head.

Lchangliang added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 14, 2022

Lchangliang mentioned this issue Jun 14, 2022

[Feature] Lightweight schema change of add/drop column #10136

Merged

dataroaring closed this as completed in #10136 Jul 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Lightweight schema change of add/drop column #10135

[Feature] Lightweight schema change of add/drop column #10135

Lchangliang commented Jun 14, 2022 •

edited

Loading

Lchangliang commented Jul 12, 2022

[Feature] Lightweight schema change of add/drop column #10135

[Feature] Lightweight schema change of add/drop column #10135

Comments

Lchangliang commented Jun 14, 2022 • edited Loading

Search before asking

Description

Use case

Related issues

Are you willing to submit PR?

Code of Conduct

Lchangliang commented Jul 12, 2022

Lchangliang commented Jun 14, 2022 •

edited

Loading