-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use the whole TTableDescription to describe an indexImplTable in TIndexDescription #6280
Use the whole TTableDescription to describe an indexImplTable in TIndexDescription #6280
Conversation
@@ -92,7 +92,14 @@ class TDescribeTableRPC : public TRpcSchemeRequestActor<TDescribeTableRPC, TEvDe | |||
return Reply(Ydb::StatusIds::INTERNAL_ERROR, ctx); | |||
} | |||
|
|||
FillIndexDescription(describeTableResult, tableDescription, splitKeyType); | |||
try { | |||
FillIndexDescription(describeTableResult, tableDescription); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FillIndexDescription now calls FillColumnDescription which might throw
) { | ||
*entry.MutableExplicitPartitions()->MutableSplitBoundary() = explicitPartitions; | ||
} | ||
*entry.AddIndexImplTableDescription() = indexImplTable->TableDescription; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder why there is no column info in the IndexImplTableDescription... I will solve this problem in scope of #4955
const NKikimrMiniKQL::TType& splitKeyType | ||
) { | ||
for (const auto& boundary : boundaries) { | ||
const NKikimrSchemeOp::TTableDescription& in, const NKikimrMiniKQL::TType& splitKeyType) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes restore these lines to the original state in which they were before the #5119
if (indexImplTableDescription.HasUniformPartitionsCount()) { | ||
settings.set_uniform_partitions(indexImplTableDescription.GetUniformPartitionsCount()); | ||
} | ||
if (indexImplTableDescription.SplitBoundarySize()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could ask yourself: "What if BOTH UniformPartitionsCount and SplitBoundary are specified?" My answer is: SplitBoundary has higher priority. This method (FillIndexDescription) prepares a description of an index. It does not create the index! So SplitBoundary is more important in this scenario.
The other option I see here is to throw an error. (FillIndexDescription already throws if FillColumnDescription cannot decide how to fill the columns.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other option I see here is to throw an error. (FillIndexDescription already throws if FillColumnDescription cannot decide how to fill the columns.)
Exception sounds better for me, because
- looks like error case
- can be changed to something else in future
- I don't think having two different behavior in such case is an good idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked how DescribeTable deals with this problem and found out that it just ignores UniformPartitionsCount, i.e. it does not even check if it exists in the TTableDescription. See the code around here. The motivation is: there is no such thing as UniformPartitionsCount for an existing table. If a table already exists, then it has explicit partition boundaries which are represented by SplitBoundary field of the TTableDescription protobuf.
ydb/core/protos/flat_scheme_op.proto
Outdated
TExplicitPartitions ExplicitPartitions = 11; | ||
} | ||
optional TPartitioningPolicy PartitioningPolicy = 12; | ||
repeated TTableDescription IndexImplTableDescription = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the Language Guide (proto 3) , repeated fields are used in the plural.
repeated TTableDescription IndexImplTableDescription = 10; | |
repeated TTableDescription IndexImplTableDescriptions = 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I agree. More on the issue:
Use pluralized names for repeated fields.
Repeated fields must use a plural field name.
- statistically speaking, only 12 out of 94 repeated protobuf field names are singular in flat_scheme_op.proto
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
@@ -1000,15 +996,10 @@ message TIndexDescription { | |||
optional uint64 PathOwnerId = 7; | |||
|
|||
repeated string DataColumnNames = 8; | |||
|
|||
// DataSize + IndexSize of indexImplTable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// DataSize + IndexSize of indexImplTable | |
// DataSize + IndexSize |
What is IndexSize? Is it size of index or size of all (now single) index impl tables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IndexSize is the size of the index (the internal index that implements the primary key logic) of the indexImplTable. It follows the general DataSize + IndexSize = TotalSize convention for tables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would much rather leave this comment for you to deal with in scope of allowing multiple indexImplTables 😜 My idea for this PR is to create a hotfix before it is too late, not to correct every assumption in the code that the indexImplTable is single
if (indexImplTableDescription.HasUniformPartitionsCount()) { | ||
settings.set_uniform_partitions(indexImplTableDescription.GetUniformPartitionsCount()); | ||
} | ||
if (indexImplTableDescription.SplitBoundarySize()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other option I see here is to throw an error. (FillIndexDescription already throws if FillColumnDescription cannot decide how to fill the columns.)
Exception sounds better for me, because
- looks like error case
- can be changed to something else in future
- I don't think having two different behavior in such case is an good idea
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
void DescribeTableIndex(const TPathId& pathId, const TString& name, NKikimrSchemeOp::TIndexDescription& entry); | ||
void DescribeTableIndex(const TPathId& pathId, const TString& name, TTableIndexInfo::TPtr indexInfo, NKikimrSchemeOp::TIndexDescription& entry); | ||
void DescribeTableIndex(const TPathId& pathId, const TString& name, | ||
bool fillConfig, bool fillBoundaries, NKikimrSchemeOp::TIndexDescription& entry |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added explicit fillConfig
and fillBoundaries
options to be able to control the behavior of DescribeTableIndex
. Moreover, these options are supported by DescribeTable
and it would be inconsistent to ignore them in DescribeTableIndex
void FillPartitionConfig(const NKikimrSchemeOp::TPartitionConfig& in, NKikimrSchemeOp::TPartitionConfig& out) { | ||
out.CopyFrom(in); | ||
NKikimr::NSchemeShard::TPartitionConfigMerger::DeduplicateColumnFamiliesById(out); | ||
out.MutableStorageRooms()->Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It turns out that PartitionConfig is not just copied, but some unnecessary info is erased before giving it out to users. I have reused this logic for both DescribeTable
and DescribeTableIndex
.
⚪
🟢
*please be aware that the difference is based on comparing your commit and the last completed build from the post-commit, check comparation |
…exDescription (ydb-platform#6280) The purpose of the change is to prepare for multiple indexImplTables for vector indices. Other changes are more or less cosmetic.
It would be better (more future proof) to keep all the necessary settings of the indexImplTable in one message. I would not like to reinvent the bicycle, so the best option I see here is to reuse TTableDescription.
Additionally, I have added options
fillConfig
andfillBoundaries
to control what parts of the whole TTableDescription would be return on DescribeTableIndex request. It should prevent the unnecessary use of CPU for parsing big protobufs (TableBoundaries can be big). We would need to enablefillBoundaries
explicitly for backups, but this would be done in scope of #4955.And the last change is to make IndexImplTableDescription field a repeated field. The motivation is to prepare
flat_scheme_op.proto
for multiple indexImplTables which are said to be coming soon to implement vector indices. It is a breaking change, but there are no instances of YDB that have been using the new TIndexDescriptions with PartitionConfig and PartitionBoundaries, so it should be ok.