Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Regions Storage #11

Merged
merged 25 commits into from
Mar 20, 2019
Merged

Refactor Regions Storage #11

merged 25 commits into from
Mar 20, 2019

Conversation

solotzg
Copy link
Contributor

@solotzg solotzg commented Mar 13, 2019

  • store all regions in one partition

  • refactor read process in MergeTreeDataSelectExecutor:

    • select and choose region config from setting.
    • check region version.
    • select parts and range of marks from CH by range of key in region.
    • concurrently generate block input stream, which is based on the granularity of region.
  • flush data into CH first and then remove it from cache.

By using test data of tpch10, operation load/search/change/delete are correct
TODO

  • something may be wrong with operation like remove region.

@solotzg solotzg changed the title [Don't Merge Now] Refactor Regions Storage Refactor Regions Storage Mar 15, 2019
@zanmato1984
Copy link
Contributor

/run-integration-tests

@zanmato1984
Copy link
Contributor

Please merge latest raft branch to enable ci test.

@zanmato1984
Copy link
Contributor

/run-integration-tests


// For test, please do NOT remove.
RegionMap & _regions() { return regions; }
RegionMap getRegions();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function returns value rather than const reference. Is this by design?

Copy link
Contributor Author

@solotzg solotzg Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fix it now by making it return const reference.


size_t rows = block.rows();

auto handle_bg = column->getElement(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bg => begin, ed => end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -222,16 +219,15 @@ void dbgFuncRegionPartition(Context & context, const ASTs & args, DBGInvoker::Pr
}

std::stringstream ss;
ss << "table #" << table_id << " partition #" << partition_id <<
" region #" << region_id << region_range_info.str();
ss << "table #" << table_id << " region #" << region_id << region_range_info.str();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dump more region status: region meta, raft log index, etc

@@ -663,6 +666,24 @@ QueryProcessingStage::Enum InterpreterSelectQuery::executeFetchColumns(Pipeline
query_info.resolve_locks = settings.resolve_locks;
query_info.read_tso = settings.read_tso;

String region_str = settings.region;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One region each query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

region_persister.persist(curr_region);
for (const auto & region : split_regions)
region_persister.persist(region);
if (tmt_ctx)
Copy link
Contributor

@zanmato1984 zanmato1984 Mar 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason swapping the order of persisting regions and calling region_table.splitRegion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

data in kvstore must be stored first.

{
std::lock_guard<std::mutex> lock(mutex);

curr_region->swap(*new_region);
regions[curr_region_id] = new_region;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason use two assignment to replace calling swap?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've propose a better solution. The original function swap also has bugs. 6528907#diff-b97987fcf9f8bc8e8506a0b07f2512edR285

partition_regions.push_back(region);
}
if (region_version != InvalidRegionVersion && (region->version() != region_version || region->conf_ver() != conf_version)) {
std::cout <<"for region "<< region_id <<", the version in ch is "<< region_version << " and "<< conf_version <<" , but the required version is " << region->version() <<" and "<< region->conf_ver() <<".\n";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Why use std::cout?
  2. The message content is kind of confusing: what do you mean by "version in ch" and "required version"? Is NOT region->version() the "version in ch"? Is not passed in region_version the "required version"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

namespace DB
{

class RegionException : public Exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we record the exception type in this class? Outer may want to know if this exception is caused by NOT_FOUND or VERSION_ERROR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -64,8 +68,8 @@ void KVStore::onSnapshot(const RegionPtr & region, Context * context)
old_region = it->second;
}

if (tmt_ctx && old_region)
tmt_ctx->region_partition.removeRegion(old_region);
if (old_region != nullptr && old_region->getIndex() >= region->getIndex())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What for old_region->getIndex() >= region->getIndex()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need outdated region.

{
std::lock_guard<std::mutex> lock(mutex);

curr_region->swap(*new_region);
regions[curr_region_id] = new_region;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zanmato1984 Plz take a look to confirm this modification

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've propose a better solution. The original function swap also has bugs.

@zanmato1984
Copy link
Contributor

/run-integration-tests

@zanmato1984
Copy link
Contributor

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants