-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[YCQL] Table stuck in the keyspace after deletion #3032
Comments
If I query the
Interestingly enough, in another keyspace where I had deleted tables I also have duplicate records with the same name (but different |
It's correct to have tables with the same names in different keyspaces. @smalyshev, you are talking about duplicate records:
If there are a few table records with the same name (and different ids) in the same keyspace - that looks strange. Could you please provide a log how do you see that? |
For example:
As you can see, two tables named |
The problem is in
In this example, when foo with ID 100 is dropped, we set the table's state as There are 2 fixes needed:
|
The scenario above sounds pretty close to what I did - I had a table for which I've forgotten to create some fields and indexes, so I've deleted it and started to recreate it immediately (I wasn't aware of the complications above then), so I think it matches what is described above. |
Summary: Since YSQL DDLs are not transactional yet, it's possible to result in a scenario where a namespace is present in YB metadata but not in postgres. This can happen if the creation partly succeeds - i.e. namespace is created in YB, but before it was created in postgres system tables, the operation was terminated (due to intermittent network issue or node failure for example). In this scenario, the namespace is unusable since postgres system tables are unaware of its existence and it just lies around in YB metadata. We should add a yb-admin command to delete namespace that can help recover in such situations. Another example is #3032. We need a way to clean up and remove tables, namespaces, and indexes in case of DDL consistency issues. Usage: Delete namespace by name: ``` yb-admin delete_namespace ysql.namespace_name yb-admin delete_namespace ycql.namespace_name ``` Delete namespace by ID ``` yb-admin delete_namespace_by_id <id> ``` Delete table by name ``` yb-admin delete_table ysql.namespace_name table_name yb-admin delete_table ycql.namespace_name table_name ``` Delete table by ID ``` yb-admin delete_table_by_id <id> ``` Delete index by name ``` yb-admin delete_index ysql.namespace_name index_name yb-admin delete_index ycql.namespace_name index_name ``` Delete index by ID ``` yb-admin delete_index_by_id <id> ``` Note that for YSQL, these commands will only delete data from master and not from postgres. These are only meant to be used to clean up master state in case of cluster errors or inconsistencies. Test Plan: Tested manually. Created inconsistent state: ``` yugabyte=# create database nehatest; ERROR: Already present: Keyspace 'nehatest' already exists yugabyte=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges -----------------+----------+----------+---------+-------------+----------------------- postgres | postgres | UTF8 | C | en_US.UTF-8 | system_platform | postgres | UTF8 | C | en_US.UTF-8 | template0 | postgres | UTF8 | C | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | C | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres yugabyte | postgres | UTF8 | C | en_US.UTF-8 | (5 rows) yugabyte=# create database nehatest; ERROR: Already present: Keyspace 'nehatest' already exists yugabyte=# drop database nehatest; ERROR: database "nehatest" does not exist yugabyte=# create database nehatest; ERROR: Already present: Keyspace 'nehatest' already exists ``` Ran `yb-admin delete_namespace ysql.nehatest` and namespace was successfully deleted. Also, tested delete table and delete index. Reviewers: bogdan, mihnea Reviewed By: mihnea Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D7657
Reproduced the main issue - stuck keyspace:
Error:
|
Due to race conditions between TServers the test result can reference to the wrong Index (above) OR to the table:
In this case we have race between (slow) DeleteTable from TS-1 and DeleteNamespace from TS-2. |
Mentioned above unexpected 'Object Not Found' error as result of DROP TABLE - is tracked by this: #3133 |
Summary: - Fixed 'Object not found' issue in YBClient::Data::DeleteTable(). - Preventing index attaching to a deleted table. (in CatalogManager::CreateTable) - Preventing table restoring back to RUNNING state after deleting. (in CatalogManager::AddIndexInfoToTable) - Added new point into the CQL Server Executor for local table cache clean-up. - Improved 'yb-admin dump_masters_state' - to be able getting data from SysCatalog into file/to console (current implementation does not allow to get it because the dump length is limited by maximum LOG line length - not too much.. it's not enough for even 1 table.) - Updated 'yb-admin delete_index' output. Test Plan: ybd --java-test org.yb.cql.TestBigNumShards#testDropTableTimeout ybd --java-test org.yb.cql.TestWithMasterLatency#testDropTableTimeout ybd --java-test org.yb.cql.TestIndex#testRecreateTable ybd --cxx-test yb-admin-test --gtest_filter AdminCliTest.TestDeleteIndex Reviewers: bogdan, mihnea, hector, neha, mikhail Reviewed By: mikhail Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D7670
Fixed by the commit above. |
…ion. Summary: The new java test org.yb.cql.TestBigNumShards#testDropTableTimeout was introduced in the fix for #3032 (D7670): a70ab64 The test creates big number of shards, so it was disabled for TSAN configuration. This diff disables the test for ASAN too, because it fails in ASAN due to timeouts with the error: com.datastax.driver.core.exceptions.TransportException: [/127.230.226.34:9042] Connection has been closed Test Plan: ybd asan --java-test org.yb.cql.TestBigNumShards#testDropTableTimeout ybd tsan --java-test org.yb.cql.TestBigNumShards#testDropTableTimeout ybd --java-test org.yb.cql.TestBigNumShards#testDropTableTimeout Reviewers: mikhail Reviewed By: mikhail Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D7747
…r stuck table/keyspace. Summary: This is the backport of this commit by @OlegLoginov in master to the 2.0.5 branch: yugabyte@a70ab64 - Fixed 'Object not found' issue in YBClient::Data::DeleteTable(). - Preventing index attaching to a deleted table. (in CatalogManager::CreateTable) - Preventing table restoring back to RUNNING state after deleting. (in CatalogManager::AddIndexInfoToTable) - Added new point into the CQL Server Executor for local table cache clean-up. - Improved 'yb-admin dump_masters_state' - to be able getting data from SysCatalog into file/to console (current implementation does not allow to get it because the dump length is limited by maximum LOG line length - not too much.. it's not enough for even 1 table.) - Updated 'yb-admin delete_index' output. Test Plan: Jenkins: skip Reviewers: mihnea Subscribers: yql, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D7869
…table/keyspace. Summary: This is the backport of the following commit by @OlegLoginov from master to the 2.0.5 branch: a70ab64 Original revision in master: https://phabricator.dev.yugabyte.com/D7670 - Fixed 'Object not found' issue in YBClient::Data::DeleteTable(). - Preventing index attaching to a deleted table. (in CatalogManager::CreateTable) - Preventing table restoring back to RUNNING state after deleting. (in CatalogManager::AddIndexInfoToTable) - Added new point into the CQL Server Executor for local table cache clean-up. - Improved 'yb-admin dump_masters_state' - to be able getting data from SysCatalog into file/to console (current implementation does not allow to get it because the dump length is limited by maximum LOG line length - not too much.. it's not enough for even 1 table.) - Updated 'yb-admin delete_index' output. Test Plan: Jenkins: skip Reviewers: oleg, mihnea Reviewed By: mihnea Subscribers: jenkins-bot, yql, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D7869
…r stuck table/keyspace. Summary: This is the backport of the following commit by @OlegLoginov from master to the 2.0.5 branch: yugabyte@a70ab64 Original revision in master: https://phabricator.dev.yugabyte.com/D7670 - Fixed 'Object not found' issue in YBClient::Data::DeleteTable(). - Preventing index attaching to a deleted table. (in CatalogManager::CreateTable) - Preventing table restoring back to RUNNING state after deleting. (in CatalogManager::AddIndexInfoToTable) - Added new point into the CQL Server Executor for local table cache clean-up. - Improved 'yb-admin dump_masters_state' - to be able getting data from SysCatalog into file/to console (current implementation does not allow to get it because the dump length is limited by maximum LOG line length - not too much.. it's not enough for even 1 table.) - Updated 'yb-admin delete_index' output. Test Plan: Jenkins: skip Reviewers: oleg, mihnea Reviewed By: mihnea Subscribers: jenkins-bot, yql, bogdan Differential Revision: https://phabricator.dev.yugabyte.com/D7869
…stDropTableTimeout. Summary: The test `org.yb.cql.TestBigNumShards#testDropTableTimeout` sets `NumShardsPerTServer `==32. For 1 table + 5 indexes it created 192 tablets. For the many number of tablets Jenkins can occasionally fail with timeout. The big number of tablets was a way to reproduce some cases when the DROP TABLE happens when the CREATE TABLE/INDEX is not finished. See for details: - GH: #3032 - Diff: https://phabricator.dev.yugabyte.com/D7670 - Commit: a70ab64 Current fix keeps the test for big number of shards, but the number of tables (and indexes) reduced to 2. Slow CREATE TABLE test-case is re-implemented via `TEST_simulate_slow_table_create_secs`: `org.yb.cql.TestMasterLatency#testSlowCreateDropIndex `. Test Plan: ybd --java-test org.yb.cql.TestBigNumShards#testCreateDropTable --tp 1 -n 10 ybd --java-test org.yb.cql.TestMasterLatency#testSlowCreateDropTable --tp 1 -n 10 ybd --java-test org.yb.cql.TestMasterLatency#testSlowCreateDropIndex --tp 1 -n 10 ybd --java-test org.yb.cql.TestSlowCreateTable#testCreateTableTimeout --tp 1 -n 10 Reviewers: timur, amitanand Reviewed By: amitanand Subscribers: yql Differential Revision: https://phabricator.dev.yugabyte.com/D12427
I created a new keyspace and then created table
sessions
in it, with a couple of indexes. After that, I tried to delete it, however something strange happened: table is stuck in half-deleted state and I can not neither use nor delete it. This is what I get:Desc tables sees it:
I can describe it:
If I try to drop it, it says it doesn't exist:
If I try to query it, it doesn't exist:
If I try to create the same table again, it creates it!
No error! But the keys in table description are duplicated now:
And I can query and drop this duplicate table now. But once I dropped it, it's back to the phantom table now. Also, can not drop the keyspace:
The text was updated successfully, but these errors were encountered: