-
Notifications
You must be signed in to change notification settings - Fork 621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KIKIMR-20543: ddl+dml in query service #1444
Conversation
⚪
|
⚪
|
bool Insert(const TKqpCompileResult::TConstPtr& compileResult, bool isEnableAstCache) { | ||
InsertQuery(compileResult); | ||
bool Insert(const TKqpCompileResult::TConstPtr& compileResult, bool isEnableAstCache, bool isPerStatement) { | ||
if (!isPerStatement) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will be with cache if we enable this new feature? They will not cache or they will be cached another way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We won't save compilation result for statements by text, because all statements have the same query text, we will save their compilation results by ast
⚪
|
⚪
|
@@ -14,21 +14,23 @@ namespace NKikimr::NKqp::NPrivateEvents { | |||
|
|||
struct TEvCompileRequest: public TEventLocal<TEvCompileRequest, TKqpEvents::EvCompileRequest> { | |||
TEvCompileRequest(const TIntrusiveConstPtr<NACLib::TUserToken>& userToken, const TMaybe<TString>& uid, | |||
TMaybe<TKqpQueryId>&& query, bool keepInCache, TInstant deadline, | |||
TMaybe<TKqpQueryId>&& query, bool keepInCache, bool canDivideIntoStatements, TInstant deadline, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Время завести builder? :)
YQL_ENSURE(queryAst.Ast); | ||
YQL_ENSURE(queryAst.Ast->IsOk()); | ||
YQL_ENSURE(queryAst.Ast->Root); | ||
auto compileResult = QueryCache.FindByAst(compileRequest.Query, *queryAst.Ast, compileRequest.KeepInCache); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Я не очень знаком с FindByAst, а зачем нам тут и дескриптор запроса (compileRequest.Query) и сам AST?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Там же еще информация есть про базу данных, параметры, кластер. Нам же не достаточно только ast (как и текст) проверить?
@@ -928,6 +945,7 @@ class TKqpSessionActor : public TActorBootstrapped<TKqpSessionActor> { | |||
|
|||
bool ExecutePhyTx(const TKqpPhyTxHolder::TConstPtr& tx, bool commit) { | |||
if (tx) { | |||
QueryState->PrepareStatementTransaction(tx->GetType()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Вот тут не очень очевидно как это работает в случае, если не per-statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Там внутри функции есть проверка на per-statement, иначе ничего не делается, но можно вынести на уровень повыше - сюда
TxCtx->EffectiveIsolationLevel = NKikimrKqp::ISOLATION_LEVEL_UNDEFINED; | ||
break; | ||
default: | ||
Commit = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
А вот тут не очень очевидно. Смотри, если мы говорим что у нас per-statement транзакция, то она может состоять из более чем одной PhyTx. Например:
UPSERT INTO ... SELECT * FROM ...
Вот тут будет две PhyTx, при этом это один стейтмент. Т.е. коммит должен идти не после каждой PhyTx, а в конце.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Проверила в логах, что коммитится действительно в конце стейтмента, но пока не понимаю, как бы это тестом покрыть
@@ -216,6 +216,35 @@ void PGParse(const TString& input, IPGParseEvents& events) { | |||
} | |||
} | |||
|
|||
List* PGGetStatements(const TString& input) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TArenaMemoryContext holds ownership of returned List*, so it's use after free
Please use PGParse method above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed, could you please check it?
⚪
|
⚪
|
@@ -107,6 +107,7 @@ namespace NSQLTranslation { | |||
bool AssumeYdbOnClusterWithSlash; | |||
TString DynamicClusterProvider; | |||
TString FileAliasPrefix; | |||
bool PerStatementExecution = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this is needed?
API is different - PGToYql/PGToYqlStatements, so an internal flag could be passed instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This param is needed in PG parser, because parsing with per statement mode or usual mode is inside in converter in method OnResult, converter is created with this settings, so because of this i put this param in settings, but it can be better to put this param in converter, not in setting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please move this setting to converter
⚪
|
⚪
|
@@ -143,16 +150,19 @@ std::unique_ptr<TEvKqp::TEvCompileRequest> TKqpQueryState::BuildCompileRequest(s | |||
case NKikimrKqp::QUERY_ACTION_PREPARE: | |||
query = TKqpQueryId(Cluster, Database, GetQuery(), settings, GetQueryParameterTypes()); | |||
keepInCache = query->IsSql(); | |||
perStatementResult = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So there will be a difference between two ways of sql execution:
- ExecuteQuery with QUERY_ACTION_EXECUTE
- ExecuteQuery with QUERY_ACTION_PREPARE and then QUERY_ACTION_EXECUTE_PREPARED
I understand that we can't execute DDL and DML in one SQL without compiling queries after execution of previous statements (fresh tables are in DMLs). But do we realize what exactly will be different in usual DML queries?
TVector<NYql::TAstParseResult> result; | ||
NYql::TIssues issues; | ||
TTranslationSettings parsedSettings(settings); | ||
google::protobuf::Arena arena; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same comment about arena
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case all protos that were constructed with this arena aren't returned outward, they are constructed and used in function SqlToAstStatements that called below
@@ -2660,6 +2660,75 @@ TNodePtr TSqlQuery::Build(const TSQLv1ParserAST& ast) { | |||
WarnUnusedNodes(); | |||
return result; | |||
} | |||
|
|||
TNodePtr TSqlQuery::Build(const std::vector<::NSQLv1Generated::TRule_sql_stmt_core>& statements) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we sure that it's better to split query into statements here in parser? Did you check what we have in several statements query after pasing now? Maybe it is more convenient to split it into statements when we have TNodePtr pointing to root of the query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have functionality with taking ast before compilation and checking it in the ast cache, and also we need PgAutoParams, that contains in AstResult class, so it was easier for me to do it in this way.
For query
$a = (SELECT * FROM TestDdl1);
UPSERT INTO TestDdl1 (Key, Value) VALUES (3, "Three");
SELECT * FROM $a;
ast without per-statements mode looks:
(
(import aggregate_module "/lib/yql/aggregate.yql")
(import window_module "/lib/yql/window.yql")
(import core_module "/lib/yql/core.yql")
(let subquerynode0 (block (
(let x (Read! world (DataSource "kikimr" "db") (Key (table (String "/Root/TestDdl1"))) (Void) ()))
(let world (Left! x))
(let table0 (Right! x))
(return (world (block (
(let select (block ((let core table0) (let core (PersistableRepr (block ((let projectCoreType (TypeOf core)) (let core (SqlProject core ((SqlProjectStarItem projectCoreType "" (lambda (row) (block ((let res row) (return res)))) ())))) (return core))))) (return core))))
(let select (RemoveSystemMembers select)) (return select))))))))
(let world (Nth subquerynode0 0))
(let subquery0 (Nth subquerynode0 1))
(let world (block (
(let values (PersistableRepr ((AsStruct ("Key" (Int32 "3")) ("Value" (String "Three"))))))
(let world (block (
(let sink (DataSink "kikimr" "db"))
(let world (Write! world sink (Key (table (String "/Root/TestDdl1"))) values ((mode upsert))))
(return world))))
(return world))))
(let world (block (
(let output (block ((let select (block ((let core subquery0) (let core (PersistableRepr (block ((let projectCoreType (TypeOf core)) (let core (SqlProject core ((SqlProjectStarItem projectCoreType "" (lambda (row) (block ((let res row) (return res)))) ())))) (return core))))) (return core)))) (let select (RemoveSystemMembers select)) (return select))))
(let world (block (
(let result_sink (DataSink result))
(let world (Write! world result_sink (Key) output ((type) (autoref))))
(return (Commit! world result_sink)))))
(return world))))
(return world)
)
and with per-statements mode:
1)
(
(import aggregate_module "/lib/yql/aggregate.yql")
(import window_module "/lib/yql/window.yql")
(import core_module "/lib/yql/core.yql")
(let world (block (
(let values (PersistableRepr ((AsStruct ("Key" (Int32 "3")) ("Value" (String "Three"))))))
(let world (block (
(let sink (DataSink "kikimr" "db"))
(let world (Write! world sink (Key (table (String "/Root/TestDdl1"))) values ((mode upsert))))
(return world))))
(return world))))
(return world)
)
2)
(
(import aggregate_module "/lib/yql/aggregate.yql")
(import window_module "/lib/yql/window.yql")
(import core_module "/lib/yql/core.yql")
(let subquerynode0 (block (
(let x (Read! world (DataSource "kikimr" "db") (Key (table (String "/Root/TestDdl1"))) (Void) ()))
(let world (Left! x))
(let table0 (Right! x))
(return (world (block (
(let select (block ((let core table0) (let core (PersistableRepr (block ((let projectCoreType (TypeOf core)) (let core (SqlProject core ((SqlProjectStarItem projectCoreType "" (lambda (row) (block ((let res row) (return res)))) ())))) (return core))))) (return core))))
(let select (RemoveSystemMembers select))
(return select))))))))
(let world (Nth subquerynode0 0))
(let subquery0 (Nth subquerynode0 1))
(let world (block (
(let output (block ((let select (block ((let core subquery0) (let core (PersistableRepr (block ((let projectCoreType (TypeOf core)) (let core (SqlProject core ((SqlProjectStarItem projectCoreType "" (lambda (row) (block ((let res row) (return res)))) ())))) (return core))))) (return core)))) (let select (RemoveSystemMembers select)) (return select))))
(let world (block (
(let result_sink (DataSink result))
(let world (Write! world result_sink (Key) output ((type) (autoref))))
(return (Commit! world result_sink)))))
(return world))))
(return world)
)
I think that it is easier to build right ast in the beginning than rebuild this ast to several right asts with needed params, named nodes and so on, but of course we will have some problem that we need to support statement splitting in several parsers
⚪
|
⚪
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me that we have discussed and resolved all important points
⚪
|
⚪
|
Changelog entry
Add ability to use mixed ddl and dml queries in the ExecuteQuery handle in the query service
Changelog category
Additional information