diff --git a/docs/dev/Architecture.md b/docs/dev/Architecture.md new file mode 100644 index 0000000000..07c475e49f --- /dev/null +++ b/docs/dev/Architecture.md @@ -0,0 +1,39 @@ +# OpenDistro SQL Engine Architecture + +--- +## 1.Overview + +The OpenDistro SQL (OD-SQL) project is developed based on NLPChina project (https://github.com/NLPchina/elasticsearch-sql) which has been deprecated now ([attributions](https://github.com/opendistro-for-elasticsearch/sql/blob/master/docs/attributions.md)). Over the one year in development, a lot of features have been added to the OD-SQL project on top of the existing older NLPChina project. The purpose of this document is to explain the OD-SQL current architecture going ahead. + +--- +## 2.High Level View + +In the high level, the OD-SQL Engine could be divided into four major sub-module. + +* *Parser*: Currently, there are two Lex&Parser coexists. The Druid Lex&Parser is the original one from NLPChina. The input AST of Core Engine is from the Druid Lex&Parser. The [ANTLR](https://github.com/opendistro-for-elasticsearch/sql/blob/master/src/main/antlr/OpenDistroSqlParser.g4) Lex&Parser is added by us to customized the verification and exception handling. +* *Analyzer*: The analyzer module take the output from ANTLR Lex&Parser then perform syntax and semantic analyze. +* *Core Engine*: The QueryAction take the output from Druid Lex&Parser and translate to the Elasticsearch DSL if possible. This is an NLPChina original module. The QueryPlanner Builder is added by us to support the JOIN and Post-processing logic. The QueryPlanner will take the take the output from Druid Lex&Parser and build the PhysicalPlan +* *Execution*: The execution module execute QueryAction or QueryPlanner and return the response to the client. Different from the Frontend, Analyzer and Core Engine which running on the Transport Thread and can’t do any blocking operation. The Execution module running on the client threadpool and can perform the blocking operation. + +There are also others modules include in the OD-SQL engine. + +* _Documentation_: it is used to auto-generated documentation. +* _Metrics_: it is used to collect OD-SQL related metrics. +* _Resource Manager_: it is used to monitor the memory consumption when performing join operation to avoid the impact to Elasticsearch availability. + +![Architecture Overview](img/architecture-overview.png) + +--- +## 3.Journey of the query in OD-SQL engine. + +The following diagram take a sample query and explain how the query flow within different modules. + +![Architecture Journey](img/architecture-journey.png) + +1. The ANTRL parser based on grammar file (https://github.com/opendistro-for-elasticsearch/sql/blob/master/src/main/antlr/OpenDistroSqlParser.g4) to auto generate the AST. +2. The Syntax and Semantic Analyzer will walk through the AST and verify whether the query is follow the grammar and supported by the OD-SQL. e.g. *SELECT * FROM semantics WHERE LOG(age, city) = 1, *will throw exception with message* Function [LOG] cannot work with [INTEGER, KEYWORD]. *and sample usage message* Usage: LOG(NUMBER T) → DOUBLE. +3. The Druid Lex&Parser takes the input query and generate the druid AST which is different from the AST generated by the ANTRL. This module is the open source library (https://github.com/alibaba/druid) used by NLPChina originally. +4. The QueryPlanner Builder take the AST as input and generate the LogicalPlan from it. Then it optimize the LogicalPlan to PhysicalPlan.(In current implementation, only rule-based model is implemented). The major part of PhysicalPlan generation use NLPChina’s original logic to translate the SQL expression in AST to Elasticsearch DSL. +5. The QueryPlanner executor execute the PhysicalPlan in worker thread. +6. The formatter will reformat the response data to the required format. The default format is JDBC format. + diff --git a/docs/dev/img/architecture-journey.png b/docs/dev/img/architecture-journey.png new file mode 100644 index 0000000000..71f355f080 Binary files /dev/null and b/docs/dev/img/architecture-journey.png differ diff --git a/docs/dev/img/architecture-overview.png b/docs/dev/img/architecture-overview.png new file mode 100644 index 0000000000..cbed41bf42 Binary files /dev/null and b/docs/dev/img/architecture-overview.png differ diff --git a/docs/developing.rst b/docs/developing.rst index 4f2c435ff9..aadf13330d 100644 --- a/docs/developing.rst +++ b/docs/developing.rst @@ -58,7 +58,7 @@ If there is update in master or you want to keep the forked repository long livi After getting the source code as well as Elasticsearch and Kibana, your workspace layout may look like this:: - $ make opendistro + $ mkdir opendistro $ cd opendistro $ ls -la total 32 diff --git a/docs/user/admin/monitoring.rst b/docs/user/admin/monitoring.rst index b8c3626181..32d588d70b 100644 --- a/docs/user/admin/monitoring.rst +++ b/docs/user/admin/monitoring.rst @@ -52,7 +52,7 @@ Result set:: "failed_request_count_cb" : 0, "failed_request_count_cuserr" : 0, "circuit_breaker" : 0, - "request_total" : 0, + "request_total" : 49, "request_count" : 0, "failed_request_count_syserr" : 0 } diff --git a/docs/user/beyond/fulltext.rst b/docs/user/beyond/fulltext.rst new file mode 100644 index 0000000000..a69e4e92c6 --- /dev/null +++ b/docs/user/beyond/fulltext.rst @@ -0,0 +1,517 @@ + +================ +Full-text Search +================ + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + +Introduction +============ + +Full-text search is for searching a single stored document which is distinguished from regular search based on original texts in database. It tries to match search criteria by examining all of the words in each document. In Elasticsearch, full-text queries provided enables you to search text fields analyzed during indexing. + +Match Query +=========== + +Description +----------- + +Match query is the standard query for full-text search in Elasticsearch. Both ``MATCHQUERY`` and ``MATCH_QUERY`` are functions for performing match query. + +Example 1 +--------- + +Both functions can accept field name as first argument and a text as second argument. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT account_number, address + FROM accounts + WHERE MATCH_QUERY(address, 'Holmes') + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "match" : { + "address" : { + "query" : "Holmes", + "operator" : "OR", + "prefix_length" : 0, + "max_expansions" : 50, + "fuzzy_transpositions" : true, + "lenient" : false, + "zero_terms_query" : "NONE", + "auto_generate_synonyms_phrase_query" : true, + "boost" : 1.0 + } + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "account_number", + "address" + ], + "excludes" : [ ] + } + } + +Result set: + ++--------------+---------------+ +|account_number| address| ++==============+===============+ +| 1|880 Holmes Lane| ++--------------+---------------+ + + +Example 2 +--------- + +Both functions can also accept single argument and be used in the following manner. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT account_number, address + FROM accounts + WHERE address = MATCH_QUERY('Holmes') + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "match" : { + "address" : { + "query" : "Holmes", + "operator" : "OR", + "prefix_length" : 0, + "max_expansions" : 50, + "fuzzy_transpositions" : true, + "lenient" : false, + "zero_terms_query" : "NONE", + "auto_generate_synonyms_phrase_query" : true, + "boost" : 1.0 + } + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "account_number", + "address" + ], + "excludes" : [ ] + } + } + +Result set: + ++--------------+---------------+ +|account_number| address| ++==============+===============+ +| 1|880 Holmes Lane| ++--------------+---------------+ + + +Multi-match Query +================= + +Description +----------- + +Besides match query against a single field, you can search for a text with multiple fields. Function ``MULTI_MATCH``, ``MULTIMATCH`` and ``MULTIMATCHQUERY`` are provided for this. + +Example +------- + +Each preceding function accepts ``query`` for a text and ``fields`` for field names or pattern that the text given is searched against. For example, the following query is searching for documents in index accounts with 'Dale' as either firstname or lastname. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT firstname, lastname + FROM accounts + WHERE MULTI_MATCH('query'='Dale', 'fields'='*name') + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "multi_match" : { + "query" : "Dale", + "fields" : [ + "*name^1.0" + ], + "type" : "best_fields", + "operator" : "OR", + "slop" : 0, + "prefix_length" : 0, + "max_expansions" : 50, + "zero_terms_query" : "NONE", + "auto_generate_synonyms_phrase_query" : true, + "fuzzy_transpositions" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "firstname", + "lastname" + ], + "excludes" : [ ] + } + } + +Result set: + ++---------+--------+ +|firstname|lastname| ++=========+========+ +| Dale| Adams| ++---------+--------+ + + +Query String Query +================== + +Description +----------- + +Query string query parses and splits a query string provided based on Lucene query string syntax. The mini language supports logical connectives, wildcard, regex and proximity search. Please refer to official documentation for more details. Note that an error is thrown in the case of any invalid syntax in query string. + +Example +------- + +``QUERY`` function accepts query string and returns true or false respectively for document that matches the query string or not. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT account_number, address + FROM accounts + WHERE QUERY('address:Lane OR address:Street') + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "query_string" : { + "query" : "address:Lane OR address:Street", + "fields" : [ ], + "type" : "best_fields", + "default_operator" : "or", + "max_determinized_states" : 10000, + "enable_position_increments" : true, + "fuzziness" : "AUTO", + "fuzzy_prefix_length" : 0, + "fuzzy_max_expansions" : 50, + "phrase_slop" : 0, + "escape" : false, + "auto_generate_synonyms_phrase_query" : true, + "fuzzy_transpositions" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "account_number", + "address" + ], + "excludes" : [ ] + } + } + +Result set: + ++--------------+------------------+ +|account_number| address| ++==============+==================+ +| 1| 880 Holmes Lane| ++--------------+------------------+ +| 6|671 Bristol Street| ++--------------+------------------+ +| 13|789 Madison Street| ++--------------+------------------+ + + +Match Phrase Query +================== + +Description +----------- + +Match phrase query is similar to match query but it is used for matching exact phrases. ``MATCHPHRASE``, ``MATCH_PHRASE`` and ``MATCHPHRASEQUERY`` are provided for this purpose. + +Example +------- + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT account_number, address + FROM accounts + WHERE MATCH_PHRASE(address, '880 Holmes Lane') + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "match_phrase" : { + "address" : { + "query" : "880 Holmes Lane", + "slop" : 0, + "zero_terms_query" : "NONE", + "boost" : 1.0 + } + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "account_number", + "address" + ], + "excludes" : [ ] + } + } + +Result set: + ++--------------+---------------+ +|account_number| address| ++==============+===============+ +| 1|880 Holmes Lane| ++--------------+---------------+ + + +Score Query +=========== + +Description +----------- + +Elasticsearch supports to wrap a filter query so as to return a relevance score along with every matching document. ``SCORE``, ``SCOREQUERY`` and ``SCORE_QUERY`` can be used for this. + +Example +------- + +The first argument is a match query expression and the second argument is for an optional floating point number to boost the score. The default value is 1.0. Apart from this, an implicit variable ``_score`` is available so you can return score for each document or use it for sorting. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT account_number, address, _score + FROM accounts + WHERE SCORE(MATCH_QUERY(address, 'Lane'), 0.5) OR + SCORE(MATCH_QUERY(address, 'Street'), 100) + ORDER BY _score + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "must" : [ + { + "bool" : { + "should" : [ + { + "constant_score" : { + "filter" : { + "match" : { + "address" : { + "query" : "Lane", + "operator" : "OR", + "prefix_length" : 0, + "max_expansions" : 50, + "fuzzy_transpositions" : true, + "lenient" : false, + "zero_terms_query" : "NONE", + "auto_generate_synonyms_phrase_query" : true, + "boost" : 1.0 + } + } + }, + "boost" : 0.5 + } + }, + { + "constant_score" : { + "filter" : { + "match" : { + "address" : { + "query" : "Street", + "operator" : "OR", + "prefix_length" : 0, + "max_expansions" : 50, + "fuzzy_transpositions" : true, + "lenient" : false, + "zero_terms_query" : "NONE", + "auto_generate_synonyms_phrase_query" : true, + "boost" : 1.0 + } + } + }, + "boost" : 100.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "account_number", + "address", + "_score" + ], + "excludes" : [ ] + }, + "sort" : [ + { + "_score" : { + "order" : "asc" + } + } + ] + } + +Result set: + ++--------------+------------------+------+ +|account_number| address|_score| ++==============+==================+======+ +| 1| 880 Holmes Lane| 0.5| ++--------------+------------------+------+ +| 6|671 Bristol Street| 100| ++--------------+------------------+------+ +| 13|789 Madison Street| 100| ++--------------+------------------+------+ + + diff --git a/docs/user/beyond/partiql.rst b/docs/user/beyond/partiql.rst new file mode 100644 index 0000000000..611ae72c54 --- /dev/null +++ b/docs/user/beyond/partiql.rst @@ -0,0 +1,295 @@ + +====================== +PartiQL (JSON) Support +====================== + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + +Introduction +============ + +PartiQL is a SQL-compatible query language that makes it easy and efficient to query semi-structured and nested data regardless of data format. For now our implementation is only partially compatible with PartiQL specification and more support will be provided in future. + +Test Data +========= + +Description +----------- + +The test index ``employees_nested`` used by all examples in this document is very similar to the one used in official PartiQL documentation. + +Example: Employees +------------------ + +Result set:: + + { + "employees" : [ + { + "id" : 3, + "name" : "Bob Smith", + "title" : null, + "projects" : [ + { + "name" : "AWS Redshift Spectrum querying", + "started_year" : 1990 + }, + { + "name" : "AWS Redshift security", + "started_year" : 1999 + }, + { + "name" : "AWS Aurora security", + "started_year" : 2015 + } + ] + }, + { + "id" : 4, + "name" : "Susan Smith", + "title" : "Dev Mgr", + "projects" : [ ] + }, + { + "id" : 6, + "name" : "Jane Smith", + "title" : "Software Eng 2", + "projects" : [ + { + "name" : "AWS Redshift security", + "started_year" : 1998 + }, + { + "name" : "AWS Hello security", + "started_year" : 2015, + "address" : [ + { + "city" : "Dallas", + "state" : "TX" + } + ] + } + ] + } + ] + } + +Querying Nested Collection +========================== + +Description +----------- + +In SQL-92, a database table can only have tuples that consists of scalar values. PartiQL extends SQL-92 to allow you query and unnest nested collection conveniently. In Elasticsearch world, this is very useful for index with object or nested field. + +Example 1: Unnesting a Nested Collection +---------------------------------------- + +In the following example, it finds nested document (project) with field value (name) that satisfies the predicate (contains 'security'). Note that because each parent document can have more than one nested documents, the matched nested document is flattened. In other word, the final result is the Cartesian Product between parent and nested documents. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT e.name AS employeeName, + p.name AS projectName + FROM employees_nested AS e, + e.projects AS p + WHERE p.name LIKE '%security%' + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "nested" : { + "query" : { + "wildcard" : { + "projects.name" : { + "wildcard" : "*security*", + "boost" : 1.0 + } + } + }, + "path" : "projects", + "ignore_unmapped" : false, + "score_mode" : "none", + "boost" : 1.0, + "inner_hits" : { + "ignore_unmapped" : false, + "from" : 0, + "size" : 3, + "version" : false, + "seq_no_primary_term" : false, + "explain" : false, + "track_scores" : false, + "_source" : { + "includes" : [ + "projects.name" + ], + "excludes" : [ ] + } + } + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "name" + ], + "excludes" : [ ] + } + } + +Result set: + ++------------+---------------------+ +|employeeName| projectName| ++============+=====================+ +| Bob Smith| AWS Aurora security| ++------------+---------------------+ +| Bob Smith|AWS Redshift security| ++------------+---------------------+ +| Jane Smith| AWS Hello security| ++------------+---------------------+ +| Jane Smith|AWS Redshift security| ++------------+---------------------+ + + +Example 2: Unnesting in Existential Subquery +-------------------------------------------- + +Alternatively, a nested collection can be unnested in subquery to check if it satisfies a condition. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT e.name AS employeeName + FROM employees_nested AS e + WHERE EXISTS ( + SELECT * + FROM e.projects AS p + WHERE p.name LIKE '%security%' + ) + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "nested" : { + "query" : { + "bool" : { + "must" : [ + { + "bool" : { + "must" : [ + { + "bool" : { + "must_not" : [ + { + "bool" : { + "must_not" : [ + { + "exists" : { + "field" : "projects", + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + { + "wildcard" : { + "projects.name" : { + "wildcard" : "*security*", + "boost" : 1.0 + } + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "path" : "projects", + "ignore_unmapped" : false, + "score_mode" : "none", + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "name" + ], + "excludes" : [ ] + } + } + +Result set: + ++------------+ +|employeeName| ++============+ +| Bob Smith| ++------------+ +| Jane Smith| ++------------+ + + diff --git a/docs/user/dml/delete.rst b/docs/user/dml/delete.rst new file mode 100644 index 0000000000..3d7eb1ee87 --- /dev/null +++ b/docs/user/dml/delete.rst @@ -0,0 +1,87 @@ + +================ +DELETE Statement +================ + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + + +DELETE +====== + +Description +----------- + +``DELETE`` statement deletes documents that satisfy the predicates in ``WHERE`` clause. Note that all documents are deleted in the case of ``WHERE`` clause absent. + +Syntax +------ + +Rule ``singleDeleteStatement``: + +.. image:: /docs/user/img/rdd/singleDeleteStatement.png + +Example +------- + +The ``datarows`` field in this case shows rows impacted, in other words how many documents were just deleted. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + DELETE FROM accounts + WHERE age > 30 + """ + } + +Explain:: + + { + "size" : 1000, + "query" : { + "bool" : { + "must" : [ + { + "range" : { + "age" : { + "from" : 30, + "to" : null, + "include_lower" : false, + "include_upper" : true, + "boost" : 1.0 + } + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : false + } + +Result set:: + + { + "schema" : [ + { + "name" : "deleted_rows", + "type" : "long" + } + ], + "total" : 1, + "datarows" : [ + [ + 3 + ] + ], + "size" : 1, + "status" : 200 + } + diff --git a/docs/user/dql/basics.rst b/docs/user/dql/basics.rst index 5ae9c15da2..3b7d0c4746 100644 --- a/docs/user/dql/basics.rst +++ b/docs/user/dql/basics.rst @@ -1,7 +1,7 @@ -=========== -Basic Query -=========== +============= +Basic Queries +============= .. rubric:: Table of contents @@ -313,7 +313,7 @@ WHERE Description ----------- -`WHERE` clause specifies only Elasticsearch documents that meet the criteria should be affected. It consists of predicates that uses ``=``, ``<>``, ``>``, ``>=``, ``<``, ``<=``, ``IN``, ``BETWEEN``, ``LIKE``, ``IS NULL`` or ``IS NOT NULL``. These predicates can be combined by logical operator ``NOT``, ``AND`` or ``OR`` to build more complex expression. +``WHERE`` clause specifies only Elasticsearch documents that meet the criteria should be affected. It consists of predicates that uses ``=``, ``<>``, ``>``, ``>=``, ``<``, ``<=``, ``IN``, ``BETWEEN``, ``LIKE``, ``IS NULL`` or ``IS NOT NULL``. These predicates can be combined by logical operator ``NOT``, ``AND`` or ``OR`` to build more complex expression. For ``LIKE`` and other full text search topics, please refer to Full Text Search documentation. @@ -328,7 +328,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT account_number FROM accounts WHERE account_number = 1" + "query" : """ + SELECT account_number + FROM accounts + WHERE account_number = 1 + """ } Explain:: @@ -388,7 +392,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT account_number, employer FROM accounts WHERE employer IS NULL" + "query" : """ + SELECT account_number, employer + FROM accounts + WHERE employer IS NULL + """ } Explain:: @@ -461,7 +469,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT age FROM accounts GROUP BY age" + "query" : """ + SELECT age + FROM accounts + GROUP BY age + """ } Explain:: @@ -521,7 +533,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT account_number AS num FROM accounts GROUP BY num" + "query" : """ + SELECT account_number AS num + FROM accounts + GROUP BY num + """ } Explain:: @@ -581,7 +597,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT age FROM accounts GROUP BY 1" + "query" : """ + SELECT age + FROM accounts + GROUP BY 1 + """ } Explain:: @@ -641,7 +661,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT ABS(age) AS a FROM accounts GROUP BY ABS(age)" + "query" : """ + SELECT ABS(age) AS a + FROM accounts + GROUP BY ABS(age) + """ } Explain:: @@ -655,9 +679,9 @@ Explain:: ], "excludes" : [ ] }, - "stored_fields" : "a", + "stored_fields" : "abs(age)", "script_fields" : { - "a" : { + "abs(age)" : { "script" : { "source" : "def abs_1 = Math.abs(doc['age'].value);return abs_1;", "lang" : "painless" @@ -666,7 +690,7 @@ Explain:: } }, "aggregations" : { - "a" : { + "abs(age)" : { "terms" : { "script" : { "source" : "def abs_1 = Math.abs(doc['age'].value);return abs_1;", @@ -719,7 +743,12 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT age, MAX(balance) FROM accounts GROUP BY age HAVING MIN(balance) > 10000" + "query" : """ + SELECT age, MAX(balance) + FROM accounts + GROUP BY age + HAVING MIN(balance) > 10000 + """ } Explain:: @@ -856,7 +885,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT employer FROM accounts ORDER BY employer IS NOT NULL" + "query" : """ + SELECT employer + FROM accounts + ORDER BY employer IS NOT NULL + """ } Explain:: @@ -912,7 +945,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT account_number FROM accounts ORDER BY account_number LIMIT 1" + "query" : """ + SELECT account_number + FROM accounts + ORDER BY account_number LIMIT 1 + """ } Explain:: @@ -953,7 +990,11 @@ SQL query:: POST /_opendistro/_sql { - "query" : "SELECT account_number FROM accounts ORDER BY account_number LIMIT 1, 1" + "query" : """ + SELECT account_number + FROM accounts + ORDER BY account_number LIMIT 1, 1 + """ } Explain:: diff --git a/docs/user/dql/complex.rst b/docs/user/dql/complex.rst new file mode 100644 index 0000000000..359779bbe8 --- /dev/null +++ b/docs/user/dql/complex.rst @@ -0,0 +1,445 @@ + +=============== +Complex Queries +=============== + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + +Besides simple SFW queries (SELECT-FROM-WHERE), there is also support for complex queries such as Subquery, ``JOIN``, ``UNION`` and ``MINUS``. For these queries, more than one Elasticsearch index and DSL query is involved. You can check out how they are performed behind the scene by our explain API. + +Subquery +======== + +Description +----------- + +A subquery is a complete ``SELECT`` statement which is used within another statement and enclosed in parenthesis. From the explain output, you can notice that some subquery are actually transformed to an equivalent join query to execute. + +Example 1: Table Subquery +------------------------- + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT a1.firstname, a1.lastname, a1.balance + FROM accounts a1 + WHERE a1.account_number IN ( + SELECT a2.account_number + FROM accounts a2 + WHERE a2.balance > 10000 + ) + """ + } + +Explain:: + + { + "Physical Plan" : { + "Project [ columns=[a1.balance, a1.firstname, a1.lastname] ]" : { + "Top [ count=200 ]" : { + "BlockHashJoin[ conditions=( a1.account_number = a2.account_number ), type=JOIN, blockSize=[FixedBlockSize with size=10000] ]" : { + "Scroll [ accounts as a2, pageSize=10000 ]" : { + "request" : { + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "adjust_pure_negative" : true, + "must" : [ + { + "bool" : { + "adjust_pure_negative" : true, + "must" : [ + { + "bool" : { + "adjust_pure_negative" : true, + "must_not" : [ + { + "bool" : { + "adjust_pure_negative" : true, + "must_not" : [ + { + "exists" : { + "field" : "account_number", + "boost" : 1 + } + } + ], + "boost" : 1 + } + } + ], + "boost" : 1 + } + }, + { + "range" : { + "balance" : { + "include_lower" : false, + "include_upper" : true, + "from" : 10000, + "boost" : 1, + "to" : null + } + } + } + ], + "boost" : 1 + } + } + ], + "boost" : 1 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1 + } + }, + "from" : 0 + } + }, + "Scroll [ accounts as a1, pageSize=10000 ]" : { + "request" : { + "size" : 200, + "from" : 0, + "_source" : { + "excludes" : [ ], + "includes" : [ + "firstname", + "lastname", + "balance", + "account_number" + ] + } + } + }, + "useTermsFilterOptimization" : false + } + } + } + }, + "description" : "Hash Join algorithm builds hash table based on result of first query, and then probes hash table to find matched rows for each row returned by second query", + "Logical Plan" : { + "Project [ columns=[a1.balance, a1.firstname, a1.lastname] ]" : { + "Top [ count=200 ]" : { + "Join [ conditions=( a1.account_number = a2.account_number ) type=JOIN ]" : { + "Group" : [ + { + "Project [ columns=[a1.balance, a1.firstname, a1.lastname, a1.account_number] ]" : { + "TableScan" : { + "tableAlias" : "a1", + "tableName" : "accounts" + } + } + }, + { + "Project [ columns=[a2.account_number] ]" : { + "Filter [ conditions=[AND ( AND account_number ISN null, AND balance GT 10000 ) ] ]" : { + "TableScan" : { + "tableAlias" : "a2", + "tableName" : "accounts" + } + } + } + } + ] + } + } + } + } + } + +Result set: + ++------------+-----------+----------+ +|a1.firstname|a1.lastname|a1.balance| ++============+===========+==========+ +| Amber| Duke| 39225| ++------------+-----------+----------+ +| Nanette| Bates| 32838| ++------------+-----------+----------+ + + +Example 2: Subquery in FROM Clause +---------------------------------- + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT a.f, a.l, a.a + FROM ( + SELECT firstname AS f, lastname AS l, age AS a + FROM accounts + WHERE age > 30 + ) AS a + """ + } + +Explain:: + + { + "from" : 0, + "size" : 200, + "query" : { + "bool" : { + "filter" : [ + { + "bool" : { + "must" : [ + { + "range" : { + "age" : { + "from" : 30, + "to" : null, + "include_lower" : false, + "include_upper" : true, + "boost" : 1.0 + } + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + } + ], + "adjust_pure_negative" : true, + "boost" : 1.0 + } + }, + "_source" : { + "includes" : [ + "firstname", + "lastname", + "age" + ], + "excludes" : [ ] + } + } + +Result set: + ++------+-----+--+ +| f| l| a| ++======+=====+==+ +| Amber| Duke|32| ++------+-----+--+ +| Dale|Adams|33| ++------+-----+--+ +|Hattie| Bond|36| ++------+-----+--+ + + +JOINs +===== + +Description +----------- + +A ``JOIN`` clause combines columns from one or more indices by using values common to each. + +Syntax +------ + +Rule ``tableSource``: + +.. image:: /docs/user/img/rdd/tableSource.png + +Rule ``joinPart``: + +.. image:: /docs/user/img/rdd/joinPart.png + +Example 1: Inner Join +--------------------- + +Inner join is very commonly used that creates a new result set by combining columns of two indices based on the join predicates specified. It iterates both indices and compare each document to find all that satisfy the join predicates. Keyword ``JOIN`` is used and preceded by ``INNER`` keyword optionally. The join predicate(s) is specified by ``ON`` clause. + + Remark that the explain API output for join queries looks complicated. This is because a join query is associated with two Elasticsearch DSL queries underlying and execute in the separate query planner framework. You can interpret it by looking into the logical plan and physical plan. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT + a.account_number, a.firstname, a.lastname, + e.id, e.name + FROM accounts a + JOIN employees_nested e + ON a.account_number = e.id + """ + } + +Explain:: + + { + "Physical Plan" : { + "Project [ columns=[a.account_number, a.firstname, a.lastname, e.name, e.id] ]" : { + "Top [ count=200 ]" : { + "BlockHashJoin[ conditions=( a.account_number = e.id ), type=JOIN, blockSize=[FixedBlockSize with size=10000] ]" : { + "Scroll [ employees_nested as e, pageSize=10000 ]" : { + "request" : { + "size" : 200, + "from" : 0, + "_source" : { + "excludes" : [ ], + "includes" : [ + "id", + "name" + ] + } + } + }, + "Scroll [ accounts as a, pageSize=10000 ]" : { + "request" : { + "size" : 200, + "from" : 0, + "_source" : { + "excludes" : [ ], + "includes" : [ + "account_number", + "firstname", + "lastname" + ] + } + } + }, + "useTermsFilterOptimization" : false + } + } + } + }, + "description" : "Hash Join algorithm builds hash table based on result of first query, and then probes hash table to find matched rows for each row returned by second query", + "Logical Plan" : { + "Project [ columns=[a.account_number, a.firstname, a.lastname, e.name, e.id] ]" : { + "Top [ count=200 ]" : { + "Join [ conditions=( a.account_number = e.id ) type=JOIN ]" : { + "Group" : [ + { + "Project [ columns=[a.account_number, a.firstname, a.lastname] ]" : { + "TableScan" : { + "tableAlias" : "a", + "tableName" : "accounts" + } + } + }, + { + "Project [ columns=[e.name, e.id] ]" : { + "TableScan" : { + "tableAlias" : "e", + "tableName" : "employees_nested" + } + } + } + ] + } + } + } + } + } + +Result set: + ++----------------+-----------+----------+----+----------+ +|a.account_number|a.firstname|a.lastname|e.id| e.name| ++================+===========+==========+====+==========+ +| 6| Hattie| Bond| 6|Jane Smith| ++----------------+-----------+----------+----+----------+ + + +Example 2: Cross Join +--------------------- + +Cross join or Cartesian join combines each document from the first index with each from the second. The result set is the Cartesian Product of documents from both indices. It appears to be similar to inner join without ``ON`` clause to specify join condition. + + Caveat: It is risky to do cross join even on two indices of medium size. This may trigger our circuit breaker to terminate the query to avoid out of memory issue. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT + a.account_number, a.firstname, a.lastname, + e.id, e.name + FROM accounts a + JOIN employees_nested e + """ + } + +Result set: + ++----------------+-----------+----------+----+-----------+ +|a.account_number|a.firstname|a.lastname|e.id| e.name| ++================+===========+==========+====+===========+ +| 1| Amber| Duke| 3| Bob Smith| ++----------------+-----------+----------+----+-----------+ +| 1| Amber| Duke| 4|Susan Smith| ++----------------+-----------+----------+----+-----------+ +| 1| Amber| Duke| 6| Jane Smith| ++----------------+-----------+----------+----+-----------+ +| 6| Hattie| Bond| 3| Bob Smith| ++----------------+-----------+----------+----+-----------+ +| 6| Hattie| Bond| 4|Susan Smith| ++----------------+-----------+----------+----+-----------+ +| 6| Hattie| Bond| 6| Jane Smith| ++----------------+-----------+----------+----+-----------+ +| 13| Nanette| Bates| 3| Bob Smith| ++----------------+-----------+----------+----+-----------+ +| 13| Nanette| Bates| 4|Susan Smith| ++----------------+-----------+----------+----+-----------+ +| 13| Nanette| Bates| 6| Jane Smith| ++----------------+-----------+----------+----+-----------+ +| 18| Dale| Adams| 3| Bob Smith| ++----------------+-----------+----------+----+-----------+ +| 18| Dale| Adams| 4|Susan Smith| ++----------------+-----------+----------+----+-----------+ +| 18| Dale| Adams| 6| Jane Smith| ++----------------+-----------+----------+----+-----------+ + + +Example 3: Outer Join +--------------------- + +Outer join is used to retain documents from one or both indices although it does not satisfy join predicate. For now, only ``LEFT OUTER JOIN`` is supported to retain rows from first index. Note that keyword ``OUTER`` is optional. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : """ + SELECT + a.account_number, a.firstname, a.lastname, + e.id, e.name + FROM accounts a + LEFT JOIN employees_nested e + ON a.account_number = e.id + """ + } + +Result set: + ++----------------+-----------+----------+----+----------+ +|a.account_number|a.firstname|a.lastname|e.id| e.name| ++================+===========+==========+====+==========+ +| 1| Amber| Duke|null| null| ++----------------+-----------+----------+----+----------+ +| 6| Hattie| Bond| 6|Jane Smith| ++----------------+-----------+----------+----+----------+ +| 13| Nanette| Bates|null| null| ++----------------+-----------+----------+----+----------+ +| 18| Dale| Adams|null| null| ++----------------+-----------+----------+----+----------+ + + diff --git a/docs/user/dql/functions.rst b/docs/user/dql/functions.rst new file mode 100644 index 0000000000..d048cd10fb --- /dev/null +++ b/docs/user/dql/functions.rst @@ -0,0 +1,732 @@ + +============= +SQL Functions +============= + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 1 + +Introduction +============ + +There is support for a wide variety of SQL functions. We are intend to generate this part of documentation automatically from our type system. However, the type system is missing descriptive information for now. So only formal specifications of all SQL functions supported are listed at the moment. More details will be added in future. + +Most of the specifications can be self explained just as a regular function with data type as argument. The only notation that needs elaboration is generic type ``T`` which binds to an actual type and can be used as return type. For example, ``ABS(NUMBER T) -> T`` means function ``ABS`` accepts an numerical argument of type ``T`` which could be any sub-type of ``NUMBER`` type and returns the actual type of ``T`` as return type. The actual type binds to generic type at runtime dynamically. + +ABS +=== + +Description +----------- + +Specifications: + +1. ABS(NUMBER T) -> T + + +ACOS +==== + +Description +----------- + +Specifications: + +1. ACOS(NUMBER T) -> DOUBLE + + +ADD +=== + +Description +----------- + +Specifications: + +1. ADD(NUMBER T, NUMBER) -> T + + +ASCII +===== + +Description +----------- + +Specifications: + +1. ASCII(STRING T) -> INTEGER + + +ASIN +==== + +Description +----------- + +Specifications: + +1. ASIN(NUMBER T) -> DOUBLE + + +ATAN +==== + +Description +----------- + +Specifications: + +1. ATAN(NUMBER T) -> DOUBLE + + +ATAN2 +===== + +Description +----------- + +Specifications: + +1. ATAN2(NUMBER T, NUMBER) -> DOUBLE + + +CAST +==== + +Description +----------- + +Specification is undefined and type check is skipped for now + +CBRT +==== + +Description +----------- + +Specifications: + +1. CBRT(NUMBER T) -> T + + +CEIL +==== + +Description +----------- + +Specifications: + +1. CEIL(NUMBER T) -> T + + +CONCAT +====== + +Description +----------- + +Specification is undefined and type check is skipped for now + +CONCAT_WS +========= + +Description +----------- + +Specification is undefined and type check is skipped for now + +COS +=== + +Description +----------- + +Specifications: + +1. COS(NUMBER T) -> DOUBLE + + +COSH +==== + +Description +----------- + +Specifications: + +1. COSH(NUMBER T) -> DOUBLE + + +COT +=== + +Description +----------- + +Specifications: + +1. COT(NUMBER T) -> DOUBLE + + +CURDATE +======= + +Description +----------- + +Specifications: + +1. CURDATE() -> DATE + + +DATE +==== + +Description +----------- + +Specifications: + +1. DATE(DATE) -> DATE + + +DATE_FORMAT +=========== + +Description +----------- + +Specifications: + +1. DATE_FORMAT(DATE, STRING) -> STRING +2. DATE_FORMAT(DATE, STRING, STRING) -> STRING + + +DAYOFMONTH +========== + +Description +----------- + +Specifications: + +1. DAYOFMONTH(DATE) -> INTEGER + + +DEGREES +======= + +Description +----------- + +Specifications: + +1. DEGREES(NUMBER T) -> DOUBLE + + +DIVIDE +====== + +Description +----------- + +Specifications: + +1. DIVIDE(NUMBER T, NUMBER) -> T + + +E += + +Description +----------- + +Specifications: + +1. E() -> DOUBLE + + +EXP +=== + +Description +----------- + +Specifications: + +1. EXP(NUMBER T) -> T + + +EXPM1 +===== + +Description +----------- + +Specifications: + +1. EXPM1(NUMBER T) -> T + + +FLOOR +===== + +Description +----------- + +Specifications: + +1. FLOOR(NUMBER T) -> T + + +IF +== + +Description +----------- + +Specifications: + +1. IF(BOOLEAN, ES_TYPE, ES_TYPE) -> ES_TYPE + + +IFNULL +====== + +Description +----------- + +Specifications: + +1. IFNULL(ES_TYPE, ES_TYPE) -> ES_TYPE + + +ISNULL +====== + +Description +----------- + +Specifications: + +1. ISNULL(ES_TYPE) -> INTEGER + + +LEFT +==== + +Description +----------- + +Specifications: + +1. LEFT(STRING T, INTEGER) -> T + + +LENGTH +====== + +Description +----------- + +Specifications: + +1. LENGTH(STRING) -> INTEGER + + +LN +== + +Description +----------- + +Specifications: + +1. LN(NUMBER T) -> DOUBLE + + +LOCATE +====== + +Description +----------- + +Specifications: + +1. LOCATE(STRING, STRING, INTEGER) -> INTEGER +2. LOCATE(STRING, STRING) -> INTEGER + + +LOG +=== + +Description +----------- + +Specifications: + +1. LOG(NUMBER T) -> DOUBLE +2. LOG(NUMBER T, NUMBER) -> DOUBLE + + +LOG2 +==== + +Description +----------- + +Specifications: + +1. LOG2(NUMBER T) -> DOUBLE + + +LOG10 +===== + +Description +----------- + +Specifications: + +1. LOG10(NUMBER T) -> DOUBLE + + +LOWER +===== + +Description +----------- + +Specifications: + +1. LOWER(STRING T) -> T +2. LOWER(STRING T, STRING) -> T + + +LTRIM +===== + +Description +----------- + +Specifications: + +1. LTRIM(STRING T) -> T + + +MAKETIME +======== + +Description +----------- + +Specifications: + +1. MAKETIME(INTEGER, INTEGER, INTEGER) -> DATE + + +MODULUS +======= + +Description +----------- + +Specifications: + +1. MODULUS(NUMBER T, NUMBER) -> T + + +MONTH +===== + +Description +----------- + +Specifications: + +1. MONTH(DATE) -> INTEGER + + +MONTHNAME +========= + +Description +----------- + +Specifications: + +1. MONTHNAME(DATE) -> STRING + + +MULTIPLY +======== + +Description +----------- + +Specifications: + +1. MULTIPLY(NUMBER T, NUMBER) -> NUMBER + + +NOW +=== + +Description +----------- + +Specifications: + +1. NOW() -> DATE + + +PI +== + +Description +----------- + +Specifications: + +1. PI() -> DOUBLE + + +POW +=== + +Description +----------- + +Specifications: + +1. POW(NUMBER T) -> T +2. POW(NUMBER T, NUMBER) -> T + + +POWER +===== + +Description +----------- + +Specifications: + +1. POWER(NUMBER T) -> T +2. POWER(NUMBER T, NUMBER) -> T + + +RADIANS +======= + +Description +----------- + +Specifications: + +1. RADIANS(NUMBER T) -> DOUBLE + + +RAND +==== + +Description +----------- + +Specifications: + +1. RAND() -> NUMBER +2. RAND(NUMBER T) -> T + + +REPLACE +======= + +Description +----------- + +Specifications: + +1. REPLACE(STRING T, STRING, STRING) -> T + + +RIGHT +===== + +Description +----------- + +Specifications: + +1. RIGHT(STRING T, INTEGER) -> T + + +RINT +==== + +Description +----------- + +Specifications: + +1. RINT(NUMBER T) -> T + + +ROUND +===== + +Description +----------- + +Specifications: + +1. ROUND(NUMBER T) -> T + + +RTRIM +===== + +Description +----------- + +Specifications: + +1. RTRIM(STRING T) -> T + + +SIGN +==== + +Description +----------- + +Specifications: + +1. SIGN(NUMBER T) -> T + + +SIGNUM +====== + +Description +----------- + +Specifications: + +1. SIGNUM(NUMBER T) -> T + + +SIN +=== + +Description +----------- + +Specifications: + +1. SIN(NUMBER T) -> DOUBLE + + +SINH +==== + +Description +----------- + +Specifications: + +1. SINH(NUMBER T) -> DOUBLE + + +SQRT +==== + +Description +----------- + +Specifications: + +1. SQRT(NUMBER T) -> T + + +SUBSTRING +========= + +Description +----------- + +Specifications: + +1. SUBSTRING(STRING T, INTEGER, INTEGER) -> T + + +SUBTRACT +======== + +Description +----------- + +Specifications: + +1. SUBTRACT(NUMBER T, NUMBER) -> T + + +TAN +=== + +Description +----------- + +Specifications: + +1. TAN(NUMBER T) -> DOUBLE + + +TIMESTAMP +========= + +Description +----------- + +Specifications: + +1. TIMESTAMP(DATE) -> DATE + + +TRIM +==== + +Description +----------- + +Specifications: + +1. TRIM(STRING T) -> T + + +UPPER +===== + +Description +----------- + +Specifications: + +1. UPPER(STRING T) -> T +2. UPPER(STRING T, STRING) -> T + + +YEAR +==== + +Description +----------- + +Specifications: + +1. YEAR(DATE) -> INTEGER + + diff --git a/docs/user/dql/metadata.rst b/docs/user/dql/metadata.rst new file mode 100644 index 0000000000..bb01bd92af --- /dev/null +++ b/docs/user/dql/metadata.rst @@ -0,0 +1,116 @@ + +================ +Metadata Queries +================ + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 1 + + +Querying Metadata +================= + +Description +----------- + +You can query your indices metadata by ``SHOW`` and ``DESCRIBE`` statement. These commands are very useful for database management tool to enumerate all existing indices and get basic information from the cluster. + +Syntax +------ + +Rule ``showStatement``: + +.. image:: /docs/user/img/rdd/showStatement.png + +Rule ``showFilter``: + +.. image:: /docs/user/img/rdd/showFilter.png + +Example 1: Show All Indices Information +--------------------------------------- + +``SHOW`` statement lists all indices that match the search pattern. By using wildcard '%', information for all indices in the cluster is returned. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : "SHOW TABLES LIKE %" + } + +Result set: + ++---------+-----------+----------------+----------+-------+--------+----------+---------+-------------------------+--------------+ +|TABLE_CAT|TABLE_SCHEM| TABLE_NAME|TABLE_TYPE|REMARKS|TYPE_CAT|TYPE_SCHEM|TYPE_NAME|SELF_REFERENCING_COL_NAME|REF_GENERATION| ++=========+===========+================+==========+=======+========+==========+=========+=========================+==============+ +|integTest| null| accounts|BASE TABLE| null| null| null| null| null| null| ++---------+-----------+----------------+----------+-------+--------+----------+---------+-------------------------+--------------+ +|integTest| null|employees_nested|BASE TABLE| null| null| null| null| null| null| ++---------+-----------+----------------+----------+-------+--------+----------+---------+-------------------------+--------------+ + + +Example 2: Show Specific Index Information +------------------------------------------ + +Here is an example that searches metadata for index name prefixed by 'acc' + +SQL query:: + + POST /_opendistro/_sql + { + "query" : "SHOW TABLES LIKE acc%" + } + +Result set: + ++---------+-----------+----------+----------+-------+--------+----------+---------+-------------------------+--------------+ +|TABLE_CAT|TABLE_SCHEM|TABLE_NAME|TABLE_TYPE|REMARKS|TYPE_CAT|TYPE_SCHEM|TYPE_NAME|SELF_REFERENCING_COL_NAME|REF_GENERATION| ++=========+===========+==========+==========+=======+========+==========+=========+=========================+==============+ +|integTest| null| accounts|BASE TABLE| null| null| null| null| null| null| ++---------+-----------+----------+----------+-------+--------+----------+---------+-------------------------+--------------+ + + +Example 3: Describe Index Fields Information +-------------------------------------------- + +``DESCRIBE`` statement lists all fields for indices that can match the search pattern. + +SQL query:: + + POST /_opendistro/_sql + { + "query" : "DESCRIBE TABLES LIKE accounts" + } + +Result set: + ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|TABLE_CAT|TABLE_SCHEM|TABLE_NAME| COLUMN_NAME|DATA_TYPE|TYPE_NAME|COLUMN_SIZE|BUFFER_LENGTH|DECIMAL_DIGITS|NUM_PREC_RADIX|NULLABLE|REMARKS|COLUMN_DEF|SQL_DATA_TYPE|SQL_DATETIME_SUB|CHAR_OCTET_LENGTH|ORDINAL_POSITION|IS_NULLABLE|SCOPE_CATALOG|SCOPE_SCHEMA|SCOPE_TABLE|SOURCE_DATA_TYPE|IS_AUTOINCREMENT|IS_GENERATEDCOLUMN| ++=========+===========+==========+==============+=========+=========+===========+=============+==============+==============+========+=======+==========+=============+================+=================+================+===========+=============+============+===========+================+================+==================+ +|integTest| null| accounts|account_number| null| long| null| null| null| 10| 2| null| null| null| null| null| 1| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| firstname| null| text| null| null| null| 10| 2| null| null| null| null| null| 2| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| address| null| text| null| null| null| 10| 2| null| null| null| null| null| 3| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| balance| null| long| null| null| null| 10| 2| null| null| null| null| null| 4| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| gender| null| text| null| null| null| 10| 2| null| null| null| null| null| 5| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| city| null| text| null| null| null| 10| 2| null| null| null| null| null| 6| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| employer| null| text| null| null| null| 10| 2| null| null| null| null| null| 7| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| state| null| text| null| null| null| 10| 2| null| null| null| null| null| 8| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| age| null| long| null| null| null| 10| 2| null| null| null| null| null| 9| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| email| null| text| null| null| null| 10| 2| null| null| null| null| null| 10| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ +|integTest| null| accounts| lastname| null| text| null| null| null| 10| 2| null| null| null| null| null| 11| | null| null| null| null| NO| | ++---------+-----------+----------+--------------+---------+---------+-----------+-------------+--------------+--------------+--------+-------+----------+-------------+----------------+-----------------+----------------+-----------+-------------+------------+-----------+----------------+----------------+------------------+ + + diff --git a/docs/user/img/rdd/joinPart.png b/docs/user/img/rdd/joinPart.png new file mode 100644 index 0000000000..635f4838ab Binary files /dev/null and b/docs/user/img/rdd/joinPart.png differ diff --git a/docs/user/img/rdd/showFilter.png b/docs/user/img/rdd/showFilter.png new file mode 100644 index 0000000000..47dbc0746e Binary files /dev/null and b/docs/user/img/rdd/showFilter.png differ diff --git a/docs/user/img/rdd/showStatement.png b/docs/user/img/rdd/showStatement.png new file mode 100644 index 0000000000..a1939e7d8c Binary files /dev/null and b/docs/user/img/rdd/showStatement.png differ diff --git a/docs/user/img/rdd/singleDeleteStatement.png b/docs/user/img/rdd/singleDeleteStatement.png new file mode 100644 index 0000000000..9b1a88c4fb Binary files /dev/null and b/docs/user/img/rdd/singleDeleteStatement.png differ diff --git a/docs/user/img/rdd/tableSource.png b/docs/user/img/rdd/tableSource.png new file mode 100644 index 0000000000..f109f44daa Binary files /dev/null and b/docs/user/img/rdd/tableSource.png differ diff --git a/docs/user/index.rst b/docs/user/index.rst index 04d3c05a8c..028e7b7e76 100644 --- a/docs/user/index.rst +++ b/docs/user/index.rst @@ -19,17 +19,23 @@ Open Distro for Elasticsearch SQL enables you to extract insights out of Elastic * **Data Query Language** - - `Basic Query `_ + - `Basic Queries `_ - - TODO: Subquery, JOIN, MINUS/UNION, SHOW/DESCRIBE, Full Text Search, SQL Functions + - `Complex Queries `_ + + - `Metadata Queries `_ + + - `SQL Functions `_ * **Data Manipulation Language** - - TODO: DELETE + - `DELETE Statement `_ * **Beyond SQL** - - TODO: Elasticsearch features + - `PartiQL (JSON) Support `_ + + - `Full-text Search `_ * **Troubleshooting** diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/beyond/FullTextIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/beyond/FullTextIT.java new file mode 100644 index 0000000000..190dc2223d --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/beyond/FullTextIT.java @@ -0,0 +1,144 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.doctest.beyond; + +import com.amazon.opendistroforelasticsearch.sql.doctest.core.DocTest; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.DocTestConfig; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.Section; + +@DocTestConfig(template = "beyond/fulltext.rst", testData = {"accounts.json"}) +public class FullTextIT extends DocTest { + + @Section(1) + public void matchQuery() { + section( + title("Match Query"), + description( + "Match query is the standard query for full-text search in Elasticsearch. Both ``MATCHQUERY`` and", + "``MATCH_QUERY`` are functions for performing match query." + ), + example( + description("Both functions can accept field name as first argument and a text as second argument."), + post(multiLine( + "SELECT account_number, address", + "FROM accounts", + "WHERE MATCH_QUERY(address, 'Holmes')" + )) + ), + example( + description("Both functions can also accept single argument and be used in the following manner."), + post(multiLine( + "SELECT account_number, address", + "FROM accounts", + "WHERE address = MATCH_QUERY('Holmes')" + )) + ) + ); + } + + @Section(2) + public void multiMatchQuery() { + section( + title("Multi-match Query"), + description( + "Besides match query against a single field, you can search for a text with multiple fields.", + "Function ``MULTI_MATCH``, ``MULTIMATCH`` and ``MULTIMATCHQUERY`` are provided for this." + ), + example( + description( + "Each preceding function accepts ``query`` for a text and ``fields`` for field names or pattern", + "that the text given is searched against. For example, the following query is searching for", + "documents in index accounts with 'Dale' as either firstname or lastname." + ), + post(multiLine( + "SELECT firstname, lastname", + "FROM accounts", + "WHERE MULTI_MATCH('query'='Dale', 'fields'='*name')" + )) + ) + ); + } + + @Section(3) + public void queryStringQuery() { + section( + title("Query String Query"), + description( + "Query string query parses and splits a query string provided based on Lucene query string syntax.", + "The mini language supports logical connectives, wildcard, regex and proximity search. Please refer", + "to official documentation for more details. Note that an error is thrown in the case of any invalid", + "syntax in query string." + ), + example( + description( + "``QUERY`` function accepts query string and returns true or false respectively for document", + "that matches the query string or not." + ), + post(multiLine( + "SELECT account_number, address", + "FROM accounts", + "WHERE QUERY('address:Lane OR address:Street')" + )) + ) + ); + } + + @Section(4) + public void matchPhraseQuery() { + section( + title("Match Phrase Query"), + description( + "Match phrase query is similar to match query but it is used for matching exact phrases.", + "``MATCHPHRASE``, ``MATCH_PHRASE`` and ``MATCHPHRASEQUERY`` are provided for this purpose." + ), + example( + description(), + post(multiLine( + "SELECT account_number, address", + "FROM accounts", + "WHERE MATCH_PHRASE(address, '880 Holmes Lane')" + )) + ) + ); + } + + @Section(5) + public void scoreQuery() { + section( + title("Score Query"), + description( + "Elasticsearch supports to wrap a filter query so as to return a relevance score along with", + "every matching document. ``SCORE``, ``SCOREQUERY`` and ``SCORE_QUERY`` can be used for this." + ), + example( + description( + "The first argument is a match query expression and the second argument is for an optional", + "floating point number to boost the score. The default value is 1.0. Apart from this, an", + "implicit variable ``_score`` is available so you can return score for each document or", + "use it for sorting." + ), + post(multiLine( + "SELECT account_number, address, _score", + "FROM accounts", + "WHERE SCORE(MATCH_QUERY(address, 'Lane'), 0.5) OR", + " SCORE(MATCH_QUERY(address, 'Street'), 100)", + "ORDER BY _score" + )) + ) + ); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/beyond/PartiQLIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/beyond/PartiQLIT.java new file mode 100644 index 0000000000..969cbbcfe5 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/beyond/PartiQLIT.java @@ -0,0 +1,153 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.doctest.beyond; + +import com.amazon.opendistroforelasticsearch.sql.doctest.core.DocTest; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.DocTestConfig; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.Section; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.builder.Example; +import com.amazon.opendistroforelasticsearch.sql.utils.JsonPrettyFormatter; + +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; +import java.util.List; +import java.util.stream.Collectors; +import java.util.stream.IntStream; + +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.TestData.TEST_DATA_FOLDER_ROOT; +import static com.amazon.opendistroforelasticsearch.sql.esintgtest.TestUtils.getResourceFilePath; + +@DocTestConfig(template = "beyond/partiql.rst", testData = {"employees_nested.json"}) +public class PartiQLIT extends DocTest { + + @Section(1) + public void showTestData() { + section( + title("Test Data"), + description( + "The test index ``employees_nested`` used by all examples in this document is very similar to", + "the one used in official PartiQL documentation." + ), + createDummyExampleForTestData("employees_nested.json") + ); + } + + @Section(2) + public void queryNestedCollection() { + section( + title("Querying Nested Collection"), + description( + "In SQL-92, a database table can only have tuples that consists of scalar values.", + "PartiQL extends SQL-92 to allow you query and unnest nested collection conveniently.", + "In Elasticsearch world, this is very useful for index with object or nested field." + ), + example( + title("Unnesting a Nested Collection"), + description( + "In the following example, it finds nested document (project) with field value (name)", + "that satisfies the predicate (contains 'security'). Note that because each parent document", + "can have more than one nested documents, the matched nested document is flattened. In other", + "word, the final result is the Cartesian Product between parent and nested documents." + ), + post(multiLine( + "SELECT e.name AS employeeName,", + " p.name AS projectName", + "FROM employees_nested AS e,", + " e.projects AS p", + "WHERE p.name LIKE '%security%'" + )) + ), + /* + Issue: https://github.com/opendistro-for-elasticsearch/sql/issues/397 + example( + title("Preserving Parent Information with LEFT JOIN"), + description( + "The query in the preceding example is very similar to traditional join queries, except ``ON`` clause missing.", + "This is because it is implicitly in the nesting of nested documents (projects) into parent (employee). Therefore,", + "you can use ``LEFT JOIN`` to preserve the information in parent document associated." + ), + post( + "SELECT e.id AS id, " + + " e.name AS employeeName, " + + " e.title AS title, " + + " p.name AS projectName " + + "FROM employees_nested AS e " + + "LEFT JOIN e.projects AS p" + ) + )*/ + example( + title("Unnesting in Existential Subquery"), + description( + "Alternatively, a nested collection can be unnested in subquery to check if it", + "satisfies a condition." + ), + post(multiLine( + "SELECT e.name AS employeeName", + "FROM employees_nested AS e", + "WHERE EXISTS (", + " SELECT *", + " FROM e.projects AS p", + " WHERE p.name LIKE '%security%'", + ")" + )) + )/*, + Issue: https://github.com/opendistro-for-elasticsearch/sql/issues/398 + example( + title("Aggregating over a Nested Collection"), + description( + "After unnested, a nested collection can be aggregated just like a regular field." + ), + post(multiLine( + "SELECT", + " e.name AS employeeName,", + " COUNT(p) AS cnt", + "FROM employees_nested AS e,", + " e.projects AS p", + "WHERE p.name LIKE '%security%'", + "GROUP BY e.id, e.name", + "HAVING COUNT(p) >= 1" + ) + )) + */ + ); + } + + private Example createDummyExampleForTestData(String fileName) { + Example example = new Example(); + example.setTitle("Employees"); + example.setDescription(""); + example.setResult(parseJsonFromTestData(fileName)); + return example; + } + + /** Concat and pretty format document at odd number line in bulk request file */ + private String parseJsonFromTestData(String fileName) { + Path path = Paths.get(getResourceFilePath(TEST_DATA_FOLDER_ROOT + fileName)); + try { + List lines = Files.readAllLines(path); + String json = IntStream.range(0, lines.size()). + filter(i -> i % 2 == 1). + mapToObj(lines::get). + collect(Collectors.joining(",","{\"employees\":[", "]}")); + return JsonPrettyFormatter.format(json); + } catch (IOException e) { + throw new IllegalStateException("Failed to load test data: " + path, e); + } + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/TestData.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/TestData.java index 9d8fb33fa4..4b2e1788c4 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/TestData.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/TestData.java @@ -15,16 +15,24 @@ package com.amazon.opendistroforelasticsearch.sql.doctest.core; -import com.amazon.opendistroforelasticsearch.sql.esintgtest.TestUtils; import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; import java.io.File; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Path; +import java.nio.file.Paths; + +import static com.amazon.opendistroforelasticsearch.sql.esintgtest.TestUtils.createIndexByRestClient; +import static com.amazon.opendistroforelasticsearch.sql.esintgtest.TestUtils.getResourceFilePath; +import static com.amazon.opendistroforelasticsearch.sql.esintgtest.TestUtils.loadDataByRestClient; /** * Test data for document generation */ public class TestData { + public static final String MAPPINGS_FOLDER_ROOT = "src/test/resources/doctest/mappings/"; public static final String TEST_DATA_FOLDER_ROOT = "src/test/resources/doctest/testdata/"; private final String[] testFilePaths; @@ -39,13 +47,15 @@ public TestData(String[] testFilePaths) { */ public void loadToES(DocTest test) { for (String filePath : testFilePaths) { + String indexName = indexName(filePath); try { - TestUtils.loadDataByRestClient(test.restClient(), indexName(filePath), TEST_DATA_FOLDER_ROOT + filePath); + createIndexByRestClient(test.restClient(), indexName, getIndexMapping(filePath)); + loadDataByRestClient(test.restClient(), indexName, TEST_DATA_FOLDER_ROOT + filePath); } catch (Exception e) { throw new IllegalStateException(StringUtils.format( - "Failed to load test filePath from %s", filePath), e); + "Failed to load mapping and test filePath from %s", filePath), e); } - test.ensureGreen(indexName(filePath)); + test.ensureGreen(indexName); } } @@ -60,4 +70,12 @@ private String indexName(String filePath) { ); } + private String getIndexMapping(String filePath) throws IOException { + Path path = Paths.get(getResourceFilePath(MAPPINGS_FOLDER_ROOT + filePath)); + if (Files.notExists(path)) { + return ""; + } + return new String(Files.readAllBytes(path)); + } + } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/builder/DocBuilder.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/builder/DocBuilder.java index ea26a3bcd0..6f7c091e76 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/builder/DocBuilder.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/builder/DocBuilder.java @@ -198,6 +198,10 @@ default Requests put(String name, Object value) { ); } + default String multiLine(String... lines) { + return String.join("\\n", lines); + } + /** Query by a simple SQL is too common and deserve a dedicated overload method */ default Requests post(String sql, UrlParam... params) { return post(new Body("\"query\":\"" + sql + "\""), params); diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/request/SqlRequestFormat.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/request/SqlRequestFormat.java index ffc3c7ac43..c6e32d9f8c 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/request/SqlRequestFormat.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/request/SqlRequestFormat.java @@ -21,13 +21,16 @@ import com.google.common.io.CharStreams; import org.apache.http.Header; import org.elasticsearch.client.Request; +import org.json.JSONObject; import java.io.IOException; import java.io.InputStream; import java.io.InputStreamReader; +import java.util.Arrays; import java.util.List; import java.util.Map; -import java.util.stream.Collectors; + +import static java.util.stream.Collectors.joining; /** * Different SQL request formats. @@ -51,7 +54,7 @@ public String format(SqlRequest sqlRequest) { if (!headers.isEmpty()) { str.append(headers.stream(). map(header -> StringUtils.format("-H '%s: %s'", header.getName(), header.getValue())). - collect(Collectors.joining(" ", "", " "))); + collect(joining(" ", "", " "))); } str.append(StringUtils.format("-X %s ", request.getMethod())). @@ -103,7 +106,18 @@ protected String body(Request request) { InputStream content = request.getEntity().getContent(); String rawBody = CharStreams.toString(new InputStreamReader(content, Charsets.UTF_8)); if (!rawBody.isEmpty()) { + JSONObject json = new JSONObject(rawBody); + String sql = json.optString("query"); // '\\n' in literal is replaced by '\n' after unquote body = JsonPrettyFormatter.format(rawBody); + + // Format and replace multi-line sql literal + if (!sql.isEmpty() && sql.contains("\n")) { + String multiLineSql = Arrays.stream(sql.split("\\n")). // '\\n' is to escape backslash in regex + collect(joining("\n\t", + "\"\"\"\n\t", + "\n\t\"\"\"")); + body = body.replace("\"" + sql.replace("\n", "\\n") + "\"", multiLineSql); + } } } catch (IOException e) { throw new IllegalStateException("Failed to parse and format body from request", e); @@ -114,6 +128,6 @@ protected String body(Request request) { protected String formatParams(Map params) { return params.entrySet().stream(). map(e -> e.getKey() + "=" + e.getValue()). - collect(Collectors.joining("&", "?", "")); + collect(joining("&", "?", "")); } } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/response/SqlResponseFormat.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/response/SqlResponseFormat.java index 9fb41c0f39..b38f413a48 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/response/SqlResponseFormat.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/response/SqlResponseFormat.java @@ -119,10 +119,19 @@ private List parseDataRows(JSONArray rows, boolean isSorted) { @SuppressWarnings({"rawtypes", "unchecked"}) private static void sort(List lists) { lists.sort((list1, list2) -> { + if (list1 == null || list2 == null) { + return compareNullable(list1, list2); + } + // Assume 2 lists are of same length and all elements are comparable for (int i = 0; i < list1.length; i++) { Comparable obj1 = (Comparable) list1[i]; Comparable obj2 = (Comparable) list2[i]; + + if (obj1 == null || obj2 == null) { + return compareNullable(obj1, obj2); + } + int result = obj1.compareTo(obj2); if (result != 0) { return result; @@ -132,4 +141,15 @@ private static void sort(List lists) { }); } + /** Put NULL first (as smaller element) */ + private static int compareNullable(Object obj1, Object obj2) { + if (obj1 == null && obj2 == null) { + return 0; + } else if (obj1 == null) { + return -1; + } else { // obj2 == null + return 1; + } + } + } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/DocBuilderTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/DocBuilderTest.java index 5ef7fb379f..42b5466854 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/DocBuilderTest.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/DocBuilderTest.java @@ -96,6 +96,39 @@ public void sectionShouldIncludeTitleAndDescription() { paragraph("This is a test"); } + @Test + public void sectionShouldIncludeMultiLineSql() { + section( + title("Test"), + description("This is a test"), + example( + description("This is an example for the test"), + post(multiLine( + "SELECT firstname", + "FROM accounts", + "WHERE age > 30") + ) + ) + ); + + verifier.section("Test"). + subSection("Description"). + paragraph("This is a test"). + subSection("Example"). + paragraph("This is an example for the test"). + codeBlock( + "SQL query", + "POST /_opendistro/_sql\n" + + "{\n" + + " \"query\" : \"\"\"\n" + + "\tSELECT firstname\n" + + "\tFROM accounts\n" + + "\tWHERE age > 30\n" + + "\t\"\"\"\n" + + "}" + ); + } + @Test public void sectionShouldIncludeExample() { section( diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/SqlRequestFormatTest.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/SqlRequestFormatTest.java index 681451b706..ddcde16608 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/SqlRequestFormatTest.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/core/test/SqlRequestFormatTest.java @@ -21,8 +21,8 @@ import org.junit.Test; import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.CURL_REQUEST; -import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.KIBANA_REQUEST; import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.IGNORE_REQUEST; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.KIBANA_REQUEST; import static org.hamcrest.Matchers.emptyString; import static org.hamcrest.Matchers.is; import static org.junit.Assert.assertThat; @@ -63,4 +63,24 @@ public void testKibanaFormat() { assertThat(KIBANA_REQUEST.format(sqlRequest), is(expected)); } + @Test + public void multiLineSqlInKibanaRequestShouldBeWellFormatted() { + SqlRequest multiLineSqlRequest = new SqlRequest( + "POST", + "/_opendistro/_sql", + "{\"query\":\"SELECT *\\nFROM accounts\\nWHERE age > 30\"}" + ); + + String expected = + "POST /_opendistro/_sql\n" + + "{\n" + + " \"query\" : \"\"\"\n" + + "\tSELECT *\n" + + "\tFROM accounts\n" + + "\tWHERE age > 30\n" + + "\t\"\"\"\n" + + "}"; + assertThat(KIBANA_REQUEST.format(multiLineSqlRequest), is(expected)); + } + } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dml/DeleteIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dml/DeleteIT.java new file mode 100644 index 0000000000..8f7ef547b3 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dml/DeleteIT.java @@ -0,0 +1,53 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.doctest.dml; + +import com.amazon.opendistroforelasticsearch.sql.doctest.core.DocTest; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.DocTestConfig; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.Section; + +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.IGNORE_REQUEST; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.KIBANA_REQUEST; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.response.SqlResponseFormat.PRETTY_JSON_RESPONSE; + +@DocTestConfig(template = "dml/delete.rst", testData = {"accounts.json"}) +public class DeleteIT extends DocTest { + + @Section(1) + public void delete() { + section( + title("DELETE"), + description( + "``DELETE`` statement deletes documents that satisfy the predicates in ``WHERE`` clause.", + "Note that all documents are deleted in the case of ``WHERE`` clause absent." + ), + images("rdd/singleDeleteStatement.png"), + example( + description( + "The ``datarows`` field in this case shows rows impacted, in other words how many", + "documents were just deleted." + ), + post(multiLine( + "DELETE FROM accounts", + "WHERE age > 30" + )), + queryFormat(KIBANA_REQUEST, PRETTY_JSON_RESPONSE), + explainFormat(IGNORE_REQUEST, PRETTY_JSON_RESPONSE) + ) + ); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/BasicQueryIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/BasicQueryIT.java index 6d5183ed14..94cd41918e 100644 --- a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/BasicQueryIT.java +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/BasicQueryIT.java @@ -115,7 +115,7 @@ public void where() { section( title("WHERE"), description( - "`WHERE` clause specifies only Elasticsearch documents that meet the criteria should be affected.", + "``WHERE`` clause specifies only Elasticsearch documents that meet the criteria should be affected.", "It consists of predicates that uses ``=``, ``<>``, ``>``, ``>=``, ``<``, ``<=``, ``IN``,", "``BETWEEN``, ``LIKE``, ``IS NULL`` or ``IS NOT NULL``. These predicates can be combined by", "logical operator ``NOT``, ``AND`` or ``OR`` to build more complex expression.\n\n" + @@ -130,7 +130,11 @@ public void where() { "number, string or date.", "``IN`` and ``BETWEEN`` is convenient for comparison with multiple values or a range." ), - post("SELECT account_number FROM accounts WHERE account_number = 1") + post(multiLine( + "SELECT account_number", + "FROM accounts", + "WHERE account_number = 1" + )) ), example( title("Missing Fields"), @@ -140,7 +144,11 @@ public void where() { "fields or existing fields only.\n\n" + "Note that for now we don't differentiate missing field and field set to ``NULL`` explicitly." ), - post("SELECT account_number, employer FROM accounts WHERE employer IS NULL") + post(multiLine( + "SELECT account_number, employer", + "FROM accounts", + "WHERE employer IS NULL" + )) ) ); } @@ -158,12 +166,20 @@ public void groupBy() { example( title("Grouping by Fields"), description(), - post("SELECT age FROM accounts GROUP BY age") + post(multiLine( + "SELECT age", + "FROM accounts", + "GROUP BY age" + )) ), example( title("Grouping by Field Alias"), description("Field alias is accessible in ``GROUP BY`` clause."), - post("SELECT account_number AS num FROM accounts GROUP BY num") + post(multiLine( + "SELECT account_number AS num", + "FROM accounts", + "GROUP BY num" + )) ), example( title("Grouping by Ordinal"), @@ -172,7 +188,11 @@ public void groupBy() { "recommended because your ``GROUP BY`` clause depends on fields in ``SELECT`` clause", "and require to change accordingly." ), - post("SELECT age FROM accounts GROUP BY 1") + post(multiLine( + "SELECT age", + "FROM accounts", + "GROUP BY 1" + )) ), example( title("Grouping by Scalar Function"), @@ -180,7 +200,11 @@ public void groupBy() { "Scalar function can be used in ``GROUP BY`` clause and it's required to be present in", "``SELECT`` clause too." ), - post("SELECT ABS(age) AS a FROM accounts GROUP BY ABS(age)") + post(multiLine( + "SELECT ABS(age) AS a", + "FROM accounts", + "GROUP BY ABS(age)" + )) ) ); } @@ -195,7 +219,12 @@ public void having() { ), example( description(), - post("SELECT age, MAX(balance) FROM accounts GROUP BY age HAVING MIN(balance) > 10000") + post(multiLine( + "SELECT age, MAX(balance)", + "FROM accounts", + "GROUP BY age", + "HAVING MIN(balance) > 10000" + )) ) ); } @@ -221,7 +250,11 @@ public void orderBy() { "The default behavior of Elasticsearch is to return nulls or missing last.", "You can make them present before non-nulls by using ``IS NOT NULL``." ), - post("SELECT employer FROM accounts ORDER BY employer IS NOT NULL") + post(multiLine( + "SELECT employer", + "FROM accounts", + "ORDER BY employer IS NOT NULL" + )) ) ); } @@ -239,7 +272,11 @@ public void limit() { description( "Given a positive number, ``LIMIT`` uses it as page size to fetch result of that size at most." ), - post("SELECT account_number FROM accounts ORDER BY account_number LIMIT 1") + post(multiLine( + "SELECT account_number", + "FROM accounts", + "ORDER BY account_number LIMIT 1" + )) ), example( title("Fetching at Offset"), @@ -248,7 +285,11 @@ public void limit() { "This can be used as simple pagination solution though it's inefficient on large index.", "Generally ``ORDER BY`` is required in this case to ensure the same order between pages." ), - post("SELECT account_number FROM accounts ORDER BY account_number LIMIT 1, 1") + post(multiLine( + "SELECT account_number", + "FROM accounts", + "ORDER BY account_number LIMIT 1, 1" + )) ) ); } diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/ComplexQueryIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/ComplexQueryIT.java new file mode 100644 index 0000000000..aa28b2ca21 --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/ComplexQueryIT.java @@ -0,0 +1,201 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.doctest.dql; + +import com.amazon.opendistroforelasticsearch.sql.doctest.core.DocTest; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.DocTestConfig; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.Section; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.builder.Example; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.builder.Requests; +import org.junit.Ignore; + +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.IGNORE_REQUEST; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.KIBANA_REQUEST; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.response.SqlResponseFormat.IGNORE_RESPONSE; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.response.SqlResponseFormat.TABLE_RESPONSE; + +@DocTestConfig(template = "dql/complex.rst", testData = {"accounts.json", "employees_nested.json"}) +public class ComplexQueryIT extends DocTest { + + @Section(1) + public void subquery() { + section( + title("Subquery"), + description( + "A subquery is a complete ``SELECT`` statement which is used within another statement", + "and enclosed in parenthesis. From the explain output, you can notice that some subquery", + "are actually transformed to an equivalent join query to execute." + ), + /* + example( + title("Scalar Value Subquery"), + description( + "" + ), + post( + "SELECT firstname, lastname, balance " + + "FROM accounts " + + "WHERE balance >= ( " + + " SELECT AVG(balance) FROM accounts " + + ") " + ) + ),*/ + example( + title("Table Subquery"), + description(""), + post(multiLine( + "SELECT a1.firstname, a1.lastname, a1.balance", + "FROM accounts a1", + "WHERE a1.account_number IN (", + " SELECT a2.account_number", + " FROM accounts a2", + " WHERE a2.balance > 10000", + ")" + )) + ), + example( + title("Subquery in FROM Clause"), + description(""), + post(multiLine( + "SELECT a.f, a.l, a.a", + "FROM (", + " SELECT firstname AS f, lastname AS l, age AS a", + " FROM accounts", + " WHERE age > 30", + ") AS a" + )) + ) + ); + } + + @Section(2) + public void joins() { + section( + title("JOINs"), + description( + "A ``JOIN`` clause combines columns from one or more indices by using values common to each." + ), + images("rdd/tableSource.png", "rdd/joinPart.png"), + example( + title("Inner Join"), + description( + "Inner join is very commonly used that creates a new result set by combining columns", + "of two indices based on the join predicates specified. It iterates both indices and", + "compare each document to find all that satisfy the join predicates. Keyword ``JOIN``", + "is used and preceded by ``INNER`` keyword optionally. The join predicate(s) is specified", + "by ``ON`` clause.\n\n", + "Remark that the explain API output for join queries looks complicated. This is because", + "a join query is associated with two Elasticsearch DSL queries underlying and execute in", + "the separate query planner framework. You can interpret it by looking into the logical", + "plan and physical plan." + ), + post(multiLine( + "SELECT", + " a.account_number, a.firstname, a.lastname,", + " e.id, e.name", + "FROM accounts a", + "JOIN employees_nested e", + " ON a.account_number = e.id" + )) + ), + joinExampleWithoutExplain( + title("Cross Join"), + description( + "Cross join or Cartesian join combines each document from the first index with each from", + "the second. The result set is the Cartesian Product of documents from both indices.", + "It appears to be similar to inner join without ``ON`` clause to specify join condition.\n\n", + "Caveat: It is risky to do cross join even on two indices of medium size. This may trigger", + "our circuit breaker to terminate the query to avoid out of memory issue." + ), + post(multiLine( + "SELECT", + " a.account_number, a.firstname, a.lastname,", + " e.id, e.name", + "FROM accounts a", + "JOIN employees_nested e" + )) + ), + joinExampleWithoutExplain( + title("Outer Join"), + description( + "Outer join is used to retain documents from one or both indices although it does not satisfy", + "join predicate. For now, only ``LEFT OUTER JOIN`` is supported to retain rows from first index.", + "Note that keyword ``OUTER`` is optional." + ), + post(multiLine( + "SELECT", + " a.account_number, a.firstname, a.lastname,", + " e.id, e.name", + "FROM accounts a", + "LEFT JOIN employees_nested e", + " ON a.account_number = e.id" + )) + ) + ); + } + + @Ignore("Multi-query doesn't work for default format: https://github.com/opendistro-for-elasticsearch/sql/issues/388") + @Section(3) + public void setOperations() { + section( + title("Set Operations"), + description( + "Set operations allow results of multiple queries to be combined into a single result set.", + "The results to be combined are required to be of same type. In other word, they require to", + "have same column. Otherwise, a semantic analysis exception is raised." + ), + example( + title("UNION Operator"), + description( + "A ``UNION`` clause combines the results of two queries into a single result set. Duplicate rows", + "are removed unless ``UNION ALL`` clause is being used. A common use case of ``UNION`` is to combine", + "result set from data partitioned in indices daily or monthly." + ), + post(multiLine( + "SELECT balance, firstname, lastname", + "FROM accounts WHERE balance < 10000", + "UNION", + "SELECT balance, firstname, lastname", + "FROM accounts WHERE balance > 30000" + )) + ), + example( + title("MINUS Operator"), + description( + "A ``MINUS`` clause takes two queries too but returns resulting rows of first query that", + "do not appear in the other query. Duplicate rows are removed automatically as well." + ), + post(multiLine( + "SELECT balance, age", + "FROM accounts", + "WHERE balance < 10000", + "MINUS", + "SELECT balance, age", + "FROM accounts", + "WHERE age < 35" + )) + ) + ); + } + + private Example joinExampleWithoutExplain(String title, String description, Requests requests) { + return example(title, description, requests, + queryFormat(KIBANA_REQUEST, TABLE_RESPONSE), + explainFormat(IGNORE_REQUEST, IGNORE_RESPONSE) + ); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/MetaDataQueryIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/MetaDataQueryIT.java new file mode 100644 index 0000000000..9850e2397f --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/MetaDataQueryIT.java @@ -0,0 +1,71 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.doctest.dql; + +import com.amazon.opendistroforelasticsearch.sql.doctest.core.DocTest; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.DocTestConfig; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.Section; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.builder.Example; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.builder.Requests; + +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.IGNORE_REQUEST; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.request.SqlRequestFormat.KIBANA_REQUEST; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.response.SqlResponseFormat.IGNORE_RESPONSE; +import static com.amazon.opendistroforelasticsearch.sql.doctest.core.response.SqlResponseFormat.TABLE_RESPONSE; + +@DocTestConfig(template = "dql/metadata.rst", testData = {"accounts.json", "employees_nested.json"}) +public class MetaDataQueryIT extends DocTest { + + @Section(1) + public void queryMetaData() { + section( + title("Querying Metadata"), + description( + "You can query your indices metadata by ``SHOW`` and ``DESCRIBE`` statement. These commands are", + "very useful for database management tool to enumerate all existing indices and get basic information", + "from the cluster." + ), + images("rdd/showStatement.png", "rdd/showFilter.png"), + metadataQueryExample( + title("Show All Indices Information"), + description( + "``SHOW`` statement lists all indices that match the search pattern. By using wildcard '%',", + "information for all indices in the cluster is returned." + ), + post("SHOW TABLES LIKE %") + ), + metadataQueryExample( + title("Show Specific Index Information"), + description("Here is an example that searches metadata for index name prefixed by 'acc'"), + post("SHOW TABLES LIKE acc%") + ), + metadataQueryExample( + title("Describe Index Fields Information"), + description("``DESCRIBE`` statement lists all fields for indices that can match the search pattern."), + post("DESCRIBE TABLES LIKE accounts") + ) + ); + } + + /** Explain doesn't work for SHOW/DESCRIBE so skip it */ + private Example metadataQueryExample(String title, String description, Requests requests) { + return example(title, description, requests, + queryFormat(KIBANA_REQUEST, TABLE_RESPONSE), + explainFormat(IGNORE_REQUEST, IGNORE_RESPONSE) + ); + } + +} diff --git a/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/SQLFunctionsIT.java b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/SQLFunctionsIT.java new file mode 100644 index 0000000000..29a1d6e7de --- /dev/null +++ b/src/test/java/com/amazon/opendistroforelasticsearch/sql/doctest/dql/SQLFunctionsIT.java @@ -0,0 +1,54 @@ +/* + * Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. + * + * Licensed under the Apache License, Version 2.0 (the "License"). + * You may not use this file except in compliance with the License. + * A copy of the License is located at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * or in the "license" file accompanying this file. This file is distributed + * on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either + * express or implied. See the License for the specific language governing + * permissions and limitations under the License. + */ + +package com.amazon.opendistroforelasticsearch.sql.doctest.dql; + +import com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.function.ScalarFunction; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.DocTest; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.DocTestConfig; +import com.amazon.opendistroforelasticsearch.sql.doctest.core.annotation.Section; +import com.amazon.opendistroforelasticsearch.sql.utils.StringUtils; + +import static com.amazon.opendistroforelasticsearch.sql.antlr.semantic.types.TypeExpression.TypeExpressionSpec; + +@DocTestConfig(template = "dql/functions.rst") +public class SQLFunctionsIT extends DocTest { + + /** List only specifications of all SQL functions supported for now */ + @Section + public void listFunctions() { + for (ScalarFunction func : ScalarFunction.values()) { // Java Enum.values() return enums in order they are defined + section( + title(func.getName()), + description(listFunctionSpecs(func)) + ); + } + } + + private String listFunctionSpecs(ScalarFunction func) { + TypeExpressionSpec[] specs = func.specifications(); + if (specs.length == 0) { + return "Specification is undefined and type check is skipped for now"; + } + + StringBuilder specStr = new StringBuilder("Specifications: \n\n"); + for (int i = 0; i < specs.length; i++) { + specStr.append( + StringUtils.format("%d. %s%s\n", (i + 1), func.getName(), specs[i]) + ); + } + return specStr.toString(); + } +} diff --git a/src/test/resources/doctest/mappings/accounts.json b/src/test/resources/doctest/mappings/accounts.json new file mode 100644 index 0000000000..de9930778d --- /dev/null +++ b/src/test/resources/doctest/mappings/accounts.json @@ -0,0 +1,44 @@ +{ + "mappings": { + "properties": { + "gender": { + "type": "text", + "fielddata": true + }, + "address": { + "type": "text", + "fielddata": true + }, + "firstname": { + "type": "text", + "fielddata": true, + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "lastname": { + "type": "text", + "fielddata": true, + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "state": { + "type": "text", + "fielddata": true, + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + } + } + } +} \ No newline at end of file diff --git a/src/test/resources/doctest/mappings/employees_nested.json b/src/test/resources/doctest/mappings/employees_nested.json new file mode 100644 index 0000000000..1805628b39 --- /dev/null +++ b/src/test/resources/doctest/mappings/employees_nested.json @@ -0,0 +1,44 @@ +{ + "mappings": { + "properties": { + "id": { + "type": "long" + }, + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + }, + "projects": { + "type": "nested", + "properties": { + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword" + } + }, + "fielddata": true + }, + "started_year": { + "type": "long" + } + } + }, + "title": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword", + "ignore_above": 256 + } + } + } + } + } +} \ No newline at end of file diff --git a/src/test/resources/doctest/templates/beyond/fulltext.rst b/src/test/resources/doctest/templates/beyond/fulltext.rst new file mode 100644 index 0000000000..c7fc3e10f9 --- /dev/null +++ b/src/test/resources/doctest/templates/beyond/fulltext.rst @@ -0,0 +1,16 @@ + +================ +Full-text Search +================ + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + +Introduction +============ + +Full-text search is for searching a single stored document which is distinguished from regular search based on original texts in database. It tries to match search criteria by examining all of the words in each document. In Elasticsearch, full-text queries provided enables you to search text fields analyzed during indexing. + diff --git a/src/test/resources/doctest/templates/beyond/partiql.rst b/src/test/resources/doctest/templates/beyond/partiql.rst new file mode 100644 index 0000000000..ec71e9e296 --- /dev/null +++ b/src/test/resources/doctest/templates/beyond/partiql.rst @@ -0,0 +1,16 @@ + +====================== +PartiQL (JSON) Support +====================== + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + +Introduction +============ + +PartiQL is a SQL-compatible query language that makes it easy and efficient to query semi-structured and nested data regardless of data format. For now our implementation is only partially compatible with PartiQL specification and more support will be provided in future. + diff --git a/src/test/resources/doctest/templates/dml/delete.rst b/src/test/resources/doctest/templates/dml/delete.rst new file mode 100644 index 0000000000..11188fa943 --- /dev/null +++ b/src/test/resources/doctest/templates/dml/delete.rst @@ -0,0 +1,12 @@ + +================ +DELETE Statement +================ + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + + diff --git a/src/test/resources/doctest/templates/dql/basics.rst b/src/test/resources/doctest/templates/dql/basics.rst index 4837573cf1..34cd5d3d09 100644 --- a/src/test/resources/doctest/templates/dql/basics.rst +++ b/src/test/resources/doctest/templates/dql/basics.rst @@ -1,7 +1,7 @@ -=========== -Basic Query -=========== +============= +Basic Queries +============= .. rubric:: Table of contents diff --git a/src/test/resources/doctest/templates/dql/complex.rst b/src/test/resources/doctest/templates/dql/complex.rst new file mode 100644 index 0000000000..f184e0ae58 --- /dev/null +++ b/src/test/resources/doctest/templates/dql/complex.rst @@ -0,0 +1,13 @@ + +=============== +Complex Queries +=============== + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 2 + +Besides simple SFW queries (SELECT-FROM-WHERE), there is also support for complex queries such as Subquery, ``JOIN``, ``UNION`` and ``MINUS``. For these queries, more than one Elasticsearch index and DSL query is involved. You can check out how they are performed behind the scene by our explain API. + diff --git a/src/test/resources/doctest/templates/dql/functions.rst b/src/test/resources/doctest/templates/dql/functions.rst new file mode 100644 index 0000000000..5f4a922b98 --- /dev/null +++ b/src/test/resources/doctest/templates/dql/functions.rst @@ -0,0 +1,18 @@ + +============= +SQL Functions +============= + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 1 + +Introduction +============ + +There is support for a wide variety of SQL functions. We are intend to generate this part of documentation automatically from our type system. However, the type system is missing descriptive information for now. So only formal specifications of all SQL functions supported are listed at the moment. More details will be added in future. + +Most of the specifications can be self explained just as a regular function with data type as argument. The only notation that needs elaboration is generic type ``T`` which binds to an actual type and can be used as return type. For example, ``ABS(NUMBER T) -> T`` means function ``ABS`` accepts an numerical argument of type ``T`` which could be any sub-type of ``NUMBER`` type and returns the actual type of ``T`` as return type. The actual type binds to generic type at runtime dynamically. + diff --git a/src/test/resources/doctest/templates/dql/metadata.rst b/src/test/resources/doctest/templates/dql/metadata.rst new file mode 100644 index 0000000000..ab6cc3c4da --- /dev/null +++ b/src/test/resources/doctest/templates/dql/metadata.rst @@ -0,0 +1,12 @@ + +================ +Metadata Queries +================ + +.. rubric:: Table of contents + +.. contents:: + :local: + :depth: 1 + + diff --git a/src/test/resources/doctest/testdata/employees_nested.json b/src/test/resources/doctest/testdata/employees_nested.json new file mode 100644 index 0000000000..a0142c49f1 --- /dev/null +++ b/src/test/resources/doctest/testdata/employees_nested.json @@ -0,0 +1,6 @@ +{"index":{"_id":"1"}} +{"id":3,"name":"Bob Smith","title":null,"projects":[{"name":"AWS Redshift Spectrum querying","started_year":1990},{"name":"AWS Redshift security","started_year":1999},{"name":"AWS Aurora security","started_year":2015}]} +{"index":{"_id":"2"}} +{"id":4,"name":"Susan Smith","title":"Dev Mgr","projects":[]} +{"index":{"_id":"3"}} +{"id":6,"name":"Jane Smith","title":"Software Eng 2","projects":[{"name":"AWS Redshift security","started_year":1998},{"name":"AWS Hello security","started_year":2015,"address":[{"city":"Dallas","state":"TX"}]}]}