Skip to content
This repository has been archived by the owner on Aug 16, 2022. It is now read-only.

Intro, basic queries, delete, cli, odbc #202

Merged
merged 54 commits into from
May 14, 2020
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
bd920db
sql_ui
Mar 29, 2020
e7cae79
SQL UI plugin
Apr 2, 2020
4801584
incorporated a few comments
Apr 3, 2020
35b5d36
added sql screenshot
Apr 3, 2020
02ee739
changed GET to POST per feedback
Apr 6, 2020
bd296b1
changed POST request
Apr 17, 2020
3602a2a
Added SQL full-text search
Apr 20, 2020
798e4ba
incorporated feedback
Apr 23, 2020
eddeb60
request parameter for sql commands is not supported
Apr 23, 2020
e19517a
initial commit
Apr 25, 2020
5bc2e3f
minor fix
Apr 25, 2020
1159898
added partiql section
Apr 25, 2020
8d78545
incorporated feedback
Apr 30, 2020
8165a75
added delete in where clause
Apr 30, 2020
718cac2
added metadata search section
Apr 30, 2020
39f092d
joins update
May 3, 2020
e52d0c8
added subquery
May 3, 2020
f349620
changed examples
May 6, 2020
e04b4f7
minor change
May 6, 2020
eb12e46
incorporated Andrew's comments
May 8, 2020
990eab5
images and reorg of section
May 8, 2020
cd898cf
added images
May 8, 2020
3c3bab2
added images
May 8, 2020
41fc17f
removed toc
May 8, 2020
fea6da9
added delete statement
May 8, 2020
56bb28a
changed title
May 8, 2020
f606c08
added sql cli
May 9, 2020
ad48a15
more changes
May 9, 2020
cd54070
test sql cli
May 10, 2020
50b3530
added odbc driver
May 11, 2020
6ad155a
added contributing call
May 11, 2020
5baaeed
complex queries
May 12, 2020
35806d6
Merge branch 'joins_update' into sql_ui
May 12, 2020
4830c7a
cursor endpoints
May 12, 2020
d5b9efd
settings
May 12, 2020
fb0ceff
nav_order
May 12, 2020
7c928ea
Merge branch 'partiql_support' into sql_ui
May 12, 2020
87a31af
nav order
May 12, 2020
922b020
remove change
May 12, 2020
9033ccb
remove change
May 12, 2020
c08e7d6
Merge branch 'sql_full_text' into sql_ui
May 12, 2020
7464af2
sql_functions
May 12, 2020
66d6ac1
Change troubleshoot and partiql title
dai-chen May 13, 2020
4a79806
add sql limitation
penghuo May 13, 2020
8172e54
Update ODBC documentation
abbashus May 14, 2020
5a5f4d1
Merge pull request #4 from abbashus/odbc_docs
ashwinkumar12345 May 14, 2020
7ab8304
Merge pull request #2 from penghuo/sql_ui_limitation
ashwinkumar12345 May 14, 2020
86b8e54
added feedback
May 14, 2020
4f776a4
Merge branch 'sql_ui' of https://github.com/ashwinkumar12345/for-elas…
May 14, 2020
3032061
Merge pull request #1 from dai-chen/sql-doc-hierarchy
ashwinkumar12345 May 14, 2020
1da20ad
reordered and added workbench section
May 14, 2020
9167948
removed duplicate troubleshoot
May 14, 2020
e27902b
minor fixes
May 14, 2020
83cea1d
minor fix
May 14, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/images/sql.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
379 changes: 375 additions & 4 deletions docs/sql/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,14 +9,18 @@ has_children: true

Open Distro for Elasticsearch SQL lets you write queries in SQL rather than the [Elasticsearch query domain-specific language (DSL)](../elasticsearch/full-text). If you're already familiar with SQL and don't want to learn the query DSL, this feature is a great option.

To use the feature, send requests to the `_opendistro/_sql` URI. You can use a request parameter or the request body (recommended).
SQL UI is now supported. Use the SQL UI to easily run on-demand SQL queries, translate SQL into its REST equivalent, and view and save results as text, JSON, JDBC, or CSV.

![Kibana SQL UI plugin](../images/sql.png)

To use the REST API, send requests to the `_opendistro/_sql` URI. You can use a request parameter or the request body (recommended).

```sql
GET https://<host>:<port>/_opendistro/_sql?sql=select * from my-index limit 50
POST _opendistro/_sql?sql=select * from my-index limit 50
```
Copy link
Contributor

@dai-chen dai-chen Apr 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GET endpoint is removed already due to security concern in opendistro-for-elasticsearch/sql#414. And POST request requires SQL query present in request body rather than URL parameter. Could you help remove it from the documentation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed. Please review.


```json
POST https://<host>:<port>/_opendistro/_sql
POST _opendistro/_sql
{
"query": "SELECT * FROM my-index LIMIT 50"
}
Expand Down Expand Up @@ -56,8 +60,375 @@ When you return data in CSV or raw format, each row corresponds to a *document*,

## User interfaces
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's mention right at the top that, y'know, we have a UI now, it's good for use cases X and Y, and include a screenshot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashwinkumar12345 Could you help change the GET request above to the correspondent POST request? Because we're removing GET endpoint very soon due to security concern.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion but a POST request should look like the sample call here: https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/sql-support.html.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed.


You can test queries using **Dev Tools** in Kibana (`https://<host>:5601`).
Kibana
{: .label .label-yellow :}

### Index data

The SQL plugin is for read-only purposes, so you cannot index or update data using SQL.

Use the `bulk` operation to index some sample data:

```json
PUT accounts/_bulk?refresh
{"index":{"_id":"1"}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's this data from? Just curious. If there's a reference here, I'm not catching it.

Copy link
Contributor Author

@ashwinkumar12345 ashwinkumar12345 Apr 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
{"index":{"_id":"6"}}
{"account_number":6,"balance":5686,"firstname":"Hattie","lastname":"Bond","age":36,"gender":"M","address":"671 Bristol Street","employer":"Netagy","email":"hattiebond@netagy.com","city":"Dante","state":"TN"}
{"index":{"_id":"13"}}
{"account_number":13,"balance":32838,"firstname":"Nanette","lastname":"Bates","age":28,"gender":"F","address":"789 Madison Street","employer":"Quility","email":"nanettebates@quility.com","city":"Nogal","state":"VA"}
{"index":{"_id":"18"}}
{"account_number":18,"balance":4180,"firstname":"Dale","lastname":"Adams","age":33,"gender":"M","address":"467 Hutchinson Court","email":"daleadams@boink.com","city":"Orick","state":"MD"}
```

Here’s how core SQL concepts map to Elasticsearch:

| SQL | Elasticsearch | Example
:--- | :--- | :---
Table | Index | `accounts`
Row | Document | `1`
Column | Field | `account_number`

To list all your indices:

```sql
SHOW TABLES LIKE %
```

| id | TABLE_NAME
:--- | :---
0 | accounts

### Read data

After you index a document, retrieve it using the following SQL expression:

```sql
SELECT *
FROM accounts
WHERE _id = 1
```

| id | account_number | firstname | gender | city | balance | employer | state | email | address | lastname | age
:--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :---
0 | 1 | Amber | M | Brogan | 39225 | Pyrami | IL | amberduke@pyrami.com | 880 Holmes Lane | Duke | 32

### Delete data

To delete a document from an index, use the `DELETE` clause:

```sql
DELETE
FROM accounts
WHERE _id = 0
```

| id | deleted_rows
:--- | :---
0 | 1

### Search and aggregate data

Use the `SELECT` clause, along with `FROM`, `WHERE`, `GROUP BY`, `HAVING`, `ORDER BY`, and `LIMIT` to search and aggregate data.

Among these clauses, `SELECT` and `FROM` are required, as they specify which fields to retrieve and which indices to retrieve them from. All other clauses are optional. Use them according to your needs.

The complete syntax for searching and aggregating data is as follows:

```sql
SELECT [DISTINCT] (* | expression) [[AS] alias] [, ...]
FROM index_name
[WHERE predicates]
[GROUP BY expression [, ...]
[HAVING predicates]]
[ORDER BY expression [IS [NOT] NULL] [ASC | DESC] [, ...]]
[LIMIT [offset, ] size]
```

These SQL clauses execute in the following order:

```sql
FROM index
WHERE predicates
GROUP BY expressions
HAVING predicates
SELECT expressions
ORDER BY expressions
LIMIT size
```

#### Select

Specify the fields to be retrieved.

*Example 1*: Use `*` to retrieve all fields in an index:

```sql
SELECT *
FROM accounts
```

| id | account_number | firstname | gender | city | balance | employer | state | email | address | lastname | age
:--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :--- | :---
0 | 1 | Amber | M | Brogan | 39225 | Pyrami | IL | amberduke@pyrami.com | 880 Holmes Lane | Duke | 32
1 | 16 | Hattie | M | Dante | 5686 | Netagy | TN | hattiebond@netagy.com | 671 Bristol Street | Bond | 36
2 | 13 | Nanette | F | Nogal | 32838 | Quility | VA | nanettebates@quility.com | 789 Madison Street | Bates | 28
3 | 18 | Dale | M | Orick | 4180 | | MD | daleadams@boink.com | 467 Hutchinson Court | Adams | 33

*Example 2*: Use field name(s) to retrieve only specific fields:

```sql
SELECT firstname, lastname
FROM accounts
```

| id | firstname | lastname
:--- | :--- | :---
0 | Amber | Duke
1 | Hattie | Bond
2 | Nanette | Bates
3 | Dale | Adams

*Example 3*: Use field aliases instead of field names. Field aliases are used to make field names more readable:

```sql
SELECT account_number AS num
FROM accounts
```

| id | num
:--- | :---
0 | 1
1 | 6
2 | 13
3 | 18

*Example 4*: Use the `DISTINCT` clause to get back only unique field values. You can specify one or more field names:

```sql
SELECT DISTINCT age
FROM accounts
```

| id | age
:--- | :---
0 | 28
1 | 32
2 | 33
3 | 36

#### From

Specify the index that you want search.

*Example 1*: Use index aliases to query across indexes. To learn about index aliases, see [Index Alias](../elasticsearch/index-alias/).
In this sample query, `acc` is an alias for the `accounts` index:

```sql
SELECT account_number, accounts.age
FROM accounts
```

or

```sql
SELECT account_number, acc.age
FROM accounts acc
```

| id | account_number | age
:--- | :--- | :---
0 | 1 | 32
1 | 6 | 36
2 | 13 | 28
3 | 18 | 33

*Example 2*: Use index patterns to query indices that match a specific pattern:

```sql
SELECT account_number
FROM account*
```

| id | account_number
:--- | :---
0 | 1
1 | 6
2 | 13
3 | 18

#### Where

Specify a condition to filter the results.

| Operators | Behavior
:--- | :---
`=` | Equal to.
`<>` | Not equal to.
`>` | Greater than.
`<` | Less than.
`>=` | Greater than or equal to.
`<=` | Less than or equal to.
`IN` | Specify multiple `OR` operators.
`BETWEEN` | Similar to a range query. For more information about range queries, see [Range query](../elasticsearch/term/#range).
`LIKE` | Use for full text search. For more information about full-text queries, see [Full-text queries](../elasticsearch/full-text/).
`IS NULL` | Check if the field value is `NULL`.
`IS NOT NULL` | Check if the field value is `NOT NULL`.

Combine comparison operators (`=`, `<>`, `>`, `>=`, `<`, `<=`) with boolean operators `NOT`, `AND`, or `OR` to build more complex expressions.

*Example 1*: Use comparison operators for numbers, strings, or dates:

```sql
SELECT account_number
FROM accounts
WHERE account_number = 1
```

| id | account_number
:--- | :---
0 | 1

*Example 2*: Elasticsearch allows for flexible schema so documents in an index may have different fields. Use `IS NULL` or `IS NOT NULL` to retrieve only missing fields or existing fields. We do not differentiate between missing fields and fields explicitly set to `NULL`:

```sql
SELECT account_number, employer
FROM accounts
WHERE employer IS NULL
```

| id | account_number | employer
:--- | :--- | :---
0 | 18 |

#### Group By

Group documents with the same field value into buckets.

*Example 1*: Group by fields:

```sql
SELECT age
FROM accounts
GROUP BY age
```

| id | age
:--- | :---
0 | 28
1 | 32
2 | 33
3 | 36

*Example 2*: Group by field alias:

```sql
SELECT account_number AS num
FROM accounts
GROUP BY num
```

| id | num
:--- | :---
0 | 1
1 | 6
2 | 13
3 | 18

*Example 4*: Use scalar functions in the `GROUP BY` clause:

```sql
SELECT ABS(age) AS a
FROM accounts
GROUP BY ABS(age)
```

| id | a
:--- | :---
0 | 28.0
1 | 32.0
2 | 33.0
3 | 36.0

#### Having

Use the `HAVING` clause to aggregate inside each bucket based on aggregation functions (`COUNT`, `AVG`, `SUM`, `MIN`, and `MAX`).
The `HAVING` clause filters results from the `GROUP BY` clause:

*Example 1*:

```sql
SELECT age, MAX(balance)
FROM accounts
GROUP BY age HAVING MIN(balance) > 10000
```

| id | age | MAX (balance)
:--- | :---
0 | 28 | 32838
1 | 32 | 39225

#### Order By

Use the `ORDER BY` clause to sort results into your desired order.

*Example 1*: Use `ORDER BY` to sort by ascending or descending order. Besides regular field names, using `ordinal`, `alias`, or `scalar` functions are supported:

```sql
SELECT account_number
FROM accounts
ORDER BY account_number DESC
```

| id | account_number
:--- | :---
0 | 18
1 | 13
2 | 6
3 | 1

*Example 2*: Specify if documents with missing fields are to be put at the beginning or at the end of the results. The default behavior of Elasticsearch is to return nulls or missing fields at the end. To push them before non-nulls, use the `IS NOT NULL` operator:

```sql
SELECT employer
FROM accounts
ORDER BY employer IS NOT NULL
```

| id | employer
:--- | :---
0 |
1 | Netagy
2 | Pyrami
3 | Quility

#### Limit

Specify the maximum number of documents that you want to retrieve. This is similar to the `size` parameter in Elasticsearch. Used to prevent fetching large amounts of data into memory.

*Example 1*: Specify the number of results to be returned:

```sql
SELECT account_number
FROM accounts
ORDER BY account_number LIMIT 1
```

| id | account_number
:--- | :---
0 | 1

*Example 2*: Specify the document number that you want to start returning the results from. The second argument is equivalent to the `from` parameter in Elasticsearch. Use `ORDER BY` to ensure the same order between pages:

```sql
SELECT account_number
FROM accounts
ORDER BY account_number LIMIT 1, 1
```

| id | account_number
:--- | :---
0 | 6

## Troubleshoot queries

Expand Down