Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deploy text-based index #651

Merged
merged 2 commits into from
Aug 31, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -1,18 +1,37 @@
# Full-text index restrictions

This document holds the restrictions for full-text indexes. Please read the restrictions very carefully before using the full-text indexes.
!!! caution

This topic introduces the restrictions for full-text indexes. Please read the restrictions very carefully before using the full-text indexes.

For now, full-text search has the following limitations:

1. The maximum indexing string length is 256 bytes. The part of data that exceeds 256 bytes will not be indexed.
2. Full-text index can not be applied to more than one property at a time (similar to a composite index).
3. The `WHERE` clause in full-text search statement `LOOKUP` does not support logical expressions `AND` and `OR`.
4. Full-text index can not be applied to multiple tags search.
5. Sorting for the returned results of the full-text search is not supported. Data is returned in the order of data insertion.
6. Full-text index can not search the null properties.
7. Rebuilding or altering Elasticsearch indexes is not supported at this time.
8. Pipe is not supported in the `LOOKUP` statement, excluding the examples in our document.
9. Full-text search only works on single terms.
10. Full-text indexes are not deleted together with the graph space.
11. Make sure that you start the Elasticsearch cluster and Nebula Graph at the same time. If not, the data writing on the Elasticsearch cluster can be incomplete.
12. Do not contain `'` or `\` in the vertex or edge values. If not, a error is caused in the Elasticsearch cluster storage.
13. It may take a while for Elasticsearch to create indexes. If Nebula Graph warns no index is found, wait for the index to take effect.

2. If there is a full-text index on the tag/edge type, the tag/edge type cannot be deleted or modified.

3. One tag/edge type can only have one full-text index.

4. The type of properties must be `string`.

5. The `WHERE` clause in the full-text search statement `LOOKUP`/`MATCH` does not support logical expressions `AND` and `OR`.

6. Full-text index can not be applied to search multiple tags/edge types.

7. Sorting for the returned results of the full-text search is not supported. Data is returned in the order of data insertion.

8. Full-text index can not search properties with value `NULL`.

9. Altering Elasticsearch indexes is not supported at this time.

10. The pipe operator is not supported in the `LOOKUP` and `MATCH` statements, excluding the examples in our manual.

11. Full-text search only works on single terms.

12. Full-text indexes are not deleted together with the graph space.

13. Make sure that you start the Elasticsearch cluster and Nebula Graph at the same time. If not, the data writing on the Elasticsearch cluster can be incomplete.

14. Do not contain `'` or `\` in the vertex or edge values. If not, an error will be caused in the Elasticsearch cluster storage.

15. It may take a while for Elasticsearch to create indexes. If Nebula Graph warns no index is found, wait for the index to take effect (however, the waiting time is unknown and there is no code to check).
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,17 @@

Nebula Graph full-text indexes are powered by [Elasticsearch](https://en.wikipedia.org/wiki/Elasticsearch). This means that you can use Elasticsearch full-text query language to retrieve what you want. Full-text indexes are managed through built-in procedures. They can be created only for variable `STRING` and `FIXED_STRING` properties when the listener cluster and the Elasticsearch cluster are deployed.

## Before you start
## Precaution

Before you start using the full-text index, please make sure that you know the [restrictions](../../4.deployment-and-installation/6.deploy-text-based-index/1.text-based-index-restrictions.md).

## Deploy Elasticsearch cluster

To deploy an Elasticsearch cluster, see the [Elasticsearch documentation](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-deploy-elasticsearch.html).
To deploy an Elasticsearch cluster, see [Kubernetes Elasticsearch deployment](https://www.elastic.co/guide/en/cloud-on-k8s/current/k8s-deploy-elasticsearch.html) or [Elasticsearch installation](https://www.elastic.co/guide/en/elasticsearch/reference/6.0/_installation.html).

When the Elasticsearch cluster is started, add the template file for the Nebula Graph full-text index. Take the following sample template for example:
When the Elasticsearch cluster is started, add the template file for the Nebula Graph full-text index. For more information on index templates, see [Elasticsearch Document](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-templates.html).

Take the following sample template for example:

```json
{
Expand Down Expand Up @@ -40,29 +42,65 @@ Make sure that you specify the following fields in strict accordance with the pr
"value" :{ "type" : "keyword"}
```

!!! caution

When creating a full-text index, start the index name with `nebula`.

For example:

```bash
curl -H "Content-Type: application/json; charset=utf-8" -XPUT http://127.0.0.1:9200/_template/nebula_index_template -d '
{
"template": "nebula*",
"settings": {
"index": {
"number_of_shards": 3,
"number_of_replicas": 1
}
},
"mappings": {
"properties" : {
"tag_id" : { "type" : "long" },
"column_id" : { "type" : "text" },
"value" :{ "type" : "keyword"}
}
}
}'
```

You can configure the Elasticsearch to meet your business needs. To customize the Elasticsearch, see [Elasticsearch Document](https://www.elastic.co/guide/en/elasticsearch/reference/current/settings.html).

## Sign in to the text search clients

When the Elasticsearch cluster is deployed, use the `SIGN IN` statement to sign in to the Elasticsearch clients. Multiple `elastic_ip:port` pairs are separated with commas. You must use the IPs and the port number in the configuration file for the Elasticsearch.

### Syntax

```ngql
SIGN IN TEXT SERVICE [(<elastic_ip:port> [,<username>, <password>]), (<elastic_ip:port>), ...]
SIGN IN TEXT SERVICE [(<elastic_ip:port> [,<username>, <password>]), (<elastic_ip:port>), ...];
```

When the Elasticsearch cluster is deployed, use the `SIGN IN` statement to sign in to the Elasticsearch clients. Multiple `elastic_ip:port` pairs are separated with commas. You must use the IPs and the port number in the configuration file for the Elasticsearch. For example:
### Example

```ngql
nebula> SIGN IN TEXT SERVICE (127.0.0.1:9200);
```

Elasticsearch does not have username or password by default. If you configured a username and password, you need to specify in the `SIGN IN` statement.
!!! Note

Elasticsearch does not have a username or password by default. If you configured a username and password, you need to specify them in the `SIGN IN` statement.

## Show text search clients

The `SHOW TEXT SEARCH CLIENTS` statement can list the text search clients.

### Syntax

```ngql
SHOW TEXT SEARCH CLIENTS
SHOW TEXT SEARCH CLIENTS;
```

Use the `SHOW TEXT SEARCH CLIENTS` statement to list the text search clients. For example:
### Example

```ngql
nebula> SHOW TEXT SEARCH CLIENTS;
Expand All @@ -79,11 +117,15 @@ nebula> SHOW TEXT SEARCH CLIENTS;

## Sign out to the text search clients

The `SIGN OUT TEXT SERVICE` statement can sign out all the text search clients.

### Syntax

```ngql
SIGN OUT TEXT SERVICE
SIGN OUT TEXT SERVICE;
```

Use the `SIGN OUT TEXT SERVICE` to sign out all the text search clients. For example:
### Example

```ngql
nebula> SIGN OUT TEXT SERVICE;
Expand Down
Original file line number Diff line number Diff line change
@@ -1,62 +1,88 @@
# Deploy Raft Listener for Nebula Storage service

Full-Text index data is written to the Elasticsearch cluster asynchronously. The Raft Listener (hereinafter shortened as Listener) is a separate process that fetches data from the Storage Service and writes them into the Elasticsearch cluster.
Full-text index data is written to the Elasticsearch cluster asynchronously. The Raft Listener (Listener for short) is a separate process that fetches data from the Storage Service and writes them into the Elasticsearch cluster.

## Prerequisites

* You have read and fully understand the [restrictions](../../4.deployment-and-installation/6.deploy-text-based-index/1.text-based-index-restrictions.md) for using Full-Text indexes.
* You have read and fully understood the [restrictions](../../4.deployment-and-installation/6.deploy-text-based-index/1.text-based-index-restrictions.md) for using full-text indexes.

* You have [deployed a Nebula Graph cluster](../deploy-nebula-graph-cluster.md).
* You have [deployed a Nebula Graph cluster](../2.compile-and-install-nebula-graph/deploy-nebula-graph-cluster.md).

* You have prepared at least one extra Storage Server. To use the Full-Text search, you must run one or more Storage Server as the Raft Listener.
* You have [deploy a Elasticsearch cluster](./2.deploy-es.md).

* You have prepared at least one extra Storage Server. To use the full-text search, you must run one or more Storage Server as the Raft Listener.

## Precautions

* The Storage Service that you want to run as a Listener must have the same or later version with all the other Nebula Graph services in the cluster.
* The Storage Service that you want to run as the Listener must have the same or later release with all the other Nebula Graph services in the cluster.

* For now, you can only add all Listeners to a graph space once and for all. Trying to add a new Listener to a graph space that already has a Listener will fail. To add all Listeners, set them [in one statement](#step_3_add_listeners_to_nebula_graph).

## Deployment process

### Step 1: Install the Storage service

* For now, you can only add Listeners to a graph space once and for all. Trying to add listeners to a graph space that already has a listener will fail. To add multiple listeners, set them [in one statement](#step_3_add_listeners_to_nebula_graph).
The Listener process and the storaged process use the same binary file. However, their configuration files and using ports are different. You can install Nebula Graph on all servers that need to deploy a Listener, but only the Storage service can be used. For details, see [Install Nebula Graph by RPM or DEB Package](../2.compile-and-install-nebula-graph/2.install-nebula-graph-by-rpm-or-deb.md).

## Step 1: Prepare the configuration file for the Listeners
### Step 2: Prepare the configuration file for the Listener

You have to prepare a Listener configuration file on the machine that you want to deploy the Listeners. The file name must be `nebula-storaged-listener.conf`. A [template](https://github.com/vesoft-inc/nebula-storage/blob/master/conf/nebula-storaged-listener.conf.production) is provided for your reference.
You have to prepare a corresponding configuration file on the machine that you want to deploy a Listener. The file must be named as `nebula-storaged-listener.conf` and stored in the `etc` directory. A [template](https://github.com/vesoft-inc/nebula-storage/blob/master/conf/nebula-storaged-listener.conf.production) is provided for your reference. Note that the file suffix `.production` should be removed.

Most configurations are the same as the configurations of [Storage Service](../../5.configurations-and-logs/1.configurations/4.storage-config.md). This topic only introduces the differences.

| Name | Default value | Description |
| :----------- | :----------------------- | :------------------|
| `daemonize` | `true` | Indicates whether to start the daemon. |
| `pid_file` | `pids_listener/nebula-storaged.pid` | The file that records the process ID. |
| `meta_server_addrs` | - | IP addresses and ports of all Meta services. Multiple Meta services are separated by commas. |
| `local_ip` | - | The local IP address of the Listener service. |
| `port` | - | The listening port of the RPC daemon of the Listener service. |
| `heartbeat_interval_secs` | `10` | The heartbeat interval of the Meta service. The unit is second (s). |
| `listener_path` | `data/listener` | The WAL directory of the Listener. Only one directory is allowed. |
| `data_path` | `data` | For compatibility reasons, this parameter can be ignored. Fill in the default value `data`. |
| `part_man_type` | `memory` | The type of the part manager. Optional values ​​are `memory` and `meta`. |
| `rocksdb_batch_size` | `4096` | The default reserved bytes for batch operations. |
| `rocksdb_block_cache` | `4` | The default block cache size of BlockBasedTable. The unit is Megabyte (MB). |
| `engine_type` | `rocksdb` | The type of the Storage engine, such as `rocksdb`, `memory`, etc. |
| `part_type` | `simple`| The type of the part, such as `simple`, `consensus`, etc. |

!!! note

Use real IP addresses in the configuration file instead of domain names or loopback IP addresses such as `127.0.0.1`.

## Step 2: Start the Listeners
### Step 3: Start Listeners

Run the following command to start the Listeners.
Run the following command to start the Listener.

```bash
./bin/nebula-storaged --flagfile ${listener_config_path}/nebula-storaged-listener.conf
./bin/nebula-storaged --flagfile <listener_config_path>/nebula-storaged-listener.conf
```

`${listener_config_path}` is the path where you store the Listener configuration file.

## Step 3: Add Listeners to Nebula Graph
### Step 4: Add Listeners to Nebula Graph

[Connect to Nebula Graph](../../2.quick-start/3.connect-to-nebula-graph.md) and run [`USE <space>`](../../3.ngql-guide/9.space-statements/2.use-space.md) to enter the graph space that you want to create Full-Text indexes for. Then run the following statement to add the Listener into Nebula Graph.

!!! note

You must use real IPs for the listeners.
[Connect to Nebula Graph](../../2.quick-start/3.connect-to-nebula-graph.md) and run [`USE <space>`](../../3.ngql-guide/9.space-statements/2.use-space.md) to enter the graph space that you want to create full-text indexes for. Then run the following statement to add a Listener into Nebula Graph.

```ngql
ADD LISTENER ELASTICSEARCH <listener_ip:port> [,<listener_ip:port>, ...]
```

Multiple `listener_ip:port` pairs are separated with commas. For example:
!!! warning

You must use real IPs for a Listener.

Add all Listeners in one statement completely.

```ngql
nebula> ADD LISTENER ELASTICSEARCH 192.168.8.5:46780,192.168.8.6:46780;
nebula> ADD LISTENER ELASTICSEARCH 192.168.8.5:9789,192.168.8.6:9789;
```

## Show Listeners

Run the `SHOW LISTENER` statement to list the Listeners.
Run the `SHOW LISTENER` statement to list all Listeners.

For example:
### Example

```ngql
nebula> SHOW LISTENER;
Expand All @@ -73,14 +99,18 @@ nebula> SHOW LISTENER;

## Remove Listeners

Run the `REMOVE LISTENER ELASTICSEARCH` statement to remove all the Elasticsearch Listeners for a graph space.
Run the `REMOVE LISTENER ELASTICSEARCH` statement to remove all Listeners in a graph space.

For example:
### Example

```ngql
nebula> REMOVE LISTENER ELASTICSEARCH;
```

## What to do next
!!! danger

After the Listener is deleted, it cannot be added again. Therefore, the synchronization to the ES cluster cannot be continued and the text index data will be incomplete. If needed, you can only recreate the graph space.

## Next

After deploying the [Elasticsearch cluster](2.deploy-es.md) and the Listeners, Full-Text indexes are created automatically on the Elasticsearch cluster. You can do Full-Text search now. For more information, see [Full-Text search](../../3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index.md).
After deploying the [Elasticsearch cluster](2.deploy-es.md) and the Listener, full-text indexes are created automatically on the Elasticsearch cluster. Users can do full-text search now. For more information, see [Full-Text search](../../3.ngql-guide/15.full-text-index-statements/1.search-with-text-based-index.md).