Skip to content

Latest commit

 

History

History
174 lines (144 loc) · 5.05 KB

README.md

File metadata and controls

174 lines (144 loc) · 5.05 KB

cloud-sync

  • Runs as Elasticsearch plugin.
  • CloudSync uses Elasticsearch Snapshot and Restore API's and provides a "streaming service" to move data from one cluster to another cluster.
Use Case(s)
  1. Provides a simple, cost effective way of moving existing Elasticsearch indices from on-prem cluster to Cloud.
  2. Curerntly this is developed and test with GCP. AWS support is more of test effort.
  3. This supports moving 100's of terabytes of Elasticsearch indices.
  4. Support blue/green deployment process. We want to do this with (almost) zero downtime.
  5. Provides visibility on the sync progress.
CloudSync Install
  1. Install cloudsync
    1. On both source and sink clusters.
      /usr/share/elasticsearch/bin/elasticsearch-plugin install file:///home/logrhythm/cloud-sync-1.0.0-SNAPSHOT.zip
  2. Install GCS plugin
    1. Download GCS plugin from: https://artifacts.elastic.co/downloads/elasticsearch-plugins/repository-gcs/repository-gcs-5.6.3.zip
    2. On both source and sink clusters.
      bin/elasticsearch-plugin install file:///home/logrhythm/hack/repository-gcs-5.6.3.zip
    3. Follow steps from: https://www.elastic.co/guide/en/elasticsearch/plugins/5.6/repository-gcs-usage.html
    4. Adding GCS creds to the Elasticsearch keystore. (On Source) /usr/share/elasticsearch/bin/elasticsearch-keystore add-file gcs.client.default.credentials_file /gcs/my_file.json
    5. If Sink cluster is on GCP, it doesnt needs GCS creds in Elasticsearch keystore.
CloudSync API
  1. Start the Source:
    PUT /cloudsync/start { "mode" : "source", "store" : "fs", "indices" : "logs-*", "location": "/mount/cloudsync_backup", }

    1. Other stores supported: aws-s3, gcp..
    2. This needs to be issued with every restart.
  2. Start the Sink: PUT /cloudsync/start { "mode" : "sink", "store" : "fs", "location": "/mount/cloudsync_backup", }

  3. Curl examples for 'filesystem' nfs store.

    1. Start Source curl -X POST "localhost:9200/cloudsync/start" -H 'Content-Type: application/json' -d' { "mode": "source", "store": "fs", "indices": "logs-*", "location": "/opt/lr/cloudsync" }'
    2. Start Sink

    curl -X POST "localhost:9201/cloudsync/start" -H 'Content-Type: application/json' -d' { "mode": "sink", "store": "fs", "indices": "logs-*", "location": "/opt/lr/cloudsync" }'

  4. Curl examples for 'gcs' store

    curl -X POST "localhost:9200/cloudsync/start" -H 'Content-Type: application/json' -d' { "mode": "source", "store": "gcs", "indices": "logs-*", "location": "dx_cloud" }' curl -X POST "localhost:9200/cloudsync/start" -H 'Content-Type: application/json' -d' { "mode": "sink", "store": "gcs", "indices": "logs-*", "location": "dx_cloud" }'

  5. Sync Status on source cluster

    GET /cloudsync/status

    curl localhost:9200/cloudsync/status

Development
  1. Built using maven.
  2. Elasticsearch plugin need to have exact version as Elasticsearch. To build for different versions change pom.xml <properties> <elasticsearch.version>5.6.3</elasticsearch.version> </properties>
  3. cloud-sync-.zip is found in target/releases/ after mvn clean install
Backlog
  1. Fix response header type missing.
  2. Tested only when this plugin is installed to singe node in cluster. Support multiple nodes for high availability.

FAQS

  1. "failed":"no such index" => your indices patterns resolved to zero indices on source cluster.

Quick Reference: Elasticsearch Snapshot & Restore API

  1. Verify repository is present curl -X POST "localhost:9200/_snapshot/cloudsync_backup/_verify"

  2. Create repository curl -XPUT 'http://localhost:9200/_snapshot/cloudsync_backup' -H 'Content-Type: application/json' -d '{ "type": "fs", "settings": { "location": "/opt/lr/cloudsync", "compress": true } }'

  3. Delete repository

  4. All snapshots must be deleted before this operation) curl -X DELETE "localhost:9200/_snapshot/cloudsync_backup"

  5. Snapshot an index: ` curl -X PUT "localhost:9200/_snapshot/cloudsync_backup/snapshot_1" -H 'Content-Type: application/json' -d' {

    "indices": "logs-2017-10-21", "ignore_unavailable": true, "include_global_state": false, "chunk_size": "10m" }' `

  6. Check indexInfo status: curl -X GET "localhost:9200/_snapshot/cloudsync_backup/snapshot_1/_status?pretty"

  7. Delete a indexInfo: curl -X DELETE "localhost:9200/_snapshot/cloudsync_backup/snapshot_1"

  8. Restore a indexInfo: curl -X POST "localhost:9201/_snapshot/cloudsync_backup/snapshot_1/_restore"

  9. List of all utils
    curl -X GET "localhost:9200/_snapshot/cloudsync_backup/_all?pretty"