Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GKE local ssd provisioner for COS #612

Merged
merged 6 commits into from
Jul 5, 2019
Merged

Conversation

gregwebs
Copy link
Contributor

@gregwebs gregwebs commented Jun 27, 2019

What problem does this PR solve?

The Ubuntu local SSD provisioner has a startup delay while provisioning packages. Make the local SSD provisioner work on COS.

What is changed and how does it work?

On COS, remount with UUID and set nobarrier.
We do not combine multiple disk like on Ubuntu.
Therefore this is recommended if your data size is < 375 GiB.
Note that by default COS does not set the nobarrier option. This script sets the nobarrier option when it remounts.

Check List

Tests

  • Manual test: create clusters with < 375 GiB of TiKV disk needed using the COS image. Also create a cluster with > 375 GiB of TiKV disk.

Code changes

  • manifest changes
  • Has documents change

Side effects

  • If you already have this deployed, Don't forget to run kubectl delete -n kube-system local-volume-provisioner.

Does this PR introduce a user-facing change?:

on GKE one can use COS for TiKV nodes with small data for faster startup

On COS, remount with UUID and set nobarrier.
We do not combine multiple like on Ubuntu.
Therefore this is recommended if your data size is < 375 GiB
@gregwebs gregwebs requested a review from cofyc June 27, 2019 19:43
We also have a [daemonset](../manifests/gke/local-ssd-provision.yaml) that
* fixes any performance issues
* remounts local SSD disks with a UUID for safety
* On Ubuntu combines all local SSD disks into one large disk with lvm tools.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we let the user decide whether to combine the disks? In case users need to deploy multiple TiKV instances on one node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the cloud I don't think it makes sense to deploy multiple TiKV instances to one node. I am more concerned about a cluster that both runs tidb-operator and something else. But it is up to them to take these daemonsets and use them appropriately.

Copy link
Contributor

@cofyc cofyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@tennix tennix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tennix tennix merged commit 6692afe into master Jul 5, 2019
@tennix tennix deleted the gke-ssd-provision-cos-one-disk branch July 5, 2019 05:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants