From 0b5b11107a705228373ddb3944f2388a8607eb53 Mon Sep 17 00:00:00 2001 From: Derek Carr Date: Tue, 12 Sep 2017 16:22:31 -0400 Subject: [PATCH] HugePages documentation --- _data/tasks.yml | 4 + docs/tasks/index.md | 4 + .../manage-hugepages/scheduling-hugepages.md | 77 +++++++++++++++++++ 3 files changed, 85 insertions(+) create mode 100644 docs/tasks/manage-hugepages/scheduling-hugepages.md diff --git a/_data/tasks.yml b/_data/tasks.yml index 53beb6e0e9bd2..5b2035255fcde 100644 --- a/_data/tasks.yml +++ b/_data/tasks.yml @@ -187,6 +187,10 @@ toc: section: - docs/tasks/manage-gpus/scheduling-gpus.md +- title: Manage HugePages + section: + - docs/tasks/manage-gpus/scheduling-hugepages.md + - title: Extend kubectl with plugins section: - docs/tasks/extend-kubectl/kubectl-plugins.md diff --git a/docs/tasks/index.md b/docs/tasks/index.md index 86a435eeae719..01e171b39a2a2 100644 --- a/docs/tasks/index.md +++ b/docs/tasks/index.md @@ -62,6 +62,10 @@ Perform common tasks for managing a DaemonSet, such as performing a rolling upda Configure and schedule NVIDIA GPUs for use as a resource by nodes in a cluster. +#### Managing HugePages + +Configure and schedule huge pages as a schedulable resource in a cluster. + ### What's next If you would like to write a task page, see diff --git a/docs/tasks/manage-hugepages/scheduling-hugepages.md b/docs/tasks/manage-hugepages/scheduling-hugepages.md new file mode 100644 index 0000000000000..324053862285d --- /dev/null +++ b/docs/tasks/manage-hugepages/scheduling-hugepages.md @@ -0,0 +1,77 @@ +--- +approvers: +- derekwaynecarr title: Scheduling HugePages +--- + +{% capture overview %} + +Kubernetes includes **alpha** support for managing huge pages spread across +nodes. This page describes how users can consume huge pages and the current +limitations. + +{% endcapture %} + +{% capture prerequisites %} + +1. Kubernetes nodes must pre-allocate huge pages in order for the node to + discover them. A node may only pre-allocate huge pages for a single size. +1. A special **alpha** feature gate `HugePages` has to be set to true across the + system: `--feature-gates="HugePages=true"`. + +The nodes will automatically discover and expose all huge page resources as a +schedulable resource. + +{% endcapture %} + +{% capture steps %} + +## API + +Huge pages can be consumed via container level resource requirements using the +resource name `hugepages-{size}`, where size is the most compact binary notation +using integer values supported on a particular node. For example, if a node +supports 2048kB page sizes, it will expose a schedulable resource +`hugepages-2Mi`. Unlike CPU or memory, huge pages do not support overcommit. + +```yaml +apiVersion: v1 +kind: Pod +metadata: + generateName: hugepages-volume- +spec: + containers: + - image: fedora:latest + command: + - sleep + - inf + name: example + volumeMounts: + - mountPath: /hugepages + name: hugepage + resources: + limits: + hugepages-2Mi: 100Mi + volumes: + - name: hugepage + emptyDir: + medium: HugePages +``` + +- Huge page requests must equal the limits. +- Huge pages are isolated at a pod scope, container isolation is planned in a + future iteration. +- EmptyDir volumes backed by HugePages may not consume more huge page memory + than the pod request. +- Applications that consume huge pages via `shmget()` with `SHM_HUGETLB` must + run with a supplemental group that matches `proc/sys/vm/hugetlb_shm_group` + +## Future + +- Support container isolation of huge pages in addition to pod isolation. +- NUMA locality guarnatees as a feature of quality of service. +- ResourceQuota support. +- LimitRange support. + +{% endcapture %} + +{% include templates/task.md %}