Skip to content

Commit

Permalink
Add a condition field for capturing status of resize on PVC
Browse files Browse the repository at this point in the history
  • Loading branch information
gnufied committed Jul 18, 2017
1 parent 9bb2192 commit a8c6fb8
Showing 1 changed file with 79 additions and 8 deletions.
87 changes: 79 additions & 8 deletions contributors/design-proposals/grow-volume-size.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,12 @@ Enable users to increase size of PVs that their pods are using. The user will up
| ----------------| :---------------: | :--------------------------:| :----------------------: |
| EBS | Yes | Yes | Yes |
| GCE PD | Yes | Yes | Yes |
| Azure Disk | Yes | Yes | No |
| GlusterFS | Yes | No | Yes |
| Cinder | Yes | Yes | Yes |
| Vsphere | Yes | Yes | No |
| Ceph RBD | Yes | Yes | No |
| Host Path | No | No | No |
| GlusterFS | Yes | No | Yes |
| Azure Disk | Yes | Yes | No |
| Azure File | No | No | No |
| Cephfs | No | No | No |
| NFS | No | No | No |
Expand Down Expand Up @@ -63,13 +63,15 @@ For volume types that only require volume plugin based api call, this will be on
A new controller called `volume_expand_controller` will listen for pvc size expansion requests and take action as needed. The steps performed in this
new controller will be:

* Watch for pvc update requests and add pvc to controller's desired state of world if a increase in volume size was requested.
* Watch for pvc update requests and add pvc to controller's desired state of world if a increase in volume size was requested. Once PVC is added to
controller's desired state of world - `pvc.Status.Conditions` will be updated with `ResizeStarted: True`.
* For unbound or pending PVCs - resize will trigger no action in `volume_expand_controller`.
* A reconciler will read desired state of world and perform corresponding volume resize operation. If there is a resize operation in progress
for same volume then resize request will be pending and retried once previous resize request has completed.
* Controller resize in effect will be level based rather than edge based. If there are more than one pending resize request for same PVC then
new resize requests for same PVC will replace older pending request.
* Resize will be performed via volume plugin interface, executed inside a goroutine spawned by `operation_exectutor`.
* A new plugin interface called `volume.Exander` will be added to volume plugin interface. The controller call to expand the PVC will look like:
* A new plugin interface called `volume.Expander` will be added to volume plugin interface. The controller call to expand the PVC will look like:

```go
func (og *operationGenerator) GenerateExpandVolumeFunc(
Expand Down Expand Up @@ -103,11 +105,11 @@ func (og *operationGenerator) GenerateExpandVolumeFunc(
}
```

* Once volume expand is successful, the volume will be marked as expanded and new size will be updated in `pv.spec.capacity`. Any errors will be
reported as *events* on PVC object.
* Once volume expand is successful, the volume will be marked as expanded and new size will be updated in `pv.spec.capacity`. Any errors will be reported as *events* on PVC object.
* If resize failed in above step, in addition to events - `pvc.Status.Conditions` will be updated with `ResizeFailed: True`. Corresponding error will be added to condition field as well.
* Depending on volume type next steps would be:

* If volume is of type that does not require file system resize, then `pvc.status.capacity` will be immediately updated to reflect new size. This would conclude the volume expand operation.
* If volume is of type that does not require file system resize, then `pvc.status.capacity` will be immediately updated to reflect new size. This would conclude the volume expand operation. Also `pvc.Status.Conditions` will be updated with `Ready: True`.
* If volume if of type that requires file system resize then a file system resize will be performed on kubelet. Read below for steps that will be performed for file system resize.

* If volume plugin is of type that can not do resizing of attached volumes (such as `Cinder`) then `ExpandVolumeDevice` can return error by checking for
Expand All @@ -122,13 +124,45 @@ reported as *events* on PVC object.

### File system resize on kublet

A File system resize will be pending on PVC until a new pod that uses this volume is scheduled somewhere. While theoretically we *can* perform
online file system resize if volume type and file system supports it - we are leaving it for next iteration of this feature.

* When calling `MountDevice` or `Setup` call of volume plugin, volume manager will in addition compare `pv.spec.capacity` and `pvc.status.capacity` and if `pv.spec.capacity` is greater
than `pvc.status.spec.capacity` then volume manager will additionally resize the file system of volume.
* The call to resize file system will be performed inside `operation_generator.GenerateMountVolumeFunc`. `VolumeToMount` struct will be enhanced to store PVC as well.
* Any errors during file system resize will be added as *events* to Pod object and mount operation will be failed.
* If there are any errors during file system resize `pvc.Status.Conditions` will be updated with `ResizeFailed: True`. Any errors will be added to
`Conditions` field.
* File System resize will not be performed on kubelet where volume being attached is ReadOnly. This is similar to pattern being used for performing formatting.
* After file system resize is successful, `pvc.status.capacity` will be updated to match `pv.spec.capacity` and volume expand operation will be considered complete.
* After file system resize is successful, `pvc.status.capacity` will be updated to match `pv.spec.capacity` and volume expand operation will be considered complete. Also `pvc.Status.Conditions` will be updated with `Ready: True`.

#### Reduce coupling between resize operation and file system type

A file system resize in general requires presence of tools such as `resize2fs` or `xfs_growfs` on the host where kubelet is running. There is a concern
that open coding call to different resize tools direclty in Kubernetes will result in coupling between file system and resize operation. To solve this problem
we have considered following options:

1. Write a library that abstracts away various file system operations, such as - resizing, formatting etc.

Pros:
* Relatively well known pattern

Cons:
* Depending on version with which Kubernetes is compiled with, we are still tied to which file systems are supported in which version
of kubernetes.
2. Ship a wrapper shell script that encapsulates various file system operations and as long as the shell script supports particular file system
the resize operation is supported.
Pros:
* Kubernetes Admin can easily replace default shell script with her own version and thereby adding support for more file system types.

Cons:
* I don't know if there is a pattern that exists in kube today for shipping shell scripts that are called out from code in Kubernetes. Flex is
different because, none of the flex scripts are shipped with Kuberntes.
3. Ship resizing tools in a container.


Of all options - #3 is our best bet but we are not quite there yet. Hence, I would like to propose that we ship with support for
most common file systems in curent release and we revisit this coupling and solve it in next release.

## API and UI Design

Expand Down Expand Up @@ -173,6 +207,43 @@ spec:
`pvc.spec.resources.requests.storage` field of pvc object will become mutable after this change.

In addition to that PVC's status will have a `Conditions []PvcCondition` - which will be used
to communicate the status of PVC to the user.

So the `PersistentVolumeClaimStatus` will become:

```go
type PersistentVolumeClaimStatus struct {
Phase PersistentVolumeClaimPhase
AccessModes []PersistentVolumeAccessMode
Capacity ResourceList
// New Field added as part of this Change
Conditions []PVCCondition
}
// new API type added
type PVCCondition struct {
Type PVCConditionType
Status ConditionStatus
LastProbeTime metav1.Time
LastTransitionTime metav1.Time
Reason string
Message string
}
// new API type
type PVCConditionType string
// new Constants
const (
PVCReady PVCConditionType = "Ready"
PVCResizeStarted PVCConditionType = "ResizeStarted"
PVCResizeFailed PVCResizeFailed = "ResizeFailed"
)
```



### Other API changes

This proposal relies on ability to update PVC status from kubelet. While updating PVC's status
Expand Down

0 comments on commit a8c6fb8

Please sign in to comment.