-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial security best practices documentation #8952
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -7,36 +7,170 @@ owner: istio/wg-security-maintainers | |
test: no | ||
--- | ||
|
||
This section provides some deployment guidelines to help keep a service mesh secure. | ||
|
||
## Use namespaces for isolation | ||
|
||
If there are multiple service operators (a.k.a. [SREs](https://en.wikipedia.org/wiki/Site_reliability_engineering)) | ||
deploying different services in a medium- or large-size cluster, we recommend creating a separate | ||
[Kubernetes namespace](https://kubernetes.io/docs/tasks/administer-cluster/namespaces-walkthrough/) for each SRE team to isolate their access. | ||
For example, you can create a `team1-ns` namespace for `team1`, and `team2-ns` namespace for `team2`, such | ||
that both teams cannot access each other's services. | ||
|
||
Let us consider a three-tier application with three services: `photo-frontend`, | ||
`photo-backend`, and `datastore`. The photo SRE team manages the | ||
`photo-frontend` and `photo-backend` services while the datastore SRE team | ||
manages the `datastore` service. The `photo-frontend` service can access | ||
`photo-backend`, and the `photo-backend` service can access `datastore`. | ||
However, the `photo-frontend` service cannot access `datastore`. | ||
|
||
In this scenario, a cluster administrator creates two namespaces: | ||
`photo-ns` and `datastore-ns`. The administrator has | ||
access to all namespaces and each team only has access to its own namespace. | ||
The photo SRE team creates two service accounts to run `photo-frontend` and | ||
`photo-backend` respectively in the `photo-ns` namespace. The datastore SRE | ||
team creates one service account to run the `datastore` service in the | ||
`datastore-ns` namespace. Moreover, we need to enforce the service access | ||
control in [Istio Mixer](https://istio.io/v1.6/docs/reference/config/policy-and-telemetry/) such that | ||
`photo-frontend` cannot access datastore. | ||
|
||
In this setup, Kubernetes can isolate the operator privileges on managing the services. | ||
Istio manages certificates and keys in all namespaces | ||
and enforces different access control rules to the services. | ||
Istio security features provide strong identity, powerful policy, transparent TLS encryption, and authentication, authorization and audit (AAA) tools to protect your services and data. | ||
However, to fully make use of these features securely, care must be taken to follow best practices. It is recommended to review the [Security overview](/docs/concepts/security/) before proceeding. | ||
|
||
## Mutual TLS | ||
|
||
Istio will [automatically](/docs/ops/configuration/traffic-management/tls-configuration/#auto-mtls) encrypt traffic using [Mutual TLS](/docs/concepts/security/#mutual-tls-authentication) whenever possible. | ||
However, proxies are configured in [permissive mode](/docs/concepts/security/#permissive-mode) by default, meaning they will accept both mutual TLS and plaintext traffic. | ||
|
||
While this is required for incremental adoption or allowing traffic from clients without an Istio sidecar, it also weakens the security stance. | ||
It is recommended to [migrate to strict mode](/docs/tasks/security/authentication/mtls-migration/) when possible, to enforce that mutual TLS is used. | ||
|
||
Mutual TLS alone is not always enough to fully secure traffic, however, as it provides only authentication, not authorization. | ||
This means that anyone with a valid certificate can still access a service. | ||
|
||
To fully lock down traffic, it is recommended to configure [authorization policies](/docs/tasks/security/authorization/). | ||
These allow creating fine-grained policies to allow or deny traffic. For example, you can allow only requests from the `app` namespace to access the `hello-world` service. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. After mutual TLS, I think it'd be good to have a section for using the beta security policies. We should cover the limitations of ALLOW policies / describe when it's safer to use DENY. We should also describe how authentication policies are decoupled from authorization policies and make it clear that authentication policies without corresponding authorization policies are just security theater. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would need help with "We should cover the limitations of ALLOW policies / describe when it's safer to use DENY.", if one of the people mentioned above could take this on? I could take on the later part if needed |
||
## Understand traffic capture limitations | ||
|
||
The Istio sidecar works by capturing both inbound traffic and outbound traffic and directing them through the sidecar proxy. | ||
|
||
However, not *all* traffic is captured: | ||
|
||
* Redirection only handles TCP based traffic. Any UDP or ICMP packets will not be captured or modified. | ||
* Inbound capture is disabled on many [ports used by the sidecar](/docs/ops/deployment/requirements/#ports-used-by-istio) as well as port 22. This list can be expanded by options like `traffic.sidecar.istio.io/excludeInboundPorts`. | ||
* Outbound capture may similarly be reduced through settings like `traffic.sidecar.istio.io/excludeOutboundPorts` or other means. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wonder if there should be a stronger link between this section and the immediately following NetworkPolicy section? You could explain which of the issues raised here can be addressed with k8s NetworkPolicy |
||
|
||
In general, there is minimal security boundary between an application and its sidecar proxy. Configuration of the sidecar is allowed on a per-pod basis, and both run in the same network/process namespace. | ||
As such, the application may have the ability to remove redirection rules and remove, alter, terminate, or replace the sidecar proxy. | ||
This allows a pod to intentionally bypass its sidecar for outbound traffic or intentionally allow inbound traffic to bypass its sidecar. | ||
|
||
As a result, it is not secure to rely on all traffic being captured unconditionally by Istio. | ||
Instead, the security boundary is that a client may not bypass *another* pod's sidecar. | ||
|
||
For example, if I run the `reviews` application on port `9080`, I can assume that all traffic from the `productpage` application will be captured by the sidecar proxy, | ||
where Istio authentication and authorization policies may apply. | ||
|
||
### Defense in depth with `NetworkPolicy` | ||
|
||
To further secure traffic, Istio policies can be layered with Kubernetes [Network Policies](https://kubernetes.io/docs/concepts/services-networking/network-policies/). | ||
This enables a strong [defense in depth](https://en.wikipedia.org/wiki/Defense_in_depth_(computing)) strategy that can be used to further strengthen the security of your mesh. | ||
|
||
For example, you may choose to only allow traffic to port `9080` of our `reviews` application. | ||
howardjohn marked this conversation as resolved.
Show resolved
Hide resolved
|
||
In the event of a compromised pod or security vulnerability in the cluster, this may limit or stop an attackers progress. | ||
|
||
### Securing egress traffic | ||
|
||
A common misconception is that options like [`outboundTrafficPolicy: REGISTRY_ONLY`](/docs/tasks/traffic-management/egress/egress-control/#envoy-passthrough-to-external-services) acts as a security policy preventing all access to undeclared services. | ||
However, this is not a strong security boundary as mentioned above, and should be considered best-effort. | ||
|
||
While this is useful to prevent accidental dependencies, if you want to secure egress traffic, and enforce all outbound traffic goes through a proxy, you should instead rely on an [Egress Gateway](/docs/tasks/traffic-management/egress/egress-gateway/). | ||
When combined with a [Network Policy](/docs/tasks/traffic-management/egress/egress-gateway/#apply-kubernetes-network-policies), you can enforce all traffic, or some subset, goes through the egress gateway. | ||
This ensures that even if a client accidentally or maliciously bypasses their sidecar, the request will be blocked. | ||
|
||
## Configure TLS verification in Destination Rule when using TLS origination | ||
|
||
Istio offers the ability to [originate TLS](/docs/tasks/traffic-management/egress/egress-tls-origination/) from the sidecar proxy. | ||
This enables applications that send plaintext HTTP traffic to be transparently "upgraded" to HTTPS. | ||
|
||
Care must be taken when configuring the `DestinationRule`'s `tls` setting to specify the `caCertificates` field. | ||
When this is not set, the servers certificate will not be verified. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Out of scope, but it feels like the API should require either There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That's the plan, but its challenging to change due to backwards compat. |
||
|
||
For example: | ||
|
||
{{< text yaml >}} | ||
apiVersion: networking.istio.io/v1beta1 | ||
kind: DestinationRule | ||
metadata: | ||
name: google-tls | ||
spec: | ||
host: google.com | ||
trafficPolicy: | ||
tls: | ||
mode: SIMPLE | ||
caCertificates: /etc/ssl/certs/ca-certificates.crt | ||
{{< /text >}} | ||
|
||
## Gateways | ||
|
||
When running an Istio [gateway](/docs/tasks/traffic-management/ingress/), there are a few resources involved: | ||
|
||
* `Gateway`s, which controls the ports and TLS settings for the gateway. | ||
* `VirtualService`s, which control the routing logic. These are associated with `Gateway`s by direct reference in the `gateways` field and a mutual agreement on the `hosts` field in the `Gateway` and `VirtualService`. | ||
|
||
### Restrict `Gateway` creation privileges | ||
|
||
It is recommended to restrict creation of Gateway resources to trusted cluster administrators. This can be achieved by [Kubernetes RBAC policies](https://kubernetes.io/docs/reference/access-authn-authz/rbac/) or tools like [Open Policy Agent](https://www.openpolicyagent.org/). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. May be worth mentioning that OPA/Gatekeeper can also be used to enforce some of these policies, like restricting hosts There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I may be misunderstanding, but doesn't this line already recommend OPA? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean that it mentions OPA as a way to restrict Gateway creation to administrators, but it might be worth saying that OPA could be used for more of these recommendations too There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah, got it. Agreed 👍 |
||
|
||
### Avoid overly broad `hosts` configurations | ||
|
||
When possible, avoid overly broad `hosts` settings in `Gateway`. | ||
|
||
For example, this configuration will allow any `VirtualService` to bind to the `Gateway`, potentially exposing unexpected domains: | ||
|
||
{{< text yaml >}} | ||
servers: | ||
- port: | ||
number: 80 | ||
name: http | ||
protocol: HTTP | ||
hosts: | ||
- "*" | ||
{{< /text >}} | ||
|
||
This should be locked down to allow only specific domains or specific namespaces: | ||
|
||
{{< text yaml >}} | ||
servers: | ||
- port: | ||
number: 80 | ||
name: http | ||
protocol: HTTP | ||
hosts: | ||
- "foo.example.com" # Allow only VirtualServices that are for foo.example.com | ||
- "default/bar.example.com" # Allow only VirtualServices in the default namespace that are for bar.example.com | ||
- "route-namespace/*" # Allow only VirtualServices in the route-namespace namespace for any host | ||
{{< /text >}} | ||
|
||
### Isolate sensitive services | ||
|
||
It may be desired to enforce stricter physical isolation for sensitive services. For example, you may want to run a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. May be helpful to offer example reasons for why someone might want to do this |
||
[dedicated gateway instance](/docs/setup/install/istioctl/#configure-gateways) for a sensitive `payments.example.com`, while utilizing a single | ||
shared gateway instance for less sensitive domains like `blog.example.com` and `store.example.com`. | ||
|
||
## Protocol detection | ||
|
||
Istio will [automatically determine the protocol](/docs/ops/configuration/traffic-management/protocol-selection/#automatic-protocol-selection) of traffic it sees. | ||
To avoid accidental or intentional miss detection, which may result in unexpected traffic behavior, it is recommended to [explicitly declare the protocol](/docs/ops/configuration/traffic-management/protocol-selection/#explicit-protocol-selection) where possible. | ||
|
||
## CNI | ||
|
||
In order to transparently capture all traffic, Istio relies on `iptables` rules configured by the `istio-init` `initContainer`. | ||
This adds a [requirement](/docs/ops/deployment/requirements/) for the `NET_ADMIN` and `NET_RAW` [capabilities](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-capabilities-for-a-container) to be available to the pod. | ||
|
||
To reduce privileges granted to pods, Istio offers a [CNI plugin](/docs/setup/additional-setup/cni/) which removes this requirement. | ||
|
||
{{< warning >}} | ||
The Istio CNI plugin is currently an alpha feature. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @stewartbutler @mandarjog @justinpettit heads up, this will put pressure on carrying istio/cni into beta |
||
{{< /warning >}} | ||
|
||
## Use hardened docker images | ||
|
||
Istio's default docker images, including those run by the control plane, gateway, and sidecar proxies, are based on `ubuntu`. | ||
This provides various tools such as `bash` and `curl`, which trades off convenience for an increase attack surface. | ||
|
||
Istio also offers a smaller image based on [distroless images](/docs/ops/configuration/security/harden-docker-images/) that reduces the dependencies in the image. | ||
|
||
{{< warning >}} | ||
Distroless images are currently an alpha feature. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @howardjohn @sdake can we take this to beta? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. opened a tracker in istio/enhancements#22 |
||
{{< /warning >}} | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please add:
|
||
## Release and security policy | ||
|
||
In order to ensure your cluster has the latest security patches for known vulnerabilities, it is important to stay on the latest patch release of Istio and ensure that you are on a [supported release](/about/supported-releases) that is still receiving security patches. | ||
|
||
## Avoid alpha and experimental features | ||
|
||
All Istio features and APIs are assigned a [feature status](/about/feature-stages/), defining its stability, deprecation policy, and security policy. | ||
|
||
Because alpha and experimental features do not have as strong security guarantees, it is recommended to avoid them whenever possible. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps make this stronger. We do not create security releases for vulnerabilities in experimental features. We decide on a case by case basis whether to fix vulnerabilities in alpha features (basically, if we think there is broad adoption of an alpha feature, we'll patch vulnerabilities following principles of responsible disclosure) |
||
|
||
To determine the feature status of features in use in your cluster, consult the [Istio features](/about/feature-stages/#istio-features) list. | ||
|
||
<!-- In the future, we should document the `istioctl` command to check this when available. --> | ||
|
||
## Configure third party service account tokens | ||
|
||
|
@@ -66,4 +200,4 @@ To determine if your cluster supports third party tokens, look for the `TokenReq | |
} | ||
{{< /text >}} | ||
|
||
While most cloud providers support this feature now, many local development tools and custom installations may not. To enable this feature, please refer to the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection). | ||
While most cloud providers support this feature now, many local development tools and custom installations may not prior to Kubernetes 1.20. To enable this feature, please refer to the [Kubernetes documentation](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#service-account-token-volume-projection). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there should be an anti-patterns section (things not to do) and one of the items should be not relying too much on namespaces for isolation?