Security implications of Kubernetes operators
February 15, 20220 min read
Managing resources in early versions of Kubernetes was a straightforward affair: we could define resources with YAML markup and submit these definitions to the cluster. But this turned out to require too much manual work, and at too low of a level.
The next step in the evolution of Kubernetes was to use Helm charts. Sometimes called "the package manager for Kubernetes," Helm allowed developers to share entire application setups using a templating language. This made sharing configurations easy, and allowed us to conveniently deploy charts with one command.
But Helm is an external add-on, and its capabilities are limited to what we can implement through the existing Kubernetes API. So the logical next step for Kubernetes was the extension of the Kubernetes API using operators. It gave us the ability to add custom functionality from within a cluster.
Kubernetes operators provide their control loops to carry out tasks that were previously carried out by human operators. To create a Kubernetes operator, the developer defines custom code to interact with the Kubernetes API and guide the application lifecycle automatically. This custom code usually runs in one or more pods on the cluster, but it could just as well interact with the cluster from the outside with appropriate authentication.
Are you thinking this setup sounds like a potential place for vulnerabilities? It is. As always, where there are new abstractions, there are new Kubernetes security issues.
In this post, we’ll look at some examples of proper operator permission scoping, the tandem roles that operator creators and end-users have in ensuring security, and a few ways to use operators to make Kubernetes services more secure.
Kubernetes security With RBAC
If we’re deploying a Kubernetes operator to a cluster, we should treat general Kubernetes security conventions as foundational to our operator-specific considerations. Let’s examine these general considerations first.
The main permissions system in Kubernetes is role-based access control (RBAC) authorization, which must be enabled when starting the cluster. It offers additional permission resources:
RoleBindings, which only apply to the namespace in which they are defined; and
ClusterRoleBindings, which apply to the entire cluster (and should only be used if necessary).
Roles define how privileged actors can access resources, and
RoleBindings connect human or software component actors to
Roles. As with general operating system security, it’s important to only provide strictly necessary permissions, and to regularly review granted permissions to verify that they're still required.
Here’s an example of a role that allows read access to the pod resources in the namespace specified:
1apiVersion: rbac.authorization.k8s.io/v1 2kind: Role 3metadata: 4 namespace: my-webserver 5 name: pod-reader 6rules: 7- apiGroups: [""] 8 resources: ["pods"] 9 verbs: ["get", "list"]
The empty string for
apiGroups indicates the core API. To match this, we can define a role binding that assigns this role to a specific user account:
1apiVersion: rbac.authorization.k8s.io/v1 2kind: RoleBinding 3Metadata: 4 namespace: my-webserver 5 name: read-pods 6subjects: 7- kind: User 8 name: emilio 9 apiGroup: rbac.authorization.k8s.io 10roleRef: 11 kind: Role 12 name: pod-reader 13 apiGroup: rbac.authorization.k8s.io
Operator deployments often come with their own sets of roles and bindings. The user should make a point to check these, and the documentation, to see exactly what is being granted on the user’s cluster by a third party.
Scopes and permissions
Kubernetes operator security comes with its chain of trust. This chain starts with the operator’s authors and their repository and continues to the way the operator is delivered to the user’s cluster, among many other things. Similar to the RBAC system, it makes sense to restrict the operator’s permissions as much as possible. Unfortunately, most Kubernetes operators require fairly broad privileges to carry out their functionality, so developers need to strike a careful balance between security and utility.
Like roles and role bindings, operators can either be limited to the scope of a namespace (namespace-scoped) or they can operate over all namespaces on the entire cluster (cluster-scoped). Kubernetes applications can be contained in their own separate namespace to isolate them from each other and the Kubernetes system.
Whenever possible, operator developers should choose namespace-scoped variants. This prevents problems from spilling over into other deployments on the same cluster. If an operator wants to provide services to all deployments on a cluster, cluster-scoped operators are necessary. A well-known example of this is a deployment that automatically provisions certificates to applications.
One way to prevent pods and containers from manipulating the rest of the Kubernetes system is to use
securityContexts, which can also be applied to operator containers. Here’s an example of a YAML snippet limiting the rights of a pod:
1securityContext: 2 privileged: false 3 allowPrivilegeEscalation: false 4 runAsNonRoot: true 5 runAsUser: 1000 6 runAsGroup: 1000 7 readOnlyRootFilesystem: true
We’re assuming that the container in this example is running on the Kubernetes cluster on which it operates. Using this snippet, we prevent processes from gaining root privileges and modifying their root file system. If the operator becomes compromised for any reason, the attacker will now be limited in what they can do to the host system.
The next step is to define cluster-wide pod security restrictions. Before Kubernetes version 1.21, we would do this with the
PodSecurityPolicy (PSP) object. This is now deprecated. Its successor,
PodSecurityAdmission (PSA), is a beta feature in Kubernetes version 1.23.
Unfortunately, this new PSA doesn’t allow the same fine-grained custom control as the YAML in the example above. Instead, we select from three newly defined policy levels — privileged, baseline, and restricted — to apply to each namespace.
1apiVersion: v1 2kind: Namespace 3metadata: 4 labels: 5 pod-security.kubernetes.io/enforce: baseline 6 pod-security.kubernetes.io/warn: restricted
Our namespace in this example is configured to enforce at the intermediate baseline security level, and to issue warnings at the aptly-named restricted level. The exact definitions of these policies are listed in the Kubernetes documentation about Pod Security Standards.
Security benefits of operators
Although Kubernetes operators introduce some security considerations, they can also make a cluster more secure. As discussed earlier, Kubernetes operators perform the actions and configurations normally assigned to human operators who can — and regularly do — make mistakes. Configuration and control executed by a program are more reliable and reproducible than the actions executed by a human operator.
When operators are correctly authored, they eliminate errors of negligence — or even malice. Response speed is also an essential factor in security. Software operators have a significant advantage over human operators in responding to problems, both in alerting about and resolving an issue.
Most deployments to a cluster are concerned with day-to-day operations of the application, but it’s possible to deploy dedicated security management services right into the cluster. These are best guided by their control loop in an operator.
Solutions for secure Kubernetes deployments
The development of Kubernetes operators has made it possible to extend the Kubernetes API with custom logic within a cluster. If mishandled, this added capability can be exploited by malicious actors. But as developers, we can ensure the security of our Kubernetes deployments with permission rules and other restrictions. The main Kubernetes feature for handling permissions is RBAC, and it allows for specifying privileges.
Operators can also provide some security benefits by replacing manual processes that are often subject to human error. They let us run dedicated security services, like Snyk, inside a cluster.
Snyk offers solutions to the potential vulnerabilities we outlined in this post, and to many others, by alerting developers to security issues as they arise during coding. In particular, Snyk provides a Kubernetes operator we can install into our cluster. From there, Snyk finds and reports vulnerabilities as they arise in existing or new workloads on the cluster.
Secure infrastructure from the source
Snyk automates IaC security and compliance in workflows and detects drifted and missing resources.