Key security improvements and best practices for Kubernetes on Azure

Runcy Oommen
6 min readJun 9, 2020

--

Cloud native design thinking has taken the world by storm and it definitely is the new standard for modern application development. This usually translates to the usage of popular Docker and Kubernetes platform with a service mesh architecture attained with Istio.

The security of your workloads and data is a key consideration as you manage clusters in managed services such as Azure Kubernetes Service (AKS). For example, a Zero Day task would be to secure the associated resources run in a multi-tenant cluster obtained with logical isolation.

Below, we take a look at some of the security concerns, key considerations and best practices for your Kubernetes deployment in Azure:

1. Disable anonymous authentication

Just for setting context, there are two types of users — normal users (managed by an outside service like Azure Active Directory) and service accounts (managed by the Kubernetes API with credentials stored as Secrets) — anything else is treated as anonymous requests. They’re given a username ofsystem:anonymousand groupsystem:unauthenticataed. Make sure you disable this mode for enhanced security by passing--anonymous-auth=falseoption to the API server

2. Configure admission controllers

Tremendous improvement in terms of speed and manageability of backend clusters has been done into Kubernetes over the years since the initial launch. A more recent introduction in security features is a set of plugins called admission controllers. They can be thought of as a gatekeeper that intercepts API requests and may change the request object or deny it altogether. Listing down some of the recommendations to operationalize them for better security:

a. As a first step, enable admission controller with the following command line and replace what appears after “=” with the name of the admission controller to be tuned on

--enable-admission-plugins=NameOfController,NameOfController2

b. Ensure that below given admission controllers are enabled by default:

LimitRanger,DefaultStorageClass,DefaultTolerationSeconds,NamespaceLifecycle,ServiceAccount,MutatingAdmissionWebhook,ValidatingAdmissionWebhook,Priority,ResourceQuota,PodSecurityPolicy

c. Enable ValidatingAdmissionWebhookif you need to validate Kubernetes resources during create, update, and delete operations

d. Consider disabling MutatingAdmissionWebhook or apply stricter RBAC restrictions as to who can create MutatingWebhookConfiguration

e. Use AlwaysPullImages admission controller to ensure pull policy is appropriately set when you want to make sure a user’s private images are only pulled by those who have the credentials.

3. Pod Security Policies (PSP) to prevent risky container images from being used

Since this is an important admission controller, I decided to mention it as a separate section. From the K8s documentation, PSP is a cluster level resource that defines a set of conditions a pod must consist in-order to be accepted into the system. An important thing to note here is that PSP is an optional (but recommended) admission controller. In order for this to be used, the target pod’s service account must be authorized to use the policy. Check this example to get a better understanding by seeing it in action. At the time of this writing, usage of pod security policies for AKS have been launched as a preview

4. Enable authorization with RBAC

RBAC is a key security feature that protects your cluster by allowing you to control who can access specific API resources. If you have an AKS cluster running with Azure AD integration, this is pretty straight forward as described in the official documentation. However, since RBAC itself is a relatively new feature, what I would be covering are some of the potential configuration mistakes that might leave you unintentionally exposed.

Potential mistake #1: Make sure that cluster-adminrole is not granted unnecessarily. It could have happened that during the transition from legacy ABAC controller to RBAC, some administrators may have replicated ABAC’s permissive configuration by granting it widely, neglecting the warnings from the documentation.

Potential mistake #2: Role aggregation, available from K8s 1.9, if not carefully reviewed can lead to improper usage. The intended use of a role; for example system:view role could improperly aggregate rules with verbs other than view, violating the original intention.

Potential mistake #3: Duplicated role grant may happen whereby subjects can get the same access in more than one way. This situation can make access revocation more difficult if an administrator does not realize that multiple role bindings grant the same privileges.

Potential mistake #4: Unused roles granted to subjects that do not exist (such as service accounts in deleted namespaces or users no longer with the org) can make it difficult to see the configurations that do matter.

5. Disable public access to your cluster

As a recommended best practice, never expose remote connectivity to the AKS nodes; instead create a bastion host or jump box in a management virtual network. The PaaS offering from Azure, called Azure Bastion, helps to connect to the nodes and perform maintenance or troubleshoot issues. The bastion host should be in a separate network that is securely peered to the AKS cluster virtual network

6. User network policies to segment and limit container and pod communications

An important aspect that demands attention from a security perspective is the network policies feature that specify how groups of pods are allowed to communicate with each other and other network endpoints. They tend to be a little confusing, since one would think that, if no network policies applied to a pod, then no connections to or from it would be permitted. The opposite, in fact, is true: if no network policies apply to a pod, then all network connections to and from it are permitted. This behavior relates to the notion of “pod isolation” which means that pods are isolated if at least one network policy applies to them; if no policies apply, they are “non-isolated”.

The network policy spec is intricate, and it can be difficult to understand and use correctly. It is recommended to use a network plugin to enforce network policies. Example plugins include Calico, Cilium, Kube-router, Romana and Weave Net

7. Use only trusted authorized container

Always scan your container images for vulnerabilities and deploy only after the images have passed the validation. Regularly update the base images and application runtime, then redeploy workloads in the AKS cluster. It is advised to integrate the deployment workflow to include a process to scan container images using tools such as Twistlock or Aqua, and then only allow verified images to be deployed. Azure Container Registry has capabilities for performing these scans to set up a CI/CD pipeline with the required automation.

8. Secure your CI/CD pipelines with vulnerability scanning

Primary steps towards the delivery of secure software is to conduct risk assessment and threat modeling. Thankfully, security of the CI/CD Pipeline is relatively straightforward assuming you and your team followed best practices during creation of your project’s Git repository. My recommendation would be to explore the Anchore Engine, an open-source project that provides a centralized service for inspection, analysis, and certification of container images. It can be run either as a standalone Docker container or within a managed orchestration service like AKS.

With a deployment of Anchore Engine running in your environment, container images downloaded from Azure Container Registry would be evaluated against user-customizable policies to perform security, compliance, and best practices enforcement checks. Check out this great great step-by-step guide on enabling Anchore for your AKS cluster.

I hope these recommendations provide you a good aid as you start the journey in securing the cluster. However, please note that the above guidelines are not exhaustive and there are many security considerations to be aware of when using Kubernetes. Check your clusters today.

--

--

Runcy Oommen

Polyglot to Humans & Machines. FOSS and Linux of Things. Security & Privacy Enthusiast.