portrait

Tiexin Guo

Senior DevOps Consultant, Amazon Web Services
Author | 4th Coffee

In my previous two articles, we discussed Kubernetes security and created a guideline for enhancing K8s. If you haven't read them yet, here are the links:

Hardening Your Kubernetes Cluster - Threat Model (Pt. 1)

Hardening Your Kubernetes Cluster - Guidelines (Pt. 2)

Today, we will follow the advice of the second article and do some hands-on work to have a deeper understanding. Let's start with Pod security.

After reading this article, you will learn:

  • how not to run pods as root;
  • how to use immutable root fs (lock the root filesystem);
  • how to do Docker image scan locally and with your CI pipelines;
  • how to use PSP.

1. Run Containers as Non-Root User

If you do the bare minimum and absolutely nothing else, by default, the Docker container will run as the root user, and K8s will allow it to do so.

Consider the following Dockerfile:

# omitted above
FROM alpine
WORKDIR /app
COPY app /app/app
CMD ["./app"]
EXPOSE 8080/tcp

Since the user with which to run the container isn't explicitly set, if you start a Docker container with this image, the container process is running as root:

/app $ whoami
root

If we use this image to run a pod in a K8s cluster, consider the following container spec:

# omitted above
spec:
  containers:
    - name: hello
      image: ironcore864/k8s-security-demo:pod-as-non-root
      ports:
        - containerPort: 8080

Since there aren't any restrictions, K8s will allow this pod to run as the root user.

If you want to try it yourself, the code used for this section is shown in this pull request here.

You can deploy the test in K8s by running the following commands:

git clone git@github.com:IronCore864/k8s-security-demo.git
cd k8s-security-demo
git checkout pod-run-as-root
kubectl apply -f deploy.yaml

1.1 The USER Instruction

With Docker, you can use the USER instruction. The USER instruction sets the user name (or UID), and optionally the user group (or GID) to use when running the image, and for any RUN, CMD and ENTRYPOINT instructions that follow it in the Dockerfile. So, by simply adding this line in the Dockerfile:

USER 1000:1000

We have changed the user with which the container will be run.

So, now we have sorted it out from the container level. But how about the pod? If an image doesn't have a USER instruction, K8s will still allow it.

Enter security context.

1.2 Security Context

A security context defines privilege and access control settings for a Pod or Container. Security context settings include, but are not limited to:

  • Discretionary Access Control: Permission to access an object, like a file, is based on user ID (UID) and group ID (GID).
  • Security Enhanced Linux (SELinux): Objects are assigned security labels.
  • Running as privileged or unprivileged.
  • Linux Capabilities: Give a process some privileges, but not all the privileges of the root user.
  • AppArmor: Use program profiles to restrict the capabilities of individual programs.
  • Seccomp: Filter a process's system calls.
  • AllowPrivilegeEscalation: Controls whether a process can gain more privileges than its parent process. This bool directly controls whether the no_new_privs flag gets set on the container process. AllowPrivilegeEscalation is true always when the container is: 1) run as Privileged OR 2) has CAP_SYS_ADMIN.
  • readOnlyRootFilesystem: Mounts the container's root filesystem as read-only.

For demo purposes, we only need to add one small section in to the container spec:

securityContext:
  runAsNonRoot: True

And, if the image runs as root, it will fail at deployment.

With security context, you can even limit the user ID and group ID with which the container must be run. For more details, check out the official doc here.

If you want to try it yourself, the code used for this section is shown in this pull request here.

You can deploy the test in K8s by running the following commands:

git clone git@github.com:IronCore864/k8s-security-demo.git
cd k8s-security-demo
git checkout pod-run-as-non-root
kubectl apply -f deploy.yaml

2. Immutable File System

If we get into the pod we created so far:

~/work/k8s-security-demo $ kubectl exec -it k8s-security-demo-748f4cfc8c-66lm7 sh
/app $ cd /tmp/
/tmp $ echo "abc" > file.txt
/tmp $ cat file.txt
abc

You can see, we basically can create any file in the root filesystem.

To lock the root filesystem, we simply add one line into the security context section:

readOnlyRootFilesystem: True

After re-apply, if we run another test:

~/work/k8s-security-demo $ kubectl exec -it k8s-security-demo-66dd894b84-fccfl - sh
/app $ cd /tmp/
/tmp $ echo "abc" > file.txt
sh: can't create file.txt: Read-only file system

We can't do anything on the root filesystem anymore.

The change is shown in this pull request here.

You can deploy the test in K8s by running the following commands:

git clone git@github.com:IronCore864/k8s-security-demo.git
cd k8s-security-demo
git checkout immutable-root-fs
kubectl apply -f deploy.yaml

3. Image Scanning

There are quite a few image scanning tools out there which can identify known vulnerabilities, outdated libraries, or misconfigurations, such as insecure ports or unnecessary permissions. There are also numerous ways to integrate the image scanning process with your CI workflow.

Today, let's have a quick look at a small but powerful tool: trivy

3.1 Local/Manual

If you are using Mac OS with brew, run one easy command:

brew install aquasecurity/trivy/trivy 

and you will have it.

Now let's scan an image with it:

~ $ trivy image ironcore864/go-hello-http:2.0.13
2021-11-28T11:40:58.863+0800 INFO Using your github token
2021-11-28T11:41:10.217+0800 INFO Detected OS: alpine
2021-11-28T11:41:10.217+0800 INFO Detecting Alpine vulnerabilities...
2021-11-28T11:41:10.219+0800 INFO Number of language-specific files: 0
ironcore864/go-hello-http:2.0.13 (alpine 3.14.3)
================================================
Total: 0 (UNKNOWN: 0, LOW: 0, MEDIUM: 0, HIGH: 0, CRITICAL: 0)

If you think that because an image is released by a big company or by an open-source community, it should have no vulnerabilities at all, you are wrong. Try to scan some images from some well-known projects, and you will be shocked. Knowing what base image you are using and what vulnerabilities you have in your image is important.

You might also want to check ggshield which can scan for leaked secrets in Docker images: check the video tutorial

3.2 Integrate with CI

You can also integrate trivy with your CI. For example, this is a workflow for GitHub Actions.

And the result is here.

If you want to use trivy with GitHub Actions, simply set up this workflow here.


4. Pod Security Policies (PSP)

Note: As of 2021, the PSP is being deprecated in Kubernetes 1.21.

PSP will continue to be fully functional for several more releases before being removed completely, though. In the future, as a replacement, there will be a "PSP Replacement Policy" (temporary name) which covers critical use cases more easily and sustainably.

Before the new thing emerges, let's have a quick look at the capabilities of PSP. PSP can do many things, some of which will be reviewed in other posts following this one. Today, let's have a look at a feature which is related to the first section of this article: run Pod as a non-root user.

4.1 PSP Explained

Let's create a user, PSP, role bindings so that this user is allowed to create pod and can use PSP. The RBAC and PSP are shown in this pull request.

# psp.yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
  name: restricted
  annotations:
    seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default,runtime/default'
    apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
    apparmor.security.beta.kubernetes.io/defaultProfileName:  'runtime/default'
spec:
  privileged: false
  allowPrivilegeEscalation: false
  seLinux:
    rule: RunAsAny
  supplementalGroups:
    rule: RunAsAny
  volumes:
  - '*'
  runAsUser:
    # Require the container to run without root privileges.
    rule: 'MustRunAsNonRoot'
  fsGroup:
    rule: RunAsAny
  readOnlyRootFilesystem: false

As from the docs:

Privileged - determines if any container in a pod can enable privileged mode. By default, a container is not allowed to access any devices on the host, but a "privileged" container is given access to all devices on the host.

AllowPrivilegeEscalation - gates whether or not a user is allowed to set the security context of a container to allowPrivilegeEscalation=true.

RunAsUser - Controls which user ID the containers are run with.

  • MustRunAs - Requires at least one range to be specified. Uses the minimum value of the first range as the default. Validates against all ranges.

  • MustRunAsNonRoot - Requires that the pod be submitted with a non-zero runAsUser or have the USER directive defined (using a numeric UID) in the image. This will cascade to all pods with runAsNonRoot=true.

  • RunAsAny - No default provided. Allows any runAsUser to be specified.

ReadOnlyRootFilesystem - Requires that containers must run with a read-only root filesystem (i.e., no writable layer).

We can ignore the other part of the PSP for now, which will be covered in future articles.

4.2 PSP RBAC Explained

When a PodSecurityPolicy resource is created, it does nothing. In order to use it, the requesting user or target pod's service account must be authorized to use the policy by allowing the use verb on the policy.

4.3 PSP Demo

Now that we are ready if we run:

alias kubectl-user='kubectl --as=system:serviceaccount:default:fake-user -n default'
kubectl-user apply -f deploy.yaml

We would not be successful because the restricted PSP requires "MustRunAsNonRoot", while the container runs as root, even though there isn't any security context defined in the container spec.

We can try something else to create a privileged pod:

kubectl-user create -f- <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: privileged
spec:
  containers:
    - name: pause
      image: k8s.gcr.io/pause
      securityContext:
        privileged: true
EOF

We would get similar error:

Error from server (Forbidden): error when creating "STDIN": pods "privileged" is forbidden: PodSecurityPolicy: unable to admit pod: [spec.containers[0].securityContext.privileged: Invalid value: true: Privileged containers are not allowed]

You can try the PSP by running the following commands:

git clone git@github.com:IronCore864/k8s-security-demo.git
cd k8s-security-demo
git checkout psp
kubectl apply -f psp.yaml
kubectl apply -f rbac.yaml
alias kubectl-user='kubectl --as=system:serviceaccount:default:fake-user -n default'
kubectl-user apply -f deploy.yaml

In the next part of this series, we will show some hands-on tutorial about network security configurations in Kubernetes. If you like this article, please share, and subscribe!

Kubernetes Hardening Tutorial: Network
How to achieve Control Plane security, true resource separation with network policies, and use Kubernetes Secrets more securely.