Using pod security policies with kubeadm
Purpose
Illustrate using pod security policies with a kubeadm installation of kubernetes
Pod security policies are a mechanism to restrict what a container can do when run on kubernetes such as preventing running as privileged containers, running with host networking etc.
Read the docs to see how this can be used to improve security
I struggled to find any information on bootstrapping a kubeadm cluster with the same, hence this content
TLDR
This post is very long so I can do a full illustration, in short you need to
- On master run kubeadm init with the PodSecurityPolicy admission controller enabled
- Add some pod security policies with RBAC config - enough to allow CNI and DNS etc. to start
- CNI daemonsets will not start without this
- Apply your CNI provider which can use one of the previously created pod security policies
- Complete configuring the cluster adding nodes via kubeadm join
- As you add more workloads to the cluster check if you need additional pod security policies and RBAC configuration for the same
What we will do
This is the list of steps I took to get pod security policies running on a kubeadm installation
- Configure the pod security policy admission controller for master init
- Configure some pod security policies for the control plane components
- Configure a CNI provider - Will use flannel here
Will then use the following to demo some other pod security policy scenarios
- Install an nginx-ingress controller which has some specfic requirements - This is just to illustrate adding additional policies
- Install a regular service that has no specific pod security policy requirements - Based on httpbin.org
Environment
- Will just illustrate on a single node rather than a multi-node cluster with HA
- Ubuntu
- Swap off as required by kubeadm
- Timezone configured for UTC
- Docker
- Assumes current user is in the docker group
- kubernetes 1.11.3 - with RBAC
Prepare master
Follow the instructions to install kubeadm
Lets install jq which we will use for some json output processing
sudo apt-get update
sudo apt-get install -y jq
Verify kubeadm version
sudo kubeadm version
Create a directory somewhere for the content we will create below, all below instructions assume you are in this directory
mkdir ~/psp-inv
cd ~/psp-inv
kubeadm config file
Will create this file and use it for kubeadm init on the master
Create a kubeadm-config.yaml file with this content - note we have to specify the podSubnet of 10.244.0.0/16 for flannel
Note this file is minimal for this demo and if you use a later version of kubeadm you may need to alter the apiVersion
apiVersion: kubeadm.k8s.io/v1alpha2
kind: MasterConfiguration
apiServerExtraArgs:
enable-admission-plugins: PodSecurityPolicy
controllerManagerExtraArgs:
address: 0.0.0.0
kubernetesVersion: v1.11.3
networking:
podSubnet: 10.244.0.0/16
schedulerExtraArgs:
address: 0.0.0.0
Master init
sudo kubeadm init --config kubeadm-config.yaml
Follow the instructions from the above command output to get your own copy of the kubeconfig file
If you want to add worker nodes to the cluster, note the join message
Lets check the master node status with
kubectl get nodes
NAME STATUS ROLES AGE VERSION
pmcgrath-k8s-master NotReady master 1m v1.11.3
So the node is not ready as it is waiting for CNI
Lets check the pods
kubectl get pods --all-namespaces
No resources found.
So none appear to be running, would normally see pods with some pending if we had not enabled the pod security policy admission control
Lets check docker
docker container ls --format '{{ .Names }}'
k8s_kube-scheduler_kube-scheduler-pmcgrath-k8s-master_kube-system_a00c35e56ebd0bdfcd77d53674a5d2a1_0
k8s_kube-controller-manager_kube-controller-manager-pmcgrath-k8s-master_kube-system_fd832ada507cef85e01885d1e1980c37_0
k8s_etcd_etcd-pmcgrath-k8s-master_kube-system_16a8af6b4a79e9b0f81092f85eab37cf_0
k8s_kube-apiserver_kube-apiserver-pmcgrath-k8s-master_kube-system_db201a8ecaf8e99623b425502a6ba627_0
k8s_POD_kube-controller-manager-pmcgrath-k8s-master_kube-system_fd832ada507cef85e01885d1e1980c37_0
k8s_POD_kube-scheduler-pmcgrath-k8s-master_kube-system_a00c35e56ebd0bdfcd77d53674a5d2a1_0
k8s_POD_kube-apiserver-pmcgrath-k8s-master_kube-system_db201a8ecaf8e99623b425502a6ba627_0
k8s_POD_etcd-pmcgrath-k8s-master_kube-system_16a8af6b4a79e9b0f81092f85eab37cf_0
So containers are running, but not showing up with kubectl
Lets check events
kubectl get events --namespace kube-system
will see something like Error creating: pods “kube-proxy-“ is forbidden: no providers available to validate pod request
Configure pod security policies
I have went with configuring
- A default pod security policy that any workload can use, has no privileges and should be good for most workloads
- Will create an RBAC ClusterRole
- Will create an RBAC ClusterRoleBinding for any authenticated users
- A privileged pod security policy that I grant nodes and all service accounts in the kube-system namespace access to
- Thinking is access to this namespace is restricted
- Should only run k8s components in this namespace
- Will create an RBAC ClusterRole
- Will create an RBAC RoleBinding in the kube-system namespace
Create a default-psp-with-rbac.yaml file with this content
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
annotations:
apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'
seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
name: default
spec:
allowedCapabilities: [] # default set of capabilities are implicitly allowed
allowPrivilegeEscalation: false
fsGroup:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
hostIPC: false
hostNetwork: false
hostPID: false
privileged: false
readOnlyRootFilesystem: false
runAsUser:
rule: 'MustRunAsNonRoot'
seLinux:
rule: 'RunAsNonRoot'
supplementalGroups:
rule: 'RunAsNonRoot'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
volumes:
- 'configMap'
- 'downwardAPI'
- 'emptyDir'
- 'persistentVolumeClaim'
- 'projected'
- 'secret'
hostNetwork: false
runAsUser:
rule: 'RunAsAny'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
fsGroup:
rule: 'RunAsAny'
---
# Cluster role which grants access to the default pod security policy
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: default-psp
rules:
- apiGroups:
- policy
resourceNames:
- default
resources:
- podsecuritypolicies
verbs:
- use
---
# Cluster role binding for default pod security policy granting all authenticated users access
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: default-psp
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: default-psp
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:authenticated
Create a privileged-psp-with-rbac.yaml file with this content
# Should grant access to very few pods, i.e. kube-system system pods and possibly CNI pods
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
annotations:
# See https://kubernetes.io/docs/concepts/policy/pod-security-policy/#seccomp
seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
name: privileged
spec:
allowedCapabilities:
- '*'
allowPrivilegeEscalation: true
fsGroup:
rule: 'RunAsAny'
hostIPC: true
hostNetwork: true
hostPID: true
hostPorts:
- min: 0
max: 65535
privileged: true
readOnlyRootFilesystem: false
runAsUser:
rule: 'RunAsAny'
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'RunAsAny'
volumes:
- '*'
---
# Cluster role which grants access to the privileged pod security policy
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: privileged-psp
rules:
- apiGroups:
- policy
resourceNames:
- privileged
resources:
- podsecuritypolicies
verbs:
- use
---
# Role binding for kube-system - allow nodes and kube-system service accounts - should take care of CNI i.e. flannel running in the kube-system namespace
# Assumes access to the kube-system is restricted
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: kube-system-psp
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: privileged-psp
subjects:
# For the kubeadm kube-system nodes
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:nodes
# For all service accounts in the kube-system namespace
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:serviceaccounts:kube-system
Apply the above pod security policies with RBAC configuration
kubectl apply -f default-psp-with-rbac.yaml
kubectl apply -f privileged-psp-with-rbac.yaml
Check
Control plane pods will turn up in a running state after some time, coredns pods will be pending - waiting on CNI
kubectl get pods --all-namespaces --output wide --watch
Control plane pods will start failing again until CNI is configured, as the node is still not ready
Install flannel
See here
Will only be able to complete this as the privileged pod security policy will now exist and the flannel service account in the kube-system will be able to use
If using a different CNI provider you should use their installation instructions, will probably need to alter the podSubnet in the kubeadm-config.yaml file used for kubeadm init
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
Check
kubectl get pods --all-namespaces --output wide --watch
All pods will eventually get to a running status including coredns pod(s)
kubectl get nodes
Node is now ready
Allow workloads on the master
If you want to spin up worker nodes, you can do so as normal using the kubeadm join command using the output from kubeadm init, skipping this here
Nothing special needed on worker nodes joining the cluster pod security policy wise
To allow workloads on the master node, as we are just trying to verify on a single node cluster
kubectl taint nodes --all node-role.kubernetes.io/master-
nginx ingress
Will use the manifest from here
This will create a new namespace and a single instance ingress controller, which is enough to illustrate additional pod security policies
Namespace
Since the namespace will not yet exist, lets create so we can reference service accounts and create a role binding
kubectl create namespace ingress-nginx
Lets create a pod security policy
This pod security policy is based on the deployment manifest
Create a file nginx-ingress-psp-with-rbac.yaml with this content
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
annotations:
# Assumes apparmor available
apparmor.security.beta.kubernetes.io/allowedProfileNames: 'runtime/default'
apparmor.security.beta.kubernetes.io/defaultProfileName: 'runtime/default'
seccomp.security.alpha.kubernetes.io/allowedProfileNames: 'docker/default'
seccomp.security.alpha.kubernetes.io/defaultProfileName: 'docker/default'
name: ingress-nginx
spec:
# See nginx-ingress-controller deployment at https://github.com/kubernetes/ingress-nginx/blob/master/deploy/mandatory.yaml
# See also https://github.com/kubernetes-incubator/kubespray/blob/master/roles/kubernetes-apps/ingress_controller/ingress_nginx/templates/psp-ingress-nginx.yml.j2
allowedCapabilities:
- NET_BIND_SERVICE
allowPrivilegeEscalation: true
fsGroup:
rule: 'MustRunAs'
ranges:
- min: 1
max: 65535
hostIPC: false
hostNetwork: false
hostPID: false
hostPorts:
- min: 80
max: 65535
privileged: false
readOnlyRootFilesystem: false
runAsUser:
rule: 'MustRunAsNonRoot'
ranges:
- min: 33
max: 65535
seLinux:
rule: 'RunAsAny'
supplementalGroups:
rule: 'MustRunAs'
ranges:
# Forbid adding the root group.
- min: 1
max: 65535
volumes:
- 'configMap'
- 'downwardAPI'
- 'emptyDir'
- 'projected'
- 'secret'
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: ingress-nginx-psp
namespace: ingress-nginx
rules:
- apiGroups:
- policy
resourceNames:
- ingress-nginx
resources:
- podsecuritypolicies
verbs:
- use
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: ingress-nginx-psp
namespace: ingress-nginx
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: ingress-nginx-psp
subjects:
# Lets cover default and nginx-ingress-serviceaccount service accounts
# Could have altered default-http-backend deployment to use the same service acccount to avoid granting the default service account access
- kind: ServiceAccount
name: default
- kind: ServiceAccount
name: nginx-ingress-serviceaccount
Lets apply
kubectl apply -f nginx-ingress-psp-with-rbac.yaml
Create nginx-ingress workload
- Will remove the controller –publish-service arg as we do not need here
curl -s https://raw.githubusercontent.com/kubernetes/ingress-nginx/master/deploy/mandatory.yaml | sed '/--publish-service/d' | kubectl apply -f -
Check for pods
kubectl get pods --namespace ingress-nginx --watch
Can now see the pod security policy is attached with an annotation with
kubectl get pods --namespace ingress-nginx --selector app.kubernetes.io/name=ingress-nginx -o json | jq -r '.items[0].metadata.annotations."kubernetes.io/psp"'
Httpbin.org workload
Lets deploy a workload where the default pod security policy will suffice
Create a httpbin.yaml file with this content
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app.kubernetes.io/name: httpbin
name: httpbin
spec:
replicas: 1
selector:
matchLabels:
app.kubernetes.io/name: httpbin
template:
metadata:
labels:
app.kubernetes.io/name: httpbin
spec:
containers:
- args: ["-b", "0.0.0.0:8080", "httpbin:app"]
command: ["gunicorn"]
image: docker.io/kennethreitz/httpbin:latest
imagePullPolicy: Always
name: httpbin
ports:
- containerPort: 8080
name: http
restartPolicy: Always
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: "nginx"
labels:
app.kubernetes.io/name: httpbin
name: httpbin
spec:
rules:
- host: my.httpbin.com
http:
paths:
- path:
backend:
serviceName: httpbin
servicePort: 8080
---
apiVersion: v1
kind: Service
metadata:
labels:
app.kubernetes.io/name: httpbin
name: httpbin
spec:
ports:
- name: http
port: 8080
selector:
app.kubernetes.io/name: httpbin
Create namespace and run in workload
kubectl create namespace demo
kubectl apply --namespace demo -f httpbin.yaml
Lets check that the pod exists and the default policy was used
kubectl get pods --namespace demo
kubectl get pods --namespace demo --selector app.kubernetes.io/name=httpbin -o json | jq -r '.items[0].metadata.annotations."kubernetes.io/psp"'
Test workload
Will do so by calling via ingress controller pod instance - I have no ingress service for this demo
# Get nginx ingress controller pod IP
nginx_ip=$(kubectl get pods --namespace ingress-nginx --selector app.kubernetes.io/name=ingress-nginx --output json | jq -r .items[0].status.podIP)
# Test ingress and out httpbin workload
curl -H 'Host: my.httpbin.com' http://$nginx_ip/get
Resets
If like me you mess this up regularly, you can reset and restart with
# Note: Will loose PKI also which is fine here as kubeadm master init will re-create
sudo kubeadm reset
# Should flush iptable rules after a kubeadm reset, see https://blog.heptio.com/properly-resetting-your-kubeadm-bootstrapped-cluster-nodes-heptioprotip-473bd0b824aa
sudo iptables -F && sudo iptables -t nat -F && sudo iptables -t mangle -F && sudo iptables -X