Skip to content

Kubernetes

Quick Reference

Minikube

  • starts minikube
    minikube start
    
  • stops the minikube virtual machine
    minikube stop
    
  • completely wipe away the minikube image. Also you can delete all files in /.minikube and /.kube.
    minikube delete
    
  • find out the required environment variables to connect to the docker daemon running in minikube
    minikube env
    
  • find out the ip address of minikube. Needed for browser access.
    minikube ip
    

Kubectl

  • get help
    kubectl explain pods
    
  • list all namespaces
    kubectl get ns
    
  • list all objects that you’ve created. Pods at first, later, ReplicaSets, Deployments and Services
    kubectl get all
    
  • either creates or updates resources depending on the contents of the yaml file
    - kubectl apply –f <yaml file>
    
  • apply all yaml files found in the current directory
    kubectl apply –f .
    
  • gives full information about the specified pod
    kubectl describe pod <name of pod>
    
  • gives full information about the specified service
    kubectl describe svc <name of pod>
    
  • view pod logs
    kubectl logs {pod-name}
    
  • Events
    kubectl get events --sort-by=.metadata.creationTimestamp
    
  • execute the specified command in the pod’s container. Doesn’t work well in Cygwin.
    kubectl exec –it <pod name> <command>
    
  • get all pods or services. Later in the course, replicasets and deployments.
    kubectl get (pod | po | service | svc | rs | replicaset | deployment | deploy)
    
  • get all pods and their labels
    kubectl get po --show-labels
    
  • get all pods matching the specified name:value pair
    kubectl get po --show-labels -l {name}={value}
    
  • delete the named pod. Can also delete svc, rs, deploy
    kubectl delete po <pod name>
    
  • delete all pods (also svc, rs, deploy)
    kubectl delete po --all
    
  • Monitor changes: --watch
  • Diff: kubectl diff -f example.yaml
  • CPU/Memory usage: kubectl top pods

Deployment Management

  • create: kubectl create deployment example --image=<image-name>
  • expose: kubectl expose deployment example --type=LoadBalancer
  • scale: kubectl scale deployment example --replicas=3
  • get the status of the named deployment
    kubectl rollout status deploy <name of deployment>
    
  • get the previous versions of the deployment
    kubectl rollout history deploy <name of deployment>
    
  • go back one version in the deployment. Also optionally --to-revision=\<revision_number> We recommend this is used only in stressful emergency situations! Your YAML will now be out of date with the live deployment!
    kubectl rollout undo deploy <name of deployment>
    
  • Manual rollouts:
    kubectl rollout history deployment hello-world-rest-api
    kubectl set image deployment hello-world-rest-api hello-world-rest-api=in28min/hello-world-rest-api:0.0.3.RELEASE --record=true
    kubectl rollout undo deployment hello-world-rest-api --to-revision=1
    

Volumes

  • list PersistentVolumeClaims
    kubectl get pvc
    

kops

  • Create cluster
    kops create cluster --zone eu-west-2a,eu-west-2b ${NAME}
    
  • Edit instance-group configuration
    kops edit ig
    
  • Update cluster
    kops update cluster ${NAME} --yes
    
  • Validate cluster
    kops validate cluster
    
  • Delete cluster
    kops delete cluster --name ${NAME} --yes
    
    kubectl create deployment hello-world-rest-api --image=in28min/hello-world-rest-api:0.0.1.RELEASE
    kubectl expose deployment hello-world-rest-api --type=LoadBalancer --port=8080
    kubectl scale deployment hello-world-rest-api --replicas=3
    kubectl delete pod hello-world-rest-api-58ff5dd898-62l9d
    kubectl autoscale deployment hello-world-rest-api --max=10 --cpu-percent=70
    kubectl edit deployment hello-world-rest-api #minReadySeconds: 15
    kubectl set image deployment hello-world-rest-api hello-world-rest-api=in28min/hello-world-rest-api:0.0.2.RELEASE

    gcloud container clusters get-credentials in28minutes-cluster --zone us-central1-a --project solid-course-258105
    kubectl create deployment hello-world-rest-api --image=in28min/hello-world-rest-api:0.0.1.RELEASE
    kubectl expose deployment hello-world-rest-api --type=LoadBalancer --port=8080
    kubectl set image deployment hello-world-rest-api hello-world-rest-api=DUMMY_IMAGE:TEST
    kubectl get events --sort-by=.metadata.creationTimestamp
    kubectl set image deployment hello-world-rest-api hello-world-rest-api=in28min/hello-world-rest-api:0.0.2.RELEASE
    kubectl get events --sort-by=.metadata.creationTimestamp
    kubectl get componentstatuses
    kubectl get pods --all-namespaces

    kubectl get events
    kubectl get pods
    kubectl get replicaset
    kubectl get deployment
    kubectl get service

    kubectl get pods -o wide

    kubectl explain pods
    kubectl get pods -o wide

    kubectl describe pod hello-world-rest-api-58ff5dd898-9trh2

    kubectl get replicasets
    kubectl get replicaset

    kubectl scale deployment hello-world-rest-api --replicas=3
    kubectl get pods
    kubectl get replicaset
    kubectl get events
    kubectl get events --sort.by=.metadata.creationTimestamp

    kubectl get rs
    kubectl get rs -o wide
    kubectl set image deployment hello-world-rest-api hello-world-rest-api=DUMMY_IMAGE:TEST
    kubectl get rs -o wide
    kubectl get pods
    kubectl describe pod hello-world-rest-api-85995ddd5c-msjsm
    kubectl get events --sort-by=.metadata.creationTimestamp

    kubectl set image deployment hello-world-rest-api hello-world-rest-api=in28min/hello-world-rest-api:0.0.2.RELEASE
    kubectl get events --sort-by=.metadata.creationTimestamp
    kubectl get pods -o wide
    kubectl delete pod hello-world-rest-api-67c79fd44f-n6c7l
    kubectl get pods -o wide
    kubectl delete pod hello-world-rest-api-67c79fd44f-8bhdt

    kubectl get componentstatuses
    kubectl get pods --all-namespaces

    gcloud auth login
    kubectl version
    gcloud container clusters get-credentials in28minutes-cluster --zone us-central1-a --project solid-course-258105

    kubectl logs hello-world-rest-api-58ff5dd898-6ctr2
    kubectl logs -f hello-world-rest-api-58ff5dd898-6ctr2

    kubectl get deployment hello-world-rest-api -o yaml
    kubectl get deployment hello-world-rest-api -o yaml > deployment.yaml
    kubectl get service hello-world-rest-api -o yaml > service.yaml
    kubectl apply -f deployment.yaml
    kubectl get all -o wide
    kubectl delete all -l app=hello-world-rest-api

    kubectl get svc --watch
    kubectl diff -f deployment.yaml
    kubectl delete deployment hello-world-rest-api
    kubectl get all -o wide
    kubectl delete replicaset.apps/hello-world-rest-api-797dd4b5dc

    kubectl get pods --all-namespaces
    kubectl get pods --all-namespaces -l app=hello-world-rest-api
    kubectl get services --all-namespaces
    kubectl get services --all-namespaces --sort-by=.spec.type
    kubectl get services --all-namespaces --sort-by=.metadata.name
    kubectl cluster-info
    kubectl cluster-info dump
    kubectl top node
    kubectl top pod
    kubectl get services
    kubectl get svc
    kubectl get ev
    kubectl get rs
    kubectl get ns
    kubectl get nodes
    kubectl get no
    kubectl get pods
    kubectl get po

    kubectl delete all -l app=hello-world-rest-api
    kubectl get all

    kubectl apply -f deployment.yaml 
    kubectl apply -f ../currency-conversion/deployment.yaml 

Yamls

apiVersion:
kind:
metadata:
spec:

Pod

apiVersion: v1
kind: Pod
metadata:
  name: pod-example
spec:
  containers:
  - name: ubuntu
    image: ubuntu:trusty
    command: ["echo"]
    args: ["Hello World"]

ReplicaSet

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  # Unique key of the ReplicaSet instance
  name: replicaset-example
spec:
  # 3 Pods should exist at all times.
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      # Run the nginx image
      - name: nginx
        image: nginx:1.14

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  # Unique key of the Deployment instance
  name: deployment-example
spec:
  # 3 Pods should exist at all times.
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        # Apply this label to pods and default
        # the Deployment label selector to this value
        app: nginx
    spec:
      containers:
      - name: nginx
        # Run this image
        image: nginx:1.14

DaemonSet

apiVersion: apps/v1
kind: DaemonSet
metadata:
  # Unique key of the DaemonSet instance
  name: daemonset-example
spec:
  selector:
    matchLabels:
      app: daemonset-example
  template:
    metadata:
      labels:
        app: daemonset-example
    spec:
      containers:
      # This container is run once on each Node in the cluster
      - name: daemonset-example
        image: ubuntu:trusty
        command:
        - /bin/sh
        args:
        - -c
        # This script is run through `sh -c <script>`
        - >-
          while [ true ]; do
          echo "DaemonSet running on $(hostname)" ;
          sleep 10 ;
          done

Service

kind: Service
apiVersion: v1
metadata:
  # Unique key of the Service instance
  name: service-example
spec:
  ports:
    # Accept traffic sent to port 80
    - name: http
      port: 80
      targetPort: 80
  selector:
    # Loadbalance traffic across Pods matching
    # this label selector
    app: nginx
  # Create an HA proxy in the cloud provider
  # with an External IP address - *Only supported
  # by some cloud providers*
  type: LoadBalancer

Volume and VolumeMounts

apiVersion: v1
kind: Pod
metadata:
  name: pv-recycler
  namespace: default
spec:
  restartPolicy: Never
  volumes:
  - name: vol
    hostPath:
      path: /any/path/it/will/be/replaced
  containers:
  - name: pv-recycler
    image: "k8s.gcr.io/busybox"
    command: ["/bin/sh", "-c", "test -e /scrub && rm -rf /scrub/..?* /scrub/.[!.]* /scrub/*  && test -z \"$(ls -A /scrub)\" || exit 1"]
    volumeMounts:
    - name: vol
      mountPath: /scrub

PersistentVolumeClaims

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: my-pvc
spec:
  storageClassName: my-ssd-local-storage
  # Find matching storage class
  accessModes:
    - ReadWriteOnce
    # Find matching access mode
  resources:
    requests:
      storage: 8Gi
      # Find 8Gi of storage 
apiVersion: v1
kind: PersistentVolume
metadata:
  name: my-local-storage
spec:
  storageClassName: my-ssd-local-storage
  capacity:
    storage: 8Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/somedir"
    type: DirectoryOrCreate
apiVersion: v1
kind: Pod
metadata:
  name: mypod
spec:
  containers:
    - name: myfrontend
      image: nginx
      volumeMounts:
      - mountPath: "/var/www/html"
        name: mypd
  volumes:
    - name: mypd
      persistentVolumeClaim:
        claimName: my-pvc

StorageClass and Binding

  • Change reclaim policy as required
# What do want?
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mongo-pvc
spec:
  storageClassName: cloud-ssd
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 7Gi
---
# How do we want it implemented
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: cloud-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: gp2

StatefulSet Set

  • StatefulSet is the workload API object used to manage stateful applications.
    • Stable, unique network identifiers.
    • Stable, persistent storage.
    • Ordered, graceful deployment and scaling.
    • Ordered, automated rolling updates.
  • Like a Deployment, a StatefulSet manages Pods that are based on an identical container spec.
  • Unlike a Deployment, a StatefulSet maintains a sticky identity for each of their Pods.
  • These pods are created from the same spec, but are not interchangeable:
    • Each has a persistent identifier that it maintains across any rescheduling.
  • https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/

RBAC

  • https://kubernetes.io/docs/reference/access-authn-authz/rbac/
  • Kind - Role -> ClusterRoleto grant cluster-wide permissions.
    apiVersion: rbac.authorization.k8s.io/v1
    kind: Role
    metadata:
      namespace: default
      name: pod-reader
    rules:
    - apiGroups: [""] # "" indicates the core API group
      resources: ["pods"]
      verbs: ["get", "watch", "list"]
    
    apiVersion: rbac.authorization.k8s.io/v1
    # This role binding allows "jane" to read pods in the "default" namespace.
    # You need to already have a Role named "pod-reader" in that namespace.
    kind: RoleBinding
    metadata:
      name: read-pods
      namespace: default
    subjects:
    # You can specify more than one "subject"
    - kind: User
      name: jane # "name" is case sensitive
      apiGroup: rbac.authorization.k8s.io
    roleRef:
      # "roleRef" specifies the binding to a Role / ClusterRole
      kind: Role #this must be Role or ClusterRole
      name: pod-reader # this must match the name of the Role or ClusterRole you wish to bind to
      apiGroup: rbac.authorization.k8s.io
    

ConfigMap

apiVersion: v1
kind: ConfigMap
metadata:
  creationTimestamp: 2016-02-18T18:52:05Z
  name: game-config
  namespace: default
  resourceVersion: "516"
  uid: b4952dc3-d670-11e5-8cd0-68f728db1985
data:
  game.properties: |
    enemies=aliens
    lives=3
    enemies.cheat=true
    enemies.cheat.level=noGoodRotten
    secret.code.passphrase=UUDDLRLRBABAS
    secret.code.allowed=true
    secret.code.lives=30
  ui.properties: |
    color.good=purple
    color.bad=yellow
    allow.textmode=true
    how.nice.to.look=fairlyNice
kubectl create configmap game-config-2 --from-file=configure-pod-container/configmap/game.properties

Secret

apiVersion: v1
kind: Secret
metadata:
  name: mysecret
type: Opaque
data:
  username: YWRtaW4=
  password: MWYyZDFlMmU2N2Rm
apiVersion: v1
kind: Secret
metadata:
  name: mysecret
type: Opaque
stringData:
  config.yaml: |-
    apiUrl: "https://my.api.com/api/v1"
    username: {{username}}
    password: {{password}}

Ingress

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: test-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: domain1.com
    http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          serviceName: test
          servicePort: 80
  - host: domain2.com

Job

apiVersion: batch/v1
kind: Job
metadata:
  name: pi
spec:
  template:
    spec:
      containers:
      - name: pi
        image: perl
        command: ["perl",  "-Mbignum=bpi", "-wle", "print bpi(2000)"]
      restartPolicy: Never
  backoffLimit: 4

CronJob

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: hello
spec:
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: hello
            image: busybox
            args:
            - /bin/sh
            - -c
            - date; echo Hello from the Kubernetes cluster
          restartPolicy: OnFailure

Cloud Deployments

Kops

Notes

  • Stay current in k8s
  • Harden nodes
  • Restrict network via RBAC
  • Use namespaces and network policies
  • Slim down images
  • Logs - k8s audit / k8s rbac audit logs /
  • Follows single responsibility concept
  • Cluster
    • Master Node - Manage Cluster
      • API Server (kube-apiserver)
      • Distributed Database (etcd)
      • Scheduler (kube-scheduler)
      • Controller Manager (kube-controller-manager)
    • Worker Nodes - Run Applications
      • Node Agent (kubelet)
        • Interacts with kube-apiserver
      • Networking Component (kube-proxy)
      • Container Runtime (CRI-OCI/docker/rkt)
      • PODS
  • Service
    • Pods are not visible outside k8s cluster
    • K8s service is a long running object with IP address and stable fixed port usable to expose Pods out of k8s cluster
    • Pod can have "Label" (key value pairs)
    • Service can have "Selector" (key value pairs)
    • Service will look for any matching key-value pairs when binding a service to a pod (selector / label)
    • Service Types:
      • NodePort
        • Expose a port through the node
        • Port needs to be greater than 30,000
        • Configure port and nodeport
      • ClusterIP
        • Internal service / not exposed to external traffic but exposed to internal nodes
      • LoadBalancer
  • Pod may die for many reasons:
    • If pod take too many resources
    • If node failed (all pods in that node die)
  • ReplicaSets
    • Add replica count into pod definition
    • Change kind to ReplicaSet
    • Nest pod definition in template spec attribute
    • Remove pod names
    • Add selector block (similar to services)
  • Deployment
    • Automatic rolling update with rollbacks
  • Networking
    • Service Discovery
      • Same Pod - can be accessed via localhost
      • Kube-dns - Resolve service name to IP address
    • Namespaces
      • Partitioning resources into separate areas
      • kubectl get ns get list of namespaces
      • kubectl get all -n kube-system list resources from a namespace
    • FQDN -> database -> database.mynamespace -> database.mynamespace.svc.cluster.local (last part is appended by DNS)
      • Check /etc/resolve.conf
  • Expand memory:
    • minikube stop
    • minikube delete
    • rm -rf ~/.kube and rm -rf ~/.minikube
    • minikube start --memory 4096
  • Microservices
    • changing one part is hard without affecting other
    • complications of release coordination (big bang releases)
    • shared global databases (integration databases)
    • microservices as extreme form of modularity
  • Common Cluster Configurations: https://github.com/kubernetes/kubernetes/tree/master/cluster/addons
  • ElasticStack + (LogStash / FluentD)
  • Helm - Package management for k8s
    • helm init
    • helm version
    • https://github.com/helm/charts
    • helm repo update
    • helm install stable/mysql --set mysqlPassword=example --name my-mysql
    • Fix permissions:
      kubectl create serviceaccount --namespace kube-system tiller
      kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller
      kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}' 
      
  • Prometheus / Grafana (UI)
    • helm install stable/prometheus-operator --name my-monitoring --namespace monitoring
    • To see how this works kops edit cluster --name prometheus-oper-prometheus change type to LoadBalancer and access UI(9090/graph))
    • To get a full set of data from Prometheus:
      kops edit cluster --name ${NAME}
      kubelet:
        anonymousAuth: false    
        authenticationTokenWebhook: true    
        authorizationMode: Webhook
      kops update cluster --yes
      kops rolling-update cluster --yes 
      
    • prometheus-operator" failed: rpc error: code = Canceled https://github.com/helm/helm/issues/6130
      • helm del --purge monitoring
      • helm install --name monitoring --namespace monitoring stable/prometheus-operator --set prometheusOperator.createCustomResource=false
    • AlertManager over /api
      • Dead man's switch - Constantly alerting. Stops if an alert is fired (or Prometheus is down)
    • Slack integration
      • Incoming web-hook
      • Listen for Dead man's switch
      • https://prometheus.io/docs/alerting/latest/configuration/
      • alertmanager.yaml
        global:
        slack_api_url: '<<add your slack endpoint here>>'
        route:
        group_by: ['alertname']
        group_wait: 5s
        group_interval: 1m
        repeat_interval: 10m
        receiver: 'slack'
        
        receivers:
        - name: 'slack'
        slack_configs:
        - channel: '#alerts'
            icon_emoji: ':bell:'
            send_resolved: true
            text: "<!channel> \nsummary: {{ .CommonAnnotations.message }}\n"
        
      • kubectl logs -n monitoring-namespace kube-prometheus
      • kubectl logs -n monitoring-namespace kube-prometheus -c alertmanager
      • remove existing secret for the config file and set a new secret (alertmanager-kube-prometheus)
  • Requests and Limits
    • Allow cluster manager to make intelligent decisions
    • If memory limit reached -> pod remain running, container restart
    • If cpu limit reached -> cpu clamped/throttled (will not be allowed to go over)
      kind: Pod
      spec:
      containers:
      - name: db
          resources:
          requests:
              memory: "64Mi"
              cpu: "250m"
          limits:
              memory: "128Mi"
              cpu: "500m"
      
  • Metrics
    • Enable metrics server: minikube addons list minikube addons enable metrics-server
    • View stats: kubectl top pod kubectl top node
  • Dashboard
    • minikube dashboard
  • Horizontal Pod Autoscaling (HPA)
  • Readiness Probe / Liveness Probe
  • QoS and Evection
  • Pod Priorities
    • https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
    • Priority indicates the importance of a Pod relative to other Pods.
    • If a Pod cannot be scheduled, the scheduler tries to preempt (evict) lower priority Pods to make scheduling of the pending Pod possible.
    • Only Pod-Priority is used in scheduling new pods
    • During evection only QoS is considered first and then Pod-Priority
  • RBAC
    • kubectl get role
    • kubectl get rolebinding
    • Super-user
      • Create OS user
      • In k8s create a new namespace kubectl create ns playground
      • kubectl config view. Copy API LB endpoint URL.
      • X.509
        • API only accept requests signed by CA within the k8s cluster
        • Create private-key: openssl genrsa -out private-key-username.key 2048
        • CSR: openssl req -new -key private-key-username.key -out req.csr --subj "/CN=username/O=groupname"
        • aws s3 cp s3://storage-name/example.local/pki/private/ca/<number>.key kubernetes.key
        • aws s3 cp s3://storage-name/example.local/pki/issued/ca/<number>.crt kubernetes.crt
        • openssl x509 -req -in req.csr -CA kubernetes.crt -CAkey kubernetes.key -CAcreateserial -out username.crt -days 365
        • Install new certificate with:
          • cp username.crt /home/.certs/
          • cp private-key-username.key /home/.certs/
          • cp kubernetes.crt /home/.certs/
          • chown -R username:username /home/username/certs
      • Define Role and RoleBinding (to bind role to user).
    • New user do:
      • kubectl config set-credential username --client-certificate=username.crt --client-key=private-key-username.key
      • kubectl config set-cluster example.local --certificate-authority=kubernetes.crt
      • kubectl config set-cluster example.local --server=<api-lb_url>
      • Now new user can: kubectl config set-context mycontext --user my-user-name --cluster example.local https://kubernetes.io/docs/tasks/access-application-cluster/configure-access-multiple-clusters/
      • Check config with kubectl config view
      • kubectl config use-context mycontext
      • Now new user can kubectl get all
    • Only allowed to access the namespace defined in role.
    • If all-namespaces should be visible use ClusterRole and ClusterRoleBinding.
      • Grant limited permission across cluster. Then use Role section to grant wider permissions on own-cluster.
    • ServiceAccount in RoleBinding is used to grant access to one pod from another.
  • ConfigMap
    • Env variables are unmanageable and may get duplicated, leaving multiple places to change
    • kubectl get cm
    • kubectl describe cm
    • Using config map
      • valueFrom -> configMapKeyRef
        • Still duplicates
        • When updating:
          • Create a new version of the config-map
          • Change the name referenced
      • envFrom -> configMapKeyRef
        • all key value paired copied
      • volumeMount and volume -> configMap
        • Each value of map becomes the file name
  • Secrets
    • Just prevents you from accidentals printing secret when doing kubectl get secrets
    • No encryption
    • RBAC can be used to deny access to secrets while giving access to configMaps
    • kubectl get secrets
    • kubectl get secret -n monitoring example -o yaml
    • kubectl get secret -n monitoring example -o json
    • kubectl delete secret -n monitoring example -o json
    • kubectl create secret -n monitoring example --from-file=example.yaml
    • Hot Reloading of ConfigMaps Spring Cloud Kubernetes Demo: https://www.youtube.com/watch?v=DiJ0Na8rWvc&t=563s
  • Ingress Controllers
    • Accessing Service
      • NodePort - Alow defining port and exposing. Port 30,000 and above only.
      • ClusterIP - Internal access only
      • LoadBalancer - Provided by the cloud provider
        • will require separate LB for each service
      • LB talk to Ingress Controller (ex Nginx) and IC talk to services
    • kubectl get ingress kubectl describe ingress
    • minikube addons list minikube addons enable ingress
    • https://kubernetes.github.io/ingress-nginx/
  • Batch Jobs
    • Pod by default always restarts (restart policy)
    • Designed to run to completion
  • DaemonSet
    • All nodes run a copy of these pods
  • StatefulSet
    • Pods have predictable names
    • Startup in sequence
    • Client can address by name
    • Example: Primary secondary situation
- OpenFaaS Kubeless OpenWisk
- GitLab TravisCI CircleCI
- Promethius Fluentd OpenTracing Jeager 
- Istio LinkerD Consul
- Teraform Helma

Tools

Security Validations

  • For internal services, use ClusterIP service-type. (instead of NodePort)
  • In production all use ClusterIP ??
  • Are pods stateless?
  • Are resource requests and limits defined properly.
    • kubectl describe node minikube (cluster node). Allocatable and Capacity.
  • If < Java 10 is used ¼ of RAM is taken for Xmx by default (-Xmx50m)
  • Readiness Probe / Liveness Probe defined properly
  • kubectl config view --minify to get password of cluster admin user/password. usable to login to cluster API and all services deployed in the cluster including AlertManager:
    • /api/v1/namespaces/
    • /api/v1/namespaces/monitoring/services
    • /api/v1/namespaces/monitoring/services/alert-manager-operated:9093/proxy
  • Prometheus: https://prometheus.io/docs/operating/security/
  • Audit RBAC configuration
    • API groups are defined properly without using *
    • Check ClusterRole for cluster wide permissions
  • K8s API only accepts requests signed by CA within the k8s cluster
  • Make sure cluster key is stored securely.
    • kops: KOPS_STATE_STORE (s3 bucket) contains this. cluster_folder/pki/private/ca/<number>.key and pki/issued/ca/<number>.crt.
      • aws s3 ls s3://storage-name/example.local/pki/private/ca/<number>.key
      • aws s3 cp s3://storage-name/example.local/pki/private/ca/<number>.key kubernetes.key
      • aws s3 cp s3://storage-name/example.local/pki/issued/ca/<number>.crt

Cluster API

  • Bring declarative k8s style APIs to cluster creation, configuration and management.

New References

Teraform

  • Server provisioning
  • Two steps:
    • Initialize: terraform init
    • Validate: terraform validate
    • Plan: terraform plan
    • Execute: teraform apply
  • terraform console
    • Resources are referenced as .: Example: aws_s3_bucket.my_s3_bucket
  • Types: resource output variable lists maps provisioner connection
  • terfaform.tfstate terfaform.tfstate.backup and .teraform must not be commited. Contains secrets in plain text.
  • When state needs to be shared, store the state inside a Remote Backend.
  • Create multiple resources: count = 2 \n name="my_user_${count.index}"
  • teraform show teraform fmt
  • Assign to variable from environment variables: TF_VAR_<variable_name>="<value>"
  • Best Practice: Create separate projects for resources that might have similar lifecycle. One application would have multiple projects.
  • Best Practice: Use immutable servers - Once you provision a server and a change is required, just remove the old server and prosition a new one. Do not do tweeks.
  • Make things dynamic:
    • aws_default_vpc special resources that is not created / destroyed by Teraform.
    • data_provider aws_subnet_ids to get list of subnets within the VPC
    • aws_security_group
    • data_provider aws_ami aws_ami_ids (most_recent: true)
  • teraform graph -> Graphwiz
  • Storing state into S3
    • Create S3 bucket: lifecycle, versioning, server_side_encryption_configuration
    • Create aws_dynamodb_table for locks: Attribute(LockID/S)
    • terraform -> backend "s3" -> point to s3 and dynamodb.. name should be unique: application/project/type/envitronment
      • application_name, project_name, environment can be variables
  • Workspaces
    • Track multiple environments
    • teraform workspace show
    • teraform workspace list
    • teraform workspace new prod-env
    • teraform workspace select prod-env
    • Use ${teraform.workspace} to add the workspace name to resources
  • Modules
    • Create teraform-modules dir and in it create modules for different resources
    • Create dev prod folders. In these folders in main.tf add variable for environment.
      • module "user_module" { source = "../../teraform-modules/users" environment = "dev"}
    brew install terraform
    terraform --version
    terraform version
    terraform init
    export AWS_ACCESS_KEY_ID=*******
    export AWS_SECRET_ACCESS_KEY=*********
    terraform plan
    terraform console
    terraform apply -refresh=false
    terraform plan -out iam.tfplan
    terraform apply "iam.tfplan"
    terraform apply -target=aws_iam_user.my_iam_user
    terraform destroy
    terraform validate
    terraform fmt
    terraform show
    export TF_VAR_iam_user_name_prefix = FROM_ENV_VARIABLE_IAM_PREFIX
    export TF_VAR_iam_user_name_prefix=FROM_ENV_VARIABLE_IAM_PREFIX
    terraform plan -refresh=false -var="iam_user_name_prefix=VALUE_FROM_COMMAND_LINE"
    terraform apply -target=aws_default_vpc.default
    terraform apply -target=data.aws_subnet_ids.default_subnets
    terraform apply -target=data.aws_ami_ids.aws_linux_2_latest_ids
    terraform apply -target=data.aws_ami.aws_linux_2_latest
    terraform workspace show
    terraform workspace new prod-env
    terraform workspace select default
    terraform workspace list
    terraform workspace select prod-env

Networking

  • Flannel
  • Calico
  • Canel = Flannel + Network Policy (Calico)
  • Weave
  • CNI has less performance than VXLAN
  • Azure Network CNI plugin
  • IPSec

Security

  • Layer 0 - Kernel
    • Check allowed syscalls and restrict
    • Improve sandbox with
      • gViser
      • Kata container
      • AWS Firecracker
    • Keep kernel updated
  • Layer 1 - The container
    • Build
      • Remove unnecessary tools
      • Lightweight image
      • Scan images
      • Known sources
      • Check integrity
    • Run
      • Epherminal containers for live debugging (never exec into running container)
      • Hook into running container and watch for anomalies
        • EBPF
        • sysdig
        • falko

  • Layer 2 - The workloads
    • Pod security context