Home Navigation

Tuesday, 7 June 2022

Kubernetes basics and cheatsheet

Before we start about Kubernetes let us first cover some of the basics of containers and what is the benefits of containerization. 

A container is an executable package of software that includes everything needed to run it. Containerization is the packaging of software code with just the operating system (OS) libraries and dependencies required to run the code to create a single lightweight executable—called a container—that runs consistently on any infrastructure

Executable unit of software

  • Encapsulate everything necessary to run
  • Can be run anywhere

OS Virtualization:

  • Isolates process
  • Control resources allocated to those process

Small, fast, and portable

  • Doesn’t include guest OS in every instance
  • Leverages host OS

Benefits of container:

  • Portability
  • Agility: rapid application development
  • Speed: 
    • Lightweight
    • Don’t include guest os
    • Spin up quickly and horizontally scalable
  • Fault isolation
    • The failure of one container does not affect the continued operation of any other containers
  • Efficiency / cost effective
  • Ease of management
  • Security

The Open Container Initiative (OCI), established in June 2015 by Docker and other industry leaders, is promoting common, minimal, open standards and specifications around container technology.
The ecosystem is standardizing on containerd and other alternatives like CoreOS rkt, Mesos Containerizer, LXC Linux Containers, OpenVZ, and crio-d.


Docker is a platform for building and running container. A Docker file serves as the blueprint for an image.
  • Image: An image is an immutable file that contains everything necessary to run an application.
  • Container is a running image
  • Each docker instruction creates a new read-only layer. A writable layer is added when an image is run as a container.
Note: The main difference between ADD and COPY in docker file is that COPY can only copy local files or directory, whereas ADD can also add files from remote URLs
CMD is the default execution command generally stays at the last in docker file.
Naming: hostname/repository:tag

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management. Google originally designed Kubernetes, but the Cloud Native Computing Foundation now maintains the project. Wikipedia

Managing the lifecycle of containers, especially in large, dynamic environments
  • Provisioning and deployment
  • Availability
  • Scaling
  • Scheduling to infrastructure
  • Rolling updates
  • Health checks
Kubernetes as “a portable, extensible, open-source platform for managing containerized workloads and services that facilitates both declarative configuration and automation.It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.”

Kubernetes is not a
  • Paas
  • Does not limit the types of applications
  • Does not deploy source code or build application
  • Does not provide-built-in middleware, databases, or other services
A Kubernetes cluster is a set of nodes that run containerized application. When you deploy Kubernetes you get a cluster. In Kubernetes cluster consists of two types of nodes,
  • Control plane ( Master node)
  • Nodes ( Worker nodes)


Control plane ( Master node):
It makes decisions about the cluster and detects and responds to events in the cluster.
  • Kubernetes API: All communication in the cluster utilizes this API.
  • Kubernetes scheduler:
    • The Kubernetes scheduler assigns newly created Pods to nodes. This means that the scheduler determines where your workloads should run within the cluster.
  • etcd:  
    • a highly available key value store that contains all the cluster data. When you tell Kubernetes to deploy your application, that deployment configuration is stored in etcd. Etcd is thus the source of truth for the state in a Kubernetes cluster, and the system works
  • Kubernetes controller manager
    • The Kubernetes controller manager runs all the controller processes that monitor the cluster state and ensure that the actual state of a cluster matches the desired state. 
  • Cloud controller manager:
    • Runs controllers that interact with the underlying cloud providers.These controllers effectively link clusters into a cloud provider’s API. Since Kubernetes is open source software and would ideally be adopted by a variety of cloud providers and organizations, it strives to be as cloud-agnostic as possible.
Kubernetes worker nodes
  • Nodes
    • Nodes are the worker machines in a Kubernetes cluster. In other words, user applications are run on nodes. Nodes can be a physical machine or a virtual machine. Managed by control plane contains the services to run applications.
  • Kube proxy:
    • Network proxy
    • Maintains network rules that allow communication to pods
  • Kubelet:
    • Communicates with the API server
    • Ensures that Pods and their associated containers are running
    • Reports to the control plan on health and status
A control loop is defined as a non-terminating loop that regulates the state of a system.

Kubernetes Objects are persistent entities in Kubernetes."Persistent" means that when you create an object, Kubernetes continually works to ensure that that object exists in the system, until and unless you modify or remove that object.
  • Persistent entities in kubernetes
  • Define the desired state of your workload
  • Use the Kubernetes API to work with them, like kubectl
Kubernetes objects consist of two main fields.
  • The first is the object "spec," which is provided by the user. The spec dictates the desired state for this object.
  • The second field is the "status," which is provided by Kubernetes. The status describes the current state of the object—its actual state as opposed to its desired state. The status is updated if at any time the status of the object changes.
  1. Namespaces: namespaces can be used to provide logical separation of a cluster into virtual clusters.
  2. Labels: Labels are key/value pairs that can be attached to objects in order to identify those objects.
  3. Pods: Simplest unit in Kubernetes, represents process running in cluster, encapsulate a container, POD serve to scale an app horizontal
  4. ReplicaSet: A ReplicaSet is a group of identical Pods that are running. a ReplicaSet encapsulates a Pod definition and adds additional information needed to replicate it.
  5. Deployment: Deployment object, a higher-level concept that in turn manages ReplicaSets, A Deployment is an object that provides updates for both Pods and ReplicaSets.
    • provides updates for pods and replicates
    • Runs multiple replicas of your application
    • Suitable for stateless applications
    • updates triggers a rollout
  • Autoscaling:
    • ReplicaSet works with a set number of pods
    • Horizontal Pod Autoscaler (HPA) enables scaling up and down as needed.
      • Kind: HorizontalPodAutoScaler
      • And in spec you define the attribute
      • Behind the scene it uses replicaSet to create object
    • Can configure based on desired state of CPU, memory etc
  • Rolling Update:
    • ReplicaSet and Autoscaling are important to minimize and service interruption
    • Rolling Update are a way to roll out app changes in an automated and controlled fashion throughout your pods
    • Rolling updates give us a way to publish changes to our applications without Noticeable interruptions for the user.
    • Additionally, rolling updates give us a way to roll back any changes to the application
    • Kubectl rollout status deployments/hello-kubernets
    • Kubectl rollout undo deployments/hello-kubernetes
ConfigMaps give us a way to provide configuration data to pods and deployments so we don't have to hard-code that data in the application code. You can also reuse these ConfigMaps and Secrets for multiple deployments.
  • Used to provide configuration for deployments
  • Reusable across deployments
  • Created in a couple of different ways:
    • using string literals
    • Using an existing properties or key=value file
    • Providing a configMap yaml descriptor file. Both the first and second ways can help us create such a file.
    • Configmap is not for sensitive data and it has only 1MB Storage limit
$kubectl create ConfigMap my-config --from-literal=MESSAGE="hello world config map"
$kubectl cm my-config --from-file=my.properties

A Secret is an object that contains a small amount of sensitive data such as a password, a token, or a key. Such information might otherwise be put in a Pod specification or in a container image. Using a Secret means that you don't need to include confidential data in your application code.

$kubectl create secret generic api-creds --from-literal=key=mycred

Why do we need services?
Service - responsible for enabling network access to a set of pods. Each pod has its own IP address. Pods are ephemeral and destroyed frequently. Each time pods recreate a new IP is get assigned.

Whereas service has stable IP address, load balancing. It is loosely coupled and helps routing within and outside cluster. Selector helps to identify to which pods to forward the request. Pods are identified via selectors a key value pair.

ClusterIp: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType. You cannot make requests to service (pods) from outside the cluster.
  • Inter service communication within the cluster. For example, communication between the front-end and back-end components of your app.
  • K8s create endpoints object same name as service to keep track of which pods are the members/endpoints of the service. $kubectl get endpoints -n myapp



Headless Services: Client wants to communicate with one specific POD instead of going via services. 
Pods want to talk directly with specific pod like database master and slave. It is mostly used of StatefulSet Object. 
            DNS lookup for service – returns single IP address in Cluster IP for example. Set ClusterIP to None returns Pod IP address instead. No Cluster IP is assigned to the POD.



NodePort Services: Exposes the Service on each Node's IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You'll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>

Node port must be in the range of 30000–32767. Manually allocating a port to the service is optional. If it is undefined, Kubernetes will automatically assign one.

it is not so secured as you are opening a service port using a clusterIP.

Use Cases
  • When you want to enable external connectivity to your service.
  • Using a NodePort gives you the freedom to set up your own load balancing solution, to configure environments that are not fully supported by Kubernetes, or even to expose one or more nodes’ IPs directly.
  • Prefer to place a load balancer above your nodes to avoid node failure.

Load balancer: Exposes the Service externally using a cloud provider's load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created.  It exposes the Service externally using a cloud provider’s load balancer.

LoadBalancer is an extension of NodePort service. Do not use nodePort service to expose to external. Configure Ingress or LoadBalancer for production environments.


ExternalName
  • Services of type ExternalName map a Service to a DNS name, not to a typical selector such as my-service.
  • You specify these Services with the `spec.externalName` parameter.
  • It maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value.
  • No proxying of any kind is established.
Use Cases
  • This is commonly used to create a service within Kubernetes to represent an external datastore like a database that runs externally to Kubernetes.
  • You can use that ExternalName service (as a local service) when Pods from one namespace to talk to a service in another namespace.

Ingress: Kubernetes Ingress is an API object that provides routing rules to manage external users' access to the services in a Kubernetes cluster, typically via HTTPS/HTTP. With Ingress, you can easily set up rules for routing traffic without creating a bunch of Load Balancers or exposing each service on the node

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.




Host:
  • A valid domain addresses
  • Map domain name to Node’s IP address which is the entry point
  • Or you can map the domain to an external entry point IP address
Ingress Controller:
  • Evaluates all the rules
  • Manages redirections
  • Entrypoint to cluster
  • Exposes HTTP/HTTPS routes for a cluster  
  • Provides route-based load balancing 
  • Can terminate TLS 
  • Provides name-based virtual hosting


Kubernetes cheatsheet commands: