Problem Slicer: Kubernetes basics and cheatsheet

Before we start about Kubernetes let us first cover some of the basics of containers and what is the benefits of containerization.

A container is an executable package of software that includes everything needed to run it. Containerization is the packaging of software code with just the operating system (OS) libraries and dependencies required to run the code to create a single lightweight executable—called a container—that runs consistently on any infrastructure

Executable unit of software

Encapsulate everything necessary to run
Can be run anywhere

OS Virtualization:

Isolates process
Control resources allocated to those process

Small, fast, and portable

Doesn’t include guest OS in every instance
Leverages host OS

Benefits of container:

Portability
Agility: rapid application development
Speed:

Lightweight
Don’t include guest os
Spin up quickly and horizontally scalable

Fault isolation

The failure of one container does not affect the continued operation of any other containers

Efficiency / cost effective
Ease of management
Security

The Open Container Initiative (OCI), established in June 2015 by Docker and other industry leaders, is promoting common, minimal, open standards and specifications around container technology.

The ecosystem is standardizing on containerd and other alternatives like CoreOS rkt, Mesos Containerizer, LXC Linux Containers, OpenVZ, and crio-d.

Docker is a platform for building and running container. A Docker file serves as the blueprint for an image.

Image: An image is an immutable file that contains everything necessary to run an application.
Container is a running image
Each docker instruction creates a new read-only layer. A writable layer is added when an image is run as a container.

Note: The main difference between ADD and COPY in docker file is that COPY can only copy local files or directory, whereas ADD can also add files from remote URLs

CMD is the default execution command generally stays at the last in docker file.

Naming: hostname/repository:tag

Kubernetes is an open-source container orchestration system for automating software deployment, scaling, and management. Google originally designed Kubernetes, but the Cloud Native Computing Foundation now maintains the project. Wikipedia

Managing the lifecycle of containers, especially in large, dynamic environments

Provisioning and deployment
Availability
Scaling
Scheduling to infrastructure
Rolling updates
Health checks

Kubernetes as “a portable, extensible, open-source platform for managing containerized workloads and services that facilitates both declarative configuration and automation.It has a large, rapidly growing ecosystem. Kubernetes services, support, and tools are widely available.”

Kubernetes is not a

Paas
Does not limit the types of applications
Does not deploy source code or build application
Does not provide-built-in middleware, databases, or other services

A Kubernetes cluster is a set of nodes that run containerized application. When you deploy Kubernetes you get a cluster. In Kubernetes cluster consists of two types of nodes,

Control plane ( Master node)
Nodes ( Worker nodes)

Control plane ( Master node):

It makes decisions about the cluster and detects and responds to events in the cluster.

Kubernetes API: All communication in the cluster utilizes this API.
Kubernetes scheduler:

The Kubernetes scheduler assigns newly created Pods to nodes. This means that the scheduler determines where your workloads should run within the cluster.

etcd:

a highly available key value store that contains all the cluster data. When you tell Kubernetes to deploy your application, that deployment configuration is stored in etcd. Etcd is thus the source of truth for the state in a Kubernetes cluster, and the system works

Kubernetes controller manager:

The Kubernetes controller manager runs all the controller processes that monitor the cluster state and ensure that the actual state of a cluster matches the desired state.

Cloud controller manager:

Runs controllers that interact with the underlying cloud providers.These controllers effectively link clusters into a cloud provider’s API. Since Kubernetes is open source software and would ideally be adopted by a variety of cloud providers and organizations, it strives to be as cloud-agnostic as possible.

Kubernetes worker nodes

Nodes:

Nodes are the worker machines in a Kubernetes cluster. In other words, user applications are run on nodes. Nodes can be a physical machine or a virtual machine. Managed by control plane contains the services to run applications.

Kube proxy:

Network proxy
Maintains network rules that allow communication to pods

Kubelet:

Communicates with the API server
Ensures that Pods and their associated containers are running
Reports to the control plan on health and status

A control loop is defined as a non-terminating loop that regulates the state of a system.

Kubernetes Objects are persistent entities in Kubernetes."Persistent" means that when you create an object, Kubernetes continually works to ensure that that object exists in the system, until and unless you modify or remove that object.

Persistent entities in kubernetes
Define the desired state of your workload
Use the Kubernetes API to work with them, like kubectl

Kubernetes objects consist of two main fields.

The first is the object "spec," which is provided by the user. The spec dictates the desired state for this object.
The second field is the "status," which is provided by Kubernetes. The status describes the current state of the object—its actual state as opposed to its desired state. The status is updated if at any time the status of the object changes.

Namespaces: namespaces can be used to provide logical separation of a cluster into virtual clusters.
Labels: Labels are key/value pairs that can be attached to objects in order to identify those objects.
Pods: Simplest unit in Kubernetes, represents process running in cluster, encapsulate a container, POD serve to scale an app horizontal
ReplicaSet: A ReplicaSet is a group of identical Pods that are running. a ReplicaSet encapsulates a Pod definition and adds additional information needed to replicate it.
Deployment: Deployment object, a higher-level concept that in turn manages ReplicaSets, A Deployment is an object that provides updates for both Pods and ReplicaSets.

provides updates for pods and replicates
Runs multiple replicas of your application
Suitable for stateless applications
updates triggers a rollout

Autoscaling:

ReplicaSet works with a set number of pods
Horizontal Pod Autoscaler (HPA) enables scaling up and down as needed.

Kind: HorizontalPodAutoScaler
And in spec you define the attribute
Behind the scene it uses replicaSet to create object

Can configure based on desired state of CPU, memory etc

Rolling Update:

ReplicaSet and Autoscaling are important to minimize and service interruption
Rolling Update are a way to roll out app changes in an automated and controlled fashion throughout your pods
Rolling updates give us a way to publish changes to our applications without Noticeable interruptions for the user.
Additionally, rolling updates give us a way to roll back any changes to the application
Kubectl rollout status deployments/hello-kubernets
Kubectl rollout undo deployments/hello-kubernetes

ConfigMaps give us a way to provide configuration data to pods and deployments so we don't have to hard-code that data in the application code. You can also reuse these ConfigMaps and Secrets for multiple deployments.

Used to provide configuration for deployments
Reusable across deployments
Created in a couple of different ways:

using string literals
Using an existing properties or key=value file
Providing a configMap yaml descriptor file. Both the first and second ways can help us create such a file.
Configmap is not for sensitive data and it has only 1MB Storage limit

$kubectl create ConfigMap my-config --from-literal=MESSAGE="hello world config map"

$kubectl cm my-config --from-file=my.properties

A Secret is an object that contains a small amount of sensitive data such as a password, a token, or a key. Such information might otherwise be put in a Pod specification or in a container image. Using a Secret means that you don't need to include confidential data in your application code.

$kubectl create secret generic api-creds --from-literal=key=mycred

Why do we need services?

Service - responsible for enabling network access to a set of pods. Each pod has its own IP address. Pods are ephemeral and destroyed frequently. Each time pods recreate a new IP is get assigned.

Whereas service has stable IP address, load balancing. It is loosely coupled and helps routing within and outside cluster. Selector helps to identify to which pods to forward the request. Pods are identified via selectors a key value pair.

ClusterIp: Exposes the Service on a cluster-internal IP. Choosing this value makes the Service only reachable from within the cluster. This is the default ServiceType. You cannot make requests to service (pods) from outside the cluster.

Inter service communication within the cluster. For example, communication between the front-end and back-end components of your app.
K8s create endpoints object same name as service to keep track of which pods are the members/endpoints of the service. $kubectl get endpoints -n myapp

Headless Services: Client wants to communicate with one specific POD instead of going via services. Pods want to talk directly with specific pod like database master and slave. It is mostly used of StatefulSet Object.

DNS lookup for service – returns single IP address in Cluster IP for example. Set ClusterIP to None returns Pod IP address instead. No Cluster IP is assigned to the POD.

NodePort Services: Exposes the Service on each Node's IP at a static port (the NodePort). A ClusterIP Service, to which the NodePort Service routes, is automatically created. You'll be able to contact the NodePort Service, from outside the cluster, by requesting <NodeIP>:<NodePort>

Node port must be in the range of 30000–32767. Manually allocating a port to the service is optional. If it is undefined, Kubernetes will automatically assign one.

it is not so secured as you are opening a service port using a clusterIP.

Use Cases

When you want to enable external connectivity to your service.
Using a NodePort gives you the freedom to set up your own load balancing solution, to configure environments that are not fully supported by Kubernetes, or even to expose one or more nodes’ IPs directly.
Prefer to place a load balancer above your nodes to avoid node failure.

Load balancer: Exposes the Service externally using a cloud provider's load balancer. NodePort and ClusterIP Services, to which the external load balancer routes, are automatically created. It exposes the Service externally using a cloud provider’s load balancer.

LoadBalancer is an extension of NodePort service. Do not use nodePort service to expose to external. Configure Ingress or LoadBalancer for production environments.

ExternalName

Services of type ExternalName map a Service to a DNS name, not to a typical selector such as my-service.
You specify these Services with the `spec.externalName` parameter.
It maps the Service to the contents of the externalName field (e.g. foo.bar.example.com), by returning a CNAME record with its value.
No proxying of any kind is established.

Use Cases

This is commonly used to create a service within Kubernetes to represent an external datastore like a database that runs externally to Kubernetes.
You can use that ExternalName service (as a local service) when Pods from one namespace to talk to a service in another namespace.

Ingress: Kubernetes Ingress is an API object that provides routing rules to manage external users' access to the services in a Kubernetes cluster, typically via HTTPS/HTTP. With Ingress, you can easily set up rules for routing traffic without creating a bunch of Load Balancers or exposing each service on the node

Ingress exposes HTTP and HTTPS routes from outside the cluster to services within the cluster. Traffic routing is controlled by rules defined on the Ingress resource.