A scalable application on the cloud is typically composed of many microservices running within containers. When there are multiple containers running the workloads of multiple applications across machines, there's a need to manage the containers. This is where container orchestration comes in. By automating container deployment, scaling, monitoring and recovery, developers can focus on code rather than operations.
Kubernetes is an open source container orchestrator. The project is overseen by the Cloud Native Computing Foundation, a project of the Linux Foundation. A shortform for Kubernetes is k8s. Kubernetes is written in Go language.
Kubernetes can work with different container technologies such as Docker or rkt. It can manage containers on clusters of physical or virtual machines. Deployments can be on-site or across various cloud providers, and therefore there's no vendor lock-in.
Why was Kubernetes invented in the first place?
Internally, Google had been using containers for more than a decade before Kubernetes itself was born. They used a cluster management system called Borg for managing hundreds of thousands of containers. This infrastructure was used to power Google Compute Engine but there was a problem. Customers were simply spinning up virtual machines (VMs), paying for them but using them well below capacity. In other words, resources were not optimally utilized. Engineers at Google realized that an open source variant of Borg was needed that everyone could use.
Containers themselves had been popularized by Docker but managing even dozens of containers was becoming an operational challenge. Kubernetes was therefore created in 2013 before Docker Swarm itself appeared the next year. By making it easier to manage containers, tools like Kubernetes enable more widespread adoption of containers instead of VMs.
What are the key features of Kubernetes?
When Kubernetes first came out, its basic feature set included replication, load balancing, service discovery, basic health checking, self-healing, and scheduling. Current features of Kubernetes include the following:
- Service discovery and load balancing: Uses a single DNS name for a set of containers. Jobs are load-balanced across them.
- Automatic binpacking: Also called scheduling, containers are assigned to nodes based on resource requirements. Mix critical and batch workloads in order to drive up utilization.
- Storage orchestration: Support for various storage options.
- Self-healing: Restart failed containers. Reschedule containers when nodes dies.
- Automated rollouts and rollbacks: App changes are rolled out progressively without bringing down all instances at the same time. If something goes wrong, rollback the changes.
- Secret and configuration management: Update secrets without rebuilding image or exposing secrets in your stack configuration.
- Batch execution: Manage batch and CI workloads.
- Horizontal scaling: Also called replication, apps can be scaled up or down automatically based on CPU usage.
Could you define some common Kubernetes terminology?
A Kubernetes glossary is available online but here are some common terms:
- Cluster: A set of worker nodes and at least one master node managed by Kubernetes.
- Controller: A control loop that watches and attempts to move the cluster to its desired state.
- Deployment: An API object that manages a replicated application. Each replica is a pod.
- Label: Key-value pairs used to identify and organize Kubernetes objects such as pods.
- Namespace: An abstraction used by Kubernetes to support multiple virtual clusters on the same physical cluster.
- Node: Previously called minion, this is a worker machine that can be virtual or physical. It has services (such as Docker) to run pods.
- Pod: The smallest and simplest Kubernetes object, it's a set of running containers on your cluster.
- ReplicaSet: Ensures a specific number of instances of a pod is always running.
- Service: An API object that describes how to access applications, such as a set of Pods, and can describe ports and load-balancers.
- Selector: Allows users to filter a list of resources based on labels.
- Volume: A directory containing data, accessible to the containers in a pod with data preserved across container restarts.
Could you describe the architecture of Kubernetes?
A Kubernetes cluster consists of at least one master plus a number of nodes. The master consists of API server, scheduler and controller. Configuration and management is via API calls processed by the master. Master also contains an etcd key-value database to store the state of the cluster.
Overall management of the cluster is with the master but the real workhorses are the nodes. Jobs execute within containers. One or more containers are combined into a single pod. One or more pods are scheduled to run on a node. Master schedules pods on nodes, not the containers themselves. Scheduling is based on resource requirements. When scheduled, the node pulls the required container image, invokes the container runtime on the node and launches the container. An agent called kubelet runs on each node to manage containers.
Kubernetes essentially does three things: resource management, scheduling and load balancing. Periodically, the controller obtains pod utilization and uses this to scale. This scaling is transparent to clients since everything is exposed as services.
Why do we need pods when containers might be good enough?
While Kubernetes could have been designed to map containers directly to nodes, pods offer an extra level of abstraction to simplify container management. Because there are different container implementations—Docker, rkt, LXD, Windows Containers —pods provide a unified interface.
Pods also encapsulate containers that closely depend on one another. Each pod has an IP address and all its containers share the same port space. Inter-process communication (IPC) and sharing storage across containers in the pod is also easy.
When a pod is scheduled, all its containers are scheduled. One may argue why not bundle them into a single container. This violates the "one process per container" principle. Having multiple processes in a container makes it difficult to debug problems. By using pods, if a container fails, it will be restarted while other containers in the pod remain unaffected.
It's perfectly valid for a pod to have a single container. An example use case of a multi-container pod is a main container plus a sidecar container. The latter can be a log watcher or a data loader.
What are some criticisms of Kubernetes?
Since Kubernetes manages the low-level details, if something is wrongly configured or things don't work as expected, it can be difficult to find the root cause. Kubernetes dashboard provides limited information though more detailed information is available via command line interface.
Internal to Google, Borg is conceived and built for large scale cluster management. The idea is to use resources at high utilization while running hundreds of thousands of jobs. Borg is the system that goes on to power Google Cloud Platform and Google Compute Engine. Details of Borg are not disclosed outside of Google until much later in 2015.
This is the year when Kubernetes goes mainstream with more conferences, developers and features. The game Pokemon GO that becomes an international is powered by Kubernetes. Windows Server Support arrives. In May, v0.1.0 of minikube is released to deploy a single-node Kubernetes cluster in a VM locally.
- Beda, Joe. 2018. "4 Years of K8s." Kubernetes Blog, June 06. Accessed 2018-09-29.
- Burns, Brendan. 2018. "The History of Kubernetes & the Community Behind It." Kubernetes Blog, July 20. Accessed 2018-09-29.
- CNCF. 2018. "Homepage." Cloud Native Computing Foundation. Accessed 2018-09-30.
- Chekin, Pavel. 2017. "Multi-container pods and container communication in Kubernetes." Mirantis Blog, August 28. Accessed 2018-10-01.
- CoderJourney. 2017. "What is Kubernetes." CoderJourney, YouTube. Accessed 2018-09-30.
- Google Cloud. 2018. "Containers at Google." Google Cloud. Accessed 2018-09-30.
- Janakiram MSV. 2016. "Kubernetes: An Overview." The New Stack, November 07. Accessed 2018-09-29.
- Johnston, Scott. 2014. "Announcing Docker Machine, Swarm, and Compose for Orchestrating Distributed Apps." Docker Blog, December 04. Accessed 2018-10-01.
- Kubernetes. 2018a. "Homepage." Kubernetes. Accessed 2018-09-29.
- Kubernetes Docs. 2018a. "Standardized Glossary." Kubernetes, May 05. Accessed 2018-09-29.
- Kubernetes Docs. 2018b. "Pod Overview." Kubernetes, June 16. Accessed 2018-09-29.
- Kubernetes GitHub. 2016. "Logo.svg." Kubernetes, GitHub, April 5. Accessed 2018-09-30.
- Kubernetes GitHub. 2018. "kubernetes/minikube: Releases." September 28. Accessed 2018-10-05.
- McLuckie, Craig. 2016. "From Google to the world: the Kubernetes origin story." Google Cloud Platform Blog, July 22. Accessed 2018-09-29.
- Papp, Andrea. 2018. "The History of Kubernetes on a Timeline." RisingStack Blog, June 20. Updated 2018-07-20. Accessed 2018-09-29.
- Rancher. 2017. "The Three Pillars of Kubernetes Container Orchestration." Rancher Labs, May 18. Accessed 2018-09-29.
- RedHat. 2018. "What is Kubernetes?" RedHat, March 19. Accessed 2018-09-30.
- Salamat, Babak and David Oppenheimer. 2018. "Get the most out of Google Kubernetes Engine with Priority and Preemption." Blog, Google Cloud, February 14. Accessed 2018-11-24.
- Sayfan, Gigi. 2018. "Kubernetes Primer: Key Concepts and Terms." Network Computing, March 22. Accessed 2018-09-29.
- Verma, Abhishek, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune, and John Wilkes. 2015. "Large-scale Cluster Management at Google with Borg." Proceedings of the Tenth European Conference on Computer Systems, ACM, pp. 18:1-18:17. Accessed 2018-09-30.
- Waterworth, Steve. 2018. "Introduction to Kubernetes." Instana Blog, January 09. Accessed 2018-09-29.
- Official Kubernetes Documentation
- Kubernetes Discussion Forum
- Sanche, Daniel. 2018. "Kubernetes 101: Pods, Nodes, Containers, and Clusters." Medium, January 02. Accessed 2018-09-29.
- Burns, Brendan, Brian Grant, David Oppenheimer, Eric Brewer, and John Wilkes. 2016. "Borg, Omega, and Kubernetes." ACM Queue, vol. 14, no. 1, March 02. Accessed 2018-09-30.
- Morgan, Timothy Prickett. 2016. "A Decade Of Container Control At Google." The Next Platform, March 22. Accessed 2018-10-01.
- Yegulalp, Serdar. 2018. "What is Kubernetes? Container orchestration explained." InfoWorld, April 04. Accessed 2018-10-01.
- Kubernetes Configuration
- Google Kubernetes Engine
- Container Orchestration
- Docker Swarm