Containerization

Containerization is a technique that allows software to run reliably regardless of the computing environment. By encapsulating software within isolated environments called containers, we can more reliably port software across operating systems and hardware infrastructures.

Let's say, most of the development or testing is done on a developer's laptop. The software may work as expected on the laptop but when deployed on the server the software fails. This could be because the server is using different versions of libraries, has a different configuration or interfaces differently to other components of the system. Containerization solves this problem by providing a consistent and isolated runtime environment regardless of the underlying OS or hardware infrastructure.

For developers, what this means is that we're no longer deploying just the application software but deploying container images that contain the app along with its dependencies.

Discussion

  • What's a container and what's in it?
    Understand containers by contrasting them against VMs. Source: Janetakis 2017.

    Containers consists of the runtime environment: an application, dependencies, libraries, binaries, and configuration files needed to run an application, bundled into one package. By containerizing the application platform and its dependencies, differences in OS distributions and underlying infrastructure are abstracted.

    Essentially, a container includes the application and all of its dependencies. It shares the OS kernel with other containers. It's not tied to a specific infrastructure: it only needs Docker Engine (or equivalent) installed on host. Thus, containers isolate the application process in user space on the host OS from other application processes.

  • What are the benefits of containers?
    Results of a survey on the benefits of containerization. Source: Coggin 2015.
    Results of a survey on the benefits of containerization. Source: Coggin 2015.

    A survey commissioned by Red Hat in 2015 showed that containers are seen to bring better security, efficiency, portability, flexibility and speed. More specifically, we can identify the following benefits:

    Benefits to Applications

    • Portable
    • Packaged in a standard way
    • Automated testing, packaging and integrations
    • Support newer microservices architectures
    • Alleviate platform compatibility issues

    Benefits to Deployment

    • Easy
    • Repeatable
    • Reliable deployments: improved speed and frequency of deployments
    • Consistent application lifecycle: configure once and run multiple times
    • Consistent environments: no more process differences between local, dev and staging environments
    • Simple scaling: Fast deployments ease the addition of workers and permit workload to grow and shrink for on-demand use cases
  • How is Virtualization different from Containerization?
    Comparing Virtual Machines and Containers. Source: https://medium.com/@faizanbashir/docker-containers-101-e47f594a0ed
    Comparing Virtual Machines and Containers. Source: https://medium.com/@faizanbashir/docker-containers-101-e47f594a0ed

    With virtualization technology, the package that can be passed around is a Virtual Machine (VM). It includes an OS as well as the application. A server running three VMs would have a hypervisor and three separate operating systems running on top of it. By contrast, a server running three containerized applications runs on top of a single OS, all containers sharing the same OS kernel. Shared parts of the operating system are read only, while each container has its own mount (i.e., a way to access the container) and volumes for reading from and writing to the file system. This means that the containers are much more lightweight and use far fewer resources than virtual machines.

    Containerization has been called an operating system level virtualization, an operating system feature in which the OS kernel allows the existence of multiple isolated user-space instances. Instances are created from images that we can build and share. Instances created from images are called containers, partitions, virtualization engines or jails (FreeBSD). Applications running inside a container can only see the container's contents and devices assigned to the container.

  • What's the typical size of a container?

    Containers can only be tens of megabytes in size. For example, the Docker Alpine image is about 4 MB. For comparison, Fedora version 25 is about 231 MB. A virtual machine with its entire OS may be several gigabytes in size. For this reason, a single server can host far more containers than virtual machines.

    Virtual machines may take several minutes to boot up their operating systems and start running the applications they host. However, containerized applications can be started almost instantly. This also means that containers can be created as needed and destroyed quickly, thereby using resources efficiently.

  • What are some types of containers?

    The common use case is to separately run applications within Application Containers. Code that developers create run within these containers, which greatly simplify the deployment of apps. Docker and OCI-based containers are examples.

    Another use case of containers is to provide a virtual operating system. What if we wish to deploy Ubuntu, CentOS and RHEL on top of a common host OS? This is where Operating System Containers can be used. LXC and LXD are examples. It's interesting that Docker and OCI-based containers can partly achieve this functionality by running systemd.

    Suppose you're building a specialized app. You want the flexibility to customize your app but at the same time benefit from the automation and tooling of the container ecosystem. You can use Pet Containers.

    Administrators can benefit from Super Privileged Containers (SPC) for kernel module loading, monitoring, backups, etc. SPCs usually have a tighter coupling with the host kernel.

  • Could you describe specific examples of container usage?
    OS containers vs. app containers. Source: Karle 2015.
    OS containers vs. app containers. Source: Karle 2015.

    Let's take the example of a web application using Nginx as the load balancer, Node.js for the app and Postgres for database. Traditionally, all of these would be deployed on a single machine. These can't be deployed, scaled or managed independently. Instead, if we adopt a three-tiered architecture, each one is delivered as a separate application container image. Each one can be deployed independently, on different machines if needed.

    Perhaps we what to run our app within a specific OS. In this case, we can use an OS container image of that OS, install the necessary dependencies and bring up the app within the container in a consistent manner.

    Application containers are used in smartphones. Nexus One uses LXC on the Android kernel. McAfee provides a Secure Container for Android. Apple iPhones use containers to compartmentalize applications and their data.

  • Which are the technical enablers for implementing containers?
    The Linux container stack. Source: https://www.engineyard.com/blog/isolation-linux-containers
    The Linux container stack. Source: https://www.engineyard.com/blog/isolation-linux-containers

    Linux Containers (LXC) is a modern method of virtualizing an application. LXC leverages cgroups to isolate the CPU, memory, file/block I/O and network resources. LXC also uses kernel namespaces to isolate the application from the operating system and separates the process trees, network access, user IDs, and file access. LXC is considered a technique that falls between chroot and VM. In version 1.0 of LXC, unprivileged containers are more secure because they run as regular unprivileged users.

    To enable containers to start quickly, the container image is not copied but shared. A copy is made only when data is modified. This is called the Copy-on-Write (CoW) mechanism. File-level CoW is easier to backup, more space efficient and simpler to cache than block-level CoW on whole-system virtualizers.

  • What are some essential terms to know in relation to containers?

    A container is specified by one or more files called Container Image. Often, a hierarchy of image layers, tagged and stored with metadata, together form what's called a Repository. Some use the two terms interchangeably.

    A container image is stored on a Registry Server, whose location is known to anyone wishing to pull and use the image. The system on which a container runs is called Container Host. Running containers are also called Containerized Processes because ultimately they're nothing but processes.

    Thus, a container starts as an image at rest and ends up as a process during execution. The job of processing user requests, pulling images from the registry and initiating execution of the container is done by the Container Engine. The engine itself doesn't execute containers. This is done by the Container Runtime, which gets the image mount point and metadata from the engine, communicates with the kernel, and sets up relevant permissions.

    Since a registry may have lots of repositories, Namespaces help in logically separating them. These can be names of persons, organizations, products, etc. These should not be confused with kernel namespaces.

  • Could you name some container implementations?

    Docker is the most popular one. Others include Linux OpenVZ, Linux-VServer, FreeBSD Jails, AIX Workload Partitions (WPARs), HP-UX Containers (SRP), and Solaris Containers. Docker offers a complete container ecosystem including image management, deployment assistance, automation, API for building Platform as a Service (PaaS), and more.

    The Open Container Initiative (OCI) has standardized container runtime. Its reference implementation is called runc. This is used by Docker, CRI-O and others. Docker previously used LXC as the runtime. Then it created its own runtime called libcontainer, which eventually became runc. Other OCI-compliant runtimes include crun, railcar, and katacontainers.

    Among the different container engines are Docker, RKT, CRI-O, and LXD. In addition, cloud providers may provide their own built-in container engines. For interoperability, many engines accept Docker or OCI-compliant images.

  • Which Linux distributions are suitable for use as a container host?

    Most Linux distributions are unnecessarily feature-heavy if their intended use is simply to act as a container host to run containers. For that reason, a number of Linux distributions have been designed specifically for running containers. Here are some examples:

    • Container Linux — formerly called CoreOS Linux, it's one of the first lightweight container operating systems built for containers.
    • RancherOS — a simplified Linux distribution built from containers, specifically for running containers.
    • Photon OS — a minimal Linux container host, optimized to run on VMware platforms.
    • Project Atomic Host — Red Hat's lightweight container OS has versions that are based on CentOS and Fedora, and there's also a downstream enterprise version in Red Hat Enterprise Linux.
    • Ubuntu Core — the smallest Ubuntu version, Ubuntu Core is designed as a host operating system for IoT devices and large-scale cloud container deployments.
    • Alpine Linux — is a very tiny Linux distribution focused on security.

Milestones

1979

During the development of Unix V7 in 1979, the chroot system call is introduced, changing the root directory of a process and its children to a new location in the filesystem. In BSD Unix, this feature is introduced in 1982.

2000

FreeBSD Jails allow administrators to partition a FreeBSD computer system into several independent, smaller systems called jails, with the ability to assign an IP address for each system and configuration. In 2001, something similar is done by Linux VServer to partition resources (file systems, network addresses, memory) on a computer system.

2004

Oracle releases a Solaris Container called Solaris Zones that combines system resource controls and boundary separation provided by zones, which are able to leverage features like snapshots and cloning from ZFS.

2005

OpenVZ is an operating system-level virtualization technology for Linux that uses a patched Linux kernel for virtualization, isolation, resource management and checkpointing. Back then, the code is not released as part of the official Linux kernel.

2006

Process Containers is launched by Google. It's designed for limiting, accounting and isolating resource usage (CPU, memory, disk I/O, network) of a collection of processes. It's renamed Control Groups (cgroups) a year later and eventually merged to Linux kernel 2.6.24.

2008

LXC (LinuX Containers) is the first, most complete implementation of Linux container manager. It's implemented using cgroups and Linux namespaces. It works on a single Linux kernel without requiring any patches.

2013

Let Me Contain That For You (LMCTFY) is started as an open-source version of Google's container stack, providing Linux application containers. Applications can be made "container aware," creating and managing their own subcontainers. Active deployment in LMCTFY stopped in 2015 after Google started contributing core LMCTFY concepts to libcontainer, which is now part of the Open Container Foundation.

2013

Docker is released and containers explode in popularity. Docker uses LXC in its initial stages and later replaces that container manager with its own library, libcontainer. Later, Docker separates itself from the pack by offering an entire ecosystem for container management.

References

  1. Coggin, Mark. 2015. "Forrester's Dave Bartoletti Reports on Container Usage at Red Hat Partner Conference." Blog, Red Hat, April 9. Accessed 2020-07-21.
  2. Docker. 2020. "What is Docker. 2020. a Container?" Docker. What is a Container?A Docker. ccessed 2020-07-21.
  3. Hogg, Scott. 2014. "Software Containers: Used More Frequently than Most Realize." Network World, May 26. Accessed 2018-04-26.
  4. Janetakis, Nick. 2017. "Virtual Machines vs Docker Containers - Dive Into Docker." YouTube, July 2. Accessed 2018-05-13.
  5. Janetakis, Nick. 2017b. "The 3 Biggest Wins When Using Alpine as a Base Docker Image." Blog, June 20. Accessed 2019-08-05.
  6. Karle, Akshay. 2015. "Operating System Containers vs. Application Containers." Blog, RisingStack, May 19. Accessed 2019-08-04.
  7. Mavungu, Eddy. 2017. "Docker Storage: An Introduction." Blog, Codeship, May 05. Accessed 2019-08-04.
  8. McCarty, Scott. 2018. "A Practical Introduction to Container Terminology." Blog, RedHat Developer, February 22. Accessed 2019-08-04.
  9. Novoseltseva, Ekaterina. 2017. "Top 10 benefits you will get by using Docker." Apiumhub, March 4. Accessed 2018-05-11.
  10. Osnat, Rani. 2018. "A Brief History of Containers: From the 1970s to 2017." Aqua Blog, March 21. Accessed 2018-04-26.
  11. Patrizio, Andy. 2018. "Containers vs. Virtual Machines ." Datamation, December 19. Accessed 2020-07-21.
  12. Rubens, Paul. 2017. "What are containers and why do we need them?" CIO, Jun 27. Accessed 2018-04-26.
  13. Tintri. 2016. "Data dive: VM sizes in the real world." Blog, Tintri, May. Accessed 2019-08-05.
  14. Wikipedia. 2018a. "Docker Software." Wikipedia, May 3. Accessed 2018-04-26.
  15. Wikipedia. 2018b. "Operating System Level Virtualization." Wikipedia, May 8. Accessed 2018-04-26.

Further Reading

  1. A Practical Introduction to Container Terminology
  2. What are containers and why do we need them?
  3. Software Containers: Used More Frequently than Most Realize
  4. Operating System Level Virtualization
  5. A Brief Introduction to Linux Containers with LXC

Article Stats

Author-wise Stats for Article Edits

Author
No. of Edits
No. of Chats
DevCoins
5
1
1480
4
3
1305
1954
Words
9
Likes
10K
Hits

Cite As

Devopedia. 2020. "Containerization." Version 9, July 21. Accessed 2023-11-13. https://devopedia.org/containerization
Contributed by
2 authors


Last updated on
2020-07-21 08:33:47

Improve this article

Article Warnings

  • Readability score of this article is below 50 (44.5). Use shorter sentences. Use simpler words.