Namespace Isolation
   HOME

TheInfoList



OR:

cgroups (abbreviated from control groups) is a
Linux kernel The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
feature that limits, accounts for, and isolates the resource usage (CPU, memory, disk I/O, etc.) of a collection of processes. Engineers at
Google Google LLC (, ) is an American multinational corporation and technology company focusing on online advertising, search engine technology, cloud computing, computer software, quantum computing, e-commerce, consumer electronics, and artificial ...
started the work on this feature in 2006 under the name "process containers". In late 2007, the nomenclature changed to "control groups" to avoid confusion caused by multiple meanings of the term "
container A container is any receptacle or enclosure for holding a product used in storage, packaging, and transportation, including shipping. Things kept inside of a container are protected on several sides by being inside of its structure. The term ...
" in the Linux kernel context, and the control groups functionality was merged into the
Linux kernel mainline The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
in kernel version 2.6.24, which was released in January 2008. Since then, developers have added many new features and controllers, such as support for kernfs in 2014, firewalling, and unified hierarchy. cgroup v2 was merged in Linux kernel 4.5 with significant changes to the interface and internal functionality.


Versions

There are two versions of cgroups. Cgroups was originally written by Paul Menage and Rohit Seth, and merged into the mainline Linux kernel in 2007. Afterwards this is called cgroups version 1. Development and maintenance of cgroups was then taken over by Tejun Heo. Tejun Heo redesigned and rewrote cgroups. This rewrite is now called version 2, the documentation of cgroup-v2 first appeared in Linux kernel 4.5 released on 14 March 2016. Unlike v1, cgroup v2 has only a single process hierarchy and discriminates between processes, not threads.


Features

One of the design goals of cgroups is to provide a unified interface to many different
use case In both software and systems engineering, a use case is a structured description of a system’s behavior as it responds to requests from external actors, aiming to achieve a specific goal. It is used to define and validate functional requireme ...
s, from controlling single processes (by using
nice Nice ( ; ) is a city in and the prefecture of the Alpes-Maritimes department in France. The Nice agglomeration extends far beyond the administrative city limits, with a population of nearly one millionoperating system-level virtualization OS-level virtualization is an operating system (OS) virtualization paradigm in which the Kernel (operating system), kernel allows the existence of multiple isolated user space and kernel space, user space instances, including containers (LXC, Sol ...
(as provided by
OpenVZ OpenVZ (Open Virtuozzo) is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments ...
,
Linux-VServer Linux-VServer is a virtual private server implementation that was created by adding operating system-level virtualization capabilities to the Linux kernel. It is developed and distributed as open-source software. Details The project was started ...
or
LXC Linux Containers (LXC) is an operating system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel. The Linux kernel provides the cgroups functionality that allows l ...
, for example). Cgroups provides: ; Resource limiting : groups can be set not to exceed a configured
memory Memory is the faculty of the mind by which data or information is encoded, stored, and retrieved when needed. It is the retention of information over time for the purpose of influencing future action. If past events could not be remembe ...
limit, which also includes the file system cache, I/O bandwidth limit, CPU quota limit, CPU set limit, or maximum open files. ; Prioritization : some groups may get a larger share of CPU utilization or disk I/O throughput ; Accounting : measures a group's resource usage, which may be used, for example, for billing purposes ; Control : freezing groups of processes, their
checkpointing Checkpointing is a technique that provides fault tolerance for computing systems. It involves saving a snapshot of an application's state, so that it can restart from that point in case of failure. This is particularly important for long-running ...
and restarting


Use

A control group (abbreviated as cgroup) is a collection of processes that are bound by the same criteria and associated with a set of parameters or limits. These groups can be hierarchical, meaning that each group inherits limits from its parent group. The kernel provides access to multiple controllers (also called subsystems) through the cgroup interface; for example, the "memory" controller limits memory use, "cpuacct" accounts CPU usage, etc. Control groups can be used in multiple ways: * By accessing the cgroup virtual file system manually. * By creating and managing groups on the fly using tools like cgcreate, cgexec, and cgclassify (from libcgroup). * Through the "rules engine daemon" that can automatically move processes of certain users, groups, or commands to cgroups as specified in its configuration. * Indirectly through other software that uses cgroups, such as Docker, Firejail,
LXC Linux Containers (LXC) is an operating system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel. The Linux kernel provides the cgroups functionality that allows l ...
,
libvirt libvirt is an open-source API, daemon and management tool for managing platform virtualization. It can be used to manage KVM, Xen, VMware ESXi, QEMU and other virtualization technologies. These APIs are widely used in the orchestration layer o ...
,
systemd systemd is a software suite that provides an array of system components for Linux operating systems. The main aim is to unify service configuration and behavior across Linux distributions. Its primary component is a "system and service manage ...
, Open Grid Scheduler/Grid Engine, and Google's developmentally defunct
lmctfy lmctfy ("Let Me Contain That For You", pronounced "l-m-c-t-fi") is an implementation of an operating system–level virtualization, which is based on the Linux kernel's cgroups functionality. It provides similar functionality to other container ...
. The Linux kernel documentation contains some technical details of the setup and use of control groups version 1 and version 2. systemd-cgtop command can be used to show top control groups by their resource usage.


Redesign

Redesign of cgroups started in 2013, with additional changes brought by versions 3.15 and 3.16 of the Linux kernel.


Namespace isolation

While not technically part of the cgroups work, a related feature of the Linux kernel is ''namespace isolation'', where groups of processes are separated such that they cannot "see" resources in other groups. For example, a PID namespace provides a separate enumeration of
process identifier In computing, the process identifier (a.k.a. process ID or PID) is a number used by most operating system kernel (operating system), kernels—such as those of Unix, macOS and Windows—to uniquely identify an active Process (computing), process. ...
s within each namespace. Also available are mount, user, UTS (Unix Time Sharing), network and SysV IPC namespaces. * The ''PID namespace'' provides isolation for the allocation of
process identifier In computing, the process identifier (a.k.a. process ID or PID) is a number used by most operating system kernel (operating system), kernels—such as those of Unix, macOS and Windows—to uniquely identify an active Process (computing), process. ...
s (PIDs), lists of processes and their details. While the new namespace is isolated from other siblings, processes in its "parent" namespace still see all processes in child namespaces—albeit with different PID numbers. * ''Network namespace'' isolates the
network interface controller A network interface controller (NIC, also known as a network interface card, network adapter, LAN adapter and physical network interface) is a computer hardware component that connects a computer to a computer network. Early network interface ...
s (physical or virtual),
iptables iptables is a user-space utility program that allows a system administrator to configure the IP packet filter rules of the Linux kernel firewall, implemented as different Netfilter modules. The filters are organized in a set of tables, whi ...
firewall rules, routing tables etc. Network namespaces can be connected with each other using the "veth" virtual Ethernet device. * ''"UTS" namespace'' allows changing the
hostname In computer networking, a hostname (archaically nodename) is a label that is assigned to a device connected to a computer network and that is used to identify the device in various forms of electronic communication, such as the World Wide Web. Hos ...
. * ''
Mount Mount is often used as part of the name of specific mountains, e.g. Mount Everest. Mount or Mounts may also refer to: Places * Mount, Cornwall, a village in Warleggan parish, England * Mount, Perranzabuloe, a hamlet in Perranzabuloe parish, ...
namespace'' allows creating a different file system layout, or making certain mount points read-only. * ''IPC namespace'' isolates the System V
inter-process communication In computer science, interprocess communication (IPC) is the sharing of data between running Process (computing), processes in a computer system. Mechanisms for IPC may be provided by an operating system. Applications which use IPC are often cat ...
between namespaces. * ''User namespace'' isolates the user IDs between namespaces. * ''Cgroup namespace'' Namespaces are created with the "unshare" command or
syscall In computing, a system call (syscall) is the programmatic way in which a computer program requests a service from the operating system on which it is executed. This may include hardware-related services (for example, accessing a hard disk drive ...
, or as "new" flags in a "clone" syscall. The "ns" subsystem was added early in cgroups development to integrate namespaces and control groups. If the "ns" cgroup was mounted, each namespace would also create a new group in the cgroup hierarchy. This was an experiment that was later judged to be a poor fit for the cgroups API, and removed from the kernel. Linux namespaces were inspired by the more general namespace functionality used heavily throughout
Plan 9 from Bell Labs Plan 9 from Bell Labs is a distributed operating system which originated from the Computing Science Research Center (CSRC) at Bell Labs in the mid-1980s and built on UNIX concepts first developed there in the late 1960s. Since 2000, Plan 9 has ...
.


Unified hierarchy

Kernfs was introduced into the Linux kernel with version 3.14 in March 2014, the main author being Tejun Heo. One of the main motivators for a separate kernfs is the cgroups file system. Kernfs is basically created by splitting off some of the
sysfs sysfs is a pseudo file system provided by the Linux kernel that exports information about various kernel subsystems, hardware devices, and associated device drivers from the kernel's device model to user space through virtual files. In addition ...
logic into an independent entity, thus easing for other kernel subsystems the implementation of their own virtual file system with handling for device connect and disconnect, dynamic creation and removal, and other attributes. Redesign continued into version 3.15 of the Linux kernel.


Kernel memory control groups (kmemcg)

''Kernel memory control groups'' (''kmemcg'') were merged into version 3.8 () of the
Linux kernel mainline The Linux kernel is a Free and open-source software, free and open source Unix-like kernel (operating system), kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the k ...
. The kmemcg controller can limit the amount of memory that the kernel can utilize to manage its own internal processes.


cgroup awareness of OOM killer

Linux Kernel 4.19 (October 2018) introduced cgroup awareness of OOM killer implementation which adds an ability to kill a cgroup as a single unit and so guarantee the integrity of the workload.


Adoption

Various projects use cgroups as their basis, including
CoreOS Container Linux (formerly CoreOS Linux) is a discontinued open-source lightweight operating system based on the Linux kernel and designed for providing infrastructure for clustered deployments. One of its focuses was scalability. As an operati ...
, Docker (in 2013),
Hadoop Apache Hadoop () is a collection of Open-source software, open-source software utilities for reliable, scalable, distributed computing. It provides a software framework for Clustered file system, distributed storage and processing of big data usin ...
,
Jelastic Jelastic is a cloud platform software vendor that provides multi-cloud platform as a service-based on container technology for hosting service providers, ISVs, telecommunication companies, enterprises and developers. The platform is available as p ...
,
Kubernetes Kubernetes (), also known as K8s is an open-source software, open-source OS-level virtualization, container orchestration (computing), orchestration system for automating software deployment, scaling, and management. Originally designed by Googl ...
,
lmctfy lmctfy ("Let Me Contain That For You", pronounced "l-m-c-t-fi") is an implementation of an operating system–level virtualization, which is based on the Linux kernel's cgroups functionality. It provides similar functionality to other container ...
(Let Me Contain That For You),
LXC Linux Containers (LXC) is an operating system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel. The Linux kernel provides the cgroups functionality that allows l ...
(Linux Containers),
systemd systemd is a software suite that provides an array of system components for Linux operating systems. The main aim is to unify service configuration and behavior across Linux distributions. Its primary component is a "system and service manage ...
, Mesos and Mesosphere, and
HTCondor HTCondor is an open-source high-throughput computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated cluster of computers, or to farm out wor ...
. Major Linux distributions also adopted it such as
Red Hat Enterprise Linux Red Hat Enterprise Linux (RHEL) is a commercial Linux distribution developed by Red Hat. Red Hat Enterprise Linux is released in server versions for x86-64, Power ISA, ARM64, and IBM Z and a desktop version for x86-64. Fedora Linux and ...
(RHEL) 6.0 in November 2010, three years before adoption by the mainline Linux kernel. On 29 October 2019, the
Fedora Project The Fedora Project is an independent project to coordinate the development of Fedora Linux, a Linux-based operating system, operating with the mission of creating "''an innovative platform for hardware, clouds, and containers that enables softw ...
modified Fedora 31 to use CgroupsV2 by default


See also

* Operating system–level virtualization implementations *
Process group In a POSIX-conformant operating system, a process group denotes a collection of one or more processes. Among other things, a process group is used to control the distribution of a signal; when a signal is directed to a process group, the signal ...
*
Tc (Linux) tc (traffic control) is the user-space system administration utility program used to configure the Linux kernel packet scheduler. Tc is usually packaged as part of the iproute2 package. Syntax tc filter add dev pppoe-dsl parent 1: prio 1 ...
a traffic control utility slightly overlapping in functionality with network-oriented cgroup settings * Job object the equivalent
Windows Windows is a Product lining, product line of Proprietary software, proprietary graphical user interface, graphical operating systems developed and marketed by Microsoft. It is grouped into families and subfamilies that cater to particular sec ...
concept, as managed by that platform’s Object Manager


References


External links


Official Linux kernel documentation on cgroups v1
and


Red Hat Resource Management Guide on cgroups



Linux kernel Namespaces and cgroups by Rami Rosen
(2013)
Namespaces and cgroups, the basis of Linux containers (including cgroups v2)
slides of a talk by Rami Rosen, Netdev 1.1, Seville, Spain, 2016 * /lwn.net/Articles/679786 Understanding the new control groups API
LWN.net LWN.net is a computing webzine with an emphasis on free software and software for Linux and other Unix-like operating systems. It consists of a weekly issue, separate stories which are published most days, and threaded discussion attached to ever ...
, by Rami Rosen, March 2016
Large-scale cluster management at Google with Borg
April 2015, by Abhishek Verma, Luis Pedrosa, Madhukar Korupolu, David Oppenheimer, Eric Tune and John Wilkes
Job Objects
similar feature on Windows {{Linux kernel Interfaces of the Linux kernel Linux kernel features Operating system security Virtualization software for Linux