Linux containers
You are in a hurry, and you don't want to read this man page. Ok, without warranty, here are the commands to launch a shell inside a container with a predefined configuration template, it may work. /usr/bin/lxc-execute -n foo -f /usr/share/doc/lxc/examples/lxc-macvlan.conf /bin/bash
The container technology is actively being pushed into the mainstream linux kernel. It provides the resource management through the control groups aka process containers and resource isolation through the namespaces.
The linux containers, lxc, aims to use these new functionalities to provide a userspace container object which provides full resource isolation and resource control for an applications or a system.
The first objective of this project is to make the life easier for the kernel developers involved in the containers project and especially to continue working on the Checkpoint/Restart new features. The lxc is small enough to easily manage a container with simple command lines and complete enough to be used for other purposes.
The lxc relies on a set of functionalities provided by the kernel which needs to be active. Depending of the missing functionalities the lxc will work with a restricted number of functionalities or will simply fail.
The following list gives the kernel features to be enabled in the kernel to have the full features container:
\*(T< * General setup * Control Group support -> Namespace cgroup subsystem -> Freezer cgroup subsystem -> Cpuset support -> Simple CPU accounting cgroup subsystem -> Resource counters -> Memory resource controllers for Control Groups * Group CPU scheduler -> Basis for grouping tasks (Control Groups) * Namespaces support -> UTS namespace -> IPC namespace -> User namespace -> Pid namespace -> Network namespace * Device Drivers * Character devices -> Support multiple instances of devpts * Network device support -> MAC-VLAN support -> Virtual ethernet pair device * Networking * Networking options -> 802.1d Ethernet Bridging * Security options -> File POSIX Capabilities \*(T>
The kernel version >= 2.6.27 shipped with the distros, will work with lxc, this one will have less functionalities but enough to be interesting. With the kernel 2.6.29, lxc is fully functional. The helper script lxc-checkconfig will give you information about your kernel configuration.
Before using the lxc, your system should be configured with the file capabilities, otherwise you will need to run the lxc commands as root.
The control group can be mounted anywhere, eg: mount -t cgroup cgroup /cgroup. If you want to dedicate a specific cgroup mount point for lxc, that is to have different cgroups mounted at different places with different options but let lxc to use one location, you can bind the mount point with the \*(T<lxc\*(T> name, eg: mount -t cgroup lxc /cgroup4lxc or mount -t cgroup -ons,cpuset,freezer,devices lxc /cgroup4lxc
A container is an object isolating some resources of the host, for the application or system running in it.
The application / system will be launched inside a container specified by a configuration that is either initially created or passed as parameter of the starting commands.
How to run an application in a container ?
Before running an application, you should know what are the resources you want to isolate. The default configuration is to isolate the pids, the sysv ipc and the mount points. If you want to run a simple shell inside a container, a basic configuration is needed, especially if you want to share the rootfs. If you want to run an application like sshd, you should provide a new network stack and a new hostname. If you want to avoid conflicts with some files eg. \*(T</var/run/httpd.pid\*(T>, you should remount \*(T</var/run\*(T> with an empty directory. If you want to avoid the conflicts in all the cases, you can specify a rootfs for the container. The rootfs can be a directory tree, previously bind mounted with the initial rootfs, so you can still use your distro but with your own \*(T</etc\*(T> and \*(T</home\*(T>
Here is an example of directory tree for sshd:
\*(T< [root@lxc sshd]$ tree -d rootfs rootfs |-- bin |-- dev | |-- pts | `-- shm | `-- network |-- etc | `-- ssh |-- lib |-- proc |-- root |-- sbin |-- sys |-- usr `-- var |-- empty | `-- sshd |-- lib | `-- empty | `-- sshd `-- run `-- sshd \*(T>
and the mount points file associated with it:
\*(T< [root@lxc sshd]$ cat fstab /lib /home/root/sshd/rootfs/lib none ro,bind 0 0 /bin /home/root/sshd/rootfs/bin none ro,bind 0 0 /usr /home/root/sshd/rootfs/usr none ro,bind 0 0 /sbin /home/root/sshd/rootfs/sbin none ro,bind 0 0 \*(T>
How to run a system in a container ?
Running a system inside a container is paradoxically easier than running an application. Why ? Because you don't have to care about the resources to be isolated, everything need to be isolated, the other resources are specified as being isolated but without configuration because the container will set them up. eg. the ipv4 address will be setup by the system container init scripts. Here is an example of the mount points file:
\*(T< [root@lxc debian]$ cat fstab /dev /home/root/debian/rootfs/dev none bind 0 0 /dev/pts /home/root/debian/rootfs/dev/pts none bind 0 0 \*(T>
More information can be added to the container to facilitate the configuration. For example, make accessible from the container the resolv.conf file belonging to the host.
\*(T< /etc/resolv.conf /home/root/debian/rootfs/etc/resolv.conf none bind 0 0 \*(T>
When the container is created, it contains the configuration information. When a process is launched, the container will be starting and running. When the last process running inside the container exits, the container is stopped.
In case of failure when the container is initialized, it will pass through the aborting state.
\*(T< --------- | STOPPED |<--------------- --------- | | | start | | | V | ---------- | | STARTING |--error- | ---------- | | | | | V V | --------- ---------- | | RUNNING | | ABORTING | | --------- ---------- | | | | no process | | | | | V | | ---------- | | | STOPPING |<------- | ---------- | | | --------------------- \*(T>
The container is configured through a configuration file, the format of the configuration file is described in \*(T<lxc.conf\*(T>(5)
A persistent container object can be created via the lxc-create command. It takes a container name as parameter and optional configuration file and template. The name is used by the different commands to refer to this container. The lxc-destroy command will destroy the container object.
\*(T< lxc-create -n foo lxc-destroy -n foo \*(T>
It is not mandatory to create a container object before to start it. The container can be directly started with a configuration file as parameter.
When the container has been created, it is ready to run an application / system. This is the purpose of the lxc-execute and lxc-start commands. If the container was not created before starting the application, the container will use the configuration file passed as parameter to the command, and if there is no such parameter either, then it will use a default isolation. If the application is ended, the container will be stopped also, but if needed the lxc-stop command can be used to kill the still running application.
Running an application inside a container is not exactly the same thing as running a system. For this reason, there are two different commands to run an application into a container:
\*(T< lxc-execute -n foo [-f config] /bin/bash lxc-start -n foo [-f config] [/bin/bash] \*(T>
lxc-execute command will run the specified command into the container via an intermediate process, lxc-init. This lxc-init after launching the specified command, will wait for its end and all other reparented processes. (to support daemons in the container). In other words, in the container, lxc-init has the pid 1 and the first process of the application has the pid 2.
lxc-start command will run directly the specified command into the container. The pid of the first process is 1. If no command is specified lxc-start will run \*(T</sbin/init\*(T>.
To summarize, lxc-execute is for running an application and lxc-start is better suited for running a system.
If the application is no longer responding, is inaccessible or is not able to finish by itself, a wild lxc-stop command will kill all the processes in the container without pity.
\*(T< lxc-stop -n foo \*(T>
If the container is configured with the ttys, it is possible to access it through them. It is up to the container to provide a set of available tty to be used by the following command. When the tty is lost, it is possible to reconnect it without login again.
\*(T< lxc-console -n foo -t 3 \*(T>
Sometime, it is useful to stop all the processes belonging to a container, eg. for job scheduling. The commands:
\*(T< lxc-freeze -n foo \*(T>
will put all the processes in an uninteruptible state and
\*(T< lxc-unfreeze -n foo \*(T>
will resume them.
This feature is enabled if the cgroup freezer is enabled in the kernel.
When there are a lot of containers, it is hard to follow what has been created or destroyed, what is running or what are the pids running into a specific container. For this reason, the following commands may be useful:
\*(T< lxc-ls lxc-info -n foo \*(T>
lxc-ls lists the containers of the system.
lxc-info gives information for a specific container.
Here is an example on how the combination of these commands allow to list all the containers and retrieve their state.
\*(T< for i in $(lxc-ls -1); do lxc-info -n $i done \*(T>
It is sometime useful to track the states of a container, for example to monitor it or just to wait for a specific state in a script.
lxc-monitor command will monitor one or several containers. The parameter of this command accept a regular expression for example:
\*(T< lxc-monitor -n "foo|bar" \*(T>
will monitor the states of containers named 'foo' and 'bar', and:
\*(T< lxc-monitor -n ".*" \*(T>
will monitor all the containers.
For a container 'foo' starting, doing some work and exiting, the output will be in the form:
\*(T< 'foo' changed state to [STARTING] 'foo' changed state to [RUNNING] 'foo' changed state to [STOPPING] 'foo' changed state to [STOPPED] \*(T>
lxc-wait command will wait for a specific state change and exit. This is useful for scripting to synchronize the launch of a container or the end. The parameter is an ORed combination of different states. The following example shows how to wait for a container if he went to the background.
\*(T< # launch lxc-wait in background lxc-wait -n foo -s STOPPED & LXC_WAIT_PID=$! # this command goes in background lxc-execute -n foo mydaemon & # block until the lxc-wait exits # and lxc-wait exits when the container # is STOPPED wait $LXC_WAIT_PID echo "'foo' is finished" \*(T>
The container is tied with the control groups, when a container is started a control group is created and associated with it. The control group properties can be read and modified when the container is running by using the lxc-cgroup command.
lxc-cgroup command is used to set or get a control group subsystem which is associated with a container. The subsystem name is handled by the user, the command won't do any syntax checking on the subsystem name, if the subsystem name does not exists, the command will fail.
\*(T< lxc-cgroup -n foo cpuset.cpus \*(T>
will display the content of this subsystem.
\*(T< lxc-cgroup -n foo cpu.shares 512 \*(T>
will set the subsystem to the specified value.
The lxc is still in development, so the command syntax and the API can change. The version 1.0.0 will be the frozen version.
Daniel Lezcano <\*(T<[email protected]\*(T>>