User Tools

Site Tools


docs:vserver

Table of Contents

Linux VServer Project

Linux-VServer provides virtualization for GNU/Linux systems. This is accomplished by kernel level isolation. It allows to run multiple virtual units at once. Those units are sufficiently isolated to guarantee the required security, but utilize available resources efficiently, as they run on the same kernel.

Resources

Installing Vserver host on PLD Linux

Ensure you have appropriate kernel installed.

You can check this from kernel config:

# modprobe configs
# zgrep CONFIG_VSERVER /proc/config.gz 
CONFIG_VSERVER=y

Installing guest PLD Linux Vserver

Preparing userspace tools

First, install the tools:

# poldek -u util-vserver

If you need to review poldek repo sources, then the configs are in /etc/vservers/.distributions/pld-*/poldek/ where * can be ac or th depending which guest you wish to install.

At this point you should have booted into vserver-enabled kernel. You must start vprocunhide or none of your Vservers can start.

To start vprocunhide:

# /sbin/service vprocunhide start

Guest creation

Build the guest system.

# a guest name (not hostname)
NAME=test
# <num> must be a number within 2-32767 range. 
CTX=2

vserver $NAME build --context $CTX -m poldek -n $NAME

This defaults installing guest same ARCH and VERSION that your host is.

If you need to use another combination, then there are two versions of PLD available for guest systems:

You may choose one using -d option:

DIST=pld-th

vserver $NAME build --context $CTX -m poldek -n $NAME  -- -d $DIST

using util-vserver >= 0.30.214-2 from ac-updates, util-vserver >= 0.30.215-2 from from th you can build other arch or distro or using own mirror:

MIRROR=http://ftp.pld-linux.org/dists/ac

vserver $NAME build --context $CTX -m poldek -n $NAME -- -m $MIRROR

To build 32bit guest on 64bit host:

vserver $NAME build --context $CTX -m poldek -n $NAME --personality linux_32bit --machine i686 -- -d $DIST

To build vserver from template (archive containing whole filesystem):

# vserver $NAME build --context $CTX -m template -n $NAME -- -t image.tar.bz2

To see other build command options:

# vserver test build --help

Install rc-scripts to the new system using vpoldek:

# vpoldek test -- -u rc-scripts

you should consider installing vserver-packages rpm package to satisfy packages dependency which have no use inside vserver.

And then start the guest system:

# vserver test start

To enter that vserver, type:

# vserver test enter

Note, however, that if you don't run plain init style you must have at least one daemon running inside your guest vserver or it will be shut down shortly.

Configuring the network

/etc/vservers/<vserver-name>/interfaces/<iface>

'iface' is an arbitrary name for the interface; the value itself is not important but may be interesting regarding interface-creation and usage with chbind. Both happens in alphabetical order and numbers like '00' are good names for these directories.

  • bcast The broadcast address.
  • dev The network device.
  • disabled When this file exists, this interface will be ignored.
  • ip The ip which will be assigned to this interface.
  • mask The network mask.
  • name When this file exists, the interface will be named with the text in this file. Without such an entry, the IP will not be shown by ifconfig but by ip addr ls only. Such a labeled interface is known as an “alias” also (e.g. 'eth0:foo').
  • nodev When this file exists, the interface will be assumed to exist already. This can be used to assign primary interfaces which are created by the host or another vserver. Using this means that IP address won't be removed at vserver stop.
  • prefix The network prefix-length.
  • scope The scope of the network interface.

To add interface with address 192.168.0.1/24, type:

# mkdir /etc/vservers/<vserver-name>/interfaces/0
# echo eth0 > /etc/vservers/<vserver-name>/interfaces/0/dev
# echo 192.168.0.1/24 > /etc/vservers/<vserver-name>/interfaces/0/ip

Configuring resources

/etc/vservers/vserver-name/rlimits

A directory with resource limits. Possible resources are cpu, fsize, data, stack, core, rss, nproc, nofile, memlock, as and locks. This configuration will be honored for kernel 2.6 only.

  • resource A file which contains the hard- and soft-limit of the given resource in the first line. The special keyword 'inf' is recognized.
  • resource.hard A file which contains the hard- of the given resource in the first line. The special keyword 'inf' is recognized.
  • resource.min A file which contains the guaranteed minimum of the given resource in the first line. The special keyword 'inf' is recognized.
  • resource.soft A file which contains the soft- of the given resource in the first line. The special keyword 'inf' is recognized.

Managing packages

You should decide for either package management policy:

Benefits managing packages externally:

  • provides extra security
  • avoids duplicating RPM database and installed libraries/packages

Benefits of managing packages internally:

  • vserver is more standalone due no dependency on host vserver (rpm version or libraries) and moving such vserver to other host is therefore easier.

Things you should be aware with internal package management:

  • you cannot upgrade rpm packages when vserver is down (obviously).
  • you must have network configured for guest os to use poldek network functions (/etc/resolv.conf, interfaces/N/IP, etc)

External package management

Using vpoldek

Syntax: vpoldek <VSERVER> – [REGULAR POLDEK OPTIONS]

For example:

# vpoldek test -- -u squid

Using vrpm

Syntax: vrpm <VSERVER> – [REGULAR RPM OPTIONS]

For example:

# vrpm test -- -qa 'apache-*'

Internal package management

To be able to use poldek and rpm from inside of your vserver, you will have to switch from managed to stand alone package management:

# vpoldek test -- -u poldek
# vserver test stop
# vserver test pkgmgmt internalize

From now on, the packages are managed by the vserver itself and the host system's tools should no longer be used to install or remove any packages.

See this doc for further info:

$ less /usr/share/doc/util-vserver-build-0.30.210/package-management.txt.gz

DB version mismatch with host/guest

if you installed ac vserver under th and have internalized package management, you'll likely suffer db version mismatch errors:

rpmdb: Program version 4.5 doesn't match environment version 4.7
rpmdb: /var/lib/rpm/Packages: unsupported hash version: 9

To solve this, we need to dump rpmdb and restore it.

on th host, dump the db and install tools for guest:

# poldek -u db4.7-utils
# cd /vservers/test/var/lib/rpm
# rm -f __db.00*
# vpoldek test -- -u db4.5-utils
# db_dump Packages > Packages.dump

on ac guest load the db:

# vserver test enter
# cd /var/lib/rpm
# rm -f __db.00* Packages Pubkeys
# db_load Packages < Packages.dump
# rpm --rebuilddb
# rm -f Packages.dump

Using plain init style

You might want to run your vserver with init style set to plain, which means it runs like a regular Linux host, where everything is controlled by /sbin/init. The other reason for doing so is that it might happen that you can't enter your vserver because it gets shut down before you can enter it because of no running processes.

To enable plain init style:

# echo 'plain' > /etc/vservers/test/apps/init/style

Copying guest PLD Linux Vserver to another host

Stop the vserver first

# vserver test stop

Then just archive and copy the structure:

# tar --exclude '/vservers/test/var/lib/mysql/*' -cSf /www/vs-test.tar \
/{etc/vservers,vservers,vservers/.pkg}/test

Removing guest PLD Linux Vserver

Stop the vserver first

# vserver test stop

Remove vserver config, filesystem and in case of external package management the rpmdb dir:

# rm -rf /{etc/vservers,vservers,vservers/.pkg}/test

Recent util-vserver includes a new command called delete:

# vserver test delete
Are you sure you want to delete the vserver test (y/N) y
Resource Manager: Entering runlevel number............................[ 6 ]
Stopping OpenSSH service...........................................[ DONE ]
Saving random seed.................................................[ DONE ]
Please stand by while rebooting the vserver........................[ DONE ]

Common problems / Useful tricks

Starting vserver fails with Dynamic Context error

# vserver test start
Dynamic Context IDs are not supported, you must set Context ID
in /etc/vservers/test/context file

Fix: set Context ID number in /etc/vservers/test/context file

# echo <num> >/etc/vservers/test/context

<num> must be a number within 2-32767 range.

Rationale: Dynamic allocation of context IDs has been disabled in latest utils, due to it being deprecated and discouraged by the Linux Vserver authors.

Starting vservers issues warnings about vc_net_create()

# vserver test start
chbind: vc_net_create(): Invalid argument

This warning is issued when there are no network interfaces configured within given vserver. You may want to configure one (see section: Configuring the network). If you need no network interfaces - e.g. when you plan not to run any daemons inside vserver - you may ignore this warning.

Starting service emits ulimit error

/etc/init.d/lighttpd: ulimit: exceeds allowable limit

Fix: remove -u unlimited from DEFAULT_SERVICE_LIMITS in /etc/sysconfig/system or per service config.

Provides: user(name) and group(name) do not work

If some group is provided by multiple packages and one is deinstalled, the users will be removed. This is because the rpm binary is not available with external package management for rpm scripts.

Preparing...                ########################################### [100%]
   1:test                   ########################################### [100%]
+ rpm -qa
/var/tmp/rpm-tmp.17082[3]: rpm: not found
error: %post(test-0.1-1.11.i686) scriptlet failed, exit status 127
vpoldek failed on vserver 'test' with errorcode 1

Workaround: disable RPM_USERDEL=yes from /etc/sysconfig/rpm

Service ssh don't start inside guest server

test sshd[17644]: error: Bind to port 22 on 192.168.0.1 failed: Cannot assign requested address.

Fix: set separate addresses after ListenAddress in /etc/ssh/sshd_config both on host and guest system. Guest configuration is optional as it's limited to chbind addresses and if these are not taken by the SSH daemon running on host system everything will work just fine.

bind won't install because of a mknod problem

bind requires some special device nodes inside it's chroot jail located in /var/lib/named. Vserver security does not allow device node creation so you will have to install the package specifying –excludepath=/var/lib/named/dev and then create devices /dev/null and /dev/random from outside of the vserver context.

UPDATE: vpoldek doesn't allow the –excludepath option:

poldek: unrecognized option `--excludepath=/var/lib/named/dev'

An alternative method is to write in poldek.conf:

rpmdef = _netsharedpath /dev:/var/lib/named/dev

or in /vservers/test/etc/rpm/macros:

%_netsharedpath     /dev:/var/lib/named/dev

To run bind you will have to change one more thing. PLD version of bind uses chroot for extra security and vserver security removes all special kernel capabilities. To allow chrooting inside your DNS vserver, use the following:

# echo CAP_SYS_RESOURCE >> /etc/vservers/test/bcapabilities

http://www.solucorp.qc.ca/howto.hc?projet=vserver&id=72

You can use lcap program to see available capabilities:

# lcap
Current capabilities: 0xFFFFFEFF
   0) *CAP_CHOWN                   1) *CAP_DAC_OVERRIDE
   2) *CAP_DAC_READ_SEARCH         3) *CAP_FOWNER
   4) *CAP_FSETID                  5) *CAP_KILL
   6) *CAP_SETGID                  7) *CAP_SETUID
   8)  CAP_SETPCAP                 9) *CAP_LINUX_IMMUTABLE
  10) *CAP_NET_BIND_SERVICE       11) *CAP_NET_BROADCAST
  12) *CAP_NET_ADMIN              13) *CAP_NET_RAW
  14) *CAP_IPC_LOCK               15) *CAP_IPC_OWNER
  16) *CAP_SYS_MODULE             17) *CAP_SYS_RAWIO
  18) *CAP_SYS_CHROOT             19) *CAP_SYS_PTRACE
  20) *CAP_SYS_PACCT              21) *CAP_SYS_ADMIN
  22) *CAP_SYS_BOOT               23) *CAP_SYS_NICE
  24) *CAP_SYS_RESOURCE           25) *CAP_SYS_TIME
  26) *CAP_SYS_TTY_CONFIG
    * = Capabilities currently allowed

syslog-ng won't run

There is no access to klogd inside vservers so all you have to do is change the following line in the config file:

source src { pipe ("/proc/kmsg" log_prefix("kernel: ")); unix-stream("/dev/log"); internal(); };

Into:

source src { unix-stream("/dev/log"); internal(); };

Running openvpn inside vserver

You need to:

  • create /dev/net/tun:
    # mkdir -p /vservers/test/dev/net
    # mknod -m 660 /vservers/test/dev/net/tun c 10 200
  • ~hide_netif
    # echo '~hide_netif' >> /etc/vservers/test/flags
  • grant CAP_NET_ADMIN
    # echo CAP_NET_ADMIN >> /etc/vservers/test/bcapabilities

Can't use ssh xauth forwarding

workaround: disable X11UseLocalhost in sshd_config

Mount failed for selinuxfs on /selinux: Operation not permitted

When starting guest with init style being set to plain with newer libselinux you can see error message like this. It happens because init executes function from libselinux which tries to mount /selinux. Disable selinux for guest by doing:

echo "SELINUX_INIT=no" > /etc/vservers/<guest>/apps/init/environment

or in .defaults (to disable for all guests).

Not enough space on /tmp

Just after installation in each vserver 16MB RAM-based filesystem is mounted in /tmp. If you want your /tmp filesystem to be bigger, reside on diffrent device or not be mounted at all see /etc/vservers/test/fstab.

Disabling interface

It's very convenient to disable some interface so it won't be activated on vserver boot

# touch /etc/vservers/test/interfaces/0/disabled

Display mounts of each xid (vserver)

for a in /proc/virtual/[0-9]*; do \
 xid=$(basename $a /); \
 echo "xid: $xid"; \
 vnamespace -e $xid -- cat /proc/mounts | sed -e "s,^,   $xid: ,"; \
done

And similarly to unmount /opt/storage on all running vservers:

for a in /proc/virtual/[0-9]*; do \
 xid=$(basename $a /); \
 echo "xid: $xid"; \
 vnamespace -e $xid -- umount /opt/storage; \
done

The last sample is needed if you want to umount /opt/storage completely on host, but as vservers inherit mounts at startup (even they don't use them) you can't umount /opt/storage.

squid won't start: FATAL: setrlimit: RLIMIT_NOFILE: (1) Operation not permitted

# echo CAP_SYS_RESOURCE >> /etc/vservers/test/bcapabilities

Making vserver automatically startup on host boot

Install util-vserver-init package, read and edit /etc/sysconfig/vservers.

Vservers startup order

Sometimes it may happen that you need to be sure that one of the vservers is started before the others - e.g. it provides some service that other depend on. Vserver provides an easy way to do this - let's assume that test2 vserver depends on test and foo vservers:

# echo test >> /etc/vservers/test2/apps/init/depends
# echo foo >> /etc/vservers/test2/apps/init/depends

At shutdown, the test2 vserver will be stopped before its dependencies.

Logging vserver start/stop messages using syslog-ng

It is possible to log system startup/shutdown messages for guest systems on host system. For each guest that you wish to log please do:

mkfifo /vservers/<name>/dev/console

If you wish to log each guest to separate log file add following entries to your /etc/syslog-ng/syslog-ng.conf

# define new log source for each guest
source vserver_name { pipe ("/vservers/name/dev/console"); };

# define destination for each guest
destination vserver_name { file("/var/log/vserver_name.log"); };

# log each vserver guest
log { source(vserver_name); destination(vserver_name); };

It is also possible to log all guests to single log file and just prefix log entries with guest name.

# define log source for guests, prefix each one with guest name
source vservers { pipe ("/vservers/test1/dev/console" log_prefix("test1: "));
                  pipe ("/vservers/test2/dev/console" log_prefix("test2: "));
                  pipe ("/vservers/test3/dev/console" log_prefix("test3: ")); };

# define destination for vservers log
destination vservers { file("/var/log/vservers"); };

# log vserver guest start/stop messages
log { source(vservers); destination(vservers); };

Vserver guest on physical console

If you wish to have your guest vserver available on physical console, lets say, /dev/tty2 do following:

  • comment out tty2 in /etc/inittab on host machine
    #2:2345:respawn:/sbin/mingetty tty2
  • copy /dev/tty2 from host machine as /vservers/name/dev/tty2
  • comment out all ttys in /vservers/etc/inittab except tty2, it is good idea to comment all ttys anyway to suppress errors like
    INIT: Id "1" respawning too fast: disabled for 5 minutes
  • press ALT+F2 and login to your guest vserver

Running 32 bit vserver on an 64 bit host

With recent util-vserver package you can create 32-bit guest systems inside a 64-bit host.

To specify arch during guest creation, use -d option, and to change what uname returns, use arguments --personality linux_32bit --machine i686:

# vserver test build --context <num> -n test -m poldek -- -d pld-th-i686 --personality linux_32bit --machine i686

If you need to set uts parameters afterwards, you can just echo them:

# echo linux_32bit >> /etc/vservers/test/personality
# echo i686 > /etc/vservers/test/uts/machine

Package built for different operating system (linux)

When upgrading packages on vservers with recent rpm one might run into an error with message:

error: package.arch: package is for a different operating system (linux)

it can be resolved by copying rpm platform information from host system to vservers settings directory:

# cp /usr/lib/util-vserver/distributions/defaults/rpm/platform \
  /etc/vservers/<NAME>/apps/pkgmgmt/base/rpm/etc/platform

or you can run this script to update all vservers:

#!/bin/sh

p=/usr/lib*/util-vserver/distributions/defaults/rpm/platform
for a in /etc/vservers/*/apps/pkgmgmt/base/rpm/etc/macros; do
    [ -f "$a" ] || continue
    f=${a%/macros}/platform
    [ ! -f "$f" ] || continue
    cp $p $f
done

this script doesn't affect newly created vservers.

also beware that if you have i686 guests on x86_64 host, the platform file would contain illegal x86_64 entries.

Can't upgrade FHS package

You will be most likely get error like:

error: unpacking of archive failed on file /proc: cpio: chown failed - Operation not permitted

The fix is to add /proc to /etc/vservers/test/apps/pkgmgmt/base/rpm/etc/macros %_netsharedpath list:

%_netsharedpath     /dev:/proc

and in case you have internalized rpmdb the macro file is there: /vservers/test/etc/rpm/macros

loopback in kernel with vserver 2.3 series

How to enable and disable loopback addresses in vserver 2.3 series so 127.0.0.1 will work in guest.

If your kernel has CONFIG_VSERVER_AUTO_LBACK=y then loopback addresses and things will be assigned and made visible in your guests automaticly. You can disable that on by guest basis by doing:

echo "~lback_remap" >> /etc/vservers/xyz/nflags
echo "~hide_lback" >> /etc/vservers/xyz/nflags

If your kernel has CONFIG_VSERVER_AUTO_LBACK option disabled you can still get automatic loopback addresses on by guest basis by doing:

echo "lback_remap" >> /etc/vservers/xyz/nflags
echo "hide_lback" >> /etc/vservers/xyz/nflags

(util-vserver 0.30.214 or newer needed)

binding to address 0.0.0.0 binds only to single IP

Newer Vserver from 2.3 series allows administrator to enable special handling of network contexts for guests with single IP only. Default value for this option is compiled into kernel as CONFIG_VSERVER_AUTO_SINGLE. When it is enabled any service configured to bind to all available IP addresses will bind only to single IP configured in /etc/vservers/guest/interfaces. It will not even bind to loopback interface.

To enable special handling of network contexts in guests with a single IP do:

echo "single_ip" >> /etc/vservers/xyz/nflags

Similarly to disable this option if its enabled in kernel do:

echo "~single_ip" >> /etc/vservers/xyz/nflags

SMACK enabled kernels

Smack enabled kernels (in PLD default kernel >= 2.6.25) use security.SMACK64 to store some data. Unfortunately vserver by default doesn't allow to change xattr. This can lead to problems like this:

# pwconv
Cannot set attribute security.SMACK64 for `/etc/passwd.tmpbPZiEN': Operation not permitted
Error while converting `root' to shadow account.

There are two solutions for this. First is to enable setfcap capability (NOTE: it enables in guest much more than is needed by smack, so seriously consider security implications for that!):

echo SETFCAP >> /etc/vservers/xyz/bcapabilities

Second one is disabling SMACK entirely if not needed. This can be done by choosing other security module to be used by default (capability, selinux) using kernel boot command line option:

security=capability (< 2.6.27)
security=default (>= 2.6.27)

Note: this option is available in vanilla kernels >= 2.6.26 and backported to PLD >= 2.6.25.9.

kernel oopses at pick_next_task_fair

Almost all kernels (including 2.6.27.x and 2.6.30/31) with vserver patch have a bug that causes oopses in pick_next_task_fair when using `sched_hard' in vserver/xyz/flags.

Temporary solution is to avoid using sched_hard. Latest 2.6.31 patches contain different way to get behaviour similar to sched_hard - it's called CFS hard limit and is explained in kernel documentation (in vserver patch).

When using nice and su (for example, in the updatedb cron job), I get: su: Permission denied.

A guest cannot lower its nice value - and that's what 'su' does through pam_limits which sets a nice value of 0.

Solution: set SYS_NICE bcapability for guest to allow it to lower it's nice value.

Why there is no used memory reported on 2.6.33 inside of vserver guest ?

2.6.33 started to use cgroups for accounting and since by default no cgroup is configured then accounted used memory is 0. Drop virt_mem flag or set cgroup memory limit. Look at http://linux-vserver.org/util-vserver:Cgroups for more information.

Running auditd inside guest

You need CAP_AUDIT_CONTROL in bcapabilities and lower priority_boost to 0 in /etc/audit/auditd.conf

After upgrading from 2.6-3.4 kernels (possibly other versions) to 3.18 (tested, possibly other versions) kernel ooppses almost immediately after accessing some files on xfs filesystem with xfs_filestream_lookup_ag visible in stack trace (or other filestream related function).

That's because vserver patch for kernels earlier than 2.6.23 patched xfs filesystem to introduce new flag:

#define XFS_XFLAG_BARRIER     0x00004000      /* chroot() barrier */

and files/dirs with such flag got saved on your filesystem.

Starting with kernel 2.6.23 kernel introduced filestreams which are using 0x00004000 bit, thus causing conflict with vserver.

#define XFS_XFLAG_FILESTREAM   0x00004000      /* use filestream allocator */

Vserver stopped adding such xfs xflag in 3.13 BUT your existing filesystem can still have XFS_XFLAG_BARRIER (0x00004000) set causing oops in newer kernels.

How to find out if I'm affected?

IIF you don't use filestream feature then modify http://oss.sgi.com/cgi-bin/gitweb.cgi?p=xfs/cmds/xfstests.git;a=blob_plain;f=src/bstat.c;hb=HEAD to show only files containing XFS_XFLAG_FILESTREAM

diff --git a/src/bstat.c b/src/bstat.c
index 4e22ecd..887512f 100644
--- a/src/bstat.c
+++ b/src/bstat.c
@@ -34,19 +34,21 @@ dotime(void *ti, char *s)
 void
 printbstat(xfs_bstat_t *sp)
 {
-       printf("ino %lld mode %#o nlink %d uid %d gid %d rdev %#x\n",
-               (long long)sp->bs_ino, sp->bs_mode, sp->bs_nlink,
-               sp->bs_uid, sp->bs_gid, sp->bs_rdev);
-       printf("\tblksize %d size %lld blocks %lld xflags %#x extsize %d\n",
-               sp->bs_blksize, (long long)sp->bs_size, (long long)sp->bs_blocks,
-               sp->bs_xflags, sp->bs_extsize);
-       dotime(&sp->bs_atime, "atime");
-       dotime(&sp->bs_mtime, "mtime");
-       dotime(&sp->bs_ctime, "ctime");
-       printf( "\textents %d %d gen %d\n",
-               sp->bs_extents, sp->bs_aextents, sp->bs_gen);
-       printf( "\tDMI: event mask 0x%08x state 0x%04x\n",
-               sp->bs_dmevmask, sp->bs_dmstate);
+       if (sp->bs_xflags & XFS_XFLAG_FILESTREAM) {
+               printf("ino %lld mode %#o nlink %d uid %d gid %d rdev %#x\n",
+                               (long long)sp->bs_ino, sp->bs_mode, sp->bs_nlink,
+                               sp->bs_uid, sp->bs_gid, sp->bs_rdev);
+               printf("\tblksize %d size %lld blocks %lld xflags %#x extsize %d\n",
+                               sp->bs_blksize, (long long)sp->bs_size, (long long)sp->bs_blocks,
+                               sp->bs_xflags, sp->bs_extsize);
+               dotime(&sp->bs_atime, "atime");
+               dotime(&sp->bs_mtime, "mtime");
+               dotime(&sp->bs_ctime, "ctime");
+               printf( "\textents %d %d gen %d\n",
+                               sp->bs_extents, sp->bs_aextents, sp->bs_gen);
+               printf( "\tDMI: event mask 0x%08x state 0x%04x\n",
+                               sp->bs_dmevmask, sp->bs_dmstate);
+       }
 }

and then run it with mounted directory of each filesystem (bstat /; bstat /home etc). It will print “ino …” information for filestream files.

How to clean up?

rsync files to other partition, recreate problematic partition and then copy files back.

Debian or Ubuntu guest installation

Install binutils package and optionally debootstrap (vserver will install it on it's own if you don't install it yourself):

# vserver test build -m debootstrap --context 1234 -- -d etch -m http://ftp.pl.debian.org/debian -- --arch i386
Could not find local version of 'debootstrap'; downloading it from
http://ftp.pl.debian.org/debian/pool/main/d/debootstrap/debootstrap_1.0.3_all.deb...
11:01:58 URL:http://ftp.pl.debian.org/debian/pool/main/d/debootstrap/debootstrap_1.0.3_all.deb [49086/49086] -> "/var/tmp/debootstrap.Rseedf/debootstrap.deb" [1]
I: Retrieving Release
I: Retrieving Packages
I: Validating Packages
I: Resolving dependencies of required packages...
I: Resolving dependencies of base packages...
I: Checking component main on http://ftp.pl.debian.org/debian...
I: Retrieving adduser
I: Validating adduser
I: Retrieving apt
I: Validating apt
[...]
I: Extracting zlib1g...
I: Installing core packages...
I: Unpacking required packages...
I: Unpacking base-files...
I: Unpacking base-passwd...
I: Unpacking bash...
I: Unpacking bsdutils...
[...]
I: Unpacking zlib1g...
I: Configuring required packages...
I: Configuring sysv-rc...
I: Configuring tzdata...
I: Configuring gcc-4.1-base...
[...]
I: Configuring debconf-i18n...
I: Configuring debconf...
I: Unpacking the base system...
I: Unpacking adduser...
I: Unpacking apt...
[...]
I: Configuring sysklogd...
I: Configuring tasksel...
I: Base system installed successfully.
# ls /vservers/test/
bin  boot  dev  etc  home  initrd  lib  media  mnt  opt  proc  root  sbin  srv  sys  tmp  usr  var

Set up guest hostname:

# echo test > /etc/vservers/test/uts/nodename

Done.

Note that file /usr/lib{,64}/util-vserver/defaults/debootstrap.uri may need URL update pointing to new debootstrap version if old is no longer there.

Possible Debian -d (distributions): squeeze, etch, lenny, sarge, sid. Popular –arch: i386, amd64, powerpc. Possible Ubuntu distributions: breezy, dapper, edgy, feisty, gutsy, horay.

Note that upstart in some Ubuntu distributions is broken and needs such workaround to get running:

echo TERM=linux >> /etc/vservers/VSERVER_NAME/apps/init/environment

CentOS guest installation

Install yum and yum-metadata-parser packages.

# vserver test build -n test --context 105 -m yum -- -d centos5

=============================================================================
 Package                 Arch       Version          Repository        Size
=============================================================================
Installing:
 glibc                   i686       2.5-12           base              5.1 M
Installing for dependencies:
 basesystem              noarch     8.0-5.1.1.el5.centos  base              2.8 k
 filesystem              i386       2.4.0-1.el5.centos  base              116 k
 glibc-common            i386       2.5-12           base               16 M
 libgcc                  i386       4.1.1-52.el5.2   updates            82 k
 setup                   noarch     2.5.58-1.el5     base              126 k
 tzdata                  noarch     2007h-1.el5      updates           746 k

Transaction Summary
=============================================================================
Install      7 Package(s)
Update       0 Package(s)
Remove       0 Package(s)

Total download size: 22 M
warning: rpmts_HdrFromFdno: Header V3 DSA signature: NOKEY, key ID e8562897
Importing GPG key 0xE8562897 "CentOS-5 Key (CentOS 5 Official Signing Key) <centos-5-key@centos.org>" from http://mirror.centos.org/centos/RPM-GPG-KEY-CentOS-5

Installed: glibc.i686 0:2.5-12
Dependency Installed: basesystem.noarch 0:8.0-5.1.1.el5.centos filesystem.i386 0:2.4.0-1.el5.centos glibc-common.i386 0:2.5-12 libgcc.i386 0:4.1.1-52.el5.2 setup.noarch 0:2.5.58-1.el5 tzdata.noarch 0:2007h-1.el5

=============================================================================
 Package                 Arch       Version          Repository        Size
=============================================================================
Installing for dependencies:
 MAKEDEV                 i386       3.23-1.2         base              135 k
 SysVinit                i386       2.86-14          base              113 k
 audit-libs              i386       1.3.1-1.el5      base               39 k
 bash                    i386       3.1-16.1         base              1.8 M
 bzip2-libs              i386       1.0.3-3          base               37 k
 centos-release          i386       10:5-0.0.el5.centos.2  base               19 k
 centos-release-notes    i386       5.0.0-2          base              112 k
 chkconfig               i386       1.3.30.1-1       base              158 k
 coreutils               i386       5.97-12.1.el5    base              3.6 M
 cracklib                i386       2.8.9-3.1        base               58 k
 cracklib-dicts          i386       2.8.9-3.1        base              3.3 M
 db4                     i386       4.3.29-9.fc6     base              917 k
 device-mapper           i386       1.02.13-1.el5    base              582 k
 e2fsprogs               i386       1.39-8.el5       base              957 k
 e2fsprogs-libs          i386       1.39-8.el5       base              112 k
 ethtool                 i386       5-1.el5          base               60 k
 findutils               i386       1:4.2.27-4.1     base              294 k
 gawk                    i386       3.1.5-14.el5     base              1.7 M
 gdbm                    i386       1.8.0-26.2.1     base               27 k
 glib2                   i386       2.12.3-2.fc6     base              677 k
 grep                    i386       2.5.1-54.2.el5   base              174 k
 info                    i386       4.8-14.el5       base              172 k
 initscripts             i386       8.45.14.EL-1.el5.centos.1  base              1.4 M
 iproute                 i386       2.6.18-4.el5     base              801 k
 iputils                 i386       20020927-43.el5  base              124 k
 krb5-libs               i386       1.5-29           updates           592 k
 libacl                  i386       2.2.39-1.1       base               19 k
 libattr                 i386       2.4.32-1.1       base               12 k
 libcap                  i386       1.10-26          base               22 k
 libselinux              i386       1.33.4-2.el5     base               93 k
 libsepol                i386       1.15.2-1.el5     base              129 k
 libstdc++               i386       4.1.1-52.el5.2   updates           350 k
 libtermcap              i386       2.0.8-46.1       base               14 k
 mcstrans                i386       0.1.10-1.el5     base               15 k
 mingetty                i386       1.07-5.2.2       base               19 k
 mktemp                  i386       3:1.5-23.2.2     base               14 k
 module-init-tools       i386       3.3-0.pre3.1.16.0.1.el5  updates           411 k
 ncurses                 i386       5.5-24.20060715  base              1.1 M
 net-tools               i386       1.60-73          base              359 k
 openssl                 i686       0.9.8b-8.3.el5_0.2  updates           1.4 M
 pam                     i386       0.99.6.2-3.14.el5  base              923 k
 pcre                    i386       6.6-2.el5_0.1    updates           112 k
 popt                    i386       1.10.2-37.el5    base               67 k
 procps                  i386       3.2.7-8.1.el5    base              207 k
 psmisc                  i386       22.2-5           base               61 k
 python                  i386       2.4.3-19.el5     base              5.9 M
 readline                i386       5.1-1.1          base              223 k
 sed                     i386       4.1.5-5.fc6      base              174 k
 shadow-utils            i386       2:4.0.17-12.el5  base              1.0 M
 sysklogd                i386       1.4.1-39.2       base               73 k
 termcap                 noarch     1:5.5-1.20060701.1  base              265 k
 udev                    i386       095-14.5.el5     base              877 k
 util-linux              i386       2.13-0.44.el5    base              1.8 M
 zlib                    i386       1.2.3-3          base               50 k

Transaction Summary
=============================================================================
Install     54 Package(s)
Update       0 Package(s)
Remove       0 Package(s)

Total download size: 34 M

Dependency Installed: MAKEDEV.i386 0:3.23-1.2 SysVinit.i386 0:2.86-14 audit-libs.i386 0:1.3.1-1.el5 bash.i386 0:3.1-16.1 bzip2-libs.i386 0:1.0.3-3 centos-release.i386 10:5-0.0.el5.centos.2 centos-release-notes.i386 0:5.0.0-2 chkconfig.i386 0:1.3.30.1-1 coreutils.i386 0:5.97-12.1.el5 cracklib.i386 0:2.8.9-3.1 cracklib-dicts.i386 0:2.8.9-3.1 db4.i386 0:4.3.29-9.fc6 device-mapper.i386 0:1.02.13-1.el5 e2fsprogs.i386 0:1.39-8.el5 e2fsprogs-libs.i386 0:1.39-8.el5 ethtool.i386 0:5-1.el5 findutils.i386 1:4.2.27-4.1 gawk.i386 0:3.1.5-14.el5 gdbm.i386 0:1.8.0-26.2.1 glib2.i386 0:2.12.3-2.fc6 grep.i386 0:2.5.1-54.2.el5 info.i386 0:4.8-14.el5 initscripts.i386 0:8.45.14.EL-1.el5.centos.1 iproute.i386 0:2.6.18-4.el5 iputils.i386 0:20020927-43.el5 krb5-libs.i386 0:1.5-29 libacl.i386 0:2.2.39-1.1 libattr.i386 0:2.4.32-1.1 libcap.i386 0:1.10-26 libselinux.i386 0:1.33.4-2.el5 libsepol.i386 0:1.15.2-1.el5 libstdc++.i386 0:4.1.1-52.el5.2 libtermcap.i386 0:2.0.8-46.1 mcstrans.i386 0:0.1.10-1.el5 mingetty.i386 0:1.07-5.2.2 mktemp.i386 3:1.5-23.2.2 module-init-tools.i386 0:3.3-0.pre3.1.16.0.1.el5 ncurses.i386 0:5.5-24.20060715 net-tools.i386 0:1.60-73 openssl.i686 0:0.9.8b-8.3.el5_0.2 pam.i386 0:0.99.6.2-3.14.el5 pcre.i386 0:6.6-2.el5_0.1 popt.i386 0:1.10.2-37.el5 procps.i386 0:3.2.7-8.1.el5 psmisc.i386 0:22.2-5 python.i386 0:2.4.3-19.el5 readline.i386 0:5.1-1.1 sed.i386 0:4.1.5-5.fc6 shadow-utils.i386 2:4.0.17-12.el5 sysklogd.i386 0:1.4.1-39.2 termcap.noarch 1:5.5-1.20060701.1 udev.i386 0:095-14.5.el5 util-linux.i386 0:2.13-0.44.el5 zlib.i386 0:1.2.3-3
# ls /vservers/test/
bin  boot  dev  etc  home  lib  media  mnt  opt  proc  root  sbin  selinux  srv  sys  tmp  usr  var  vservers

As you can see there is /vservers directory inside our new guest. This is probably due to bug in either yum itself or yum-chroot.patch from util-vserver package. This bug also causes many errors like:

could not open ts_done file: [Errno 2] No such file or directory: '/vservers/test//var/lib/yum/transaction-done.2007-11-14.13:40.11'

Those errors may be safely ignored (there were deleted from example above) and directory may be removed:

# rm -rf /vservers/test/vservers/

Please keep in mind that there will be no messages on screen while yum is working in background. It will only display results when finished. Be patient :-) You may also install older CentOS 4 by using -d centos4

Set up guest hostname:

# echo test > /etc/vservers/test/uts/nodename

You may also wish to run pwconv inside guest system.

internalized package management

If you wish to use yum or rpm inside newly created guest you must do few more things.

Install yum:

# vyum test -- install yum

=============================================================================
 Package                 Arch       Version          Repository        Size
=============================================================================
Installing:
 yum                     noarch     3.0.5-1.el5.centos.2  base              481 k
Installing for dependencies:
 beecrypt                i386       4.1.2-10.1.1     base              116 k
 elfutils-libelf         i386       0.125-3.el5      base               52 k
 expat                   i386       1.95.8-8.2.1     base               77 k
 m2crypto                i386       0.16-6.el5.1     base              487 k
 python-elementtree      i386       1.2.6-5          base               83 k
 python-sqlite           i386       1.1.7-1.2.1      base               39 k
 python-urlgrabber       noarch     3.1.0-2          base              127 k
 rpm                     i386       4.4.2-37.el5     base              638 k
 rpm-libs                i386       4.4.2-37.el5     base              966 k
 rpm-python              i386       4.4.2-37.el5     base               53 k
 sqlite                  i386       3.3.6-2          base              213 k

Transaction Summary
=============================================================================
Install     12 Package(s)
Update       0 Package(s)
Remove       0 Package(s)

Total download size: 3.3 M
Is this ok [y/N]: y

Installed: yum.noarch 0:3.0.5-1.el5.centos.2
Dependency Installed: beecrypt.i386 0:4.1.2-10.1.1 elfutils-libelf.i386 0:0.125-3.el5 expat.i386 0:1.95.8-8.2.1 m2crypto.i386 0:0.16-6.el5.1 python-elementtree.i386 0:1.2.6-5 python-sqlite.i386 0:1.1.7-1.2.1 python-urlgrabber.noarch 0:3.1.0-2 rpm.i386 0:4.4.2-37.el5 rpm-libs.i386 0:4.4.2-37.el5 rpm-python.i386 0:4.4.2-37.el5 sqlite.i386 0:3.3.6-2

Run pkgmgmt interalize:

# vserver test pkgmgmt internalize

Since CentOS uses different version of db you will get following errors while trying to use vyum/vrpm outside guest or yum/rpm inside guest:

# vrpm test -- -qa
rpmdb: Program version 4.3 doesn't match environment version
error: db4 error(-30974) from dbenv->open: DB_VERSION_MISMATCH: Database environment version mismatch
error: cannot open Packages index using db3 -  (-30974)
error: cannot open Packages database in /var/lib/rpm

To fix this please execute following commands:

# vserver test start
# vserver test enter
bash-3.1# rm -f /var/lib/rpm/__db.*
bash-3.1# rpm --rebuilddb

If that doesn't work, try the following:

# cd /vservers/test/var/lib/rpm
# rm -f __db.*
# db_dump Packages > Packages.dump
# vserver test start
# vserver test enter
bash-3.1# cd /var/lib/rpm
bash-3.1# rm Packages
bash-3.1# db_load Packages < Packages.dump
bash-3.1# rpm --rebuilddb
bash-3.1# rm Packages.dump

Using quota in vservers

To enable quota in a vserver you need to:

  • enable quota on the “real” device mounted in vserver (in /etc/fstab):
    /dev/space/vserver1_home        /vservers/test/home xfs     defaults,usrquota       0       0
  • load the vroot module and add it to your /etc/modules. you can optionaly increase max vroot number of devices by putting the limit in your /etc/modprobe.conf:
    options vroot max_vroot=64
  • assing a free vroot node for the device in question:
    # vrsetup /dev/vroot3 /dev/space/vserver1_home
  • copy the vroot device to the guest:
    # cp -af /dev/vroot3 /vservers/test/dev/
  • add to /etc/vservers/test/apps/init/mtab:
    /dev/vroot3     /home/    xfs     defaults,usrquota        0       0
  • add quota_ctl to /etc/vservers/test/ccapabilities:
  • restart your vserver and run edquota inside

Network namespace in vservers

Starting from util-vserver 0.30.216-1.pre3054 there is basic support for creating network namespaces with interfaces inside.

Enabling netns and two capabilities: NET_ADMIN (allows interfaces in guest to be managed) and NET_RAW (makes iptables working).

mkdir /etc/vservers/test/spaces
touch /etc/vserver/test/spaces/net
echo NET_ADMIN >> /etc/vservers/test/bcapabilities
echo NET_RAW >> /etc/vservers/test/bcapabilities
echo 'plain' > /etc/vservers/test/apps/init/style

Avoid context isolation since it makes little sense when using network namespaces:

touch /etc/vserver/test/noncontext

Configure interfaces:

0 - arbitrary directory name, just for ordering

myiface0 will be interface name inside of guest (optional, default geth0, geth1 and so on)

veth-host - interface name on the host side

mkdir -p /etc/vservers/test/netns/interfaces/0
echo myiface0 > /etc/vservers/test/netns/interfaces/guest
echo veth-host > /etc/vservers/test/netns/interfaces/host

!!! FINISH ME. FINISH ME. FINISH ME. !!!

Network namespace in vservers (OLD WAY)

Enabling netns and two capabilities: NET_ADMIN (allows interfaces in guest to be managed) and NET_RAW (makes iptables working).

Plain init style is needed for post-start to run as soon as possible (and with plain init style is just after starting init process).

mkdir /etc/vservers/test/spaces
touch /etc/vservers/test/spaces/net
echo NET_ADMIN >> /etc/vservers/test/bcapabilities
echo NET_RAW >> /etc/vservers/test/bcapabilities
echo 'plain' > /etc/vservers/test/apps/init/style

veth-cXYZ - host interface eth-cXYZ - guest interface ifcfg-veth-cXYZ on host should have ONBOOT=no (it will be started when vserver starts)

Create /etc/vservers/test/scripts/post-start script:

#!/bin/sh
VSERVER_SCRIPT="$1"
VSERVER_NAME="$2"

CONTEXT=$(cat /etc/vservers/${VSERVER_NAME}/context)
VSERVER_IFACE_SUFFIX="c${CONTEXT}"

VSERVER_HOST_IFACE="veth-${VSERVER_IFACE_SUFFIX}"
VSERVER_GUEST_IFACE="eth-${VSERVER_IFACE_SUFFIX}"

ip link add name "${VSERVER_HOST_IFACE}" type veth peer name "${VSERVER_GUEST_IFACE}"
vserver ${VSERVER_NAME} exec sleep 60 &
for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15; do
        pid=$(vserver ${VSERVER_NAME} exec pidof -s sleep)
        [ -n "$pid" ] && break
        usleep 100000
done
if [ -z "$pid" ]; then
        echo "vserver guest $VSERVER_NAME: failed to find guest net namespace" >&2
fi
ip link set "${VSERVER_GUEST_IFACE}" netns $pid
sysctl -q -w net.ipv4.conf.${VSERVER_HOST_IFACE}.forwarding=1
/sbin/ifup "${VSERVER_HOST_IFACE}"
exit 0

Create /etc/vservers/test/scripts/post-stop script:

#!/bin/sh
VSERVER_SCRIPT="$1"
VSERVER_NAME="$2"


CONTEXT=$(cat /etc/vservers/${VSERVER_NAME}/context)
VSERVER_IFACE_SUFFIX="c${CONTEXT}"

VSERVER_HOST_IFACE="veth-${VSERVER_IFACE_SUFFIX}"
VSERVER_GUEST_IFACE="eth-${VSERVER_IFACE_SUFFIX}"

ip link del "${VSERVER_HOST_IFACE}" 2> /dev/null
exit 0

You end with one interface on the host and one inside guest (virtually connected). Configure interfaces, routing as on normal system.

Notes:

  • vserver name can't be longer than 10 characters. Longer one will produce interface names longer than limit (15 characters; veth- + vserver name)
  • this method is racy. post-start is running in parallel to init process inside of guest system. If guest is faster and tries to configure networking before post-start puts new iterface into guest you are doomed. Fortunately this is unlikely to happen as post-start is short and should always be first before networking is being configured by guest scripts. Race could be avoided by implementing proper netns interface moving support into util-vserver scripts.
  • enabling pid namespace is likely to break post-start script (part with guest pid fetching for iproute2 netns use). Using vps (aka context 1 spectacor mode) to find guest process pid but in host namespace is likely to solve this problem.

cgroups

Example cgroups usage:

* create “cgroup” directory in /etc/vserver/test/

* put files there like:

  • cpuset.cpus - numbers of cores used by Vserver: 0-n
  • cpuset.mems - NUMA node numbers
  • cpuset.memory_migrate - memory migration: (1 - do memory migration when shuting down cores or 0 - not)
  • cpu.shares - Vserver's CPU share: for example 256
  • memory.limit_in_bytes - Vserver's RAM: 256M

Important:

  • the share you get is equal to the guest's share divided by the sum of the cpu shares of all the guest. Default shares is 1024 (for guest, host… generally default in kernel cgroup) and is inherited from parent cgroup.
  • there is no hierarchy when dealing (beside inheriting default value) with cpu.shares. All shares are summed and cgroup gets it's “cpu.share/sum”. For example host has default 1024, guest gets 2048 set. This means that host will get 1/3 of cpu power and guest will get 2/3.
  • virt_mem flag is needed for guest to see only cgroup limited memory

cgroups with libcgroup

libcgroup can mount cgroup differently. It can use separate subdirectory for each cgroup subsystem like:

# cat /proc/mounts |grep cgroup
cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio 0 0
cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu 0 0
cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,relatime,memory 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,relatime,net_cls 0 0

For these to work you need at least util-vserver-0.30.216-1.pre2955.3 (that .3 is important) and turn on per subsys support by doing:

# mkdir /etc/vservers/.defaults/cgroup
# touch /etc/vservers/.defaults/cgroup/per-ss

cgroups mountpoint

if you have cgroups mounted somewhere else, you can inform vserver of that (it searching in /sys/fs/cgroup by default)

none        /dev/cgroup     cgroup  cpuset,cpu,cpuacct,devices,freezer,net_cls  0 0

you need to tell vserver where it mounted:

# cat /etc/vservers/.defaults/cgroup/mnt
/dev/cgroup
docs/vserver.txt · Last modified: 2015-10-05 15:07 by glen