Exploring Namespaces in Linux

Published: 2021-04-27 | Last Updated: 2021-04-27 | ~9 Minute Read

Table of contents

Context

As part of my journey into learning more about containerization technology I wanted to take a look at the core of what makes it all possible on a Linux system.

I have done some things with Docker as well as LXC, however both of these technologies heavily rely on a Linux kernel feature that I’d like to better understand.

So we’ll be taking a look at this feature in this post. As a note I will be exploring these features on a Slackware system with Kernel version: 5.10.29

What are Namespaces?

In a nutshell, namespaces are to processes what workspaces are to desktop environments. Namespaces are a tool to control how processes (and other system resources) and their children see the environment they’re running in. This is used to separate and compartmentalize resources on the system.

Another way to think of namespaces is as a way to determine what system resources can and cannot see each other. This depends on type of the namespace and an id (inode number) for proper separation.

From the namespaces man page there are eight namespace types:

Namespace Flag            Page                  Isolates
Cgroup    CLONE_NEWCGROUP cgroup_namespaces(7)  Cgroup root directory
IPC       CLONE_NEWIPC    ipc_namespaces(7)     System V IPC,
                                                POSIX message queues
Network   CLONE_NEWNET    network_namespaces(7) Network devices,
                                                stacks, ports, etc.
Mount     CLONE_NEWNS     mount_namespaces(7)   Mount points
PID       CLONE_NEWPID    pid_namespaces(7)     Process IDs
Time      CLONE_NEWTIME   time_namespaces(7)    Boot and monotonic
                                                clocks
User      CLONE_NEWUSER   user_namespaces(7)    User and group IDs
UTS       CLONE_NEWUTS    uts_namespaces(7)     Hostname and NIS
                                                   domain name

We can use the following commands to interact with namespaces on the system:

clone
unshare

Namespaces in Use

So how do namespaces work? Let’s take a look:

Namespaces are always present even if you don’t run containers on your system (as of Kernel version 2.4.19), this is due to the /proc/[pid]/ns/ directory.

You can use the lsns command in order to see the namespaces currently in operation:

nix~ $ lsns
        NS TYPE   NPROCS   PID USER COMMAND
4026531834 time       37  1361 nix  -bash
4026531835 cgroup     37  1361 nix  -bash
4026531836 pid        37  1361 nix  -bash
4026531837 user       28  1361 nix  -bash
4026531838 uts        37  1361 nix  -bash
4026531839 ipc        28  1361 nix  -bash
4026531840 mnt        37  1361 nix  -bash
4026531992 net        28  1361 nix  -bash

We see that we have some namespace instances from the different categories we looked at before.

To find out what each of these columns mean we can execute the lsns -h command:

nix~ $ lsns -h

Usage:
 lsns [options] [<namespace>]

List system namespaces.

Options:
 -J, --json             use JSON output format
 -l, --list             use list format output
 -n, --noheadings       don't print headings
 -o, --output <list>    define which output columns to use
     --output-all       output all columns
 -p, --task <pid>       print process namespaces
 -r, --raw              use the raw output format
 -u, --notruncate       don't truncate text in columns
 -W, --nowrap           don't use multi-line representation
 -t, --type <name>      namespace type (mnt, net, ipc, user, pid, uts, cgroup, time)

 -h, --help             display this help
 -V, --version          display version

Available output columns:
          NS  namespace identifier (inode number)
        TYPE  kind of namespace
        PATH  path to the namespace
      NPROCS  number of processes in the namespace
           PID  lowest PID in the namespace
        PPID  PPID of the PID
     COMMAND  command line of the PID
         UID  UID of the PID
        USER  username of the PID
     NETNSID  namespace ID as used by network subsystem
        NSFS  nsfs mountpoint (usually used network subsystem)

For more details see lsns(8).

Another way to see a bit of detail about the namespaces currently in use by your system, in case your version of util-linux does not include lsns, is by executing the following:

nix~ $ ls -l /proc/$$/ns | awk '{print $1, $9, $10, $11}'
total   
lrwxrwxrwx cgroup -> cgroup:[4026531835]
lrwxrwxrwx ipc -> ipc:[4026531839]
lrwxrwxrwx mnt -> mnt:[4026531840]
lrwxrwxrwx net -> net:[4026531992]
lrwxrwxrwx pid -> pid:[4026531836]
lrwxrwxrwx pid_for_children -> pid:[4026531836]
lrwxrwxrwx time -> time:[4026531834]
lrwxrwxrwx time_for_children -> time:[4026531834]
lrwxrwxrwx user -> user:[4026531837]
lrwxrwxrwx uts -> uts:[4026531838]

This demonstrates that even without the use of containerization on the system, namespaces are currently being used.

A Closer Look at the User Namespace

Let’s focus on a single namespace for now, the user namespace. The current namespaces that are in use were not created by us but by the system, let’s interact a bit with them to explore how they work.

First we will check the currently running processes on the user namespace to have a baseline:

nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22751)
bash(22740)

From this we can see that the nix user currently has two bash processes running and both of these processes are in the default user namespace generated by the Kernel. Now let’s create a new shell, in this case zsh for distinction, from one of these bash instances and see the updated state of our environment:

nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22754)
bash(22740)---zsh(22752)

This shows us the relationship between parent and child processes. In our example the zsh shell is a child of the bash shell. This new child process still resides within the default namespace. Now let’s create a new shell (csh) in a namespace of our own with the unshare command from our zsh session:

zsh-\u@\h:\w\$ unshare -U csh
csh-%

And when we check our current user namespace status from our first bash shell:

nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22768)
bash(22740)---zsh(22752)
[4026532212]
csh(22755)

We now see that our csh shell has been successfully created in its own namespace 4026532212, which is different from the system default. Since we are inside the newly crated csh shell we can use it to confirm how its “perspective” of the system user namespaces is affected by the fact that it’s in a different user namespace from the system default:

csh-% pstree -pN user
[4026532256]
csh(1086)───pstree(1094)

This output shows us that this shell is not able to see the “surrounding” user namespaces on the system, but our bash shell can because the namespace created for our csh shell is a child of the root system user namespace. If we create another shell as a child of the csh shell without modifying its namespace configuration:

csh-% tcsh 
tcsh->

And check the user namespaces from this new shell:

tcsh-> pstree -pN user
[4026532212]
csh(22755)---tcsh(22771)---pstree(22773)
tcsh-> 

We still see the that there is a single user namespace available, this occurs because the tcsh shell is a child of the csh shell and it inherits its namespace configuration.

So from the point of view of the initial bash shell our current setup looks like the following:

nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22768)
bash(22740)---zsh(22752)
[4026532212]
csh(22755)---tcsh(22771)

Since the initial bash shell is in the default namespace it has access to the rest of the namespaces because they are its children but the processes that get separated and their children do not.

Now that we saw how the user namespace works in a bit more depth, we can see it’s effects on what the processes think the user is. From our first bash shell we can see the following user information:

nix@nixing:~$ whoami
nix
nix@nixing:~$ id -u
1000

And if we do this same check on our tcsh shell, we get something a bit different:

tcsh-> whoami
whoami: cannot find name for user ID 65534
tcsh-> id -u
65534
tcsh-> 

Since our tcsh shell is running under a different user namespace we can see that it sees the user id as 65534, however on our initial bash shell we get our user id to be 1000. We also don’t have a name for the user in our tcsh shell, this is because we created a “temporary user” via a namespace instead of the usual methods.

Namespaces Working Together

We can use namespaces manually of course, however a good example of using namespaces in a more production-like capacity are Linux containers. Below is an example of what a system may look like when it has LXC containers deployed.

First we will initiate the container on our host:

root@host:~ # lxc-start container
root@host:~ #

Now that we have our container running we can see the following in the host:

root@host # lsns -o NS,TYPE,PATH,PID,COMMAND,USER
        NS TYPE   PATH                   PID COMMAND         USER
4026531834 time   /proc/1/ns/time          1 init [3]        root
4026531835 cgroup /proc/1/ns/cgroup        1 init [3]        root
4026531836 pid    /proc/1/ns/pid           1 init [3]        root
4026531837 user   /proc/1/ns/user          1 init [3]        root
4026531838 uts    /proc/1/ns/uts           1 init [3]        root
4026531839 ipc    /proc/1/ns/ipc           1 init [3]        root
4026531840 mnt    /proc/1/ns/mnt           1 init [3]        root
4026531992 net    /proc/1/ns/net           1 init [3]        root
4026532337 mnt    /proc/5438/ns/mnt     5438 init [3]        root
4026532340 uts    /proc/5438/ns/uts     5438 init [3]        root
4026532341 ipc    /proc/5438/ns/ipc     5438 init [3]        root
4026532342 pid    /proc/5438/ns/pid     5438 init [3]        root
4026532345 net    /proc/5438/ns/net     5438 init [3]        root
4026532434 cgroup /proc/5438/ns/cgroup  5438 init [3]        root

PID Namespace

Let’s break the output above down a bit, we see that the init process has two PIDs (This is not expected on most linux systems), that is because one process belongs to the container and another to the host. Let’s look at what they each report.

This is the host:

root@host # ps -e | grep -E "PID|init"
  PID TTY          TIME CMD
    1 ?        00:00:06 init
 5438 ?        00:00:00 init

This is what the container sees:

root@container # ps -e | grep -E "PID|init"
  PID TTY          TIME CMD
    1 ?        00:00:00 init

This example has the PID namespace in use, below is what is visible from the systems:

On the host: (Shortened for brevity)

root@host # pstree -pN pid
[4026531836]
init(1)─┬─acpid(938)
.
.
.
[4026532342]
init(5438)─┬─agetty(5741)

On the container:

root@container # pstree -pN pid
[4026532342]
init(1)-+-agetty(291)

This shows that the LXC container is using a different PID namespace and thus is able to have it’s own init process. Additionally the container is reporting that its init process has PID 1 which we can confirm from the host is actually PID 5438. This means that any processes that get spawned in the container will not be aware that they are on a child namespace of the root user namespace from the host and therefore be unaware of the existence of processes outside their own user namespace.

Final Breakdown

Due to LXC being a way to create containers on a system and containers having separation as a purpose we see that several namespaces are created on the host when running LXC containers.

We have two different namespaces created for each namespace type, one for the host and another for the container:

For the host:

root@host # lsns -p 1 -o NS,TYPE,PATH,PID,COMMAND,USER
        NS TYPE   PATH              PID COMMAND       USER
4026531834 time   /proc/1/ns/time     1 init [3]      root
4026531835 cgroup /proc/1/ns/cgroup   1 init [3]      root
4026531836 pid    /proc/1/ns/pid      1 init [3]      root
4026531837 user   /proc/1/ns/user     1 init [3]      root
4026531838 uts    /proc/1/ns/uts      1 init [3]      root
4026531839 ipc    /proc/1/ns/ipc      1 init [3]      root
4026531840 mnt    /proc/1/ns/mnt      1 init [3]      root
4026531992 net    /proc/1/ns/net      1 init [3]      root

For the container:

root@host # lsns -p 5438 -o NS,TYPE,PATH,PID,COMMAND,USER
        NS TYPE   PATH                   PID COMMAND  USER
4026531834 time   /proc/1/ns/time          1 init [3] root
4026531837 user   /proc/1/ns/user          1 init [3] root
4026532337 mnt    /proc/5438/ns/mnt     5438 init [3] root
4026532340 uts    /proc/5438/ns/uts     5438 init [3] root
4026532341 ipc    /proc/5438/ns/ipc     5438 init [3] root
4026532342 pid    /proc/5438/ns/pid     5438 init [3] root
4026532345 net    /proc/5438/ns/net     5438 init [3] root
4026532434 cgroup /proc/5438/ns/cgroup  5438 init [3] root

In this specific case the container shares the user and time namespaces with the host, but this depends on how you configure your LXC containers.

Conclusion

Namespaces are a great way to have separation of different types of resources on a system and although containers are the most popular way to use namespaces they’re available to the system at any moment.