Exploring Namespaces in Linux
Published: 2021-04-27 | Last Updated: 2022-01-09 | ~9 Minute Read
Table of contents
- Context
- What are Namespaces?
- Namespaces in Use
- A Closer Look at the User Namespace
- Namespaces Working Together
- PID Namespace
- Final Breakdown
- Conclusion
Context
As part of my journey into learning more about containerization technology I wanted to take a look at the core of what makes it all possible on a Linux system.
I have done some things with Docker as well as LXC, however both of these technologies heavily rely on a Linux kernel feature that I’d like to better understand.
So we’ll be taking a look at this feature in this post. As a note I will be exploring these features on a Slackware system with Kernel version: 5.10.29
What are Namespaces?
In a nutshell, namespaces are to processes what workspaces are to desktop environments. Namespaces are a tool to control how processes (and other system resources) and their children see the environment they’re running in. This is used to separate and compartmentalize resources on the system.
Another way to think of namespaces is as a way to determine what system resources can and cannot see each other. This depends on type of the namespace and an id (inode number) for proper separation.
From the namespaces
man page there are eight namespace types:
Namespace Flag Page Isolates
Cgroup CLONE_NEWCGROUP cgroup_namespaces(7) Cgroup root directory
IPC CLONE_NEWIPC ipc_namespaces(7) System V IPC,
POSIX message queues
Network CLONE_NEWNET network_namespaces(7) Network devices,
stacks, ports, etc.
Mount CLONE_NEWNS mount_namespaces(7) Mount points
PID CLONE_NEWPID pid_namespaces(7) Process IDs
Time CLONE_NEWTIME time_namespaces(7) Boot and monotonic
clocks
User CLONE_NEWUSER user_namespaces(7) User and group IDs
UTS CLONE_NEWUTS uts_namespaces(7) Hostname and NIS
domain name
We can use the following commands to interact with namespaces on the system:
clone
unshare
Namespaces in Use
So how do namespaces work? Let’s take a look:
Namespaces are always present even if you don’t run containers on your system (as of Kernel version 2.4.19), this is due to the /proc/[pid]/ns/
directory.
You can use the lsns
command in order to see the namespaces currently in operation:
nix~ $ lsns
NS TYPE NPROCS PID USER COMMAND
4026531834 time 37 1361 nix -bash
4026531835 cgroup 37 1361 nix -bash
4026531836 pid 37 1361 nix -bash
4026531837 user 28 1361 nix -bash
4026531838 uts 37 1361 nix -bash
4026531839 ipc 28 1361 nix -bash
4026531840 mnt 37 1361 nix -bash
4026531992 net 28 1361 nix -bash
We see that we have some namespace instances from the different categories we looked at before.
To find out what each of these columns mean we can execute the lsns -h
command:
nix~ $ lsns -h
Usage:
lsns [options] [<namespace>]
List system namespaces.
Options:
-J, --json use JSON output format
-l, --list use list format output
-n, --noheadings don't print headings
-o, --output <list> define which output columns to use
--output-all output all columns
-p, --task <pid> print process namespaces
-r, --raw use the raw output format
-u, --notruncate don't truncate text in columns
-W, --nowrap don't use multi-line representation
-t, --type <name> namespace type (mnt, net, ipc, user, pid, uts, cgroup, time)
-h, --help display this help
-V, --version display version
Available output columns:
NS namespace identifier (inode number)
TYPE kind of namespace
PATH path to the namespace
NPROCS number of processes in the namespace
PID lowest PID in the namespace
PPID PPID of the PID
COMMAND command line of the PID
UID UID of the PID
USER username of the PID
NETNSID namespace ID as used by network subsystem
NSFS nsfs mountpoint (usually used network subsystem)
For more details see lsns(8).
Another way to see a bit of detail about the namespaces currently in use by your system, in case your version of util-linux
does not include lsns
, is by executing the following:
nix~ $ ls -l /proc/$$/ns | awk '{print $1, $9, $10, $11}'
total
lrwxrwxrwx cgroup -> cgroup:[4026531835]
lrwxrwxrwx ipc -> ipc:[4026531839]
lrwxrwxrwx mnt -> mnt:[4026531840]
lrwxrwxrwx net -> net:[4026531992]
lrwxrwxrwx pid -> pid:[4026531836]
lrwxrwxrwx pid_for_children -> pid:[4026531836]
lrwxrwxrwx time -> time:[4026531834]
lrwxrwxrwx time_for_children -> time:[4026531834]
lrwxrwxrwx user -> user:[4026531837]
lrwxrwxrwx uts -> uts:[4026531838]
This demonstrates that even without the use of containerization on the system, namespaces are currently being used.
A Closer Look at the User Namespace
Let’s focus on a single namespace for now, the user
namespace. The current namespaces that are in use were not created by us but by the system, let’s interact a bit with them to explore how they work.
First we will check the currently running processes on the user
namespace to have a baseline:
nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22751)
bash(22740)
From this we can see that the nix
user currently has two bash
processes running and both of these processes are in the default user
namespace generated by the Kernel. Now let’s create a new shell, in this case zsh
for distinction, from one of these bash
instances and see the updated state of our environment:
nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22754)
bash(22740)---zsh(22752)
This shows us the relationship between parent and child processes. In our example the zsh
shell is a child of the bash
shell. This new child process still resides within the default namespace. Now let’s create a new shell (csh
) in a namespace of our own with the unshare
command from our zsh
session:
zsh-\u@\h:\w\$ unshare -U csh
csh-%
And when we check our current user
namespace status from our first bash
shell:
nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22768)
bash(22740)---zsh(22752)
[4026532212]
csh(22755)
We now see that our csh
shell has been successfully created in its own namespace 4026532212
, which is different from the system default. Since we are inside the newly crated csh
shell we can use it to confirm how its “perspective” of the system user namespaces is affected by the fact that it’s in a different user
namespace from the system default:
csh-% pstree -pN user
[4026532256]
csh(1086)───pstree(1094)
This output shows us that this shell is not able to see the “surrounding” user
namespaces on the system, but our bash
shell can because the namespace created for our csh
shell is a child of the root system user
namespace. If we create another shell as a child of the csh
shell without modifying its namespace configuration:
csh-% tcsh
tcsh->
And check the user
namespaces from this new shell:
tcsh-> pstree -pN user
[4026532212]
csh(22755)---tcsh(22771)---pstree(22773)
tcsh->
We still see the that there is a single user namespace available, this occurs because the tcsh
shell is a child of the csh
shell and it inherits its namespace configuration.
So from the point of view of the initial bash
shell our current setup looks like the following:
nix@nixing:~$ pstree -pN user
[4026531837]
bash(22721)---pstree(22768)
bash(22740)---zsh(22752)
[4026532212]
csh(22755)---tcsh(22771)
Since the initial bash
shell is in the default namespace it has access to the rest of the namespaces because they are its children but the processes that get separated and their children do not.
Now that we saw how the user
namespace works in a bit more depth, we can see it’s effects on what the processes think the user is. From our first bash
shell we can see the following user information:
nix@nixing:~$ whoami
nix
nix@nixing:~$ id -u
1000
And if we do this same check on our tcsh
shell, we get something a bit different:
tcsh-> whoami
whoami: cannot find name for user ID 65534
tcsh-> id -u
65534
tcsh->
Since our tcsh
shell is running under a different user
namespace we can see that it sees the user id as 65534
, however on our initial bash
shell we get our user id to be 1000
. We also don’t have a name for the user in our tcsh
shell, this is because we created a “temporary user” via a namespace instead of the usual methods.
Namespaces Working Together
We can use namespaces manually of course, however a good example of using namespaces in a more production-like capacity are Linux containers. Below is an example of what a system may look like when it has LXC containers deployed.
First we will initiate the container on our host:
root@host:~ # lxc-start container
root@host:~ #
Now that we have our container running we can see the following in the host:
root@host # lsns -o NS,TYPE,PATH,PID,COMMAND,USER
NS TYPE PATH PID COMMAND USER
4026531834 time /proc/1/ns/time 1 init [3] root
4026531835 cgroup /proc/1/ns/cgroup 1 init [3] root
4026531836 pid /proc/1/ns/pid 1 init [3] root
4026531837 user /proc/1/ns/user 1 init [3] root
4026531838 uts /proc/1/ns/uts 1 init [3] root
4026531839 ipc /proc/1/ns/ipc 1 init [3] root
4026531840 mnt /proc/1/ns/mnt 1 init [3] root
4026531992 net /proc/1/ns/net 1 init [3] root
4026532337 mnt /proc/5438/ns/mnt 5438 init [3] root
4026532340 uts /proc/5438/ns/uts 5438 init [3] root
4026532341 ipc /proc/5438/ns/ipc 5438 init [3] root
4026532342 pid /proc/5438/ns/pid 5438 init [3] root
4026532345 net /proc/5438/ns/net 5438 init [3] root
4026532434 cgroup /proc/5438/ns/cgroup 5438 init [3] root
PID Namespace
Let’s break the output above down a bit, we see that the init
process has two PIDs (This is not expected on most linux systems), that is because one process belongs to the container and another to the host. Let’s look at what they each report.
This is the host:
root@host # ps -e | grep -E "PID|init"
PID TTY TIME CMD
1 ? 00:00:06 init
5438 ? 00:00:00 init
This is what the container sees:
root@container # ps -e | grep -E "PID|init"
PID TTY TIME CMD
1 ? 00:00:00 init
This example has the PID namespace in use, below is what is visible from the systems:
On the host: (Shortened for brevity)
root@host # pstree -pN pid
[4026531836]
init(1)─┬─acpid(938)
.
.
.
[4026532342]
init(5438)─┬─agetty(5741)
On the container:
root@container # pstree -pN pid
[4026532342]
init(1)-+-agetty(291)
This shows that the LXC container is using a different PID
namespace and thus is able to have it’s own init
process. Additionally the container is reporting that its init
process has PID 1
which we can confirm from the host is actually PID 5438
. This means that any processes that get spawned in the container will not be aware that they are on a child namespace of the root user
namespace from the host and therefore be unaware of the existence of processes outside their own user
namespace.
Final Breakdown
Due to LXC being a way to create containers on a system and containers having separation as a purpose we see that several namespaces are created on the host when running LXC containers.
We have two different namespaces created for each namespace type, one for the host and another for the container:
For the host:
root@host # lsns -p 1 -o NS,TYPE,PATH,PID,COMMAND,USER
NS TYPE PATH PID COMMAND USER
4026531834 time /proc/1/ns/time 1 init [3] root
4026531835 cgroup /proc/1/ns/cgroup 1 init [3] root
4026531836 pid /proc/1/ns/pid 1 init [3] root
4026531837 user /proc/1/ns/user 1 init [3] root
4026531838 uts /proc/1/ns/uts 1 init [3] root
4026531839 ipc /proc/1/ns/ipc 1 init [3] root
4026531840 mnt /proc/1/ns/mnt 1 init [3] root
4026531992 net /proc/1/ns/net 1 init [3] root
For the container:
root@host # lsns -p 5438 -o NS,TYPE,PATH,PID,COMMAND,USER
NS TYPE PATH PID COMMAND USER
4026531834 time /proc/1/ns/time 1 init [3] root
4026531837 user /proc/1/ns/user 1 init [3] root
4026532337 mnt /proc/5438/ns/mnt 5438 init [3] root
4026532340 uts /proc/5438/ns/uts 5438 init [3] root
4026532341 ipc /proc/5438/ns/ipc 5438 init [3] root
4026532342 pid /proc/5438/ns/pid 5438 init [3] root
4026532345 net /proc/5438/ns/net 5438 init [3] root
4026532434 cgroup /proc/5438/ns/cgroup 5438 init [3] root
In this specific case the container shares the user
and time
namespaces with the host, but this depends on how you configure your LXC containers.
Conclusion
Namespaces are a great way to have separation of different types of resources on a system and although containers are the most popular way to use namespaces they’re available to the system at any moment.