One of the problem I met yesterday is that I add my user on GPU server to the docker group and run a DL docker container. Although I once firmly believed that the root in docker is virtual for the real server, it turned out I am wrong. And the bad result was that I occupied some of the GPUs exclusively, and terminated the running training process of the others with root privilege (who does not develop the habit to store its model after a specific number of epochs).
Therefore I was asked to run docker container without using root.
Before I made any changes, if you run some command in a docker container, and when you watch it on the real server, you will find the owner of that process is root.
ping www.google.com # command I uses for testing
After I made the changes, when I run the command in a docker container, in real server, it shows the uid of the user I set. Here is what I have done
Reference: Isolate containers with a user namespace
Step 1
Change the /etc/docker/daemon.json
from
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
}
}
to
{
"runtimes": {
"nvidia": {
"path": "nvidia-container-runtime",
"runtimeArgs": []
}
},
"userns-remap": "default"
}
I chose userns-remap
to be default
merely for convenience. You can choose the user you like according to the format in reference.
Step 2
Restart docker service (It will delete all the existing images and containers)
systemctl restart docker
And there are some examination steps for these changes, it is in the Enable userns-remap on the daemon
section (step 2 - step 5) in reference.
Once all of these have been done, when you run programs in container, although in docker it shows the user is root, in real server, the user is dockremap (under default case).