Performance Comparison Between Singularity And Docker

Abstract—To meet the growing demand for computing power, HPC clusters have grown tremendously in size and complexity. With the prevalence of high-speed interconnects, multi-core processors, and gas pedals, sharing these resources efficiently has become even more important to achieve faster turnaround times and lower costs per user. the HPC cluster environment invokes a large number of analytic programs as well as internal development scripts, and the environment is extremely complex to configure and manage, with low repeatability, leading to big challenges in upgrading, managing, and migrating processes. Existing IT technologies have a solution for this: containers. However, different containers have different performance in various aspects. This article compares the performance differences between docker and singularity in a comprehensive way, with the aim of getting a condition of the applicability of these two containers in different scenarios.

I. INTRODUCTION

Over time, after the technology of hardware virtualization, abundant hardware resources have become more accessible and easy to use than ever before, and this has led to what we know today as cloud platforms. Most of the cloud services are based on virtual machines. The performance of the virtual machine has a significant impact on the overall performance of the cloud application. For tasks that require computing, this is the biggest drawback of the technology. Because different operating systems use different kernels, such as Windows and Linux, any kernel used in a system consumes a large amount of computing resources such as CPU and memory, which are the resources most needed for high-performance computing. The goal of containerization is to provide more computational power to applications by sacrificing this flexibility, so all containers use the same cores. Due to this limitation, containers cannot be migrated live and must be suspended or stopped before they can be moved to another host. On top of that, most container scenarios use static containers to secure business applications, and these containers are rarely moved at runtime. Containers use the host's working kernel when they are working.

A container is a lightweight operating system within the host OS that runs them. It uses the native instructions of the core CPU and does not has any VMM (Virtual Machine Manager) requirements. The only limitation is that containers must use the kernel of the host operating system to use existing hardware components. Unlike virtualization, containers can use different operating systems without any restrictions, and we would like to know if there are unacceptable performance overheads or security issues when containers run on different operating systems.

Traditional hypervisor-based virtualization solutions are not commonly used in high performance computing (HPC) due to the performance overhead. Container-based virtualization technologies, represented by Linux Container and Docker, can provide better resource sharing, customized environments and low overhead. We can see that different containers have different advantages and disadvantages. For example, Docker is definitely the hottest container technology today and does a great job in terms of version control, isolation, portability, etc. However, it has some permission issues when used by non-root users and includes unnecessary resource overhead, perhaps making it not well suited to the functional and security needs in HPC environments. Singularity is an attractive container-based approach to meet the requirements of scientific applications. It features computational mobility, repeatability, user freedom, and support for existing traditional HPC, but it does not provide very complete isolation and virtualization as Docker does.

Although some studies have shown that docker outperforms virtualization solutions in most aspects and has comparable performance with bare metal, there are also studies that have tested the advantages of singularity in hpc environments such as lightweight, low overhead, and adaptability to hpc environments. We already know enough that docker and singularity have excellent performance, but at the same time we have to admit that these containers have shortcomings and there are relatively good choices in different application scenarios. But according to our investigation we did not find these two excellent containers together for full testing and comparison. So the purpose of this article is not only to focus on their performance evaluation with benchmarks, but also to investigate their practicality in many aspects to come up with targeted application criteria, which is also our motivation.

Ⅱ. Contribution

We summarized our work process and conclusions into a blog. First, we surveyed and summarized the relevant work, then briefly introduced several containers and some indicators to judge the practicability of containers, and explained the installation and deployment process of docker and singularity. Then is our experimental test. We have carried out a unified test on the machine in the laboratory. Through the benchmark test, we have measured the throughput, latency, cpu utilization and other performance of the container on single thread and multiple threads, and have extended the comparison to multiple nodes in the recent period. Through consulting the documents, the performance of version control, isolation, portability, repeatability and other performances are compared, and finally the container use recommendations adapted to different situations are obtained.

Ⅲ. RELATED WORK

Xavier et al. [1] conducted some experiments to evaluate container performance for high performance computing (HPC). They compared OpenVZ, LXC, VServer, and Xen, considering isolation issues. They conclude that all container-based systems have almost native performance in terms of CPU, memory, disk and network. The main difference between them is the implementation of resource management, leading to problems in isolation and security.

Beserra et al. [2] analyzed the performance of LXC containers in comparison to hypervisor-based virtualization - KVM - for HPC activities. The results showed that the type of hypervisor directly affects the results. They conclude that LXC is better suited for HPC than KVM, however, in more complex cases where physical resources are partitioned into multiple and logical environments, the performance of both degrades, especially for KVM. LXC's superior performance is also observed in clustered environments where there is more cooperation between processes. Some issues have been reported with LXC compared to KVM, such as isolation issues.

Pedretti et al. [3] test Shifter and Singularity in HPC systems. tested and compared bare-metal execution and container execution in supercomputers (NERSC's Edison and Sandia's Volta). Their experiments included synthetic benchmarks and computational tools such as HPCG, HPMG-FE, IMB, and FEniCS toolboxes. Similary, Ruiz et al. [4] investigate the impact of using containers in the context of HPC research. The evaluation showed the limitations of using containers, which types of applications are most affected, and what level of oversubscription containers can handle without affecting application performance. While using containers gains considerable overhead, it shows that the technology is becoming more mature and that performance issues are being addressed with each new release of the Linux kernel.

A comparative study of productive simulations using biological systems is provided by Rudyy et al. [5]. In the paper, the productivity benefits of employing containers for large HPC codes are analyzed and the performance overhead incurred by using three different container technologies (Docker, Singularity, and Shifter) is quantified, comparing them to local execution. Given the results of these tests, they chose Singularity as the best technology, based on performance and portability. And they show the scalability results of Alya using singularity to reach 256 compute nodes (up to 12k cores) of MareNostrum4 and present performance and portability studies on three different HPC architectures (Intel Skylake, IBM Power9 and Arm-v8).

Ⅳ. CONTAINERIZATION SOLUTIONS

A. VM vs. Container

VM is equivalent to installing a completely new system (new kernel). For example, if we use VirtualBox to install a Windows virtual machine on a Mac computer, or a Linux on a Win10 computer, each time we start it is equivalent to starting a whole new operating system, so it is slow and consumes a lot of system resources. There are many tutorials on the web.

Container shares a kernel with the host system and configures the runtime environment based on the existing kernel. This means that it is faster and smaller than a VM, and a Container may only be a few MB. However, a Container for Linux must run on a Linux host, and most Containers are currently developed on Linux. There is no need to configure a virtual machine specifically for an app, using Container it can share the app's bins/libraries.

B. Containers for HPC

As containers gain popularity within industry, their applicability towards supercomputing resources is also considered. In order to assess the suitability of such system software, it is necessary to conduct a careful examination to identify the components of the container that could have the greatest impact on the HPC system in the near future. Current commodity containers could provide many positive attributes, below are serveral of them:

  • Bring-Your-Own-Environment. Developers define the operating environment and system libraries in which their applications run.

  • Composability. Developers explicitly define how their software environments are composed of modular components that act as container images, thus making it possible for repeatable environments to span different architectures.

  • Portability. Containers can be rebuilt, layered or shared across multiple different computing systems, potentially ranging from laptops to clouds to advanced supercomputing resources.

  • Version Control Integration. Containers can integrated with revision control systems such as Git, including not only build manifests, but also full container images using container registries like Docker Hub.

While it may be feasible to leverage industry-standard open source container technologies in HPC, there are a number of potential caveats on HPC resources that are incompatible or unnecessary. These include the following attributes.

  • Overhead. While some minor overhead is acceptable with the new level of abstraction, HPC applications generally cannot accept significant overhead from the deployment or runtime aspects of the container.

  • Partitioning on nodes. For most parallel HPC applications, resource partitioning on nodes with cgroups is not yet necessary. While in-place workload coupling is possible with containers, it is beyond the scope of this work.

  • Root operations. Containers typically allow root-level access control to users; however, this is not allowed in supercomputers, which is a significant security risk to the facility.

These two lists effectively serve as selection criteria for investigating containers in the HPC ecosystem. Due to the popularity of containers and the growing demand for more advanced use cases for HPC resources, we can use these attributes to investigate several potential HPC container solutions that have been created recently.

C. Docker vs. Singularity

Docker. We can't discuss containers without mentioning Docker, the most widely used container software that has matured and has a large community of users. Docker Hub is an online repository for containers, with over 100,000 configured containers available for download and use. Like the Docker logo, a cargo ship stacked with different containers, Docker is designed to allow multiple container to run on the same system, while also enabling isolation between containers, containers and the main system. However, between security concerns, root user operations, and lack of distributed storage integration, Docker in its current form does not lend itself to use on HPC resources. However, Docker is still useful for personal container development on laptops and workstations, whereby images can be ported to other systems, allowing root system software to be built but deployed exclusively elsewhere.

Singularity. Singularity[8] provides a compelling use of containers specifically designed for the HPC ecosystem. Originally developed by Lawrence Berkeley National Laboratory, now under Singularity-Ware, LLC. Singularity provides custom Docker images that can be run on demand on HPC compute nodes. Singularity leverages chroot and bind mounts (and optionally OverlayFS) to mount container images and directories, as well as Linux namespaces and user mappings for any given container, and no need for root privileges. Singularity has some advantages. First, Singularity supports the generation of custom image manifests by defining Singularity containers, as well as on-demand import of Docker containers. What's more, Singularity wraps all images in a single file, providing simple management and sharing of containers, not only by a specific user, but also potentially between different resources. This approach alleviates the need for additional image management tools or gateway services and greatly simplifies the deployment of containers. In addition, Singularity is a single installation package and provides a straightforward configuration system.

Shifter. Shifter is one of the first major projects to introduce containerization to advanced supercomputing architectures. shifter has recently included support for GPUs and has been used to deploy Apache Spark big data workloads. Although Shifter is an open source project, so far it has focused on deployments on the Cray supercomputer platform and has incorporated implementation-specific features that make porting to other architectures more difficult.

Charliecloud. Charliecloud[9] is an open source implementation of containers developed for a production HPC cluster at Los Alamos National Laboratory. It allows converting Docker containers to run on HPC clusters. charliecloud uses user namespaces to avoid the Docker model of running root-level services, and the code base is said to be very small and concise, with less than 1000 lines of code. However, the long-term stability and portability of the Linux kernel namespace used in Charliecloud is uncertain.

Due to the wide range of options for providing containerization in an HPC environment, we chose to test Docker and Singularity containers, compare them or with native environments. Singularity has advantages over the other two containers, whose interoperability allows containers to be ported on a variety of architectures. Users can build dedicated Singularity containers on Linux workstations and take advantage of the new SingularityHub service.

D. Container build-up:

Following steps should be followed for building the container in Docker and Singularity

Docker

  • First, figure out the Docker specification file.

  • Build an interactive container, then use docker commit.

  • Then, pull all the existing images from the repository which can be either docker store, docker hub, or a private repository.

Singularity

  • The first step is to create a def file in singularity. All the packaged dependencies are analyzed and also incorporated into the images automatically.

  • By leveraging the sudo singularity shell the user needs to expand the existing images.

  • Now, import or convert a docker image by using a singularity import into an image.

  • Then, share the prebuilt singularity images.

E. Deployment:

The following steps are followed for the deployment of the two:

Docker:

  • Docker can be deployed using a single package installation

  • Then, run the daemon service and containers in docker with root privileges to prevent all the possible vulnerabilities.

  • The user then can utilise it in batches.

Singularity

The first step is to build a Singularity container on your local system where you have root or sudo access i.e personal computer where singularity is installed. The user can also build the container from scratch using a recipe file, or you can also convert containers from one to another. Then, your containers need to be spawned with an interactive shell session. Then, execute a command inside the container. Transfer your containers to the High-Performance Computing system where you want to run them. Finally, run the singularity containers on the system.

Ⅴ. EXPERIMENT

1. Environment
  • CPU : 2 * Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz

  • GPU : NVIDIA A100 Tensor Core GPU

  • Alya : 2021/03/09 17:46:18

  • node number : 6

  • iperf3 : 3.1.3

  • IOzone : 3.494

2. Test Cases

We use Alya to test the performance. Alya is a simulation code for high performance computational mechanics. Alya solves coupled multiphysics problems using high performance computing techniques for distributed and shared memory supercomputers, together with vectorization and optimization at the node level. Alya is based on OpenMPI.

small benchmarks

(1)

  • Reynolds number

  • Mesh refinement

  • Domain boundaries

  • Drag and lift forces

(2)

  • Steady state flow

  • Domain Boundaries

  • Convergence of the case

large benchmarks

Flow over a sphere

  • Mesh: 16.7M elements, 2.9M nodes

  • Modules: Nastin

  • Physics: turbulent flow over a sphere

  • Numerical model: Vreman turbulence model, convective term using the EMACS scheme

  • Solution strategy: fractional step with Runge-Kutta of order 3

  • Algebraic solvers: CG

network benchmarks
  • iperf3 -u -c 192.168.1.1 -b 5M -P 30 -t 60 (single thread)

  • iperf3 -u -c 192.168.1.1 -b 100M -d -t 60 (multiple thread)

IOZone benchmarks
  • sequence write and read

  • random write and read

  • STREAM

3. Test Result

Alya Result :

Network Result :

IOZone:

4. Conclusions

Traditional software integration and deployment performed in HPC systems is a time-consuming activity which relies on manual installations performed by system administrators and presents some issues which make it very inflexible. The main issue of the selected containerization technology, Singularity, relies on its integration with Open MPI. Different general approaches for enabling MPI containers on HPC clusters were proposed. We analyzed the use of CPU, RAM, IO, Latency and network. The obtained parameters were very close to those obtained by the native execution of the same ones, reason why the loss is despicable. In addition, if we consider the time that will be saved using this novel form of software deployment, this minimal loss is more than justified.

Ⅵ. COMPARISON OF CONTAINERS

The following table shows a comparison of features other than performance

docker

shifter

singularity

Privilege model

Root Daemon

SUID

SUID,UserNS

Application Scenarios

microservices. Enterprise applications

HPC with lots of docker programs

Varieties of operating systems and applications

Network Access

Utilize Network Namespace.

Transparent

Transparent

No Addition Network config

×

Native Support for GPU

×

×

Native Support for MPI

work with all schedulers

×

×

Trival HPC Install

×

Network Virtualization

×

×

cross-platform

×

×

Now let's look at what each line means

Privilege model: The permission model used by containers. Docker is very strict because it requires root privileges. Shifter support Set UID permission model, that is, for each file set access, modify, delete and other permissions. singularity in addition to support SUID support User NameSpace, you can manage permissions by region

Application Scenarios: Docker is commonly used for microservices and enterprise applications, shift is suitable for scenarios where a large number of docker programs have been installed but there is no root access, such as hpc servers are often the case, singularity is compatible with various operating systems and applications

Network Access: Docker has a dedicated Namespace for network access, while shifter and singularit are transparent to network access, i.e. any user can access all the data in the container through the network, provided they know the username and password

Network config: Docker requires network configuration, while shifter and singularity do not

Native GPU: Only singularity supports native GPU

Network Virtualization : Only Docker supports network virtualization

cross-platform : Only singularity supports cross-platform, which means you can make your image file in linux system can run normally in win or mac platform


Here are some suggestions for container selection:

  1. Docker has a high requirement for permissions, requiring root permissions. shifter uses the classic SUID approach to manage permissions. singularity supports not only SUID but also user namespace permissions

  1. Singularity and shifter are transparent to web users, increasing convenience with some security risks

  1. Singularity requires no additional network configuration, supports local GPU, is compatible with all common schedulers, easy to install HPC packages and cross-platform.

  1. Only docker supports network virtualization. You need to pay attention to relevant requirements

In addition,singularity currently has a small user base compared to docker, so it may not be easy to find a time when a similar problem has been encountered. This is also important when it comes to container selection.

To sum up, docker is recommended for non-demanding performance or container-related experience, shifter is recommended if you want to deploy a large number of docker applications without root access, and singularity is recommended for quick and easy use or for improved performance.

Of course, all of these containers are very interoperable, and we don't have to choose just one of them. For example, in the common client-side server model, we can use singularity on the non-root and performance-dependent server side, and install docker on the client-friendly client side.

REFERENCES

[1] M. G. Xavier, M. V. Neves, F. D. Rossi, T. C. Ferreto, T. Lange, and C. A. F. D. Rose, “Performance evaluation of container-based virtualization for high performance computing environments,” in 2013 21st Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, Feb 2013, pp. 233–240.

[2] D. Beserra, E. D. Moreno, P. T. Endo, J. Barreto, D. Sadok, and S. Fernandes, “Performance analysis of lxc for hpc environments,” in 2015 Ninth International Conference on Complex, Intelligent, and Software Intensive Systems, July 2015, pp. 358–363.

[3] A. J. Younge, K. Pedretti, R. E. Grant, and R. Brightwell, “A tale of two systems: Using containers to deploy HPC applications on supercomputers and clouds,” in Cloud Computing Technology and Science (CloudCom), 2017 IEEE Int. Conference on. IEEE, 2017, pp. 74–81.

[4] C. Ruiz, E. Jeanvoine, and L. Nussbaum, “Performance evaluation of containers for HPC,” in European Conference on Parallel Processing. Springer, 2015, pp. 813–824.

[5] O. Rudyy, M. Garcia-Gasulla, F. Mantovani, A. Santiago, R. Sirvent and M. Vázquez, "Containers in HPC: A Scalability and Portability Study in Production Biological Simulations," 2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS), 2019, pp. 567-577, doi: 10.1109/IPDPS.2019.00066.

[6] I. Jimenez, M. Sevilla, N. Watkins, C. Maltzahn, J. Lofstead, K. Mohror, A. Arpaci-Dusseau, and R. Arpaci-Dusseau, “The popper convention: Making reproducible systems evaluation practical,” in Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2017 IEEE International. IEEE, 2017, pp. 1561–1570

[7] D. Merkel, “Docker: lightweight linux containers for consistent development and deployment,” Linux Journal, vol. 2014, no. 239, p. 2, 2014.

[8] G. M. Kurtzer, V. Sochat, and M. W. Bauer, “Singularity: Scientific containers for mobility of compute,” PLOS ONE, vol. 12, no. 5, pp. 1–20, 05 2017.

[9] R. Priedhorsky and T. C. Randles, “Charliecloud: Unprivileged containers for user-defined software stacks,” Los Alamos National Laboratory, Tech. Rep., 2016.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 2
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值