Center-vast help for, and R&D about, containers will help researchers compute with simplicity at TACC and somewhere else.
Researchers who use supercomputers for science ordinarily do not restrict on their own to a person method. They go their tasks to regardless of what methods are out there, often using lots of different systems concurrently, in their lab, on their campus cluster, at advanced computing facilities like TACC, and in the cloud.
It’s not a deficiency of loyalty, it is just a truth of study daily life — alternatives are shifting and hungry scientists discover what they need to have to get their study completed.
The systems researchers use aren’t automatically the similar even though. They may have different hardware with different architectures, and different compilers or libraries.
This opportunistic computing paradigm produces a lot of added function for computational scientists and method directors – adapting aged codes to function on new systems or putting in program offers many occasions. As working systems evolve, supporting code that was made on deprecated environments becomes a reproducibility problem.
In modern many years, a new solution has emerged. Generically identified as “containers,” it will involve a kind of isolation, the place a researcher’s code is packaged jointly with all the program dependencies in these kinds of a way that it can run at lots of web sites devoid of necessitating recompilations. By incorporating an application’s lots of dependencies into self-sustainable illustrations or photos, containers prevent a lot of problems.
Popularized by Docker in 2013, containers were promptly approved in each the industrial and tutorial scientific computing arenas. TACC was an early adopter and started enabling containerized science in 2016 – first through Docker, and more a short while ago through Singularity, which was launched in 2015 by a workforce of researchers at Lawrence Berkeley National Laboratory, and is specially nicely-suited to substantial-performance computer systems, which have tens of countless numbers of tightly linked processors.
Among the the users of containers on TACC systems are Thomas Hughes — professor of Aerospace Engineering and Engineering Mechanics at The University of Texas at Austin and a member of the National Academies of Science and Engineering — and David Kamensky — a previous member of Hughes’ workforce, now an assistant professor at the University of California, San Diego. The pair use containers to establish predictive versions of coronary artery circulation and examine turbulence.
“The cause we started off using containers was to run the numerical PDE [partial differential equation] program FEniCS on Stampede2,” claimed Kamensky. “FEniCS is a complex program with lots of dependencies, and it can be challenging to put in.”
When they required to carry out an isogeometric examination on leading of FEniCS, they converted a Docker graphic taken care of by the FEniCS job workforce into a Singularity graphic and ended up using more than 1,000 node several hours on Stampede2.
Collaborating with John Evans at the University of Colorado, Boulder (CU Boulder) on a turbulence modeling examine, they were ready to conveniently change from Stampede2 to CU Boulder’s cyberinfrastructure for the reason that of containerization.
“I do not see it as useful for supercomputer facilities to sustain and debug all the different program to meet every scientist’s requirements,” Kamensky claimed. “With Singularity, they only need to have to sustain a person piece of program.”
Sharon Glotzer, a professor of chemical engineering at the University of Michigan and a member of each the National Academy of Sciences and the National Academy of Engineering, also makes use of Singularity on TACC and several other facilities to examine how the creating blocks of matter transition from fluid to good to much better comprehend how to style and design new supplies.
In certain, her team makes use of molecular simulations to examine the assembly conduct of massive numbers of hard particles into designs using HOOMD-Blue, a common-purpose particle simulation toolkit.
“We make use of compute methods on XSEDE — such as Stampede2, Comet, and Bridges — Summit at the Oak Ridge Management Computing Facility, and nearby clusters,” claimed Joshua Anderson, a study place expert who builds and sustain the container illustrations or photos for Glotzer’s team. “Singularity containers permit us to use the similar program setting across all of these systems so that we can conveniently transition our workflows in between them.”
Transitioning in between systems isn’t usually trivial. Relocating workflows that use Concept Passing Interfaces (MPI) to harness the parallel computing electricity of supercomputers is nevertheless the major problem researchers face for using containers on different clusters.
“Each involves its possess appropriate MPI and person-place driver stack inside the container,” Anderson claimed. To address this, Anderson builds specialized illustrations or photos for each method primarily based on the similar container recipe. In 2019, their workforce made use of more than 5,300 node several hours on Stampede2 and lots of more on other systems.
“Each involves its possess appropriate MPI and person-place driver stack inside the container,” Anderson claimed. To address this, Anderson builds specialized illustrations or photos for each method primarily based on the similar container recipe. In 2019, their workforce made use of more than 5,300 node several hours on Stampede2 and countless more on other systems.
Michael Gurnis, professor of Geophysics and director of the Seismological Laboratory at Caltech, makes use of containers on Stampede2 to establish computational versions of subduction and mantle circulation, carry out massive parallel computation of grid searches, and check out the parameter place with first order influence on the evolution of the Earth using the geodynamic code, Underworld2.
Configuring Underworld2 as a docker graphic allowed Gurnis’s workforce to circumvent the set up of dependent offers, configure the setting to their requirements, and conveniently run the code on Stampede2.
Ian Wang, a Analysis Affiliate in the HPC Efficiency & Architectures Team at TACC, worked with the developer of the Underworld program to containerize the device. “I assume this is the first software that operates at quite massive scale inside of Singularity containers using MPI,” Wang claimed. “The users and the developers basically assisted me establish a bug in Singularity that only appears at massive scale MPI operates.”
Gurnis’s workforce has made use of 11,000 node several hours so considerably in containers and expects to proceed. “With the pulled graphic and Singularity, we can circumvent the bothersome set up of pertinent offers and configuration of setting. Containerization tends to make it effortless to put in and run a massive code on Stampede2,” claimed Yida Li, a graduate university student in Gurnis’ team.
Containers Assistance Local community Computing Attempts
Biology and bioinformatics are two of the foremost communities that have adopted containerization. The disciplines are relative newcomers to the HPC globe, and the development of new codes and tools in the subject has been quick and furious. This has led to some problems.
“A modern examine uncovered that, in bioinformatics, of the program released in final ten many years, 50 per cent could not be installed,” claimed Greg Zynda, a bioinformatician and member of the Lifetime Sciences Computing team at TACC. “They were designed for legacy working systems and just cannot be run on today’s supercomputers. We’re trying to fix that issue using containers.”
Zynda has led an hard work at TACC to make 16,000 bio-containers out there on TACC’s supercomputers, eliminating the need to have for daily life science researchers to package and sustain each and every piece of program in the subject.
Two of TACC’s biggest collaborative program expert services tasks — the DARPA Synergistic Discovery and Structure (SD2E) job and Cyverse — also leverage containers thoroughly.
SD2E makes use of automation and device studying to speed up discovery in spots the place the underlying design is not nicely understood. The job delivers jointly study labs and providers from about the U.S. with complementary abilities, and streamlines their interactions using cyberinfrastructure at TACC, such as: a centralized information repository and information catalog for sharing, provenance, and discovery the Tapis APIs to help automatic, party-pushed information examination using each cloud and HPC back-end hardware and a developer ecosystem of command-line tooling, version control, and constant integration.
“The system is really effective and versatile, and almost every element makes use of Docker containers as a core creating block,” claimed John Fonner, who manages Rising Systems in TACC’s Lifetime Sciences Computing team. “The examination tools, persistent expert services, and even the APIs on their own are almost totally composed of containers.”
Cyverse is an NSF-funded job that will help daily life scientists use cyberinfrastructure to manage large datasets and advanced analyses, hence enabling information-pushed discovery. CyVerse includes a information storage facility, an interactive, world-wide-web-primarily based, analytics system, and cloud infrastructure to for computation, examination, and storage — much of it make and taken care of at TACC.
“Docker containers are the main way Cyverse researchers combine tailor made apps into the system,” claimed Fonner. “This includes non-interactive cloud and HPC apps, as nicely as interactive ‘VICE’ apps these kinds of as Jupyter Notebooks.”
Nevertheless often addressed as a panacea, there are trade-offs to using containers, Zynda states. Containers are not usually as optimized for performance as they could be, and once they are produced, they are static and not effortless to modify.
“But once a container is released and can fix a issue, possibly it does not need to have to modify,” he claimed.
TACC does not only permit the use of containers, they are active in building tools to make containers much easier to use. Just one example of this is Rolling Gantry Crane (RGC) — named following the equipment that offload containers from ships in harbors. RGC integrates containers into TACC’s environmental module method (Lmod) to permit common interactions, basically creating the container method transparent.
TACC also trains researchers to use containers through frequent workshops, webinars, and integration into TACC’s Institutes.
“We believes that program containers are an essential section of reproducible computing,” Fonner claimed. “Containers are supported on all our HPC clusters and in our cloud infrastructure. Internally, we use Docker intensely in our regular development tactics, and we deploy illustrations or photos to compute methods using each Docker and Singularity interchangeably. In a brief time, containers have turn out to be a central section of how we help science at TACC.”
Prepared by Aaron Dubrow