Container Engine Components

In this section, we'll explore the essential objects and Docker commands you'll use to manage containers. Understanding these objects and commands will help you effectively create, manage, and deploy your scientific applications in a containerized environment.

Container Objects

In this section, we will explain the primary objects you'll encounter when working with containers, helping you build a foundational understanding of their roles and interactions.

Images

An image is a lightweight, standalone, and executable software package that includes everything needed to run a piece of software, including the code, a runtime, libraries, environment variables, and configuration files. Images are the building blocks of containers and can be layered to include additional functionality.

Containers

A container is a running instance of an image. It can be thought of as an isolated environment in which applications run. You can have multiple containers based on the same image, each with its own environment. Containers are built on a layered architecture, allowing for efficient updates and consistency.

Volumes and Mounts

Volumes are used for persistent storage of data that a container generates or needs to access. They exist independently of the lifecycle of a container. By using volumes, you can save data even if the container is removed. You can specify volumes using the -v or --volume flag, binding host paths to container paths.

Layers

Layers are the foundation of how container images are constructed, shared, and reused. Each instruction in a Dockerfile (e.g., RUN, COPY, ENV) generates a new layer that builds on top of the previous one. Think of these layers as building blocks—each representing a file system delta that adds or changes files from the previous layer.

Why Layers Matter

Efficiency: Layers allow container engines to cache parts of the build process. If nothing changes in a particular layer, it doesn’t need to be rebuilt.
Reusability: Common base layers (like alma9.5) can be shared across many images, reducing storage and network overhead.
Transparency: You can inspect image layers to see exactly what’s included and in what order.

How Layers Work

When you build a container image:

The first instruction (often FROM) starts with a base image layer (e.g., alma9.5).
Each subsequent instruction (e.g., RUN dnf install, COPY mycode) creates a new intermediate layer.
These layers are stacked to form a final image, which is the sum of all changes.

Layers are read-only once built. When a container runs, a final writable layer (called the container layer) is added on top, which captures all changes made during runtime (e.g., temporary files or logs).

💡 By minimizing the number of layers and grouping commands (e.g., combining RUN steps), you can keep your images smaller and builds faster.

Example of Layer Creation

FROM almalinux:9.5         # Base layer
RUN dnf install -y git     # New layer with git installed
COPY . /app                # New layer adding local files to /app
ENV APP_ENV=production     # Metadata layer

Docker Command Overview

When working with containers, it's essential to understand how to interact with container images and container registires. Images are typically pre-built and stored in registries, ready for you to pull and utilize. We have a pre-built jlab-base:alma9.5 container image based on AlmaLinux/9-base:9.5, which we'll use in the next section.

⚠️ It's crucial to be cautious and ensure you're pulling images from trusted sources, whether public repositories like Docker Hub or an in-house container registry integrated with GitLab.

Pull: Retrieving Images

The docker pull command allows you to download a container image from a registry to your local system. Images have a naming convention that includes a repository name and a tag, separated by a colon.

docker pull <repository_name>/<image_name>:<tag>

Example: docker pull ubuntu:latest

Why Pull?

Ensure Consistency: Retrieve specific versions to maintain consistent environments across different systems.
Prepare Environments: Quickly set up necessary environments for applications or analysis.

Further details on the storage and setup of these images are covered in the Container Engine Setup section.

Push: Uploading Images

Use the docker push command to upload your locally modified container image to a registry. This is typically done after adding new layers or customizing an existing image, like creating a personalized version of jlab-base:alma9.5.

docker push <repository_name>/<image_name>:<tag>

Example: docker push codecr.jlab.org/myimage:latest

Why Push?

Share and Collaborate: Easily share your customized images with collaborators.
Archive Versions: Preserve a history of image versions for analysis reproducibility. This is essential for analysis preservation, allowing others to run your container if they have access to the registry.

Remember, it's essential to separate software and data configurations, as discussed in the why-containers section.

ℹ️ Jefferson Lab maintains a container registry accompanying our Gitlab instance. You can use your private namespace to store your images and even set up continuous integration to build your containers.

Build: Creating Images

The docker build command is used to create a new container image from a Dockerfile, giving you control over the environment and dependencies. You will typically build on top of a pre-existing container provided by a trusted vendor. In the Building Containers section, we will build on top of the jlab-base:alma9.5 container available in the codecr.jlab.org container registry.

docker build -t <image_name>:<tag> <path_to_dockerfile>

Example: docker build -t myapp:v1 .

Why Build?

Customize Environments: Tailor images to specific dependencies and settings for your application.
Automate Setups: Use Dockerfiles for automated and consistent environment configurations.

Run: Starting a Container

The docker run command initializes a container from an image. A container represents a running instance of the image, providing an isolated environment for your application.

docker run [OPTIONS] <image_name>

Example: docker run -it ubuntu:latest

Key Options:

-it: Run a container in interactive mode with -tty teletypewriter (terminal) access.
-d: Run container in detached mode.
-v or --volume: Bind mount a volume (e.g., -v /host/path:/container/path).
-p or --port: Publish the container’s port to the host (e.g., -p 8080:80).
--rm: Automatically remove the container when it exits, which helps keep containers ephemeral.
--name: Assign a name for easier identification.

Why Run?

Test Locally: Validate changes in a controlled environment.
Isolate Applications: Ensure applications run without interfering with the host, preserving the environment's integrity.

ℹ️ Establishing an entry point with a script like entrypoint.sh can streamline getting into an image. Keeping container modifications minimal during runtime enhances reliability and reproducibility.

Container Objects​

Images​

Containers​

Volumes and Mounts​

Layers​

Why Layers Matter​

How Layers Work​

Example of Layer Creation​

Docker Command Overview​

Pull: Retrieving Images​

Push: Uploading Images​

Build: Creating Images​

Run: Starting a Container​

Further Reading and Resources​