Docker Image registry, naming and tagging

Sep 28, 2024

—

1. Introduction

Images are stored in either local repositories or centralized places called registries. Docker Hub is the most popular registry to store images. Most modern registries implement the OCI distribution-spec and we sometimes call them OCI registries. A registry’s role is to securely store container images and facilitate easy access across various environments. Some registries also provide advanced features like image scanning and integration with build pipelines. The Docker client uses the Docker Hub as its default registry.

Use docker images command to list all images.

C:\>docker images
REPOSITORY        TAG       IMAGE ID       CREATED        SIZE
simple-node-app   latest    328c28b10f66   35 hours ago   917MB
ubuntu            latest    b1e9cef3f297   4 weeks ago    78.1MB
redis             latest    7e49ed81b42b   2 months ago   117MB

Image registry has repositories and repository has images. This can be shown with the help of following figure:

2. Official repositories

The Docker Official Repository refers to a collection of container images that are maintained and published by Docker, Inc. on Docker Hub. These repositories provide trusted images for various software applications, tools, and operating systems, ensuring that users can access reliable and up-to-date versions.

2.1 Key Features of Docker Official Repositories

Trusted Sources: Official repositories are verified by Docker, ensuring the images are safe and reliable.
Regular Updates: They are regularly updated to include the latest security patches and features.
Standardized Naming: Official images typically follow a standardized naming convention, making them easy to find and use.
Documentation: Each official image comes with documentation that provides details on usage, configuration, and best practices.
Variety of Applications: The repositories include a wide range of images, from popular programming languages (like Python, Node.js, and Java) to databases (like MySQL and PostgreSQL) and even operating systems (like Ubuntu and Alpine).

Docker official images have a green “Docker official image” badge as shown below:

If a repository isn’t official then it may not be safe. You just need to be very careful before trusting code from unofficial images.

3. Image naming and tagging

While pulling an image from official repository, the command is

docker pull <repository>:<tag>

If the tag is not provided then the Docker assumes that you intend to provide latest. The latest tag doesn’t mean that the it is latest image.

Following are few examples:

$ docker pull mongo:4.2.24
//pull from official 'mongo' repository with tage as 4.2.24

$ docker pull alpine
//pull from official 'alpine' repository with tage as latest

The command to pull unofficial image is:

docker pull <Docker username or org name>/<image name>:<tag>

The command to pull an image from a third party registry (not Docker Hub), use the following command:

docker pull <registry_url>/<repository>/<image_name>:<tag>

4. Image tags

An image tag in Docker is a label that is used to identify a specific version or variant of a Docker image. A single image can have many tags. Tags are arbitrary alphanumeric values stored as metadata alongside the image. Tags help manage and differentiate between different builds of an image, allowing users to pull, run, or manage specific versions of applications and their dependencies.

Key Points about Image Tags:

Format: An image tag is typically formatted as <image name>:<tag>. For example, in myapp:1.0, myapp is the image name, and 1.0 is the tag.
Default Tag: If no tag is specified when pulling an image, Docker defaults to using the latest tag.
Version Control: Tags are commonly used for version control, enabling developers to specify which version of an image they want to use (e.g., myapp:v1.0, myapp:v2.0).
Semantic Versioning: Many images use semantic versioning (e.g., 1.0.0, 1.1.0), which makes it easier to understand the changes between versions.
Environment-Specific Tags: Tags can also represent different environments, such as myapp:development, myapp:staging, or myapp:production.
Multi-Architecture Tags: Some images may have tags that indicate compatibility with different architectures (e.g., myapp:arm64, myapp:amd64).

5. Provide image tag

To provide a tag to an existing Docker image, you can use the docker tag command. Here’s the syntax and an example:

docker tag <source_image>:<source_tag> <target_image>:<target_tag>

Here, we are providing tag v2 to the simple-node-app image:

C:\>docker images
REPOSITORY        TAG       IMAGE ID       CREATED        SIZE
simple-node-app   latest    328c28b10f66   37 hours ago   917MB
ubuntu            latest    b1e9cef3f297   4 weeks ago    78.1MB
redis             latest    7e49ed81b42b   2 months ago   117MB

C:\>docker tag simple-node-app:latest simple-node-app:v2

C:\>docker images
REPOSITORY        TAG       IMAGE ID       CREATED        SIZE
simple-node-app   latest    328c28b10f66   37 hours ago   917MB
simple-node-app   v2        328c28b10f66   37 hours ago   917MB
ubuntu            latest    b1e9cef3f297   4 weeks ago    78.1MB
redis             latest    7e49ed81b42b   2 months ago   117MB

6. Dangling image

A dangling image in Docker refers to an image that is not tagged and has no associated containers using it. These images are typically created as a result of building new images, where intermediate layers may remain in the system without being tagged or referenced.

Characteristics of Dangling Images:

No Tags: Dangling images have no tags associated with them, making them difficult to reference. They are usually identified by <none> in the repository and tag fields when you list images.
Unused Layers: They often represent unused layers from previous builds or failed builds, taking up disk space without serving a purpose.
Cleanup: It is generally a good practice to remove dangling images to free up space. You can do this easily with the command:
docker image prune
If you add the -a flag, Docker will also remove all unused images (those not in use by any containers).

7. image hashes (digests)

In a image, each layer is independent of the other images. Each layer is identified by a crypto ID that is a hash of the layer content. Changing the content of the image or its layers changes the hash. By checking the hash, we can find if the changes have been made.

When Docker pushes or pulls an image to/from a registry, it compresses the image layers to save network bandwidth and reduce storage requirements in the registry. This compression reduces the size of the image being transferred, making the process faster and more efficient.

However, there is an important distinction between the compressed and uncompressed versions of the image layers:

Compressed Layers: These are the layers transferred during the push/pull operation. They are optimized for efficient storage and transfer.
Uncompressed Layers: These are the layers in their original form that Docker uses when the image is being built or running locally.

7.1 Impact of Compression on Content Hashes

Docker identifies each image layer by its content hash, which is based on the contents of the layer. However, because the compression alters the structure of the data, the hash of a compressed layer is different from the hash of the uncompressed layer.

7.1.1 Why Does This Happen?

Compression Changes Content: Compressing and decompressing data changes the byte-level structure of the content. While the logical content is the same, the binary representation of the compressed and uncompressed versions differs.
Hashing Differences: Since Docker uses the content hash of a layer to track it, compressing a layer creates a new version of the layer with a different binary structure, resulting in a different hash.

8. Conclusion

In Docker, image registries, naming conventions, and tagging are essential components that streamline the management and distribution of container images. Registries like Docker Hub allow you to store and share images across different environments, while proper naming and tagging practices help in version control and organization. Understanding how to pull images from third-party registries, tag them appropriately, and push them to repositories ensures seamless workflows for development, testing, and production environments. With these tools, you can efficiently manage your container images and maintain a reliable deployment process.