Is the CI/CD pipeline taking forever? Is it taking too long to build a container image locally? One possible reason could be the size of container images – they are often unnecessarily bloated. This article presents several strategies to optimise images and make them faster and more efficient. 🚀

No unnecessary dependencies

A grown project with numerous dependencies in pyproject.toml can quickly become confusing. Before taking the next step – containerization with Docker, for example – it is worth first checking which dependencies are still needed and which are now obsolete. This allows you to streamline the code base, reduce potential security risks, and improve maintainability.

One option would be to delete all dependencies and the virtual environment and then go through the source code file by file to add only the dependencies that are needed. The command line tool deptry offers a more efficient strategy. It takes over this tedious task and helps to quickly identify superfluous dependencies. The installation is carried out with

uv add --dev deptry

The analysis of the project can then be started directly in the project folder with the following command

deptry .

After that, deptry lists the dependencies that are no longer used

Scanning 126 files...
pyproject.toml: DEP002 'pandas' defined as a dependency but not used in the codebase
Found 1 dependency issue.

In this case, pandas no longer appears to be used. It is recommended to check this and then remove all dependencies that are no longer needed.

uv remove deptry pandas

Alternative index

If you are using a package such as pytorch, docling, or sparrow with torch(vision) as a dependency and only want to use the CPU, you can omit the installation of the CUDA libraries. This can be achieved by specifying an alternate index for torch(vision), where uv will look for the package first, with no dependencies on the CUDA libraries for torch(vision) defined in this index. To do this, add the following entry to pyproject.toml under dependencies.

[tool.uv.sources]
torch = [
    { index = "pytorch-cpu" },
]

torchvision = [
    { index = "pytorch-cpu" },
]


[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"

This is what the images look like with and without the alternative index:

REPOSITORY           TAG       IMAGE ID       CREATED              SIZE
sample_torchvision   gpu       f0f89156f089   5 minutes ago        6.46GB
sample_torchvision   cpu       0e4b696bdcb2   About a minute ago   657MB

With the alternative index, the image is only 1/10 as large!

The correct Dockerfile

Whether the Python project is just starting or has been around for a while, it is worth having a look at the sample docker files provided by uv: uv-docker-example.

These provide a reasonable base configuration and are optimised to create the smallest possible images. They are extensively commented and use a minimal base image with Python and uv preinstalled. Dependencies and the project are installed in separate commands, so that layer caching works optimally. Only the regular dependencies are installed, while dev dependencies such as the previously installed deptry are excluded.

In the multistage example, only the virtual environment and project files are copied to the runtime image, so that no superfluous build artefacts end up in the final image.

Bonus tip for Azure WebApp users

This tip will not reduce the size of the image, but it may save you some headaches in an emergency.

When deploying the Docker image in an Azure WebApp, /home or underlying paths should not be used as WORKDIR. The /home path can be used to share data across multiple WebApp instances. This is controlled by the environment variable WEBSITES_ENABLE_APP_SERVICE_STORAGE. If this is set to true, the shared storage is mounted to /home, which means that the files contained in the image are no longer visible in the container.

(If the Dockerfile is based on the uv examples, then the WORKDIR is already configured correctly under “/app”.)