by Adam Brett

Docker Patterns - Cache Volume

The Cache Volume is a special type of Dev Volume designed for tools (specifically package managers) that don't play nice, any by play nice, I mean cache or install their libraries to the local directory. Examples of good package managers that don't require a Cache Volume are Composer and NPM, both of which install to the current directory. Bad examples would be Pip[^1] and Maven.

This is important if you have a build that has multiple steps that consume your dependencies and you don't want to slow down the build by installing them every time, or if you want to run your application container multiple times without having to re-download your dependencies. Things like that.

Let's take a look what that looks like in reality. Here's a simple package.json

{
  "dependencies": {
    "express": "^4.0.0"
  }
}

Now let's run a Transient Container with a Dev Volume to install our dependencies:

docker run --rm -v ${PWD}:${PWD} -w ${PWD} node:6-alpine npm install

I won't flood you with the output of that command, but if you run an ls you should now see your node_modules directory and all of your dependencies inside. If we were to repeat that command, it would do nothing, as the dependencies already exist. Let's try that now:

docker cache npm install 2

Now let's try a slightly more complicated example using Maven. Here's a basic pom.xml from a spring boot application:

<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.example</groupId>
    <artifactId>myproject</artifactId>
    <version>0.0.1-SNAPSHOT</version>

    <parent>
        <groupId>org.springframework.boot</groupId>
        <artifactId>spring-boot-starter-parent</artifactId>
        <version>2.0.0.RELEASE</version>
    </parent>
</project>

Now let's follow the same process as we did with npm with maven:

docker run --rm -v ${PWD}:${PWD} -w ${PWD} maven:3-alpine mvn install

If you run the tree command you should see that there's no sign of your packages anywhere:

.
├── pom.xml
└── target
    ├── maven-archiver
    │   └── pom.properties
    └── myproject-0.0.1-SNAPSHOT.jar

2 directories, 3 files

This is because maven installs your packages globally, not locally for the project. This doesn't work well with the Transient Container or [Tool Entrypoint] patterns.

Let's demonstrate this by running the same command a second time:

docker cache mvn install 2

It repeated the whole thing! On a small project like this it didn't take too long, but it can quickly add up and really slow down a full build. This is where the Cache Volume comes in to play.

The default maven images caches its dependencies globally in /root/.m2/repository, so let's mount a directory from our project at that location and see what happens:

docker run --rm -v ${PWD}:${PWD} -v ${PWD}/vendor/maven:/root/.m2 -w ${PWD} maven:3-alpine mvn install

A quick ls of vendor/maven and vendor/maven/repository shows that our dependencies are now cached locally in our project, just like they were with npm.

docker cache mvn install 2

If we run the command again, we should see that the dependencies are no-longer re-downloaded and the locally cached dependencies are used.

docker cache mvn install 3

There is a small problem left to solve, which we can see by running ls -l on our cache directory:

[email protected]:~$ ls -l vendor/maven/
total 12
-rw-r--r--  1 root root  350 Mar  7 11:10 copy_reference_file.log
drwxr-xr-x 10 root root 4096 Mar  7 11:08 repository
-rw-r--r--  1 root root  327 Mar  7 11:07 settings-docker.xml

The dependencies are owned by root, as docker runs as root, so that means we would need to use sudo to remove them. We can fix this by using the Run User pattern.

[1]: Yes, virtual-env. I know.

For exclusive content, including screen-casts, videos, and early beta access to my projects, subscribe to my email list below.


I love discussion, but not blog comments. If you want to comment on what's written above, head over to twitter.