Multistage docker is sharing intermediate images

Sometimes you want to have a testing stage in your Dockerfile
that does unit testing, and integration testing, and for that to run you are going to run two sets of docker build
:
- for testing
docker build --target test -t image:latest .
- for the final image
docker build --target finale -t image:latest
for 1) you really want to have an ENTRYPOINT
for the test
stage that can be run in a docker-compose
file that also spins all other infrastructure needed for an integration test: a database, caching etc. For 2) you don't want to run the test stage, but you want for it to depend on the build
stage that the test
stage also uses, so both needs the build stage. The crucial thing here is that both docker build
commands uses the same build artifact, so it is the same code that is in the final
image, that is also tested.
And example Dockerfile
I have this simple Dockerfile
as the root of my example
FROM debian:bullseye AS build
RUN mkdir /src
ADD date.txt /src
FROM build AS test
WORKDIR /test
COPY --from=build /src/date.txt ./
RUN cat date.txt
FROM build AS finale
COPY --from=build /src/date.txt ./
RUN echo production
ENTRYPOINT ["cat", "date.txt"]
This assumes that i have a code file named date.txt
, which is a simple date -Ins > date.txt
. Here both test
and finale
makes use of the same base target, build
. If i run docker build --target=test -t time:latest .
in the root of the project folder, i get
❯ docker build --target=test -t time:latest .
Sending build context to Docker daemon 3.072kB
Step 1/7 : FROM debian:bullseye AS build
bullseye: Pulling from library/debian
001c52e26ad5: Pull complete
Digest: sha256:82bab30ed448b8e2509aabe21f40f0607d905b7fd0dec72802627a20274eba55
Status: Downloaded newer image for debian:bullseye
---> 07d9246c53a6
Step 2/7 : RUN mkdir /src
---> Running in 609c1c49df71
Removing intermediate container 609c1c49df71
---> 4cf6aabe126f
Step 3/7 : ADD date.txt /src
---> fdea526d2a8a
Step 4/7 : FROM build AS test
---> fdea526d2a8a
Step 5/7 : WORKDIR /test
---> Running in 684195c68f6e
Removing intermediate container 684195c68f6e
---> cc75641dfbea
Step 6/7 : COPY --from=build /src/date.txt ./
---> 2db94861c1e2
Step 7/7 : RUN cat date.txt
---> Running in 2824d1be97c9
2022-08-21T20:59:16,453937447+02:00
Removing intermediate container 2824d1be97c9
---> 530358e16335
Successfully built 530358e16335
Successfully tagged time:latest
It build the two stages, build
and test
, no caches is involved, and when it is done, the intermediate images, is dangling, as we see here
❯ docker image ls -a
REPOSITORY TAG IMAGE ID CREATED SIZE
time latest 530358e16335 17 seconds ago 124MB
<none> <none> 2db94861c1e2 18 seconds ago 124MB
<none> <none> cc75641dfbea 20 seconds ago 124MB
<none> <none> fdea526d2a8a 20 seconds ago 124MB
<none> <none> 4cf6aabe126f 21 seconds ago 124MB
debian bullseye 07d9246c53a6 2 weeks ago 124MB
The images ids matches: we have some intermediate images, that is built, where the output of the intermediate is used as a base for the next. It all chains up in the end and produces the final image, that is named and tagged with time:latest
. If we then build the finale
target with docker build --target=finale -t date:latest .
we should expect it to use the same build
target with id fdea526d2a8a
. Lets try:
❯ docker build --target=finale -t time:latest .
Sending build context to Docker daemon 3.072kB
Step 1/11 : FROM debian:bullseye AS build
---> 07d9246c53a6
Step 2/11 : RUN mkdir /src
---> Using cache
---> 4cf6aabe126f
Step 3/11 : ADD date.txt /src
---> Using cache
---> fdea526d2a8a
Step 4/11 : FROM build AS test
---> fdea526d2a8a
Step 5/11 : WORKDIR /test
---> Using cache
---> cc75641dfbea
Step 6/11 : COPY --from=build /src/date.txt ./
---> Using cache
---> 2db94861c1e2
Step 7/11 : RUN cat date.txt
---> Using cache
---> 530358e16335
Step 8/11 : FROM build AS finale
---> fdea526d2a8a
Step 9/11 : COPY --from=build /src/date.txt ./
---> 9d73cf14f389
Step 10/11 : RUN echo production
---> Running in 1d3004c044db
production
Removing intermediate container 1d3004c044db
---> 73ee55a9e8fd
Step 11/11 : ENTRYPOINT ["cat", "date.txt"]
---> Running in f45cf6403bf0
Removing intermediate container f45cf6403bf0
---> cfb965b72a6e
Successfully built cfb965b72a6e
Successfully tagged time:latest
here we can see that it uses the caches, thus: we are using the same src files in the test, as we use in the finale image, which is crusial because we NEED the finale image to be the same as the tested image. It makes no sense if it builds from source again, when building the finale image. If that happended, it will render the test completly useless.
Testing
I haven't tried it yet, but now i should be possible to make a docker-compose
file, using the Dockerfile
as the basis, that targets the test
target, and also spins all the infrastructure up that is used when integration testing: databases, caches, storage etc. If this is possible (i can't see why is shouldn't be possible), then we have a nice setup for doing test of the code, before building the finale image used in production, with all the benefits that the container technology brings us.