Docker image layers lessons learned

Common misconception that deleting files during docker build at later image layers reducing final image size. For example if you have Dockerfile like below, you would think that total image size will be similar to base image since we deleted downloaded file at later stage of image build.

FROM mcr.microsoft.com/windows/nanoserver:1809
ADD ["http://ipv4.download.thinkbroadband.com/100MB.zip", "."]
RUN del 100mb.zip

To see effect on deletion had on final image size it’s good to start with just ADD statement in dockerfile

FROM mcr.microsoft.com/windows/nanoserver:1809
ADD ["http://ipv4.download.thinkbroadband.com/100MB.zip", "."]

Resulting build is in fact showing that image size increased by 100 MB compared to base image like below 

PS C:\docker\LayerTest> docker build -t layers:delete .
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM mcr.microsoft.com/windows/nanoserver:1809
 ---> a5034827da99
Step 2/3 : ADD ["http://ipv4.download.thinkbroadband.com/100MB.zip", "."]
Downloading [==================================================>]  104.9MB/104.9MB
 ---> Using cache
 ---> feba369ecb4e
Step 3/3 : RUN del 100mb.zip
 ---> Running in 3307758a70ce
Removing intermediate container 3307758a70ce
 ---> 74d6679e81cf
Successfully built 74d6679e81cf
Successfully tagged layers:delete
PS C:\docker\LayerTest> docker images
REPOSITORY                             TAG                 IMAGE ID            CREATED             SIZE
layers                                 add                 feba369ecb4e        48 seconds ago      410MB
mcr.microsoft.com/windows/nanoserver   1809                a5034827da99        2 weeks ago         305MB

As expected image increased by about 100 MB with addition of the file. Now let’s try to delete that file with `RUN DEL 100MB.zip`. Common misconception is that this will remove file inside image and hence total size will decrease back to original base image size. The results are below.

FROM mcr.microsoft.com/windows/nanoserver:1809
ADD ["http://ipv4.download.thinkbroadband.com/100MB.zip", "."]
RUN del 100mb.zip

Results of the build below which are showing that not only final image size did not decrease but it’s in fact increased by 1 MB!

PS C:\docker\LayerTest> docker build -t layers:delete .
Sending build context to Docker daemon  2.048kB
Step 1/3 : FROM mcr.microsoft.com/windows/nanoserver:1809
 ---> a5034827da99
Step 2/3 : ADD ["http://ipv4.download.thinkbroadband.com/100MB.zip", "."]
Downloading [==================================================>]  104.9MB/104.9MB
 ---> Using cache
 ---> feba369ecb4e
Step 3/3 : RUN del 100mb.zip
 ---> Running in 3307758a70ce
Removing intermediate container 3307758a70ce
 ---> 74d6679e81cf
Successfully built 74d6679e81cf
Successfully tagged layers:delete
PS C:\docker\LayerTest> docker images
REPOSITORY                             TAG                 IMAGE ID            CREATED             SIZE
layers                                 delete              74d6679e81cf        4 seconds ago       411MB
layers                                 add                 feba369ecb4e        48 seconds ago      410MB
mcr.microsoft.com/windows/nanoserver   1809                a5034827da99        2 weeks ago         305MB

You can see what was done on image layers with docker history <imgid> command like below, showing that last layer did not change anything but added 1 MB to total size.

PS C:\docker\LayerTest> docker history 74
IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
74d6679e81cf        17 minutes ago      cmd /S /C del 100mb.zip                         1.05MB
feba369ecb4e        17 minutes ago      cmd /S /C #(nop) ADD aa41da93b6f56103ecbc3fd…   105MB
a5034827da99        3 weeks ago         Install update 1809_amd64                       61.6MB
<missing>           2 months ago        Apply image 1809_RTM_amd64                      244MB

The take out of this is that you can not decrease total size of the image in any following layers, you can only INCREASE it. That means you need to keep each created layer as small possible by cleaning up file inside the same RUN statement. Example above can be rewritten as below where build process downloads files and then deletes it. Since it’s done inside single layer total size of image does not change.

FROM mcr.microsoft.com/windows/nanoserver:1809
RUN curl http://ipv4.download.thinkbroadband.com/100MB.zip --output 100MB.zip &\
    del 100MB.zip
PS C:\docker\LayerTest> docker build -t layers:curl .
Sending build context to Docker daemon  2.048kB
Step 1/2 : FROM mcr.microsoft.com/windows/nanoserver:1809
 ---> a5034827da99
Step 2/2 : RUN curl http://ipv4.download.thinkbroadband.com/100MB.zip --output 100MB.zip &    del 100MB.zip
 ---> Running in 58026752da9e
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  100M  100  100M    0     0  6023k      0  0:00:17  0:00:17 --:--:-- 7782k
Removing intermediate container 58026752da9e
 ---> b9de8a8a5077
Successfully built b9de8a8a5077
Successfully tagged layers:curl
PS C:\docker\LayerTest> docker images
REPOSITORY                             TAG                 IMAGE ID            CREATED             SIZE
layers                                 curl                b9de8a8a5077        7 seconds ago       307MB
layers                                 delete              74d6679e81cf        15 minutes ago      411MB
layers                                 add                 feba369ecb4e        16 minutes ago      410MB
mcr.microsoft.com/windows/nanoserver   1809                a5034827da99        2 weeks ago         305MB

This is especially important if you use MSI/EXE based installation in full servercore image since MSI based installer leave uninstallation/reinstallation files behind which inflate image size unneccessary. Compare common scenario below. You add MSI package to your image, you run installation, you delete MSI installer. Example below adds MariaDB installation package to base servercore image, installs it and then deletes installation file. Which is all wrong based on discussion above.

FROM mcr.microsoft.com/windows/servercore:ltsc2019
WORKDIR prep
ADD ["https://downloads.mariadb.com/Bundles/TX/mariadb-tx-3.0-10.3.11-windows.zip", "mariadb.zip"]
RUN powershell -Command Expand-Archive mariadb.zip
RUN powershell -command "Start-Process -filepath 'msiexec' -ArgumentList @('/i', 'c:\prep\mariadb\mariadb-tx-3.0-10.3.11-windows\mariadb-10.3.11-winx64.msi', '/qn') -PassThru | wait-process"
RUN del /f /s /q .

Resulting image history showing up that each layer significantly increase total image size.

IMAGE               CREATED             CREATED BY                                      SIZE                COMMENT
d26180fd56b8        25 seconds ago      cmd /S /C del /f /s /q .                        5.22MB
17b2bad012a1        3 minutes ago       cmd /S /C powershell -command "Start-Process…   482MB
bd9161ad3507        8 minutes ago       cmd /S /C powershell -Command Expand-Archive…   99.3MB
dd3ccba3a5ff        9 minutes ago       cmd /S /C #(nop) ADD 7d838d807796b908c08ade6…   71.8MB
94bfd0c4d09f        12 minutes ago      cmd /S /C #(nop) WORKDIR C:\prep                41kB
670f5c41d658        3 weeks ago         Install update ltsc2019_amd64                   509MB
<missing>           2 months ago        Apply image 1809_RTM_amd64                      3.47GB

Better version of the same process to confine entire process to single layer like below

FROM mcr.microsoft.com/windows/servercore:ltsc2019
WORKDIR prep
RUN curl "https://downloads.mariadb.com/Bundles/TX/mariadb-tx-3.0-10.3.11-windows.zip" --output mariadb.zip &  \
    powershell -command "Expand-Archive mariadb.zip" & \
    powershell -command "Start-Process -filepath 'msiexec' @('/i', 'c:\prep\mariadb\mariadb-tx-3.0-10.3.11-windows\mariadb-10.3.11-winx64.msi', '/qn') -PassThru | Wait-Process" & \
    del /f /s /q . & \

Resulting image showing that final image size of installation process decreased from 661 MB to 447 MB

IMAGE               CREATED              CREATED BY                                      SIZE                COMMENT
e727a4dd3642        About a minute ago   cmd /S /C curl "https://downloads.mariadb.co…   447MB
94bfd0c4d09f        18 minutes ago       cmd /S /C #(nop) WORKDIR C:\prep                41kB
670f5c41d658        3 weeks ago          Install update ltsc2019_amd64                   509MB
<missing>           2 months ago         Apply image 1809_RTM_amd64                      3.47GB

Now back to notice that MSI installer always leaves cleanup binaries behind. This binaries are located in c:\windows\installer folder. You can verify that they are there by checking contents of that folder inside better image like below.

PS C:\docker\LayerTest> docker run --rm e7 cmd /c dir c:\windows\installer
 Volume in drive C has no label.
 Volume Serial Number is 069E-146F

 Directory of c:\windows\installer

11/16/2018  09:05 PM        54,956,032 6ee9.msi
12/02/2018  05:00 PM            20,480 SourceHash{D02C77A8-80E6-4CA2-8028-BC2AF9BE21B1}
               2 File(s)     54,976,512 bytes

This files can deleted since no uninstallation will ever be performed inside docker. So final Dockerfile will look like below which results in extra 55MB savings.

FROM mcr.microsoft.com/windows/servercore:ltsc2019
WORKDIR prep
RUN curl "https://downloads.mariadb.com/Bundles/TX/mariadb-tx-3.0-10.3.11-windows.zip" --output mariadb.zip &  \
    powershell -command "Expand-Archive mariadb.zip" & \
    powershell -command "Start-Process -filepath 'msiexec' @('/i', 'c:\prep\mariadb\mariadb-tx-3.0-10.3.11-windows\mariadb-10.3.11-winx64.msi', '/qn') -PassThru | Wait-Process" & \
    del /f /s /q . & \
    del /f /s /q c:\windows\installer\*

By optimizing image build we went from 661MB file to 392MB with no change in functionality

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s