Understanding Drone and Docker layer caching

Hi there,

First of, thanks for for the awesome product! Currently I’m trying to set up my selfhosted pipeline, but am running into minor issues with configuring layer caching based on a private repository. Based on the article provided by laslzlocph posted on his site and mentioned somewhere on the forum, I came up with the pipeline attached below.

My intention is to use layers from existing images to minimize the throughput time when working on features, especially since building wheels on a pi is very time-consuming. However, my build keeps recompiling and building everything at the first step, even though I have not changed anything in the requirements, code or buildfiles for the image.

My expectation was that, since both the steps and the resulting layers are the same, it would use those. Can anybody help me understand what is happening, and why it does not use the layers from my personal registry?

I suspect I might have misunderstood the concept of build cache layers and need to adjust either my dockerfile or drone.yml.

Edit:
It is possible to push and pull from the repo using CLI and the images build during the first step can be found there, so that part of the pipeline works as expected.

DOCKERFILE

FROM python:3.6-alpine3.9 as base

FROM base as builder
RUN mkdir /install

WORKDIR /install

COPY ./requirements.txt /requirements.txt

RUN apk update
RUN apk add --update --no-cache make automake gcc libc-dev subversion python3-dev libxslt-dev
RUN pip wheel --wheel-dir=/install/wheelhouse -r /requirements.txt
RUN pip install --no-index --find-links=/install/wheelhouse -r /requirements.txt

FROM base

COPY --from=builder /install /usr/local
COPY ./run.py .
COPY ./app /app

WORKDIR /

ENTRYPOINT ["python", "./run.py"]

DRONE.YML

 global-variables:
  settings:
    - settings: &settings
        dockerfile: ./.Dockerfile
        registry: private.repo
        repo: private.repo/bitbeckers/scraper-tweets
        insecure: true
        tags:
          - "${DRONE_BRANCH}"
          - "${DRONE_BRANCH}-${DRONE_COMMIT}"
        cache_from:
          - "private.repo/bitbeckers/scraper-tweets:develop"
          - "private.repo/bitbeckers/scraper-tweets:${DRONE_BRANCH}"
        debug: true

kind: pipeline
name: scraper-tweets

platform:
  os: linux
  arch: arm

steps:
  - name: prepare
    image: plugins/docker
    settings:
      <<: *settings
    when:
      event:
        - push
        - pull_request

  - name: test
    image: private.repo/bitbeckers/scraper-tweets:${DRONE_BRANCH}
    settings:
      <<: *settings
    commands:
      - echo "The current branch is ${DRONE_BRANCH}"
      - echo "The current commit hash is ${DRONE_COMMIT_SHA}"
      - echo "Running tests"
      - nose2 -v
    when:
      event:
        - push
        - pull_request

  - name: publish_develop
    image: plugins/docker
    settings:
      <<: *settings
      tag:
        - "${DRONE_BRANCH}"
      debug: false
    when:
      event:
        - push
      branch:
        - develop
1 Like

I have the same issue and I can’t seem to figure out what’s wrong.

Hey @laszlocph Can you spot any obvious issues? Thanks!

1 Like

I am not sure there is enough information in this thread to help triage. Layer caching is about order of operations and uniqueness of the files (content, file metadata, timestamps, etc) in each layer. For example, if the first layer in your Dockerfile is a binary that has a different timestamp every time it is compiled, caching is not going to be effective.

I would also point out that the docker build command should provide logs regarding if a layer is cached or not. This information is vital and has not been provided, which makes it difficult to help troubleshoot.

Docker also provides some best practices for how to structure your Dockerfile to best leverage layer caching. I recommend reading the best practices and comparing them against your Dockerfile. See https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#leverage-build-cache

If you need them, I can provide the logs. When I look into them, I see that some layers are being pulled in and cached, but this does not work on the layer running the apt-get update && apt-get install. Maybe this has something to do with the updating of the package list, which might happen every few minutes? This seems to align with your comment on ‘content, file metadata, timestamps, etc’

In the meantime I’ve narrowed down the problem to 2 main topics, the time it takes to compile python packages on a pi and the fact that a previously performed compilations are discarded between pipeline runs. To solve this, I chose an approach mixing the use of a ‘wheelhouse’ in a baseline-image combined with multi-staged build, so that the commonly used packages are precompiled and stored in an image on my registry.

The strategy of a baseline image reduced my build time from >45 minutes to <12minutes on a Pi3B+. The repository is made public here: https://github.com/bitbeckers/python3.6debian_pi. Regarding caching this seems like the easiest route for me. Please feel free to comment on the repo provided, as it still a work in progress which I just wanted to share :slight_smile:

Hi folks,

as Brad suggested, the Docker build output can reveal what steps were cached, and which wasn’t. It will make this thread more focused too, we can talk about line numbers, etc.

The bellow sample output has a few lines that are crucial to see. Like line “docker-builder:70” and “docker-builder:89”

➜ ./drone exec
[docker-builder:69] + /usr/local/bin/docker pull laszlocloud/cache-from-test:latest
[docker-builder:70] latest: Pulling from laszlocloud/cache-from-test
[docker-builder:77] 0e3d8c77ad65: Pull complete
[docker-builder:80] + /usr/local/bin/docker build --rm=true -f Dockerfile -t 00000000 . --pull=true --cache-from laszlocloud/cache-from-test:latest --label org.label-schema.schema-version=1.0 --label org.label-schema.build-date=2019-02-17T14:23:07Z --label org.label-schema.vcs-ref=00000000 --label org.label-schema.vcs-url=
[...]
[docker-builder:88] Step 2/6 : RUN apt-get update && apt-get install -y     curl  && rm -rf /var/lib/apt/lists/*
[docker-builder:89]  ---> Using cache
[docker-builder:90]  ---> 880ea2ef13d2

As a suggestion, can you try factoring your Dockerfile to a single stage first?

My hunch is that for a multi-stage Dockerfile to work with cache_from, you have to push each intermediate step to registry as well and pull it before each build. Making the image pull step longer, and the config more complicated. But first things first: logs, and single stage. If those work, you can weigh the multi-stage alternative.

Hi Laszlo, thank you for your input! And sorry for the late response, life got in the way.

Currently, I am indeed building in the first stage, pushing to registry and pulling in subsequent stage. As you discussed in your posts, this does come with some overhead because you keep pulling and pushing images, but in the case of compiling python on pi this might the preferred solution.

A single-stage file deployment is running as we speak, so I can produce the logs later today!

Hi there, I finally got around to producing the logs. You might be onto something, as the single-stage build does pull from the registry. The files and log are added at the bottom.

So maybe I’m getting ahead of things, but your suggestion to push each intermediate step sounds like a good solution! This will indeed make the pull step longer… If that’s the case, can there be any advantage compared to using a custom base python image? Or is this the weighing part? :slight_smile:

BUILD FILE

FROM python:3.6.8-slim-stretch

RUN apt-get update \
    && apt-get install -y \
        build-essential \
        make \
        automake \
        zlib1g-dev \
        gcc \
        g++ \
        python3-dev \
        python-dev\
        libxml2-dev \
        libxslt-dev \
        python3-lxml

RUN python3 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"

COPY requirements.txt .

RUN mkdir -p /install/wheelhouse
RUN	pip install --upgrade pip setuptools wheel\
    && pip wheel --wheel-dir /install/wheelhouse -r requirements.txt \
    && pip install --no-cache-dir --no-index --find-links=/install/wheelhouse -r requirements.txt

COPY ./run.py .
COPY ./app /app
COPY ./tests /tests

EXPOSE 9000

CMD ["python3", "run.py"]

DRONE.YML

    global-variables:
    deployment:
        - &registry private.repo
        - &repository private.repo/${DRONE_REPO}
    caching:
        - cache_from: &private-cache
        - "private.repo/${DRONE_REPO}:develop"
        - "private.repo/${DRONE_REPO}:${DRONE_BRANCH}"
    default-tags: &default-tags
        - "${DRONE_BRANCH}"
        - "${DRONE_BRANCH}-${DRONE_COMMIT}"
    drone-settings:
        - settings: &build-settings
            dockerfile: ./Dockerfile
            registry: *registry
            repo: *repository
            insecure: true
            tags: *default-tags
            cache_from: *private-cache
            debug: true

        - settings: &test-settings
            registry: *registry
            repo: *repository
            insecure: true
            cache_from: *private-cache
            debug: true

    kind: pipeline
    name: analyser-content

    platform:
    os: linux
    arch: arm

    steps:
    - name: prepare
        image: plugins/docker
        settings:
        <<: *build-settings
        when:
        event:
            - push
            - pull_request

LOG

   0
avatar
User settings
Logout
Repositories
bitbeckers/analyser-content
#34
analyser-content
View source
Activity Feed
#34. Pull built image
avatar
bitbeckers pushed b86a3b9a to feature-singledeployfile
10:03
10 minutes ago
analyser-content: Error response from daemon: manifest for private.repo/bitbeckers/python3.6debian_pi:feature-singledeployfile not found
analyser-content — prepare
09:15
11735ab98e82: Waiting
c67b362f649e: Waiting
8c97e30f24ec: Waiting
4b82a4fe2e20: Verifying Checksum
4b82a4fe2e20: Download complete
5f40cf050674: Verifying Checksum
5f40cf050674: Download complete
b0727b1ec48e: Verifying Checksum
b0727b1ec48e: Download complete
080dc1c681e8: Verifying Checksum
080dc1c681e8: Download complete
11735ab98e82: Verifying Checksum
11735ab98e82: Download complete
f49e693b652e: Verifying Checksum
f49e693b652e: Download complete
50431a244f6a: Verifying Checksum
50431a244f6a: Download complete
8c97e30f24ec: Verifying Checksum
8c97e30f24ec: Download complete
c67b362f649e: Download complete
b0727b1ec48e: Pull complete
4b82a4fe2e20: Pull complete
5f40cf050674: Pull complete
080dc1c681e8: Pull complete
f49e693b652e: Pull complete
50431a244f6a: Pull complete
11735ab98e82: Pull complete
c67b362f649e: Pull complete
8c97e30f24ec: Pull complete
Digest: sha256:7f1129e0d3479590c3ad09650e53afb040a60984c07c21a96b5906fb5fbb74db
Status: Downloaded newer image for private.repo/bitbeckers/analyser-content:develop
+ /usr/local/bin/docker pull private.repo/bitbeckers/analyser-content:feature-singledeployfile
time="2019-06-17T19:18:35.462256905Z" level=warning msg="Error getting v2 registry: Get https://private.repo/v2/: http: server gave HTTP response to HTTPS client"
time="2019-06-17T19:18:35.462381487Z" level=info msg="Attempting next endpoint for pull after error: Get https://private.repo/v2/: http: server gave HTTP response to HTTPS client"
feature-singledeployfile: Pulling from bitbeckers/analyser-content
5155b41fe73a: Pulling fs layer
5d431f802675: Pulling fs layer
f65f7c6b7d6d: Pulling fs layer
c384daa176c5: Pulling fs layer
f7633e7e4d6c: Pulling fs layer
06e0849a5a6a: Pulling fs layer
7da6fc781548: Pulling fs layer
548b3f349af1: Pulling fs layer
70785b5059f5: Pulling fs layer
a32f16151c01: Pulling fs layer
780a4f6fae5c: Pulling fs layer
91b33d072fc2: Pulling fs layer
f7633e7e4d6c: Waiting
4b2fc79d26f0: Pulling fs layer
06e0849a5a6a: Waiting
7da6fc781548: Waiting
548b3f349af1: Waiting
780a4f6fae5c: Waiting
70785b5059f5: Waiting
91b33d072fc2: Waiting
a32f16151c01: Waiting
4b2fc79d26f0: Waiting
c384daa176c5: Waiting
5d431f802675: Verifying Checksum
5d431f802675: Download complete
f65f7c6b7d6d: Verifying Checksum
f65f7c6b7d6d: Download complete
5155b41fe73a: Verifying Checksum
5155b41fe73a: Download complete
c384daa176c5: Download complete
f7633e7e4d6c: Download complete
548b3f349af1: Verifying Checksum
548b3f349af1: Download complete
70785b5059f5: Verifying Checksum
70785b5059f5: Download complete
7da6fc781548: Verifying Checksum
7da6fc781548: Download complete
780a4f6fae5c: Verifying Checksum
780a4f6fae5c: Download complete
91b33d072fc2: Verifying Checksum
91b33d072fc2: Download complete
4b2fc79d26f0: Verifying Checksum
4b2fc79d26f0: Download complete
a32f16151c01: Verifying Checksum
a32f16151c01: Download complete
06e0849a5a6a: Verifying Checksum
06e0849a5a6a: Download complete
5155b41fe73a: Pull complete
5d431f802675: Pull complete
f65f7c6b7d6d: Pull complete
c384daa176c5: Pull complete
f7633e7e4d6c: Pull complete
06e0849a5a6a: Pull complete
7da6fc781548: Pull complete
548b3f349af1: Pull complete
70785b5059f5: Pull complete
a32f16151c01: Pull complete
780a4f6fae5c: Pull complete
91b33d072fc2: Pull complete
4b2fc79d26f0: Pull complete
Digest: sha256:5f675ae9796fcabe84e474ffe493e0dfd3e5dbd7c6c50b776313e45dce6921bd
Status: Downloaded newer image for private.repo/bitbeckers/analyser-content:feature-singledeployfile
+ /usr/local/bin/docker build --rm=true -f ./Dockerfile -t b86a3b9a87bd3932226b027d3d977eae42d249cf . --pull=true --cache-from private.repo/bitbeckers/analyser-content:develop --cache-from private.repo/bitbeckers/analyser-content:feature-singledeployfile --label org.label-schema.schema-version=1.0 --label org.label-schema.build-date=2019-06-17T19:16:30Z --label org.label-schema.vcs-ref=b86a3b9a87bd3932226b027d3d977eae42d249cf --label org.label-schema.vcs-url=[GITURL]
Sending build context to Docker daemon 97.79kB
Step 1/16 : FROM python:3.6.8-slim-stretch
3.6.8-slim-stretch: Pulling from library/python
5155b41fe73a: Already exists
5d431f802675: Already exists
f65f7c6b7d6d: Already exists
c384daa176c5: Already exists
f7633e7e4d6c: Already exists
Digest: sha256:0d095570901e7cf0dac93ba4d0ee75ee2364715338b0396874589c9cc23343b6
Status: Downloaded newer image for python:3.6.8-slim-stretch
---> 63aeec0788cd
Step 2/16 : RUN apt-get update && apt-get install -y build-essential make automake zlib1g-dev gcc g++ python3-dev python-dev libxml2-dev libxslt-dev python3-lxml
---> Using cache
---> 1471edde60d3
Step 3/16 : RUN python3 -m venv /opt/venv
---> Using cache
---> f0977ddba7b6
Step 4/16 : ENV PATH="/opt/venv/bin:$PATH"
---> Using cache
---> dcf074ca916d
Step 5/16 : COPY requirements.txt .
---> Using cache
---> 4906193ad623
Step 6/16 : RUN mkdir -p /install/wheelhouse
---> Using cache
---> ce4db3e502ea
Step 7/16 : RUN pip install --upgrade pip setuptools wheel && pip wheel --wheel-dir /install/wheelhouse -r requirements.txt && pip install --no-cache-dir --no-index --find-links=/install/wheelhouse -r requirements.txt
---> Using cache
---> f4d620ee5bee
Step 8/16 : COPY ./run.py .
---> Using cache
---> 78f4b521753a
Step 9/16 : COPY ./app /app
---> Using cache
---> 41e14dd892be
Step 10/16 : COPY ./tests /tests
---> Using cache
---> 78206bdd9c9b
Step 11/16 : EXPOSE 9000
---> Using cache
---> 86683f0ca257
Step 12/16 : CMD ["python3", "run.py"]
---> Using cache
---> 9618c18beaba
Step 13/16 : LABEL org.label-schema.build-date=2019-06-17T19:16:30Z
---> Running in 7eccb2efcc23
time="2019-06-17T19:25:10.562396703Z" level=info msg="Layer sha256:2789c481a6e548b4a56cb615d222a57e23edffe2d9713e1a348e3e10515de01b cleaned up"
Removing intermediate container 7eccb2efcc23
---> cb91bf6f2671
Step 14/16 : LABEL org.label-schema.schema-version=1.0
---> Running in cf45de95a580
time="2019-06-17T19:25:13.037546685Z" level=info msg="Layer sha256:2789c481a6e548b4a56cb615d222a57e23edffe2d9713e1a348e3e10515de01b cleaned up"
Removing intermediate container cf45de95a580
---> 283e8e122a60
Step 15/16 : LABEL org.label-schema.vcs-ref=b86a3b9a87bd3932226b027d3d977eae42d249cf
---> Running in a4b22f164013
time="2019-06-17T19:25:13.884689620Z" level=info msg="Layer sha256:2789c481a6e548b4a56cb615d222a57e23edffe2d9713e1a348e3e10515de01b cleaned up"
Removing intermediate container a4b22f164013
---> 10cde1cddbab
Step 16/16 : LABEL org.label-schema.vcs-url=https://github.com/bitbeckers/analyser-content.git
---> Running in 72885e92b92b
time="2019-06-17T19:25:17.912110421Z" level=info msg="Layer sha256:2789c481a6e548b4a56cb615d222a57e23edffe2d9713e1a348e3e10515de01b cleaned up"
Removing intermediate container 72885e92b92b
---> 369058bb127f
Successfully built 369058bb127f
Successfully tagged b86a3b9a87bd3932226b027d3d977eae42d249cf:latest
+ /usr/local/bin/docker tag b86a3b9a87bd3932226b027d3d977eae42d249cf private.repo/bitbeckers/analyser-content:feature-singledeployfile
+ /usr/local/bin/docker push private.repo/bitbeckers/analyser-content:feature-singledeployfile
The push refers to repository [private.repo/bitbeckers/analyser-content]
time="2019-06-17T19:25:18.461607115Z" level=info msg="Attempting next endpoint for push after error: Get https://private.repo/v2/: http: server gave HTTP response to HTTPS client"
ee0a6c15eff6: Preparing
449a2b8f7a2b: Preparing
a44606f54835: Preparing
90b1dcecba9c: Preparing
3abc89ba8dbb: Preparing
ab5f8a98d1aa: Preparing
21b8f7ed231f: Preparing
95bc667cd93b: Preparing
cc90aa4adbf5: Preparing
971787105f82: Preparing
3368a541313b: Preparing
b6bf2928848d: Preparing
8f4224afbf05: Preparing
ab5f8a98d1aa: Waiting
21b8f7ed231f: Waiting
95bc667cd93b: Waiting
3368a541313b: Waiting
b6bf2928848d: Waiting
8f4224afbf05: Waiting
cc90aa4adbf5: Waiting
971787105f82: Waiting
449a2b8f7a2b: Layer already exists
a44606f54835: Layer already exists
3abc89ba8dbb: Layer already exists
90b1dcecba9c: Layer already exists
ee0a6c15eff6: Layer already exists
ab5f8a98d1aa: Layer already exists
21b8f7ed231f: Layer already exists
95bc667cd93b: Layer already exists
971787105f82: Layer already exists
cc90aa4adbf5: Layer already exists
3368a541313b: Layer already exists
b6bf2928848d: Layer already exists
8f4224afbf05: Layer already exists
feature-singledeployfile: digest: sha256:8e7092cc1c00d9c85bdcf3ff169b0a3e325726f9a42383f1efb221a3292dca75 size: 3043
+ /usr/local/bin/docker tag b86a3b9a87bd3932226b027d3d977eae42d249cf private.repo/bitbeckers/analyser-content:feature-singledeployfile-b86a3b9a87bd3932226b027d3d977eae42d249cf
+ /usr/local/bin/docker push private.repo/bitbeckers/analyser-content:feature-singledeployfile-b86a3b9a87bd3932226b027d3d977eae42d249cf
The push refers to repository [private.repo/bitbeckers/analyser-content]
time="2019-06-17T19:25:20.835501918Z" level=info msg="Attempting next endpoint for push after error: Get https://private.repo/v2/: http: server gave HTTP response to HTTPS client"
ee0a6c15eff6: Preparing
449a2b8f7a2b: Preparing
a44606f54835: Preparing
90b1dcecba9c: Preparing
3abc89ba8dbb: Preparing
ab5f8a98d1aa: Preparing
21b8f7ed231f: Preparing
95bc667cd93b: Preparing
cc90aa4adbf5: Preparing
971787105f82: Preparing
3368a541313b: Preparing
b6bf2928848d: Preparing
8f4224afbf05: Preparing
ab5f8a98d1aa: Waiting
21b8f7ed231f: Waiting
95bc667cd93b: Waiting
cc90aa4adbf5: Waiting
971787105f82: Waiting
3368a541313b: Waiting
b6bf2928848d: Waiting
8f4224afbf05: Waiting
449a2b8f7a2b: Layer already exists
a44606f54835: Layer already exists
3abc89ba8dbb: Layer already exists
90b1dcecba9c: Layer already exists
ee0a6c15eff6: Layer already exists
cc90aa4adbf5: Layer already exists
ab5f8a98d1aa: Layer already exists
971787105f82: Layer already exists
95bc667cd93b: Layer already exists
21b8f7ed231f: Layer already exists
3368a541313b: Layer already exists
b6bf2928848d: Layer already exists
8f4224afbf05: Layer already exists
feature-singledeployfile-b86a3b9a87bd3932226b027d3d977eae42d249cf: digest: sha256:8e7092cc1c00d9c85bdcf3ff169b0a3e325726f9a42383f1efb221a3292dca75 size: 3043
+ /usr/local/bin/docker rmi b86a3b9a87bd3932226b027d3d977eae42d249cf
Untagged: b86a3b9a87bd3932226b027d3d977eae42d249cf:latest
+ /usr/local/bin/docker system prune -f
Deleted Images:
untagged: private.repo/bitbeckers/analyser-content@sha256:5f675ae9796fcabe84e474ffe493e0dfd3e5dbd7c6c50b776313e45dce6921bd
deleted: sha256:ae85f589bf7934b697543930528123402c8f7c133d324d067c01bd7ae9e21a29
Total reclaimed space: 0B
DocumentationPluginsSupport
GitHubTwitterDiscourseGitter
1 Like

I ended up doing what was suggested and pushing/pulling each stage. Each subsequent step pulls all the previous steps. Something like this:

- name: base
  image: plugins/docker
  settings:
    <<: *registry_settings
    target: base
    cache_from: 
      - "my-registry.com/${DRONE_REPO_NAME}:base"
    tags:
      - base

- name: dependencies
  image: plugins/docker
  settings:
    <<: *registry_settings
    target: dependencies
    cache_from:
      - "my-registry.com/${DRONE_REPO_NAME}:base" 
      - "my-registry.com/${DRONE_REPO_NAME}:dependencies"
    tags:
      - dependencies

- name: app
  image: plugins/docker
  settings:
    <<: *registry_settings
    target: app
    cache_from:
      - "my-registry.com/${DRONE_REPO_NAME}:base" 
      - "my-registry.com/${DRONE_REPO_NAME}:dependencies"
      - "my-registry.com/${DRONE_REPO_NAME}:app" 
    tags:
      - app

This works, i.e. everything is pulled from the cache. Like @laszlocph mentioned, the image pull/push steps add more time and the config is slightly more complex. I may combine some stages that don’t need to be in separate stages.