ademaria
(Alessandro De Maria)
November 18, 2020, 11:23am
1
From this morning we are seeing broken builds when using the gcr plugin:
latest: Pulling from plugins/gcr
Digest: sha256:0c2a1fc28da459cf6cb5a61e724750a5c56eac9d7c92cb1f68cc419cac6615e6
Status: Image is up to date for plugins/gcr:latest
+ /usr/local/bin/dockerd --data-root /var/lib/docker --host=unix:///var/run/docker.sock
Detected registry credentials
+ /usr/local/bin/docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:22:56 2020
OS/Arch: linux/amd64
Experimental: false
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
time="2020-11-18T11:16:30Z" level=fatal msg="exit status 1"
we are puzzled because other builds that do not rely on it are working fine.
I have eliminated from the causes the nodes (I have re-created them) and we have not upgraded drone recently.
Any suggestion on how to go about debugging this?
1 Like
ademaria
(Alessandro De Maria)
November 18, 2020, 11:29am
2
Looking better at it, the issue seems to be related to the docker version
step, which fails to get the server version
running it locally also gives the same error
$ docker run --rm -it plugins/gcr
+ /usr/local/bin/dockerd --data-root /var/lib/docker --host=unix:///var/run/docker.sock
Registry credentials or Docker config not provided. Guest mode enabled.
+ /usr/local/bin/docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:22:56 2020
OS/Arch: linux/amd64
Experimental: false
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
FATA[0015] exit status 1
ademaria
(Alessandro De Maria)
November 18, 2020, 11:39am
3
so the reason is due to a new release of docker that happened today.
Server: Docker Engine - Community
Engine:
Version: 19.03.13
API version: 1.40 (minimum version 1.12)
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:01:30 2020
OS/Arch: linux/amd64
Experimental: false
the last successful run we had was on the previous version
Server: Docker Engine - Community
Engine:
Version: 19.03.8
API version: 1.40 (minimum version 1.12)
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:30:32 2020
OS/Arch: linux/amd64
Experimental: false
Iām also having this issue, it strangely doesnāt affect all projects so Iāve been trying to troubleshoot them in particular. Whatās changed in this latest docker release that could cause this issue?
ademaria
(Alessandro De Maria)
November 18, 2020, 12:01pm
5
We reverted to use the previous docker and containerd, but it is still not working.
So perhaps the upgrade brought in more changes that now simply reverting wonāt work.
Strangely I have noticed that the āplugins/gcrā plugin doesnāt seem to have a Dockerfile or a github project in https://github.com/drone-plugins?q=gcr&type=&language=
@josh-stableprice which plugin is it failing for you?
ademaria
(Alessandro De Maria)
November 18, 2020, 12:04pm
6
For us it only affects projects that make use of the plugins/gcr
plugin, so you might have the same setup and perhaps itās working on those that do not make use of it
Well the weird issue Iām having is all four my projects use plugin/ecr
which Iām lead to believe uses plugin/docker
under the hood. Two of them are continuing to work correctly, and the other two are not.
ademaria
(Alessandro De Maria)
November 18, 2020, 12:09pm
8
I believe this needs updated! It references the old version of docker.
ademaria
(Alessandro De Maria)
November 18, 2020, 12:16pm
10
@bradrydzewski would someone be able to review this PR and help us unblock our builds?
ademaria
(Alessandro De Maria)
November 18, 2020, 1:39pm
11
I have tested my patch by building my own image but I donāt seem to get it working anyway.
I think we will be moving away from the DIND approach back to explicit docker commands.
Hey there
Did you try running the plugin with debug: true
?
In our case, we had a lot of issues related to concurrent iptables access at docker daemon startup, resulting in the docker daemon unable to launch.
Running in debug might give you (and people in this thread) a bit more insights
I didnāt know we could do that, so Iāve just run with debug and launch_debug on
+ /usr/local/bin/dockerd --data-root /var/lib/docker --host=unix:///var/run/docker.sock
time="2020-11-18T14:25:16.835823089Z" level=info msg="Starting up"
time="2020-11-18T14:25:16.837361836Z" level=warning msg="could not change group /var/run/docker.sock to docker: group docker not found"
time="2020-11-18T14:25:16.838277236Z" level=info msg="libcontainerd: started new containerd process" pid=41
time="2020-11-18T14:25:16.838394167Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2020-11-18T14:25:16.838542708Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2020-11-18T14:25:16.838587499Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
time="2020-11-18T14:25:16.838651109Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2020-11-18T14:25:16.855560640Z" level=info msg="starting containerd" revision=7ad184331fa3e55e52b890ea95e65ba581ae3429 version=v1.2.13
time="2020-11-18T14:25:16.855798793Z" level=error msg="failed to change OOM score to -500" error="write /proc/41/oom_score_adj: permission denied"
time="2020-11-18T14:25:16.856244167Z" level=info msg="loading plugin "io.containerd.content.v1.content"..." type=io.containerd.content.v1
time="2020-11-18T14:25:16.856445949Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.btrfs"..." type=io.containerd.snapshotter.v1
time="2020-11-18T14:25:16.856765173Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.btrfs" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
time="2020-11-18T14:25:16.856796483Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.aufs"..." type=io.containerd.snapshotter.v1
time="2020-11-18T14:25:16.859609004Z" level=warning msg="failed to load plugin io.containerd.snapshotter.v1.aufs" error="modprobe aufs failed: "ip: can't find device 'aufs'\nmodprobe: can't change directory to '/lib/modules': No such file or directory\n": exit status 1"
time="2020-11-18T14:25:16.859641884Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.native"..." type=io.containerd.snapshotter.v1
time="2020-11-18T14:25:16.859802666Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.overlayfs"..." type=io.containerd.snapshotter.v1
time="2020-11-18T14:25:16.859993317Z" level=info msg="loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1
time="2020-11-18T14:25:16.860306441Z" level=info msg="skip loading plugin "io.containerd.snapshotter.v1.zfs"..." type=io.containerd.snapshotter.v1
time="2020-11-18T14:25:16.860336711Z" level=info msg="loading plugin "io.containerd.metadata.v1.bolt"..." type=io.containerd.metadata.v1
time="2020-11-18T14:25:16.860413642Z" level=warning msg="could not use snapshotter zfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.zfs must be a zfs filesystem to be used with the zfs snapshotter: skip plugin"
time="2020-11-18T14:25:16.860460063Z" level=warning msg="could not use snapshotter btrfs in metadata plugin" error="path /var/lib/docker/containerd/daemon/io.containerd.snapshotter.v1.btrfs must be a btrfs filesystem to be used with the btrfs snapshotter"
time="2020-11-18T14:25:16.860471413Z" level=warning msg="could not use snapshotter aufs in metadata plugin" error="modprobe aufs failed: "ip: can't find device 'aufs'\nmodprobe: can't change directory to '/lib/modules': No such file or directory\n": exit status 1"
time="2020-11-18T14:25:16.864093361Z" level=info msg="loading plugin "io.containerd.differ.v1.walking"..." type=io.containerd.differ.v1
time="2020-11-18T14:25:16.864130681Z" level=info msg="loading plugin "io.containerd.gc.v1.scheduler"..." type=io.containerd.gc.v1
time="2020-11-18T14:25:16.864246923Z" level=info msg="loading plugin "io.containerd.service.v1.containers-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.864270933Z" level=info msg="loading plugin "io.containerd.service.v1.content-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.864338464Z" level=info msg="loading plugin "io.containerd.service.v1.diff-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.864377634Z" level=info msg="loading plugin "io.containerd.service.v1.images-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.864397144Z" level=info msg="loading plugin "io.containerd.service.v1.leases-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.864491935Z" level=info msg="loading plugin "io.containerd.service.v1.namespaces-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.864511906Z" level=info msg="loading plugin "io.containerd.service.v1.snapshots-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.864597057Z" level=info msg="loading plugin "io.containerd.runtime.v1.linux"..." type=io.containerd.runtime.v1
time="2020-11-18T14:25:16.864793759Z" level=info msg="loading plugin "io.containerd.runtime.v2.task"..." type=io.containerd.runtime.v2
time="2020-11-18T14:25:16.864970750Z" level=info msg="loading plugin "io.containerd.monitor.v1.cgroups"..." type=io.containerd.monitor.v1
time="2020-11-18T14:25:16.865439735Z" level=info msg="loading plugin "io.containerd.service.v1.tasks-service"..." type=io.containerd.service.v1
time="2020-11-18T14:25:16.865483066Z" level=info msg="loading plugin "io.containerd.internal.v1.restart"..." type=io.containerd.internal.v1
time="2020-11-18T14:25:16.865588917Z" level=info msg="loading plugin "io.containerd.grpc.v1.containers"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.865610567Z" level=info msg="loading plugin "io.containerd.grpc.v1.content"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.865627207Z" level=info msg="loading plugin "io.containerd.grpc.v1.diff"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.865707848Z" level=info msg="loading plugin "io.containerd.grpc.v1.events"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.865781689Z" level=info msg="loading plugin "io.containerd.grpc.v1.healthcheck"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.865815430Z" level=info msg="loading plugin "io.containerd.grpc.v1.images"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.865833049Z" level=info msg="loading plugin "io.containerd.grpc.v1.leases"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.865942381Z" level=info msg="loading plugin "io.containerd.grpc.v1.namespaces"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.866019402Z" level=info msg="loading plugin "io.containerd.internal.v1.opt"..." type=io.containerd.internal.v1
time="2020-11-18T14:25:16.866321715Z" level=info msg="loading plugin "io.containerd.grpc.v1.snapshots"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.866363645Z" level=info msg="loading plugin "io.containerd.grpc.v1.tasks"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.866442166Z" level=info msg="loading plugin "io.containerd.grpc.v1.version"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.866471257Z" level=info msg="loading plugin "io.containerd.grpc.v1.introspection"..." type=io.containerd.grpc.v1
time="2020-11-18T14:25:16.866729760Z" level=info msg=serving... address="/var/run/docker/containerd/containerd-debug.sock"
time="2020-11-18T14:25:16.866892470Z" level=info msg=serving... address="/var/run/docker/containerd/containerd.sock"
time="2020-11-18T14:25:16.866917761Z" level=info msg="containerd successfully booted in 0.012009s"
time="2020-11-18T14:25:16.874363311Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2020-11-18T14:25:16.874400711Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2020-11-18T14:25:16.874426581Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
time="2020-11-18T14:25:16.874644454Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2020-11-18T14:25:16.875551583Z" level=info msg="parsed scheme: \"unix\"" module=grpc
time="2020-11-18T14:25:16.875581073Z" level=info msg="scheme \"unix\" not registered, fallback to default scheme" module=grpc
time="2020-11-18T14:25:16.875602574Z" level=info msg="ccResolverWrapper: sending update to cc: {[{unix:///var/run/docker/containerd/containerd.sock 0 <nil>}] <nil>}" module=grpc
time="2020-11-18T14:25:16.875713505Z" level=info msg="ClientConn switching balancer to \"pick_first\"" module=grpc
time="2020-11-18T14:25:16.879748238Z" level=error msg="failed to mount overlay: operation not permitted" storage-driver=overlay2
time="2020-11-18T14:25:16.888010497Z" level=error msg="AUFS was not found in /proc/filesystems" storage-driver=aufs
time="2020-11-18T14:25:16.888856555Z" level=error msg="failed to mount overlay: operation not permitted" storage-driver=overlay
time="2020-11-18T14:25:16.888886695Z" level=error msg="Failed to built-in GetDriver graph devicemapper /var/lib/docker"
time="2020-11-18T14:25:16.907542584Z" level=info msg="Loading containers: start."
time="2020-11-18T14:25:16.912073582Z" level=warning msg="Running modprobe bridge br_netfilter failed with message: ip: can't find device 'bridge'\nbridge 172032 1 br_netfilter\nstp 16384 1 bridge\nllc 16384 2 bridge,stp\nipv6 524288 1230 nf_conntrack_ipv6,nf_nat_ipv6,ip_vs,bridge,[permanent]\nip: can't find device 'br_netfilter'\nbr_netfilter 24576 0 \nbridge 172032 1 br_netfilter\nmodprobe: can't change directory to '/lib/modules': No such file or directory\n, error: exit status 1"
time="2020-11-18T14:25:16.912968823Z" level=warning msg="Running iptables --wait -t nat -L -n failed with message: `iptables v1.8.3 (legacy): can't initialize iptables table `nat': Permission denied (you must be root)\nPerhaps iptables or your kernel needs to be upgraded.`, error: exit status 3"
time="2020-11-18T14:25:16.926797910Z" level=info msg="stopping event stream following graceful shutdown" error="<nil>" module=libcontainerd namespace=moby
time="2020-11-18T14:25:16.927293215Z" level=info msg="stopping event stream following graceful shutdown" error="context canceled" module=libcontainerd namespace=plugins.moby
time="2020-11-18T14:25:16.927298935Z" level=info msg="stopping healthcheck following graceful shutdown" module=libcontainerd
failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.3 (legacy): can't initialize iptables table `nat': Permission denied (you must be root)
Perhaps iptables or your kernel needs to be upgraded.
(exit status 3)
Detected registry credentials
+ /usr/local/bin/docker version
Client: Docker Engine - Community
Version: 19.03.8
API version: 1.40
Go version: go1.12.17
Git commit: afacb8b7f0
Built: Wed Mar 11 01:22:56 2020
OS/Arch: linux/amd64
Experimental: false
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
time="2020-11-18T14:25:32Z" level=fatal msg="exit status 1"
Thanks for sharing the debug logs.
failed to start daemon: Error initializing network controller: error obtaining controller instance: failed to create NAT chain DOCKER: iptables failed: iptables -t nat -N DOCKER: iptables v1.8.3 (legacy): can't initialize iptables table `nat': Permission denied (you must be root)
Are you sure you are running your plugin in privileged mode ?
I think plugin/grc
is not whitelisted as a privileged image by default, but you might be able to configure it on your runner, or just use privileged: true
on your drone step
This isnāt plugin/grc
, itās plugin/ecr
which is whitelisted if I remember correctly
Edit: So I canāt talk anymore in this thread because Iāve hit my āreply limitā⦠wtfā¦
Anyway, so Iām running drone on K8ās, with the kubernetes runner and my base image for the nodes with docker is from October 7th. So Iām starting to wonder if that needs to be updated instead
1 Like
ademaria
(Alessandro De Maria)
November 18, 2020, 2:51pm
16
WinterNis:
privileged: true
Iāve been running gcr for months and never had to whitelist.
Do you think the new upgrade of docker changed things?
ademaria
(Alessandro De Maria)
November 18, 2020, 3:27pm
17
@josh-stableprice we changed our autoscaler to use
DRONE_AGENT_IMAGE: drone/drone-runner-docker:1.5.3
and it all started to work
ademaria
(Alessandro De Maria)
November 18, 2020, 3:28pm
18
we believe it has something to do with:
ademaria
(Alessandro De Maria)
November 18, 2020, 3:29pm
19
FYI I run everything with privileged
and it unfortunately didnāt fix for me
Well⦠interestingly seems like Iāll have to wait for this to be resolved upstream as drone-runner-kube doesnāt seem to be have been getting versioned releases for the last month that I can point to