Hi,
I’m setting up Drone for the first time in K8S.
This is the deployment I’m using:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: drone-server
namespace: sre
labels:
app: drone
component: server
spec:
replicas: 1
template:
metadata:
labels:
app: drone
component: server
spec:
serviceAccountName: drone-service-account
containers:
- name: server
image: "docker.io/drone/drone:1.0.0-rc.5"
imagePullPolicy: IfNotPresent
env:
- name: DRONE_KUBERNETES_ENABLED
value: "true"
- name: DRONE_KUBERNETES_NAMESPACE
value: "sre"
- name: DRONE_KUBERNETES_SERVICE_ACCOUNT
value: "drone-pipeline-service-account"
- name: DRONE_ALWAYS_AUTH
value: "false"
- name: DRONE_SERVER_HOST
value: "<>"
- name: DRONE_SERVER_PROTO
value: "https"
- name: DRONE_RPC_HOST
value: "drone.sre.svc.cluster.local"
- name: DRONE_RPC_PROTO
value: "http"
- name: DRONE_USER_CREATE
value: username:roccodonnarummaef,machine:false,admin:true
- name: DRONE_RPC_SECRET
value: "<>"
- name: DRONE_GITHUB_CLIENT_ID
value: "<>"
- name: DRONE_GITHUB_SERVER
value: "https://github.com"
- name: DRONE_GITHUB_CLIENT_SECRET
value: "<>"
- name: DRONE_LOGS_TRACE
value: "true"
# - name: DRONE_DEBUG_DUMP_HOOK
# value: "true"
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 443
protocol: TCP
- name: grpc
containerPort: 9000
protocol: TCP
livenessProbe:
httpGet:
path: /
port: http
volumeMounts:
- name: data
mountPath: /var/lib/drone
volumes:
- name: data
emptyDir: {}
Pod starts fine and pipeline job is triggered:
kind: pipeline
name: build-docker
steps:
- name: build
image: docker:dind
volumes:
- name: dockersock
path: /var/run/docker.sock
commands:
- docker ps -a
volumes:
- name: dockersock
host:
path: /var/run/docker.sock
The pod created by the pipeline fails to be scheduled. I’ve tried setting a node selector on the deployment but got the same result.
Any ideas on what am I missing?
the error says “0/7 nodes are available: 7 Nodes didn’t amtch the node selector”. Drone does not set node affinity by default, so my first thought would be that your yaml defines node affinity that cannot be fulfilled?
This is the complete deployment.yaml
apiVersion: v1
kind: Namespace
metadata:
name: sre
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: drone-server
namespace: sre
labels:
app: drone
component: server
spec:
replicas: 1
template:
metadata:
labels:
app: drone
component: server
spec:
serviceAccountName: drone-service-account
nodeSelector:
role: worker
containers:
- name: server
image: "docker.io/drone/drone:1.0.0-rc.5"
imagePullPolicy: IfNotPresent
env:
- name: DRONE_KUBERNETES_ENABLED
value: "true"
- name: DRONE_KUBERNETES_NAMESPACE
value: "sre"
- name: DRONE_KUBERNETES_SERVICE_ACCOUNT
value: "drone-pipeline-service-account"
- name: DRONE_ALWAYS_AUTH
value: "false"
- name: DRONE_SERVER_HOST
value: "<example.com>"
- name: DRONE_SERVER_PROTO
value: "https"
- name: DRONE_RPC_HOST
value: "drone.sre.svc.cluster.local"
- name: DRONE_RPC_PROTO
value: "http"
- name: DRONE_USER_CREATE
value: username:roccodonnarummaef,machine:false,admin:true
- name: DRONE_RPC_SECRET
value: "<>"
- name: DRONE_GITHUB_CLIENT_ID
value: "<>"
- name: DRONE_GITHUB_SERVER
value: "https://github.com"
- name: DRONE_GITHUB_CLIENT_SECRET
value: "<>"
- name: DRONE_LOGS_TRACE
value: "true"
# - name: DRONE_DEBUG_DUMP_HOOK
# value: "true"
ports:
- name: http
containerPort: 80
protocol: TCP
- name: https
containerPort: 443
protocol: TCP
- name: grpc
containerPort: 9000
protocol: TCP
livenessProbe:
httpGet:
path: /
port: http
volumeMounts:
- name: data
mountPath: /var/lib/drone
volumes:
- name: data
emptyDir: {}
---
apiVersion: v1
kind: Service
metadata:
name: drone
namespace: sre
labels:
app: drone
spec:
type: ClusterIP
ports:
- name: http
port: 80
targetPort: 80
- name: grpc
port: 9000
targetPort: 9000
selector:
app: drone
component: server
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: drone-public-rules
namespace: sre
annotations:
kubernetes.io/ingress.class: traefik-public
spec:
rules:
- host: <example.com>
http:
paths:
- path: /
backend:
serviceName: drone
servicePort: 80
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: drone-service-account
namespace: sre
labels:
app: drone
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: drone
namespace: sre
labels:
app: drone
rules:
- apiGroups:
- batch
resources:
- jobs
verbs:
- "*"
- apiGroups:
- extensions
resources:
- deployments
verbs:
- get
- list
- patch
- update
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: drone
namespace: sre
labels:
app: drone
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: drone
subjects:
- kind: ServiceAccount
name: drone-service-account
namespace: sre
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: drone-pipeline-service-account
namespace: sre
labels:
app: drone
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: drone-pipeline
namespace: sre
labels:
app: drone
rules:
- apiGroups:
- extensions
resources:
- deployments
verbs:
- get
- list
- watch
- patch
- update
- apiGroups:
- ""
resources:
- namespaces
- configmaps
- secrets
- pods
verbs:
- create
- delete
- get
- list
- watch
- apiGroups:
- ""
resources:
- pods/log
verbs:
- get
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: drone-pipeline
namespace: sre
labels:
app: drone
subjects:
- kind: ServiceAccount
name: drone-pipeline-service-account
namespace: sre
roleRef:
kind: ClusterRole
name: drone-pipeline
apiGroup: rbac.authorization.k8s.io
---
oh perhaps I am misunderstanding … I assumed you were using Drone for Kubernetes, have a running instance, and you cannot get Drone to execute a build on kubernetes. Is the issue that you are having trouble getting Drone server deployed?
To clarify, I was previously asking about your .drone.yml
configuration.
Sorry, my bad - let me explain better.
I’ve deployed drone server in K8S using the above yaml (deployment.yaml). I can access the UI, sync with GitHub, activate a repo and upon pushing a commit a job is created in K8S. That job creates a pod which is then pending to be executed.
I can provide logs or more info if needed. Below is the pending pod description:
root@af4ec1e7e484:/work/kube-apps# kubectl -n hnh5f4t4fl16znfsxz016fu0op9zv2vd describe pod wun0i9trl1wyd9x2q9kfbyq8iusuihl7
Name: wun0i9trl1wyd9x2q9kfbyq8iusuihl7
Namespace: hnh5f4t4fl16znfsxz016fu0op9zv2vd
Priority: 0
PriorityClassName: <none>
Node: <none>
Labels: io.drone=true
io.drone.build.number=1
io.drone.created=1550855521
io.drone.expires=1550862721
io.drone.protected=false
io.drone.repo.name=docker-image-helloworld
io.drone.repo.namespace=example
io.drone.stage.name=build-docker
io.drone.stage.number=1
io.drone.step.name=clone
io.drone.ttl=1h0m0s
Annotations: <none>
Status: Pending
IP:
Containers:
wun0i9trl1wyd9x2q9kfbyq8iusuihl7:
Image: docker.io/drone/git:latest
Port: <none>
Host Port: <none>
Environment:
DRONE_NETRC_PASSWORD: x-oauth-basic
DRONE_COMMIT_REF: refs/heads/drone
DRONE: true
DRONE_STEP_NUMBER: 1
DRONE_REPO_PRIVATE: true
CI_JOB_STATUS: success
DRONE_TARGET_BRANCH: drone
DRONE_DEPLOY_TO:
CI_NETRC_PASSWORD: x-oauth-basic
CI_BUILD_STARTED: 1550855521
CI_BUILD_FINISHED: 1550855521
DRONE_WORKSPACE_PATH:
DRONE_SYSTEM_VERSION: 7a510d79c88f93ee6a0d24bd48cf8797941a3509
DRONE_BUILD_NUMBER: 1
DRONE_REPO_VISIBILITY: private
DRONE_COMMIT_BRANCH: drone
CI_JOB_STARTED: 1550855521
CI_WORKSPACE_PATH:
DRONE_GIT_HTTP_URL: https://github.com/example/docker-image-helloworld.git
DRONE_RUNNER_HOSTNAME: ip-10-0-3-52.eu-west-1.compute.internal
DRONE_BUILD_CREATED: 1550855518
DRONE_REPO_NAMESPACE: example
DRONE_STEP_NAME: clone
DRONE_SYSTEM_HOSTNAME: drone.example.com
DRONE_COMMIT_AFTER: bd9618f512c528bc6491368529c309589fe45a85
DRONE_NETRC_USERNAME: b94102da57f552387c2613d8d686329232c53752
CI_BUILD_STATUS: success
DRONE_BUILD_STARTED: 1550855521
DRONE_COMMIT_AUTHOR: roccodonnarummaef
DRONE_RUNNER_PLATFORM: linux/amd64
DRONE_WORKSPACE_BASE: /drone/src
DRONE_COMMIT_AUTHOR_EMAIL: rocco.donnarumma@example
DRONE_REPO: example/docker-image-helloworld
DRONE_REPO_NAME: docker-image-helloworld
DRONE_COMMIT_AUTHOR_AVATAR: https://avatars1.githubusercontent.com/u/44259300?v=4
DRONE_BRANCH: drone
DRONE_RUNNER_HOST: ip-10-0-3-52.eu-west-1.compute.internal
DRONE_SYSTEM_PROTO: https
DRONE_JOB_STATUS: success
DRONE_JOB_FINISHED: 1550855521
DRONE_WORKSPACE: /drone/src
DRONE_REPO_OWNER: example
DRONE_COMMIT_AUTHOR_NAME: Rocco Donnarumma
CI_NETRC_MACHINE: github.com
DRONE_NETRC_MACHINE: github.com
DRONE_SOURCE_BRANCH: drone
DRONE_BUILD_EVENT: push
DRONE_REPO_SCM:
DRONE_MACHINE: ip-10-0-3-52.eu-west-1.compute.internal
DRONE_COMMIT_MESSAGE: drone
DRONE_COMMIT: bd9618f512c528bc6491368529c309589fe45a85
DRONE_BUILD_FINISHED: 1550855521
DRONE_REMOTE_URL: https://github.com/example/docker-image-helloworld.git
DRONE_BUILD_STATUS: success
DRONE_JOB_STARTED: 1550855521
CI_WORKSPACE_BASE: /drone/src
CI_WORKSPACE: /drone/src
DRONE_COMMIT_SHA: bd9618f512c528bc6491368529c309589fe45a85
DRONE_BUILD_LINK: https://drone.example.com/example/docker-image-helloworld/1
DRONE_GIT_SSH_URL: git@github.com:example/docker-image-helloworld.git
CI_JOB_FINISHED: 1550855521
DRONE_COMMIT_LINK: https://github.com/example/docker-image-helloworld/compare/8a6999ab1916...bd9618f512c5
DRONE_BUILD_ACTION:
DRONE_COMMIT_BEFORE: 8a6999ab19161144577671e879bfa1c062b51609
CI: true
DRONE_SYSTEM_HOST: drone.example.com
CI_NETRC_USERNAME: b94102da57f552387c2613d8d686329232c53752
DRONE_REPO_LINK: https://github.com/example/docker-image-helloworld
DRONE_REPO_BRANCH: master
KUBERNETES_NODE: (v1:spec.nodeName)
Mounts:
/drone/src from ogbpsy38qodi1pktv954j2edhypds5ll (rw)
Conditions:
Type Status
PodScheduled False
Volumes:
ogbpsy38qodi1pktv954j2edhypds5ll:
Type: HostPath (bare host directory volume)
Path: /tmp/drone/hnh5f4t4fl16znfsxz016fu0op9zv2vd/ogbpsy38qodi1pktv954j2edhypds5ll
HostPathType: DirectoryOrCreate
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m57s (x25 over 3m31s) default-scheduler 0/7 nodes are available: 7 node(s) didn't match node selector.
root@af4ec1e7e484:/work/kube-apps#
providing a sample of your .drone.yml is also helpful.
drone.yml
kind: pipeline
name: build-docker
steps:
- name: build
image: docker:dind
volumes:
- name: dockersock
path: /var/run/docker.sock
commands:
- docker ps -a
volumes:
- name: dockersock
host:
path: /var/run/docker.sock
Although I’ve tried with a simpler step too, something like:
steps:
- name: build
image: golang
commands:
- go build
- go test
I think this is the crux of the issue. Can you see why it is not being scheduled? There must be some information provided by Kubernetes to better understand what something was not scheduled.
Drone uses node affinity to schedule your pipeline steps on the same machine as the pipeline controller (e.g. drone-job-…). The pipeline controller is configured to receive the host node information via spec.nodeName
as described in the kubernetes docs [1]. This is used to schedule all pipeline steps on the same node as the controller. Pipeline steps are created with node affinity that matches the spec.nodeName
to kubernetes.io/hostname
[2]. That is one thing I did not see in the pod spec that you posted, where node selectors looked empty …
Unfortunately I do not use Kubernetes and have limited working knowledge, which means the best I can do is point you to the source code to get a better understanding of how things work. I have personally tested Drone on Kubernetes with Digital Ocean and Minikube and have not run into any issues, however, I recognize that no two clusters are the same.
[1] https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/#use-pod-fields-as-values-for-environment-variables
[2] https://github.com/drone/drone-runtime/blob/master/engine/kube/kube.go#L136
I added some more detailed notes to my previous reply.
You might also find this post helpful for debugging at a lower level. Contributing to Drone for Kubernetes
I ran into this same problem and did a little debugging. With the way my cluster is set up (kops on AWS), the nodeName does not correspond to the hostname. You can verify that you’re encountering the problem by looking at the spec for the pod that generated by the runner/job and looking at the node affinity requirements that are generated.
I think using the kubernetes.io/hostname
label in the pod affinity might not be entirely correct since that assumes the nodeName and hostname are always identical.
Instead, why don’t we consider using https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#nodename since that does select nodes by name and will accomplish the same thing that you’re trying to do with the pod affinity/hostname.
Alternatively, we could change the fieldRef from passing the nodeName to instead passing the kubernetes.io/hostname
label
Thoughts?
2 Likes
Sure, please consider sending a pull request. I am not actively working on the Kubernetes implementation right now – I have others priorities I need to address – but am happy to accept pull requests.
hanjunlee
(Noah Hanjun Lee)
July 14, 2019, 2:05pm
12
Hi. I had the same issue it could not schedule the pod because kubernetes.io/hostname
label was not same to the node name. So I made a PR which is using the node name. Thanks