This Week, In Brief
Draft the first version of Ascent Checklist CI/CD within Kubernetes clusters!
Highlights
Goals
- Create source code repository validator
- Create Ascent Checklist schema and validation
- Create Ascent Checklist provisioner with a simple test deployment scenario
Goal Grades
Create source code repository validator
- Result: Didn’t even get to start this. :(
- Grade: F
Create Ascent Checklist schema and validation
- Result: Didn’t even get to start this. :(
- Grade: F
Create Ascent Checklist provisioner with a simple test deployment scenario
- Result: Didn’t even get to start this. :(
- Grade: F
More Stuff
ClusterAPI works great…until it doesn’t
I spent my entire week fighting ClusterAPI, which was really frustrating.
Long story short…is still pretty long. Posted below.
I really hate running into dealbreaking walls like this.
ClusterAPI works great…until it doesn’t, the novella
i hope this entry makes sense, because this journey was extremely confusing.
end-to-end tests for the Kubernetes Developer Platform Provisioner used Kubernetes v1.22.1 for
clusters created within Management Planes and Platforms. This means three things:
-
The kind Cluster provider will download and use the kind utility to create
a Management Planes cluster pair with the --kubernetes-version 1.22.1 flag set,
and
-
The ClusterAPI manifest rendered from the
platform/templates/platform/xsmall-values template will set the
kubernetesVersion property to 1.22.1 as well
-
When capi-controller-manager receives the KubeadmControlPlane manifest
from the manifest above, it will internally use kind to bootstrap the Platform
clusters.
The issue here was two fold. Some of this might be incorrect:
cgroup confusion
The kubelet service would not start on the Platform clusters due to this error:
"failed to run Kubelet: invalid configuration: cgroup-root [\"kubelet\"] doesn't exist" . Based on
this thread, I think this
was occurring due to a conflict between the OS my Docker Engine was running on
(Alpine) and kubeadm bootstrapping.
because these nodes are running within Docker, they need to volume mount the
cgroup mounts on the host within the container so that containers created within
Pods receive the correct control groups. however, the location of these cgroup
mounts can vary depending on whether the host is using systemd to manage cgroups
or not.
Alpine uses openrc instead of systemd and has non-conventional mount points for
cgroups. this trips up the kubelet and prevents it from starting.
this is further complicated by the fact that (a) cluster-api uses ubuntu in
their end-to-end tests, and (b) they don’t matrix their tests to account for
other operating systems (or at least I didn’t see any evidence of this from
looking at their GitHub workflows.)
lots of version incompatibilities
backwards compatibility does not seem to exist between cluster manifests
generated by versions of clusterctl that are older than the version of
capd-controller-manager. (i.e. a cluster manifest generated by 1.2.5 of
clusterctl will be accepted by 1.2.5 of capd-controller-manager. however, a
cluster manifest generated by 1.2.4 of clusterctl will cause unpredictable
behavior when submitted to 1.2.5 of capd-controller-manager.)
i noticed this while trying to get a working control plane provisioned on a
Ubuntu machine running the Docker Engine. in this scenario (kind cluster running
v1.23, clusterctl version 1.2.5, requesting a kubernetes cluster running
1.22.1), the kubelet within capd-provisioned nodes would not start due to the
kube-apiserver container not starting. logs coming from the failing
kube-apiserver container showed that it was failing to start due to an unknown
resource PodSecurityConfiguration being created during Pod admission.
Looking at the Pod configuration for the kube-apiserver showed that it mounted
base manifests from /etc/kubernetes/manifests on the node. Sure enough, there
was a manifest in this folder that defined a PodSecurityConfiguration within
an AdmissionConfiguration resource inside of an API
pod-security.admission.config.k8s.io/v1beta1 that did not ship with 1.22.1
(this was actually in v1 of that API).
(I knew to look at the kube-apiserver Pod configuration because the systemd
unit for kubelet showed that it loaded static
pods
within the /etc/kubernetes/manifests folder)
As it happens, this admission configuration was created as a result of this
PR…which assumed
a later version of Kubernetes.
TL;DR: I should have read
this before
arbitrarily deciding to use k8s v1.22.1 for everything…and running
containerd/Docker Engine on Alpine is officially a Bad Idea™)