This Week, In Brief
Draft the first version of Ascent Checklist CI/CD within Kubernetes clusters!
Highlights
Goals
- Create source code repository validator
- Create Ascent Checklist schema and validation
- Create Ascent Checklist provisioner with a simple test deployment scenario
Goal Grades
Create source code repository validator
- Result: Didn’t even get to start this. :(
- Grade: F
Create Ascent Checklist schema and validation
- Result: Didn’t even get to start this. :(
- Grade: F
Create Ascent Checklist provisioner with a simple test deployment scenario
- Result: Didn’t even get to start this. :(
- Grade: F
More Stuff
ClusterAPI works great…until it doesn’t
I spent my entire week fighting ClusterAPI, which was really frustrating.
Long story short…is still pretty long. Posted below.
I really hate running into dealbreaking walls like this.
ClusterAPI works great…until it doesn’t, the novella
i hope this entry makes sense, because this journey was extremely confusing.
end-to-end tests for the Kubernetes Developer Platform Provisioner used Kubernetes v1.22.1 for
clusters created within Management Planes and Platforms. This means three things:
-
The kind
Cluster provider will download and use the kind
utility to create
a Management Planes cluster pair with the --kubernetes-version 1.22.1
flag set,
and
-
The ClusterAPI manifest rendered from the
platform/templates/platform/xsmall-values
template will set the
kubernetesVersion
property to 1.22.1 as well
-
When capi-controller-manager
receives the KubeadmControlPlane
manifest
from the manifest above, it will internally use kind to bootstrap the Platform
clusters.
The issue here was two fold. Some of this might be incorrect:
cgroup confusion
The kubelet
service would not start on the Platform clusters due to this error:
"failed to run Kubelet: invalid configuration: cgroup-root [\"kubelet\"] doesn't exist"
. Based on
this thread, I think this
was occurring due to a conflict between the OS my Docker Engine was running on
(Alpine) and kubeadm bootstrapping.
because these nodes are running within Docker, they need to volume mount the
cgroup mounts on the host within the container so that containers created within
Pods receive the correct control groups. however, the location of these cgroup
mounts can vary depending on whether the host is using systemd to manage cgroups
or not.
Alpine uses openrc instead of systemd and has non-conventional mount points for
cgroups. this trips up the kubelet and prevents it from starting.
this is further complicated by the fact that (a) cluster-api uses ubuntu in
their end-to-end tests, and (b) they don’t matrix their tests to account for
other operating systems (or at least I didn’t see any evidence of this from
looking at their GitHub workflows.)
lots of version incompatibilities
backwards compatibility does not seem to exist between cluster manifests
generated by versions of clusterctl that are older than the version of
capd-controller-manager. (i.e. a cluster manifest generated by 1.2.5 of
clusterctl will be accepted by 1.2.5 of capd-controller-manager. however, a
cluster manifest generated by 1.2.4 of clusterctl will cause unpredictable
behavior when submitted to 1.2.5 of capd-controller-manager.)
i noticed this while trying to get a working control plane provisioned on a
Ubuntu machine running the Docker Engine. in this scenario (kind cluster running
v1.23, clusterctl version 1.2.5, requesting a kubernetes cluster running
1.22.1), the kubelet within capd-provisioned nodes would not start due to the
kube-apiserver container not starting. logs coming from the failing
kube-apiserver container showed that it was failing to start due to an unknown
resource PodSecurityConfiguration
being created during Pod admission.
Looking at the Pod configuration for the kube-apiserver
showed that it mounted
base manifests from /etc/kubernetes/manifests
on the node. Sure enough, there
was a manifest in this folder that defined a PodSecurityConfiguration
within
an AdmissionConfiguration
resource inside of an API
pod-security.admission.config.k8s.io/v1beta1
that did not ship with 1.22.1
(this was actually in v1
of that API).
(I knew to look at the kube-apiserver
Pod configuration because the systemd
unit for kubelet
showed that it loaded static
pods
within the /etc/kubernetes/manifests
folder)
As it happens, this admission configuration was created as a result of this
PR…which assumed
a later version of Kubernetes.
TL;DR: I should have read
this before
arbitrarily deciding to use k8s v1.22.1 for everything…and running
containerd/Docker Engine on Alpine is officially a Bad Idea™)