Page 4 of 25

Changing the default apps wildcard certificate in OCP4

December 30, 2023 / David / 0 Comments

In a standard OCP4 installation, several route objects are created by default and secured with a internally signed wildcard certificate.

These routes are configured as <app-name>.apps.<domain>. In my example, I have a cluster with the assigned domain ocp-acm.virtualthoughts.co.uk, which results in the routes below:

oauth-openshift.apps.ocp-acm.virtualthoughts.co.uk
console-openshift-console.apps.ocp-acm.virtualthoughts.co.uk
grafana-openshift-monitoring.apps.ocp-acm.virtualthoughts.co.uk
thanos-querier-openshift-monitoring.apps.ocp-acm.virtualthoughts.co.uk
prometheus-k8s-openshift-monitoring.apps.ocp-acm.virtualthoughts.co.uk
alertmanager-main-openshift-monitoring.apps.ocp-acm.virtualthoughts.co.uk

Inspecting console-openshift-console.apps.ocp-acm.virtualthoughts.co.uk shows us the default wildcard TLS certificate used by the Ingress Operator:

Because it’s internally signed, it’s not trusted by default by external clients. However, this can be changed.

Installing Cert-Manager

OperatorHub includes the upstream cert-manager chart, as well as one maintained by Red Hat. This can be installed to manage the lifecycle of our new certificate. Navigate to Operators -> Operator Hub -> cert-manager and install.

Create `Secret`, `Issuer` and `Certificate` resources

With Cert-Manager installed, we need to provide configuration so it knows how to issue challenges and generate certificates. In this example:

Secret – A client secret created from my cloud provider for authentication used to satisfy the challenge type. In this example AzureDNS, as I’m using the DNS challenge request type to prove ownership of this domain.
ClusterIssuer – A cluster wide configuration that when referenced, determines how to get (issue) certs. You can have multiple Issuers in a cluster, namespace or cluster scoped pointing to different providers and configurations.
Certificate – TLS certs can be generated automatically from ingress annotations, however in this example, it is used to request and store the certificate in its own lifecycle, not tied to a specific ingress object.

Let’s Encrypt provides wildcard certificates, but only through the DNS-01 challenge. The HTTP-01 challenge cannot be used to issue wildcard certificates. This is reflected in the config:

apiVersion: v1
kind: Secret
metadata:
  name: azuredns-config
  namespace: cert-manager
type: Opaque
data:
  client-secret: <Base64 Encoded Secret from Azure>
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-production
  namespace: cert-manager
spec:
  acme:
    server: https://acme-v02.api.letsencrypt.org/directory
    email: <email>
    privateKeySecretRef:
      name: letsencrypt
    solvers:
    - dns01:
        azureDNS:
          clientID: <clientID>
          clientSecretSecretRef:
            name: azuredns-config
            key: client-secret
          subscriptionID: <subscriptionID>
          tenantID: <tenantID>
          resourceGroupName: <resourceGroupName>
          hostedZoneName: virtualthoughts.co.uk
          # Azure Cloud Environment, default to AzurePublicCloud
          environment: AzurePublicCloud
---
apiVersion: cert-manager.io/v1
kind: Certificate
metadata:
  name: wildcard-apps-certificate
  namespace: openshift-ingress
spec:
  secretName: apps-wildcard-tls
  issuerRef:
    name: letsencrypt-production
    kind: ClusterIssuer
  commonName: "*.apps.ocp-acm.virtualthoughts.co.uk"
  dnsNames:
  - "*.apps.ocp-acm.virtualthoughts.co.uk"

Applying the above will create the respective objects required for us to request, receive and store a wildcard certificate from LetsEncrypt, using the DNS challenge request with AzureDNS.

The certificate may take ~2mins or so to become Ready due to the nature of the DNS style challenge.

oc get cert -A

NAMESPACE           NAME                        READY   SECRET              AGE
openshift-ingress   wildcard-apps-certificate   True    apps-wildcard-tls   33m

Patch the Ingress Operator

With the certificate object created, the Ingress Operator needs re configuring, referencing the secret name of the certificate object for our new certificate:

oc patch ingresscontroller.operator default \
--type=merge -p \
'{"spec":{"defaultCertificate":{"name":"apps-wildcard-tls"}}}' \
--namespace=openshift-ingress-operator

Validate

After applying, navigating back to the clusters console will present the new wildcard cert:

Improving the CI/build process for the community Rancher Exporter

June 5, 2023 / David / 0 Comments

One of my side projects is developing and maintaining an unofficial Prometheus Exporter for Rancher. It exposes metrics pertaining to Rancher-specific resources including, but not limited to managed clusters, Kubernetes versions, and more. Below shows an example dashboard based on these metrics.

Incidentally, if you are using Rancher, I’d love to hear your thoughts/feedback.

Previous CI workflow

The flowchart below outlines the existing process. Whilst automated, pushing directly to latest is bad practice.

To improve this. Several additional steps were added. First of which acquires the latest, versioned image of the exporter and saves it to the $GITHUB_OUTPUT environment

    - name: Retrieve latest Docker image version
        id: get_version
        run: |
          echo "image_version=$(curl -s "https://registry.hub.docker.com/v2/repositories/virtualthoughts/prometheus-rancher-exporter/tags/" | jq -r '.results[].name' | grep -v latest | sort -V | tail -n 1)" >> $GITHUB_OUTPUT

Referencing this, the next version can be generated based on MAJOR.MINOR.PATCH. Incrementing the PATCH version. In the future, this will be modified to add more flexibility to change MAJOR and MINOR versions.

      - name: Increment version
        id: increment_version
        run: |
          # Increment the retrieved version
          echo "updated_version=$(echo "${{ steps.get_version.outputs.image_version }}" | awk -F. -v OFS=. '{$NF++;print}')" >> $GITHUB_OUTPUT

With the version generated, the subsequent step can tag and push both the incremented version, and latest.

      - name: Build and push
        uses: docker/build-push-action@v3
        with:
          context: .
          push: true
          tags: |
            virtualthoughts/prometheus-rancher-exporter:${{ steps.increment_version.outputs.updated_version }}
            virtualthoughts/prometheus-rancher-exporter:latest

Lastly, the Github action will also modify the YAML manifest file to reference the most recent, versioned image:

      - name: Update Kubernetes YAML manifest
        run: |
          # Install yq
          curl -sL https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -o yq
          chmod +x yq
          sudo mv yq /usr/local/bin/
          
          # Find and update the image tag in the YAML file
          IMAGE_NAME="virtualthoughts/prometheus-rancher-exporter"
          NEW_TAG="${{ steps.increment_version.outputs.updated_version }}"
          OLD_TAG=$(yq eval '.spec.template.spec.containers[] | select(.name == "rancher-exporter").image' manifests/exporter.yaml | cut -d":" -f2)
          NEW_IMAGE="${IMAGE_NAME}:${NEW_TAG}"
          sed -i "s|${IMAGE_NAME}:${OLD_TAG}|${NEW_IMAGE}|" manifests/exporter.yaml

Which results in:

Simplify Multus deployments with Rancher and RKE2

May 8, 2023 / David / 0 Comments

From my experience, some environments necessitate leveraging multiple NICs on Kubernetes worker nodes as well as the underlying Pods. Because of this, I wanted to create a test environment to experiment with this kind of setup. Although more common in bare metal environments, I’ll create a virtualised equivalent.

Planning

This is what I have in mind:

In RKE2 vernacular, we refer to nodes that assume etcd and/or control plane roles as servers, and worker nodes as agents.

Server Nodes

Server nodes will not run any workloads. Therefore, they only require 1 NIC. This will reside on VLAN40 in my environment and will act as the overlay/management network for my cluster and will be used for node <-> node communication.

Agent Nodes

Agent nodes will be connected to multiple networks:

VLAN40 – Used for node <-> node communication.
VLAN50 – Used exclusively by Longhorn for replication traffic. Longhorn is a cloud-native distributed block storage solution for Kubernetes.
VLAN60 – Provide access to ancillary services.

Creating Nodes

For the purposes of experimenting, I will create my VMs first.

Server VM config:

Agent VM Config:

Rancher Cluster Configuration

Using Multus is as simple as selecting it from the dropdown list of CNI’s. We have to have an existing CNI for cluster networking, which is Canal in this example

The section “Add-On Config” enables us to make changes to the various addons for our cluster:

This cluster has the following tweaks:

calico:
  ipAutoDetectionMethod: interface=ens192

flannel:
  backend: host-gw
  iface: ens192

The Canal CNI is a combination of both Calico and Flannel. Which is why the specific interface used is defined in both sections.

With this set, we can extract the join command and run it on our servers:

Tip – Store the desired node-ip in a config file before launching the command on the nodes. Ie:

packerbuilt@mullti-homed-wrk-1:/$ cat /etc/rancher/rke2/config.yaml
node-ip: 172.16.40.47

NAME                 STATUS   ROLES                       AGE   VERSION          INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
multi-homed-cpl-1   Ready    control-plane,etcd,master   42h   v1.25.9+rke2r1   172.16.40.46   <none>        Ubuntu 22.04.1 LTS   5.15.0-71-generic   containerd://1.6.19-k3s1
multi-homed-cpl-2   Ready    control-plane,etcd,master   41h   v1.25.9+rke2r1   172.16.40.49   <none>        Ubuntu 22.04.1 LTS   5.15.0-71-generic   containerd://1.6.19-k3s1
multi-homed-cpl-3   Ready    control-plane,etcd,master   41h   v1.25.9+rke2r1   172.16.40.50   <none>        Ubuntu 22.04.1 LTS   5.15.0-71-generic   containerd://1.6.19-k3s1
multi-homed-wrk-1   Ready    worker                      42h   v1.25.9+rke2r1   172.16.40.47   <none>        Ubuntu 22.04.1 LTS   5.15.0-71-generic   containerd://1.6.19-k3s1
multi-homed-wrk-2   Ready    worker                      42h   v1.25.9+rke2r1   172.16.40.48   <none>        Ubuntu 22.04.1 LTS   5.15.0-71-generic   containerd://1.6.19-k3s1
multi-homed-wrk-3   Ready    worker                      25h   v1.25.9+rke2r1   172.16.40.51   <none>        Ubuntu 22.04.1 LTS   5.15.0-71-generic   containerd://1.6.19-k3s1

Pod Networking

Multus is not a CNI in itself, but a meta CNI plugin, enabling the use of multiple CNI’s in a Kubernetes cluster. At this point we have a functioning cluster with an overlay network in place for cluster communication, and every Pod will have a interface on that network. So which other CNI’s can we use?

Out of the box, we can query the /opt/cni/bin directory for available plugins. You can also add additional CNI’s if you wish.

packerbuilt@mullti-homed-wrk-1:/$ ls /opt/cni/bin/
bandwidth  calico       dhcp      flannel      host-local  ipvlan    macvlan  portmap  sbr     tuning  vrf
bridge     calico-ipam  firewall  host-device  install     loopback  multus   ptp      static  vlan

For this environment, macvlan will be used. It provides MAC addresses directly to Pod interfaces which makes it simple to integrate with network services like DHCP.

Defining the Networks

Through NetworkAttachmentDefinition objects, we can define the respective networks and bridge them to named, physical interfaces on the host:

apiVersion: v1
kind: Namespace
metadata:
  name: multus-network-attachments
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-longhorn-dhcp
  namespace: multus-network-attachments
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "macvlan",
      "master": "ens224",
      "mode": "bridge",
      "ipam": {
        "type": "dhcp"
      }
    }'
---
apiVersion: "k8s.cni.cncf.io/v1"
kind: NetworkAttachmentDefinition
metadata:
  name: macvlan-private-dhcp
  namespace: multus-network-attachments
spec:
  config: '{
      "cniVersion": "0.3.0",
      "type": "macvlan",
      "master": "ens256",
      "mode": "bridge",
      "ipam": {
        "type": "dhcp"
      }
    }'

We use an annotation to attach a pod to additional networks

apiVersion: v1
kind: Pod
metadata:
  name: net-tools
  namespace: multus-network-attachments
  annotations:
    k8s.v1.cni.cncf.io/networks: multus-network-attachments/macvlan-longhorn-dhcp,multus-network-attachments/macvlan-private-dhcp
spec:
  containers:
  - name: samplepod
    command: ["/bin/bash", "-c", "sleep 2000000000000"]
    image: ubuntu

Which we can validate within the pod:

root@net-tools:/# ip addr show
3: eth0@if2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
    link/ether 1a:57:1a:c1:bf:f3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.42.5.27/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::1857:1aff:fec1:bff3/64 scope link 
       valid_lft forever preferred_lft forever
4: net1@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether aa:70:ab:b6:7a:86 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.16.50.40/24 brd 172.16.50.255 scope global net1
       valid_lft forever preferred_lft forever
    inet6 fe80::a870:abff:feb6:7a86/64 scope link 
       valid_lft forever preferred_lft forever
5: net2@if5: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 62:a6:51:84:a9:30 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.16.60.30/24 brd 172.16.60.255 scope global net2
       valid_lft forever preferred_lft forever
    inet6 fe80::60a6:51ff:fe84:a930/64 scope link 
       valid_lft forever preferred_lft forever

root@net-tools:/# ip route
default via 169.254.1.1 dev eth0 
169.254.1.1 dev eth0 scope link 
172.16.50.0/24 dev net1 proto kernel scope link src 172.16.50.40 
172.16.60.0/24 dev net2 proto kernel scope link src 172.16.60.30

Testing access to a service on net2:

root@net-tools:/# curl 172.16.60.31
<!DOCTYPE html>
<html>
<head>
<title>Welcome to nginx!</title>

Configuring Longhorn

Longhorn has a config setting to define the network used for storage operations:

If setting this post-install, the instance-manager pods will restart and attach a new interface:

instance-manager-e-437ba600ca8a15720f049790071aac70:/ # ip addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
3: eth0@if51: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default 
    link/ether fe:da:f1:04:81:67 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.42.1.58/32 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::fcda:f1ff:fe04:8167/64 scope link 
       valid_lft forever preferred_lft forever
4: lhnet1@if4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 12:90:50:15:04:c7 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 172.16.50.34/24 brd 172.16.50.255 scope global lhnet1
       valid_lft forever preferred_lft forever
    inet6 fe80::1090:50ff:fe15:4c7/64 scope link 
       valid_lft forever preferred_lft forever

Virtual Thoughts