Resilient & Multi-Tenant Kubernetes Cluster
Welcome back to my tech corner. Today, we are going to take a deep dive into some fundamental Kubernetes topics, including Init Containers, Deployments, DaemonSets, and Taints & Tolerations.
Prerequisites: Since I will be running Kubernetes locally using Docker and Kind, you will need:
- Docker Desktop installed on your system.
- Kind (Kubernetes in Docker) installed along with the
kubectlclient.
The Problem
Imagine you are working for a company that requires two distinct departments: Production and Monitoring. You have a cluster with three nodes. However, Worker Node 2 is reserved exclusively for critical work.
The Rule: Production apps must not run on Worker Node 2.
The Exception: Monitoring apps (like log collectors) must run on every node, regardless of restrictions.
The Architecture

This is the high-level view of our Nodes:
- Worker Nodes 1 & 2: Will host the Frontend and Backend of the production application.
- Worker Node 3: Is the only node capable of running the SSD Cache app (simulating a hardware dependency).
- All Nodes: Must run the Monitoring Log Collector for compliance.
Step 1: Cluster Setup
First, we will create a cluster for this project with 3 worker nodes using a kind.yaml configuration file. Since we are using Kind, we need to configure port mapping to ensure NodePort services work correctly from our local machine.
The kind.yaml configuration:

Command to create the cluster:
$ kind create cluster --image [image-name] --name [cluster-name] --config kind.yaml
Verify the setup: Run the following commands to check your cluster status and nodes:
kind get clusters
kubectl get nodes

Step 2: Namespaces & Node Labeling
Now, let's create the Production and Monitoring namespaces. We will also taint Worker 2 for critical workloads and label Worker 3 for node affinity.
1. Create Namespaces (Imperative Commands):
kubectl create namespace prod
kubectl create namespace monitoring
2. Apply the Taint: We taint Worker 2 so standard pods cannot schedule there.
# Syntax: kubectl taint nodes [node-name] key=value:effect
kubectl taint nodes cka-dual-tenant-cluster-worker2 restricted=true:NoSchedule
3. Apply the Label: We label Worker 3 to simulate a node with a specific hardware feature (SSD).
kubectl label nodes cka-dual-tenant-cluster-worker3 ssd=true
Step 3: Deploying Production Apps
The Backend App
We will create a Deployment that runs a simple Go application. The Twist: We will add an Init Container. The main application won't start until this container finishes its job (simulating a "Wait for Database" check).
The backend.yaml file
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: backend-app
5 namespace: prod
6 labels:
7 app: backend
8 tier: api
9spec:
10 replicas: 2
11 selector:
12 matchLabels:
13 app: backend
14 template:
15 metadata:
16 labels:
17 app: backend
18 spec:
19 # 1. This container runs FIRST.
20 initContainers:
21 - name: check-db-ready
22 image: busybox:1.28
23 # Simulates waiting for a database for 10 seconds
24 command: ['sh', '-c', 'echo "Checking database connection..."; sleep 10; echo "DB is up!";']
25
26 # 2. This container starts only after the Init Container finishes.
27 containers:
28 - name: main-app
29 image: gcr.io/google-samples/hello-app:1.0
30 ports:
31 - containerPort: 8080
32---
33apiVersion: v1
34kind: Service
35metadata:
36 name: backend-service
37 namespace: prod
38spec:
39 # ClusterIP means it is ONLY accessible inside the cluster (secure)
40 type: ClusterIP
41 selector:
42 app: backend
43 ports:
44 - port: 80 # The port other pods use to talk to this service
45 targetPort: 8080 # The port the container is actually listening onObservation: Immediately after applying, if you run kubectl get pods -n prod -w, you will see the status transition from Init:0/1 → PodInitializing → Running. This proves our resilience logic is working.

The Frontend App
The Frontend app runs an Nginx image with 2 replicas, exposed on port 30009.
The frontend.yaml file
1apiVersion: apps/v1
2kind: Deployment
3metadata:
4 name: frontend-app
5 namespace: prod
6 labels:
7 app: frontend
8 tier: web
9spec:
10 replicas: 2
11 selector:
12 matchLabels:
13 app: frontend
14 template:
15 metadata:
16 labels:
17 app: frontend
18 spec:
19 containers:
20 - name: nginx
21 image: nginx:alpine
22 ports:
23 - containerPort: 80
24---
25apiVersion: v1
26kind: Service
27metadata:
28 name: frontend-service
29 namespace: prod
30spec:
31 # NodePort opens a port on your computer so you can access it in browser
32 type: NodePort
33 selector:
34 app: frontend
35 ports:
36 - port: 80
37 targetPort: 80
38 nodePort: 30009 # We force a specific port for easy testingObservation: After inspecting the pods, you will notice that the Frontend and Backend apps are only running on Worker 1 and Worker 3. Worker Node 2 is skipped entirely because of the taint we applied earlier.

Step 4: The Monitoring Agent (DaemonSet)
Now we will deploy the Log Collector. This DaemonSet utilizes a Toleration to ignore the "restricted" taint we created in Step 2.
The monitor-agent.yaml file
1
2apiVersion: apps/v1
3kind: DaemonSet
4metadata:
5 name: log-collector
6 namespace: monitoring
7 labels:
8 app: logging
9spec:
10 selector:
11 matchLabels:
12 app: logging
13 template:
14 metadata:
15 labels:
16 app: logging
17 spec:
18 # 1. TOLERATIONS: This magic key allows this pod to land on the tainted node
19 tolerations:
20 - key: "restricted"
21 operator: "Equal"
22 value: "true"
23 effect: "NoSchedule"
24
25 containers:
26 - name: fluentd-simulator
27 image: busybox
28 args:
29 - /bin/sh
30 - -c
31 - >
32 i=0;
33 while true;
34 do
35 echo "$i: Collecting logs from node $(printenv MY_NODE_NAME)...";
36 i=$((i+1));
37 sleep 10;
38 done
39 env:
40 # This helps us see which node the pod is actually running on in the logs
41 - name: MY_NODE_NAME
42 valueFrom:
43 fieldRef:
44 fieldPath: spec.nodeNameThis DaemonSet uses a simple BusyBox image to simulate log collection. Because of the Toleration, this pod is allowed to run on all nodes, including the restricted Worker Node 2.
Step 5: Node Affinity (SSD Cache)
Finally, let's use Node Affinity. We want a specific "Database Cache" pod that only runs on nodes backed by fast SSDs (Worker Node 3, which we labeled ssd=true).
The ssd-cache.yaml file
1
2apiVersion: apps/v1
3kind: DaemonSet
4metadata:
5 name: log-collector
6 namespace: monitoring
7 labels:
8 app: logging
9spec:
10 selector:
11 matchLabels:
12 app: logging
13 template:
14 metadata:
15 labels:
16 app: logging
17 spec:
18 # 1. TOLERATIONS: This magic key allows this pod to land on the tainted node
19 tolerations:
20 - key: "restricted"
21 operator: "Equal"
22 value: "true"
23 effect: "NoSchedule"
24
25 containers:
26 - name: fluentd-simulator
27 image: busybox
28 args:
29 - /bin/sh
30 - -c
31 - >
32 i=0;
33 while true;
34 do
35 echo "$i: Collecting logs from node $(printenv MY_NODE_NAME)...";
36 i=$((i+1));
37 sleep 10;
38 done
39 env:
40 # This helps us see which node the pod is actually running on in the logs
41 - name: MY_NODE_NAME
42 valueFrom:
43 fieldRef:
44 fieldPath: spec.nodeNameThis deploys a simple Redis image that will strictly adhere to our hardware requirements.
Verification
1. Visualizing the Nodes
Let's look at a detailed view of worker node one distribution.

If you check the monitoring namespace, you will see:
The SSD Cache Pod is running only on Worker 3 (due to Affinity).
The Log Collector is running on all 3 nodes (due to DaemonSet + Tolerations).

2. Verifying the Log Collector (The Taint Test)
The most important part of this project is proving that our Log Collector is running on Worker Node 2 (the restricted node) and actually doing its job.
- Find the Pod on the Restricted Node: First, list the pods with the node name to find the one running on worker2.
kubectl get pods -n monitoring -o wide
- Check the Logs: Copy that pod's name and check its output. It should be printing the node name it is running on.
kubectl logs [log-collector-pod-name] -n monitoring
- Expected Output:

This confirms that despite the "NoSchedule" taint, our infrastructure agent is successfully monitoring the critical node.
3. External Access (Frontend)
Open your browser and go to http://localhost:30009. You should see the "Welcome to nginx!" page.

4. Internal DNS (Frontend → Backend)
To test internal service discovery, we will log into the Frontend pod and try to reach the Backend using its Service name (backend-service).
Get the Frontend pod name:
kubectl get pods -n prod
Exec into the Frontend pod:
kubectl exec -it [frontend-pod-name] -n prod -- sh
Test connectivity via curl:
curl http://backend-service
(Note: If curl is missing, you can often verify DNS with nslookup backend-service).
