Core Components & Roles
Introduction
Welcome to our deep dive into the architecture of Kubernetes. In this lesson, we will look at each core component, understand its specific role, and most importantly, see how they all communicate to bring your applications to life.
To begin, let’s use a simple analogy:
The Control Plane is the “brain” or the command center of the operation, making all the decisions. The Worker Nodes are the “muscle” or the workforce, carrying out the orders from the brain.
The Control Plane: The Brain of the Operation
The Control Plane consists of the master processes that manage the entire cluster. Its components make global decisions about the cluster (for example, scheduling) as well as detecting and responding to cluster events. For a production environment, you would run the control plane components across multiple master nodes to ensure high availability. Even if you lose a control plane node, your applications on the worker nodes will continue to run without downtime.
Let’s break down the components of the Control Plane.
1. kube-apiserver
If the Control Plane is the brain, the kube-apiserver is the brain’s “frontal lobe” — its single point of contact. It is the heart of all communication in Kubernetes. Every other component, whether it’s part of the Control Plane or on a Worker Node, speaks only to the kube-apiserver. When you use the kubectl command-line tool, you are talking directly to the kube-apiserver.
It is responsible for three key functions:
- Exposing the Kubernetes API: It provides the primary RESTful interface to the cluster, allowing users, administrators, and other components to query and manipulate the state of objects in Kubernetes.
- Validating & Processing Requests: It intercepts all incoming requests, validates them against the API rules, and ensures they are legitimate before any changes are made to the cluster’s state.
- Acting as a Gateway: It is the only component that is allowed to communicate directly with
etcd, the cluster’s database.
This architecture ensures that all traffic is funneled through one central, secure, and auditable point.
2. etcd
etcd is the cluster’s database. It is a consistent and highly-available key-value store that serves as the single source of truth for your entire cluster. Every piece of configuration, every object definition (like Deployments, Pods, and Services), and the current state of every node and workload is stored in etcd.
If the
kube-apiserveris the gateway,etcdis the official record book.
Because it holds the entire state of your cluster, etcd is the most critical component to protect and back up. Losing etcd data means losing your cluster’s state.
3. kube-scheduler
The kube-scheduler has one, very specific job: it decides which Worker Node a newly created Pod should run on. It doesn’t actually run the Pod itself—it just makes the placement decision.
Here’s how it works:
- The scheduler constantly watches the
kube-apiserverfor new Pods that do not yet have a node assigned to them. - For each Pod, it runs a complex algorithm, first filtering out nodes that don’t meet the Pod’s requirements (e.g., insufficient CPU or memory).
- It then scores the remaining valid nodes to find the “best” possible fit based on various criteria.
- Once it decides, it simply notifies the
kube-apiserverof its decision (e.g., “Assign this Pod to Worker Node 3”).
4. kube-controller-manager
The kube-controller-manager is the engine that drives the state of the cluster towards your desired state. It runs multiple background processes known as “reconciliation loops.”
Think of it like a thermostat. You set a desired temperature (the desired state), and the thermostat constantly checks the current temperature (the actual state) and turns the heating or cooling on or off to match it.
The controller manager is a single binary that contains multiple controllers. For example:
- Node Controller: Watches for nodes becoming unhealthy and takes action.
- Deployment Controller: Ensures that the number of Pods running matches the number of replicas you specified in your Deployment manifest. If a Pod fails, this controller detects it and requests a replacement.
These controllers constantly watch the cluster’s state via the API server and make changes (also via the API server) to correct any deviations from your desired configuration.
The Worker Nodes: The Workforce
The Control Plane has made the decisions. Now, the Worker Nodes do the actual work. Each worker node (which can be a VM or a physical machine) runs a few key components that allow it to be managed by the control plane and, most importantly, to run your containerized applications.
1. kubelet
The kubelet is the primary agent that runs on every single worker node in the cluster. It acts as the local representative of the Control Plane. It registers its node with the kube-apiserver and then constantly watches for Pods that have been scheduled to it.
Once the kubelet sees a Pod assigned to its node, its job is to:
- Read the Pod’s specifications (the “PodSpec”).
- Communicate with the container runtime to pull the required container images and start the containers.
- Continuously monitor the health of those containers and report their status back to the
kube-apiserver.
If a container fails or the kubelet can’t run a Pod for any reason, it’s the kubelet that reports this failure status back to the Control Plane.
2. kube-proxy
The kube-proxy is a network proxy that runs on each node and is responsible for the networking magic behind Kubernetes Services. Its primary job is to maintain the network rules on the node itself. It does this by watching the kube-apiserver for the creation and removal of Service and Endpoint objects. Based on that information, it configures the node’s underlying networking system (typically using iptables or IPVS) to ensure traffic sent to a Service’s IP address is correctly forwarded to the right backend Pods, regardless of which node those Pods are running on.
3. Container Runtime
The Container Runtime is the software that is actually responsible for running containers. Kubernetes itself doesn’t run containers; it orchestrates them. It relies on a container runtime to do the heavy lifting of starting, stopping, and managing the container lifecycle.
The container runtime handles low-level operational duties like managing network isolation and allocating CPU and memory for containers. Kubernetes communicates with the container runtime through a standardized API called the Container Runtime Interface (CRI). This abstraction allows Kubernetes to remain flexible and work with different CRI-compliant runtimes.
Common examples include containerd and CRI-O.
Summary: Putting It All Together
The key architectural principle to remember is that all communication flows through the kube-apiserver.
The kube-scheduler doesn’t directly tell the kubelet what to do. The kube-controller-manager doesn’t directly interact with nodes. Instead, every component acts as a client of the API server. They watch for changes and then declare their intent back to the API server, which persists that intent in etcd. The kubelet on the worker node then sees the results of those decisions (like a Pod being assigned to it) from the API server and works to make it a reality.
This declarative, API-centric design is what makes Kubernetes so powerful, resilient, and extensible. Each component has a specific, focused job, and they are all loosely coupled through a central API.
Next Steps
Understanding these components and their roles is the absolute foundation for mastering Kubernetes. In our next topic, we’ll explore the concept of High Availability and why it’s so critical for the Control Plane.