Skip to main content

Kubernetes

Kubernetes

Kubernetes is an open-source container orchestration tool or system that is used to automate tasks such as the management, monitoring, scaling, and deployment of containerized applications. It is used to easily manage several containers (since it can handle grouping of containers), which provides for logical units that can be discovered and managed.

The architectural concepts behind Kubernetes.


A Kubernetes cluster consists of a control plane plus a set of worker machines, called nodes, that run containerized applications. Every cluster needs at least one worker node in order to run Pods.

The worker node(s) host the Pods that are the components of the application workload. The control plane manages the worker nodes and the Pods in the cluster. In production environments, the control plane usually runs across multiple computers and a cluster usually runs multiple nodes, providing fault-tolerance and high availability.

This document outlines the various components you need to have for a complete and working Kubernetes cluster.





Control plane componentsThe control plane's components make global decisions about the cluster (for example, scheduling), as well as detecting and responding to cluster events (for example, starting up a new pod when a Deployment's replicas field is unsatisfied).


Control plane components can be run on any machine in the cluster. However, for simplicity, setup scripts typically start all control plane components on the same machine, and do not run user containers on this machine. See Creating Highly Available clusters with kubeadm for an example control plane setup that runs across multiple machines.



kube-apiserver

The API server is a component of the Kubernetes control plane that exposes the Kubernetes API. The API server is the front end for the Kubernetes control plane.

The main implementation of a Kubernetes API server is kube-apiserver. kube-apiserver is designed to scale horizontally—that is, it scales by deploying more instances. You can run several instances of kube-apiserver and balance traffic between those instances.

"The Kubernetes API server, also known as kube-apiserver, is the main interface for managing, creating, and configuring Kubernetes clusters": 

Gateway: The API server is the primary entry point for users and services to interact with the cluster. 

Validator: The API server checks if requests are valid and processes them. 

Communicator: The API server allows communication between users, external components, and parts of the cluster. 

Data manager: The API server validates and configures data for API objects, such as pods, services, and replication controllers. 

Data retriever and updater: The API server retrieves and updates data in the ETCD key-value store. 
Security controls: The API server has built-in security controls, such as audit logging and admission controllers. 

The API server has three representations of the API:

External representation: The representation that comes in via an API request

Internal representation: The in-memory representation of the object used within the API server for processing

Storage representation: Recorded into the storage layer to persist the API objects 

etcd

The name ā€œetcdā€ comes from a naming convention within the Linux directory structure: In UNIX, all system configuration files for a single system are contained in a folder called ā€œ/etc;ā€ ā€œdā€ stands for ā€œdistributed.ā€

etcd is an open-source, distributed key-value storage system that facilitates the configuration of resources, the discovery of services, and the coordination of distributed systems such as clusters and containers. Its functionalities include distributing and scheduling work across multiple hosts, enabling automatic updates that are safer, and setting up overlay networking for containers. etcd is designed to maintain redundancy and resilience in cloud systems and is the standard storage system used in Kubernetes.

You can find in-depth information about etcd in the official documentation.

kube-scheduler

Control plane component that watches for newly created Pods with no assigned node, and selects a node for them to run on.

Factors taken into account for scheduling decisions include: individual and collective resource requirements, hardware/software/policy constraints, affinity and anti-affinity specifications, data locality, inter-workload interference, and deadlines.

kube-controller-manager


Control plane component that runs controller processes.

Logically, each controller is a separate process, but to reduce complexity, they are all compiled into a single binary and run in a single process.

There are many different types of controllers. Some examples of them are:Node controller: Responsible for noticing and responding when nodes go down.
Job controller: Watches for Job objects that represent one-off tasks, then creates Pods to run those tasks to completion.
EndpointSlice controller: Populates EndpointSlice objects (to provide a link between Services and Pods).
ServiceAccount controller: Create default ServiceAccounts for new namespaces.

The above is not an exhaustive list.

cloud-controller-manager

A Kubernetes control plane component that embeds cloud-specific control logic. The cloud controller manager lets you link your cluster into your cloud provider's API, and separates out the components that interact with that cloud platform from components that only interact with your cluster.

The following controllers can have cloud provider dependencies:
  • Node controller: For checking the cloud provider to determine if a node has been deleted in the cloud after it stops responding
  • Route controller: For setting up routes in the underlying cloud infrastructure
  • Service controller: For creating, updating and deleting cloud provider load balancers

Node components

Node components run on every node, maintaining running pods and providing the Kubernetes runtime environment.

kubelet

An agent that runs on each node in the cluster. It makes sure that containers are running in a Pod

The kubelet takes a set of PodSpecs that are provided through various mechanisms and ensures that the containers described in those PodSpecs are running and healthy. The kubelet doesn't manage containers which were not created by Kubernetes.

kube-proxy (optional)

kube-proxy is a network proxy that runs on each node in your cluster, implementing part of the Kubernetes Service concept.

kube-proxy maintains network rules on nodes. These network rules allow network communication to your Pods from network sessions inside or outside of your cluster.

kube-proxy uses the operating system packet filtering layer if there is one and it's available. Otherwise, kube-proxy forwards the traffic itself.

If you use a network plugin that implements packet forwarding for Services by itself, and providing equivalent behavior to kube-proxy, then you do not need to run kube-proxy on the nodes in your cluster.






FAQs

How scalable is Kubernetes?

With its recent major release of 1.23, Kubernetes offers built-in features for cluster scalability to support up to 5000 nodes and 150,000 pods. The platform allows multiple auto scaling options based on both resource and custom metrics (for application scaling) and node pools (for comprehensive cluster scaling).

Can Kubernetes scale nodes?

Kubernetes supports auto scaling of both control plane and worker nodes for optimum performance handling. With inherent cluster scaling capabilities, Kubernetes allows increasing or reducing the number of nodes in the cluster based on node utilization metrics and the existence of pending pods. To request or deallocate nodes dynamically, the cluster autoscaling object typically interfaces with the cloud service for handling load spikes.

Can Kubernetes do vertical scaling?

In addition to supporting horizontal scaling to add more pods, Kubernetes also allows vertical scaling that involves the dynamic provisioning of attributed resources, such as RAM or CPU of cluster nodes to match changing application requirements. A vertical scaling is essentially achieved by tweaking the pod resource request parameters based on workload consumption metrics

Comments

Popular posts from this blog

Microservices design patterns

Microservices design pattern Next :  saga-design-pattern-microservices

Runtime Fabric (RTF)

MuleSoft's Anypoint Runtime Fabric (RTF) has many features that help with deployment and management of Mule applications: Deployment: RTF can deploy applications to any environment, including on-premises, in the cloud, or in a hybrid setup. It can also automatically deploy Mule runtimes into containers. Isolation: RTF can isolate applications by running a separate Mule runtime server for each application. Scaling: RTF can scale applications across multiple replicas. Fail-over: RTF can automatically fail over applications. Monitoring and logging: RTF has built-in monitoring and logging capabilities to help teams troubleshoot issues and gain insights into application performance. Containerization: RTF supports containerization, which allows applications to be packaged with their dependencies and run consistently across different environments. Integration: RTF can integrate with services like SaveMyLeads to automate data flow between applications. Management: RTF can be managed with A...

Integration Design Patterns

Understanding Integration Design Patterns: Integration design patterns serve as reusable templates for solving common integration problems encountered in software development. They encapsulate best practices and proven solutions, empowering developers to architect complex systems with confidence. These patterns abstract away the complexities of integration, promoting modularity, flexibility, and interoperability across components. Most Common Integration Design Patterns: Point-to-Point Integration: Point-to-Point Integration involves establishing direct connections between individual components. While simple to implement, this pattern can lead to tight coupling and scalability issues as the number of connections grows. Visualizing this pattern, imagine a network of interconnected nodes, each communicating directly with specific endpoints. Publish-Subscribe (Pub/Sub) Integration: Pub/Sub Integration decouples producers of data (publishers) from consumers (subscribers) through a central ...