集群基础设施准备
Kubernetes 集群安装前的系统环境准备,包括容器运行时安装、内核参数配置、网络设置等基础设施配置。
概述
在安装 Kubernetes 集群之前,需要对所有节点(包括控制平面和工作节点)进行系统级的环境准备。这包括容器运行时安装、内核模块加载、网络参数调整等。CKA 考试可能涉及排查集群基础设施问题。
1. 系统前置条件
1.1 主机要求
| 项目 | 要求 |
|---|---|
| 操作系统 | Linux (Ubuntu 20.04+, CentOS 7+, Rocky Linux 8+) |
| CPU | 至少 2 核 |
| 内存 | 控制平面至少 2GB,工作节点至少 1GB |
| 磁盘 | 至少 20GB 可用空间 |
| 主机名 | 唯一且可解析 |
| 网络 | 节点间网络互通 |
# 检查系统信息
uname -a
cat /etc/os-release
hostnamectl
1.2 主机名与 hosts 解析
# 设置主机名
sudo hostnamectl set-hostname control-plane-1
# 配置 /etc/hosts
cat <<EOF | sudo tee -a /etc/hosts
192.168.1.10 control-plane-1
192.168.1.11 worker-1
192.168.1.12 worker-2
EOF
# 验证主机名解析
hostname
hostname -f
getent hosts $(hostname)
1.3 关闭 swap
Kubernetes 要求禁用 swap 以确保 Pod 的内存稳定性。
# 临时关闭
sudo swapoff -a
# 永久关闭(注释掉 /etc/fstab 中的 swap 行)
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# 验证
free -m
swapon --show
1.4 关闭防火墙(或放行端口)
# Ubuntu (ufw)
sudo ufw disable
sudo ufw status
# CentOS/RHEL (firewalld)
sudo systemctl stop firewalld
sudo systemctl disable firewalld
sudo firewall-cmd --state
# 需要放行的端口(如果不关闭防火墙)
# 控制平面:
# - 6443 (API Server)
# - 2379-2380 (etcd)
# - 10250 (kubelet)
# - 10259 (scheduler)
# - 10257 (controller-manager)
# 工作节点:
# - 10250 (kubelet)
# - 30000-32767 (NodePort 服务)
1.5 时间同步
# 安装 chrony
sudo apt install -y chrony # Ubuntu
sudo yum install -y chrony # CentOS/RHEL
# 启动并启用
sudo systemctl enable --now chronyd
# 验证同步状态
timedatectl status
chronyc tracking
1.6 加载内核模块
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
# 立即加载
sudo modprobe overlay
sudo modprobe br_netfilter
# 验证
lsmod | grep overlay
lsmod | grep br_netfilter
1.7 配置内核参数
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
net.ipv4.conf.all.rp_filter = 1
EOF
# 应用配置
sudo sysctl --system
# 验证
sysctl net.bridge.bridge-nf-call-iptables
sysctl net.ipv4.ip_forward
2. 容器运行时安装
2.1 containerd
containerd 是 Kubernetes 默认的容器运行时,也是 CKA 考试推荐使用的运行时。
# --- 方法一:通过 Docker 安装 containerd ---
# Ubuntu
sudo apt update
sudo apt install -y containerd
# CentOS/RHEL
sudo yum install -y yum-utils
sudo yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
sudo yum install -y containerd.io
# --- 方法二:从二进制安装 containerd(推荐用于考试环境)---
# 下载并安装 containerd
wget https://github.com/containerd/containerd/releases/download/v1.7.x/containerd-1.7.x-linux-amd64.tar.gz
sudo tar Cxzvf /usr/local containerd-1.7.x-linux-amd64.tar.gz
# 安装 runc
wget https://github.com/opencontainers/runc/releases/download/v1.1.x/runc.amd64
sudo install -m 755 runc.amd64 /usr/local/sbin/runc
# 安装 CNI 插件
wget https://github.com/containernetworking/plugins/releases/download/v1.3.x/cni-plugins-linux-amd64-v1.3.x.tgz
sudo mkdir -p /opt/cni/bin
sudo tar Cxzvf /opt/cni/bin cni-plugins-linux-amd64-v1.3.x.tgz
2.2 containerd 配置
# 生成默认配置文件(重要!)
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml
# 查看生成的配置
cat /etc/containerd/config.toml
2.3 SystemdCgroup 配置
这是 CKA 考试中的关键配置点。Kubernetes 要求使用 systemd cgroup driver。
# 编辑 containerd 配置文件
sudo vi /etc/containerd/config.toml
找到并修改以下配置:
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
...
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
SystemdCgroup = true # 改为 true
或使用 sed 快速修改:
# 修改 SystemdCgroup 为 true
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
# 配置 sandbox_image(如果内网环境需要修改)
sudo sed -i 's|sandbox_image = "registry.k8s.io/pause:.*"|sandbox_image = "registry.k8s.io/pause:3.9"|' /etc/containerd/config.toml
# 重启 containerd
sudo systemctl restart containerd
# 验证 containerd 状态
sudo systemctl status containerd
sudo ctr version
sudo crictl version
sudo crictl info
2.4 CRI-O(备选)
# Ubuntu 安装 CRI-O
sudo apt update
sudo apt install -y cri-o cri-o-runc
# CentOS/RHEL 安装 CRI-O
sudo yum install -y cri-o
# 启动 CRI-O
sudo systemctl enable --now crio
# 配置 CRI-O cgroup driver
sudo vi /etc/crio/crio.conf
# 设置 cgroup_manager = "systemd"
sudo systemctl restart crio
3. kubelet、kubeadm、kubectl 安装
# --- Ubuntu / Debian ---
sudo apt-get update
sudo apt-get install -y apt-transport-https ca-certificates curl gpg
# 添加 Kubernetes 仓库
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.31/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.31/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl # 防止版本自动升级
# --- CentOS / RHEL / Rocky ---
cat <<EOF | sudo tee /etc/yum.repos.d/kubernetes.repo
[kubernetes]
name=Kubernetes
baseurl=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/
enabled=1
gpgcheck=1
gpgkey=https://pkgs.k8s.io/core:/stable:/v1.31/rpm/repodata/repomd.xml.key
EOF
sudo yum install -y kubelet kubeadm kubectl --disableexcludes=kubernetes
sudo yum-mark hold kubelet kubeadm kubectl
# 启动 kubelet(此时会重启失败,因为还没有初始化集群,这是正常的)
sudo systemctl enable --now kubelet
sudo systemctl status kubelet
4. 验证基础设施
# 验证所有前置条件
echo "=== 主机名 ==="
hostnamectl
echo "=== swap 状态 ==="
swapon --show || echo "swap 已关闭 ✓"
echo "=== 内核模块 ==="
lsmod | grep -E "overlay|br_netfilter"
echo "=== 网络参数 ==="
sysctl net.bridge.bridge-nf-call-iptables net.ipv4.ip_forward
echo "=== containerd ==="
sudo systemctl is-active containerd
sudo crictl version
echo "=== kubelet ==="
sudo systemctl is-enabled kubelet
echo "=== 端口监听 ==="
sudo ss -tulpn | grep -E "6443|2379|2380|10250|10259|10257"
CKA 考试要点
- SystemdCgroup 是必考重点 -- kubeadm init 失败时首先检查 containerd SystemdCgroup 配置
- swap 必须关闭 -- 集群启动失败时记得检查 swap 是否已禁用
- 使用
apt-mark hold-- 防止 kubelet/kubeadm/kubectl 意外升级 - 提前配置 containerd -- 在 kubeadm init 之前确保 containerd 正常运行
- 调试常用命令:
sudo journalctl -u containerd -n 50 --no-pagersudo journalctl -u kubelet -n 50 --no-pagercrictl images-- 列出已拉取的镜像crictl ps -a-- 列出所有容器
🧪 完整操作实例:在 Ubuntu 节点上配置 K8s 基础设施
场景描述
在新安装的 Ubuntu 22.04 节点上完成 Kubernetes 前置系统配置:禁用 swap、加载内核模块、配置网络参数、安装 containerd 并设置 SystemdCgroup。
前置条件
- 一台运行 Ubuntu 22.04+ 的服务器(物理机或虚拟机)
- 具有 sudo 权限的用户
- 网络可访问 apt 仓库
操作步骤
Step 1: 禁用 swap
# 临时关闭
sudo swapoff -a
# 永久关闭(注释 /etc/fstab 中的 swap 条目)
sudo sed -i '/ swap / s/^\(.*\)$/#\1/g' /etc/fstab
# 验证
free -m
# total used free shared buff/cache available
# Mem: 7982 1234 4567 123 2181 6543
# Swap: 0 0 0 <-- 应为 0
swapon --show
# (无输出,表示 swap 已关闭)
Step 2: 加载内核模块
# 配置开机自动加载
cat <<EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
# 立即加载
sudo modprobe overlay
sudo modprobe br_netfilter
# 验证
lsmod | grep -E "overlay|br_netfilter"
# overlay 147456 0
# br_netfilter 32768 0
# bridge 307200 1 br_netfilter
Step 3: 配置内核网络参数
cat <<EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
# 应用配置
sudo sysctl --system
# 验证关键参数
sysctl net.bridge.bridge-nf-call-iptables net.ipv4.ip_forward
# net.bridge.bridge-nf-call-iptables = 1
# net.ipv4.ip_forward = 1
Step 4: 安装 containerd
# 安装依赖
sudo apt-get update
sudo apt-get install -y ca-certificates curl
# 添加 Docker 官方 GPG 密钥和仓库
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu $(. /etc/os-release && echo $VERSION_CODENAME) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update
sudo apt-get install -y containerd.io
Step 5: 配置 containerd SystemdCgroup
# 生成默认配置
sudo mkdir -p /etc/containerd
containerd config default | sudo tee /etc/containerd/config.toml > /dev/null
# 启用 SystemdCgroup
sudo sed -i 's/SystemdCgroup = false/SystemdCgroup = true/' /etc/containerd/config.toml
# 重启 containerd
sudo systemctl restart containerd
# 验证配置
sudo systemctl status containerd --no-pager
# ● containerd.service - containerd container runtime
# Loaded: loaded (/lib/systemd/system/containerd.service; enabled; vendor preset: enabled)
# Active: active (running)
sudo crictl info | grep cgroupManager
# "cgroupManager": "systemd"
验证结果
# 一键验证所有前置条件
echo "=== swap ===" && swapon --show || echo "swap 已关闭"
echo "=== 内核模块 ===" && lsmod | grep -E "overlay|br_netfilter"
echo "=== 网络参数 ===" && sysctl net.bridge.bridge-nf-call-iptables net.ipv4.ip_forward
echo "=== containerd ===" && sudo systemctl is-active containerd && ctr version
考试提示
- SystemdCgroup 是 CKA 中 kubeadm init 失败的常见原因 -- 务必在初始化前配置
swapoff -a仅临时生效,必须同步修改/etc/fstab确保重启后仍生效- 使用
crictl info验证 containerd 配置,而不是ctr version - kubeadm init 失败时,先检查
journalctl -u kubelet和 containerd SystemdCgroup 配置