扩缩容与 HPA
手动扩缩容、HorizontalPodAutoscaler 自动扩缩容、metrics-server 配置
概述
CKA 考试需要掌握手动扩缩容 Deployment 和配置 HPA 自动扩缩容。HPA 依赖 metrics-server 提供 Pod 资源指标。
一、手动扩缩容
1.1 kubectl scale
kubectl scale deployment/nginx --replicas=5
kubectl scale deployment nginx --replicas=3
# 扩缩 ReplicaSet
kubectl scale rs/web-rs --replicas=4
# 扩缩 StatefulSet
kubectl scale sts/web --replicas=5
# 当前副本数查看
kubectl get deployment nginx
kubectl get rs
kubectl get pods
# 条件缩容(--current-replicas 验证当前副本数)
kubectl scale deployment/nginx --current-replicas=5 --replicas=3
# 如果当前不是 5,则不执行
1.2 扩容特定 ReplicaSet 版本(回滚后扩缩)
# 查看版本历史
kubectl rollout history deployment nginx
# 回滚到版本 2
kubectl rollout undo deployment nginx --to-revision=2
# 扩容
kubectl scale deployment nginx --replicas=5
二、HorizontalPodAutoscaler(HPA)
HPA 根据 CPU/内存使用率自动调整 Deployment / StatefulSet 的副本数。
2.1 前置条件:metrics-server
# 安装 metrics-server(考试环境通常已安装)
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 验证安装
kubectl get pods -n kube-system | grep metrics-server
# 查看节点和 Pod 指标
kubectl top nodes
kubectl top pods
# 如果 top 命令返回 "metrics not available yet",等待 metrics-server 收集数据(约 30s)
2.2 创建 HPA
方法一:命令式(kubectl autoscale)
# 创建 HPA:CPU 使用率超过 50% 时扩容,最大 10 个副本,最小 2 个
kubectl autoscale deployment nginx --cpu-percent=50 --min=2 --max=10
# 生成 HPA YAML
kubectl autoscale deployment nginx --cpu-percent=50 --min=2 --max=10 --dry-run=client -o yaml
方法二:声明式 YAML
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: nginx-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 50
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
2.3 查看 HPA 状态
# 查看 HPA
kubectl get hpa
kubectl get horizontalpodautoscaler
# 查看 HPA 详细信息
kubectl describe hpa nginx-hpa
# 查看 HPA YAML
kubectl get hpa nginx-hpa -o yaml
# 监控 HPA
kubectl get hpa -w
2.4 HPA 扩缩容行为示例输出
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
nginx-hpa Deployment/nginx 30%/50% 2 10 2 5m
nginx-hpa Deployment/nginx 70%/50% 2 10 4 6m
nginx-hpa Deployment/nginx 45%/50% 2 10 4 7m
2.5 生成负载测试扩容
# 启动一个产生 CPU 负载的 Pod
kubectl run load-generator --image=busybox --restart=Never -- /bin/sh -c "while true; do wget -q -O- http://nginx-service; done"
# 或者使用
kubectl run -i --tty load-generator --image=busybox --restart=Never -- sh -c "while true; do wget -q -O- http://nginx:80; done"
# 查看 HPA 是否扩容
kubectl get hpa -w
# 完成测试后删除负载
kubectl delete pod load-generator
三、HPA 高级配置
3.1 自定义指标(autoscaling/v2)
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Pods
pods:
metric:
name: requests-per-second
target:
type: AverageValue
averageValue: 1000
behavior: # 扩缩容行为控制
scaleDown:
stabilizationWindowSeconds: 300 # 缩容稳定窗口(默认 5 分钟)
policies:
- type: Percent
value: 100
periodSeconds: 15
scaleUp:
stabilizationWindowSeconds: 0 # 扩容无需等待
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
selectPolicy: Max
3.2 行为策略说明
| 策略 | 说明 |
|---|---|
stabilizationWindowSeconds | 稳定窗口,防止频繁扩缩(抖动) |
scaleDown.policies | 缩容策略:每秒最多缩容百分比 / 数量 |
scaleUp.policies | 扩容策略:每秒最多扩容百分比 / 数量 |
selectPolicy: Max/Min/Disabled | 选择策略:最大值 / 最小值 / 禁用 |
四、metrics-server
4.1 安装
# 快速安装
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 如果安装后不能正常工作,可能需要修改参数
kubectl edit deployment metrics-server -n kube-system
# 在 spec.containers[0].args 中添加:
# - --kubelet-insecure-tls
# - --kubelet-preferred-address-types=InternalIP
4.2 验证
# 等待 Pod 就绪
kubectl wait --namespace kube-system --for=condition=ready pod -l k8s-app=metrics-server --timeout=120s
# 测试指标采集
kubectl top nodes
kubectl top pods
# 如果长时间无数据,检查 metrics-server 日志
kubectl logs -n kube-system -l k8s-app=metrics-server
五、考试实用命令
# 1. 快速创建带资源的 Deployment(HPA 需要 Pod 有 CPU requests)
kubectl create deployment nginx --image=nginx --dry-run=client -o yaml > nginx.yaml
# 编辑添加 resources.requests.cpu
# 2. 应用资源设置
vim nginx.yaml
kubectl apply -f nginx.yaml
# 3. 创建 HPA
kubectl autoscale deployment nginx --cpu-percent=50 --min=1 --max=5
# 4. 验证
kubectl get hpa
kubectl get pods -w
# 5. 创建 CronJob 定期扩容(非 HPA 但考试有用)
kubectl create cronjob scale-up --image=bitnami/kubectl --schedule="0 8 * * 1-5" -- kubectl scale deployment nginx --replicas=10
kubectl create cronjob scale-down --image=bitnami/kubectl --schedule="0 18 * * 1-5" -- kubectl scale deployment nginx --replicas=2
🧪 完整操作实例:手动扩缩容 + 配置 HPA 自动扩缩
场景描述
先手动扩展 Deployment,再配置基于 CPU 的 HPA 自动扩缩容。
前置条件
- 可用的 Kubernetes 集群(建议使用 minikube 或 kind)
- kubectl 已配置连接集群
- Deployment 中的 Pod 必须设置 CPU requests(HPA 依赖)
操作步骤
Step 1: 创建带资源请求的 Deployment
cat <<'EOF' > nginx-deploy.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
resources:
requests:
cpu: "200m"
memory: "128Mi"
ports:
- containerPort: 80
EOF
kubectl apply -f nginx-deploy.yaml
# 预期输出:deployment.apps/nginx created
kubectl get deployment nginx
# 预期输出:NAME READY UP-TO-DATE AVAILABLE AGE
# nginx 2/2 2 2 <seconds>
Step 2: 手动扩缩容
# 扩容到 5 个副本
kubectl scale deployment nginx --replicas=5
# 预期输出:deployment.apps/nginx scaled
kubectl get deployment nginx
# 预期输出:NAME READY UP-TO-DATE AVAILABLE AGE
# nginx 5/5 5 5 <seconds>
# 缩容回 2 个副本
kubectl scale deployment nginx --replicas=2
# 预期输出:deployment.apps/nginx scaled
Step 3: 确认 metrics-server 已安装
kubectl get pods -n kube-system | grep metrics-server
# 预期输出:metrics-server-<hash> 1/1 Running 0 <time>
# 如未安装则安装
kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
# 验证指标可用
kubectl top pods
# 预期输出:NAME CPU(cores) MEMORY(bytes)
# nginx-<hash>-<pod> 1m 10Mi
# nginx-<hash>-<pod> 2m 12Mi
Step 4: 创建 HPA(基于 CPU)
kubectl autoscale deployment nginx --cpu-percent=50 --min=2 --max=10
# 预期输出:horizontalpodautoscaler.autoscaling/nginx autoscaled
kubectl get hpa nginx
# 预期输出:NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
# nginx Deployment/nginx 0%/50% 2 10 2 <seconds>
Step 5: 产生 CPU 负载触发自动扩容
# 暴露 Service 以便负载生成器访问
kubectl expose deployment nginx --port=80 --target-port=80
# 预期输出:service/nginx exposed
# 启动负载生成器
kubectl run load-generator --image=busybox --restart=Never -- /bin/sh -c "while true; do wget -q -O- http://nginx; done"
# 预期输出:pod/load-generator created
# 监控 HPA(另开终端或加 &)
kubectl get hpa nginx -w
# 预期输出(约 1-2 分钟后开始变化):
# nginx Deployment/nginx 0%/50% 2 10 2 2m
# nginx Deployment/nginx 65%/50% 2 10 4 3m
# nginx Deployment/nginx 80%/50% 2 10 5 4m
# nginx Deployment/nginx 45%/50% 2 10 5 5m
验证结果
# 查看 HPA 最终状态
kubectl get hpa nginx
# 预期输出:CPU 使用率低于 50%(负载稳定后可能回到低值)
# 查看副本数变化
kubectl get deployment nginx
# 预期输出:REPLICAS 可能 > 2(自动扩容)
# 停止负载后观察缩容(几分钟后自动缩回 2)
kubectl delete pod load-generator
# 预期输出:pod "load-generator" deleted
# 清理
kubectl delete deployment nginx
kubectl delete service nginx
kubectl delete hpa nginx
考试提示
- Pod 必须设置 CPU requests 才能被 HPA 识别,否则 HPA 无法计算 CPU 利用率
kubectl autoscale是最快的创建 HPA 方式,适用于 CKA 时间紧张场景- HPA 默认每 15 秒采集一次指标,扩容无延迟策略,缩容默认有 5 分钟稳定窗口
- metrics-server 通常已预装在考试环境,但建议先用
kubectl top nodes验证 - 如果
kubectl top返回metrics not available yet,等待约 30-60 秒后再试