参考：
https://mp.weixin.qq.com/s/D8efjj9ZhLyEu7zEqWvJiQ
https://stackoverflow.com/questions/71860152/actuator-health-endpoint-returns-out-of-service-when-all-groups-are-up
https://docs.spring.io/spring-boot/docs/2.6.x/reference/htmlsingle/#actuator.endpoints.kubernetes-probes

本文使用 K8s + SpringBoot 实现零宕机发布：健康检查 + 滚动更新 + 优雅停机 + 弹性伸缩 + Prometheus监控 + 配置分离（镜像复用）

健康检查

健康检查类型：就绪探针（readiness）+ 存活探针（liveness）
探针类型：exec（进入容器执行脚本）、tcpSocket（探测端口）、httpGet（调用接口）

业务层面

项目依赖 pom.xml

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

定义访问端口、路径及权限 application.yaml

management:
  server:
    port: 50000                         # 启用独立运维端口（可选，如果不配置，则请求服务端口）
  endpoint:                             # 开启health端点
    health:
      probes:
        enabled: true
        # 如果配置add-additional-paths=true，这将使主服务器端口上的liveness在/livez可用，readiness在/readyz可用。
        # add-additional-paths: true
      # 如果配置show-details: always，将在当前endpoint展示更多节点信息（包含ping结果，livenessState，readinessState，diskSpace等）
      # show-details: always
      # 通过配置group.*.include，来添加访问/health/(liveness|readiness)接口的结果信息
      # group:
      #   readiness:
      #     include:
      #       - readinessState
      #       - ping
      #   liveness:
      #     include:
      #       - readinessState
      #       - ping
  endpoints:
    web:
      exposure:
        # 默认就是/actuator
        base-path: /actuator            # 指定上下文路径，启用相应端点
        include: health

将暴露/actuator/health/readiness和/actuator/health/liveness两个接口，访问方式如下：

1
2
3

http://127.0.0.1:50000/actuator/health -> 返回组下所有信息
http://127.0.0.1:50000/actuator/health/readiness -> 返回readiness组下信息
http://127.0.0.1:50000/actuator/health/liveness -> 返回liveness组下信息

运维层面

k8s部署模板deployment.yaml

apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      containers:
      - name: {APP_NAME}
        image: {IMAGE_URL}
        imagePullPolicy: Always
        ports:
        - containerPort: {APP_PORT}
        - name: management-port
          containerPort: 50000         # 应用管理端口
        readinessProbe:                # 就绪探针
          httpGet:
            path: /actuator/health/readiness
            port: management-port
          initialDelaySeconds: 90      # 延迟加载时间
          periodSeconds: 30            # 重试时间间隔
          timeoutSeconds: 30            # 超时时间设置
          successThreshold: 1          # 健康阈值
          failureThreshold: 3          # 不健康阈值
        livenessProbe:                 # 存活探针
          httpGet:
            path: /actuator/health/liveness
            port: management-port
          initialDelaySeconds: 90      # 延迟加载时间
          periodSeconds: 30            # 重试时间间隔
          timeoutSeconds: 30            # 超时时间设置
          successThreshold: 1          # 健康阈值
          failureThreshold: 3          # 不健康阈值

滚动更新

k8s资源调度之滚动更新策略，若要实现零宕机发布，需支持健康检查

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {APP_NAME}
  labels:
    app: {APP_NAME}
spec:
  selector:
    matchLabels:
      app: {APP_NAME}
  replicas: {REPLICAS}    # Pod副本数
  strategy:
    type: RollingUpdate    # 滚动更新策略
    rollingUpdate:
      maxSurge: 1                   # 升级过程中最多可以比原先设置的副本数多出的数量
      maxUnavailable: 1             # 升级过程中最多有多少个POD处于无法提供服务的状态

优雅停机

在 K8s 中，优雅停机的目标是：不再接入新流量、让在途任务完成、按序释放资源。下面结合 test_graceful_shutdown_web 的实践给出一套可直接复用的方式。

核心原则

先让 Pod 变为 NotReady（readiness 失败），避免新流量进入
应用内部“先停入口、再等在途、再释放资源”
所有等待都要有超时，避免无限卡死

Spring Boot 配置（应用层）

SmartLifecycle 统一停机入口

在SpringBoot的应用中，我们通常会利用**@PostConstruct和@PreDestroy**注解，在Bean初始化或销毁时执行一些操作，这些操作都处于Bean声明周期的层面。

然而，在某些情况下，我们可能会遇到一些遗漏的场景，比如希望在容器本身的生命周期事件（如容器启动、停止）上执行一些操作

public class GracefulShutdownGate implements SmartLifecycle {
### `SmartLifecycle` 接口关键方法
1. `isAutoStartup()`：返回一个布尔值，指示这个生命周期组件是否应该在 Spring 容器启动时自动启动。
2. `getPhase()`：返回一个整数，表示这个生命周期组件的启动顺序。数值越小，组件启动的越早。
3. `start()`：启动这个生命周期组件。
4. `stop()`：停止这个生命周期组件。
5. `stop(Runnable callback)`：停止生命周期组件，并在停止后执行提供的回调。
6. `isRunning`：在应用退出时会执行isRunning方法判断该Lifecycle是否已经启动，如果返回true则调用stop()停止方法
}

适用场景：

你需要统一管理多个资源的停机顺序（如 Kafka、线程池、定时任务）
需要等待业务内“在途任务”完成（例如消息处理、异步任务、批处理）

@Async / @Scheduled 的优雅停机

Spring 已支持异步与定时任务的优雅关闭：

spring.task.execution.shutdown.await-termination=true
# spring.task.execution.shutdown.await-termination-period=30s
spring.task.scheduling.shutdown.await-termination=true
# spring.task.scheduling.shutdown.await-termination-period=30s

适用场景：

@Async 的后台任务
@Scheduled 的周期任务

HTTP 请求的优雅停机

spring:
  lifecycle:
    timeout-per-shutdown-phase: 30s

server:
  shutdown: graceful

HTTP 层由 Spring Boot 自带 graceful shutdown 控制，表现为：

新请求不再进入（接口请求会在停机时继续执行完成）
现有请求完成后退出（异步请求会走 @Async 等待配置）

Kafka 在优雅停机中的处理

项目使用 Spring Cloud Stream，消费逻辑示例：

@Bean
public Consumer<Message<KafkaMessageModel>> testConsumer() {
    return message -> {
        Acknowledgment ack = message.getHeaders()
            .get(KafkaHeaders.ACKNOWLEDGMENT, Acknowledgment.class);
        // 业务处理
        // 成功后手动 ack
        if (ack != null) {
            ack.acknowledge();
        }
    };
}

关键点：

手动 ack：只有处理成功才提交 offset，避免“未处理完但已确认”
停机顺序：先停生产者，再停消费者
在途任务等待：通过 MainJobState 记录是否在处理，监控类等待完成

配置示例（手动 ack）：

1	spring.cloud.stream.kafka.bindings.testConsumer-in-0.consumer.ack-mode=manual

K8s 配置（平台层）

spec:
  terminationGracePeriodSeconds: 30
  containers:
  - name: {APP_NAME}
    lifecycle:
      preStop:
        exec:
          command: ["curl", "-XPOST", "127.0.0.1:50000/actuator/shutdown"]

弹性伸缩

为pod设置资源限制后，创建HPA

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {APP_NAME}
  labels:
    app: {APP_NAME}
spec:
  template:
    spec:
      containers:
      - name: {APP_NAME}
        image: {IMAGE_URL}
        imagePullPolicy: Always
        resources:                     # 容器资源管理
          limits:                      # 资源限制（监控使用情况）
            cpu: 0.5
            memory: 1Gi
          requests:                    # 最小可用资源（灵活调度）
            cpu: 0.15
            memory: 300Mi
---
kind: HorizontalPodAutoscaler            # 弹性伸缩控制器
apiVersion: autoscaling/v2beta2
metadata:
  name: {APP_NAME}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {APP_NAME}
  minReplicas: {REPLICAS}                # 缩放范围
  maxReplicas: 6
  metrics:
    - type: Resource
      resource:
        name: cpu                        # 指定资源指标
        target:
          type: Utilization
          averageUtilization: 50

Prometheus集成

业务层面

项目依赖 pom.xml

<!-- 引入Spring boot的监控机制-->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

定义访问端口、路径及权限 application.yaml

management:
  server:
    port: 50000                         # 启用独立运维端口
  metrics:
    tags:
      application: ${spring.application.name}
  endpoints:
    web:
      exposure:
        base-path: /actuator            # 指定上下文路径，启用相应端点
        include: metrics,prometheus

将暴露/actuator/metric和/actuator/prometheus接口，访问方式如下：

1 2	http://127.0.0.1:50000/actuator/metric http://127.0.0.1:50000/actuator/prometheus

运维层面

deployment.yaml

apiVersion: apps/v1
kind: Deployment
spec:
  template:
    metadata:
      annotations:
        prometheus:io/port: "50000"
        prometheus.io/path: /actuator/prometheus  # 在流水线中赋值
        prometheus.io/scrape: "true"              # 基于pod的服务发现

配置分离

方案：通过configmap挂载外部配置文件，并指定激活环境运行

作用：配置分离，避免敏感信息泄露；镜像复用，提高交付效率

通过文件生成configmap

# 通过dry-run的方式生成yaml文件
kubectl create cm -n <namespace> <APP_NAME> --from-file=application-test.yaml --dry-run=1 -oyaml > configmap.yaml

# 更新
kubectl apply -f configmap.yaml

挂载configmap并指定激活环境

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {APP_NAME}
  labels:
    app: {APP_NAME}
spec:
  template:
    spec:
      containers:
      - name: {APP_NAME}
        image: {IMAGE_URL}
        imagePullPolicy: Always
        env:
          - name: SPRING_PROFILES_ACTIVE   # 指定激活环境
            value: test
        volumeMounts:                      # 挂载configmap
        - name: conf
          mountPath: "/app/config"         # 与Dockerfile中工作目录一致
          readOnly: true
      volumes:
      - name: conf
        configMap:
          name: {APP_NAME}

汇总配置

业务层面

项目依赖 pom.xml

<!-- 引入Spring boot的监控机制-->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-registry-prometheus</artifactId>
</dependency>

定义访问端口、路径及权限 application.yaml

spring:
  application:
    name: project-sample
  profiles:
    active: @profileActive@
  lifecycle:
    timeout-per-shutdown-phase: 30s     # 停机过程超时时长设置30s，超出30s，直接停机

server:
  port: 8080
  shutdown: graceful                    # 默认为IMMEDIATE，表示立即关机；GRACEFUL表示优雅关机

management:
  server:
    port: 50000                         # 启用独立运维端口
  metrics:
    tags:
      application: ${spring.application.name}
  endpoint:                             # 开启shutdown和health端点
    shutdown:
      enabled: true
    health:
      probes:
        enabled: true
        # 如果配置add-additional-paths=true，这将使主服务器端口上的liveness在/livez可用，readiness在/readyz可用。
        # add-additional-paths: true
      # 如果配置show-details: always，将在当前endpoint展示更多节点信息（包含ping结果，livenessState，readinessState，diskSpace等）
      # show-details: always
      # 通过配置group.*.include，来添加访问/health/(liveness|readiness)接口的结果信息
      # group:
      #   readiness:
      #     include:
      #       - readinessState
      #       - ping
      #   liveness:
      #     include:
      #       - readinessState
      #       - ping
  endpoints:
    web:
      exposure:
        base-path: /actuator            # 指定上下文路径，启用相应端点
        include: health,shutdown,metrics,prometheus

运维层面

确保dockerfile模板集成curl工具，否则无法使用curl命令

FROM openjdk:8-jdk-alpine
#构建参数
ARG JAR_FILE
ARG WORK_PATH="/app"
ARG EXPOSE_PORT=8080

#环境变量
ENV JAVA_OPTS=""\
    JAR_FILE=${JAR_FILE}

#设置时区
RUN ln -sf /usr/share/zoneinfo/Asia/Shanghai /etc/localtime && echo 'Asia/Shanghai' >/etc/timezone
RUN sed -i 's/dl-cdn.alpinelinux.org/mirrors.ustc.edu.cn/g' /etc/apk/repositories  \
    && apk add --no-cache curl
#将maven目录的jar包拷贝到docker中，并命名为for_docker.jar
COPY target/$JAR_FILE $WORK_PATH/


#设置工作目录
WORKDIR $WORK_PATH


# 指定于外界交互的端口
EXPOSE $EXPOSE_PORT
# 配置容器，使其可执行化
ENTRYPOINT exec java $JAVA_OPTS -jar $JAR_FILE

k8s部署模板deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {APP_NAME}
  labels:
    app: {APP_NAME}
spec:
  selector:
    matchLabels:
      app: {APP_NAME}
  replicas: {REPLICAS}                            # Pod副本数
  strategy:
    type: RollingUpdate                           # 滚动更新策略
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      name: {APP_NAME}
      labels:
        app: {APP_NAME}
      annotations:
        timestamp: {TIMESTAMP}
        prometheus.io/port: "50000"               # 不能动态赋值
        prometheus.io/path: /actuator/prometheus
        prometheus.io/scrape: "true"              # 基于pod的服务发现
    spec:
      affinity:                                   # 设置调度策略，采取多主机/多可用区部署
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - weight: 100
            podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: app
                  operator: In
                  values:
                  - {APP_NAME}
              topologyKey: "kubernetes.io/hostname" # 多可用区为"topology.kubernetes.io/zone"
      terminationGracePeriodSeconds: 30             # 优雅终止宽限期
      containers:
      - name: {APP_NAME}
        image: {IMAGE_URL}
        imagePullPolicy: Always
        ports:
        - containerPort: {APP_PORT}
        - name: management-port
          containerPort: 50000         # 应用管理端口
        readinessProbe:                # 就绪探针
          httpGet:
            path: /actuator/health/readiness
            port: management-port
          initialDelaySeconds: 90      # 延迟加载时间
          periodSeconds: 30            # 重试时间间隔
          timeoutSeconds: 30            # 超时时间设置
          successThreshold: 1          # 健康阈值
          failureThreshold: 3          # 不健康阈值
        livenessProbe:                 # 存活探针
          httpGet:
            path: /actuator/health/liveness
            port: management-port
          initialDelaySeconds: 90      # 延迟加载时间
          periodSeconds: 30            # 重试时间间隔
          timeoutSeconds: 30            # 超时时间设置
          successThreshold: 1          # 健康阈值
          failureThreshold: 3          # 不健康阈值
        resources:                     # 容器资源管理
          limits:                      # 资源限制（监控使用情况）
            cpu: 0.5
            memory: 1Gi
          requests:                    # 最小可用资源（灵活调度）
            cpu: 0.1
            memory: 200Mi
        env:
          - name: TZ
            value: Asia/Shanghai
---
kind: HorizontalPodAutoscaler            # 弹性伸缩控制器
apiVersion: autoscaling/v2beta2
metadata:
  name: {APP_NAME}
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: {APP_NAME}
  minReplicas: {REPLICAS}                # 缩放范围
  maxReplicas: 6
  metrics:
    - type: Resource
      resource:
        name: cpu                        # 指定资源指标
        target:
          type: Utilization
          averageUtilization: 50

问题

程序中有段代码：在CommandLineRunner.run当中执行while（true）{...}，永无休止的执行一段代码

这会导致一个问题：这个程序永远都无法正常停止！当执行健康检查的/readiness接口时，返回的status永远都是503

解决办法：while（true）放在单独的一个子线程执行

SpringBoot+K8S中的滚动发布、优雅停机、弹性伸缩、应用监控、配置分离

健康检查

业务层面

运维层面

滚动更新

优雅停机

核心原则

Spring Boot 配置（应用层）

SmartLifecycle 统一停机入口

@Async / @Scheduled 的优雅停机

HTTP 请求的优雅停机

Kafka 在优雅停机中的处理

K8s 配置（平台层）

推荐的停机执行顺序

弹性伸缩

Prometheus集成

业务层面

运维层面

配置分离

汇总配置

业务层面

运维层面

问题