容器编排综合实战

本章继Docker进阶(Docker Compose)章节续写

目录

期末实战(生产就绪栈,带完整可观测模拟)

1. 目录结构

2. 完整文件

.env

docker-compose.yml

api/Dockerfile

api/package.json

api/src/server.js

api/healthcheck.js

scripts/simulate_logs.sh

scripts/toggle_health_fail.sh

scripts/simulate_db_outage.sh

scripts/check_logpath.sh

Makefile(可选)

3. 启动与基本验证

4. 实验 A:dev profile 与 prod 行为差异

5. 实验 B:健康检查失败 → 恢复

快速自检命令汇总

6. 实验 C:容器崩溃自愈(重启策略)

7. 实验 D:只读根 + tmpfs 验证

8. 实验 E:日志滚动(2MB×3段)

9. 实验 F:数据库中断与恢复

10. 实验 G:环境版本切换

11. 清理

12. 总结


期末实战(生产就绪栈,带完整可观测模拟)

1. 目录结构

compose-labs/08-final/
├─ docker-compose.yml
├─ .env
├─ Makefile                     # 可选:一键命令
├─ scripts/
│  ├─ simulate_logs.sh          # 生成大量日志,触发滚动
│  ├─ toggle_health_fail.sh     # 让健康检查失败/恢复
│  ├─ simulate_db_outage.sh     # 模拟数据库故障
│  └─ check_logpath.sh          # 找容器日志文件路径
└─ api/
   ├─ Dockerfile
   ├─ package.json
   ├─ healthcheck.js
   └─ src/
      └─ server.js
mkdir -p ~/compose-labs/08-final/{scripts/,api/src/} && cd ~/compose-labs/08-final/

2. 完整文件

.env

tee > .env << "EOF"
POSTGRES_PASSWORD=example
POSTGRES_DB=appdb
DATABASE_URL=postgres://postgres:example@db:5432/appdb
TZ=UTC
APP_VERSION=8.0.0
EOF

docker-compose.yml

  • 多服务联动(db+api+adminer)

  • 健康检查更严格(api+db)

  • 生产安全项:read_only + tmpfs + user + no-new-privileges

  • 日志滚动:2MB × 3 片(我们会模拟触发它

  • dev profile:adminer 只在开发态起

tee > docker-compose.yml << "EOF"
x-logging: &default-logging
  driver: "json-file"
  options: { max-size: "2m", max-file: "3" }  # 小一点,便于快速触发滚动

networks:
  frontend:
  backend:

volumes:
  dbdata:

services:
  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD}
      POSTGRES_DB: ${POSTGRES_DB}
    volumes:
      - dbdata:/var/lib/postgresql/data
    networks: [backend]
    logging: *default-logging
    healthcheck:
      test: ["CMD-SHELL","pg_isready -U postgres -d ${POSTGRES_DB}"]
      interval: 5s
      timeout: 3s
      retries: 5
    restart: unless-stopped

  api:
    build:
      context: ./api
      target: runner
    env_file: .env
    ports:
      - "8080:3000"
    depends_on:
      db:
        condition: service_healthy
    networks: [frontend, backend]
    logging: *default-logging
    healthcheck:
      test: ["CMD","node","healthcheck.js"]
      interval: 5s
      timeout: 2s
      retries: 3
      start_period: 5s
    restart: always
    init: true
    read_only: true             # 生产加固:根只读
    tmpfs: ["/tmp"]             # 运行时可写目录
    security_opt: ["no-new-privileges:true"]
    user: "1000:1000"

  adminer:
    image: adminer:4
    profiles: ["dev"]           # 仅开发时启用
    environment:
      ADMINER_DEFAULT_SERVER: db
    ports:
      - "8081:8080"
    depends_on:
      - db
    networks: [backend]
    logging: *default-logging
EOF

api/Dockerfile

tee > api/Dockerfile << "EOF"
# syntax=docker/dockerfile:1.7

FROM node:20-alpine AS deps
WORKDIR /app
COPY package*.json ./
RUN --mount=type=cache,target=/root/.npm npm ci

FROM deps AS build
COPY . .
RUN npm run build && node -e "import('fs').then(fs=>{fs.access('dist/server.js',fs.constants.F_OK,(e)=>{if(e)process.exit(1)})})"

FROM node:20-alpine AS runner
WORKDIR /app
ENV NODE_ENV=production TZ=${TZ}
RUN addgroup -S app && adduser -S app -G app
USER app
COPY --from=deps /app/node_modules ./node_modules
COPY --from=build --chown=app:app /app/dist ./dist
COPY --chown=app:app healthcheck.js ./healthcheck.js
EXPOSE 3000
HEALTHCHECK --interval=10s --timeout=2s --retries=3 CMD node healthcheck.js || exit 1
CMD ["node","dist/server.js"]
EOF

api/package.json

tee > api/package.json << "EOF"
{
  "name": "final-api",
  "private": true,
  "type": "module",
  "version": "1.0.0",
  "scripts": {
    "build": "mkdir -p dist && cp -r src/* dist/",
    "start": "node dist/server.js"
  },
  "dependencies": {
    "pg": "^8.12.0"
  }
}
EOF

api/src/server.js

tee > api/src/server.js << "EOF"
import http from "http";
import { Client } from "pg";

const port = process.env.PORT || 3000;
const version = process.env.APP_VERSION || "dev";
const dbUrl = process.env.DATABASE_URL;

const server = http.createServer(async (req, res) => {
  if (req.url === "/healthz") {
    // 简单健康:只要进程活着就 200(真实可加 DB 探测)
    res.writeHead(200); res.end("ok"); return;
  }

  if (req.url === "/version") {
    res.setHeader("Content-Type","application/json");
    res.end(JSON.stringify({ version, node: process.version }));
    return;
  }

  if (req.url === "/db") {
    try {
      const cli = new Client({ connectionString: dbUrl });
      await cli.connect();
      const { rows } = await cli.query("SELECT NOW() now");
      await cli.end();
      res.setHeader("Content-Type","application/json");
      res.end(JSON.stringify({ ok: true, now: rows[0].now }));
    } catch (e) {
      res.statusCode = 500;
      res.end(String(e));
    }
    return;
  }

  if (req.url.startsWith("/work")) {
    // 模拟写临时文件(只读根 + tmpfs:/tmp → 只能在 /tmp 写)
    const path = `/tmp/demo-${Date.now()}.txt`;
    await import('fs').then(fs => fs.promises.writeFile(path, "hello tmp"));
    res.end(`wrote ${path}`);
    return;
  }

  if (req.url.startsWith("/crash")) {
    // 模拟崩溃(用于触发 restart 自愈)
    process.nextTick(() => process.exit(1));
    res.end("bye");
    return;
  }

  if (req.url.startsWith("/spamlog")) {
    // 生成大量日志,便于触发日志滚动
    const n = Number(new URL(req.url, "http://x").searchParams.get("n") || 20000);
    for (let i = 0; i < n; i++) console.log(`spam-${i}-${Date.now()}`);
    res.end(`logged ${n} lines`);
    return;
  }

  res.setHeader("Content-Type","application/json");
  res.end(JSON.stringify({ hello: "final-08", now: new Date().toISOString() }));
});

server.listen(port, () => console.log(`API on ${port}, version=${version}`));
EOF

api/healthcheck.js

  • read_only: true + tmpfs: ["/tmp"]/tmp 可写,适合放测试旗标。

  • 以后只要在容器内 touch /tmp/HEALTH_FAIL 就能把健康探针置为失败;rm 即恢复。

tee >  api/healthcheck.js << "EOF"
import fs from "fs";
import http from "http";

const FAIL_FLAG = "/tmp/HEALTH_FAIL";

// 若存在旗标文件,直接报失败(只读根下 /tmp 可写)
if (fs.existsSync(FAIL_FLAG)) {
  console.error("[healthcheck] FAIL by flag:", FAIL_FLAG);
  process.exit(1);
}

// 否则走 http 探针
const req = http.request(
  { host: "127.0.0.1", port: 3000, path: "/healthz", timeout: 1500 },
  (res) => {
    process.exitCode = res.statusCode === 200 ? 0 : 1;
    res.resume();
  }
);
req.on("error", () => process.exit(1));
req.end();
EOF

scripts/simulate_logs.sh

tee > scripts/simulate_logs.sh << "EOF"
#!/usr/bin/env bash
set -euo pipefail
CID=$(docker compose ps -q api)
echo "[i] API container: $CID"
echo "[i] Trigger spam logs to hit rotation..."
docker exec -it "$CID" sh -lc "wget -qO- http://127.0.0.1:3000/spamlog?n=200000 >/dev/null || true"
sleep 2
echo "[i] LogPath:"
docker inspect "$CID" | grep -m1 LogPath | sed 's/.*LogPath...\"//; s/\".*//'
EOF

scripts/toggle_health_fail.sh

  • touch /tmp/HEALTH_FAIL → 健康检查失败

  • rm /tmp/HEALTH_FAIL → 恢复

  • sleep 6 给健康检查两轮时间(你的 compose 里 interval: 5sretries: 3,可适当调整等待)

tee > scripts/toggle_health_fail.sh << "EOF"
#!/usr/bin/env bash
set -euo pipefail

CID=$(docker compose ps -q api)
ACTION="${1:-fail}"
FLAG="/tmp/HEALTH_FAIL"

if [ -z "$CID" ]; then
  echo "[!] API container not found. Start it first."
  exit 1
fi

case "$ACTION" in
  fail)
    echo "[i] simulated fail"
    docker compose exec api sh -lc "touch $FLAG && ls -l $FLAG"
    ;;
  recover)
    echo "[i] simulated recover"
    docker compose exec api sh -lc "rm -f $FLAG && echo removed"
    ;;
  *)
    echo "Usage: $0 [fail|recover]"
    exit 1
    ;;
esac

# 等待健康检查周期生效(与你 compose 里的 interval/retries 对齐)
sleep 6

echo "[i] Current health status:"
docker inspect "$CID" --format '{{json .State.Health}}' | jq
EOF

scripts/simulate_db_outage.sh

tee > scripts/simulate_db_outage.sh << "EOF"
#!/usr/bin/env bash
set -euo pipefail
DB=$(docker compose ps -a -q db)   # 加 -a,能列出已停止的容器
API=$(docker compose ps -q api)

case "${1:-stop}" in
  stop)
    echo "[i] Stop DB..."
    docker stop "$DB"
    ;;
  start)
    echo "[i] Start DB..."
    docker start "$DB"
    ;;
  *)
    echo "Usage: $0 [stop|start]"
    exit 1
    ;;
esac

sleep 2

echo "[i] try /db"
curl -sS http://localhost:8080/db || true
echo
echo "[i] API health:"
docker inspect "$API" --format '{{.State.Health.Status}}'
EOF

scripts/check_logpath.sh

tee > scripts/check_logpath.sh << "EOF"
#!/usr/bin/env bash
set -euo pipefail
CID=$(docker compose ps -q api)
docker inspect "$CID" | grep -m1 LogPath | sed 's/.*LogPath...\"//; s/\".*//'
EOF

Makefile(可选)

没有 make 也没关系,下面都给了等价命令。

tee > Makefile << "EOF"
up:
	DOCKER_BUILDKIT=1 docker compose up -d --build
down:
	docker compose down -v
ps:
	docker compose ps
logs:
	docker compose logs -f api
health:
	@CID=$$(docker compose ps -q api); docker inspect $$CID --format '{{json .State.Health}}' | jq
logpath:
	bash scripts/check_logpath.sh
spam:
	bash scripts/simulate_logs.sh
fail:
	bash scripts/toggle_health_fail.sh fail
recover:
	bash scripts/toggle_health_fail.sh recover
dbstop:
	bash scripts/simulate_db_outage.sh stop
dbstart:
	bash scripts/simulate_db_outage.sh start
EOF

3. 启动与基本验证

# 1. 进入包含 package.json 的 api 目录
cd ~/compose-labs/08-final/api

# 2. 执行 npm install(此时会读取 api 目录下的 package.json,安装依赖)
npm install
cd ~/compose-labs/08-final
DOCKER_BUILDKIT=1 docker compose up -d --build
docker compose ps
curl -sS http://localhost:8080/

curl -sS http://localhost:8080/version

curl -sS http://localhost:8080/db

curl -sS http://localhost:8080/work

预期:

  • / 返回 { hello: "final-08", now: ... }

  • /version 返回 { version: "8.0.0", node: "v20..." }

  • /db 返回 { ok: true, now: ... }

  • /work 返回 wrote /tmp/demo-xxx.txt(证明只读根+tmpfs生效)

健康状态:

CID=$(docker compose ps -q api)
docker inspect "$CID" --format '{{json .State.Health}}' | jq
# 看到 "Status": "healthy"

4. 实验 A:dev profile 与 prod 行为差异

纯生产(默认不启用 adminer):

docker compose down -v
DOCKER_BUILDKIT=1 docker compose up -d --build
docker compose ps   # 没有 adminer

开发模式(启用 adminer):

docker compose --profile dev up -d
docker compose ps   # 多了 adminer
curl -sS http://localhost:8081  # Adminer 页面

可观测差异docker compose ps 是否出现 adminer;8081端口是否开放。

5. 实验 B:健康检查失败 → 恢复

初始健康

curl -sS http://localhost:8080/healthz
# ok

docker inspect $(docker compose ps -q api) --format '{{.State.Health.Status}}'
# healthy

让健康检查失败:

bash scripts/toggle_health_fail.sh fail
# [i] simulated fail
# ... 稍等几秒,输出 Health 对象,Status 应为 "unhealthy"

预期输出:

"Status": "unhealthy"

等价手动命令(可观察旗标文件存在):

docker compose exec api ls -l /tmp/HEALTH_FAIL
docker inspect $(docker compose ps -q api) --format '{{.State.Health.Status}}'
# unhealthy

docker-compose.yml 里专门启用了生产加固(不能通过修改api/src/server.js来模拟健康检查失败):

read_only: true
tmpfs: ["/tmp"]

意思是:

  • 整个容器根文件系统只读(read_only);

  • 只有 /tmp 挂载为可写的内存盘。

恢复:

bash scripts/toggle_health_fail.sh recover
# [i] simulated recover
# ... 稍等几秒,Status -> "healthy"

预期:

"Status": "healthy"

快速自检命令汇总

# 当前健康状态(单行)
docker inspect $(docker compose ps -q api) --format '{{.State.Health.Status}}'

# 查看健康检查最近日志
docker inspect $(docker compose ps -q api) --format '{{json .State.Health.Log}}' | jq '.[-3:]'

# 观察容器重启计数(崩溃自愈演示用)
docker inspect $(docker compose ps -q api) --format '{{.RestartCount}}'

# 旗标文件是否存在
docker compose exec api sh -lc 'ls -l /tmp/HEALTH_FAIL || echo "no flag"'

# 实时看健康事件(可视化)
docker events --filter container=$(docker compose ps -q api)

6. 实验 C:容器崩溃自愈(重启策略)

可观测差异RESTARTS 计数增加;日志里可看到重启。

curl -sS http://localhost:8080/crash
sleep 2
docker compose ps
# 一会儿 api 会变成 Up(RESTARTS 计数 +1)

7. 实验 D:只读根 + tmpfs 验证

# 可观测差异:根不可写;/tmp 可写。

CID=$(docker compose ps -q api)
docker exec -it "$CID" sh -lc 'echo hi > /root/test.txt || echo "RO root"'
docker exec -it "$CID" sh -lc 'echo hi > /tmp/test.txt && cat /tmp/test.txt'

sh: can't create /root/test.txt: Permission denied
RO root

hi

8. 实验 E:日志滚动(2MB×3段)

生成大量日志:

bash scripts/simulate_logs.sh

查日志文件:

LOG=$(bash scripts/check_logpath.sh)
ls -lh "$(dirname "$LOG")" | grep -E "$(basename "$LOG")"
# 你会看到:
#  xxx-json.log
#  xxx-json.log.1
#  xxx-json.log.2

可观测差异:出现 .log.1.log.2;主文件大小降到 <2MB(被轮转)。

9. 实验 F:数据库中断与恢复

模拟 DB 故障:

bash scripts/simulate_db_outage.sh stop
# 预期:
# [i] Stop DB...
# Error: getaddrinfo EAI_AGAIN db
# API health: healthy

预期:/db 返回错误;api 健康仍是 healthy(因为我们的健康只看进程)。
(真实生产可把 /healthz 变为“弱依赖DB可降级/强依赖DB判失败”两种策略)

恢复数据库:

bash scripts/simulate_db_outage.sh start
# 预期:
# [i] Start DB...
# [i] try /db
# { ok: true, now: ... }
# API health: healthy

预期:/db 恢复 { ok: true, now: ... }

可观测差异:服务降级/恢复效果。

10. 实验 G:环境版本切换

修改 .envAPP_VERSION

sed -i 's/APP_VERSION=8.0.0/APP_VERSION=8.0.1/' .env
docker compose up -d
curl -sS http://localhost:8080/version
# 预期:version 变为 8.0.1 (无需重建镜像)

可观测差异:/version 回应变化,说明配置由环境驱动。

11. 清理

docker rm -f $(docker ps -aq --filter network=08-final_backend)
docker compose down -v
docker image prune -f

12. 总结

场景变更对象你能观察到什么
dev / prod 切换进程拓扑docker compose ps 是否有 adminer;端口 8081 是否开放
健康失败/恢复健康状态docker inspectStatus unhealthy → healthy
崩溃自愈进程重启RESTARTS 计数增加;logs 显示重启
只读根+tmpfs文件系统根路径写入失败、/tmp 写入成功
日志滚动磁盘占用容器日志出现 .log.1/.log.2;主日志大小下降
DB 故障/恢复业务功能/db 500 → 正常返回;健康可独立选择是否受影响
环境变量配置管理/version 响应变更,不需 rebuild
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值