前言
之前都是想持续运行的 Pod,相对于也会有想运行完成工作后就终止任务的情况。ReplicationController、ReplicaSet 和 DaemonSet 会持续运行任务,永远达不到完成态。这些 pod 中的进程在退出时会重新启动。但是在一个可完成的任务中,其进程终止后,不应该再重新启动。
Job
Job 允许你运行一种 pod,该 pod 在内部进程成功结束时,不重启容器。一旦任务完成,pod 就被认为处于完成状态。在发生节点故障时,该节点上由 Job 管理的 pod 将重新安排到其他节点。
Job 对于临时任务很有用,关键是任务要以正确的方式结束。可以在未托管的 pod 中运行任务并等待它完成。
创建 Job
job_test.yaml
apiVersion: batch/v1
kind: Job
metadata:
name: test-job
spec:
template:
metadata:
labels:
app: test-job
spec:
containers:
- name: first-job
image: luksa/batch-job // 该镜像调用一个运行120秒的进程,然后
退出。
restartPolicy: OnFailure // RestartPolicy 仅支持 Never 或 OnFailure
$ kubectl apply -f job_test.yaml
查看 job 和 pod,显示 job 的 completions ( 标志 Job 结束需要成功运行的 Pod 个数);parallelism (标志并行运行的Pod的个数,默认为1);正在运行的个数,成功的个数,失败的个数;事件。
$ kubectl get job
NAME COMPLETIONS DURATION AGE
test-job 0/1 111s 111s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
test-job--1-qmh85 1/1 Running 0 100s
$ kubectl describe job test-job
Name: test-job
Namespace: default
Selector: controller-uid=33227695-2bdd-4a55-bc62-5e7844fc4341
Labels: app=test-job
controller-uid=33227695-2bdd-4a55-bc62-5e7844fc4341
job-name=test-job
Annotations: <none>
Parallelism: 1
Completions: 1
Completion Mode: NonIndexed
Start Time: Tue, 30 Nov 2021 22:18:37 +0800
Pods Statuses: 1 Running / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=test-job
controller-uid=33227695-2bdd-4a55-bc62-5e7844fc4341
job-name=test-job
Containers:
first-job:
Image: luksa/batch-job
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 118s job-controller Created pod: test-job--1-qmh85
Job 完成后
获取到 pod 状态为 Completed;事件中有 Job Completed 的 message。
$ kubectl get job
NAME COMPLETIONS DURATION AGE
test-job 1/1 2m18s 7m21s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
test-job--1-qmh85 0/1 Completed 0 7m50s
$ kubectl describe job test-job
Name: test-job
Namespace: default
Selector: controller-uid=33227695-2bdd-4a55-bc62-5e7844fc4341
Labels: app=test-job
controller-uid=33227695-2bdd-4a55-bc62-5e7844fc4341
job-name=test-job
Annotations: <none>
Parallelism: 1
Completions: 1
Completion Mode: NonIndexed
Start Time: Tue, 30 Nov 2021 22:18:37 +0800
Completed At: Tue, 30 Nov 2021 22:20:55 +0800
Duration: 2m18s
Pods Statuses: 0 Running / 1 Succeeded / 0 Failed
Pod Template:
Labels: app=test-job
controller-uid=33227695-2bdd-4a55-bc62-5e7844fc4341
job-name=test-job
Containers:
first-job:
Image: luksa/batch-job
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 8m40s job-controller Created pod: test-job--1-qmh85
Normal Completed 6m22s job-controller Job completed
配置项
job 不仅可以配置需要运行的 pod 个数,并行运行的 pod 个数,activeDeadlineSeconds(重试的最大时间),backoffLimit(可以重试的次数,默认为 6)
CronJob
Job 资源在创建时会立即运行 pod。但是许多批处理任务需要在特定的时间运行,或者在指定的时间间隔内重复运行;类似于 Linux 的 corn,在配置的时间,Kubernetes 将根据在 CronJob 对象中配置的 Job 模板创建 Job 资源。
创建 cronJob
cronjob_test.yaml,schedule 为 cron 时间表格式
apiVersion: batch/v1
kind: CronJob
metadata:
name: test-job-every-fifteen-min
spec:
schedule: "0,15,30,45 * * * *"
jobTemplate:
spec:
template:
metadata:
labels:
app: test-job
spec:
containers:
- name: first-job
image: luksa/batch-job
restartPolicy: OnFailure
$ kubectl apply -f cronjob_test.yaml
查看 cronjob
$ kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
test-job-every-fifteen-min 0,15,30,45 * * * * False 0 <none> 69s
$ kubectl describe cronjob test-job-every-fifteen-min
Name: test-job-every-fifteen-min
Namespace: default
Labels: <none>
Annotations: <none>
Schedule: 0,15,30,45 * * * *
Concurrency Policy: Allow
Suspend: False
Successful Job History Limit: 3
Failed Job History Limit: 1
Starting Deadline Seconds: <unset>
Selector: <unset>
Parallelism: <unset>
Completions: <unset>
Pod Template:
Labels: app=test-job
Containers:
first-job:
Image: luksa/batch-job
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Last Schedule Time: <unset>
Active Jobs: <none>
Events: <none>
查看 cronjob 创建出的 job,cronjob 的事件。
$ kubectl get job
NAME COMPLETIONS DURATION AGE
test-job-every-fifteen-min-27304755 1/1 2m18s 15m
test-job-every-fifteen-min-27304770 0/1 28s 28s
$ kubectl describe cronjob test-job-every-fifteen-min
Name: test-job-every-fifteen-min
Namespace: default
Labels: <none>
Annotations: <none>
Schedule: 0,15,30,45 * * * *
Concurrency Policy: Allow
Suspend: False
Successful Job History Limit: 3
Failed Job History Limit: 1
Starting Deadline Seconds: <unset>
Selector: <unset>
Parallelism: <unset>
Completions: <unset>
Pod Template:
Labels: app=test-job
Containers:
first-job:
Image: luksa/batch-job
Port: <none>
Host Port: <none>
Environment: <none>
Mounts: <none>
Volumes: <none>
Last Schedule Time: Tue, 30 Nov 2021 23:30:00 +0800
Active Jobs: test-job-every-fifteen-min-27304770
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 16m cronjob-controller Created job test-job-every-fifteen-min-27304755
Normal SawCompletedJob 14m cronjob-controller Saw completed job: test-job-every-fifteen-min-27304755, status: Complete
Normal SuccessfulCreate 79s cronjob-controller Created job test-job-every-fifteen-min-27304770