首先,按https://blog.csdn.net/vah101/article/details/108098827这里提到的方式为k8s集群配置好GPU结点,
之后编写如下的tensflow.yaml文件
apiVersion: apps/v1
kind: Deployment
metadata:
name: tensorflow-gpu-jupyter
labels:
app: tensorflow-gpu-jupyter
spec:
replicas: 1
selector: # define how the deployment finds the pods it mangages
matchLabels:
app: tensorflow-gpu-jupyter
template: # define the pods specifications
metadata:
labels:
app: tensorflow-gpu-jupyter
spec:
containers:
- name: tensorflow-gpu-jupyter
image: tensorflow/tensorflow:latest-gpu-jupyter
resources:
limits:
nvidia.com/gpu: 1
---
apiVersion: v1
kind: Service
metadata:
name: tensorflow-gpu-jupyter
labels:
app: tensorflow-gpu-jupyter
spec:
type: NodePort
ports:
- port: 8888
targetPort: 8888
nodePort: 30888
selector:
app: tensorflow-gpu-jupyter
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: tensorflow-gpu-jupyter
spec:
rules:
- host: gpu.k8s
http:
paths:
- backend:
serviceName: tensorflow-gpu-jupyter
servicePort: 8888
path: /
status:
loadBalancer: {}
执行:
kubectl create -f tensflow.yaml
之后,通过kubectl get pod 检查tensflow是否启动,如果启动会显示如下内容:
NAME READY STATUS RESTARTS AGE
tensorflow-gpu-jupyter-66874fd7db-rvjt2 1/1 Running 0 4m29s
之后执行kubectl logs -f tensorflow-gpu-jupyter-66874fd7db-rvjt2
会看到日志显示:
[I 07:56:42.097 NotebookApp] Writing notebook server cookie secret to /root/.local/share/jupyter/runtime/notebook_cookie_secret
[I 07:56:42.318 NotebookApp] Serving notebooks from local directory: /tf
[I 07:56:42.318 NotebookApp] The Jupyter Notebook is running at:
[I 07:56:42.318 NotebookApp] http://tensorflow-gpu-jupyter-66874fd7db-rvjt2:8888/?token=1ba80e3978b1c7da673aae668c9750105b37cb3efd0b223a
[I 07:56:42.318 NotebookApp] or http://127.0.0.1:8888/?token=1ba80e3978b1c7da673aae668c9750105b37cb3efd0b223a
[I 07:56:42.318 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 07:56:42.321 NotebookApp]
To access the notebook, open this file in a browser:
file:///root/.local/share/jupyter/runtime/nbserver-1-open.html
Or copy and paste one of these URLs:
http://tensorflow-gpu-jupyter-66874fd7db-rvjt2:8888/?token=1ba80e3978b1c7da673aae668c9750105b37cb3efd0b223a
or http://127.0.0.1:8888/?token=1ba80e3978b1c7da673aae668c9750105b37cb3efd0b223a
说明启动成功,在浏览器中输入
master结点ip:30888/?token=1ba80e3978b1c7da673aae668c9750105b37cb3efd0b223a
即可进入jupyter界面