nvidia gpu java监控_Kubernetes Nvidia GPU Monitor & Grafana Dashboard

a506a5604138a6a77c5115a6fd32d978.png

▶ Export Metrics

1、Prerequisites

NVIDIA Tesla drivers = R384+ (download from NVIDIA Driver Downloads page)

nvidia-docker version > 2.0 (see how to install and it's prerequisites)

Optionally configure docker to set your default runtime to nvidia

NVIDIA device plugin for Kubernetes (see how to install)

2、Create PVC

apiVersion: v1

kind: PersistentVolumeClaim

metadata:

name: prometheus-gpu-pvc

namespace: kube-system

spec:

accessModes:

- ReadWriteMany

volumeMode: Filesystem

resources:

requests:

storage: 10Gi

3、Run DaementSet, Run Pod On GPU Node

apiVersion: apps/v1

kind: DaemonSet

metadata:

name: prometheus-gpu

namespace: kube-system

spec:

revisionHistoryLimit: 3

selector:

matchLabels:

k8s-app: prometheus-gpu

template:

metadata:

labels:

k8s-app: prometheus-gpu

spec:

nodeSelector:

kubernetes.io/hostname: gpu

volumes:

- name: prometheus

persistentVolumeClaim:

claimName: prometheus-gpu-pvc

- name: proc

hostPath:

path: /proc

- name: sys

hostPath:

path: /sys

serviceAccountName: admin-user

containers:

- name: dcgm-exporter

image: "nvidia/dcgm-exporter"

volumeMounts:

- name: prometheus

mountPath: /run/prometheus/

imagePullPolicy: Always

securityContext:

runAsNonRoot: false

runAsUser: 0

env:

- name: DEPLOY_TIME

value: { { ansible_date_time.iso8601 }}

- name: node-exporter

image: "q

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值