- 关键字
docker , springboot ,dmidecode (主板序列ID ....), nvidia-smi (GPU ...),System.getProperty(key)
1、情景概要
项目 需要获取服务器信息(主板序列号,GPU ,CPU 等相关信息,验证设备是否被授权);
项目部署环境,采用 docker容器部署,sprignBoot 框架开发 ,服务器为 Linux
2、思路
读取设备硬件信息:我们可以通过执行命令得到期望的结果;window 可以 执行 .bat
linux 可以通过 sh 执行脚本 ,在Linux 中 dmidecode 、nvidia-smi 可以帮助我们获取硬件信息;
3、遇到的问题
通过docker 部署的系统,执行 脚本,是在容器内,想要得到在 服务器中的 直接执行 效果,该怎么做?
Runtime.getRuntime().exec(new String[] {"echo" ,"1234"}) 相当于 在服务器输入以下命令:
docker exec -it dockerContainerName echo "1234";
3.1 NVIDIA-SMI couldn‘t find libnvidia-ml.so library in your system
绑定 nvidia-smi 时,只绑定了 /usr/bin/nvidia-smi ,执行命令的相关文件未绑定,
通过whereis 找到对应文件文件进行绑定即可;
3.12 nvidia-smi -L 获取 gpuids 导致,程序依赖包 实例化 败(NoClassDefFoundError)
绑定 /usr/lib/x86_64-linux-gnu 就会出现这个问题,有一定的概率不会导致失败
解决方案 : 通过读取服务文件直接获取 信息,而不是通过执行命令;
/sys/class/dmi/id/board_serial # 主板序列目录
/proc/driver/nvidia/gpus # gpuId 目录
public static String getBoardSN() {
String cmd = "cat /sys/class/dmi/id/board_serial";
Process p = RuntimeUtil.exec(cmd);
return RuntimeUtil.getResult(p);
}
public static String getSerialNumberBySysFile() {
try {
List<String> result = IOUtils.readLines(new FileInputStream("/sys/class/dmi/id/board_serial"), "UTF-8");
return result.size() > 0 ? (String)result.get(0) : "error";
} catch (IOException var1) {
log.error("get sn error");
return "";
}
}
public static String getGpuIdsBySysFile() {
try {
Path rootPath = Paths.get("/proc/driver/nvidia/gpus");
List<File> informationList = FileUtil.loopFiles(rootPath, 2, pathname -> !pathname.isHidden() && pathname.getName().contains("information"));
String gpuMark = "GPU UUID:";
String gpuResultStr = "";
if(CollUtil.isNotEmpty(informationList)){
for(File infoFile: informationList){
List<String> result = IOUtils.readLines(new FileInputStream(infoFile.getPath()), "UTF-8");
if(result.size() > 2){
String gpuLine = result.get(2);
int index = gpuLine.indexOf(gpuMark);
gpuResultStr = gpuResultStr + gpuLine.substring(index + gpuMark.length() + 1).trim() +",";
}
}
}
return gpuResultStr;
} catch (IOException var1) {
log.error("get gpu error");
return "";
}
}
4、代码及配置文件
4.1 docker-compose.yml
whereis 命令可以查看 应用、文件位置 如 : whereis dmidecode
services:
my_test_services_api:
image: my_remote_storage:6565/my_test_services_api:e99e0457
healthcheck:
test: if mountpoint -q /data ;then echo "mounted" ;else kill 1 ;fi
interval: 120s
timeout: 3s
start_period: 40s
labels:
group: "my_test_services_api"
environment:
- TZ=Asia/Shanghai
privileged: true # 设置权限为 root
ports:
- 2080:8080 # 端口映射
networks:
- local
command: java -XX:MetaspaceSize=128m -XX:MaxMetaspaceSize=128m -Xms1024m -Xmx1024m -Xmn256m -Xss256k -XX:SurvivorRatio=8 -XX:+UseConcMarkSweepGC -jar my_test_services_api.jar
restart: always
volumes:
- type: bind
source: ./my_test_services_api.conf
target: /opt/my_test_services_api/config/application.yml
- type: bind
source: /data1/log/my_test_services_api
target: /opt/my_test_services_api/logs
- type: bind
source: /data1/data
target: /data
- type: bind
source: /var/run/docker.sock
target: /var/run/docker.sock
- type: bind
source: ./docker-java.properties
target: /docker-java.properties
- type: bind # dmidecode 绑定
source: /usr/sbin/dmidecode
target: /usr/sbin/dmidecode
- type: bind
source: /dev/mem
target: /dev/mem
- type: bind # nvidea-smi 绑定
source: /usr/bin/nvidia-smi
target: /usr/bin/nvidia-smi
- type: bind
source: /usr/lib/x86_64-linux-gnu
target: /usr/lib/x86_64-linux-gnu
4.2 java 后端 代码
通过System.getProperty 判断服务器类型,本文只实现了 linux 下主板信息和 GPU 信息获取
import lombok.extern.slf4j.Slf4j;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
@Slf4j
public class MachineUtil {
public static String getBoard_Series_No_linux(){
String osName = System.getProperty("os.name");
log.info("system os name :{}" ,osName);
String result = "";
if(!osName.startsWith("Mac OS") && !osName.startsWith("Windows")){
String CPU_ID_CMD = "dmidecode";
BufferedReader bufferedReader = null;
Process p = null;
try {
p = Runtime.getRuntime()
.exec(new String[] {CPU_ID_CMD ,"-s" ,"baseboard-serial-number" });// 管道
p.waitFor();
bufferedReader = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line = null;
while ((line = bufferedReader.readLine()) != null) {
result = result + line.trim();
log.info("result :{}" ,result);
}
} catch (IOException | InterruptedException e) {
log.error("获取cpu信息错误", e);
}finally {
closeIoStream(bufferedReader, p);
}
}
return result.trim();
}
public static String getGpuIds_linux() {
String osName = System.getProperty("os.name");
String result = "";
if(!osName.startsWith("Mac OS") && !osName.startsWith("Windows")){
String CPU_ID_CMD = "nvidia-smi";
BufferedReader bufferedReader = null;
Process p = null;
try {
p = Runtime.getRuntime().exec(new String[] {CPU_ID_CMD ,"-L"});// 管道
p.waitFor();
bufferedReader = new BufferedReader(new InputStreamReader(p.getInputStream()));
String line = null;
int index = -1;
while ((line = bufferedReader.readLine()) != null) {
index = line.toLowerCase().indexOf("uuid");
if (index >= 0) {
// 取出GPU UUID 并去除2边空格
result = result +
line.substring(index + "uuid".length() + 1).trim() +",";
}
}
} catch (IOException | InterruptedException e) {
log.error("获取Gpu信息错误", e);
}finally {
closeIoStream(bufferedReader, p);
}
}
return result.replace(")","").trim();
}
private static void closeIoStream(BufferedReader br, Process pro){
if(br != null) {
try {
br.close();
} catch (IOException e) {
}
}
if(pro != null) {
pro.destroy();
}
}
}
5、相关知识点补充
5.1 java中 System.getProperty(String key) 可查参数
5.2 dimidecode 读取信息
可查看硬件相关信息有: bios, system, baseboard, chassis, processor,
memory, cache, connector, slot
5.3 GPU信息
使用状况
watch -n 2 nvidia-smi // 每间隔 2秒 刷新一下
GPU: GPU编号 有多块显卡的时候,从0开始编号
Fan:风扇转速(0%-100%),N/A表示没有风扇
Name:GPU类型
Temp:GPU的温度
Perf:GPU的性能状态,从P0(最大性能)到P12(最小性能)
Persistence-M:持续模式的状态,持续模式虽然耗能大,但是在新的GPU应用启动时花费的时间更少
Pwr:Usager/Cap:能耗表示,Usage:用了多少,Cap总共多少
Bus-Id:GPU总线相关显示,domain:bus:device.function
Disp.A:Display Active ,表示GPU的显示是否初始化
Memory-Usage:显存使用率
Volatile GPU-Util:GPU使用率
Uncorr. ECC:关于ECC的东西,是否开启错误检查和纠正技术,0/disabled,1/enabled
Compute M:计算模式,0/DEFAULT,1/EXCLUSIVE_PROCESS,2/PROHIBITED
Processes:显示每个进程占用的显存使用率、进程号、占用的哪个GPU
5.4 GPU列表 nvidia-smi -L
root/master:~ # nvidia-smi -L
GPU 0: GeForce RTX 3090 (UUID: GPU-****32a0-****-****-****-5************)
GPU 1: GeForce RTX 3090 (UUID: GPU-****e20b-****-****-****-c36********)