通过 DEVICE_LIST指定 gpu列表属于进程级别,在第二次初始化session指定不同gpu时将会报错。
GPUOptions gpus = GPUOptions.newBuilder().setVisibleDeviceList("2,3").setAllowGrowth(true).build();
ConfigProto conf = ConfigProto.newBuilder().setAllowSoftPlacement(false)
.setGpuOptions(gpus).build();
//创建session时,加入配置信息
Session session1 = new Session(graph,conf.toByteArray());
GPUOptions gpus = GPUOptions.newBuilder().setVisibleDeviceList("4,5").setAllowGrowth(true).build();
ConfigProto conf = ConfigProto.newBuilder().setAllowSoftPlacement(false)
.setGpuOptions(gpus).build();
//创建session时,加入配置信息
Session session2 = new Session(graph,conf.toByteArray());
session2初始化时将会报错,无法在单进程下执行不同gpu的多session。
org.tensorflow.TensorFlowException: TensorFlow device (GPU:0) is being mapped to multiple CUDA devices (0 now, and 1 previously), which is not supported. This may be the result of providing different GPU configurations (ConfigProto.gpu_options, for example different visible_device_list) when creating multiple Sessions in the same process. This is not currently supported, see https://github.com/tensorflow/tensorflow/issues/19083
以下是StackOverflow的解决办法,完美解决:
try(FileInputStream fis = new FileInputStream(new File(modelPath))){
byte[] modelBytes = new byte[fis.available()];
fis.read(modelBytes);
Graph graph = new Graph();
//在图中去设置使用设备
graph.importGraphDef(**modifyGraphDef**(modelBytes,gpus));
//配置gpu
ConfigProto conf = getConfig(gpus);
//指定session的图
Session s = new Session(graph,conf==null?null:conf.toByteArray());
}catch (Exception e){
e.printStackTrace();
throw e;
}
public static byte[] modifyGraphDef(byte[] graphDef, int gpu) throws Exception {
GraphDef.Builder builder = GraphDef.parseFrom(graphDef).toBuilder();
String deviceString = String.format("/gpu:%d", gpu);
for (int i = 0; i < builder.getNodeCount(); ++i) {
builder.getNodeBuilder(i).setDevice(deviceString);
}
return builder.build().toByteArray();
}
通过载入模型后,在graph中去给每个node指定设备。