tensorflow 支持java,使用TensorFlow for Java进行内存泄漏

最新推荐文章于 2024-07-26 10:05:28 发布

爪哇岛的小怪兽

最新推荐文章于 2024-07-26 10:05:28 发布

阅读量280

点赞数

文章标签： tensorflow 支持java

该博客讨论了一个TensorFlow代码示例中出现的内存泄漏问题。作者通过创建图形和会话，执行累加操作并关闭张量资源来尝试避免内存泄漏，但仍然观察到内存增长。经过检查，确定这不是Java对象或JVM内存问题，而是可能的JNI代码中缺少的内存释放操作。在更新和修复后，问题在1.2.0-rc1版本中得到解决。

摘要由CSDN通过智能技术生成

The following test code leaks memory:

private static final float[] X = new float[]{1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0};

public void testTensorFlowMemory() {

// create a graph and session

try (Graph g = new Graph(); Session s = new Session(g)) {

// create a placeholder x and a const for the dimension to do a cumulative sum along

Output x = g.opBuilder("Placeholder", "x").setAttr("dtype", DataType.FLOAT).build().output(0);

Output dims = g.opBuilder("Const", "dims").setAttr("dtype", DataType.INT32).setAttr("value", Tensor.create(0)).build().output(0);

Output y = g.opBuilder("Cumsum", "y").addInput(x).addInput(dims).build().output(0);

// loop a bunch to test memory usage

for (int i=0; i<10000000; i++){

// create a tensor from X

Tensor tx = Tensor.create(X);

// run the graph and fetch the resulting y tensor

Tensor ty = s.runner().feed("x", tx).fetch("y").run().get(0);

// close the tensors to release their resources

tx.close();

ty.close();

}

System.out.println("non-threaded test finished");

}

Is there something obvious I'm doing wrong? The basic flow is to create a graph and a session on that graph, create a placeholder and a constant in order to do a cumulative sum on a tensor fed in as x. After running the resulting y operation, I close both the x and y tensors to free their memory resources.

Things I believe so far to help:

This is not a Java objects memory problem. The heap does not grow, other memory in the JVM is not growing- according to jvisualvm. Doesn't appear to be a JVM memory leak according to Java's Native Memory Tracking.

The close operations are helping, if they're not there the memory grows by leaps and bounds. With them in place it still grows pretty fast, but nearly as much as without them.

The cumsum operator is not important, it happens with sum and other operators as well

It happens on Mac OS with TF 1.1, and CentOS 7 with TF 1.1 and 1.2_rc0

Commenting out the Tensor ty lines removes the leak, so it appears to be in there.

Any ideas? Thanks! Also, here's a Github project that demonstrates this issue with both a threaded test (to grow the memory faster) and an unthreaded test (to show it's not due to threading). It uses maven and can be run with simple:

mvn test

解决方案

I believe there is indeed a leak (in particular a missing TF_DeleteStatus corresponding to the allocation in JNI code) (Thanks for the detailed instructions to reproduce)

I'd encourage you to file an issue at http://github.com/tensorflow/tensorflow/issues

and hopefully it should be fixed before the final 1.2 release.

(Relatedly, you also have a leak outside the loop since the Tensor object created by Tensor.create(0) is not being closed)

UPDATE: This was fixed and 1.2.0-rc1 should no longer have this problem.

爪哇岛的小怪兽

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
tensorflow 支持java,使用TensorFlow for Java进行内存泄漏

The following test code leaks memory:private static final float[] X = new float[]{1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,0,1,2,3,4,5,6,7,8,9,...
复制链接

扫一扫