我正在寻找计算两个直方图之间的earth mover's distance (EMD)的java代码(或库)。这可以是直接或间接的(例如使用匈牙利算法)。我在c / c ++中发现了几个这样的实现(例如"Fast and Robust Earth Mover's Distances",但是我想知道是否有Java版本可用。
我将使用EMD计算来评估this paper在我正在进行的一个科学项目的上下文中给出的方法。
更新
使用各种资源,我估计下面的代码应该可以做到。 determineMinCostAssignment是由匈牙利算法确定的最佳分配的计算。为此,我将使用http://konstantinosnedas.com/dev/soft/munkres.htm的代码
我主要关心的是计算流量:我不确定这是否正确。有人可以证实这是正确的吗?
/**
* Determines the Earth Mover's Distance between two histogram assuming an equal distance between two buckets of a histogram. The distance between
* two buckets is equal to the differences in the indexes of the buckets.
*
* @param threshold
* The maximum distance to use between two buckets.
*/
public static double determineEarthMoversDistance(double[] histogram1, double[] histogram2, int threshold) {
if (histogram1.length != histogram2.length)
throw new InvalidParameterException("Each histogram must have the same number of elements");
double[][] groundDistances = new double[histogram1.length][histogram2.length];
for (int i = 0; i < histogram1.length; ++i) {
for (int j = 0; j < histogram2.length; ++j) {
int abs_diff = Math.abs(i - j);
groundDistances[i][j] = Math.min(abs_diff, threshold);
}
}
int[][] assignment = determineMinCostAssignment(groundDistances);
double costSum = 0, flowSum = 0;
for (int i = 0; i < assignment.length; i++) {
double cost = groundDistances[assignment[i][0]][assignment[i][1]];
double flow = histogram2[assignment[i][1]];
costSum += cost * flow;
flowSum += flow;
}
return costSum / flowSum;
}