创建一个新的答案来解决问题中的新细节。
在 DataFrame 上调用.map,并将逻辑放入lambda中以将一行 transformation为新行。
// Do your data manipulation in a call to `.map`,
// which will return another DataFrame.
DataFrame df2 = df.map(
// This work will be spread out across all your nodes,
// which is the real power of Spark.
r -> {
// I'm assuming the code you put in the question works,
// and just copying it here.
// Note the type parameter of with .getAs
String colVal1 = r.getAs(colName1);
String colVal2 = r.getAs(colName2);
String[] nestedValues = new String[allCols.length];
nestedValues[0]=colVal1;
nestedValues[1]=colVal2;
.
.
.
// Return a single Row
RowFactory.create(nestedValues);
}
);
// When you are done, get local results as Rows.
List localResultRows = df2.collectAsList();
https://spark.apache.org/docs/1.6.1/api/java/org/apache/spark/sql/DataFrame.html