一.什么是AHP
RFM是对顾客价值分群,但是每个群内的顾客并没有区分价值度。所以AHP就是针对每个群内的顾客进行打分去区分不同价值顾客。
什么是AHP---------------> https://baike.baidu.com/item/%E5%B1%82%E6%AC%A1%E5%88%86%E6%9E%90%E6%B3%95/1672?fr=aladdin)以及 (https://tellyouwhat.cn/p/ahp-users-value-score/)
AHP(the analytic hierarchy process),层级分析法
为每个用户计算AHP得分,并根据RFM分群结果进行同类中的客户排序
1.建立层次结构模型
2.构造成对比较矩阵
3.计算权向量并做一致性检验
目标:
针针RFM中同类价值顾客排名
利用RFM模型中的指标R、F、M
为每一个用户计算AHP得分(根据AHP得分对同类价值顾客进行排名)
二.数据
数据来自:spark之RFM客户价值分群挖掘(https://www.cnblogs.com/little-horse/p/14014812.html)
三.代码(spark3.0,java1.8)
详细代码见,AHP层次分析顾客价值得分(https://github.com/jiangnanboy/spark_tutorial)
/**
* RFM聚类可以分为高价值用户、一般用户、低价值用户等。
* 对于RFM中的同类用户的排序则使用AHP权向量给每个用户计算最终得分:利用每个用户的RFM向量与权值向量点乘得出AHP分数
* @param dataset 经过RFM聚类后的数据
* @param weightVector 权重向量
*/
public static void ahpScore(Dataset<Row> dataset, List<Double> weightVector) {
/**
* 计算每个用户的AHP分值:
*+----------+------------------+--------------------+----------+--------------------+
* |customerid| features| scaledfeatures|prediction| ahpscore|
* +----------+------------------+--------------------+----------+--------------------+
* | 12940| [46.0,4.0,876.29]|[0.12332439678284...| 1|0.024241021827781713|
* | 13285|[23.0,4.0,2709.12]|[0.06166219839142...| 1|0.023847531248595018|
* | 13623| [30.0,7.0,672.44]|[0.08042895442359...| 1|0.024049650279212683|
* | 13832| [17.0,2.0,40.95]|[0.04557640750670...| 1|0.014321280782467466|
* | 14450|[180.0,3.0,483.25]|[0.48257372654155...| 0| 0.04870738944845504|
* +----------+------------------+--------------------+----------+--------------------+
*/
dataset = dataset.map((MapFunction<Row, Row>) row -> {
int customerID = row.getInt(0);
Vector featureVec = (Vector) row.get(1);
Vector scaledFeatureVec = (Vector) row.get(2);
int prediction = row.getInt(3);
double aphScore = 0.0;
for(int i = 0; i < weightVector.size(); i++) {
aphScore += weightVector.get(i) * scaledFeatureVec.apply(i);
}
return RowFactory.create(customerID, Vectors.dense(new double[]{featureVec.apply(0), featureVec.apply(1), featureVec.apply(2)}), Vectors.dense(new double[]{scaledFeatureVec.apply(0), scaledFeatureVec.apply(1), scaledFeatureVec.apply(2)}), prediction, aphScore);
}, RowEncoder.apply(new StructType(new StructField[]{
new StructField("customerid", DataTypes.IntegerType, false, Metadata.empty()),//用户id
new StructField("features", SQLDataTypes.VectorType(),false, Metadata.empty()),//rfm特征向量
new StructField("scaledfeatures", SQLDataTypes.VectorType(), false, Metadata.empty()),//min-max标准化后的rfm特征向量
new StructField("prediction", DataTypes.IntegerType, false, Metadata.empty()),//预测该用户的价值类别
new StructField("ahpscore", DataTypes.DoubleType, false, Metadata.empty())//该用户的价值得分
})));
/**
* 在同类价值用户中根据ahpscore排序
* +----------+--------------------+--------------------+----------+------------------+----+
* |customerid| features| scaledfeatures|prediction| ahpscore|rank|
* +----------+--------------------+--------------------+----------+------------------+----+
* | 14646|[1.0,77.0,279489.02]|[0.00268096514745...| 1|0.7306140418787522| 1|
* | 18102|[0.0,62.0,256438.49]|[0.0,0.2469635627...| 1|0.6609787921304062| 2|
* | 14911|[1.0,248.0,132572...|[0.00268096514745...| 1|0.5933314030496094| 3|
* | 17450|[8.0,55.0,187482.17]|[0.02144772117962...| 1|0.4982050472344627| 4|
* | 14156|[9.0,66.0,113384.14]|[0.02412868632707...| 1|0.3430011157923704| 5|
* +----------+--------------------+--------------------+----------+------------------+----+
*/
dataset = dataset.withColumn("rank", functions.rank().over(Window.partitionBy("prediction").orderBy(col("ahpscore").desc())));
dataset.show(5);
}