报错如下:
diagnostics: User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 17 in stage 15.0 failed 4 times, most recent failure: Lost task 17.3 in stage 15.0 (TID 27368, ip-10-19-204-241.ec2.internal, executor 10): com.mongodb.MongoWriteException: WiredTigerIndex::insert: key too large to index, failing 3346 { : “{“trackHostHttps”:“http://gateway_budget.iymedia.me”,“macros”:”",“alisaName”:“samsung”,“level”:2,“chargeType”:“nurlimpressions”,“trafficType”:1,“tim…” }
解决办法:
在mongo客户端studio 3T上,
执行如下命令:
db.yd_app_adx_traffic_data_offline.ensureIndex({“Module”:“hashed”})
ps.yd_app_adx_traffic_data_offline为你的表名称
参考文献:
https://www.cnblogs.com/timelesszhuang/p/6501847.html