1. SparkContext提供了一个取消job的api
class SparkContext(config: SparkConf) extends Logging with ExecutorAllocationClient {
/** Cancel a given job if it's scheduled or running */
private[spark] def cancelJob(jobId: Int) {
dagScheduler.cancelJob(jobId)
}
}
2. 那么如何获取jobId呢?
Spark提供了一个叫SparkListener的对象,它提供了对spark事件的监听功能
trait SparkListener {
/**
* Called when a job starts
*/
def onJobStart(jobStart: SparkListenerJobStart) { }
/**
* Called when a job ends
*/
def onJobEnd(jobEnd: SparkListenerJobEnd) { }
}
因此需要自定义一个类,继承自SparkListener,即:
public class DHSparkListener implements SparkListener {
private static Logger logger = Logger.getLogger(DHSparkListener.class);
//存储了提交job的线程局部变量和job的映射关系
private static ConcurrentHashMap<String, In