在java spark开发中,读取hdfs上文件时,出现UnknownHostException,此问题解决方法:
方法一:
拷贝core-site.xml, hdfs-site.xml到工程目录的resources下
方法二:
// prefixConf: 一定要带上schema: file://, 如: file:///home/work/hadoop
Configuration configuration = new Configuration();
configuration.addDefaultResource(prefixConf + "/core-site.xml");
configuration.addDefaultResource(prefixConf + "/hdfs-site.xml");
方法三:
String prefixConf = "file://" + hdfsConf;
String master = "local";
SparkConf sparkConf = new SparkConf();
sparkConf.setAppName("spark-demo");
sparkConf.setMaster(master);
sc = new JavaSparkContext(sparkConf);
sc.hadoopConfiguration().addResource(prefixConf + "/core-site.xml"); // 必须要告知协议,如file://,否则不生效
sc.hadoopConfiguration().addResource(prefixConf + "/hdfs-site.xml");
方法四:
String prefixConf = "file://" + hdfsConf;
String master = "local";
SparkConf sparkConf = new SparkConf();
sparkConf.setAppName("spark-demo");
sparkConf.setMaster(master);
sc = new JavaSparkContext(sparkConf);
sc.hadoopConfiguration().addResource(new File(hdfsConf + "/core-site.xml").toURI().toURL());
sc.hadoopConfiguration().addResource(new File(hdfsConf + "/hdfs-site.xml").toURI().toURL());