首先安装java环境
下载spark:
wget https://archive.apache.org/dist/spark/spark-2.3.0/spark-2.3.0-bin-hadoop2.7.tgz
创建解析文件:
mkdir /opt/input
cd /opt/input/
touch wenjian.txt
文件中输入测试内容:
hello az
hello spark
金庸防火墙:
CentOS7
systemctl stop firewalld
CentOS6
service iptables stop
启动Spark
./bin/spark-shell
运行WordCount程序
sc.textFile("/opt/input").flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).collect