11.13在Hadoop YARN-client上运行WordCount程序
介绍的是使用spark-submit在Hadoop Yarn上运行Wordcount程序
11.13.1在Hadoop Yarn上运行Wordcount程序
cd ~/pythonwork/PythonProject
HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop spark-submit --driver-memory 512m --executor-cores 2 --master yarn --deploy-mode client WordCount.py
11.13.2查看执行完成后HDFS产生的目录
hadoop fs -ls /user/hduser/data/output
11.13.3查看执行完成后HDFS产生的文件
hadoop fs -cat /user/hduser/data/output/part-00000 |more
11.13.4在Hadoop Web界面查看WordCounts
网址:http://master:8088/
11.14在Spark Standalone Cluster上运行WordCount程序
11.14.1删除已产生的目录
hadoop fs -rm -R /user/hduser/data/output
11.14.2启动Standalone Cluster
/usr/local/spark/sbin/start-all.sh
11.14.3在Spark Standalone Cluster上运行WordCount程序
cd ~/pythonwork/PythonProject
spark-submit --master spark://master:7077 --deploy-mode client --executor-memory 500m --deploy-mode client --total-executor-cores 2 WordCount.py
11.14.4查看程序运行后的输出目录
hadoop fs -ls /user/hduser/data/output