mapreduce-linux命令行

最新推荐文章于 2024-03-14 14:15:32 发布

Jingle_dog

最新推荐文章于 2024-03-14 14:15:32 发布

阅读量348

点赞数

分类专栏： Hadoop随笔文章标签： mapreduce linux hadoop

本文链接：https://blog.csdn.net/Jingle_dog/article/details/121160959

版权

3 篇文章 0 订阅

订阅专栏

建立wordcount.java文件：

start-all.sh
stop-all.sh

cd ~/Documents/wordcount/
javac -classpath /usr/local/hadoop/share/hadoop/common/hadoop-common-2.9.2.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.9.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar -d ./ ./WordCount.java
jar -cvf WordCount.jar ./*.class
hdfs dfs -rm -r output
hadoop jar ./WordCount.jar WordCount input output
hdfs dfs -ls ./output
hdfs dfs -cat output/*
hdfs dfs -rm -r output

cd ~/Documents/Matrix1/
javac -classpath /usr/local/hadoop/share/hadoop/common/hadoop-common-2.9.2.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.9.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar -d ./ ./MatrixMultiply.java
jar -cvf MatrixMultiply.jar ./*.class
查看脚本：sudo gedit ./genMatrix.sh
生成两个矩阵（30x50，50x100）：./genMatrix.sh 30 50 100
将两个矩阵文件从本地发送至hdfs文件系统：（发送一次即可）
hdfs dfs -put ./M_30_50 input
hdfs dfs -put ./N_50_100 input
hdfs dfs -rm -r output
hadoop jar ./MatrixMultiply.jar MatrixMultiply input/M_30_50 input/N_50_100 output
hdfs dfs -ls ./output
hdfs dfs -cat output/*
hdfs dfs -rm -r output

cd ~/Documents/index/
javac -classpath /usr/local/hadoop/share/hadoop/common/hadoop-common-2.9.2.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-2.9.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar -d ./ ./InvertedIndex.java
jar -cvf InvertedIndex.jar ./*.class
hdfs dfs -rm -r output
导入停词表到input文件夹：（导入一次即可）
hdfs dfs -put ./stopwords.txt input
hadoop jar ./InvertedIndex.jar InvertedIndex indexinput output
hdfs dfs -ls ./output
hdfs dfs -cat output/*
hdfs dfs -rm -r output
创建indexinput文件夹：（创建一次即可）
cd /usr/local/hadoop/bin
hdfs dfs -mkdir indexinput
将文件导入：（导入一次即可）
cd ~/Documents/index/
hdfs dfs -put ./input/file1.txt indexinput
hdfs dfs -put ./input/file2.txt indexinput
hdfs dfs -put ./input/file3.txt indexinput
hdfs dfs -put ./input/file4.txt indexinput
hdfs dfs -put ./input/file5.txt indexinput
hdfs dfs -rm -r indexinput

关注

专栏目录