参考:https://jingyan.baidu.com/article/f96699bb163475894e3c1be4.html
-
- 下载hadoop安装包
链接:https://archive.apache.org/dist/hadoop/common/
备注:我选用的是hadoop-2.6.5.tar.gz
-
- Hadoop环境变量
HADOOP_HOME
D:\soft\developsoft\Hadoop\hadoop-2.6.5
Path后面追加 %HADOOP_HOME%\bin;%HADOOP_HOME%\sbin;
- 进入目录D:\HadoopCilent\hadoop-2.6.5\etc\hadoop
- 修改core-site.xml,找一个节点就好
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://10.111.32.165:8020</value> </property> </configuration> |
- 修改hdfs-site.xml
<property> <name>dfs.replication</name> <value>1</value> </property> |
- 修改mapred-site.xml
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> |
- 修改yarn-site.xml,找一下测试集群的resourcemanager的IP
<property> <name>yarn.resourcemanager.hostname</name> <value>10.111.32.166</value> </property> |
-
- winutils安装
- 按照如上操作出现如下报错
ava.io.IOException: Could not locate executable D:\HadoopCilent\hadoop-2.6.5\bin\winutils.exe in the Hadoop binaries. at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:378) at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:393) at org.apache.hadoop.util.Shell.<clinit>(Shell.java:386) at org.apache.hadoop.util.GenericOptionsParser.preProcessForWindows(GenericOptionsParser.java:438) at org.apache.hadoop.util.GenericOptionsParser.parseGeneralOptions(GenericOptionsParser.java:484) at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:170) at org.apache.hadoop.util.GenericOptionsParser.<init>(GenericOptionsParser.java:153) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:64) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84) at org.apache.hadoop.fs.FsShell.main(FsShell.java:340) 19/01/17 17:07:18 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable |
- winutils下载 https://github.com/steveloughran/winutils。下载响应winutils.exe放在D:\HadoopCilent\hadoop-6.5\bin下,再执行如下命令看成功了:
>hadoop fs -ls / 19/01/17 17:10:17 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Found 12 items drwxrwxrwx - yarn hadoop 0 2018-05-16 11:56 /app-logs drwxr-xr-x - hdfs hdfs 0 2017-11-22 02:33 /apps drwxr-xr-x - yarn hadoop 0 2017-11-22 02:30 /ats drwxrwxrwx - hdfs hdfs 0 2018-01-03 21:42 /flume drwxr-xr-x - hdfs hdfs 0 2017-11-22 02:30 /hdp drwx------ - livy hdfs 0 2018-03-23 11:36 /livy2-recovery drwxr-xr-x - hdfs hdfs 0 2018-04-27 12:25 /log drwxr-xr-x - mapred hdfs 0 2017-11-22 02:30 /mapred drwxrwxrwx - mapred hadoop 0 2017-11-22 02:30 /mr-history drwxrwxrwx - spark hadoop 0 2019-01-17 17:10 /spark2-history drwxrwxrwx - hdfs hdfs 0 2019-01-17 14:45 /tmp drwxr-xr-x - hdfs hdfs 0 2018-05-16 11:55 /user |
- 先做一下IP映射,打开C:\Windows\System32\drivers\etc\ hosts:
10.111.32.165 hdp165.tmtgeo.com 10.111.32.166 hdp166.tmtgeo.com 10.111.32.168 hdp168.tmtgeo.com |
- 从这里考一个demo:https://hadoop.apache.org/docs/r7.7/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduceTutorial.html
- Pom文件:
<dependencies>
|
- Main函数追加如下内容:
System.setProperty("hadoop.home.dir", "D:\\soft\\developsoft\\Hadoop\\hadoop-2.6.5"); System.setProperty("HADOOP_USER_NAME", "hdfs"); conf.set("mapreduce.app-submission.cross-platform", "true"); conf.set("mapreduce.job.ubertask.enable", "true"); conf.set("fs.defaultFS","hdfs://10.111.32.165:8020"); conf.set("mapreduce.app-submission.cross-platform", "true"); conf.set("mapreduce.job.ubertask.enable", "true"); conf.set("fs.defaultFS","hdfs://10.111.32.165:8020"); |
- 复制测试集群的配置文件放在项目resources中
- 运行可以看见:
2019-01-17 16:22:00,924 WARN [main] shortcircuit.DomainSocketFactory (DomainSocketFactory.java:<init>(117)) - The short-circuit local reads feature cannot be used because UNIX Domain sockets are not available on Windows. 2019-01-17 16:22:02,191 INFO [main] impl.TimelineClientImpl (TimelineClientImpl.java:serviceInit(292)) - Timeline service address: http://hdp166.tmtgeo.com:8188/ws/v1/timeline/ 2019-01-17 16:22:02,529 INFO [main] client.RMProxy (RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at hdp166.tmtgeo.com/10.111.32.166:8050 2019-01-17 16:22:03,008 WARN [main] mapreduce.JobResourceUploader (JobResourceUploader.java:uploadFiles(64)) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this. 2019-01-17 16:22:12,404 INFO [main] input.FileInputFormat (FileInputFormat.java:listStatus(281)) - Total input paths to process : 1 2019-01-17 16:22:12,845 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:submitJobInternal(199)) - number of splits:1 2019-01-17 16:22:13,212 INFO [main] mapreduce.JobSubmitter (JobSubmitter.java:printTokens(288)) - Submitting tokens for job: job_1529566523883_6787 2019-01-17 16:22:13,728 INFO [main] impl.YarnClientImpl (YarnClientImpl.java:submitApplication(251)) - Submitted application application_1529566523883_6787 2019-01-17 16:22:13,771 INFO [main] mapreduce.Job (Job.java:submit(1301)) - The url to track the job: http://hdp166.tmtgeo.com:8088/proxy/application_1529566523883_6787/ 2019-01-17 16:22:13,772 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1346)) - Running job: job_1529566523883_6787 2019-01-17 16:22:22,007 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1367)) - Job job_1529566523883_6787 running in uber mode : false 2019-01-17 16:22:22,011 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1374)) - map 0% reduce 0% 2019-01-17 16:22:29,271 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1374)) - map 100% reduce 0% 2019-01-17 16:22:36,436 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1374)) - map 100% reduce 100% 2019-01-17 16:22:38,516 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1385)) - Job job_1529566523883_6787 completed successfully 2019-01-17 16:22:38,628 INFO [main] mapreduce.Job (Job.java:monitorAndPrintJob(1392)) - Counters: 49 File System Counters FILE: Number of bytes read=87 FILE: Number of bytes written=285801 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=136 HDFS: Number of bytes written=45 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=42024 Total time spent by all reduces in occupied slots (ms)=75872 Total time spent by all map tasks (ms)=5253 Total time spent by all reduce tasks (ms)=4742 Total vcore-milliseconds taken by all map tasks=5253 Total vcore-milliseconds taken by all reduce tasks=4742 Total megabyte-milliseconds taken by all map tasks=43032576 Total megabyte-milliseconds taken by all reduce tasks=77692928 Map-Reduce Framework Map input records=9 Map output records=9 Map output bytes=63 Map output materialized bytes=87 Input split bytes=109 Combine input records=9 Combine output records=9 Reduce input groups=9 Reduce shuffle bytes=87 Reduce input records=9 Reduce output records=9 Spilled Records=18 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=63 CPU time spent (ms)=2300 Physical memory (bytes) snapshot=2806378496 Virtual memory (bytes) snapshot=23102570496 Total committed heap usage (bytes)=6315048960 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=27 File Output Format Counters Bytes Written=45
Process finished with exit code 0 |