CDH & Cloudera Manager
安装完 Cloudera Distribution Hadoop 和 Cloudera Manager 后,目录结构如下:
[root@hadoop01 module]# /opt/module/cm/cm-5.12.1/share/cmf/schema/scm_prepare_database.sh mysql scm scm scm
JAVA_HOME=/usr/local/jdk
Verifying that we can write to /opt/module/cm/cm-5.12.1/etc/cloudera-scm-server
Creating SCM configuration file in /opt/module/cm/cm-5.12.1/etc/cloudera-scm-server
Executing: /usr/local/jdk/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/opt/module/cm/cm-5.12.1/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /opt/module/cm/cm-5.12.1/etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.
[ main] DbCommandExecutor INFO Successfully connected to database.
All done, your SCM database is configured correctly!
[root@hadoop01 ~]# /usr/local/mysql/support-files/mysql.server start
Starting MySQL.. [ OK ]
[root@hadoop01 ~]# /opt/module/cm/cm-5.12.1/etc/init.d/cloudera-scm-server start
Starting cloudera-scm-server: [ OK ]
[root@hadoop01 ~]# /opt/module/cm/cm-5.12.1/etc/init.d/cloudera-scm-agent start
Starting cloudera-scm-agent: [ OK ]
[root@hadoop01 ~]# netstat -anp | grep 7180
tcp 0 0 0.0.0.0:7180 0.0.0.0:* LISTEN 3953/java
安装基础组件,比如HDFS,YARN,ZooKeeper,Hive:
点击 Hosts,进入 hadoop01,可查看相关详细信息:
后台 jps 可查看正在运行的进程:
[root@hadoop01 ~]# jps
17570 Main
66051 DataNode
15139 QuorumPeerMain
66053 SecondaryNameNode
66756 Jps
9221 AlertPublisher
9223 EventCatcherService
66056 NameNode
9228 Main
3953 Main
17843 Main
65267 ResourceManager
65241 JobHistoryServer
65339 NodeManager
Yarn
关于Yarn,我把Yarn Conguration里scheduler class改成capacity, 还有 container vcore 改成 2,container memory 改成 3G,然后在 Web UI 中查看ResourceManager Web UI:
wordcount 程序
[root@hadoop01 ~]# sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/hadoop-mapreduce-examples-2.6.0-cdh5.12.1.jar wordcount /tmp/manifest.json /tmp/test2
20/12/15 11:07:39 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/172.16.235.134:8032
20/12/15 11:07:39 INFO input.FileInputFormat: Total input paths to process : 1
20/12/15 11:07:40 INFO mapreduce.JobSubmitter: number of splits:1
20/12/15 11:07:40 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1608001618553_0001
20/12/15 11:07:41 INFO impl.YarnClientImpl: Submitted application application_1608001618553_0001
20/12/15 11:07:41 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1608001618553_0001/
20/12/15 11:07:41 INFO mapreduce.Job: Running job: job_1608001618553_0001
20/12/15 11:07:46 INFO mapreduce.Job: Job job_1608001618553_0001 running in uber mode : false
20/12/15 11:07:46 INFO mapreduce.Job: map 0% reduce 0%
20/12/15 11:07:52 INFO mapreduce.Job: map 100% reduce 0%
20/12/15 11:07:57 INFO mapreduce.Job: map 100% reduce 100%
20/12/15 11:07:58 INFO mapreduce.Job: Job job_1608001618553_0001 completed successfully
20/12/15 11:07:58 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=1664
FILE: Number of bytes written=293809
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=72415
HDFS: Number of bytes written=2645
HDFS: Number of read operations=6
HDFS: Number of large read operations=0
HDFS: Number of write operations=2
Job Counters
Launched map tasks=1
Launched reduce tasks=1
Data-local map tasks=1
Total time spent by all maps in occupied slots (ms)=4539
Total time spent by all reduces in occupied slots (ms)=2489
Total time spent by all map tasks (ms)=4539
Total time spent by all reduce tasks (ms)=2489
Total vcore-milliseconds taken by all map tasks=4539
Total vcore-milliseconds taken by all reduce tasks=2489
Total megabyte-milliseconds taken by all map tasks=4647936
Total megabyte-milliseconds taken by all reduce tasks=2548736
Map-Reduce Framework
Map input records=1825
Map output records=3077
Map output bytes=51249
Map output materialized bytes=1660
Input split bytes=103
Combine input records=3077
Combine output records=116
Reduce input groups=116
Reduce shuffle bytes=1660
Reduce input records=116
Reduce output records=116
Spilled Records=232
Shuffled Maps =1
Failed Shuffles=0
Merged Map outputs=1
GC time elapsed (ms)=170
CPU time spent (ms)=2330
Physical memory (bytes) snapshot=681234432
Virtual memory (bytes) snapshot=5624213504
Total committed heap usage (bytes)=617086976
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=72312
File Output Format Counters
Bytes Written=2645
可看到 wordcount State 最终为FINISHED状态:
可通过命令行或者 Web UI查看 wordcount运行结果:
[root@hadoop01 tmp]# sudo -u hdfs hdfs dfs -ls /tmp/test2
Found 2 items
-rw-r--r-- 3 hdfs supergroup 0 2020-12-15 11:07 /tmp/test2/_SUCCESS
-rw-r--r-- 3 hdfs supergroup 2645 2020-12-15 11:07 /tmp/test2/part-r-00000
[root@hadoop01 tmp]# sudo -u hdfs hdfs dfs -cat /tmp/test2/part-r-00000
"0.11.0+cdh5.12.1+101", 10
"0.11.0-cdh5.12.1" 10
"0.12.0+cdh5.12.1+110", 10
"0.12.0-cdh5.12.1" 10
"0.7.0+cdh5.12.1+0", 10
"0.9+cdh5.12.1+34", 10
"0.9-cdh5.12.1" 10
"0.9.0+cdh5.12.1+23", 10
"0.9.0-cdh5.12.1" 10
"1.0.0+cdh5.12.1+0", 10
"1.0.0+cdh5.12.1+144", 10
"1.0.0-cdh5.12.1" 20
"1.1.0+cdh5.12.1+1197", 20
"1.1.0-cdh5.12.1" 20
"1.2.0+cdh5.12.1+365", 10
"1.2.0-cdh5.12.1" 10
"1.4.6+cdh5.12.1+113", 10
"1.4.6-cdh5.12.1" 10
"1.5+cdh5.12.1+71", 10
"1.5-cdh5.12.1" 10
"1.5.0+cdh5.12.1+187", 10
"1.5.0-cdh5.12.1" 10
"1.5.1+cdh5.12.1+329", 10
"1.5.1-cdh5.12.1" 10
"1.6.0+cdh5.12.1+166", 10
"1.6.0+cdh5.12.1+530", 10
"1.6.0-cdh5.12.1" 20
"1.99.5+cdh5.12.1+46", 10
"1.99.5-cdh5.12.1" 10
"1.cdh5.12.1.p0.3", 290
"11b2e7ae1c5c6e3f74400341995691173bcc6186" 1
"1fa91397e6d56c79401b27e46a37a1272ea525f5" 1
"1fafd87ac05de00bc5e58f4e32a74fd3a4f36032" 1
"2.6.0+cdh5.12.1+2540", 70
"2.6.0-cdh5.12.1" 70
"2.9.0+cdh5.12.1+0", 10
"2.9.0-cdh5.12.1" 10
"3.4.5+cdh5.12.1+117", 10
"3.4.5-cdh5.12.1" 10
"3.9.0+cdh5.12.1+6507", 10
"3.9.0-cdh5.12.1" 10
"3e69cd1fb314a4b8e378c65dfa97a1c2996762ed" 1
"4.1.0+cdh5.12.1+446", 10
"4.1.0-cdh5.12.1" 10
"4.10.3+cdh5.12.1+519", 10
"4.10.3-cdh5.12.1" 10
"6.0.53-cdh5.12.1" 10
"62ec2b8013e3d5d404ebc21e6ada46da8e146807" 1
"65264b0cffc79b8f241210e8ae445af4e469e81b" 1
"6db17c7caa0fcc251a100dab3f6da28b918eeb1f" 1
"8db9d71943d2f6f23badf412874911875d1e3cb7" 1
"9f521a893d1e03f5bb863c5e7365c88afaf0057d" 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-el5.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-el6.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-el7.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-jessie.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-precise.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-sles11.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-sles12.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-trusty.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-wheezy.parcel", 1
"CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel", 1
"IMPALA, 10
"SPARK2 10
"bigtop-tomcat", 10
"components": 10
"conflicts": 10
"crunch", 10
"d4d088e5158f0e2f64ec3f2ed16b06de8f0497b8" 1
"flume-ng", 10
"hadoop", 10
"hadoop-0.20-mapreduce", 10
"hadoop-hdfs", 10
"hadoop-httpfs", 10
"hadoop-kms", 10
"hadoop-mapreduce", 10
"hadoop-yarn", 10
"hash": 10
"hbase", 10
"hbase-solr", 10
"hive", 10
"hive-hcatalog", 10
"hue", 10
"impala", 10
"kite", 10
"lastUpdated": 1
"llama", 10
"mahout", 10
"name": 290
"oozie", 10
"parcelName": 10
"parcels": 1
"parquet", 10
"pig", 10
"pkg_release": 290
"pkg_version": 290
"replaces": 10
"sentry", 10
"solr", 10
"spark", 10
"sqoop", 10
"sqoop2", 10
"version": 290
"whirr", 10
"zookeeper", 10
(<< 10
15041200430000, 1
2.0.0.cloudera2)", 10
SOLR, 10
SPARK", 10
[ 11
] 1
], 10
{ 301
} 12
}, 289
HDFS Web UI 中查看更为直观:
Oozie
在 MySQL 中创建 Ooziedb,并启动 Oozie UI
[root@hadoop01 tmp]# mysql -uroot -hhadoop01 -p
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 101
Server version: 5.7.24 MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> create database ooziedb DEFAULT CHARSET utf8 COLLATE utf8_general_ci;
Query OK, 1 row affected (0.01 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.02 sec)
Oozie Web UI 默认是没有启用的:
Enable Oozie web UI doc, 请参考下面链接:
https://docs.cloudera.com/documentation/enterprise/5-12-x/topics/admin_oozie_console.html
完成上面链接的步骤再Enable和Restart:
Oozie Web UI就出来了: