CDH实战记录

CDH & Cloudera Manager

安装完 Cloudera Distribution Hadoop 和 Cloudera Manager 后,目录结构如下:

cmpath

[root@hadoop01 module]# /opt/module/cm/cm-5.12.1/share/cmf/schema/scm_prepare_database.sh mysql scm scm scm

JAVA_HOME=/usr/local/jdk

Verifying that we can write to /opt/module/cm/cm-5.12.1/etc/cloudera-scm-server

Creating SCM configuration file in /opt/module/cm/cm-5.12.1/etc/cloudera-scm-server

Executing:  /usr/local/jdk/bin/java -cp /usr/share/java/mysql-connector-java.jar:/usr/share/java/oracle-connector-java.jar:/opt/module/cm/cm-5.12.1/share/cmf/schema/../lib/* com.cloudera.enterprise.dbutil.DbCommandExecutor /opt/module/cm/cm-5.12.1/etc/cloudera-scm-server/db.properties com.cloudera.cmf.db.

[                          main] DbCommandExecutor              INFO  Successfully connected to database.

All done, your SCM database is configured correctly!

[root@hadoop01 ~]# /usr/local/mysql/support-files/mysql.server start

Starting MySQL..                                           [  OK  ]

[root@hadoop01 ~]# /opt/module/cm/cm-5.12.1/etc/init.d/cloudera-scm-server start

Starting cloudera-scm-server:                              [  OK  ]

[root@hadoop01 ~]# /opt/module/cm/cm-5.12.1/etc/init.d/cloudera-scm-agent start

Starting cloudera-scm-agent:                               [  OK  ]

[root@hadoop01 ~]# netstat -anp | grep 7180

tcp        0      0 0.0.0.0:7180            0.0.0.0:*               LISTEN      3953/java 

安装基础组件,比如HDFS,YARN,ZooKeeper,Hive:

点击 Hosts,进入 hadoop01,可查看相关详细信息:

后台 jps 可查看正在运行的进程:

[root@hadoop01 ~]# jps

17570 Main

66051 DataNode

15139 QuorumPeerMain

66053 SecondaryNameNode

66756 Jps

9221 AlertPublisher

9223 EventCatcherService

66056 NameNode

9228 Main

3953 Main

17843 Main

65267 ResourceManager

65241 JobHistoryServer

65339 NodeManager

Yarn

关于Yarn,我把Yarn Conguration里scheduler class改成capacity, 还有 container vcore 改成 2,container memory 改成 3G,然后在 Web UI 中查看ResourceManager Web UI:

 

 

 

wordcount 程序

[root@hadoop01 ~]# sudo -u hdfs hadoop jar /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/hadoop-mapreduce-examples-2.6.0-cdh5.12.1.jar wordcount /tmp/manifest.json /tmp/test2

20/12/15 11:07:39 INFO client.RMProxy: Connecting to ResourceManager at hadoop01/172.16.235.134:8032

20/12/15 11:07:39 INFO input.FileInputFormat: Total input paths to process : 1

20/12/15 11:07:40 INFO mapreduce.JobSubmitter: number of splits:1

20/12/15 11:07:40 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1608001618553_0001

20/12/15 11:07:41 INFO impl.YarnClientImpl: Submitted application application_1608001618553_0001

20/12/15 11:07:41 INFO mapreduce.Job: The url to track the job: http://hadoop01:8088/proxy/application_1608001618553_0001/

20/12/15 11:07:41 INFO mapreduce.Job: Running job: job_1608001618553_0001

20/12/15 11:07:46 INFO mapreduce.Job: Job job_1608001618553_0001 running in uber mode : false

20/12/15 11:07:46 INFO mapreduce.Job:  map 0% reduce 0%

20/12/15 11:07:52 INFO mapreduce.Job:  map 100% reduce 0%

20/12/15 11:07:57 INFO mapreduce.Job:  map 100% reduce 100%

20/12/15 11:07:58 INFO mapreduce.Job: Job job_1608001618553_0001 completed successfully

20/12/15 11:07:58 INFO mapreduce.Job: Counters: 49

 File System Counters

  FILE: Number of bytes read=1664

  FILE: Number of bytes written=293809

  FILE: Number of read operations=0

  FILE: Number of large read operations=0

  FILE: Number of write operations=0

  HDFS: Number of bytes read=72415

  HDFS: Number of bytes written=2645

  HDFS: Number of read operations=6

  HDFS: Number of large read operations=0

  HDFS: Number of write operations=2

 Job Counters

  Launched map tasks=1

  Launched reduce tasks=1

  Data-local map tasks=1

  Total time spent by all maps in occupied slots (ms)=4539

  Total time spent by all reduces in occupied slots (ms)=2489

  Total time spent by all map tasks (ms)=4539

  Total time spent by all reduce tasks (ms)=2489

  Total vcore-milliseconds taken by all map tasks=4539

  Total vcore-milliseconds taken by all reduce tasks=2489

  Total megabyte-milliseconds taken by all map tasks=4647936

  Total megabyte-milliseconds taken by all reduce tasks=2548736

 Map-Reduce Framework

  Map input records=1825

  Map output records=3077

  Map output bytes=51249

  Map output materialized bytes=1660

  Input split bytes=103

  Combine input records=3077

  Combine output records=116

  Reduce input groups=116

  Reduce shuffle bytes=1660

  Reduce input records=116

  Reduce output records=116

  Spilled Records=232

  Shuffled Maps =1

  Failed Shuffles=0

  Merged Map outputs=1

  GC time elapsed (ms)=170

  CPU time spent (ms)=2330

  Physical memory (bytes) snapshot=681234432

  Virtual memory (bytes) snapshot=5624213504

  Total committed heap usage (bytes)=617086976

 Shuffle Errors

  BAD_ID=0

  CONNECTION=0

  IO_ERROR=0

  WRONG_LENGTH=0

  WRONG_MAP=0

  WRONG_REDUCE=0

 File Input Format Counters

  Bytes Read=72312

 File Output Format Counters

  Bytes Written=2645

 

可看到 wordcount State 最终为FINISHED状态:

 

可通过命令行或者 Web UI查看 wordcount运行结果:

[root@hadoop01 tmp]# sudo -u hdfs hdfs dfs -ls /tmp/test2

Found 2 items

-rw-r--r--   3 hdfs supergroup          0 2020-12-15 11:07 /tmp/test2/_SUCCESS

-rw-r--r--   3 hdfs supergroup       2645 2020-12-15 11:07 /tmp/test2/part-r-00000

[root@hadoop01 tmp]# sudo -u hdfs hdfs dfs -cat /tmp/test2/part-r-00000

"0.11.0+cdh5.12.1+101", 10

"0.11.0-cdh5.12.1" 10

"0.12.0+cdh5.12.1+110", 10

"0.12.0-cdh5.12.1" 10

"0.7.0+cdh5.12.1+0", 10

"0.9+cdh5.12.1+34", 10

"0.9-cdh5.12.1" 10

"0.9.0+cdh5.12.1+23", 10

"0.9.0-cdh5.12.1" 10

"1.0.0+cdh5.12.1+0", 10

"1.0.0+cdh5.12.1+144", 10

"1.0.0-cdh5.12.1" 20

"1.1.0+cdh5.12.1+1197", 20

"1.1.0-cdh5.12.1" 20

"1.2.0+cdh5.12.1+365", 10

"1.2.0-cdh5.12.1" 10

"1.4.6+cdh5.12.1+113", 10

"1.4.6-cdh5.12.1" 10

"1.5+cdh5.12.1+71", 10

"1.5-cdh5.12.1" 10

"1.5.0+cdh5.12.1+187", 10

"1.5.0-cdh5.12.1" 10

"1.5.1+cdh5.12.1+329", 10

"1.5.1-cdh5.12.1" 10

"1.6.0+cdh5.12.1+166", 10

"1.6.0+cdh5.12.1+530", 10

"1.6.0-cdh5.12.1" 20

"1.99.5+cdh5.12.1+46", 10

"1.99.5-cdh5.12.1" 10

"1.cdh5.12.1.p0.3", 290

"11b2e7ae1c5c6e3f74400341995691173bcc6186" 1

"1fa91397e6d56c79401b27e46a37a1272ea525f5" 1

"1fafd87ac05de00bc5e58f4e32a74fd3a4f36032" 1

"2.6.0+cdh5.12.1+2540", 70

"2.6.0-cdh5.12.1" 70

"2.9.0+cdh5.12.1+0", 10

"2.9.0-cdh5.12.1" 10

"3.4.5+cdh5.12.1+117", 10

"3.4.5-cdh5.12.1" 10

"3.9.0+cdh5.12.1+6507", 10

"3.9.0-cdh5.12.1" 10

"3e69cd1fb314a4b8e378c65dfa97a1c2996762ed" 1

"4.1.0+cdh5.12.1+446", 10

"4.1.0-cdh5.12.1" 10

"4.10.3+cdh5.12.1+519", 10

"4.10.3-cdh5.12.1" 10

"6.0.53-cdh5.12.1" 10

"62ec2b8013e3d5d404ebc21e6ada46da8e146807" 1

"65264b0cffc79b8f241210e8ae445af4e469e81b" 1

"6db17c7caa0fcc251a100dab3f6da28b918eeb1f" 1

"8db9d71943d2f6f23badf412874911875d1e3cb7" 1

"9f521a893d1e03f5bb863c5e7365c88afaf0057d" 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-el5.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-el6.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-el7.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-jessie.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-precise.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-sles11.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-sles12.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-trusty.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-wheezy.parcel", 1

"CDH-5.12.1-1.cdh5.12.1.p0.3-xenial.parcel", 1

"IMPALA, 10

"SPARK2 10

"bigtop-tomcat", 10

"components": 10

"conflicts": 10

"crunch", 10

"d4d088e5158f0e2f64ec3f2ed16b06de8f0497b8" 1

"flume-ng", 10

"hadoop", 10

"hadoop-0.20-mapreduce", 10

"hadoop-hdfs", 10

"hadoop-httpfs", 10

"hadoop-kms", 10

"hadoop-mapreduce", 10

"hadoop-yarn", 10

"hash": 10

"hbase", 10

"hbase-solr", 10

"hive", 10

"hive-hcatalog", 10

"hue", 10

"impala", 10

"kite", 10

"lastUpdated": 1

"llama", 10

"mahout", 10

"name": 290

"oozie", 10

"parcelName": 10

"parcels": 1

"parquet", 10

"pig", 10

"pkg_release": 290

"pkg_version": 290

"replaces": 10

"sentry", 10

"solr", 10

"spark", 10

"sqoop", 10

"sqoop2", 10

"version": 290

"whirr", 10

"zookeeper", 10

(<< 10

15041200430000, 1

2.0.0.cloudera2)", 10

SOLR, 10

SPARK", 10

[ 11

] 1

], 10

{ 301

} 12

}, 289

 

HDFS Web UI 中查看更为直观:

 

 

 

Oozie

在 MySQL 中创建 Ooziedb,并启动 Oozie UI

[root@hadoop01 tmp]# mysql -uroot -hhadoop01 -p

Welcome to the MySQL monitor.  Commands end with ; or \g.

Your MySQL connection id is 101

Server version: 5.7.24 MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its

affiliates. Other names may be trademarks of their respective

owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> create database ooziedb DEFAULT CHARSET utf8 COLLATE utf8_general_ci;

Query OK, 1 row affected (0.01 sec)

mysql> flush privileges;

Query OK, 0 rows affected (0.02 sec)

 

 

 

Oozie Web UI 默认是没有启用的:

Enable Oozie web UI doc, 请参考下面链接:

https://docs.cloudera.com/documentation/enterprise/5-12-x/topics/admin_oozie_console.html

完成上面链接的步骤再Enable和Restart:

 

Oozie Web UI就出来了:

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值