CENTO7.2安装CDH5.10和Kudu1.2(二)

5 Kudu 安装
CDH从5.10开始,打包继承Kudu1.2,并且Cloundera正式提供支持。这个版本开始Kudu的安装较之前要简单很多,省去Impala_Kudu,安装完Kudu,Impala即可直接操作Kudu。
以下安装步骤基于用户使用Clodera Manager来安装和部署Kudu1.2
5.1安装 文件
1.下载csd文件
[root@ip-172-31-2-159 ~]# wge thttp://archive.cloudera.com/kudu/csd/KUDU-5.10.0.jar
2.将下载的jar包文件移动到/opt/cloudera/csd目录
[root@ip-172-31-2-159 ~]# mv KUDU-5.10.0.jar /opt/cloudera/csd
3.修改权限
[root@ip-172-31-2-159 ~]# chown cloudera-scm:cloudera-scm /opt/cloudera/csd/KUDU-5.10.0.jar
[root@ip-172-31-2-159 ~]# chmod 644 /opt/cloudera/csd/KUDU-5.10.0.jar
4.重启Cloudera Manager 服务
[root@ip-172-31-2-159 ~]# systemctl restart cloudera-scm-server
5.2安装Kudu服务
1.下载Kudu服务需要的parcel包
[root@ip-172-31-2-159 ~]#
wget
http://archive.cloudera.com/kudu/parcels/5.10/KUDU-1.2.0-1.cdh5.10.1.p0.66-el7.parcel
[root@ip-172-31-2-159 ~]# wget http://archive.cloudera.com/kudu/parcels/5.10/KUDU-1.2.0-1.cdh5.10.1.p0.66-el7.parcel.sha1
[root@ip-172-31-2-159 ~]# wget http://archive.cloudera.com/kudu/parcels/5.10/manifest.json
2.将Kudu的Parcel包部署到http服务
[root@ip-172-31-2-159 ~]# mkdir kudu1.2
[root@ip-172-31-2-159 ~]# mv KUDU-1.2.0-1.cdh5.10.1.p0.66-el7.parcel* kudu1.2/
[root@ip-172-31-2-159 ~]# mv manifest.json kudu1.2
[root@ip-172-31-2-159 ~]# mv kudu1.2/ /var/www/html/
[root@ip-172-31-2-159 ~]# systemctl start httpd
3.检查http显示Kudu正常:
在这里插入图片描述
4.通过CM界面配置Kudu的Parcel地址,并下载,分发,激活Kudu。

在这里插入图片描述
5.通过CM安装Kudu1.2
添加服务
在这里插入图片描述

选择Master和Tablet Server
在这里插入图片描述
配置相应的目录,注:无论是Master还是Tablet根据实际情况数据目录(fs_data_dir)应该都可能有多个,以提高并发读写,从而提高Kudu性能

在这里插入图片描述
启动服务
在这里插入图片描述
安装完毕
在这里插入图片描述
5.3配置Impala
在CDH5.10中,安装完Kudu1.2后,默认Impala即可直接操作Kudu进行SQL操作,但为了省区每次建表都需要在TBLPROPERETIES中添加kudu
_master_addressess属性,建议在Impala的高级配置KuduMaster的地址:–kudu——master——hosts=ip-172-31-2-159:7051
在这里插入图片描述
6 快速组建服务验证
6.1HDFS验证(mkdir+put+cat+get)
[root@ip-172-31-2-159 ~]# hadoop fs -mkdir -p /lilei/test_table
[root@ip-172-31-2-159 ~]# cat > a.txt
1#2
c#d
我#你^C
[root@ip-172-31-2-159 ~]#
[root@ip-172-31-2-159 ~]#
[root@ip-172-31-2-159 ~]#
[root@ip-172-31-2-159 ~]# hadoop fs -put a.txt /lilei/test_table
[root@ip-172-31-2-159 ~]# hadoop fs -cat /lilei/test_table/a.txt
1#2
c#d
[root@ip-172-31-2-159 ~]# rm -rf a.txt
[root@ip-172-31-2-159 ~]#
[root@ip-172-31-2-159 ~]# hadoop fs -get /lilei/test_table/a.txt
[root@ip-172-31-2-159 ~]#
[root@ip-172-31-2-159 ~]# cat a.txt
1#2
c#d
6.2Hive验证

[root@ip-172-31-2-159 ~]# hive

Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/jars/hive-common-1.1.0-cdh5.10.0.jar!/hive-log4j.properties
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
hive> create external table test_table
> (
> s1 string,
> s2 string
> )
> row format delimited fields terminated by ‘#’
> stored as textfile location ‘/lilei/test_table’;
OK
Time taken: 0.631 seconds
hive> select * from test_table;
OK
1 2
c d
Time taken: 0.36 seconds, Fetched: 2 row(s)
hive> select count(*) from test_table;
Query ID = root_20170404013939_69844998-4456-4bc1-9da5-53ea91342e43
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=
In order to set a constant number of reducers:
set mapreduce.job.reduces=
Starting Job = job_1491283979906_0005, Tracking URL = http://ip-172-31-2-159:8088/proxy/application_1491283979906_0005/
Kill Command = /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop/bin/hadoop job -kill job_1491283979906_0005
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2017-04-04 01:39:25,425 Stage-1 map = 0%, reduce = 0%
2017-04-04 01:39:31,689 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 1.02 sec
2017-04-04 01:39:36,851 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 2.34 sec
MapReduce Total cumulative CPU time: 2 seconds 340 msec
Ended Job = job_1491283979906_0005
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 2.34 sec HDFS Read: 6501 HDFS Write: 2 SUCCESS
Total MapReduce CPU Time Spent: 2 seconds 340 msec
OK
2
Time taken: 21.56 seconds, Fetched: 1 row(s)
6.3 MapReduce 验证
[root@ip-172-31-2-159 ~]# hadoop jar /opt/cloudera/parcels/CDH/lib/hadoop-0.20-mapreduce/hadoop-examples.jar pi 5 5
Number of Maps = 5
Samples per Map = 5
Wrote input for Map #0
Wrote input for Map #1
Wrote input for Map #2
Wrote input for Map #3
Wrote input for Map #4
Starting Job
17/04/04 01:38:15 INFO client.RMProxy: Connecting to ResourceManager at ip-172-31-2-159/172.31.2.159:8032
17/04/04 01:38:15 INFO mapreduce.JobSubmissionFiles: Permissions on staging directory /user/root/.staging are incorrect: rwxrwxrwx. Fixing permissions to correct value rwx------
17/04/04 01:38:15 INFO input.FileInputFormat: Total input paths to process : 5
17/04/04 01:38:15 INFO mapreduce.JobSubmitter: number of splits:5
17/04/04 01:38:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491283979906_0004
17/04/04 01:38:16 INFO impl.YarnClientImpl: Submitted application application_1491283979906_0004
17/04/04 01:38:16 INFO mapreduce.Job: The url to track the job: http://ip-172-31-2-159:8088/proxy/application_1491283979906_0004/
17/04/04 01:38:16 INFO mapreduce.Job: Running job: job_1491283979906_0004
17/04/04 01:38:21 INFO mapreduce.Job: Job job_1491283979906_0004 running in uber mode : false
17/04/04 01:38:21 INFO mapreduce.Job: map 0% reduce 0%
17/04/04 01:38:26 INFO mapreduce.Job: map 100% reduce 0%
17/04/04 01:38:32 INFO mapreduce.Job: map 100% reduce 100%
17/04/04 01:38:32 INFO mapreduce.Job: Job job_1491283979906_0004 completed successfully
17/04/04 01:38:32 INFO mapreduce.Job: Counters: 49
File System Counters
FILE: Number of bytes read=64
FILE: Number of bytes written=749758
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1350
HDFS: Number of bytes written=215
HDFS: Number of read operations=23
HDFS: Number of large read operations=0
HDFS: Number of write operations=3
Job Counters
Launched map tasks=5
Launched reduce tasks=1
Data-local map tasks=5
Total time spent by all maps in occupied slots (ms)=16111
Total time spent by all reduces in occupied slots (ms)=2872
Total time spent by all map tasks (ms)=16111
Total time spent by all reduce tasks (ms)=2872
Total vcore-seconds taken by all map tasks=16111
Total vcore-seconds taken by all reduce tasks=2872
Total megabyte-seconds taken by all map tasks=16497664
Total megabyte-seconds taken by all reduce tasks=2940928
Map-Reduce Framework
Map input records=5
Map output records=10
Map output bytes=90
Map output materialized bytes=167
Input split bytes=760
Combine input records=0
Combine output records=0
Reduce input groups=2
Reduce shuffle bytes=167
Reduce input records=10
Reduce output records=0
Spilled Records=20
Shuffled Maps =5
Failed Shuffles=0
Merged Map outputs=5
GC time elapsed (ms)=213
CPU time spent (ms)=3320
Physical memory (bytes) snapshot=2817884160
Virtual memory (bytes) snapshot=9621606400
Total committed heap usage (bytes)=2991587328
Shuffle Errors
BAD_ID=0
CONNECTION=0
IO_ERROR=0
WRONG_LENGTH=0
WRONG_MAP=0
WRONG_REDUCE=0
File Input Format Counters
Bytes Read=590
File Output Format Counters
Bytes Written=97
Job Finished in 17.145 seconds
Estimated value of Pi is 3.68000000000000000000
6.4Impala验证
[root@ip-172-31-2-159 ~]# impala-shell -i ip-172-31-7-96
Starting Impala Shell without Kerberos authentication
Connected to ip-172-31-7-96:21000
Server version: impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2)


Welcome to the Impala shell.
(Impala Shell v2.7.0-cdh5.10.0 (785a073) built on Fri Jan 20 12:03:56 PST 2017)

Run the PROFILE command after a query has finished to see a comprehensive summary
of all the performance and diagnostic information that Impala gathered for that
query. Be warned, it can be very long!


[ip-172-31-7-96:21000] > show tables;
Query: show tables
±-----------+
| name |
±-----------+
| test_table |
±-----------+
Fetched 1 row(s) in 0.20s
[ip-172-31-7-96:21000] > select * from test_table;
Query: select * from test_table
Query submitted at: 2017-04-04 01:41:56 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=c4a06bd46f9106b:4a69f04800000000
±—±---+
| s1 | s2 |
±—±---+
| 1 | 2 |
| c | d |
±—±---+
Fetched 2 row(s) in 3.73s
[ip-172-31-7-96:21000] > select count() from test_table;
Query: select count(
) from test_table
Query submitted at: 2017-04-04 01:42:06 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=2a415724696f7414:1f9113ea00000000
±---------+
| count(*) |
±---------+
| 2 |
±---------+
Fetched 1 row(s) in 0.15s
6.5 Spark验证

[root@ip-172-31-2-159 ~]# spark-shell
Setting default log level to “WARN”.
To adjust logging level use sc.setLogLevel(newLevel).
Welcome to
____ __
/ / ___ / /
\ / _ / _ `/ __/ '/
/
/ .__/_,// //_\ version 1.6.0
/
/

Using Scala version 2.10.5 (Java HotSpot™ 64-Bit Server VM, Java 1.7.0_67)
Type in expressions to have them evaluated.
Type :help for more information.
Spark context available as sc (master = yarn-client, app id = application_1491283979906_0006).
17/04/04 01:43:26 WARN metastore.ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.1.0
17/04/04 01:43:27 WARN metastore.ObjectStore: Failed to get database default, returning NoSuchObjectException
SQL context available as sqlContext.

scala> var textFile=sc.textFile(“hdfs://ip-172-31-2-159:8020/lilei/test_table/a.txt”)
textFile: org.apache.spark.rdd.RDD[String] = hdfs://ip-172-31-2-159:8020/lilei/test_table/a.txt MapPartitionsRDD[1] at textFile at :27

scala>

scala> textFile.count()
res0: Long = 2
6.6Kudu验证
[root@ip-172-31-2-159 ~]# impala-shell -i ip-172-31-7-96
Starting Impala Shell without Kerberos authentication
Connected to ip-172-31-7-96:21000
Server version: impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2)


Welcome to the Impala shell.
(Impala Shell v2.7.0-cdh5.10.0 (785a073) built on Fri Jan 20 12:03:56 PST 2017)

Every command must be terminated by a ‘;’.


[ip-172-31-7-96:21000] > CREATE TABLE my_first_table
> (
> id BIGINT,
> name STRING,
> PRIMARY KEY(id)
> )
> PARTITION BY HASH PARTITIONS 16
> STORED AS KUDU;
Query: create TABLE my_first_table
(
id BIGINT,
name STRING,
PRIMARY KEY(id)
)
PARTITION BY HASH PARTITIONS 16
STORED AS KUDU

Fetched 0 row(s) in 1.35s
[ip-172-31-7-96:21000] > INSERT INTO my_first_table VALUES (99, “sarah”);
Query: insert INTO my_first_table VALUES (99, “sarah”)
Query submitted at: 2017-04-04 01:46:08 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=824ce0b3765c6b91:5ea8dd7c00000000
Modified 1 row(s), 0 row error(s) in 3.37s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > INSERT INTO my_first_table VALUES (1, “john”), (2, “jane”), (3, “jim”);
Query: insert INTO my_first_table VALUES (1, “john”), (2, “jane”), (3, “jim”)
Query submitted at: 2017-04-04 01:46:13 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=a645259c3b8ae7cd:e446e15500000000
Modified 3 row(s), 0 row error(s) in 0.11s
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:46:19 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=f44021589ff0d94d:8d30568200000000
±—±------+
| id | name |
±—±------+
| 2 | jane |
| 3 | jim |
| 1 | john |
| 99 | sarah |
±—±------+
Fetched 4 row(s) in 0.55s
[ip-172-31-7-96:21000] > delete from my_first_table where id =99;
Query: delete from my_first_table where id =99
Query submitted at: 2017-04-04 01:46:56 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=814090b100fdf0b4:1b516fe400000000
Modified 1 row(s), 0 row error(s) in 0.15s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:46:57 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=724aa3f84cedb109:a679bf0200000000
±—±-----+
| id | name |
±—±-----+
| 2 | jane |
| 3 | jim |
| 1 | john |
±—±-----+
Fetched 3 row(s) in 0.15s
[ip-172-31-7-96:21000] > INSERT INTO my_first_table VALUES (99, “sarah”);
Query: insert INTO my_first_table VALUES (99, “sarah”)
Query submitted at: 2017-04-04 01:47:32 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=6244b3c6d33b443e:f43c857300000000
Modified 1 row(s), 0 row error(s) in 0.11s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > update my_first_table set name=‘lilei’ where id=99;
Query: update my_first_table set name=‘lilei’ where id=99
Query submitted at: 2017-04-04 01:47:32 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=8f4ab0dd3c19f9df:b2c7bdfa00000000
Modified 1 row(s), 0 row error(s) in 0.13s
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:47:34 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=6542579c8bd5b6ad:af68f50800000000
±—±------+
| id | name |
±—±------+
| 2 | jane |
| 3 | jim |
| 1 | john |
| 99 | lilei |
±—±------+
Fetched 4 row(s) in 0.15s
[ip-172-31-7-96:21000] > upsert into my_first_table values(1, “john”), (4, “tom”), (99, “lilei1”);
Query: upsert into my_first_table values(1, “john”), (4, “tom”), (99, “lilei1”)
Query submitted at: 2017-04-04 01:48:52 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=694fc7ac2bc71d21:947f1fa200000000
Modified 3 row(s), 0 row error(s) in 0.11s
[ip-172-31-7-96:21000] >
[ip-172-31-7-96:21000] > select * from my_first_table;
Query: select * from my_first_table
Query submitted at: 2017-04-04 01:48:52 (Coordinator: http://ip-172-31-7-96:25000)
Query progress can be monitored at: http://ip-172-31-7-96:25000/query_plan?query_id=a64e0ee707762b6b:69248a6c00000000
±—±-------+
| id | name |
±—±-------+
| 2 | jane |
| 3 | jim |
| 1 | john |
| 99 | lilei1 |
| 4 | tom |
±—±-------+
Fetched 5 row(s) in 0.16s

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值