spark代码连接hive_spark SQL学习(spark连接hive)

spark 读取hive中的数据

scala> import org.apache.spark.sql.hive.HiveContext

import org.apache.spark.sql.hive.HiveContext

scala> val hiveContext = new HiveContext(sc)

//hive中的feigu数据库中表stud_info

scala> val stud_infoRDD = hiveContext.sql("select * from feigu.stud_info").rdd

scala> stud_infoRDD.take(5).foreach(line => println("code:"+line(0)+";name:"+line(1)))

code:stud_code;name:stud_name

code:2015101000;name:王进

code:2015101001;name:刘海

code:2015101002;name:张飞

code:2015101003;name:刘婷

spark载入数据到hive

两个文件

hadoop@master:~/wujiadong$ cat spark_stud_info.txt

wujiadong,26

ji,24

sun,27

xu,25

hadoop@master:~/wujiadong$ cat spark_stud_score.txt

wujiadong,90

ji,100

sun,99

xu,99

scala代码

scala> import org.apache.spark.sql.hive.HiveContext

scala> val hiveContext = new HiveContext(sc)

scala> hiveContext.sql("drop table if exists wujiadong.spark_stud_info")

scala> hiveContext.sql("create table if not exists wujiadong.spark_stud_info(name string,age int) row format delimited fields terminated by ','")

scala> hiveContext.sql("load data local inpath '/home/hadoop/wujiadong/spark_stud_info.txt' into table wujiadong.spark_stud_info");

scala> hiveContext.sql("drop table if exists wujiadong.spark_stud_score")

scala> hiveContext.sql("create table if not exists wujiadong.spark_stud_score(name string,score int) row format delimited fields terminated by ','")

scala> hiveContext.sql("load data local inpath '/home/hadoop/wujiadong/spark_stud_score.txt' into table wujiadong.spark_stud_score");

然后到hive中查询是否导入成功

hive> select * from spark_stud_info;

OK

wujiadong26

ji24

sun27

xu25

Time taken: 0.178 seconds, Fetched: 4 row(s)

hive> select * from spark_stud_score;

OK

wujiadong90

ji100

sun99

xu99

Time taken: 0.212 seconds, Fetched: 4 row(s)

//将两张表进行连接查询大于99分的

scala> val df = hiveContext.sql("select sss.name,sss.score from wujiadong.spark_stud_info ssi join wujiadong.spark_stud_score sss on ssi.name=sss.name where sss.score > 99")

scala> df.show()

17/03/06 22:30:37 INFO FileInputFormat: Total input paths to process : 1

17/03/06 22:30:38 INFO FileInputFormat: Total input paths to process : 1

+----+-----+

|name|score|

+----+-----+

| ji| 100|

+----+-----+

//将df中数据保存到表result_stu表中

scala> hiveContext.sql("drop table if exists wujiadong.result_stud")

scala> df.saveAsTable("wujiadong.result_stu")

//然后针对表result_stu直接创建dataframe

//Hive中查看

hive> select * from result_stu;

OK

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".

SLF4J: Defaulting to no-operation (NOP) logger implementation

SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.

ji100

Time taken: 0.252 seconds, Fetched: 1 row(s)

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值