SQL,HiveQL,Spark Shell总结

最新推荐文章于 2024-06-19 09:00:00 发布

翱翔的江鸟

最新推荐文章于 2024-06-19 09:00:00 发布

阅读量519

点赞数

本文链接：https://blog.csdn.net/wxfghy/article/details/81162031

版权

MySQL 同时被 3 个专栏收录

12 篇文章 0 订阅

订阅专栏

Spark

7 篇文章 0 订阅

订阅专栏

Hive

6 篇文章 0 订阅

订阅专栏

1. SQL

创建表

drop table if exists demo01;
create table demo01(
eno int(10),
ename varchar(20)
);

插入数据

insert into demo01 values(1,"hello");

增加字段

alter table demo01 add loc varchar(20);

修改字段的值

update demo01 set loc="world";

删除某行记录

delet from demo01 where loc="world";

删除字段

alter table demo01 drop column loc;

查看表结构

desc demo01;

查询

# 查询员工经理
select e.ename as emp,m.ename as mgr from emp e join emp m on e.mgr=m.empno;
# 查询部门中高于平均工资员工
select ename,sal from emp a where sal > (
select avg(sal) from emp b where a.deptno=b.deptno);

2. HiveQL

数据库相关

create database mydb;
use mydb;
show tables;

创建表

create external table demo01(
eno int,ename string)
row format delimited
fields terminated by ','
collection items terminated by '.'
map keys terminated by ':'
lines terminated by '\n'
partitioned by (city string)
;

载入数据

load data inpath ‘/demo01.txt’ overwrite into table demo01;
load data local inpath ‘/home/hadoop/demo01.txt’ overwrite into table demo01;

查询wordcount

create external table docs(line string);
load data inpath '/word.txt' overwrite into table docs;

select word,count(1) as num from (
select explode(split(line,'\\s+')) as word from docs) as x
group by word
order by num desc
;

3. Spark Shell

sparkcontext简称sc

1. 读取文件: sc.textFile("hdfs://ghym:9000/demo.txt")
2. 扁平化拆分单词: .flatMap(line=>line.split("\\s+"))
3. mapreduce: .map((_,1)).reduceByKey(_+_)
4. 按第二个字段排序: .sortBy(_._2,false)
5. action函数转换数据打印: .collection.foreach(println)