hive sql 50道练习题

这是一组关于Hive SQL的练习题,涵盖了建库建表、数据操作和复杂查询,如课程成绩比较、学生信息筛选、课程统计等,旨在提升Hive SQL技能。
摘要由CSDN通过智能技术生成

建库建表

--建库
create database test;

--建表
create table student(s_id string,s_name string,s_birth string,s_sex string) row format delimited fields terminated by '\t';
create table course(c_id string,c_name string,t_id string) row format delimited fields terminated by '\t';
create table teacher(t_id string,t_name string) row format delimited fields terminated by '\t';
create table score(s_id string,c_id string,s_score int) row format delimited fields terminated by '\t';

生成数据
vim /opt/module/test/course.txt

01	语文	02
02 	数学	01
03	英语	03

vim /opt/module/test/score.txt

01	01	80
01	02	90
01	03	99
02	01	70
02	02	60
02	03	80
03	01	80
03	02	80
03	03	80
04	01	50
04	02	30
04	03	20
05	01	76
05	02	87
06	01	31
06	03	34
07	02	89
07	03	98

vim /opt/module/test/student.txt

01	赵雷	1990-01-0102	钱电	1990-12-2103	孙风	1990-05-2004	李云	1990-08-0605	周梅	1991-12-0106	吴兰	1992-03-0107	郑竹	1989-07-0108	王菊	1990-01-20

vim /opt/module/test/teacher.txt

01	张三
02	李四
03	王五

导入数据

--导入数据
load data local inpath '/opt/module/test/student.txt' into table student;
load data local inpath '/opt/module/test/course.txt' into table course;
load data local inpath '/opt/module/test/teacher.txt' into table teacher;
load data local inpath '/opt/module/test/score.txt' into table score;
  1. 查询"01"课程比"02"课程成绩高的学生的信息及课程分数:
--方式1:直接连接sc1和sc2
select
    sc1.s_id,
    sc1.s_score score01,
    sc2.s_score score02
from score sc1
join score sc2 on sc1.s_id=sc2.s_id
where sc1.c_id='01' and sc2.c_id='02' and sc1.s_score>sc2.s_score;

--方式2
select
    student.*,
    s1.s_score score01,
    s2.s_score score02
from student
join score s1 on student.s_id = s1.s_id and s1.c_id='01'
left join score s2 on student.s_id = s2.s_id and s2.c_id='02'
where s1.s_score>s2.s_score;

--方式3
select
    student.*,
    s1.s_score score01,
    s2.s_score score02
from student
join score s1 on student.s_id = s1.s_id
left join score s2 on student.s_id = s2.s_id
where s1.c_id='01' and s2.c_id='02' and s1.s_score>s2.s_score;

--方式4
select
    student.*,
    s1.s_score score01,
    s2.s_score score02
from student
join score s1 on s1.c_id='01'
left join score s2 on s2.c_id='02'
where student.s_id = s1.s_id and student.s_id = s2.s_id and s1.s_score>s2.s_score;
  1. 查询"01"课程比"02"课程成绩低的学生的信息及课程分数:(和上面的第1题一样)
select
    student.*,
    s1.s_score score01,
    s2.s_score score02
from student
join score s1 on student.s_id = s1.s_id and s1.c_id='01'
left join score s2 on s1.s_id = s2.s_id and s2.c_id='02'
where s1.s_score<s2.s_score;
  1. 查询平均成绩大于等于60分的同学的学生编号和学生姓名和平均成绩:
--方式1:round函数可以去掉null值
select
    t1.sid,
    t1.sname,
    t1.avg_score
from
(
    select
        student.s_id sid,
        student.s_name sname,
        round(avg(s.s_score),1)  avg_score
    from student
    join score s on student.s_id = s.s_id
    group by student.s_id,student.s_name
)t1
where avg_score>=60
order by t1.sid;

--方式2:这个比较好
--第一步:查询出每个sid对应的平均成绩(通过round函数去掉null值,并且取小数点2位),作为临时表tmp
--第二步:将student表和tmp表进行关联,筛选出平均成绩avg_score大大于60的信息。
select
    student.s_id,
    student.s_name,
    tmp.avg_score
from student
join
(
    select
        s_id,
        round(avg(s_score),2) avg_score
    from score
    group by s_id
)tmp
on student.s_id=tmp.s_id
where tmp.avg_score>=60;

--方式3:将student表和score表进行关联,通过s_id和s_name进行分组,在分组的基础上查询每个分组的平均成绩大于等于60的学生信息和平均成绩
--这个比较简单,而且容易想到
select
    student.s_id,
    student.s_name,
    round(avg(s.s_score),2)
from student
join score s on student.s_id = s.s_id
group by student.s_id, student.s_name
having avg(s.s_score)>=60;
  1. 查询平均成绩小于60分的同学的学生编号和学生姓名和平均成绩:(包括有成绩的和无成绩的)
--方式1:将成绩小于60以及没有成绩的查询结果union all
--提示:union all去重;union不去重
--有成绩大于60的
select
    student.s_id,
    student.s_name,
    avg_score
from student
join
(
select
    tmp.s_id,
    tmp.avg_score
 from (
          select s_id,
                 round(avg(s_score), 2) avg_score
          from score
          group by s_id
      ) tmp
 where tmp.avg_score < 60
)t1
on student.s_id=t1.s_id
union all
(
--没有成绩的
    select
        student.s_id,
        student.s_name,
        0 avg_score
    from student
    left join score s on student.s_id = s.s_id
    where student.s_id not in
    (
        select s_id
        from score
        group by s_id
    )
)
1.上传tar包 2.解压 tar -zxvf hive-1.2.1.tar.gz 3.安装mysql数据库 推荐yum 在线安装 4.配置hive (a)配置HIVE_HOME环境变量 vi conf/hive-env.sh 配置其中的$hadoop_home (b)配置元数据库信息 vi hive-site.xml 添加如下内容: javax.jdo.option.ConnectionURL jdbc:mysql://localhost:3306/hive?createDatabaseIfNotExist=true JDBC connect string for a JDBC metastore javax.jdo.option.ConnectionDriverName com.mysql.jdbc.Driver Driver class name for a JDBC metastore javax.jdo.option.ConnectionUserName root username to use against metastore database javax.jdo.option.ConnectionPassword hadoop password to use against metastore database 5.安装hive和mysq完成后,将mysql的连接jar包拷贝到$HIVE_HOME/lib目录下 如果出现没有权限的问题,在mysql授权(在安装mysql的机器上执行) mysql -uroot -p #(执行下面的语句 *.*:所有库下的所有表 %:任何IP地址或主机都可以连接) GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' IDENTIFIED BY 'root' WITH GRANT OPTION; FLUSH PRIVILEGES; 6. Jline包版本不一致的问题,需要拷贝hive的lib目录中jline.2.12.jar的jar包替换掉hadoop中的 /home/hadoop/app/hadoop-2.6.4/share/hadoop/yarn/lib/jline-0.9.94.jar 启动hive bin/hive ---------------------------------------------------------------------------------------------------- Hive几种使用方式: 1.Hive交互shell bin/hive 2.Hive JDBC服务(参考java jdbc连接mysql) 3.hive启动为一个服务器,来对外提供服务 bin/hiveserver2 nohup bin/hiveserver2 1>/var/log/hiveserver.log 2>/var/log/hiveserver.err & 启动成功后,可以在别的节点上用beeline去连接 bin/beeline -u jdbc:hive2://mini1:10000 -n root 或者 bin/beeline ! connect jdbc:hive2://mini1:10000 4.Hive命令 hive -e ‘sql’ bin/hive -e 'select * from t_test'
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值