数据库学习笔记（二）

最新推荐文章于 2021-04-28 15:53:54 发布

游到海水变蓝

最新推荐文章于 2021-04-28 15:53:54 发布

阅读量293

点赞数

文章标签： mysql

本文链接：https://blog.csdn.net/ChangHongYin/article/details/115217692

版权

select distinct() from table 1 left join table 2 on 1.id = 2.id
where id > 5 / last_name like’%a%’
group by id
having sum(price)>400 #having条件筛选与where一样，但having可以加聚合函数筛选
order by salary desc
limit a b;
DQL语言的学习（数据查询语言）
1.1基础查询（非常用语句与函数）
非常用语句——
Delete语句：DELETE FROM table_name WHERE condition;
Update语句：update table1 set 更新条件+where从句
if条件语句：if（exprx = expr1，expr2，expr3），如果expr1等于exprx是true则
返回expr2，否则返回expr3，这里面是与exprx比较，也可以是其他运算符
流程控制case when end条件语句：
update salary set sex = case sex when ‘m’ then ‘f’ else ‘m’ end，与if条件语句一
样都是可以当做一个实数跟在=后面
case when end函数的总结：
select e,emp_no, case when btype=1 then salary0.1 else salary0.2 end as
bonus from employees;
触发器trigger：
非常用函数——
distinct函数：select distinct id from table 1;去重某行
concat函数：select concat(‘last_name’,‘first_name’) as 姓名;连接字符
select concat(last_name,"’",first_name) as name
其中的连接符要用“”双引号括起来
ifnull函数：select ifnull(department_id,0) from table2; 判断第一个参数是否为
null，如果是的话返回第二个参数0，可设置第二个参数为null判别是否为空
length函数：select length("if_not_and_your“)查询字节长短为15
substr函数：order by substr(first_name,-2,2);从倒数第二开始取两位
replace函数：replace(object,search,replace) 把object中出现search的全部替换
为replace
聚合函数：min、max、count、sum、avg函数
limit函数：limit y读取y条；limit x y跳过x条读取y条；limit y offset x同上；
Datediff函数：Datediff(time1, time2) = 1判别时间相减是否为1，常接在where子
句后面
运算符函数：5 div 2 = 2整除；5 mod 2 = 1取余；round(1.5)=2四舍五入
窗口函数over（OLAP函数）：
select , rank() over (partition by 班级 order by成绩 desc) as rank1 from salary;
over (partition by #partition子句可省略，不指定分组 order by )
专用窗口函数，rank、dense_rank、row_number等
聚合函数，如sum、avg、count、max、min等
窗口函数是对where和group by处理后的结果进行操作，partition by可以省略，是分组使用
rank：排序是123336，也就是说重复的会重复，但次序累加
dense_rank：排序是123334，重复的会重复，次序递进
row_number：排序是123456，无所谓重复的
以上是rank这种排序类型窗口函数的应用，但其实rank还可以替换为聚合函数：sum/max/min/count/avg——
select , sum(成绩) over(order by 学号) as current_sum
这条语句的意思是在原有的列表右边生成新的一列：current_sum，这列依据学号排序（注意这里不是按照成绩排序，而是要针对一个主键id进行排序），并对成绩进行汇总，也就是如此第次叠加，其他的聚合函数也是如此
id score current_sum
1 33 33
2 12 12+33
3 15 12+33+15
strftime函数：strftime(’%Y-%m’,日期)，可将年月日转化为年月
count()/count(字段)/count(1)：这是count函数的一些用处，
count()的查询结果只会是一列所有的行数，不会忽略null值，
count(1)与之查询一样，但查询效率在样本小情况下较高
count(字段）则会单查询这一列的行数，如果有的是null值则忽略
like函数的各种符号：
%是包含任意数量的任意字符串
_是任何单个任意字符
[]是指定这个范围区间
是指任意数量的任意字符
1.2条件查询
主要为模糊查询like、between and、isnull、in四个语句，都跟在where后面
select * from employees where last_name like ‘%a%’;like函数紧跟在where后面
select id from table1 where id between 100 and 120;查询区间
1.3分组查询
group by子句在order by之前
1.4排序查询☆
order by子句在where之后
1.5连接查询☆
内连接：
select e.last_name from jobs j, employees e where e.job_id = j.job_id
select e.last_name from jobs j inner join employees e on j.job_id = e.job_id
以上两个效果都是一样的，都是查询两个表的公共部分
外连接：left join，right join取两个表的公共部分，公共部分覆盖在左表上，其余
没有覆盖到的右表的部分一律为null
外连接展示：
id kemu id score
1 语文 2 70
2 数学 3 69
3 英语 4 53
select * from kemu k left join score s k.id = s.id;
id kemu id1 score
1 语文 null null
2 数学 2 70
3 英语 3 69
发现：首先主表与副表的id重叠部分只有2、3，则在主表基础上增添2、3对应的
副表的所有特征，而主表id非2、3的对应副表的特征全为null
两次连接：
SELECT employee_id, city, department_name FROM employees e
JOIN departments d ON d.department_id = e.department_id
JOIN locations l ON d.location_id = l.location_id
自连接：
select a.Name as Employee
from Employee a inner join Employee b on a.ManagerId = b.Id
and a.Salary > b.Salary;
含义是第一个条件：a.ManagerId = b.Id，利用自连接直接判别a表的mid和b表的id相比较
（不是一一对应的比较），然后第二个条件：a.Salary > b.Salary，在id相等的情况下比较
两个id分别对应的salary谁大，这些都必须通过自连接来比较
交叉连接（Cross Join）：如果查询语句不带where条件子句，则返回两个表的笛卡尔
积，返回结果的行数是两个表行数的乘积，例如T_class和T_id，则为44=16行数据
全连接（Full Join）：返回结果是左表和右表内的所有行
1.6子查询☆
子查询跟在where语句后面：select * from employees where salary>(select
salary from employees where last_name = ‘Abel’)
1.7分页查询
主要使用limit实现分页查询，例如select * from employees limit 0, 10;从第一条
开始查询10条数据
1.8union联合查询
对两个select查询语句使用，union是结合两个结果（去重），union all不去重
DML语言的学习（数据操作语言）
2.1插入语句
alter table author add column annual double;
2.2修改语句
alter table author modify column annual timestamp;
alter table author change column annual pubDate datetime;
alter table author rename to author2;
2.3删除语句
alter table author drop column annual;
DDL语言的学习（数据定义语言）
3.1库和表的管理☆
库的创建：create database if not exists books;
库的删除：drop database if exists books;
表的创建：create table book (,);
表的修改：见DML语言的学习
表的删除：drop table if exists book;
3.2常见数据类型介绍
3.3常见约束☆
添加约束：alter table stuinfo modify column stuname varchar(20) not null;
添加外键：alter table stuinfo add constraint fk_stuinfo_major foreign
key(majorid) references major(id)
在表stuinfo的majorid列——关联表major的id列
TCL语言的学习（工具命令语言）
4.1事务和事务处理☆
事务：事务由单独单元的一个或多个SQL语句组成，在这个单元中，每个MySQL
语句都是相互依赖的。一个或一组SQL语句组成一个执行单元，这个执行单元要
么全部执行要么不执行
事务的ACID属性
其他操作处理与过程
5.1视图☆
含义：虚拟表，和普通的表一样，通过表动态生成的数据，具备临时性,仅保存
SQL逻辑，并未保存查询结果
create view v1
as
select stuname from stuinfo s inner join major m on s.‘majorid’ = m.‘id’;
5.2存储过程和函数
5.3流程控制结构
5.4变量定义
Hive语句的操作与详解
1.DDL数据定义
创建数据库在指定位置：create database db_hive location ‘/db_hive2.db’;
查询指定数据库：show database like 'db_hive’;
修改数据库：alter database db_hive set dbproperties(‘createtime’=‘20170830’);
彻底删除数据库：drop database db_hive2;
内部表和外部表：内部表的删除会直接删除掉，外部表的删除只会删除掉元数据信息
将数据导入外部表：load data local inpath '数据存储地址‘ into table default.dept;
查询表的类型：desc formatted student2;
修改内部表为外部表：alter table student2 set tb1properties(‘external’=‘true’);
修改外部表为内部表：alter table student2 set tb1properties(‘external’=‘true’);
分区表：分区表是HDFS文件系统的一个独立文件夹，分区即分目录，将大数据集分割为小的数据集
创建分区表：
create table dept_partition(deptno int, dname string, loc string) partitioned by (month string)
加载数据进入分区表：
load data local inpath ‘数据地址’ into table default.dept_partition partition(month=‘201709’);
查询分区表的数据：
select * from dept_partition where month=‘201709’;
增加分区：
alter table dept_partition add partition(month=‘201706’);
减少分区：
alter table dept_partition drop partition(month=‘201707’);
显示分区数量：
show partitions dept_partition;
分区的结构内容：一级分区即(month=‘201706’)，二级分区即(month=‘201706’, day=‘13’)
二级分区相当于在(month=‘201706’)这个大前提下产生二级数据表
2.DML

《错题集锦》
知识点1：limit与offset函数在求最值顺序查询时候的使用
limit y 分句表示: 读取 y 条数据
limit x, y 分句表示: 跳过 x 条数据，读取 y 条数据
limit y offset x 分句表示: 跳过 x 条数据，读取 y 条数据
limit pages,pageSize表示跳过page，读取pagesize（不用加where语句引导，而且表明pages表明返回的数据，pageSize每页的行数）

例如select * from table limit 2,1;——含义是在列表排列的数据中跳过排序1、2，直接取序号3的数据
select * from table limit 2 offset 1;——含义是从第1条（不包括第一条）开始取出两条数据
再例如select * from table order by salary desc limit 1,1;——这个含义是从已经进行了降序排列的列表中跳过最大的，取第二大的数据
select emp_no,salary
from salaries
where to_date=‘9999-01-01’
order by salary desc limit 1,1;
知识点2：包含没有编号、部门等新人问题的处理方法
一般题目包含“包括暂时没有分配部门的员工”这类新人问题的时候，应该想到这一般是左右连接查询的操作（left join）
left join的核心就是寻找两个表的公共特征，通过on公共特征将两个表连接在一起，如果需要实现两个left join，则可能其中一个表拥有其他两个表的公共特征，这个表作为中间表。
select em.last_name,em.first_name,dp.dept_name
from employees em
left join dept_emp de on em.emp_no = de.emp_no
left join departments dp on de.dept_no = dp.dept_no;
知识点3：inner join连接和left join连接的区异
　　inner join(等值连接) 只返回两个表中联结字段相等的行
对于inner join来说，两张表会存在一个共同特征，一般为序号no，那么表A和表B就会存在相同的序号，比如共同都有1,2,3三个序号，则新生成的表就有一列序号1,2,3，那么对应的其他特征就可以映射过来。
left join(左联接) 返回包括左表中的所有记录和右表中联结字段相等的记录
对于left join来说，则全面保留左边的表A，例如表A有序号1，而表B没有，则表B此行所有特征显示NULL，之后的序号2,3,4也均是一一对应排列。
知识点4：找寻某张表内对应另一张表的数据“不存在”的问题
一般这种不存在性的问题，尤其是涉及到两张表的连接，有两种思路解决：
1、left join+is null方法
select f.film_id, f.title
from film f left join film_category fc on f.film_id=fc.film_id
where fc.category_id is null
2、inner join+where+not in()方法
select film.film_id,film.title
from film
where film_id not in
(select f.film_id from film f inner join film_category fc on f.film_id=fc.film_id);
知识点5：在 SQLite 中，用 INSERT OR IGNORE 来插入记录，或忽略插入与表内UNIQUE字段都相同的记录，即插入数据过程中忽略已有数据
各种插入、替换、忽视的插入型语句：
insert into names(name, age) values(“小明”, 23);插入不存在数据，对于重复数据报错
insert ignore into names(name, age) values(“小明”, 24);插入不存在数据，对于重复数据忽略
replace into names(name, age) values(“小明”, 25);插入不存在数据，对于重复数据替换
在sqlites中语法有insert or ignore into意义与insert ignore into一致，都是忽略
补充：插入一个表的数据到另一个表内（前表数据完全填入后表内，不存在冲突性问题）：
insert into actor_name
select first_name,last_name
from actor;
知识点6：如何利用length函数和replace函数计算字符串某字符频次
select (length(“10,A,B”)-length(replace(“10,A,B”, “,”, “”))) as cnt;
发现没有计算频次，只需要使用replace函数，将‘，’全部替换到，则可以计算出‘，’的频次
replace(object,search,replace) 把object中出现search的全部替换为replace
知识点6补充：主要常用函数与语句补遗
1.substr函数
order by substr(first_name,-2,2);从倒数第二开始取两位
2.repalce函数
replace(object,search,replace) 把object中出现search的全部替换为replace
3.concat函数
concat(last_name,’_’,first_name) from employees;将两个字符用字符拼接在一起
sqlites语法中，group_cancat聚合函数的用法（需要同时与group by使用才能生效）
group_cancat(x,y)，x是查找的列，y是分隔符，如果不写则默认为逗号
4.update更新语句
要注意update更新语句是对整张表进行更新
update table1 set 更新的条件（通常为x = x*1.1之类）+where从句
5.强调某个数不在查找范围内的<>、!=、not in
<>大于小于+子查询语句 where salary>(select min(salary) from employees) and
where salary<(select max(salary) from employees);
!=+子查询语句 where salary != (select min(salary) from employees);
not in +子查询语句 where salary not in (select min(salary) from employees);
6.exists函数和in函数，not exists函数和not in函数
where not exists(子查询从句)与not in相较not exists更强调一种不存在的状态，而not in强调不在某种条件之内
select * from employees e
left join dept_emp d on d.emp_no = e.emp_no
where d.emp_no is null;

SELECT * FROM employees e
WHERE NOT EXISTS(SELECT * FROM dept_emp d WHERE e.emp_no=d.emp_no);

一个小案例：非manager员工信息
where de,emp_no not in (select emp_no from dept_manager);

知识点7：条件筛选
满足条件一是如此排列，满足条件二又是其他的排列，条件二的出现在不满足条件一的基础上面
案例：对所有员工的当前(to_date=‘9999-01-01’)薪水按照salary进行按照1-N的排名，相同salary并列且按照emp_no升序排列
select emp_no,salary,(select count(distinct salary) from salaries s2
where to_date = ‘9999-01-01’ and s1.salary<=s2.salary) rank
from salaries s1
where s1.to_date = ‘9999-01-01’
order by s1.salary desc,s1.emp_no asc;
按照条件一排列，然后相同并列的又按照条件二排列：order by 条件1 desc,条件2 desc;
额外增加一列1-N对应的序号Rank：(select count(distinct salary) from salaries s2 where to_date = ‘9999-01-01’ and s1.salary<=s2.salary) rank
知识点8：将查询结果作为新表再去查询
select t1.emp_no,t2.emp_no from (select salary from salaries left join…) t1,(select salary from salaries left join…) t2