1,SELECT查询
SELECT
语句, 称为查询语句
SELECT column(列名), another_column, … FROM mytable(表名);
如下表 Movies :
Id | Title | Director | Year | Length_minutes |
1 | Toy Story | John Lasseter | 1995 | 81 |
2 | A Bug's Life | John Lasseter | 1998 | 95 |
3 | Toy Story 2 | John Lasseter | 1999 | 93 |
4 | Monsters, Inc. | Pete Docter | 2001 | 92 |
5 | Finding Nemo | Finding Nemo | 2003 | 107 |
6 | The Incredibles | Brad Bird | 2004 | 116 |
7 | Cars | John Lasseter | 2006 | 117 |
8 | Ratatouille | Brad Bird | 2007 | 115 |
9 | WALL-E | Andrew Stanton | 2008 | 104 |
10 | Up | Pete Docter | 2009 | 101 |
11 | Toy Story 3 | Lee Unkrich | 2010 | 103 |
12 | Cars 2 | John Lasseter | 2011 | 120 |
13 | Brave | Brenda Chapman | 2012 | 102 |
14 | Monsters University | Dan Scanlon | 2013 | 110 |
我们需要查询所有:
SELECT * FROM movies;
- 【简单查询】找到所有电影的名称
title
- 【简单查询】找到所有电影的导演
- 【简单查询】找到所有电影的名称和导演
- 【简单查询】找到所有电影的名称和上映年份
- 【简单查询】找到所有电影的所有信息
- 【简单查询】找到所有电影的名称,Id和播放时长
SELECT title FROM movies;
SELECT Director FROM movies;
SELECT title,Director FROM movies;
SELECT title,year FROM movies;
SELECT * FROM movies;
SELECT title,id,Length_minutes FROM movies;
2,查询条件
平常我们的查询需要就是需要条件来进行查询,需要使用的就是where
SELECT (列) FROM WHERE (条件);
- 【简单条件】找到
id
为6的电影 - 【简单条件】找到在2000-2010年间
year
上映的电影 - 【简单条件】找到不是在2000-2010年间
year
上映的电影 - 【简单条件】找到头5部电影
- 【简单条件】找到2010(含)年之后的电影里片长小于两个小时的片子
SELECT * FROM movies;
SELECT * FROM movies where id=6;
SELECT * FROM movies where year<=2010 and year>=2000;
SELECT * FROM movies where year>2010 or year<2000;
SELECT * FROM movies where id<6;
SELECT * FROM movies where year>=2010 and Length_minutes<120;
增加查询条件,比如模糊查询,就要使用到like,需要使用到%通配符:
- 【复杂条件】找到所有
Toy Story
系列电影 - 【复杂条件】找到所有
John Lasseter
导演的电影 - 【复杂条件】找到所有不是
John Lasseter
导演的电影 - 【复杂条件】找到所有电影名为
"WALL-"
开头的电影 - 【复杂条件】有一部98年电影中文名《虫虫危机》请给我找出来
SELECT * FROM movies;
SELECT * FROM movies where title like "%Toy Story%";
SELECT * FROM movies where Director="John Lasseter";
SELECT * FROM movies where Director!="John Lasseter";
SELECT * FROM movies where Title like "WALL%";
SELECT * FROM movies where year like "%98";
3,查询结果过滤和排序
DISTINCT语法,使用这个gaugau关键字来指定某个(多个)属性列唯一返回.
SELECT DISTINCT (属性列名) FROM (表) WHERE (条件);
因为 DISTINCT
语法会直接删除重复的行, 我们还会学习 GROUP BY
语句.
然后就是对查询的结果按照要求进行排序:
可以使用ORDER BY (属性列名)排序的语法来让结果按照一个或者多个属性做排序.正序:ASD,倒序:DESC
然后通过LIMIIT选取部分结果:
SELECT DISTINCT (属性列名)
FROM (表)
WHERE (条件)
ORDER BY column ASC/DESC
LIMIT (个数) OFFSET (开始位置);
- 【结果排序】按导演名
排重
列出所有电影(只显示导演),并按导演名正序排列
SELECT distinct Director
FROM movies
order by director;
- 【结果排序】列出按上映年份
最新
上线的4部电影
SELECT distinct *
FROM movies
order by year desc
limit 4;
- 【结果排序】按电影名字母序
升序
排列,列出前5部电影
SELECT distinct *
FROM movies
order by Title
limit 5;
- 【结果排序】按电影名字母序升序排列,列出上一题
之后
的5部电影
SELECT distinct *
FROM movies
order by Title
limit 5 OFFSET 5;
- 【结果排序】如果按片长排列,John Lasseter导演导过片长第3长的电影是哪部,列出名字即可
SELECT distinct Title
FROM movies
order by Length_minutes
limit 1 OFFSET 3;
4,多表联合查询
使用JSINS进行多表联合查询
主键:一般关系数据库中,都会有一个属性设置为主键,借助主键,我们可以把两个表中具有相同主键ID的数据连接起来,使用到JOIN关键字.
SELECT (属性列名)
FROM (主表)
INNER JOIN(要连接的表)
ON mytable.id = another_table.id (主键连接,两个相同的连成1条)
WHERE (条件)
ORDER BY column, … ASC/DESC
LIMIT (个数)OFFSET (开始位置);
例题:
Table:Movies
Id | Title | Director | Year | Length_minutes |
1 | Toy Story | John Lasseter | 1995 | 81 |
2 | A Bug's Life | John Lasseter | 1998 | 95 |
3 | Toy Story 2 | John Lasseter | 1999 | 93 |
4 | Monsters, Inc. | Pete Docter | 2001 | 92 |
5 | Finding Nemo | Finding Nemo | 2003 | 107 |
6 | The Incredibles | Brad Bird | 2004 | 116 |
7 | Cars | John Lasseter | 2006 | 117 |
8 | Ratatouille | Brad Bird | 2007 | 115 |
9 | WALL-E | Andrew Stanton | 2008 | 104 |
10 | Up | Pete Docter | 2009 | 101 |
11 | Toy Story 3 | Lee Unkrich | 2010 | 103 |
12 | Cars 2 | John Lasseter | 2011 | 120 |
13 | Brave | Brenda Chapman | 2012 | 102 |
14 | Monsters University | Dan Scanlon | 2013 | 110 |
Boxoffice :
Movie_id | Rating | Domestic_sales | International_sales |
5 | 8.2 | 380843261 | 555900000 |
14 | 7.4 | 268492764 | 475066843 |
8 | 8 | 206445654 | 417277164 |
12 | 6.4 | 191452396 | 368400000 |
3 | 7.9 | 245852179 | 239163000 |
6 | 8 | 261441092 | 370001000 |
9 | 8.5 | 223808164 | 297503696 |
11 | 8.4 | 415004880 | 648167031 |
1 | 8.3 | 191796233 | 170162503 |
7 | 7.2 | 244082982 | 217900167 |
10 | 8.3 | 293004164 | 438338580 |
4 | 8.1 | 289916256 | 272900000 |
2 | 7.2 | 162798565 | 200600000 |
13 | 7.2 | 237283207 | 301700000 |
- 【联表】找到所有电影的国内
Domestic_sales
和国际销售额
SELECT *
FROM movies
inner join Boxoffice
on Movies.id=Boxoffice.Movie_id;
- 【联表】找到所有国际销售额比国内销售大的电影
SELECT *
FROM movies
inner join Boxoffice
on Movies.id=Boxoffice.Movie_id
where International_sales>Domestic_sales;
- 【联表】找出所有电影按市场占有率
rating
倒序排列
SELECT *
FROM movies
inner join Boxoffice
on Movies.id=Boxoffice.Movie_id
order by rating desc;
- 【联表】每部电影按国际销售额比较,排名最靠前的导演是谁,国际销量多少
SELECT Director,International_sales
FROM movies
inner join Boxoffice
on Movies.id=Boxoffice.Movie_id
order by International_sales desc
limit 1;
5,外连接
INNER JOIN
只会保留两个表都存在的数据,意味着一些数据的丢失.
于是我们就要介绍左连接LEFT JOIN
,右连接RIGHT JOIN
和 全连接FULL JOIN.
SELECT (属性列名)
FROM (主表)
INNER/LEFT/RIGHT/FULL JOIN (需要连接的表)
ON mytable.id = another_table.matching_id
WHERE (条件)
ORDER BY (排序的条件), … ASC/DESC
LIMIT (个数) OFFSET (位置);
例子:
Employees :
Role | Name | Building | Years_employed |
Engineer | Becky A. | 1e | 4 |
Engineer | Dan B. | 1e | 2 |
Engineer | Sharon F. | 1e | 6 |
Engineer | Dan M. | 1e | 4 |
Engineer | Malcom S. | 1e | 1 |
Artist | Tylar S. | 2w | 2 |
Artist | Sherman D. | 2w | 8 |
Artist | Jakob J. | 2w | 6 |
Artist | Lillia A. | 2w | 7 |
Artist | Brandon J. | 2w | 7 |
Manager | Scott K. | 1e | 9 |
Manager | Shirlee M. | 1e | 3 |
Manager | Daria O. | 2w | 6 |
Engineer | Yancy I. | null | 0 |
Artist | Oliver P. | null | 0 |
Buildings :
Building_name | Capacity |
1e | 24 |
1w | 32 |
2e | 16 |
2w | 20 |
- 【复习】找到所有有雇员的办公室(
buildings
)名字
SELECT distinct Building_name
FROM Employees
left join Buildings
on Employees.Building=Buildings .Building_name
where Building_name is not null;
- 【复习】找到所有办公室里的所有角色(包含没有雇员的),并做唯一输出(
DISTINCT
)
SELECT distinct role,Building_name
FROM Buildings
left join Employees
on Employees.Building=Buildings .Building_name
- 【难题】找到所有有雇员的办公室(
buildings
)和对应的容量
SELECT distinct Building_name,Capacity
FROM Buildings
left join Employees
on Employees.Building=Buildings .Building_name
where building
6,NULL关键字
null还是无的意思,因为我们可能要考虑到某个属性为null 的情况,这种特殊性质会导致编写SQL的复杂性,尽量减少null的使用.
如果多表连接,A和B有数据差异, 必须要null来填充,可以用is 努力了或者是 is not null 来选这个字段是否等于null.
还是根据5中的员工两个表,举例子:
- 【复习】找到雇员里还没有分配办公室的(列出名字和角色就可以)
SELECT name,role
from Employees
left join Buildings
on Employees.Building=Buildings .Building_name
where building is null;
- 【难题】找到还没有雇员的办公室
SELECT building_name
from Buildings
left join Employees
on Employees.Building=Buildings .Building_name
where building is null;
7,在查询中使用表达式
顾名思义,就是对数字或者字符串的运算
SELECT particle_speed / 2.0 AS half_particle_speed (对结果做了一个除2)
FROM physics_data
WHERE ABS(particle_position) * 10.0 >500
(条件要求这个属性绝对值乘以10大于500);
AS就是取别名
例子:
Movies :
Id | Title | Director | Year | Length_minutes |
1 | Toy Story | John Lasseter | 1995 | 81 |
2 | A Bug's Life | John Lasseter | 1998 | 95 |
3 | Toy Story 2 | John Lasseter | 1999 | 93 |
4 | Monsters, Inc. | Pete Docter | 2001 | 92 |
5 | Finding Nemo | Finding Nemo | 2003 | 107 |
6 | The Incredibles | Brad Bird | 2004 | 116 |
7 | Cars | John Lasseter | 2006 | 117 |
8 | Ratatouille | Brad Bird | 2007 | 115 |
9 | WALL-E | Andrew Stanton | 2008 | 104 |
10 | Up | Pete Docter | 2009 | 101 |
11 | Toy Story 3 | Lee Unkrich | 2010 | 103 |
12 | Cars 2 | John Lasseter | 2011 | 120 |
13 | Brave | Brenda Chapman | 2012 | 102 |
14 | Monsters University | Dan Scanlon | 2013 | 110 |
Boxoffice :
Movie_id | Rating | Domestic_sales | International_sales |
5 | 8.2 | 380843261 | 555900000 |
14 | 7.4 | 268492764 | 475066843 |
8 | 8 | 206445654 | 417277164 |
12 | 6.4 | 191452396 | 368400000 |
3 | 7.9 | 245852179 | 239163000 |
6 | 8 | 261441092 | 370001000 |
9 | 8.5 | 223808164 | 297503696 |
11 | 8.4 | 415004880 | 648167031 |
1 | 8.3 | 191796233 | 170162503 |
7 | 7.2 | 244082982 | 217900167 |
10 | 8.3 | 293004164 | 438338580 |
4 | 8.1 | 289916256 | 272900000 |
2 | 7.2 | 162798565 | 200600000 |
13 | 7.2 | 237283207 | 301700000 |
- 【计算】列出所有的电影ID,名字和销售总额(以百万美元为单位计算)
SELECT id,title, (b.Domestic_sales+b.International_sales)/1000000 AS sales
FROM movies m
left join Boxoffice b
on m.id=b.Movie_id
- 【计算】列出所有的电影ID,名字和市场指数(
Rating
的10倍为市场指数)
SELECT id,title, b.Rating*10 AS sales
FROM movies m
left join Boxoffice b
on m.id=b.Movie_id
- 【计算】列出所有偶数年份的电影,需要电影ID,名字和年份
SELECT Id,Title,Year
FROM Movies
WHERE Year%2=0
- 【难题】John Lasseter导演的每部电影每分钟值多少钱,告诉我最高的3个电影名和价值就可以
SELECT title, (b.Domestic_sales+b.International_sales)/m.length_minutes AS sales
FROM movies m
left join Boxoffice b
on m.id=b.Movie_id
where Director="John Lasseter"
order by sales desc
limit 3
8,在查询中统计
对数据进行计数,求平均值等操作.
SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression;
常用函数:
- count:计数----count(*)统计数据的行数,count(colnmn)统计column非null的行数
- min(column):找column最小的一行,与之相反就是max
- AUG(column):找column所有行取平均值
- SUM(column):对column所有行求和
其次,group by 可以对某个数据进行分组,可以组合,对分组进行统计.
SELECT 属性列名,count(*)
FROM 主表
where 条件 group by 属性名;
例子:
Employees :
Role | Name | Building | Years_employed |
Engineer | Becky A. | 1e | 4 |
Engineer | Dan B. | 1e | 2 |
Engineer | Sharon F. | 1e | 6 |
Engineer | Dan M. | 1e | 4 |
Engineer | Malcom S. | 1e | 1 |
Artist | Tylar S. | 2w | 2 |
Artist | Sherman D. | 2w | 8 |
Artist | Jakob J. | 2w | 6 |
Artist | Lillia A. | 2w | 7 |
Artist | Brandon J. | 2w | 7 |
Manager | Scott K. | 1e | 9 |
Manager | Shirlee M. | 1e | 3 |
Manager | Daria O. | 2w | 6 |
Engineer | Yancy I. | null | 0 |
Artist | Oliver P. | null | 0 |
- 【统计】找出就职年份最高的雇员(列出雇员名字+年份)
SELECT name,MAX(Years_employed)
FROM employees;
- 【分组】按角色(
Role
)统计一下每个角色的平均就职年份
SELECT Role,AVG(Years_employed)
FROM Employees GROUP by Role;
- 【分组】按办公室名字总计一下就职年份总和
SELECT Building,sum(Years_employed)
FROM Employees group by Building;
- 【难题】每栋办公室按人数排名,不要统计无办公室的雇员
SELECT Building,count(*)
FROM Employees
where building is not null group by Building;
在group语法之中,我们知道数据库是对数据先做where操作,然后分组统计,但是我们如果要分组完再筛选出数据,就要使用having语法(HAVING
和 WHERE
语法一样,只不过作用的结果集不一样. )
SELECT group_by_column, AGG_FUNC(column_expression) AS aggregate_result_alias, …
FROM mytable
WHERE condition
GROUP BY column
HAVING group_condition;
例子:
- 【统计】统计一下Artist角色的雇员数量
SELECT COUNT() FROM Employees WHERE Role='Artist';
- 【分组】按角色统计一下每个角色的雇员数量
SELECT COUNT(*),role
FROM Employees
group by role;
- 【分组】算出Engineer角色的就职年份总计
SELECT SUM(Years_employed)
FROM Employees
where Role="Engineer";
- 【难题】按角色分组算出每个角色按有办公室和没办公室的统计人数(列出角色,数量,有无办公室,注意一个角色如果部分有办公室,部分没有需分开统计)
SELECT Role,CASE when Building is NOT NULL
THEN '1' ELSE '0' END
AS Have,COUNT(Name)
FROM Employees
GROUP BY Role,Have;
由此可见having用的真的少(这些例题都没用上)
9,查询执行顺序
介绍了所有查询的相关语法,大概梳理一下所有语法的执行顺序:
- from和join
- where
- group by
- having
- select
- distinct
- order by
- limit/offset
不是每一个SQL语句都要用到所有的句法,但灵活运用以上的句法组合和深刻理解SQL执行原理将能在SQL层面更好的解决数据问题,而不用把问题都抛给程序逻辑.