数据分析（三）：SQL入门到实战

最新推荐文章于 2024-04-16 09:22:03 发布

R.renee

最新推荐文章于 2024-04-16 09:22:03 发布

阅读量238

点赞数

文章标签：数据分析 sql 数据库

本文链接：https://blog.csdn.net/SusicRuan/article/details/131763716

版权

本文介绍了SQL的基础语法，包括select、from、where、group by、having、order by、limit等子句，以及聚合函数、窗口函数、表连接、子查询等高级用法。通过例题展示了SQL在数据分析中的应用，如计算平均次日留存率、统计用户练题情况、计算科目出勤率等。

摘要由CSDN通过智能技术生成

数据分析（三）：SQL入门到实战

最近在对以往学到的数据分析知识进行回顾，，如有错误和不恰当的地方，欢迎大佬指正~ 学习过程中主要观看戴师兄的教程，讲得非常好，指路B站：链接: https://www.bilibili.com/video/BV1ZM4y1u7uF?p=1&vd_source=69f057a466718e60e53438e3b563d7f8
本节还补充了部分Mosh老师的SQL进阶内容，讲得非常详细，目前只学到数据分析所用到的SQL查询部分，后续数据开发部分有待学习，感兴趣的朋友可以去看看：SQL进阶教程 | 史上最易懂SQL教程！10小时零基础成长SQL大师！！

基础语法

SQL查询语句语法结构和运行顺序：
语法结构：select–from–where–group by–having–order by–limit
运行顺序：from–where–group by–having–order by–limit–select
在这里插入图片描述

select&from：select 字段名 from 表名

select 字段名决定这一段查询最后展示的字段
from 表名指定这段查询语句涉及的数据来源

select中使用distinct去重，使用*选取该表格所有字段。

SELECT name, continent, population FROM world

在这里插入图片描述

SELECT * from world

在这里插入图片描述

SELECT distinct continent FROM world

在这里插入图片描述

select中计算字段的运用

SELECT name, (gdp/population) as per_capita_GDP FROM world

在这里插入图片描述

where

select 字段名 
from 表名 
[where 表达式]

where 表达式限定查询行必须满足的条件
where核心子句是可选项，使用该子句是为了通过表达式筛选出符合查询条件的行数据

SQL运算符一览

tips：空值（NULL），空值不同于0，也不同于null字符串
例题

SELECT name, (gdp/population) as per_capita_GDP FROM world
where population >= 200000000

在这里插入图片描述

模糊查询like
where子句的表达式中除了使用运算符来进行条件判断，还可以使用like操作符组合通配符进行模糊查询。
- 模糊查询标准语法

    select 字段名
    from 表名
    where 字段名 like '通配符+字符'

通配符用来匹配值的一部分，跟在like后面进行数据过滤常用的通配符有%和_，%用来匹配多个字符可以是零个、一个也可以是多个字符，_仅能用来匹配单个字符

正则运算符 regexp

-- regexp 正则
-- ^ 开头；$ 结尾；| 逻辑或；[abcd];[a-h]
select *
from customers
-- where first_name regexp 'elka|ambur' -- 包含elka或者ambur
-- where last_name regexp 'ey$|on$' -- 结尾包含ey或者on
-- where last_name regexp '^my|se'  -- 开头包含my或者整个包含se
where last_name regexp 'b[ru]' -- 包含br或者bu

在这里插入图片描述

例题1
Show the countries which have a name that includes the word ‘United’

select name from world
where name like '%United%'

在这里插入图片描述

例题2
Equatorial Guinea and Dominican Republic have all of the vowels (a e i o u) in the name. They don’t count because they have more than one word in the name.

select name
from world
where name like '%a%'
and name like '%e%'
and name like '%i%'
and name like '%o%'
and name like '%u%'
and name not like '% %';

在这里插入图片描述

order by

order by 字段名
asc|desc 规定查询出的结果集显示的顺序
order by核心子句是可选项，使用该子句是为了对被查询出的结果集，指定依据字段排序
asc指定该字段升序排序，desc为降序排序，不写则默认为升序排序

select 字段名
from 表名
[where 表达式]
[order by 字段名 asc|desc]

例题
List the winners, year and subject where the winner starts with Sir. Show the the most recent first, then by name order.

select winner,yr,subject from nobel
where winner like 'Sir%'
order by yr desc,winner asc

在这里插入图片描述

limit

limit [位置偏移量,]行数限制查询结果集显示的行数
limit子句是可选项，行数是子句中的必选参数，参数位置偏移量是可选参数

【查询结果返回前n行】
    select 字段名
    from 表名
    [where 表达式]
    [order by 字段名 asc|desc]
    [limit n]

【查询结果返回第x+1行开始的n行到x+n行】
    select 字段名
    from 表名
    [where 表达式]
    [order by 字段名 asc|desc]
    [limit x,n]

聚合函数&group by

聚合函数
group by

group by 字段名规定依据哪个字段分组聚合 group
by核心子句是可选项，使用该子句是为了依据相同字段值分组后进行聚合运算，常和聚合函数联用
简单来说就是

count（）
1、count(1)与count(*)得到的结果一致，包含null值。
2、count(字段)不计算null值
3、count(null)结果恒为0

select 字段名1
from 表名
[where 表达式]
[group by 字段名1]
[order by 字段名 asc|desc]
[limit [位置偏移量,]行数]

例题1
查询每个大洲（continent）和大洲内的国家（name）数量

select continent,count(name)
from world
group by continent

在这里插入图片描述

例题2
查询2013至2015年每年每个科目的获奖人数，结果按年份从大到小，人数从大到小排序

SELECT yr, subject, count(winner) as num FROM nobel
WHERE yr between 2013 and 2015
group by yr,subject
order by yr desc,num desc

with rollup
根据分组计算总值。只有MySQL有这个运算符。

最低0.47元/天解锁文章

R.renee

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫