SQL1| CS186

SQL1

SQL Pros and Cons

  1. Declarative!
  • Say what you want, not how to get it
  1. Implemented widely

    • With varying levels of efficiency, completeness
  2. Constrained

    • Not targeted at Turing-complete tasks
  3. General-purpose and feature-rich

  • many years of added features
  1. extensible: callouts to other languages, data sources

Relational Terminology

  1. Database: Set of named Relations
  2. Relation (Table):
    • Schema: description (“metadata”)
    • Instance: set of data satisfying the schema
  3. Attribute (Column, Field)
  4. Tuple (Record, Row)

Schema is fixed: unique attribute names, atomic types

Instance can change often:a multiset of “rows” (“tuples”)
{(123456789, ’wei’, ’jones’),
(987654321, ’apurva’, ’lee’),
(543219876, ‘sara’, ‘manning’),
(987654321, ’apurva’, ’lee’)}

Why is this not a relation?

在这里插入图片描述

instance is not suitble for the schema

在这里插入图片描述

schema must have unique name and attribute name
在这里插入图片描述
schema must have atomic types(不可再分,但是现在第三列是可分类型,第三列的每一行都有三个子类型)

SQL Language

  • Two sublanguages:

    • DDL – Data Definition Language

      • Define and modify schema
    • DML – Data Manipulation Language

      • Queries can be written intuitively.
  • RDBMS responsible for efficient evaluation.

    • Choose and run algorithms for declarative queries

      • Choice of algorithm must not affect query answer.

在这里插入图片描述

  • Primary Key column(s):

    • Provides a unique “lookup key” for the relation
      Cannot have any duplicate values
      Can be made up of >1 column
      E.g. (firstname, lastname)

在这里插入图片描述

  • Foreign key references a table
    • Via the primary key of that table
    • Need not share the name of the referenced primary key

在这里插入图片描述

Basic Single-Table Queries(DML)

SELECT

SELECT [DISTINCT] <column expression list>
FROM <single table>
[WHERE <predicate>]

Simplest version is straightforward

  • Produce all tuples in the table that satisfy the predicate
  • Output the expressions in the SELECT list
  • Expression can be a column reference, or an arithmetic
  • expression over column refs

SELECT DISTINCT

SELECT DISTINCT S.name, S.gpa
FROM students S
WHERE S.dept = 'CS'

DISTINCT specifies removal of duplicate rows before output
Can refer to the students table as “S”, this is called an alia

SELECT 代表输出的列(NAME GPA),duplicate rows的意思是,两个行有相同的S.NAME 和S.GPA

WHERE 代表输出的每一行所要满足的条件,满足才可以输出该行

ORDER BY

SELECT S.name, S.gpa, S.age*2 AS a2
FROM Students S
WHERE S.dept = 'CS'
ORDER BY S.gpa, S.name, a2
  • ORDER BY clause specifies output to be sorted
    • Lexicographic ordering
  • Obviously must refer to columns in the output
    • Note the AS clause for naming output columns!

AS 关键字代表输出中有一列的名称为a2,该列的内容是age2
先按gpa的大小排序,若两个对象gpa一样,则按name排序,若gpa和name都一样,则按照age
2排序

SELECT  S.name, S.gpa, S.age*2 AS a2
FROM Students S
WHERE S.dept = 'CS'
ORDER BY S.gpa DESC, S.name ASC, a2;
  • Ascending(升序) order by default, but can be overridden
    • DESC flag for descending(降序), ASC for ascending
    • Can mix and match, lexicographically

降序:该列中的gpa数值从上到下以此递减
升序:名字按首字母顺序以此递增

在这里插入图片描述

LIMIT

SELECT  S.name, S.gpa, S.age*2 AS a2
FROM Students S
WHERE S.dept = 'CS'
ORDER BY S.gpa DESC, S.name ASC, a2;
LIMIT 3 ;
  • Only produces the first output rows
  • Typically used with ORDER BY
    • Otherwise the output is non-deterministic
    • Not a “pure” declarative construct in that case – output set depends on algorithm for query processing

Aggregates

SELECT [DISTINCT] AVG(S.gpa)
FROM Students S
WHERE S.dept = 'CS'
  • Before producing output, compute a summary (a.k.a. an aggregate) of some arithmetic expression
  • Produces 1 row of output
    with one column in this case
  • Other aggregates: SUM, COUNT, MAX, MIN

GROUP BY

SELECT [DISTINCT] AVG(S.gpa), S.dept
FROM Students S
GROUP BY S.dept
  • Partition table into groups with same GROUP BY column values
    • Can group by a list of columns
  • Produce an aggregate result per group
    • Cardinality of output = # of distinct group values(输出的aggregate result 的数量等于分组数)
  • Note: can put grouping columns in SELECT list

HAVING

SELECT [DISTINCT] AVG(S.gpa), S.dept
FROM Students S
GROUP BY S.dept
HAVING COUNT(*) > 2

先分组,计算成员数大于2的小组中的平均gpa,最后显示gpa和部分名字

  • The HAVING predicate filters groups
  • HAVING is applied after grouping and aggregation
    • Hence can contain anything that could go in the SELECT list
    • I.e. aggs or GROUP BY columns
  • HAVING can only be used in aggregate queries
  • It’s an optional clause

Putting it all together

SELECT S.dept, AVG(S.gpa), COUNT(*)
FROM Students S
WHERE S.gender = 'F'
GROUP BY S.dept 
HAVING COUNT(*) >= 2
ORDER BY S.dept;

DISTINCT Aggregates

Are these the same or different?


SELECT COUNT(DISTINCT S.name)
FROM Students S
WHERE S.dept = 'CS';

SELECT DISTINCT COUNT(S.name)
FROM Students S
WHERE S.dept = 'CS';

不一样,第一个首先剔除s.mame 中的重复项,在计算cs学生中名字的数量,第二个不管cs中学生的名字是否重复,直接计算总数,假设有10个bob和5个小红,第一个结果为2,第二个结果为15

试验地址

SQL DML:

General Single-Table Queries

SELECT [DISTINCT] <column expression list>
FROM <single table>
[WHERE <predicate>]
[GROUP BY <column list>
[HAVING <predicate>] ]
[ORDER BY <column list>]
[LIMIT <integer>];
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值