Data Types
String Fields
Understand character sets and are indexable for searching
- CHAR allocates the entire space (faster for small strings where length
is known) - VARCHAR allocates a variable amount of space depending on the
data length (less space)
Text Fields
Have a character set - paragraphs or HTML pages
- TINYTEXT up to 255 characters
- TEXT up to 65K
- MEDIUMTEXT up to 16M
- LONGTEXT up to 4G
Integer Numbers
- TINYINT (-128, 128)
- SMALLINT (-32768, +32768)
- INT or INTEGER (2 Billion)
- BIGINT - (10**18 ish)
Dates
- TIMESTAMP - ‘YYYY-MM-DD HH:MM:SS’ (1970, 2037)
- DATETIME - ‘YYYY-MM-DD HH:MM:SS’
- DATE - ‘YYYY-MM-DD’
- TIME - ‘HH:MM:SS’
- Built-in MySQL function NOW()
Floating Point Numbers
- FLOAT (32-bit) 10**38 with seven digits of accuracy
- DOUBLE (64-bit) 10**308 with 14 digits of accuracy
Binary Types (rarely used)
Character = 8 - 32 bits of information depending on character set
Byte = 8 bits of information
- BYTE(n) up to 255 bytes
- VARBINARY(n) up to 65K bytes
Binary Large Object (BLOB)
Large raw data, files, images, word documents, PDFs, movies, etc.
No translation, indexing, or character set
- TINYBLOB(n) - up to 255
- BLOB(n) - up to 65K
- MEDIUMBLOB(n) - up to 16M
- LONGBLOB(n) - up to 4G
Database Keys and Indexes
- Primary key - generally an integer autoincrement field.
- Logical key - what the outside world uses for lookup.
- Foreign key - generally an integer key pointing to a row in another table.
- AUTO_INCREMENT
- INDEX (HASH or BTREE)
CRUD
Retrieve–Select
SELECT field list FROM table list WHERE criteria
SELECT * FROM Users WHERE email='csev@umich.edu'
SELECT DISTINCT Supplier_ID FROM Wine;
SELECT Wine_ID, colour, price*7 AS price_CNY From Wine;
SELECT * FROM Wine WHERE Type IS NULL
# or NOT NULL
With Functions:
COUNT,SUM,AVG,MIN,MAX
##Count the number of wines that have a quality greater than 6.
SELECT COUNT (*) FROM Wine WHERE quality>6;
SELECT AVG(price) FROM Wine
#eg.Count all wines with quality 6 or higher that cost less than average. For this one you will need a subquery to find the average price.
SELECT COUNT(*) FROM Wine WHERE (quality>=6 AND price<(SELECT AVG(price) FROM Wine));
##Alternatively, you can type in this command
SELECT COUNT(*) FROM (SELECT * FROM Wine WHERE quality>=6 AND price<(SELECT AVG(price) FROM Wine));
Sets and Subqueries (IN & NOT IN)
SELECT * FROM Wine WHERE quality IN (1,2,3,4);
SELECT * FROM Customer WHERE Address2 IN (SELECT Address2 FROM Supplier WHERE Name='Celler del Cava');
#The following command returns the value of ‘Barcelona’
SELECT Address2 FROM Supplier WHERE Name='Celler del Cava'
#So the first command equals to the following one
SELECT * FROM Customer WHERE Address2 IN (‘Barcelona’);
Queries involving more than one table (using WHERE & JOIN ON)
#General format of joining three tables
WHERE table1.primarykey = table2.foreignkey AND table3.primarykey = table1.foreignkey
#What we want to see + The tables that hold the data + How the tables are linked
SELECT Track.title, Artist.name, Album.title, Genre.name FROM Track JOIN Artist JOIN Album JOIN Genre ON Track.genre_id = Genre.genre_id AND Track.album_id = Album.album_id AND Album.artist_id = Artist.artist_id
LIKE
SELECT * FROM Users WHERE name LIKE '%e%'
#注意 %(*), _(?), [],^(!)的用法。
Order by
SELECT * FROM Wine ORDER BY year DESC, price ASC;
#The LIMIT clause can request the first "n" rows, or the first "n" rows after some starting row. Note: the first row is zero, not one.
SELECT * FROM Users ORDER BY email DESC LIMIT 2;
SELECT * FROM Users ORDER BY email LIMIT 1,2;
Group by
#Find the number of staff working in each distribution center and the sum of their salaries.
SELECT dCenterNo, COUNT(staffNo) AS totalstaff, SUM(salary) AS totalsalary FROM Staff GROUP BY dCenterNo;
窗口函数
COUNT(xxx)、SUM(xxx)、MIN(xxx)、MAX(xxx)、AVG(xxx)
RANK()、DENSE_RANK()、ROW_NUMBER()
first_value(xxx)、last_value(xxx)
LAG(col,n,DEFAULT):用于统计窗口内往上第n行值
LEAD(col,n,DEFAULT):用于统计窗口内往下第n行值
NTILE(n):将数据分成n片(1/n%)
CUME_DIST():小于等于当前值的行数/分组内总行数
PERCENT_RANK():分组内当前行的RANK值-1/分组内总行数-1
SELECT start_terminal,
duration_seconds,
SUM(duration_seconds) OVER
(PARTITION BY start_terminal ORDER BY start_time)
AS running_total
FROM tutorial.dc_bikeshare_q1_2012
WHERE start_time < '2012-01-08'
Insert
INSERT INTO Album (title, artist_id) VALUES ('Who Made Who', 2);
Create & Drop
CREATE DATABASE People;
USE People;
CREATE TABLE Track (
track_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
title VARCHAR(255),
len INTEGER,
album_id INTEGER REFERENCES Album (album_id)
ON DELETE CASCADE ON UPDATE CASCADE,
genre_id INTEGER REFERENCES Genre (genre_id)
ON DELETE CASCADE ON UPDATE CASCADE,
INDEX USING BTREE (title),
) ENGINE = InnoDB;
DESCRIBE Genre;
DROP TABLE Track;
Delete
DELETE FROM Users WHERE email='ted@umich.edu'
ON DELETE Choices
- Default / RESTRICT – Don’t allow changes that break the constraint
- CASCADE – Adjust child rows by removing or updating to maintain consistency
- SET NULL – Set the foreign key columns in the child rows to null
Update
UPDATE Users SET name='Charles' WHERE email='csev@umich.edu'