一、查询
1. select queries
# Select query for a specific columns
SELECT column, another_column, …
FROM mytable;
# Select query for all columns
SELECT *
FROM mytable;
2. queries with constraints
queries with number comparison
# Select query with constraints
SELECT column, another_column, …
FROM mytable
WHERE condition
AND/OR another_condition
AND/OR …;
Operator | Condition | SQL Example |
---|---|---|
=, !=, < <=, >, >= | Standard numerical operators | col_name != 4 |
BETWEEN … AND … | Number is within range of two values (inclusive) | col_name BETWEEN 1.5 AND 10.5 |
NOT BETWEEN … AND … | Number is not within range of two values (inclusive) | col_name NOT BETWEEN 1 AND 10 |
IN (…) | Number exists in a list | col_name IN (2, 4, 6) |
NOT IN (…) | Number does not exist in a list | col_name NOT IN (1, 3, 5) |
queries with string comparison
Operator | Condition | Example |
---|---|---|
= | Case sensitive exact string comparison (notice the single equals) | col_name = “abc” |
!= or <> | Case sensitive exact string inequality comparison | col_name != “abcd” |
LIKE | Case insensitive exact string comparison | col_name LIKE “ABC” |
NOT LIKE | Case insensitive exact string inequality comparison | col_name NOT LIKE “ABCD” |
% | Used anywhere in a string to match a sequence of zero or more characters (only with LIKE or NOT LIKE) | col_name LIKE “%AT%”(matches “AT”, “ATTIC”, “CAT” or even “BATS”) |
_ | Used anywhere in a string to match a single character (only with LIKE or NOT LIKE) | col_name LIKE “AN_”(matches “AND”, but not “AN”) |
IN (…) | String exists in a list | col_name IN (“A”, “B”, “C”) |
NOT IN (…) | String does not exist in a list | col_name NOT IN (“D”, “E”, “F”) |
3. filtering and sorting query results
# Select query with unique results
SELECT DISTINCT column, another_column, …
FROM mytable
WHERE condition(s);
# Select query with ordered results
SELECT column, another_column, …
FROM mytable
WHERE condition(s)
ORDER BY column ASC/DESC;
The LIMIT will reduce the number of rows to return, and the optional OFFSET will specify where to begin counting the number rows from.
# Select query with limited rows
SELECT column, another_column, …
FROM mytable
WHERE condition(s)
ORDER BY column ASC/DESC
LIMIT num_limit OFFSET num_offset;
4. multi-table queries with joins
Database normalization
is useful because it minimizes duplicate data in any single table, and allows for data in the database to grow independently of each other (ie. Types of car engines can grow independent of each type of car). As a trade-off, queries get slightly more complex since they have to be able to find data from different parts of the database, and performance issues can arise when working with many large tables.
Tables that share information about a single entity need to have a primary key
that identifies that entity uniquely across the database. One common primary key type is an auto-incrementing integer (because they are space efficient), but it can also be a string, hashed value, so long as it is unique.
inner join
# Select query with INNER JOIN on multiple tables
SELECT column, another_table_column, …
FROM mytable
INNER JOIN another_table
ON mytable.id = another_table.id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;
The INNER JOIN
is a process that matches rows from the first table and the second table which have the same key
(as defined by the ON constraint) to create a result row with the combined columns from both tables. After the tables are joined, the other clauses we learned previously are then applied.
outer joins
# Select query with LEFT/RIGHT/FULL JOINs on multiple tables
SELECT column, another_column, …
FROM mytable
LEFT/RIGHT/FULL JOIN another_table
ON mytable.id = another_table.matching_id
WHERE condition(s)
ORDER BY column, … ASC/DESC
LIMIT num_limit OFFSET num_offset;
LEFT JOIN
simply includes rows from A regardless of whether a matching row is found in B.RIGHT JOIN
is the same, but reversed, keeping rows in B regardless of whether a match is found in A.FULL JOIN
simply means that rows from both tables are kept, regardless of whether a matching row exists in the other table.
LEFT OUTER JOIN, RIGHT OUTER JOIN, or FULL OUTER JOIN, these queries are simply equivalent to LEFT JOIN, RIGHT JOIN, and FULL JOIN respectively.
join = inner join
When using any of these new joins, you will likely have to write additional logic to deal with NULLs
in the result and constraints.
# Select query with constraints on NULL values
SELECT column, another_column, …
FROM mytable
WHERE column IS/IS NOT NULL
AND/OR another_condition
AND/OR …;
5. queries with expressions
# Example query with expressions
SELECT particle_speed / 2.0 AS half_particle_speed
FROM physics_data
WHERE ABS(particle_position) * 10.0 > 500;
Each database has its own supported set of mathematical, string, and date functions that can be used in a query, which you can find in their own respective docs.
make aliases
# Select query with expression aliases
SELECT col_expression AS expr_description, …
FROM mytable;
# Example query with both column and table name aliases
SELECT column AS better_column_name, …
FROM a_long_widgets_table_name AS mywidgets
INNER JOIN widget_sales
ON mywidgets.id = widget_sales.widget_id;
6. queries with aggregates
aggregate expressions (or functions) allow you to summarize information about a group of rows of data.
# Select query with aggregate functions over all rows
SELECT AGG_FUNC(column_or_expression) AS aggregate_description, …
FROM mytable
WHERE constraint_expression;
Function | Description |
---|---|
COUNT(*), COUNT(column) | A common function used to counts the number of rows in the group if no column name is specified. Otherwise, count the number of rows in the group with non-NULL values in the specified column. |
MIN(column) | Finds the smallest numerical value in the specified column for all rows in the group. |
MAX(column) | Finds the largest numerical value in the specified column for all rows in the group. |
AVG(column) | Finds the average numerical value in the specified column for all rows in the group. |
SUM(column) | Finds the sum of all numerical values in the specified column for the rows in the group. |
sum
、group by
、having
SELECT role, AVG(years_employed) as Average_years_employed
FROM employees
GROUP BY role;
SELECT role, SUM(years_employed)
FROM employees
GROUP BY role
HAVING role = "Engineer";
二、插入行
# Insert statement with values for all columns
INSERT INTO mytable
VALUES (value_or_expr, another_value_or_expr, …),
(value_or_expr_2, another_value_or_expr_2, …),
…;
# Insert statement with specific columns
INSERT INTO mytable
(column, another_column, …)
VALUES (value_or_expr, another_value_or_expr, …),
(value_or_expr_2, another_value_or_expr_2, …),
…;
# Example Insert statement with expressions
INSERT INTO boxoffice
(movie_id, rating, sales_in_millions)
VALUES (1, 9.9, 283742034 / 1000000);
三、修改行
# Update statement with values
UPDATE mytable
SET column = value_or_expr,
other_column = another_value_or_expr,
…
WHERE condition;
四、删除行
# Delete statement with condition
DELETE FROM mytable
WHERE condition;
If you decide to leave out the WHERE constraint, then all rows are removed
, which is a quick and easy way to clear out a table completely (if intentional).
五、创建表
# Create table statement w/ optional table constraint and default value
CREATE TABLE IF NOT EXISTS mytable (
column DataType TableConstraint DEFAULT default_value,
another_column DataType TableConstraint DEFAULT default_value,
…
);
Data type | Description |
---|---|
INTEGER, BOOLEAN | The integer datatypes can store whole integer values like the count of a number or an age. In some implementations, the boolean value is just represented as an integer value of just 0 or 1. |
FLOAT, DOUBLE, REAL | The floating point datatypes can store more precise numerical data like measurements or fractional values. Different types can be used depending on the floating point precision required for that value. |
CHARACTER(num_chars), VARCHAR(num_chars), TEXT | The text based datatypes can store strings and text in all sorts of locales. The distinction between the various types generally amount to underlaying efficiency of the database when working with these columns.Both the CHARACTER and VARCHAR (variable character) types are specified with the max number of characters that they can store (longer values may be truncated), so can be more efficient to store and query with big tables. |
DATE, DATETIME | SQL can also store date and time stamps to keep track of time series and event data. They can be tricky to work with especially when manipulating data across timezones. |
BLOB | Finally, SQL can store binary data in blobs right in the database. These values are often opaque to the database, so you usually have to store them with the right metadata to requery them. |
# Movies table schema
CREATE TABLE movies (
id INTEGER PRIMARY KEY,
title TEXT,
director TEXT,
year INTEGER,
length_minutes INTEGER
);
Constraint | Description |
---|---|
PRIMARY KEY | This means that the values in this column are unique, and each value can be used to identify a single row in this table. |
AUTOINCREMENT For integer values, this means that the value is automatically filled in and incremented with each row insertion. Not supported in all databases. | |
UNIQUE | This means that the values in this column have to be unique, so you can’t insert another row with the same value in this column as another row in the table. Differs from the PRIMARY KEY in that it doesn’t have to be a key for a row in the table. |
NOT NULL | This means that the inserted value can not be NULL . |
CHECK (expression) | This allows you to run a more complex expression to test whether the values inserted are valid. For example, you can check that values are positive, or greater than a specific size, or start with a certain prefix, etc. |
FOREIGN KEY | This is a consistency check which ensures that each value in this column corresponds to another value in a column in another table.For example, if there are two tables, one listing all Employees by ID, and another listing their payroll information, the FOREIGN KEY can ensure that every row in the payroll table corresponds to a valid employee in the master Employee list. |
六、修改表
# Altering table to add new column(s)
ALTER TABLE mytable
ADD column DataType OptionalTableConstraint
DEFAULT default_value;
# Altering table to remove column(s)
ALTER TABLE mytable
DROP column_to_be_deleted;
# Altering table name
ALTER TABLE mytable
RENAME TO new_table_name;
七、删除表
In some rare cases, you may want to remove an entire table including all of its data and metadata, and to do so, you can use the DROP TABLE
statement, which differs from the DELETE
statement in that it also removes the table schema from the database entirely.
# Drop table statement
DROP TABLE IF EXISTS mytable;
Like the CREATE TABLE statement, the database may throw an error if the specified table does not exist, and to suppress that error, you can use the IF EXISTS
clause.
In addition, if you have another table that is dependent on columns in table you are removing (for example, with a FOREIGN KEY
dependency) then you will have to either update all dependent tables first
to remove the dependent rows or to remove those tables entirely.
More
subqueries
Example: General subquery
Lets say your company has a list of all Sales Associates, with data on the revenue that each Associate brings in, and their individual salary. Times are tight, and you now want to find out which of your Associates are costing the company more than the average revenue brought per Associate.
First, you would need to calculate the average revenue all the Associates are generating:
SELECT AVG(revenue_generated)
FROM sales_associates;
And then using that result, we can then compare the costs of each of the Associates against that value. To use it as a subquery, we can just write it straight into the WHERE clause of the query:
SELECT *
FROM sales_associates
WHERE salary >
(SELECT AVG(revenue_generated)
FROM sales_associates);
As the constraint is executed, each Associate’s salary will be tested against the value queried from the inner subquery.
Because subqueries can be nested, each subquery must be fully enclosed in parentheses in order to establish proper hierarchy
Unions, Intersections & Exceptions
When working with multiple tables, the UNION and UNION ALL operator allows you to append the results of one query to another assuming that they have the same column count, order and data type. If you use the UNION without the ALL, duplicate rows between the tables will be removed from the result.
# Select query with set operators
SELECT column, another_column
FROM mytable
UNION / UNION ALL / INTERSECT / EXCEPT
SELECT other_column, yet_another_column
FROM another_table
ORDER BY column DESC
LIMIT n;
The UNION happens before the ORDER BY
and LIMIT
. It’s not common to use UNIONs, but if you have data in different tables that can’t be joined and processed, it can be an alternative to making multiple queries on the database.
Similar to the UNION
, the INTERSECT
operator will ensure that only rows that are identical in both result sets are returned, and the EXCEPT operator will ensure that only rows in the first result set that aren’t in the second are returned. This means that the EXCEPT
operator is query order-sensitive, like theLEFT JOIN
and RIGHT JOIN
.
Both INTERSECT
and EXCEPT
also discard duplicate rows after their respective operations, though some databases also support INTERSECT ALL
and EXCEPT ALL
to allow duplicates to be retained and returned.