DatabaseSystemConcepts(SixthEdition) - Chapter 3 - Introduction to SQL_database system concepts, sixth edition-CSDN博客

本文链接：https://blog.csdn.net/qq_22938603/article/details/123612514

本文介绍了SQL作为最广泛使用的数据库查询语言，不仅用于查询，还可定义数据结构、修改数据库及设置安全约束。从1986年ANSI和ISO发布的SQL标准开始，SQL已发展成为关系型数据库的标准语言。SQL包括DDL（数据定义语言）和DML（数据操纵语言），涵盖了查询、事务处理、视图和安全性等。此外，文章还讨论了基本数据类型、模式定义、子查询和数据库修改等内容。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Introduction to SQL
3.1 Overview of the SQL Query Language
3.2 SQL Data Definition
- 3.2.1 Basic Types
- 3.2.2 Basic Schema Definition
3.3 Basic Structure of SQL Queries
- 3.3.1 Queries on a Single Relation
- 3.3.2 Queries on Multiple Relations
- 3.3.3 The Natural Join
3.4 Additional Basic Operations
- 3.4.1 The Rename Operation
- 3.4.2 String Operations
- 3.4.3 Attribute Specification in Select Clause
- 3.4.4 Ordering the Display of Tuples
- 3.4.5 Where Clause Predicates
3.5 Set Operations
- 3.5.1 The Union Operation
- 3.5.2 The Intersect Operation
- 3.5.3 The Except Operation
3.6 Null Values
3.7 Aggregate Functions
- 3.7.1 Basic Aggregation
- 3.7.2 Aggregation with Grouping
- 3.7.3 The Having Clause
- 3.7.4 Aggregation with Null and Boolean Values
3.8 Nested Subqueries
- 3.8.1 Set Membership
- 3.8.2 Set Comparison
- 3.8.3 Test for Empty Relations
- 3.8.4 Test for the Absence of Duplicate Tuples
- 3.8.5 Subqueries in the From Clause
- 3.8.6 The with Clause
- 3.8.7 Scalar Subqueries
3.9 Modification of the Database
- 3.9.1 Deletion
- 3.9.2 Insertion
- 3.9.3 Updates
3.10 Summary
Review Terms
Practice Exercises
Exercises
Tools
Bibliographical Notes

Introduction to SQL

There are a number of database query languages in use, either commercially or experimentally. In this chapter, as well as in Chapters 4 and 5, we study the most widely used query language, SQL.

Although we refer to the SQL language as a “query language,” it can do much more than just query a database. It can define the structure of the data, modify data in the database, and specify security constraints.

It is not our intention to provide a complete users’ guide for SQL. Rather,we present SQL’s fundamental constructs and concepts. Individual implementations of SQL may differ in details, or may support only a subset of the full language.

3.1 Overview of the SQL Query Language

IBM developed the original version of SQL, originally called Sequel, as part of the System R project in the early 1970s. The Sequel language has evolved since then, and its name has changed to SQL (Structured Query Language). Many products now support the SQL language. SQL has clearly established itself as the standard relational database language.

In 1986, the American National Standards Institute (ANSI) and the International Organization for Standardization (ISO) published an SQL standard, called SQL-86. ANSIpublishedanextendedstandard for SQL, SQL-89, in 1989. The next version of the standardwas SQL-92 standard, followed by SQL:1999, SQL:2003, SQL:2006, and most recently SQL:2008. The bibliographic notes provide references to these standards.

The SQL language has several parts:

• Data-definition language (DDL).The SQL DDL provides commands for defining relation schemas, deleting relations, and modifying relation schemas.

• Data-manipulation language (DML).The SQL DML provides the ability to query information from the database and to insert tuples into, delete tuples from, and modify tuples in the database.

• Integrity.The SQL DDL includes commands for specifying integrity constraints that the data stored in the database must satisfy. Updates that violate integrity constraints are disallowed.

• View definition.The SQL DDL includes commands for defining views.

• Transaction control.SQL includes commands for specifying the beginning and ending of transactions.

• Embedded SQL and dynamic SQL.Embedded and dynamic SQL define how SQL statements can be embedded within general-purpose programming languages, such as C, C++, and Java.

• Authorization.The SQL DDL includes commands for specifying access rights to relations and views.

In this chapter, we present a survey of basic DML and the DDL features of SQL. Features described here have been part of the SQL standard since SQL-92.

In Chapter 4,we provide amore detailed coverage of the SQL query language, including (a) various join expressions; (b) views; © transactions; (d) integrity constraints; (e) type
system; and (f) authorization.

In Chapter 5,we cover more advanced features of the SQL language, including (a) mechanisms to allow accessing SQL from a programming language; (b) SQL functions and procedures; © triggers; (d) recursive queries; (e) advanced aggregation features; and (f) several features designed for data analysis, which were introduced in SQL:1999, and subsequent versions of SQL. Later, in Chapter 22, we outline object-oriented extensions to SQL, which were introduced in SQL:1999.

Although most SQL implementations support the standard features we describe here, you should be aware that there are differences between implementations. Most implementations support some nonstandard features, while omitting support for some of the more advanced features. In case you find that some language features described here do not work on the database system that you use, consult the user manuals for your database system to find exactly what features it supports.

3.2 SQL Data Definition

The set of relations in a database must be specified to the system by means of a data-definition language (DDL). The SQL DDL allows specification of not only a set of relations, but also information about each relation, including:

• The schema for each relation.
• The types of values associated with each attribute.
• The integrity constraints.
• The set of indices to be maintained for each relation.
• The security and authorization information for each relation.
• The physical storage structure of each relation on disk.

We discuss here basic schema definition and basic types; we defer discussion of the other SQL DDL features to Chapters 4 and 5.

3.2.1 Basic Types

The SQL standard supports a variety of built-in types, including:

• char(n):A fixed-length character string with user-specified length n. The full form, character, can be used instead.

• varchar(n):A variable-length character string with user-specified maximum length n. The full form, character varying, is equivalent.

• int:An integer (a finite subset of the integers that ismachine dependent). The full form, integer, is equivalent.

• smallint:A small integer (a machine-dependent subset of the integer type).

• numeric(p, d):Afixed-point numberwith user-specified precision. The number consists of p digits (plus a sign), and d of the p digits are to the right of the decimal point. Thus, numeric(3,1) allows 44.5 to be stored exactly, but neither 444.5 or 0.32 can be stored exactly in a field of this type.

• real, double precision:Floating-point and double-precision floating-point numbers with machine-dependent precision.

• float(n):A floating-point number, with precision of at least n digits. Additional types are covered in Section 4.5.

Each type may include a special value called the null value. A null value indicates an absent value that may exist but be unknown or that may not exist at all. In certain cases, we may wish to prohibit null values from being entered, as we shall see shortly.

The char data type stores fixed length strings. Consider, for example, an attribute A of type char(10). If we store a string “Avi” in this attribute, 7 spaces are appended to the string to make it 10 characters long. In contrast, if attribute B were of type varchar(10), and we store “Avi” in attribute B, no spaces would be added. When comparing two values of type char, if they are of different lengths extra spaces are automatically added to the shorter one to make them the same size, before comparison.

When comparing a char type with a varchar type, one may expect extra spaces to be added to the varchar type to make the lengths equal, before comparison; however, this may or may not be done, depending on the database system. As a result, even if the same value “Avi” is stored in the attributes A and B above, a comparison A=B may return false. We recommend you always use the varchar type instead of the char type to avoid these problems.

SQL also provides the nvarchar type to store multilingual data using the Unicode representation. However, many databases allow Unicode (in the UTF-8 representation) to be stored even in varchar types.

3.2.2 Basic Schema Definition

We define an SQL relation by using the create table command. The following command creates a relation department in the database.

create table department
(dept name varchar (20),
building varchar (15),
budget numeric (12,2),
primary key (dept name));

The relation created above has three attributes, dept name, which is a character string of maximum length 20, building, which is a character string of maximum length 15, and budget, which is a number with 12 digits in total, 2 of which are after the decimal point. The create table command also specifies that the dept name attribute is the primary key of the department relation.

The general form of the create table command is:

create table r
   (A1 D1,
    A2 D2,
    . . . ,
    An Dn,
   <integrity-constraint1>,
   . . . ,
   <integrity-constraintk>
   );

where r is the name of the relation, each Ai is the name of an attribute in the schema of relation r, and Di is the domain of attribute Ai; that is, Di specifies the type of attribute Ai along with optional constraints that restrict the set of allowed values for Ai .

The semicolon shown at the end of the create table statements, as well as at the end of other SQL statements later in this chapter, is optional in many SQL implementations.

SQL supports a number of different integrity constraints. In this section, we discuss only a few of them:
• primary key (Aj1 , Aj2, . . . , Ajm ):The primary-key specification says that attributes Aj1 , Aj2, . . . , Ajm form the primary key for the relation. The primarykey attributes are required to be nonnull and unique; that is, no tuple can have a null value for a primary-key attribute, and no two tuples in the relation can be equal on all the primary-key attributes. Although the primary-key specification is optional, it is generally a good idea to specify a primary key
for each relation.

• foreign key (Ak1 , Ak2, . . . , Akn ) references s:Theforeign key specification says that the values of attributes (Ak1 , Ak2, . . . , Akn ) for any tuple in the relation must correspond to values of the primary key attributes of some tuple in relation s.

Figure 3.1presents a partial SQLDDL definition of the university databasewe use in the text. The definition of the course table has a declaration “foreign key (dept name) references department”. This foreign-key declaration specifies that for each course tuple, the department name specified in the tuple must exist in the primary key attribute (dept name) of the department relation. Without this constraint, it is possible for a course to specify a nonexistent department name. Figure 3.1 also shows foreign key constraints on tables section, instructor and teaches.

• not null:The not null constraint on an attribute specifies that the null value is not allowed for that attribute; in other words, the constraint excludes the null value from the domain of that attribute. For example, in Figure 3.1, the not null constraint on the name attribute of the instructor relation ensures that the name of an instructor cannot be null.

More details on the foreign-key constraint, aswell as on other integrity constraints that the create table command may include, are provided later, in Section 4.4.

SQL prevents any update to the database that violates an integrity constraint. For example, if a newly inserted or modified tuple in a relation has null values for any primary-key attribute, or if the tuple has the same value on the primary-key attributes as does another tuple in the relation, SQL flags an error and prevents the update. Similarly, an insertion of a course tuple with a dept name value that does not appear in the department relation would violate the foreign-key constraint on course, and SQL prevents such an insertion from taking place.

A newly created relation is empty initially. We can use the insert command to load data into the relation. For example, if we wish to insert the fact that there is an instructor named Smith in the Biology department with instructor id 10211 and a salary of $66,000, we write:

insert into instructor
   values (10211, ’Smith’, ’Biology’, 66000);

The values are specified in the order in which the corresponding attributes are listed in the relation schema. The insert command has a number of useful features, and is covered in more detail later, in Section 3.9.2.

Wecan use the deletecommandto delete tuples froma relation. The command would delete all tuples from the student relation. Other forms of the delete command allow specific tuples to be deleted; the delete command is covered in more detail later, in Section 3.9.1.

delete from student;

create table department
 (dept name varchar (20),
  building varchar (15),
  budget numeric (12,2),
  primary key (dept name));

create table course
 (course id varchar (7),
  title varchar (50),
  dept name varchar (20),
  credits numeric (2,0),
  primary key (course id),
  foreign key (dept name) references department);

create table instructor
 (ID varchar (5),
  name varchar (20) not null,
  dept name varchar (20),
  salary numeric (8,2),
  primary key (ID),
  foreign key (dept name) references department);

create table section
 (course id varchar (8),
  sec id varchar (8),
  semester varchar (6),
  year numeric (4,0),
  building varchar (15),
  room number varchar (7),
  time slot id varchar (4),
  primary key (course id, sec id, semester, year),
  foreign key (course id) references course);
  
create table teaches
 (ID varchar (5),
  course id varchar (8),
  sec id varchar (8),
  semester varchar (6),
  year numeric (4,0),
  primary key (ID, course id, sec id, semester, year),
  foreign key (course id, sec id, semester, year) references section,
  foreign key (ID) references instructor);

Figure 3.1 SQL data definition for part of the university database.

To remove a relation from an SQL database, we use the drop table command. The drop table command deletes all information about the dropped relation from the database. The command

drop table r;

is a more drastic action than

delete from r;

The latter retains relation r, but deletes all tuples in r. The former deletes not only all tuples of r, but also the schema for r. After r is dropped, no tuples can be inserted into r unless it is re-created with the create table command.

We use the alter table command to add attributes to an existing relation. All tuples in the relation are assigned null as the value for the new attribute. The form of the alter table command is

alter table r add AD;

where r is the name of an existing relation, A is the name of the attribute to be added, and D is the type of the added attribute. We can drop attributes from a relation by the command

alter table r drop A;

where r is the name of an existing relation, and A is the name of an attribute of the relation. Many database systems do not support dropping of attributes, although they will allow an entire table to be dropped.

3.3 Basic Structure of SQL Queries

The basic structure of an SQL query consists of three clauses: select, from, and where. The query takes as its input the relations listed in the from clause, operates on them as specified in the where and select clauses, and then produces a relation as the result. We introduce the SQL syntax through examples, and describe the general structure of SQL queries later.

3.3.1 Queries on a Single Relation

Let us consider a simple query using our university example, “Find the names of all instructors.” Instructor names are found in the instructor relation, so we
在这里插入图片描述
Figure 3.2 Result of “select name from instructor”.
put that relation in the from clause. The instructor’s name appears in the name attribute, so we put that in the select clause.

select name
from instructor;

The result is a relation consisting of a single attribute with the heading name. If the instructor relation is as shown in Figure 2.1, then the relation that results from the preceding query is shown in Figure 3.2.

Now consider another query, “Find the department names of all instructors,” which can be written as:

select dept name
from instructor;

Since more than one instructor can belong to a department, a department name could appear more than once in the instructor relation. The result of the above query is a relation containing the department names, shown in Figure 3.3.

In the formal, mathematical definition of the relational model, a relation is a set. Thus, duplicate tuples would never appear in relations. In practice, duplicate elimination is time-consuming. Therefore, SQL allows duplicates in relations as well as in the results of SQL expressions. Thus, the preceding SQL query lists each department name once for every tuple in which it appears in the instructor relation.

In those cases where we want to force the elimination of duplicates, we insert the keyword distinct after select. We can rewrite the preceding query as:

select distinct dept name
from instructor;

在这里插入图片描述
Figure 3.3 Result of “select dept name from instructor”.
ifwe want duplicates removed. The result of the above querywould contain each department name at most once.

SQL allows us to use the keyword all to specify explicitly that duplicates are not removed:

select all dept name
from instructor;

Since duplicate retention is the default, we shall not use all in our examples. To ensure the elimination of duplicates in the results of our example queries, we shall use distinct whenever it is necessary.

The select clause may also contain arithmetic expressions involving the operators +, −, ∗, and / operating on constants or attributes of tuples. For example, the query:

select ID, name, dept name, salary * 1.1
from instructor;

returns a relation that is the same as the instructor relation, except that the attribute salary is multiplied by 1.1. This shows what would result if we gave a 10% raise to each instructor; note, however, that it does not result in any change to the instructor relation.

SQL also provides special data types, such as various forms of the date type, and allows several arithmetic functions to operate on these types.We discuss this further in Section 4.5.1.

The where clause allows us to select only those rows in the result relation of the from clause that satisfy a specified predicate. Consider the query “Find the names of all instructors in the Computer Science department who have salary greater than $70,000.” This query can be written in SQL as:
在这里插入图片描述
Figure 3.4 Result of “Find the names of all instructors in the Computer Science department who have salary greater than $70,000.”

select name
from instructor
where dept name = ’Comp. Sci.’ and salary > 70000;

If the instructor relation is as shown in Figure 2.1, then the relation that results from the preceding query is shown in Figure 3.4.

SQL allows the use of the logical connectives and, or, and not in the where clause. The operands of the logical connectives can be expressions involving the comparison operators <, <=, >, >=, =, and <>. SQL allows us to use the comparison operators to compare strings and arithmetic expressions, as well as special types, such as date types.

We shall explore other features of where clause predicates later in this chapter.

3.3.2 Queries on Multiple Relations

So far our example queries were on a single relation. Queries often need to access information from multiple relations.We now study how to write such queries.

An an example, suppose we want to answer the query “Retrieve the names of all instructors, along with their department names and department building name.”

Looking at the schema of the relation instructor, we realize that we can get the department name from the attribute dept name, but the department building name is present in the attribute building of the relation department. To answer the query, each tuple in the instructor relation must be matched with the tuple in the department relation whose dept name value matches the dept name value of the instructor tuple.

In SQL, to answer the above query,we list the relations that need to be accessed in the from clause, and specify the matching condition in the where clause. The above query can be written in SQL as

select name, instructor.dept name, building
from instructor, department
where inst