sql基础_SQL基础

sql基础

The pandas workflow is a common favorite among data analysts and data scientists. The workflow looks something like this:

熊猫工作流是数据分析师和数据科学家共同的最爱。 工作流程如下所示:

The pandas workflow works well when:

在以下情况下,pandas工作流程可以正常运行:

  • the data fits in memory (a few gigabytes but not terabytes)
  • the data is relatively static (doesn’t need to be loaded into memory every minute because the data has changed)
  • only a single person is accessing the data (shared access to memory is difficult)
  • security isn’t important (security is critical for company scale production situations)
  • 数据适合内存(几千兆字节而不是千兆字节)
  • 数据是相对静态的(由于数据已更改,不需要每分钟都加载到内存中)
  • 只有一个人正在访问数据(共享访问内存很困难)
  • 安全性并不重要(安全性对于公司规模的生产情况至关重要)

When the data changes frequently, requires shared access, doesn’t fit in memory, and security is critical, a database is a much better solution. A database is a data representation that lives on disk that can be queried, accessed, and updated without using much memory. We primarily interact with a database using a database management system or DBMS for short.

当数据频繁更改,需要共享访问,不适合内存且安全性至关重要时, 数据库是一个更好的解决方案。 数据库是驻留在磁盘上的数据表示形式,可以在不占用大量内存的情况下对其进行查询,访问和更新。 我们主要使用数据库管理系统或简称DBMS与数据库进行交互。

In the pandas workflow, we spend most of our time thinking about what functions and methods to use, where to store intermediate results in variables, and juggling all of these. To work with data stored in a database, we instead use a language called SQL (or structured query language). In SQL, we express each unique request (whether it be fetching a subset of or editing values in the data) as a single query and then ask the DBMS to run the query and display any results.

在熊猫工作流程中,我们将大部分时间都花在考虑使用什么功能和方法,将中间结果存储在变量中以及如何处理所有这些方面。 为了处理存储在数据库中的数据,我们改用一种称为SQL的语言(或结构化查询语言)。 在SQL中,我们将每个唯一请求(无论是获取数据的子集还是编辑数据中的值)都表示为单个查询,然后要求DBMS运行查询并显示任何结果。

For example, to fetch a specific subset of the data from a database, we would:

例如,要从数据库中获取数据的特定子集,我们将:

  • write the SQL query: SELECT * FROM salaries
  • ask the DBMS to run the query and display the results to us
  • 编写SQL查询: SELECT * FROM salaries
  • 让DBMS运行查询并将结果显示给我们

Here’s what the database workflow looks like:

数据库工作流程如下所示:

Because the data lives on disk, we can work with datasets that consume multiple terabytes of disk space. Many data science teams in industry have servers and setups in cloud environments like Microsoft Azure or Amazon Web Services that let team members work with this scale of data. Robust and popular DBMS tools like Postgres and MySQL include powerful features for managing user credentials, security, and high data throughput (quickly changing data). In this course and the next, we’ll learn the fundamentals of SQL using a small, portable DBMS called SQLite. SQLite is the most popular database in the world and is lightweight enough that the SQLite DBMS is included as a module in Python. In later courses, we’ll dive into production systems like Postgres.

由于数据保存在磁盘上,因此我们可以使用消耗多个TB磁盘空间的数据集。 行业中的许多数据科学团队都在Microsoft Azure或Amazon Web Services等云环境中拥有服务器和设置,使团队成员可以使用这种规模的数据。 强大且流行的DBMS工具(例如PostgresMySQL)包括用于管理用户凭据,安全性和高数据吞吐量(快速更改数据)的强大功能。 在本课程和下一节中,我们将使用称为SQLite的小型可移植DBMS学习SQL的基础知识。 SQLite是世界上最流行的数据库,它的重量很轻,足以将SQLite DBMS作为模块包含在Python中 。 在以后的课程中,我们将深入探讨Postgres等生产系统。

In this course, we’ll explore data from the American Community Survey on job outcome statistics based on college majors. While the original CSV version can be found on FiveThirtyEight’s Github, we’ll be using a slightly modified version of the data that’s stored as a database. We’ll be working with a of the data that contains the 2010-2012 data for recent college grads only. In this post, we’ll learn how to write SQL queries to explore and start to understand the dataset.

在本课程中,我们将探索来自美国社区调查的数据,这些数据是基于大学专业的工作成果统计数据的。 虽然可以在FiveThirtyEight的Github上找到原始的CSV版本,但是我们将使用存储在数据库中的数据的稍微修改后的版本。 我们将使用其中仅包含最近大学毕业生的2010-2012年数据的数据。 在本文中,我们将学习如何编写SQL查询来探索和开始理解数据集

使用SELECT预览表 (Previewing A Table Using SELECT)

Whenever we encountered a new dataset in the past, we displayed the first few rows to get familiar with the different columns, types of values, and some sample data.

过去,只要遇到新的数据集,我们就会显示前几行,以熟悉不同的列,值的类型和一些示例数据。

We’ve loaded the dataset on job outcome statistics into a database. A database usually consists of multiple, related tables of data. Each table contains rows and columns, just like a CSV file. We’ll be working with the database file jobs.db, which contains a single table named recent_grads. In later courses, we’ll learn how to work with a database containing multiple tables.

我们已经将工作结果统计数据集加载到数据库中。 数据库通常由多个相关的数据表组成。 每个表都包含行和列,就像CSV文件一样。 我们将使用数据库文件jobs.db ,其中包含一个名为recent_grads表。 在以后的课程中,我们将学习如何使用包含多个表的数据库。

To display the first 5 rows from the recent_grads table, we need to:

要显示recent_grads表的前5行,我们需要:

  • write SQL code that expresses this request
  • ask the SQLite RDBMS software to run the code and display the results.
  • 编写表示此请求SQL代码
  • 要求SQLite RDBMS软件运行代码并显示结果。

Like other programming languages, code in SQL has to adhere to a defined structure and vocabulary. To specify that we want to return the first 5 rows from recent_grads, we need to run the following SQL query:

像其他编程语言一样,SQL中的代码必须遵守定义的结构和词汇表。 要指定我们要从recent_grads返回前5行,我们需要运行以下SQL查询:

SELECT SELECT * * FROM FROM recent_grads recent_grads LIMIT LIMIT 5
5
index 指数 RankMajor_code 专业代码 Major 重大的 Major_category 专业类别 TotalSample_size 样本大小 Men 男装 Women 女装 ShareWomen 分享女性 Employed 受雇 Full_time 全职 Part_time 兼职 Full_time_year_round Full_time_year_round Unemployed 待业 Unemployment_rate 失业率 Median 中位数 P25th P25th P75th P75th College_jobs 大学工作 Non_college_jobs 非大学工作 Low_wage_jobs 低薪工作
0 0 1 1个 2419 2419 PETROLEUM ENGINEERING 石油工程师 Engineering 工程 2339 2339 36 36 2057 2057年 282 282 0.120564 0.120564 1976 1976年 1849 1849年 270 270 1207 1207 37 37 0.018381 0.018381 110000 110000 95000 95000 125000 125000 1534 1534 364 364 193 193
1 1个 2 2 2416 2416 MINING AND MINERAL ENGINEERING 采矿与矿物工程 Engineering 工程 756 756 7 7 679 679 77 77 0.101852 0.101852 640 640 556 556 170 170 388 388 85 85 0.117241 0.117241 75000 75000 55000 55000 90000 90000 350 350 257 257 50 50
2 2 3 3 2415 2415 METALLURGICAL ENGINEERING 冶金工程 Engineering 工程 856 856 3 3 725 725 131 131 0.153037 0.153037 648 648 558 558 133 133 340 340 16 16 0.024096 0.024096 73000 73000 50000 50000 105000 105000 456 456 176 176 0 0
3 3 4 4 2417 2417 NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 Engineering 工程 1258 1258 16 16 1123 1123 135 135 0.107313 0.107313 758 758 1069 1069 150 150 692 692 40 40 0.050125 0.050125 70000 70000 43000 43000 80000 80000 529 529 102 102 0 0
4 4 5 5 2405 2405 CHEMICAL ENGINEERING 化学工程 Engineering 工程 32260 32260 289 289 21239 21239 11021 11021 0.341631 0.341631 25694 25694 23170 23170 5180 5180 16697 16697 1672 1672 0.061098 0.061098 65000 65000 50000 50000 75000 75000 18314 18314 4440 4440 972 972

In this query, we specified:

在此查询中,我们指定:

  • the columns we wanted using SELECT *
  • the table we wanted to query using FROM recent_grads
  • the number of rows we wanted using LIMIT 5
  • 我们想要使用SELECT *的列SELECT *
  • 我们要使用FROM recent_grads查询的表
  • 我们想要使用LIMIT 5的行数

Here’s a visual breakdown of the different components of the query:

这是查询的不同组成部分的直观细分:

Writing and running SQL queries in our interface is similar to writing and running Python code. Type the query in the code cell and click Run to execute the query against the database. If you write multiple queries in a code cell, SQLite will only display the last query’s results.

在我们的界面中编写和运行SQL查询类似于编写和运行Python代码。 在代码单元格中键入查询,然后单击“运行”以对数据库执行查询。 如果您在一个代码单元中编写了多个查询,则SQLite将仅显示最后一个查询的结果

Let’s write a SQL query that returns the first 10 rows from recent_grads.

让我们编写一个SQL查询,该查询返回recent_grads的前10行。

index 指数 RankMajor_code 专业代码 Major 重大的 Major_category 专业类别 TotalSample_size 样本大小 Men 男装 Women 女装 ShareWomen 分享女性 Employed 受雇 Full_time 全职 Part_time 兼职 Full_time_year_round Full_time_year_round Unemployed 待业 Unemployment_rate 失业率 Median 中位数 P25th P25th P75th P75th College_jobs 大学工作 Non_college_jobs 非大学工作 Low_wage_jobs 低薪工作
0 0 1 1个 2419 2419 PETROLEUM ENGINEERING 石油工程师 Engineering 工程 2339 2339 36 36 2057 2057年 282 282 0.120564 0.120564 1976 1976年 1849 1849年 270 270 1207 1207 37 37 0.018381 0.018381 110000 110000 95000 95000 125000 125000 1534 1534 364 364 193 193
1 1个 2 2 2416 2416 MINING AND MINERAL ENGINEERING 采矿与矿物工程 Engineering 工程 756 756 7 7 679 679 77 77 0.101852 0.101852 640 640 556 556 170 170 388 388 85 85 0.117241 0.117241 75000 75000 55000 55000 90000 90000 350 350 257 257 50 50
2 2 3 3 2415 2415 METALLURGICAL ENGINEERING 冶金工程 Engineering 工程 856 856 3 3 725 725 131 131 0.153037 0.153037 648 648 558 558 133 133 340 340 16 16 0.024096 0.024096 73000 73000 50000 50000 105000 105000 456 456 176 176 0 0
3 3 4 4 2417 2417 NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 Engineering 工程 1258 1258 16 16 1123 1123 135 135 0.107313 0.107313 758 758 1069 1069 150 150 692 692 40 40 0.050125 0.050125 70000 70000 43000 43000 80000 80000 529 529 102 102 0 0
4 4 5 5 2405 2405 CHEMICAL ENGINEERING 化学工程 Engineering 工程 32260 32260 289 289 21239 21239 11021 11021 0.341631 0.341631 25694 25694 23170 23170 5180 5180 16697 16697 1672 1672 0.061098 0.061098 65000 65000 50000 50000 75000 75000 18314 18314 4440 4440 972 972
5 5 6 6 2418 2418 NUCLEAR ENGINEERING 核工程 Engineering 工程 2573 2573 17 17 2200 2200 373 373 0.144967 0.144967 1857 1857 2038 2038年 264 264 1449 1449 400 400 0.177226 0.177226 65000 65000 50000 50000 102000 102000 1142 1142 657 657 244 244
6 6 7 7 6202 6202 ACTUARIAL SCIENCE 精密科学 Business 商业 3777 3777 51 51 832 832 960 960 0.535714 0.535714 2912 2912 2924 2924 296 296 2482 2482 308 308 0.095652 0.095652 62000 62000 53000 53000 72000 72000 1768 1768 314 314 259 259
7 7 8 8 5001 5001 ASTRONOMY AND ASTROPHYSICS 天文学与天体物理学 Physical Sciences 物理科学 1792 1792 10 10 2110 2110 1667 1667 0.441356 0.441356 1526 1526 1085 1085 553 553 827 827 33 33 0.021167 0.021167 62000 62000 31500 31500 109000 109000 972 972 500 500 220 220
8 8 9 9 2414 2414 MECHANICAL ENGINEERING 机械工业 Engineering 工程 91227 91227 1029 1029 12953 12953 2105 2105 0.139793 0.139793 76442 76442 71298 71298 13101 13101 54639 54639 4650 4650 0.057342 0.057342 60000 60000 48000 48000 70000 70000 52844 52844 16384 16384 3253 3253
9 9 10 10 2408 2408 ELECTRICAL ENGINEERING 电机工程 Engineering 工程 81527 81527 631 631 8407 8407 6548 6548 0.437847 0.437847 61928 61928 55450 55450 12695 12695 41413 41413 3895 3895 0.059174 0.059174 60000 60000 45000 45000 72000 72000 45829 45829 10874 10874 3170 3170

使用WHERE过滤行 (Filtering Rows Using WHERE)

SQLite ran our query and returned the first 10 rows and all columns from the recent_grads table. Head to the dataset page and spend some time getting familiar with what each column represents.

SQLite运行我们的查询,并返回了recent_grads表的前10行和所有列。 转到数据集页面花一些时间熟悉每一列所代表的内容。

Based on this dataset preview and an understanding of what each column represents, here are some questions we may have:

基于此数据集预览以及对每列表示什么的理解,以下是我们可能遇到的一些问题:

  • Which majors had mostly female students? Which ones had mostly male students?
  • Which majors had the largest spread (difference) between the 25th and 75th percentile starting salaries?
  • Which engineering majors had the highest full time employment rates?
  • 哪个专业的学生大多数是女学生? 哪些学生中男生最多?
  • 哪个专业的起薪点在第25个百分点与第75个百分点之间有最大的差异(差异)?
  • 哪个工程专业的全职就业率最高?

Let’s start by focusing on the first question. The SQL workflow revolves around translating the question we want to answer to the subset of data we want from the database. To determine which majors had mostly female students, we want the following subset:

让我们从关注第一个问题开始。 SQL工作流围绕着将我们想要回答的问题转换为我们想要从数据库中获取的数据子集。 为了确定哪个专业主要是女学生,我们需要以下子集:

  • only the Major column
  • only the rows where ShareWomen is greater than 0.5 (corresponding to 50%)
  • 只有Major
  • ShareWomen大于0.5 (相当于50%)的行

To return only the Major column, we need to add the specific column name in the SELECT statement part of the query (instead of using the * operator to return all columns):

要仅返回Major列,我们需要在查询的SELECT语句部分中添加特定的列名(而不是使用*运算符返回所有列):

SELECT Major FROM recent_grads

SELECT Major FROM recent_grads

This will return all of the values in the Major column. We can specify multiple columns this way as well and the results table will preserve the order of the columns:

这将返回“ Major列中的所有值。 我们也可以通过这种方式指定多个列,结果表将保留列的顺序:

To return only the values where ShareWomen is greater than or equal to 0.5, we need to add a WHERE clause:

要仅返回ShareWomen大于或等于0.5 ,我们需要添加WHERE子句:

SELECT Major FROM recent_grads
WHERE ShareWomen >= 0.5

SELECT Major FROM recent_grads
WHERE ShareWomen >= 0.5

Finally, we can limit the number of rows returned using LIMIT:

最后,我们可以使用LIMIT返回的行数:

Major 重大的
ACTUARIAL SCIENCE 精密科学
COMPUTER SCIENCE 计算机科学
ENVIRONMENTAL ENGINEERING 环境工程
NURSING 护理
INDUSTRIAL PRODUCTION TECHNOLOGIES 工业生产技术

Here’s a breakdown of the different components:

以下是不同组件的细分:

While in the SELECT part of the query, we express the specific column we want, in the WHERE part we express the specific rows we want. The beauty of SQL is that these can be independent.

在查询的SELECT部分中,我们表示所需的特定列,在WHERE部分中,我们表示所需的特定行。 SQL的优点在于它们可以是独立的。

Let’s write a SQL query that returns the majors where females were a minority. We’ll Only return the Major and ShareWomen columns (in that order) and don’t limit the number of rows returned.

让我们编写一个SQL查询,返回女性为少数的专业。 我们将仅ShareWomen顺序返回MajorShareWomen列,并且不限制返回的行数。

SELECT SELECT MajorMajor , , ShareWomen ShareWomen FROM FROM recent_grads recent_grads WHERE WHERE ShareWomen ShareWomen < < 0.5
0.5
Major 重大的 ShareWomen 分享女性
PETROLEUM ENGINEERING 石油工程师 0.120564 0.120564
MINING AND MINERAL ENGINEERING 采矿与矿物工程 0.101852 0.101852
METALLURGICAL ENGINEERING 冶金工程 0.153037 0.153037
NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 0.107313 0.107313
CHEMICAL ENGINEERING 化学工程 0.341631 0.341631
NUCLEAR ENGINEERING 核工程 0.144967 0.144967
ASTRONOMY AND ASTROPHYSICS 天文学与天体物理学 0.441356 0.441356
MECHANICAL ENGINEERING 机械工业 0.139793 0.139793
ELECTRICAL ENGINEERING 电机工程 0.437847 0.437847
COMPUTER ENGINEERING 计算机工程 0.199413 0.199413
AEROSPACE ENGINEERING 航空航天工程 0.196450 0.196450
BIOMEDICAL ENGINEERING 生物医学工程 0.119559 0.119559
MATERIALS SCIENCE 材料科学 0.310820 0.310820
ENGINEERING MECHANICS PHYSICS AND SCIENCE 工程力学物理与科学 0.183985 0.183985
BIOLOGICAL ENGINEERING 生物工程 0.320784 0.320784
INDUSTRIAL AND MANUFACTURING ENGINEERING 工业与制造工程 0.343473 0.343473
GENERAL ENGINEERING 一般工程 0.252960 0.252960
ARCHITECTURAL ENGINEERING 建筑工程 0.350442 0.350442
COURT REPORTING 法院报告 0.236063 0.236063
FOOD SCIENCE 食物科学 0.222695 0.222695
ELECTRICAL ENGINEERING TECHNOLOGY 电气工程技术 0.325092 0.325092
MATERIALS ENGINEERING AND MATERIALS SCIENCE 材料工程与材料科学 0.292607 0.292607
MANAGEMENT INFORMATION SYSTEMS AND STATISTICS 管理信息系统与统计 0.278790 0.278790
CIVIL ENGINEERING 土木工程 0.227118 0.227118
CONSTRUCTION SERVICES 建筑服务 0.342229 0.342229
OPERATIONS LOGISTICS AND E-COMMERCE 运营物流与电子商务 0.322222 0.322222
MISCELLANEOUS ENGINEERING 杂项工程 0.189970 0.189970
PUBLIC POLICY 公共政策 0.251389 0.251389
ENGINEERING TECHNOLOGIES 工程技术 0.090713 0.090713
MISCELLANEOUS FINE ARTS 其他美术 0.410180 0.410180
GEOLOGICAL AND GEOPHYSICAL ENGINEERING 地质与地球工程 0.324838 0.324838
FINANCE 金融 0.355469 0.355469
ECONOMICS 经济学 0.340825 0.340825
BUSINESS ECONOMICS 商业经济学 0.249190 0.249190
NUCLEAR, INDUSTRIAL RADIOLOGY, AND BIOLOGICAL … 核,工业放射学和生物… 0.430537 0.430537
ACCOUNTING 会计 0.253583 0.253583
MATHEMATICS 数学 0.244103 0.244103
PHYSICS 物理 0.448099 0.448099
MEDICAL TECHNOLOGIES TECHNICIANS 医疗技术人员 0.434298 0.434298
STATISTICS AND DECISION SCIENCE 统计与决策科学 0.281936 0.281936
ENGINEERING AND INDUSTRIAL MANAGEMENT 工程与工业管理 0.174123 0.174123
MEDICAL ASSISTING SERVICES 医疗辅助服务 0.178982 0.178982
COMPUTER PROGRAMMING AND DATA PROCESSING 计算机编程和数据处理 0.269194 0.269194
GENERAL BUSINESS 一般业务 0.417925 0.417925
ARCHITECTURE 建筑 0.321770 0.321770
INTERNATIONAL BUSINESS 国际业务 0.282903 0.282903
PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTR… 药房药物科学和行政管理 0.451465 0.451465
MOLECULAR BIOLOGY 分子生物学 0.077453 0.077453
MISCELLANEOUS BUSINESS & MEDICAL ADMINISTRATION 其他商业和医疗管理 0.200023 0.200023
MISCELLANEOUS ENGINEERING TECHNOLOGIES 杂项工程技术 0.000000 0.000000
MECHANICAL ENGINEERING RELATED TECHNOLOGIES 机械工程相关技术 0.377437 0.377437
INDUSTRIAL AND ORGANIZATIONAL PSYCHOLOGY 工业和组织心理学 0.436302 0.436302
PHYSICAL SCIENCES 物理科学 0.426924 0.426924
MILITARY TECHNOLOGIES 军事技术 0.429685 0.429685
ELECTRICAL, MECHANICAL, AND PRECISION TECHNOLO… 电气,机械和精密技术 0.232444 0.232444
MARKETING AND MARKETING RESEARCH 市场营销研究 0.382900 0.382900
POLITICAL SCIENCE AND GOVERNMENT 政治科学与政府 0.485930 0.485930
GEOGRAPHY 地理 0.473190 0.473190
COMPUTER ADMINISTRATION MANAGEMENT AND SECURITY 计算机管理管理与安全 0.180883 0.180883
COMPUTER NETWORKING AND TELECOMMUNICATIONS 计算机网络与电信 0.305005 0.305005
GEOLOGY AND EARTH SCIENCE 地质与地球科学 0.470197 0.470197
PUBLIC ADMINISTRATION 公共行政 0.476461 0.476461
COMMUNICATIONS 通讯方式 0.305109 0.305109
CRIMINAL JUSTICE AND FIRE PROTECTION 刑事司法与消防 0.125035 0.125035
COMMERCIAL ART AND GRAPHIC DESIGN 商业艺术与图形设计 0.374356 0.374356
SPECIAL NEEDS EDUCATION 特殊需求教育 0.366177 0.366177
TRANSPORTATION SCIENCES AND TECHNOLOGIES 交通科学与技术 0.321296 0.321296
NEUROSCIENCE 神经科学 0.475010 0.475010
MULTI/INTERDISCIPLINARY STUDIES 跨学科研究 0.495397 0.495397
ATMOSPHERIC SCIENCES AND METEOROLOGY 大气科学与气象 0.124950 0.124950
EDUCATIONAL ADMINISTRATION AND SUPERVISION 教育行政与监督 0.448732 0.448732
PHILOSOPHY AND RELIGIOUS STUDIES 哲学与宗教研究 0.416810 0.416810
ENGLISH LANGUAGE AND LITERATURE 英语语言与文学 0.339671 0.339671
SCIENCE AND COMPUTER TEACHER EDUCATION 科学与计算机教师教育 0.423209 0.423209
MUSIC 音乐 0.444582 0.444582
COSMETOLOGY SERVICES AND CULINARY ARTS 美容服务和烹饪 0.383719 0.383719

使用AND表示多个过滤条件 (Expressing Multiple Filter Criteria Using AND)

To filter rows by specific criteria, we need to use the WHERE statement. A simple WHERE statement requires three things:

要按特定条件过滤行,我们需要使用WHERE语句。 一个简单的WHERE语句需要三件事:

  • The column we want the database to filter on: ShareWomen
  • A comparison operator that specifies how we want to compare a value in a column: >
  • The value we want the database to compare each value to: 0.5
  • 我们希望数据库作为筛选依据的列: ShareWomen
  • 比较运算符,用于指定我们如何比较列中的值: >
  • 我们希望数据库将每个值与以下值进行比较: 0.5

Here are the comparison operators we can use:

这是我们可以使用的比较运算符:

  • Less than: <
  • Less than or equal to: <=
  • Greater than: >
  • Greater than or equal to: >=
  • Equal to: =
  • Not equal to: !=
  • 小于: <
  • 小于或等于: <=
  • 大于: >
  • 大于或等于: >=
  • 等于: =
  • 不等于: !=

The comparison value after the operator must be either text or a number, depending on the field. Because ShareWomen is a numeric column, we don’t need to enclose the number 0.5 in quotes. Finally, most database systems require that the SELECT and FROM statements come first, before WHERE or any other statements.

取决于字段,运算符后的比较值必须为文本或数字。 由于ShareWomen是一个数字列,因此我们不需要将数字0.5括在引号中。 最后,大多数数据库系统要求SELECTFROM语句在WHERE或任何其他语句之前排在最前面。

We can use the AND operator to combine multiple filter criteria. For example, to determine which engineering majors had majority female, we’d need to specify 2 filtering criteria.

我们可以使用AND运算符组合多个过滤条件。 例如,要确定哪个工程专业的女性占多数,我们需要指定2个过滤条件。

Major 重大的
ENVIRONMENTAL ENGINEERING 环境工程
INDUSTRIAL PRODUCTION TECHNOLOGIES 工业生产技术

It looks like only 2 majors met this criteria. If we wanted to “zoom” back out to look at all of the columns for both of these majors to see if they shared some other common attributes, we can modify the SELECT statement and use the symbol * to represent all columns

看来只有2个专业达到了这个标准。 如果我们想“放大”以查看这两个专业的所有列以查看它们是否共享其他共同的属性,则可以修改SELECT语句并使用符号*表示所有列

SELECT SELECT * * FROM FROM recent_grads
recent_grads
WHERE WHERE Major_category Major_category = = 'Engineering' 'Engineering' AND AND ShareWomen ShareWomen > > 0.5
0.5
index 指数 RankMajor_code 专业代码 Major 重大的 Major_category 专业类别 TotalSample_size 样本大小 Men 男装 Women 女装 ShareWomen 分享女性 Employed 受雇 Full_time 全职 Part_time 兼职 Full_time_year_round Full_time_year_round Unemployed 待业 Unemployment_rate 失业率 Median 中位数 P25th P25th P75th P75th College_jobs 大学工作 Non_college_jobs 非大学工作 Low_wage_jobs 低薪工作
30 30 31 31 2410 2410 ENVIRONMENTAL ENGINEERING 环境工程 Engineering 工程 4047 4047 26 26 2639 2639 3339 3339 0.558548 0.558548 2983 2983 2384 2384 930 930 1951 1951年 308 308 0.093589 0.093589 50000 50000 42000 42000 56000 56000 2028 2028年 830 830 260 260
38 38 39 39 2503 2503 INDUSTRIAL PRODUCTION TECHNOLOGIES 工业生产技术 Engineering 工程 4631 4631 73 73 528 528 1588 1588 0.750473 0.750473 4428 4428 3988 3988 597 597 3242 3242 129 129 0.028308 0.028308 46000 46000 35000 35000 65000 65000 1394 1394 2454 2454 480 480

The ability to quickly iterate on queries as you think of new questions is the appeal of SQL. The SQL workflow lets data professionals focus on asking and answering questions, instead of lower level programming concepts. There’s a clear separation of concerns between the engine that stores, organizes, and retrieves the data and the language that let’s people interface with the data easily.

当您想到新问题时,快速迭代查询的能力是SQL的吸引力。 SQL工作流使数据专业人员可以专注于提问和回答问题,而不是底层编程概念。 存储,组织和检索数据的引擎与使人们轻松地与数据交互的语言之间存在明显的关注点分离。

As the scale of data has increased, engineers have maintained the interface of SQL while swapping out the database engine underneath. This allows people who need to ask and answer questions to easily transfer their SQL experience, even as database technologies change. For example, the Presto project lets you query using SQL but use data from database systems like MySQL, from a distributed file system like HDFS, and more.

随着数据规模的增加,工程师在更换下面的数据库引擎的同时维护了SQL的接口。 这使需要询问和回答问题的人员可以轻松地转移他们SQL经验,即使数据库技术发生了变化。 例如, Presto项目使您可以使用SQL查询,但可以使用来自数据库系统(如MySQL),来自分布式文件系统(如HDFS)等的数据。

Let’s write a SQL query that returns all majors with majority female and all majors had a median salary greater than 50000. Let’s only include the following columns in the results and in this order:

让我们编写一个SQL查询,该查询返回所有具有女性多数的专业, 并且所有专业的中位薪水均大于50000 。 让我们仅在结果中按顺序包括以下几列:

Major 重大的 Major_category 专业类别 Median 中位数 ShareWomen 分享女性
ACTUARIAL SCIENCE 精密科学 Business 商业 62000 62000 0.535714 0.535714
COMPUTER SCIENCE 计算机科学 Computers & Mathematics 电脑与数学 53000 53000 0.578766 0.578766

使用OR返回多个条件之一 (Returning One of Several Conditions With OR)

We used the AND operator to specify that our filter needs to pass two Boolean conditions. Both of the conditions had to evaluate to True for the record to appear in the result set. If we wanted to specify a filter that meets either of the conditions instead, we would use the OR operator.

我们使用AND运算符来指定我们的过滤器需要传递两个布尔条件。 为了使记录出现在结果集中,两个条件都必须评估为True 。 如果我们想指定一个满足一条件的过滤器,则可以使用OR运算符。

SELECT [column1, column2,...] FROM [table1]
WHERE [condition1] OR [condition2]

SELECT [column1, column2,...] FROM [table1]
WHERE [condition1] OR [condition2]

We’ll dive straight into a practice problem because we use the OR and AND operators in similar ways.

我们将直接探讨实践问题,因为我们以类似的方式使用ORAND运算符。

Write a SQL query that returns the first 20 majors that either have a Median salary greater than or equal to 10,000, or have less than or equal to 1,000 Unemployed people. Let’s only include the following columns in the results and in this order:

编写一个SQL查询,返回前20个专业, 这些专业的Median工资Median大于或等于10,000 Unemployed小于或等于1,000 。 让我们仅在结果中按顺序包括以下几列:

  • Major
  • Median
  • Unemployed
  • Major
  • Median
  • Unemployed
Major 重大的 Median 中位数 Unemployed 待业
PETROLEUM ENGINEERING 石油工程师 110000 110000 37 37
MINING AND MINERAL ENGINEERING 采矿与矿物工程 75000 75000 85 85
METALLURGICAL ENGINEERING 冶金工程 73000 73000 16 16
NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 70000 70000 40 40
CHEMICAL ENGINEERING 化学工程 65000 65000 1672 1672
NUCLEAR ENGINEERING 核工程 65000 65000 400 400
ACTUARIAL SCIENCE 精密科学 62000 62000 308 308
ASTRONOMY AND ASTROPHYSICS 天文学与天体物理学 62000 62000 33 33
MECHANICAL ENGINEERING 机械工业 60000 60000 4650 4650
ELECTRICAL ENGINEERING 电机工程 60000 60000 3895 3895
COMPUTER ENGINEERING 计算机工程 60000 60000 2275 2275
AEROSPACE ENGINEERING 航空航天工程 60000 60000 794 794
BIOMEDICAL ENGINEERING 生物医学工程 60000 60000 1019 1019
MATERIALS SCIENCE 材料科学 60000 60000 78 78
ENGINEERING MECHANICS PHYSICS AND SCIENCE 工程力学物理与科学 58000 58000 23 23
BIOLOGICAL ENGINEERING 生物工程 57100 57100 589 589
INDUSTRIAL AND MANUFACTURING ENGINEERING 工业与制造工程 57000 57000 699 699
GENERAL ENGINEERING 一般工程 56000 56000 2859 2859
ARCHITECTURAL ENGINEERING 建筑工程 54000 54000 170 170
COURT REPORTING 法院报告 54000 54000 11 11

用括号将运算符分组 (Grouping Operators With Parentheses)

There’s a certain class of questions that we can’t answer using only the techniques we’ve learned so far. For example, if we wanted to write a query that returned all Engineering majors that either had mostly female graduates or an unemployment rate below 5.1%, we would need to use parentheses to express this more complex logic.

我们仅使用到目前为止已经掌握的技术就无法回答某些问题。 例如,如果我们想编写一个返回所有的查询Engineering要么有大部分是女毕业生低于5.1%的失业率的专业,我们需要使用括号来表达这种更复杂的逻辑。

The three raw conditions we’ll need are:

我们需要的三个原始条件是:

Major_category = 'Engineering'
ShareWomen >= 0.5
Unemployment_rate < 0.051

Major_category = 'Engineering'
ShareWomen >= 0.5
Unemployment_rate < 0.051

What the SQL query looks like using parantheses:

使用括号时,SQL查询的外观如下:

The first thing you may notice is that we didn’t capitalize any of the operators or statements in the query. SQL’s built-in keywords are case-insensitive, which means we don’t have to capitalize operators like AND or statements like SELECT. This also goes for the column names (you can use either major_category or Major_category). We’ll stick to using capitalized SQL and the original column names to stay consistent.

您可能会注意到的第一件事是,我们没有将查询中的任何运算符或语句大写。 SQL的内置关键字不区分大小写,这意味着我们不必大写诸如AND类的运算符或诸如SELECT类的SELECT 。 列名称也是如此(您可以使用major_categoryMajor_category )。 我们将坚持使用大写SQL和原始列名保持一致。

The second thing you may notice is how we enclosed the logic we wanted to be evaluated together in parentheses. This is very similar to how we group mathematical calculations together in a particular order. The parentheses makes it explictly clear to the database that we want all of the rows where both of the expressions in the statements evaluate to True:

您可能会注意到的第二件事是,我们如何将要评估的逻辑放在括号中。 这非常类似于我们按特定顺序将数学计算分组在一起的方式。 括号使数据库清楚地知道,我们希望所有行中语句中两个表达式的求和结果都为True

(Major_category = 'Engineering' AND ShareWomen > 0.5) -> True or False
(ShareWomen > 0.5 OR Unemployment_rate < 0.051) -> True or False

(Major_category = 'Engineering' AND ShareWomen > 0.5) -> True or False
(ShareWomen > 0.5 OR Unemployment_rate < 0.051) -> True or False

If we had written the where statement without any parentheses, the database would guess what our intentions are, and actually execute the following query instead:

如果我们编写了不带括号的where语句,则数据库将猜测我们的意图,并实际上执行以下查询:

Leaving the parentheses out implies that we want the calculation to happen from left to right in the order in which the logic is written, and wouldn’t return us the data we want. Now let’s run our intended query and see the results!

省略括号意味着我们希望计算以逻辑编写的顺序从左到右进行,并且不会向我们返回所需的数据。 现在,让我们运行预期的查询并查看结果!

Let’s run the query we explored above, which returns all Engineering majors that either had mostly women graduates or had an unemployment rate below 5.1%, which was the rate in August 2015. Let’s only include the following columns in the results and in this order:

让我们运行上面探索的查询,该查询返回所有Engineering专业的学生, 这些学生大多数是女性毕业生, 或者失业率低于5.1% ,即2015年8月的失业率。我们仅在结果中按以下顺序包括以下几列:

  • Major
  • Major_category
  • ShareWomen
  • Unemployment_rate
  • Major
  • Major_category
  • ShareWomen
  • Unemployment_rate
SELECT SELECT MajorMajor , , Major_categoryMajor_category , , ShareWomenShareWomen , , Unemployment_rate
Unemployment_rate
FROM FROM recent_grads
recent_grads
WHERE WHERE (( Major_category Major_category = = 'Engineering''Engineering' ) ) AND AND (( ShareWomen ShareWomen > > 0.5 0.5 OR OR Unemployment_rate Unemployment_rate < < 0.0510.051 )
)
Major 重大的 Major_category 专业类别 ShareWomen 分享女性 Unemployment_rate 失业率
PETROLEUM ENGINEERING 石油工程师 Engineering 工程 0.120564 0.120564 0.018381 0.018381
METALLURGICAL ENGINEERING 冶金工程 Engineering 工程 0.153037 0.153037 0.024096 0.024096
NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 Engineering 工程 0.107313 0.107313 0.050125 0.050125
MATERIALS SCIENCE 材料科学 Engineering 工程 0.310820 0.310820 0.023043 0.023043
ENGINEERING MECHANICS PHYSICS AND SCIENCE 工程力学物理与科学 Engineering 工程 0.183985 0.183985 0.006334 0.006334
INDUSTRIAL AND MANUFACTURING ENGINEERING 工业与制造工程 Engineering 工程 0.343473 0.343473 0.042876 0.042876
MATERIALS ENGINEERING AND MATERIALS SCIENCE 材料工程与材料科学 Engineering 工程 0.292607 0.292607 0.027789 0.027789
ENVIRONMENTAL ENGINEERING 环境工程 Engineering 工程 0.558548 0.558548 0.093589 0.093589
INDUSTRIAL PRODUCTION TECHNOLOGIES 工业生产技术 Engineering 工程 0.750473 0.750473 0.028308 0.028308
ENGINEERING AND INDUSTRIAL MANAGEMENT 工程与工业管理 Engineering 工程 0.174123 0.174123 0.033652 0.033652

使用ORDER BY排序结果 (Ordering Results Using ORDER BY)

The results of every query we’ve written so far have been ordered by the Rank column. Recall a query from early in the post, where we wrote a query that returned all of the columns and didn’t filter rows on any specific criteria:

到目前为止,我们编写的每个查询的结果均按“ Rank列排序。 回想一下文章开头的查询,我们编写了一个查询,该查询返回了所有列,并且未根据任何特定条件过滤行:

index 指数 RankMajor_code 专业代码 Major 重大的 Major_category 专业类别 TotalSample_size 样本大小 Men 男装 Women 女装 ShareWomen 分享女性 Employed 受雇 Full_time 全职 Part_time 兼职 Full_time_year_round Full_time_year_round Unemployed 待业 Unemployment_rate 失业率 Median 中位数 P25th P25th P75th P75th College_jobs 大学工作 Non_college_jobs 非大学工作 Low_wage_jobs 低薪工作
0 0 1 1个 2419 2419 PETROLEUM ENGINEERING 石油工程师 Engineering 工程 2339 2339 36 36 2057 2057年 282 282 0.120564 0.120564 1976 1976年 1849 1849年 270 270 1207 1207 37 37 0.018381 0.018381 110000 110000 95000 95000 125000 125000 1534 1534 364 364 193 193
1 1个 2 2 2416 2416 MINING AND MINERAL ENGINEERING 采矿与矿物工程 Engineering 工程 756 756 7 7 679 679 77 77 0.101852 0.101852 640 640 556 556 170 170 388 388 85 85 0.117241 0.117241 75000 75000 55000 55000 90000 90000 350 350 257 257 50 50
2 2 3 3 2415 2415 METALLURGICAL ENGINEERING 冶金工程 Engineering 工程 856 856 3 3 725 725 131 131 0.153037 0.153037 648 648 558 558 133 133 340 340 16 16 0.024096 0.024096 73000 73000 50000 50000 105000 105000 456 456 176 176 0 0
3 3 4 4 2417 2417 NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 Engineering 工程 1258 1258 16 16 1123 1123 135 135 0.107313 0.107313 758 758 1069 1069 150 150 692 692 40 40 0.050125 0.050125 70000 70000 43000 43000 80000 80000 529 529 102 102 0 0
4 4 5 5 2405 2405 CHEMICAL ENGINEERING 化学工程 Engineering 工程 32260 32260 289 289 21239 21239 11021 11021 0.341631 0.341631 25694 25694 23170 23170 5180 5180 16697 16697 1672 1672 0.061098 0.061098 65000 65000 50000 50000 75000 75000 18314 18314 4440 4440 972 972

As the questions we want to answer get more complex, we want more control over how the results are ordered. We can specify the order using the ORDER BY clause. For example, we may want to understand which majors that met the criteria in the WHERE statement had the lowest unemployment rate. The following query will return the results in ascending order by the Unemployment_rate column.

随着我们要回答的问题变得越来越复杂,我们希望对结果的排序方式有更多的控制。 我们可以使用ORDER BY子句指定顺序。 例如,我们可能想了解哪些符合WHERE陈述标准的专业失业率最低。 以下查询将按Unemployment_rate列的升序返回结果。

SELECT SELECT RankRank , , MajorMajor , , Major_categoryMajor_category , , ShareWomenShareWomen , , Unemployment_rate
Unemployment_rate
FROM FROM recent_grads
recent_grads
WHERE WHERE (( Major_category Major_category = = 'Engineering''Engineering' ) ) AND AND (( ShareWomen ShareWomen > > 0.5 0.5 OR OR Unemployment_rate Unemployment_rate < < 0.0510.051 )
)
ORDER ORDER BY BY Unemployment_rate
Unemployment_rate
RankMajor 重大的 Major_category 专业类别 ShareWomen 分享女性 Unemployment_rate 失业率
15 15 ENGINEERING MECHANICS PHYSICS AND SCIENCE 工程力学物理与科学 Engineering 工程 0.183985 0.183985 0.006334 0.006334
1 1个 PETROLEUM ENGINEERING 石油工程师 Engineering 工程 0.120564 0.120564 0.018381 0.018381
14 14 MATERIALS SCIENCE 材料科学 Engineering 工程 0.310820 0.310820 0.023043 0.023043
3 3 METALLURGICAL ENGINEERING 冶金工程 Engineering 工程 0.153037 0.153037 0.024096 0.024096
24 24 MATERIALS ENGINEERING AND MATERIALS SCIENCE 材料工程与材料科学 Engineering 工程 0.292607 0.292607 0.027789 0.027789
39 39 INDUSTRIAL PRODUCTION TECHNOLOGIES 工业生产技术 Engineering 工程 0.750473 0.750473 0.028308 0.028308
51 51 ENGINEERING AND INDUSTRIAL MANAGEMENT 工程与工业管理 Engineering 工程 0.174123 0.174123 0.033652 0.033652
17 17 INDUSTRIAL AND MANUFACTURING ENGINEERING 工业与制造工程 Engineering 工程 0.343473 0.343473 0.042876 0.042876
4 4 NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 Engineering 工程 0.107313 0.107313 0.050125 0.050125
31 31 ENVIRONMENTAL ENGINEERING 环境工程 Engineering 工程 0.558548 0.558548 0.093589 0.093589

If we instead want the results ordered by the same column but in descending order, we can add the DESC keyword:

相反,如果我们希望结果按同一列但以降序排列,则可以添加DESC关键字:

RankMajor 重大的 Major_category 专业类别 ShareWomen 分享女性 Unemployment_rate 失业率
31 31 ENVIRONMENTAL ENGINEERING 环境工程 Engineering 工程 0.558548 0.558548 0.093589 0.093589
4 4 NAVAL ARCHITECTURE AND MARINE ENGINEERING 海军建筑与海洋工程 Engineering 工程 0.107313 0.107313 0.050125 0.050125
17 17 INDUSTRIAL AND MANUFACTURING ENGINEERING 工业与制造工程 Engineering 工程 0.343473 0.343473 0.042876 0.042876
51 51 ENGINEERING AND INDUSTRIAL MANAGEMENT 工程与工业管理 Engineering 工程 0.174123 0.174123 0.033652 0.033652
39 39 INDUSTRIAL PRODUCTION TECHNOLOGIES 工业生产技术 Engineering 工程 0.750473 0.750473 0.028308 0.028308
24 24 MATERIALS ENGINEERING AND MATERIALS SCIENCE 材料工程与材料科学 Engineering 工程 0.292607 0.292607 0.027789 0.027789
3 3 METALLURGICAL ENGINEERING 冶金工程 Engineering 工程 0.153037 0.153037 0.024096 0.024096
14 14 MATERIALS SCIENCE 材料科学 Engineering 工程 0.310820 0.310820 0.023043 0.023043
1 1个 PETROLEUM ENGINEERING 石油工程师 Engineering 工程 0.120564 0.120564 0.018381 0.018381
15 15 ENGINEERING MECHANICS PHYSICS AND SCIENCE 工程力学物理与科学 Engineering 工程 0.183985 0.183985 0.006334 0.006334

Let’s write a query that returns all majors where ShareWomen is greater than 0.3 and Unemployment_rate is less than .1. Let’s only include the following columns in the results and in this order:

让我们编写一个查询,该查询返回ShareWomen大于0.3Unemployment_rate小于.1所有专业。 让我们仅在结果中按顺序包括以下几列:

  • Major,
  • ShareWomen,
  • Unemployment_rate
  • Major
  • ShareWomen
  • Unemployment_rate

We’ll order the results in descending order by the ShareWomen column.

我们ShareWomen列的降序排列结果。

SELECT SELECT MajorMajor , , ShareWomenShareWomen , , Unemployment_rate Unemployment_rate FROM FROM recent_grads
recent_grads
WHERE WHERE ShareWomen ShareWomen > > 0.3 0.3 AND AND Unemployment_rate Unemployment_rate < < .. 1
1
ORDER ORDER BY BY ShareWomen ShareWomen DESC
DESC
Major 重大的 ShareWomen 分享女性 Unemployment_rate 失业率
EARLY CHILDHOOD EDUCATION 早期儿童教育 0.967998 0.967998 0.040105 0.040105
MATHEMATICS AND COMPUTER SCIENCE 数学与计算机科学 0.927807 0.927807 0.000000 0.000000
ELEMENTARY EDUCATION 小学教育 0.923745 0.923745 0.046586 0.046586
ANIMAL SCIENCES 动物科学 0.910933 0.910933 0.050862 0.050862
PHYSIOLOGY 生理 0.906677 0.906677 0.069163 0.069163
MISCELLANEOUS PSYCHOLOGY 杂项心理学 0.905590 0.905590 0.051908 0.051908
HUMAN SERVICES AND COMMUNITY ORGANIZATION 人类服务与社区组织 0.904075 0.904075 0.037819 0.037819
NURSING 护理 0.896019 0.896019 0.044863 0.044863
GEOSCIENCES 地球科学 0.881294 0.881294 0.024374 0.024374
MASS MEDIA 媒体 0.877228 0.877228 0.089837 0.089837
COGNITIVE SCIENCE AND BIOPSYCHOLOGY 认知科学与生物心理学 0.854523 0.854523 0.075236 0.075236
ART HISTORY AND CRITICISM 艺术史与批评 0.845934 0.845934 0.060298 0.060298
EDUCATIONAL PSYCHOLOGY 教育心理学 0.817099 0.817099 0.065112 0.065112
GENERAL EDUCATION 普通教育 0.812877 0.812877 0.057360 0.057360
SOCIAL WORK 社会工作 0.810704 0.810704 0.068828 0.068828
TEACHER EDUCATION: MULTIPLE LEVELS 教师教育:多个层次 0.798920 0.798920 0.036546 0.036546
COUNSELING PSYCHOLOGY 心理咨询 0.798746 0.798746 0.053621 0.053621
MATHEMATICS TEACHER EDUCATION 数学教师教育 0.792095 0.792095 0.016203 0.016203
PSYCHOLOGY 心理学 0.779933 0.779933 0.083811 0.083811
GENERAL MEDICAL AND HEALTH SERVICES 一般医疗卫生 0.774577 0.774577 0.082102 0.082102
HEALTH AND MEDICAL ADMINISTRATIVE SERVICES 卫生和医疗行政服务 0.770901 0.770901 0.089626 0.089626
SOIL SCIENCE 土壤科学 0.764427 0.764427 0.000000 0.000000
AREA ETHNIC AND CIVILIZATION STUDIES 地区民族与文明研究 0.758060 0.758060 0.063429 0.063429
APPLIED MATHEMATICS 应用数学 0.753927 0.753927 0.090823 0.090823
FAMILY AND CONSUMER SCIENCES 家庭和消费者科学 0.752144 0.752144 0.067128 0.067128
INDUSTRIAL PRODUCTION TECHNOLOGIES 工业生产技术 0.750473 0.750473 0.028308 0.028308
SOCIAL PSYCHOLOGY 社会心理学 0.747561 0.747561 0.029650 0.029650
HUMANITIES 人文学科 0.745662 0.745662 0.068584 0.068584
HOSPITALITY MANAGEMENT 接待管理 0.733992 0.733992 0.061169 0.061169
SOCIAL SCIENCE OR HISTORY TEACHER EDUCATION 社会科学或历史教师教育 0.733968 0.733968 0.054083 0.054083
THEOLOGY AND RELIGIOUS VOCATIONS 神学和宗教职业 0.728495 0.728495 0.062628 0.062628
FRENCH GERMAN LATIN AND OTHER COMMON FOREIGN L… 法国德语拉丁语和其他常见外国语言 0.728033 0.728033 0.075566 0.075566
INTERDISCIPLINARY SOCIAL SCIENCES 跨学科社会科学 0.721866 0.721866 0.092306 0.092306
MISCELLANEOUS AGRICULTURE 其他农业 0.719974 0.719974 0.059767 0.059767
JOURNALISM 新闻学 0.719859 0.719859 0.069176 0.069176
MISCELLANEOUS EDUCATION 杂项教育 0.718365 0.718365 0.059212 0.059212
COMPUTER AND INFORMATION SYSTEMS 计算机与信息系统 0.707719 0.707719 0.093460 0.093460
COMMUNICATION DISORDERS SCIENCES AND SERVICES 通信疾病科学与服务 0.707136 0.707136 0.047584 0.047584
MISCELLANEOUS HEALTH MEDICAL PROFESSIONS 其他健康医疗专业 0.702020 0.702020 0.081411 0.081411
LIBERAL ARTS 大量的美术作品 0.700898 0.700898 0.078268 0.078268
FORESTRY 林业 0.690365 0.690365 0.096726 0.096726
OCEANOGRAPHY 海洋学 0.688999 0.688999 0.056995 0.056995
ART AND MUSIC EDUCATION 艺术和音乐教育 0.686024 0.686024 0.038638 0.038638
PHYSICAL FITNESS PARKS RECREATION AND LEISURE 健身公园的休闲娱乐 0.683943 0.683943 0.051467 0.051467
ADVERTISING AND PUBLIC RELATIONS 广告与公共关系 0.673143 0.673143 0.067961 0.067961
HUMAN RESOURCES AND PERSONNEL MANAGEMENT 人力资源和人事管理 0.672161 0.672161 0.059570 0.059570
MULTI-DISCIPLINARY OR GENERAL SCIENCE 多学科或通用科学 0.669999 0.669999 0.055807 0.055807
FINE ARTS 精美艺术 0.667034 0.667034 0.084186 0.084186
COMPOSITION AND RHETORIC 成分和修辞 0.666119 0.666119 0.081742 0.081742
HISTORY 历史 0.651741 0.651741 0.095667 0.095667
ECOLOGY 生态 0.651660 0.651660 0.054475 0.054475
GENETICS 遗传学 0.643331 0.643331 0.034118 0.034118
TREATMENT THERAPY PROFESSIONS 治疗专业 0.640000 0.640000 0.059821 0.059821
NUTRITION SCIENCES 营养科学 0.638147 0.638147 0.068701 0.068701
ZOOLOGY 动物学 0.637293 0.637293 0.046320 0.046320
INTERNATIONAL RELATIONS 国际关系 0.632987 0.632987 0.096799 0.096799
UNITED STATES HISTORY 美国历史 0.630716 0.630716 0.047179 0.047179
DRAMA AND THEATER ARTS 戏剧和戏剧艺术 0.629505 0.629505 0.077541 0.077541
CRIMINOLOGY 犯罪学 0.618223 0.618223 0.097244 0.097244
MICROBIOLOGY 微生物学 0.615727 0.615727 0.066776 0.066776
PLANT SCIENCE AND AGRONOMY 植物科学与农艺学 0.606889 0.606889 0.045455 0.045455
BIOLOGY 生物学 0.601858 0.601858 0.070725 0.070725
SECONDARY TEACHER EDUCATION 中学教师教育 0.601752 0.601752 0.052229 0.052229
AGRICULTURE PRODUCTION AND MANAGEMENT 农业生产与管理 0.594208 0.594208 0.050031 0.050031
PRE-LAW AND LEGAL STUDIES 法律前和法律研究 0.591001 0.591001 0.071965 0.071965
AGRICULTURAL ECONOMICS 农业经济学 0.589712 0.589712 0.077250 0.077250
STUDIO ARTS 工作室艺术 0.584776 0.584776 0.089552 0.089552
ENVIRONMENTAL SCIENCE 环境科学 0.584556 0.584556 0.078585 0.078585
BUSINESS MANAGEMENT AND ADMINISTRATION 业务管理与行政 0.580948 0.580948 0.072218 0.072218
COMPUTER SCIENCE 计算机科学 0.578766 0.578766 0.063173 0.063173
LANGUAGE AND DRAMA EDUCATION 语言和戏剧教育 0.576360 0.576360 0.050306 0.050306
MISCELLANEOUS BIOLOGY 其他生物学 0.566641 0.566641 0.058545 0.058545
NATURAL RESOURCES MANAGEMENT 自然资源管理 0.564639 0.564639 0.066619 0.066619
ENVIRONMENTAL ENGINEERING 环境工程 0.558548 0.558548 0.093589 0.093589
HEALTH AND MEDICAL PREPARATORY PROGRAMS 卫生和医疗准备计划 0.556604 0.556604 0.069780 0.069780
MISCELLANEOUS SOCIAL SCIENCES 其他社会科学 0.543405 0.543405 0.073080 0.073080
ACTUARIAL SCIENCE 精密科学 0.535714 0.535714 0.095652 0.095652
SOCIOLOGY 社会学 0.532334 0.532334 0.084951 0.084951
BOTANY 植物学 0.528969 0.528969 0.000000 0.000000
INFORMATION SCIENCES 信息科学 0.526476 0.526476 0.060741 0.060741
PHARMACOLOGY 药理 0.524153 0.524153 0.085532 0.085532
GENERAL AGRICULTURE 普通农业 0.515543 0.515543 0.019642 0.019642
BIOCHEMICAL SCIENCES 生化科学 0.515406 0.515406 0.080531 0.080531
INTERCULTURAL AND INTERNATIONAL STUDIES 文化间和国际研究 0.507377 0.507377 0.083634 0.083634
PHYSICAL AND HEALTH EDUCATION TEACHING 体育与健康教育教学 0.506721 0.506721 0.074667 0.074667
CHEMISTRY 化学 0.505141 0.505141 0.053972 0.053972
MULTI/INTERDISCIPLINARY STUDIES 跨学科研究 0.495397 0.495397 0.070861 0.070861
NEUROSCIENCE 神经科学 0.475010 0.475010 0.048482 0.048482
GEOLOGY AND EARTH SCIENCE 地质与地球科学 0.470197 0.470197 0.075449 0.075449
PHARMACY PHARMACEUTICAL SCIENCES AND ADMINISTR… 药房药物科学和行政管理 0.451465 0.451465 0.055521 0.055521
EDUCATIONAL ADMINISTRATION AND SUPERVISION 教育行政与监督 0.448732 0.448732 0.000000 0.000000
PHYSICS 物理 0.448099 0.448099 0.048224 0.048224
MUSIC 音乐 0.444582 0.444582 0.075960 0.075960
ASTRONOMY AND ASTROPHYSICS 天文学与天体物理学 0.441356 0.441356 0.021167 0.021167
ELECTRICAL ENGINEERING 电机工程 0.437847 0.437847 0.059174 0.059174
MEDICAL TECHNOLOGIES TECHNICIANS 医疗技术人员 0.434298 0.434298 0.036983 0.036983
NUCLEAR, INDUSTRIAL RADIOLOGY, AND BIOLOGICAL … 核,工业放射学和生物… 0.430537 0.430537 0.071540 0.071540
PHYSICAL SCIENCES 物理科学 0.426924 0.426924 0.035354 0.035354
SCIENCE AND COMPUTER TEACHER EDUCATION 科学与计算机教师教育 0.423209 0.423209 0.047264 0.047264
GENERAL BUSINESS 一般业务 0.417925 0.417925 0.072861 0.072861
PHILOSOPHY AND RELIGIOUS STUDIES 哲学与宗教研究 0.416810 0.416810 0.096052 0.096052
MISCELLANEOUS FINE ARTS 其他美术 0.410180 0.410180 0.089375 0.089375
COSMETOLOGY SERVICES AND CULINARY ARTS 美容服务和烹饪 0.383719 0.383719 0.055677 0.055677
MARKETING AND MARKETING RESEARCH 市场营销研究 0.382900 0.382900 0.061215 0.061215
MECHANICAL ENGINEERING RELATED TECHNOLOGIES 机械工程相关技术 0.377437 0.377437 0.056357 0.056357
COMMERCIAL ART AND GRAPHIC DESIGN 商业艺术与图形设计 0.374356 0.374356 0.096798 0.096798
SPECIAL NEEDS EDUCATION 特殊需求教育 0.366177 0.366177 0.041508 0.041508
FINANCE 金融 0.355469 0.355469 0.060686 0.060686
ARCHITECTURAL ENGINEERING 建筑工程 0.350442 0.350442 0.061931 0.061931
INDUSTRIAL AND MANUFACTURING ENGINEERING 工业与制造工程 0.343473 0.343473 0.042876 0.042876
CONSTRUCTION SERVICES 建筑服务 0.342229 0.342229 0.060023 0.060023
CHEMICAL ENGINEERING 化学工程 0.341631 0.341631 0.061098 0.061098
ECONOMICS 经济学 0.340825 0.340825 0.099092 0.099092
ENGLISH LANGUAGE AND LITERATURE 英语语言与文学 0.339671 0.339671 0.087724 0.087724
ELECTRICAL ENGINEERING TECHNOLOGY 电气工程技术 0.325092 0.325092 0.087557 0.087557
GEOLOGICAL AND GEOPHYSICAL ENGINEERING 地质与地球工程 0.324838 0.324838 0.075038 0.075038
OPERATIONS LOGISTICS AND E-COMMERCE 运营物流与电子商务 0.322222 0.322222 0.047859 0.047859
TRANSPORTATION SCIENCES AND TECHNOLOGIES 交通科学与技术 0.321296 0.321296 0.072725 0.072725
BIOLOGICAL ENGINEERING 生物工程 0.320784 0.320784 0.087143 0.087143
MATERIALS SCIENCE 材料科学 0.310820 0.310820 0.023043 0.023043
COMMUNICATIONS 通讯方式 0.305109 0.305109 0.075177 0.075177

SQL is a powerful language for accessing data and we hope you got a taste for it in this post If you’d like to learn more, we encourage you to check out the SQL Fundamentals course, from which this blog post is based on. In the course, we dive into how to:

SQL是一种用于访问数据的强大语言,我们希望您能从本文中受益匪浅。如果您想了解更多信息,我们鼓励您阅读本博客文章所基于的SQL Fundamentals课程 。 在课程中,我们将深入探讨如何:

  • calculate summary statistics
  • segment data using grouping
  • write more complex queries using subqueries
  • create your own local SQLite database and query it using Python
  • 计算汇总统计
  • 使用分组细分数据
  • 使用子查询编写更复杂的查询
  • 创建您自己的本地SQLite数据库并使用Python查询它

翻译自: https://www.pybloggers.com/2017/10/sql-fundamentals/

sql基础

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值