Spark SQL----INSERT TABLE

一、描述

INSERT语句将新行插入表中或覆盖表中的现有数据。插入的行可以由值表达式指定,也可以由查询结果指定。

二、语法

INSERT [ INTO | OVERWRITE ] [ TABLE ] table_identifier [ partition_spec ] [ ( column_list ) ]
    { VALUES ( { value | NULL } [ , ... ] ) [ , ( ... ) ] | query }

INSERT INTO [ TABLE ] table_identifier REPLACE WHERE boolean_expression query

三、参数

  • table_identifier
    指定一个表名称,可以选择使用数据库名称对其进行限定。
    语法:[ database_name. ] table_name
  • partition_spec
    一个可选参数,用于指定分区的键值对的逗号分隔列表。请注意,可以在分区规范中使用类型化的文字(例如,date’2019-01-02’)。
    语法:PARTITION ( partition_col_name = partition_col_val [ , … ] )
  • column_list
    一个可选参数,用于指定属于table_identifier表的列的逗号分隔列表。Spark会根据指定的column列表对输入查询的列进行重新排序,以匹配表schema。
    注:当前行为有一些局限性:
    • 所有指定的列都应存在于表中,并且不能相互重复。它包括除静态分区列之外的所有列。
    • column列表的大小应该与VALUES子句或查询中的数据大小完全相同。
  • VALUES ( { value | NULL } [ , … ] ) [ , ( … ) ]
    指定要插入的值。可以插入显式指定的值或NULL。必须使用逗号分隔子句中的每个值。可以指定多个值集来插入多行。
  • boolean_expression
    指定计算结果为布尔型的任何表达式。可以使用逻辑运算符(AND, OR)将两个或多个表达式组合在一起。
  • query
    生成要插入的行的查询。它可以是以下格式之一:
    • SELECT语句
    • Inline Table语句
    • FROM语句

四、例子

4.1 Insert Into

使用VALUES子句进行单行插入

CREATE TABLE students (name VARCHAR(64), address VARCHAR(64))
    USING PARQUET PARTITIONED BY (student_id INT);

INSERT INTO students VALUES
    ('Amy Smith', '123 Park Ave, San Jose', 111111);

SELECT * FROM students;
+---------+----------------------+----------+
|     name|               address|student_id|
+---------+----------------------+----------+
|Amy Smith|123 Park Ave, San Jose|    111111|
+---------+----------------------+----------+

使用VALUES子句进行多行插入

INSERT INTO students VALUES
    ('Bob Brown', '456 Taylor St, Cupertino', 222222),
    ('Cathy Johnson', '789 Race Ave, Palo Alto', 333333);

SELECT * FROM students;
+-------------+------------------------+----------+
|         name|                 address|student_id|
+-------------+------------------------+----------+
|    Amy Smith|  123 Park Ave, San Jose|    111111|
+-------------+------------------------+----------+
|    Bob Brown|456 Taylor St, Cupertino|    222222|
+-------------+------------------------+----------+
|Cathy Johnson| 789 Race Ave, Palo Alto|    333333|
+--------------+-----------------------+----------+

使用SELECT语句插入

-- Assuming the persons table has already been created and populated.
SELECT * FROM persons;
+-------------+--------------------------+---------+
|         name|                   address|      ssn|
+-------------+--------------------------+---------+
|Dora Williams|134 Forest Ave, Menlo Park|123456789|
+-------------+--------------------------+---------+
|  Eddie Davis|   245 Market St, Milpitas|345678901|
+-------------+--------------------------+---------+

INSERT INTO students PARTITION (student_id = 444444)
    SELECT name, address FROM persons WHERE name = "Dora Williams";

SELECT * FROM students;
+-------------+--------------------------+----------+
|         name|                   address|student_id|
+-------------+--------------------------+----------+
|    Amy Smith|    123 Park Ave, San Jose|    111111|
+-------------+--------------------------+----------+
|    Bob Brown|  456 Taylor St, Cupertino|    222222|
+-------------+--------------------------+----------+
|Cathy Johnson|   789 Race Ave, Palo Alto|    333333|
+-------------+--------------------------+----------+
|Dora Williams|134 Forest Ave, Menlo Park|    444444|
+-------------+--------------------------+----------+

使用TABLE语句插入

-- Assuming the visiting_students table has already been created and populated.
SELECT * FROM visiting_students;
+-------------+---------------------+----------+
|         name|              address|student_id|
+-------------+---------------------+----------+
|Fleur Laurent|345 Copper St, London|    777777|
+-------------+---------------------+----------+
|Gordon Martin| 779 Lake Ave, Oxford|    888888|
+-------------+---------------------+----------+

INSERT INTO students TABLE visiting_students;

SELECT * FROM students;
+-------------+--------------------------+----------+
|         name|                   address|student_id|
+-------------+--------------------------+----------+
|    Amy Smith|    123 Park Ave, San Jose|    111111|
+-------------+--------------------------+----------+
|    Bob Brown|  456 Taylor St, Cupertino|    222222|
+-------------+--------------------------+----------+
|Cathy Johnson|   789 Race Ave, Palo Alto|    333333|
+-------------+--------------------------+----------+
|Dora Williams|134 Forest Ave, Menlo Park|    444444|
+-------------+--------------------------+----------+
|Fleur Laurent|     345 Copper St, London|    777777|
+-------------+--------------------------+----------+
|Gordon Martin|      779 Lake Ave, Oxford|    888888|
+-------------+--------------------------+----------+

使用FROM语句插入

-- Assuming the applicants table has already been created and populated.
SELECT * FROM applicants;
+-----------+--------------------------+----------+---------+
|       name|                   address|student_id|qualified|
+-----------+--------------------------+----------+---------+
|Helen Davis| 469 Mission St, San Diego|    999999|     true|
+-----------+--------------------------+----------+---------+
|   Ivy King|367 Leigh Ave, Santa Clara|    101010|    false|
+-----------+--------------------------+----------+---------+
| Jason Wang|     908 Bird St, Saratoga|    121212|     true|
+-----------+--------------------------+----------+---------+

INSERT INTO students
     FROM applicants SELECT name, address, student_id WHERE qualified = true;

SELECT * FROM students;
+-------------+--------------------------+----------+
|         name|                   address|student_id|
+-------------+--------------------------+----------+
|    Amy Smith|    123 Park Ave, San Jose|    111111|
+-------------+--------------------------+----------+
|    Bob Brown|  456 Taylor St, Cupertino|    222222|
+-------------+--------------------------+----------+
|Cathy Johnson|   789 Race Ave, Palo Alto|    333333|
+-------------+--------------------------+----------+
|Dora Williams|134 Forest Ave, Menlo Park|    444444|
+-------------+--------------------------+----------+
|Fleur Laurent|     345 Copper St, London|    777777|
+-------------+--------------------------+----------+
|Gordon Martin|      779 Lake Ave, Oxford|    888888|
+-------------+--------------------------+----------+
|  Helen Davis| 469 Mission St, San Diego|    999999|
+-------------+--------------------------+----------+
|   Jason Wang|     908 Bird St, Saratoga|    121212|
+-------------+--------------------------+----------+

为分区列值使用类型化日期文字插入

CREATE TABLE students (name STRING, address  STRING) PARTITIONED BY (birthday DATE);

INSERT INTO students PARTITION (birthday = date'2019-01-02')
    VALUES ('Amy Smith', '123 Park Ave, San Jose');

SELECT * FROM students;
+-------------+-------------------------+-----------+
|         name|                  address|   birthday|
+-------------+-------------------------+-----------+
|    Amy Smith|   123 Park Ave, San Jose| 2019-01-02|

使用column列表插入

INSERT INTO students (address, name, student_id) VALUES
    ('Hangzhou, China', 'Kent Yao', 11215016);

SELECT * FROM students WHERE name = 'Kent Yao';
+---------+----------------------+----------+
|     name|               address|student_id|
+---------+----------------------+----------+
|Kent Yao |       Hangzhou, China|  11215016|
+---------+----------------------+----------+

同时使用分区spec和column列表进行插入

INSERT INTO students PARTITION (student_id = 11215017) (address, name) VALUES
    ('Hangzhou, China', 'Kent Yao Jr.');

SELECT * FROM students WHERE student_id = 11215017;
+------------+----------------------+----------+
|        name|               address|student_id|
+------------+----------------------+----------+
|Kent Yao Jr.|       Hangzhou, China|  11215017|
+------------+----------------------+----------+

4.2 Insert Overwrite

使用VALUES子句插入

-- Assuming the students table has already been created and populated.
SELECT * FROM students;
+-------------+--------------------------+----------+
|         name|                   address|student_id|
+-------------+--------------------------+----------+
|    Amy Smith|    123 Park Ave, San Jose|    111111|
|    Bob Brown|  456 Taylor St, Cupertino|    222222|
|Cathy Johnson|   789 Race Ave, Palo Alto|    333333|
|Dora Williams|134 Forest Ave, Menlo Park|    444444|
|Fleur Laurent|     345 Copper St, London|    777777|
|Gordon Martin|      779 Lake Ave, Oxford|    888888|
|  Helen Davis| 469 Mission St, San Diego|    999999|
|   Jason Wang|     908 Bird St, Saratoga|    121212|
+-------------+--------------------------+----------+

INSERT OVERWRITE students VALUES
    ('Ashua Hill', '456 Erica Ct, Cupertino', 111111),
    ('Brian Reed', '723 Kern Ave, Palo Alto', 222222);

SELECT * FROM students;
+----------+-----------------------+----------+
|      name|                address|student_id|
+----------+-----------------------+----------+
|Ashua Hill|456 Erica Ct, Cupertino|    111111|
|Brian Reed|723 Kern Ave, Palo Alto|    222222|
+----------+-----------------------+----------+

使用SELECT语句插入

-- Assuming the persons table has already been created and populated.
SELECT * FROM persons;
+-------------+--------------------------+---------+
|         name|                   address|      ssn|
+-------------+--------------------------+---------+
|Dora Williams|134 Forest Ave, Menlo Park|123456789|
+-------------+--------------------------+---------+
|  Eddie Davis|   245 Market St, Milpitas|345678901|
+-------------+--------------------------+---------+

INSERT OVERWRITE students PARTITION (student_id = 222222)
    SELECT name, address FROM persons WHERE name = "Dora Williams";

SELECT * FROM students;
+-------------+--------------------------+----------+
|         name|                   address|student_id|
+-------------+--------------------------+----------+
|   Ashua Hill|   456 Erica Ct, Cupertino|    111111|
+-------------+--------------------------+----------+
|Dora Williams|134 Forest Ave, Menlo Park|    222222|
+-------------+--------------------------+----------+

使用REPLACE WHERE语句插入

-- Assuming the persons and persons2 table has already been created and populated.
SELECT * FROM persons;
+-------------+--------------------------+---------+
|         name|                   address|      ssn|
+-------------+--------------------------+---------+
|Dora Williams|134 Forest Ave, Menlo Park|123456789|
+-------------+--------------------------+---------+
|  Eddie Davis|   245 Market St, Milpitas|345678901|
+-------------+--------------------------+---------+
    
SELECT * FROM persons2;
+-------------+--------------------------+---------+
|         name|                   address|      ssn|
+-------------+--------------------------+---------+
|   Ashua Hill|   456 Erica Ct, Cupertino|432795921|
+-------------+--------------------------+---------+

-- in an atomic operation, 1) delete rows with ssn = 123456789 and 2) insert rows from persons2 
INSERT INTO persons REPLACE WHERE ssn = 123456789 SELECT * FROM persons2

SELECT * FROM persons;
+-------------+--------------------------+---------+
|         name|                   address|      ssn|
+-------------+--------------------------+---------+
|  Eddie Davis|   245 Market St, Milpitas|345678901|
+-------------+--------------------------+---------+
|   Ashua Hill|   456 Erica Ct, Cupertino|432795921|
+-------------+--------------------------+---------+

使用TABLE语句插入

-- Assuming the visiting_students table has already been created and populated.
SELECT * FROM visiting_students;
+-------------+---------------------+----------+
|         name|              address|student_id|
+-------------+---------------------+----------+
|Fleur Laurent|345 Copper St, London|    777777|
+-------------+---------------------+----------+
|Gordon Martin| 779 Lake Ave, Oxford|    888888|
+-------------+---------------------+----------+

INSERT OVERWRITE students TABLE visiting_students;

SELECT * FROM students;
+-------------+---------------------+----------+
|         name|              address|student_id|
+-------------+---------------------+----------+
|Fleur Laurent|345 Copper St, London|    777777|
+-------------+---------------------+----------+
|Gordon Martin| 779 Lake Ave, Oxford|    888888|
+-------------+---------------------+----------+

使用FROM语句插入

-- Assuming the applicants table has already been created and populated.
SELECT * FROM applicants;
+-----------+--------------------------+----------+---------+
|       name|                   address|student_id|qualified|
+-----------+--------------------------+----------+---------+
|Helen Davis| 469 Mission St, San Diego|    999999|     true|
+-----------+--------------------------+----------+---------+
|   Ivy King|367 Leigh Ave, Santa Clara|    101010|    false|
+-----------+--------------------------+----------+---------+
| Jason Wang|     908 Bird St, Saratoga|    121212|     true|
+-----------+--------------------------+----------+---------+

INSERT OVERWRITE students
    FROM applicants SELECT name, address, student_id WHERE qualified = true;

SELECT * FROM students;
+-----------+-------------------------+----------+
|       name|                  address|student_id|
+-----------+-------------------------+----------+
|Helen Davis|469 Mission St, San Diego|    999999|
+-----------+-------------------------+----------+
| Jason Wang|    908 Bird St, Saratoga|    121212|
+-----------+-------------------------+----------+

为分区列值使用类型化日期文字插入

CREATE TABLE students (name STRING, address  STRING) PARTITIONED BY (birthday DATE);

INSERT INTO students PARTITION (birthday = date'2019-01-02')
    VALUES ('Amy Smith', '123 Park Ave, San Jose');

SELECT * FROM students;
+-------------+-------------------------+-----------+
|         name|                  address|   birthday|
+-------------+-------------------------+-----------+
|    Amy Smith|   123 Park Ave, San Jose| 2019-01-02|
+-------------+-------------------------+-----------+

INSERT OVERWRITE students PARTITION (birthday = date'2019-01-02')
    VALUES('Jason Wang', '908 Bird St, Saratoga');

SELECT * FROM students;
+-----------+-------------------------+-----------+
|       name|                  address|   birthday|
+-----------+-------------------------+-----------+
| Jason Wang|    908 Bird St, Saratoga| 2019-01-02|
+-----------+-------------------------+-----------+

使用column列表插入

INSERT OVERWRITE students (address, name, student_id) VALUES
    ('Hangzhou, China', 'Kent Yao', 11215016);

SELECT * FROM students WHERE name = 'Kent Yao';
+---------+----------------------+----------+
|     name|               address|student_id|
+---------+----------------------+----------+
|Kent Yao |       Hangzhou, China|  11215016|
+---------+----------------------+----------+

同时使用分区spec和column列表进行插入

INSERT OVERWRITE students PARTITION (student_id = 11215016) (address, name) VALUES
    ('Hangzhou, China', 'Kent Yao Jr.');

SELECT * FROM students WHERE student_id = 11215016;
+------------+----------------------+----------+
|        name|               address|student_id|
+------------+----------------------+----------+
|Kent Yao Jr.|       Hangzhou, China|  11215016|
+------------+----------------------+----------+
  • 14
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值