1. secure_file_priv
mysql 数据导出和导入csv格式时,需要特别注意 null 和空字符的处理,在导出和导入的结果要保持一致。
secure_file_priv 在 select into file 时指定文件存储位置。
如果为null表示不能使用 select into outfile ;
如果为 '' 表示可以使用 select into file 保存到任何目录;
该变量的修改,需要在my.cnf的[mysqld]中配置:
并且需要重启mysqld。
2. select from xxx into outfile
SELECT * FROM xxxx WHERE
INTO OUTFILE "/tmp/xxx.csv"
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY "\r\n";
使用 select into outfile 生成CSV格式,需要注意默认的转义字符为 eccaped by '\', 比如 NULL 导出CSV为: \N,如下所示:
导入时,使用的语句如下:
LOAD DATA INFILE '/tpm/xxx.csv'
INTO TABLE xxxxxx
FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY "\r\n"
上面的导入语句和导出语句配合,确保 在导出和导入的过程不能发生把 NULL 丢失掉,比如 NULL变成字符串的'', 变成整数的 0 这样的问题。
有时不能用双引号包括,这个时候,需要去掉 OPTIONALLY ENCLOSED BY '"'
LOAD DATA INFILE '/tpm/xxx.csv'
INTO TABLE xxxxxx
FIELDS TERMINATED BY ',' LINES TERMINATED BY "\r\n"
有时需要导出CSV的表头,此时需要使用union进行处理:
select *
from (
SELECT '姓名','身份证号码','盟市','旗县'
UNION
SELECT * from table
) t INTO OUTFILE "/tmp/xxx.csv" FIELDS TERMINATED BY ',' LINES TERMINATED BY "\r\n"
上面 的 SELECT '姓名','身份证号码','盟市','旗县' 会生成表头;
有时导出的CSV用excel打开时乱码,此时需要指定编码:
select * from t into outfile 'tmp/xxx.csv' character set gbk
FIELDS TERMINATED BY ',' LINES TERMINATED BY "\r\n"
如果不想指定编码重新导出CSV的话,还有其他的处理方法:
在简体中文系统的环境下,EXCEL打开的CSV文件默认是ANSI编码,如果CSV文件的编码方式为utf-8、Unicode等编码可能就会出现文件乱码的情况。知道什么原因,那接下来就去解决:
1)把记事本等文本文件打开,然后另存文件,编码选择ANSI。
2)方法二:
创建一个新的Excel文件;切换至“数据”菜单,选择数据来源为“自文本”选择 CSV 文件,
查看帮助文档:
mysql> ? load data
Name: 'LOAD DATA'
Description:
Syntax:
LOAD DATA [LOW_PRIORITY | CONCURRENT] [LOCAL] INFILE 'file_name'
[REPLACE | IGNORE]
INTO TABLE tbl_name
[PARTITION (partition_name [, partition_name] ...)]
[CHARACTER SET charset_name]
[{FIELDS | COLUMNS}
[TERMINATED BY 'string']
[[OPTIONALLY] ENCLOSED BY 'char']
[ESCAPED BY 'char']
]
[LINES
[STARTING BY 'string']
[TERMINATED BY 'string']
]
[IGNORE number {LINES | ROWS}]
[(col_name_or_user_var
[, col_name_or_user_var] ...)]
[SET col_name={expr | DEFAULT},
[, col_name={expr | DEFAULT}] ...]
The LOAD DATA INFILE statement reads rows from a text file into a table
at a very high speed. LOAD DATA INFILE is the complement of SELECT ...
INTO OUTFILE. (See
http://dev.mysql.com/doc/refman/5.6/en/select-into.html.) To write data
from a table to a file, use SELECT ... INTO OUTFILE. To read the file
back into a table, use LOAD DATA INFILE. The syntax of the FIELDS and
LINES clauses is the same for both statements. Both clauses are
optional, but FIELDS must precede LINES if both are specified.
You can also load data files by using the mysqlimport utility; it
operates by sending a LOAD DATA INFILE statement to the server. The
--local option causes mysqlimport to read data files from the client
host. You can specify the --compress option to get better performance
over slow networks if the client and server support the compressed
protocol. See http://dev.mysql.com/doc/refman/5.6/en/mysqlimport.html.
For more information about the efficiency of INSERT versus LOAD DATA
INFILE and speeding up LOAD DATA INFILE, see
http://dev.mysql.com/doc/refman/5.6/en/insert-optimization.html.
The file name must be given as a literal string. On Windows, specify
backslashes in path names as forward slashes or doubled backslashes.
The character_set_filesystem system variable controls the
interpretation of the file name.
LOAD DATA supports explicit partition selection using the PARTITION
option with a list of one or more comma-separated names of partitions,
subpartitions, or both. When this option is used, if any rows from the
file cannot be inserted into any of the partitions or subpartitions
named in the list, the statement fails with the error Found a row not
matching the given partition set. For more information and examples,
see http://dev.mysql.com/doc/refman/5.6/en/partitioning-selection.html.
For partitioned tables using storage engines that employ table locks,
such as MyISAM, LOAD DATA cannot prune any partition locks. This does
not apply to tables using storage engines which employ row-level
locking, such as InnoDB. For more information, see
http://dev.mysql.com/doc/refman/5.6/en/partitioning-limitations-locking
.html.
The server uses the character set indicated by the
character_set_database system variable to interpret the information in
the file. SET NAMES and the setting of character_set_client do not
affect interpretation of input. If the contents of the input file use a
character set that differs from the default, it is usually preferable
to specify the character set of the file by using the CHARACTER SET
clause. A character set of binary specifies "no conversion."
LOAD DATA INFILE interprets all fields in the file as having the same
character set, regardless of the data types of the columns into which
field values are loaded. For proper interpretation of file contents,
you must ensure that it was written with the correct character set. For
example, if you write a data file with mysqldump -T or by issuing a
SELECT ... INTO OUTFILE statement in mysql, be sure to use a
--default-character-set option so that output is written in the
character set to be used when the file is loaded with LOAD DATA INFILE.
*Note*:
It is not possible to load data files that use the ucs2, utf16,
utf16le, or utf32 character set.
If you use LOW_PRIORITY, execution of the LOAD DATA statement is
delayed until no other clients are reading from the table. This affects
only storage engines that use only table-level locking (such as MyISAM,
MEMORY, and MERGE).
If you specify CONCURRENT with a MyISAM table that satisfies the
condition for concurrent inserts (that is, it contains no free blocks
in the middle), other threads can retrieve data from the table while
LOAD DATA is executing. This option affects the performance of LOAD
DATA a bit, even if no other thread is using the table at the same
time.
With row-based replication, CONCURRENT is replicated regardless of
MySQL version. With statement-based replication CONCURRENT is not
replicated prior to MySQL 5.5.1 (see Bug #34628). For more information,
see
http://dev.mysql.com/doc/refman/5.6/en/replication-features-load-data.h
tml.
The LOCAL keyword affects expected location of the file and error
handling, as described later. LOCAL works only if your server and your
client both have been configured to permit it. For example, if mysqld
was started with the local_infile system variable disabled, LOCAL does
not work. See
http://dev.mysql.com/doc/refman/5.6/en/load-data-local.html.
The LOCAL keyword affects where the file is expected to be found:
o If LOCAL is specified, the file is read by the client program on the
client host and sent to the server. The file can be given as a full
path name to specify its exact location. If given as a relative path
name, the name is interpreted relative to the directory in which the
client program was started.
When using LOCAL with LOAD DATA, a copy of the file is created in the
server's temporary directory. This is not the directory determined by
the value of tmpdir or slave_load_tmpdir, but rather the operating
system's temporary directory, and is not configurable in the MySQL
Server. (Typically the system temporary directory is /tmp on Linux
systems and C:\WINDOWS\TEMP on Windows.) Lack of sufficient space for
the copy in this directory can cause the LOAD DATA LOCAL statement to
fail.
o If LOCAL is not specified, the file must be located on the server
host and is read directly by the server. The server uses the
following rules to locate the file:
o If the file name is an absolute path name, the server uses it as
given.
o If the file name is a relative path name with one or more leading
components, the server searches for the file relative to the
server's data directory.
o If a file name with no leading components is given, the server
looks for the file in the database directory of the default
database.
In the non-LOCAL case, these rules mean that a file named as
./myfile.txt is read from the server's data directory, whereas the file
named as myfile.txt is read from the database directory of the default
database. For example, if db1 is the default database, the following
LOAD DATA statement reads the file data.txt from the database directory
for db1, even though the statement explicitly loads the file into a
table in the db2 database:
LOAD DATA INFILE 'data.txt' INTO TABLE db2.my_table;
Non-LOCAL load operations read text files located on the server. For
security reasons, such operations require that you have the FILE
privilege. See
http://dev.mysql.com/doc/refman/5.6/en/privileges-provided.html. Also,
non-LOCAL load operations are subject to the secure_file_priv system
variable setting. If the variable value is a nonempty directory name,
the file to be loaded must be located in that directory. If the
variable value is empty (which is insecure), the file need only be
readable by the server.
Using LOCAL is a bit slower than letting the server access the files
directly, because the contents of the file must be sent over the
connection by the client to the server. On the other hand, you do not
need the FILE privilege to load local files.
LOCAL also affects error handling:
o With LOAD DATA INFILE, data-interpretation and duplicate-key errors
terminate the operation.
o With LOAD DATA LOCAL INFILE, data-interpretation and duplicate-key
errors become warnings and the operation continues because the server
has no way to stop transmission of the file in the middle of the
operation. For duplicate-key errors, this is the same as if IGNORE is
specified. IGNORE is explained further later in this section.
The REPLACE and IGNORE keywords control handling of input rows that
duplicate existing rows on unique key values:
o If you specify REPLACE, input rows replace existing rows. In other
words, rows that have the same value for a primary key or unique
index as an existing row. See [HELP REPLACE].
o If you specify IGNORE, rows that duplicate an existing row on a
unique key value are discarded.
o If you do not specify either option, the behavior depends on whether
the LOCAL keyword is specified. Without LOCAL, an error occurs when a
duplicate key value is found, and the rest of the text file is
ignored. With LOCAL, the default behavior is the same as if IGNORE is
specified; this is because the server has no way to stop transmission
of the file in the middle of the operation.
URL: http://dev.mysql.com/doc/refman/5.6/en/load-data.html
mysql>
---------------------------------------------------------------------------------------------------------------------------------
基本语法:
load data [low_priority] [local] infile 'file_name txt' [replace | ignore]
into table tbl_name
[fields
[terminated by't']
[OPTIONALLY] enclosed by '']
[escaped by'\' ]]
[lines terminated by'n']
[ignore number lines]
[(col_name, )]
load data infile语句从一个文本文件中以很高的速度读入一个表中。使用这个命令之前,mysqld进程(服务)必须已经在运行。为了安全原因,当读取位于服务器上的文本文件时,文件必须处于数据库目录或可被所有人读取。另外,为了对服务器上文件使用load data infile,在服务器主机上你必须有file的权限。
1 如果你指定关键词low_priority,那么MySQL将会等到没有其他人读这个表的时候,才把插入数据。可以使用如下的命令:
load data low_priority infile "/home/mark/data sql" into table Orders;
2 如果指定local关键词,则表明从客户主机读文件。如果local没指定,文件必须位于服务器上。
3 replace和ignore关键词控制对现有的唯一键记录的重复的处理。如果你指定replace,新行将代替有相同的唯一键值的现有行。如果你指定ignore,跳过有唯一键的现有行的重复行的输入。如果你不指定任何一个选项,当找到重复键时,出现一个错误,并且文本文件的余下部分被忽略。例如:
load data low_priority infile "/home/mark/data sql" replace into table Orders;
4 分隔符
(1) fields关键字指定了文件记段的分割格式,如果用到这个关键字,MySQL剖析器希望看到至少有下面的一个选项:
terminated by分隔符:意思是以什么字符作为分隔符
enclosed by字段括起字符
escaped by转义字符
terminated by描述字段的分隔符,默认情况下是tab字符(\t)
enclosed by描述的是字段的括起字符。
escaped by描述的转义字符。默认的是反斜杠(backslash:\ )
例如:load data infile "/home/mark/Orders txt" replace into table Orders fields terminated by',' enclosed by '"';
(2)lines 关键字指定了每条记录的分隔符默认为'\n'即为换行符
如果两个字段都指定了那fields必须在lines之前。如果不指定fields关键字缺省值与如果你这样写的相同: fields terminated by'\t' enclosed by ’ '' ‘ escaped by'\\'
如果你不指定一个lines子句,缺省值与如果你这样写的相同: lines terminated by'\n'
例如:load data infile "/jiaoben/load.txt" replace into table test fields terminated by ',' lines terminated by '/n';
5 load data infile 可以按指定的列把文件导入到数据库中。 当我们要把数据的一部分内容导入的时候,,需要加入一些栏目(列/字段/field)到MySQL数据库中,以适应一些额外的需要。比方说,我们要从Access数据库升级到MySQL数据库的时候
下面的例子显示了如何向指定的栏目(field)中导入数据:
load data infile "/home/Order txt" into table Orders(Order_Number, Order_Date, Customer_ID);
6 当在服务器主机上寻找文件时,服务器使用下列规则:
(1)如果给出一个绝对路径名,服务器使用该路径名。
(2)如果给出一个有一个或多个前置部件的相对路径名,服务器相对服务器的数据目录搜索文件。
(3)如果给出一个没有前置部件的一个文件名,服务器在当前数据库的数据库目录寻找文件。
例如: /myfile txt”给出的文件是从服务器的数据目录读取,而作为“myfile txt”给出的一个文件是从当前数据库的数据库目录下读取。
使用mysql命令导出数据
利用mysql的-e参数,可以导出数据,最重要的是我们可以对导出的数据进行正则处理。
如下利用mysql命令导出数据到csv文件,并且把表中的null值在excel中显示为空。
[root@test2 ~]# mysql -e "set names gbk;select * from newsdb.t_hk_stock_news where news_time > '2019-03-31 23:59:59' limit 5" |sed -e "s/\t/,/g" -e "s/NULL/ /g" -e "s/\n/\r\n/g" > /db/test.csv
#在-e参数中实际使用了两条命令,一条是设置字符集,另一条是select语句,通过管道把每一行数据都通过正则来处理。
#正则中把字段之间的TAB键换为“,”,然后把字段值中的null替换为空字符
Load data 完整的demo:
LOAD DATA LOCAL INFILE '/tmp/2982/20200424/user.csv'
INTO TABLE t_user CHARACTER SET utf8mb4 FIELDS TERMINATED BY ','
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(userName, userNo, age, homeAddr)
SET province = '浙江省', city='杭州市', creatorId=2982, createTime='2020-04-24 13:24:24'
local 关键字的作用:CSV文件和mysql不在同一个服务器时,使用local关键字可以A服务器的CSV文件导入到 mysql所在的B服务器。
如果没有local关键字则,csv文件和mysql必须在同一个服务器上面;