目录
1.file
文件必须在指定路径下创建才有意义 /var/lib/clickhouse/user_files ,但该路径可修改,修改方式如下
- 修改配置文件,文件的创建目录位置
- -- clickhosue的配置文件的位置 默认在/etc/clickouse-server下
- vi config.xml 修改为
- <user_files_path>/</user_files_path>
- 重启服务
(1)读取单个文件
SELECT * FROM file('test.csv', 'CSV', 'column1 UInt32, column2 UInt32, column3 UInt32');
(2)读取多个文件
文件列如下:
- 'some_dir/some_file_1'
- 'some_dir/some_file_2'
- 'some_dir/some_file_3'
- 'another_dir/some_file_1'
- 'another_dir/some_file_2'
- 'another_dir/some_file_3'
SELECT count(*) FROM file('{some,another}_dir/some_file_{1..3}', 'TSV', 'name String, value UInt32');
或
SELECT count(*) FROM file('{some,another}_dir/*', 'TSV', 'name String, value UInt32');
in addtional , query the data from files named file000, file001, … , file999:
SELECT count(*) FROM file('big_dir/file{0..9}{0..9}{0..9}', 'CSV', 'name String, value UInt32');
注意事项:
- 暂时未发现有跳行参数设置方法
- 空值会被填充默认值:int填0,string填空字符串,date填1970-01-01
2.mysql
语法: mysql('host:port', 'database', 'table', 'user', 'password'[, replace_query, 'on_duplicate_clause'])
样例:select * from mysql('host:3306', 'test', 'table_name', 'user', 'pwd') ;
说明:
- 参数一:masql的地址 ,参数二:mysql数据库名,参数三:mysql的表名
- 参数四:用户名 ,参数五: 用户密码
3.S3
语法:s3(path, [aws_access_key_id, aws_secret_access_key,] format, structure, [compression])
样例:select * from s3('http://s3.amazonaws.com/bucket_name/test.csv','access_key_id', 'secret_access_key', 'CSV', 'c1 String,c2 String,c3 String ,c4 String,c5 String,c6 String');
amazon s3 url:
http://s3.amazonaws.com/bucket/key (for a bucket created in the US East (N. Virginia) region)
https://s3.amazonaws.com/bucket/key
http://s3-region.amazonaws.com/bucket/key
https://s3-region.amazonaws.com/bucket/key
http://s3.region.amazonaws.com/bucket/key
https://s3.region.amazonaws.com/bucket/key
http://s3.dualstack.region.amazonaws.com/bucket/key (for requests using IPv4 or IPv6)
https://s3.dualstack.region.amazonaws.com/bucket/key
http://bucket.s3.amazonaws.com/key
http://bucket.s3-region.amazonaws.com/key
http://bucket.s3.region.amazonaws.com/key
http://bucket.s3.dualstack.region.amazonaws.com/key (for requests using IPv4 or IPv6)
http://bucket.s3-website.region.amazonaws.com/key (if static website hosting is enabled on the bucket)
http://bucket.s3-accelerate.amazonaws.com/key (where the filetransfer exits Amazons network at the last possible moment so as to give the fastest possible transfer speed and lowest latency)
http://bucket.s3-accelerate.dualstack.amazonaws.com/key
http://bucket/key (where bucket is a DNS CNAME record pointing to bucket.s3.amazonaws.com)
https://access_point_name-account ID.s3-accesspoint.region.amazonaws.com (for requests via an access point granting restricted access to a bucket)
4.input
$ cat data.csv | clickhouse-client --query="INSERT INTO test SELECT lower(col1), col3 * col3 FROM input('col1 String, col2 Date, col3 Int32') FORMAT CSV";
$ cat data.csv | clickhouse-client --query="INSERT INTO test FORMAT CSV";
$ cat data.csv | clickhouse-client --query="INSERT INTO test SELECT * FROM input('test_structure') FORMAT CSV";
5.其他
参考官方文档:https://clickhouse.tech/docs/en/sql-reference/table-functions/
- merge
- numbers
- remote
- url
- jdbc
- odbc
- hdfs
- generateRandom
- cluster
- view