连接Minio
两种方式:使用secret或者set
CREATE SECRET minio (
TYPE S3,
KEY_ID ‘A6JLo3XVrlgpJ0kaIQbz’,
SECRET ‘zqDtE4qnwHZIhChSPA5GiXZDPVXSCo7ZVQFqd3xe’,
ENDPOINT ‘XXX.:9000’,
url_style ‘path’,
use_ssl ‘false’,
REGION ‘us-east-1’
);
select * from duckdb_secrets();
SET s3_url_style=‘path’;
SET s3_endpoint=‘XXXX:9000’;
SET s3_access_key_id=‘A6JLo3XVrlgpJ0kaIQbz’ ;
SET s3_secret_access_key=‘zqDtE4qnwHZIhChSPA5GiXZDPVXSCo7ZVQFqd3xe’;
SET s3_use_ssl=‘false’;
要特别注意URL_STYLE的设置还有USE_SSL。
Minio 读写
可以直接讲csv表(customer) 写入minio并分区。
COPY customer TO ‘s3://duckdb/customer’ (
FORMAT PARQUET,
PARTITION_BY (c_mktsegment)
);
重写需要加overwrite_or_ignore, 建议先根据业务逻辑控制是否要删除再重写。
–多次copy 測試
COPY customer TO ‘s3://duckdb/customer’ (
FORMAT PARQUET,
PARTITION_BY (c_mktsegment)
);
IO Error: Directory “s3://duckdb/customer” is not empty! Enable OVERWRITE option to overwrite files
COPY customer TO ‘s3://duckdb/customer’ (
FORMAT PARQUET,
overwrite_or_ignore true,
PARTITION_BY (c_mktsegment)
);
如果使用table 模式
CREATE TABLE cus AS SELECT * FROM read_parquet(‘s3://duckdb/customer.parquet’);
CREATE TABLE cus_s3 AS SELECT * FROM read_parquet(‘s3://duckdb/customer//.parquet’);
有新文件加入到minio的时候,duckdb不会自动刷新文件,新文件不会被访问
select count(*) from cus_s3 where c_mktsegment=‘AUTOMOBILE’;
– 不用table 会直接访问所有文件
select count() from read_parquet('s3://duckdb/customer//*.parquet’) where c_mktsegment=‘AUTOMOBILE’;
其它
DuckDB is column-based and ODBC is row-based, there can be some inefficiencies when using ODBC with DuckDB.
ODBC can be slower than other methods of connecting to a database, such as using a native driver, as it adds an extra layer of abstraction between the application and the database
如果是本地库,duckdb会写wal
root@xiehq-virtual-machine:~/duckdb/db-data# ps -ef|grep duckdb
root 5773 5528 2 02:45 pts/1 00:01:13 ./duckdb ./db-data/testdb
root 6084 6083 1 03:00 pts/0 00:00:33 minio server --console-address :9001 /root/duckdb/data
root@xiehq-virtual-machine:~/duckdb/db-data# ls -ltr
total 287304
-rw-r–r-- 1 root root 269496320 Jul 26 03:38 testdb
-rw-r–r-- 1 root root 24696179 Jul 26 03:38 testdb.wal
duckdb定位是分析型db,因此事务类sql支撑比较少。