长文自学记录： Postgresql 的时序数据扩展Timescaledb

本文链接：https://blog.csdn.net/edimy/article/details/133012168

工作的原因需要处理时序数据，之前系统中使用的是postgresql，于是搜索了一番后发现了postgresql的时序数据扩展功能timescaledb，花了几天学习了一下，顺便做些记录。

一、安装及配置

本地安装

网站上说有docker的安装方式，不知道是不是网络的原因我们没有pull下来，所以先装一个本地版本测试一下，以下是ubuntu22.04下的安装步骤：

$ apt install gnupg postgresql-common apt-transport-https lsb-release wget

$ /usr/share/postgresql-common/pgdg/apt.postgresql.org.sh

$ apt update

//这里安装的是基于postgresql14版本的扩展
$ apt install timescaledb-2-postgresql-14

//安装完后运行配置工具，默认yes
$ timescaledb-tune

//安装psql客户的工具
$ apt-get install postgresql-client

//重启
$ systemctl restart postgresql

//修改postgres帐号的密码
sudo -u postgres psql
postgres=# \password postgres
postgres=# \q

timescaledb的配置信息是写入到postgresql数据的配置文件中的，位置在/etc/postgresql/14/main/postgresql.conf

如果需要配置pgadmin通过远程访问，还需要：
在postgres.conf文件中修改

listen_addresses = '*'

在pg_hba.conf文件中增加

host    all             all             0.0.0.0/0               trust

Docker安装

可以直接下载编译好的docker镜像并运行

docker run -d --name timescaledb -p 5432:5432 \
-v /ipom/data:/home/postgres/pgdata/data \
-e POSTGRES_PASSWORD=postgres timescale/timescaledb-ha:pg14-latest

也可以使用Dockerfile文件自己编译，这里由于设备用的是arm64v8架构所以尝试自己编译镜像

docker build \
--build-arg PG_VERSION="13.3" \
--build-arg PREV_IMAGE="arm64v8/postgres:13.3-alpine" \
--build-arg TS_VERSION="2.4.2" \
-t armv8/timescaledb:13.3 .

二、基本概念

到目前位置基于postgresql的timescaledb扩展就安装完了，在了解如何使用前需要对扩展中的一些概念进行深入的了解。

1. 超表 Hypertable

Postgresql通过时间来分区的表，主要用于时序数据的存储和处理
Hypertabel表有一个alongside的普通表，用来提供更多的时序功能，使用者在使用Hypertable时可以当作一个postgresql的普通表来处理。Hypertable根据时间来划分分区，每个分区被称为chunk。

//创建一个数据库，并指定数据表空间
CREATE TABLESPACE test_space OWNER user_name LOCATION directory_path;
CREATE database test_db tablespace test_space;

//为数据库创建timescaledb扩展
\c test_db
CREATE EXTENSION IF NOT EXISTS timescaledb;

//创建一个表
CREATE TABLE conditions (
   time        TIMESTAMPTZ       NOT NULL,
   location    TEXT              NOT NULL,
   device      TEXT              NOT NULL,
   temperature DOUBLE PRECISION  NULL,
   humidity    DOUBLE PRECISION  NULL
);
//创建超表
SELECT create_hypertable('conditions', 'time');

//查询超表的时间间隔配置，默认是604800000000（7天*24小时*3600秒*1000毫秒*1000微秒）
SELECT h.table_name, c.interval_length
  FROM _timescaledb_catalog.dimension c
  JOIN _timescaledb_catalog.hypertable h
    ON h.id = c.hypertable_id;

为超表建立索引，首先需要确定超表的分区列partitioning columns，分区列必须包含一个时间字段（示例中的time），可以选择包含一个空间分区字段space-partitioning columns（示例中可以是device）。需要注意的是建立唯一索引可以是多个字段一起，在指定分区列时可以使用关键字partitioning_column来指定空间分区字段，但这个字段必须是索引字段。

//创建一个表
CREATE TABLE hypertable_example(
  time TIMESTAMPTZ,
  user_id BIGINT,
  device_id BIGINT,
  value FLOAT
);

//创建索引列
CREATE UNIQUE INDEX idx_deviceid_time
  ON hypertable_example(device_id, time);
  
//创建超表并指定分区列
SELECT * FROM create_hypertable(
  'hypertable_example',
  'time',
  partitioning_column => 'device_id',
  number_partitions => 4
);

2. 时间桶 Time buckets

bucket的意思是将时序数据按照指定的时间间隔划分成一个个小的时间段，并且可以使用time buckets函数对“桶”内的数据进行计算，如求和，取均值等。需要注意的是时间的划分是从origin时间点开始，按照间隔来进行划分。如果按照月、年、世纪来划分的话，origin的默认值为2000年1月1日0点，如果按其他方式划分，origin的默认值为2000年1月3日0点。

----未完待续-----