influxdb学习

最新推荐文章于 2024-08-09 09:33:24 发布

katy的小乖

最新推荐文章于 2024-08-09 09:33:24 发布

阅读量1.2k

点赞数 1

分类专栏：数据库 Influxdb

原文链接：https://blog.csdn.net/qian_feifei/article/details/75139652, https://jasper-zhang1.gitbooks.io/influxdb/content/Introduction/

版权

数据库同时被 2 个专栏收录

31 篇文章 2 订阅

订阅专栏

Influxdb

13 篇文章 1 订阅

订阅专栏

docker 安装

// 下载：如果不指定版本号，会默认拉取最新的
docker pull influxdb:1.3.3

// 查看本地镜像,已经存在刚刚拉取的influxdb:1.3.3
docker images
REPOSITORY           TAG                 IMAGE ID            CREATED             SIZE
docker.io/influxdb   1.3.3               fa78fa0c3591        2 years ago         227 MB

// 运行,把主机的8086端口映射到容器的8086端口，把主机的/root/metric映射到容器的/var/lib/influxdb目录，给服务起名为influxdb
docker run -p 8086:8086 -v /root/metric:/var/lib/influxdb -d --name influxdb fa78fa0c3591

卸载

centos

// 先检查安装的influxdb
rpm -qa|grep influx
// 打印结果
influxdb-1.3.3-1.x86_64 

// 卸载influxdb
rpm -e influxdb-1.3.3.x86_64
// 打印结果
warning: /etc/influxdb/influxdb.conf saved as /etc/influxdb/influxdb.conf.rpmsave
Removed symlink /etc/systemd/system/multi-user.target.wants/influxdb.service.
Removed symlink /etc/systemd/system/influxd.service.

下载安装

我的是centos

// 下载
wget https://dl.influxdata.com/influxdb/releases/influxdb-1.5.3.x86_64.rpm
// 解压
sudo yum localinstall influxdb-1.5.3.x86_64.rpm

启动

sudo systemctl start influxdb

// 启动成功的提示
Redirecting to /bin/systemctl start influxdb.service

关闭
sudo systemctl stop influxdb

查看influxdb进程

ps aux | grep influx
[root@localhost ~]# ps aux | grep influx
influxdb 17673  0.0  0.4 320804 18068 ?        Ssl  09:10   0:00 /usr/bin/influxd -config /etc/influxdb/influxdb.conf
root     17691  0.0  0.0 112704   972 pts/0    S+   09:50   0:00 grep --color=auto influx

启动influxdb客户端

[root@localhost ~]# influx
Connected to http://localhost:8086 version 1.5.3
InfluxDB shell version: 1.5.3
>

退出influx客户端

exit

查询数据库

show databases

influx指令用法

influx --help
Usage of influx:
  -version
       Display the version and exit.
  -host 'host name'
       Host to connect to.
  -port 'port #'
       Port to connect to.
  -socket 'unix domain socket'
       Unix socket to connect to.
  -database 'database name'
       Database to connect to the server.
  -password 'password'
      Password to connect to the server.  Leaving blank will prompt for password (--password '').
  -username 'username'
       Username to connect to the server.
  -ssl
        Use https for requests.
  -unsafeSsl
        Set this when connecting to the cluster using https and not use SSL verification.
  -execute 'command'
       Execute command and quit.
  -format 'json|csv|column'
       Format specifies the format of the server responses:  json, csv, or column.
  -precision 'rfc3339|h|m|s|ms|u|ns'
       Precision specifies the format of the timestamp:  rfc3339, h, m, s, ms, u or ns.
  -consistency 'any|one|quorum|all'
       Set write consistency level: any, one, quorum, or all
  -pretty
       Turns on pretty print for the json format.
  -import
       Import a previous database export from file
  -pps
       How many points per second the import will allow.  By default it is zeroand will not throttle importing.
  -path
       Path to file to import
  -compressed
       Set to true if the import file is compressed

Examples:

    # Use influx in a non-interactive mode to query the database "metrics" and pretty print json:
    $ influx -database 'metrics' -execute 'select * from cpu' -format 'json' -pretty

    # Connect to a specific database on startup and set database context:
    $ influx -database 'metrics' -host 'localhost' -port '8086'

创建数据库

CREATE DATABASE <db-name>

// 首先登录influxdb客户端
influx
// 创建数据库katy
create database 数据库名称
// 显示数据库信息(databases是复数)
show databases


[root@localhost ~]# influx
Connected to http://localhost:8086 version 1.5.3
InfluxDB shell version: 1.5.3
> create database katy
> show databases
name: databases
name
----
_internal
katy

查看已存在的数据库

show databases

使用指定的数据库

// use 数据库名称
> use _internal
Using database _internal
>

新增数据


// cpu为measurement, host和region为tag, value为field key, 0.89为field value
// measurement与tag之间没有空格，measurement后直接加,跟上tag key=tag value格式，多个tag k-v之间使用,分隔，tag k-v是非必须的。field k-v是必须的，tag k-v后跟个空格，再跟对应的field k-v，多个field k-v之间使用,分隔
> insert cpu,host=serverA,region=china value=0.89
// 查询刚刚新增的记录
> select * from cpu;
name: cpu
time                host    region value
----                ----    ------ -----
1565245452628724393 serverA china  0.89

查询数据的语法

https://jasper-zhang1.gitbooks.io/influxdb/content/Query_language/data_exploration.html

select子句

SELECT <field_key>[,<field_key>,<tag_key>] FROM <measurement_name>[,<measurement_name>]

// select支持以下几种格式
1. select *   // 返回所有的field和tag
2. select "field_key"    // 返回指定的field
3. select "field_key", "field_key"    // 返回多个指定的field
4. select "field_key", "tag_key"  // 返回指定的field_key和tag_key。当包含一个tag_key的时候，必须至少指定一个field_key。
5. select "field_key"::field,"tag_key"::tag  // 返回指定的field_key和tag_key，区分具有同名的fiel_key和tag_key


// 例子，measurement为temperature, tag为machine和type, field为external和internal
INSERT temperature,machine=unit42,type=assembly external=25,internal=37
// 1. select * 
> select * from "temperature"
name: temperature
time                external internal machine type
----                -------- -------- ------- ----
1565247133525966065 25       37       unit42  assembly
>

// 2. select "field_key"
 select "external" from "temperature"
name: temperature
time                external
----                --------
1565247133525966065 25
>

// 3.select "field_key","field_key"
 select "external","internal" from "temperature"
name: temperature
time                external internal
----                -------- --------
1565247133525966065 25 
>

// 4. select "field_key", "tag_key"
 select "internal", "machine" from "temperature"
name: temperature
time                internal machine
----                -------- -------
1565247133525966065 37       unit42
>
// 如果只查询某个tag，而不查询field，就查询不到任何结果
 select "machine" from "temperature"
>

// 5. todo 
....

from 子句

// from子句支持的格式
1. from measurement_name   // 从单个measurement返回数据
2. from measurement_name1, measurement_name2  // 从多个measurement返回数据
3. from database_name.retention_policy_name.measurement_name  // 从一个指定了数据库名称，保留策略名称的measurement返回数据
4. FROM database_name..measurement_name    // 从一个用户指定的数据库中返回存储策略为DEFAULT的数据。

where子句

在field、tag、timestamp上支持条件查询

// 语法
SELECT_clause FROM_clause WHERE <conditional_expression> [(AND|OR) <conditional_expression> [...]]

fields

// where子句支持field value是string, boolean, float, integer类型
field_key <operator> ['string' | boolean | float | integer]

// operator支持的操作符
=  //等于
<>  // 不等于
！=  // 不等于
>  //大于
>=  //大于等于
<  // 小于
<=  // 小于等于

timestamps

默认时间范围是UTC的1677-09-21 00：12：43.145224194到2262-04-11T23：47：16.854775806Z.
对于只有GROUP BY time()子句的SELECT语句，默认时间范围在UTC的1677-09-21 00：12：43.145224194和now()之间

group by子句

group by后面可以跟上用户指定的tags或者时间间隔

group by tags

// 语法
SELECT_clause FROM_clause [WHERE_clause] GROUP BY [* | <tag_key>[,<tag_key]]

1. group by *  // 对结果中的所有tag作group by
2. group by tak_key  //对结果按照指定的tag_key作group by
3. group by tag_key1, tag_key2  // 对结果按多个tag作group by，tag key顺序无所谓

group by 时间间隔

group by time() 对返回结果按照指定的时间间隔group by

// 语法
SELECT <function>(<field_key>) FROM_clause WHERE <time_range> GROUP BY time(<time_interval>),[tag_key] [fill(<fill_option>)]

1. group by 必须在where之后
2. 如果time_interval=5m,则标识在where子句中指定的时间范围内将查询结果分到5分钟的时间组里

into子句

将查询结果写入到自定义的measurement中

// 语法
SELECT_clause INTO <measurement_name> FROM_clause [WHERE_clause] [GROUP_BY_clause]

order by time desc

influxdb的数据默认按照时间升序返回。

SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] ORDER BY time DESC

limit

limit n // 从指定的measurement中返回前n个数据点


// 语法
SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] LIMIT <N>

slimit

slimit n 返回指定mearurement的前n个series中的每个点

// 语法
SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] GROUP BY *[,time(<time_interval>)] [ORDER_BY_clause] SLIMIT <N>

limit和slimit一起使用

// 语法
SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] GROUP BY *[,time(<time_interval>)] [ORDER_BY_clause] LIMIT <N1> SLIMIT <N2>

offset

offset n 从查询结果中返回分页的n个数据点

// 语法
SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] LIMIT_clause OFFSET <N> [SLIMIT_clause]

offset需要limit子句，没有limit子句的offset可能会导致查询结果不一致

soffset

soffset n 从查询结果中返回分页的n个series

// 语法
SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] GROUP BY *[,time(time_interval)] [ORDER_BY_clause] [LIMIT_clause] [OFFSET_clause] SLIMIT_clause SOFFSET <N>

1. 如果soffset指定的值大于series数量，则返回空
2. n指的是series的分页数量。soffset需要一个slimit子句

time zone子句

tz()返回指定时区的UTC偏移量

// 语法
SELECT_clause [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] [LIMIT_clause] [OFFSET_clause] [SLIMIT_clause] [SOFFSET_clause] tz('<time_zone>')

正则表达式

// 语法
SELECT /<regular_expression_field_key>/ FROM /<regular_expression_measurement>/ WHERE [<tag_key> <operator> /<regular_expression_tag_value>/ | <field_key> <operator> /<regular_expression_field_value>/] GROUP BY /<regular_expression_tag_key>/

// 在以下场景支持正则表达式
1. 在SELECT中的field key和tag key；
2. 在FROM中的measurement
3. 在WHERE中的tag value和字符串类型的field value
4. 在GROUP BY中的tag key

数据类型

在select中支持指定field的类型

// 语法
SELECT_clause <field_key>::<type> FROM_clause

type可以是float，integer，string和boolean。在大多数情况下，如果field_key没有存储指定type的数据，那么InfluxDB将不会返回数据

类型转换

::语法允许用户在查询中做基本的数据类型转换。目前，InfluxDB支持从整数转到浮点，或者从浮点转到整数。

// 语法
SELECT_clause <field_key>::<type> FROM_clause


type可以是float或者integer。
如果查询试图把整数或者浮点数转换成字符串或者布尔型，InfluxDB将不会返回数据。

多语句

用;分割多个select。

子查询

子查询是嵌套在另一个查询的FROM子句中的查询。使用子查询将查询作为条件应用于其他查询。子查询提供与嵌套函数和SQLHAVING子句类似的功能。

// 语法
SELECT_clause FROM ( SELECT_statement ) [...]

InfluxDB首先执行子查询，再次执行主查询。

主查询围绕子查询，至少需要SELECT和FROM子句。主查询支持本文档中列出的所有子句。

子查询显示在主查询的FROM子句中，它需要附加的括号。子查询支持本文档中列出的所有子句。

多个子查询

InfluxQL每个主要查询支持多个嵌套子查询。多个子查询的示例语法：

SELECT_clause FROM ( SELECT_clause FROM ( SELECT_statement ) [...] ) [...]

schema查询

查询当前数据库使用的rp

> show retention policies
name    duration shardGroupDuration replicaN default
----    -------- ------------------ -------- -------
autogen 0s       168h0m0s           1        true

修改指定数据库的rp

https://docs.influxdata.com/influxdb/v1.7/query_language/database_management/#retention-policy-management

// 把telegraf数据库的rp名字修改为autogen，保留时间是3天，shard group时间为1天。
// rp时间必须大于shard group时间
ALTER RETENTION POLICY autogen ON telegraf DURATION 72h REPLICATION 1 SHARD DURATION 24h DEFAULT;

// 查询修改后的rp
> show retention policies;
name    duration shardGroupDuration replicaN default
----    -------- ------------------ -------- -------
autogen 72h0m0s  24h0m0s            1        true

查询CARDINALITY

use xxx;
// What's your database cardinality?
SHOW SERIES CARDINALITY;

查询series

> show series
key
---
cpu,host=serverA,region=china
temperature,machine=unit42,type=assembly

查询measurement

> show measurements
name: measurements
name
----
cpu
temperature

查询当前数据库里的所有tag keys

> show tag keys
name: cpu
tagKey
------
host
region

name: temperature
tagKey
------
machine
type

查询当前数据库的所有field keys

> show field keys
name: cpu
fieldKey fieldType
-------- ---------
value    float

name: temperature
fieldKey fieldType
-------- ---------
external float
internal float
>

数据库管理

https://jasper-zhang1.gitbooks.io/influxdb/content/Query_language/database_management.html

连续查询

对实时数据自动周期运行的查询，然后把查询结果写入到指定的measurement中。

https://jasper-zhang1.gitbooks.io/influxdb/content/Query_language/continuous_queries.html

// 语法
CREATE CONTINUOUS QUERY <cq_name> ON <database_name>
BEGIN
  <cq_query>
END

函数

https://jasper-zhang1.gitbooks.io/influxdb/content/Query_language/functions.html

使用tag的情况

把你经常查询的字段作为tag
如果你要对其使用GROUP BY()，也要放在tag中
如果你要对其使用InfluxQL函数，则将其放到field中
如果你需要存储的值不是字符串，则需要放到field中，因为tag value只能是字符串

influxd

influxd help

Configure and start an InfluxDB server.

Usage: influxd [[command] [arguments]]

The commands are:

    backup               downloads a snapshot of a data node and saves it to disk
    config               display the default configuration
    help                 display this help message
    restore              uses a snapshot of a data node to rebuild a cluster
    run                  run node with existing configuration
    version              displays the InfluxDB version

"run" is the default command.

查看配置信息

influxd config

Merging with configuration at: /etc/influxdb/influxdb.conf
reporting-disabled = false
bind-address = "127.0.0.1:8088"

[meta]
  dir = "/var/lib/influxdb/meta"
  retention-autocreate = true
  logging-enabled = true

[data]
  dir = "/var/lib/influxdb/data"
  index-version = "inmem"
  wal-dir = "/var/lib/influxdb/wal"
  wal-fsync-delay = "0s"
  query-log-enabled = true
  cache-max-memory-size = 1073741824
  cache-snapshot-memory-size = 26214400
  cache-snapshot-write-cold-duration = "10m0s"
  compact-full-write-cold-duration = "4h0m0s"
  max-series-per-database = 1000000
  max-values-per-tag = 100000
  max-concurrent-compactions = 0
  max-index-log-file-size = 1048576
  trace-logging-enabled = false

[coordinator]
  write-timeout = "10s"
  max-concurrent-queries = 0
  query-timeout = "0s"
  log-queries-after = "0s"
  max-select-point = 0
  max-select-series = 0
  max-select-buckets = 0

[retention]
  enabled = true
  check-interval = "30m0s"

[shard-precreation]
  enabled = true
  check-interval = "10m0s"
  advance-period = "30m0s"

[monitor]
  store-enabled = true
  store-database = "_internal"
  store-interval = "10s"

[subscriber]
  enabled = true
  http-timeout = "30s"
  insecure-skip-verify = false
  ca-certs = ""
  write-concurrency = 40
  write-buffer-size = 1000

[http]
  enabled = true
  bind-address = ":8086"
  auth-enabled = false
  log-enabled = true
  write-tracing = false
  pprof-enabled = true
  debug-pprof-enabled = false
  https-enabled = false
  https-certificate = "/etc/ssl/influxdb.pem"
  https-private-key = ""
  max-row-limit = 0
  max-connection-limit = 0
  shared-secret = ""
  realm = "InfluxDB"
  unix-socket-enabled = false
  bind-socket = "/var/run/influxdb.sock"
  max-body-size = 25000000
  access-log-path = ""

[logging]
  format = "auto"
  level = "info"
  suppress-logo = false

[ifql]
  enabled = false
  log-enabled = true
  bind-address = ":8082"

[[graphite]]
  enabled = false
  bind-address = ":2003"
  database = "graphite"
  retention-policy = ""
  protocol = "tcp"
  batch-size = 5000
  batch-pending = 10
  batch-timeout = "1s"
  consistency-level = "one"
  separator = "."
  udp-read-buffer = 0

[[collectd]]
  enabled = false
  bind-address = ":25826"
  database = "collectd"
  retention-policy = ""
  batch-size = 5000
  batch-pending = 10
  batch-timeout = "10s"
  read-buffer = 0
  typesdb = "/usr/share/collectd/types.db"
  security-level = "none"
  auth-file = "/etc/collectd/auth_file"
  parse-multivalue-plugin = "split"

[[opentsdb]]
  enabled = false //是否启用该模块
  bind-address = ":4242"
  database = "opentsdb"  //默认数据库，opentsdb
  retention-policy = "" // 存储策略，无默认值
  consistency-level = "one" //一致性级别，默认one
  tls-enabled = false
  certificate = "/etc/ssl/influxdb.pem" //证书路径
  batch-size = 1000
  batch-pending = 5
  batch-timeout = "1s"
  log-point-errors = true

[[udp]]
  enabled = false
  bind-address = ":8089" // 绑定地址，默认值是8099
  database = "udp"  //数据库名称，默认udp
  retention-policy = "" //存储策略
  batch-size = 5000
  batch-pending = 10
  read-buffer = 0 //udp读取buffer的大小，0表示操作系统提供的值，如果超过操作系统的默认配置则会出错
  batch-timeout = "1s"
  precision = ""

[continuous_queries]
  log-enabled = true
  enabled = true  //是否开启CQ，默认为true
  query-stats-enabled = false // 是否开启日志，默认为true
  run-interval = "1s"  //时间间隔，默认1s

http api write

http api query

聚合函数

count()

返回一个field字段中的非空值的数量。

SELECT COUNT(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]

distanct()

返回一个字段的唯一值。

SELECT DISTINCT(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]

mean()

返回一个字段中的值的算术平均值(平均值)，字段类型必须是长整形或者float64

SELECT MEAN(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]

median()

从单个字段中的排序值返回中间值(中位数)，在一组数值中居于中间的数值，字段类型必须是长整形或float64。

SELECT MEDIAN(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]

spread()

返回字段的最小值和最大值之间的差值，数据类型必须是长整形或者float64。

SELECT SPREAD(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]

sum()

返回一个字段中所有值得和，字段类型必须是长整形或者float64。

SELECT SUM(<field_key>) FROM <measurement_name> [WHERE <stuff>] [GROUP BY <stuff>]

integral()

返回曲线

SELECT INTEGRAL( [ * | <field_key> | /<regular_expression>/ ] [ , <unit> ]  ) [INTO_clause] FROM_clause [WHERE_clause] [GROUP_BY_clause] [ORDER_BY_clause] [LIMIT_clause] [OFFSET_clause] [SLIMIT_clause] [SOFFSET_clause]

正则查询

https://docs.influxdata.com/influxdb/v1.7/query_language/data_exploration/#regular-expressions

语法

SELECT /<regular_expression_field_key>/ FROM /<regular_expression_measurement>/ WHERE [<tag_key> <operator> /<regular_expression_tag_value>/ | <field_key> <operator> /<regular_expression_field_value>/] GROUP BY /<regular_expression_tag_key>/

支持的运算符

=~: 匹配

!~ : 不匹配

查询docker_container_cpu中cpu为total-cpu和container_name不是以k8s_开头的记录

SELECT container_name, usage_percent FROM "docker_container_cpu" WHERE "cpu"='cpu-total' and "container_name" !~  /^k8s.*$/ order by time desc limit 100;

参考：

influxdb原理详解：https://www.cnblogs.com/gaoguangjun/p/8513054.html

influxdb相关论文： https://www.influxdata.com/_resources/techpapers-new/

聚合函数：https://www.cnblogs.com/michellexiaoqi/p/7256005.html