ottertune mysql_OtterTune源码解析

为了方便后面对ottertune进行魔(hu)改(gao),需要先搞清楚它的源码结构和pipeline

OtterTune分为两大部分:

server side: 包括一个MySQL数据库(用于存储调优数据,供ml model用),Django(FrontEnd User Interface),Celery(用于调度ML task);

client side: Target_DBMS(存储用户的业务数据用,支持多种DBMS),Controller(用于控制target DBMS),Driver(用于调用controller,入口是fabfile.py)

Client Side

1. Driver

Driver是用户运行client的入口。用户并不直接执行controller的命令,而是通过driver来控制它。

Driver使用了Python的fabric库编写。作者在里面预置了很多常用的命令(例如开关target DBMS, run oltpbench等等)。

fabfile.py

这是Driver的核心。在Client侧最后运行操作(C6)就需要调用这个文件。

在C6操作中,需要用fab loop和fab run_loops来让client周期性在target DBMS上采集knob/metric sample(In each loop, it collects target DBMS info, uploads to the server, gets new recommended configuration, installs the config and restarts DBMS. Users can continue to run loops until they are satisfied with the recommended configuration)。

fab loop              runs one single loop.

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

1 @task2 defloop():3 #free cache, clean Linux PageCache

4 free_cache()5

6 #remove oltpbench log and controller log

7 clean_logs()8

9 #restart database, shell "sudo service postgresql restart"

10 restart_database()11

12 #check whether there are enough free space on disk

13 if check_disk_usage() >MAX_DISK_USAGE:14 LOG.WARN('Exceeds max disk usage %s', MAX_DISK_USAGE)15

16 #run controller as another process. Run the following command line in "../controller" folder:

17 #sudo gradle run -PappArgs="-c CONF_controller_config -d output/" --no-daemon > CONF_controller_log

18 p = Process(target=run_controller, args=())19 p.start()20 LOG.info('Run the controller')21

22 #check whether the controller is ready(has created the log files)

23 while not_ready_to_start_oltpbench():24 pass

25 #run oltpbench as a background job. Run the following command line in CONF_oltpbench_home folder:

26 #./oltpbenchmark -b CONF_oltpbench_workload -c CONF_oltpbench_config --execute=true -s 5 -o outputfile > CONF_oltpbench_log 2>&1 &

27 run_oltpbench_bg()28 LOG.info('Run OLTP-Bench')29

30 #the controller starts the first collection

31

32 #check whether 'Warmup complete, starting measurements' is in CONF_oltpbench_log file

33 while not_ready_to_start_controller():34 pass

35 #shell 'sudo kill -2 CTL_PID'

36 #send a signal to the process CTL_PID, where CTL_PID is the content of '../controller/pid.txt' (pid of java controller)

37 signal_controller()38 LOG.info('Start the first collection')39

40 #stop the experiment

41

42 #check whether 'Output Raw data into file' is in CONF_oltpbench_log file

43 while not_ready_to_shut_down_controller():44 pass

45 #shell 'sudo kill -2 CTL_PID'

46 #send a signal to the process CTL_PID, where CTL_PID is the content of '../controller/pid.txt' (pid of java controller)

47 signal_controller()48 LOG.info('Start the second collection, shut down the controller')49

50 #wait until controller exited

51 p.join()52

53 #add user defined target objective

54 #add_udf()

55

56 #save result file: 'knobs.json', 'metrics_after.json', 'metrics_before.json', 'summary.json'

57 save_dbms_result()58

59 #upload result to Django web interface

60 upload_result()61

62 #get result

63 #shell 'python3 ../../script/query_and_get.py CONF_upload_url CONF_upload_code 5'

64 get_result()65

66 #change target DBMS config

67 #shell 'sudo python3 PostgresConf.py next_config CONF_database_conf'

68 change_conf()

View Code

fab run_loops:max_iter=10   runs 10 loops. You can set max_iter to change the maximum iterations.

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

1 #intervals of restoring the databse

2 RELOAD_INTERVAL = 10

3

4 @task5 def run_loops(max_iter=1):6 #dump database if it's not done before.

7 #shell 'PGPASSWORD=CONF_password pg_dump -U CONF_username -F c -d CONF_database_name > CONF_database_save_path/CONF_database_name.dump'

8 dump =dump_database()9

10 for i inrange(int(max_iter)):11 #restore database every RELOAD_INTERVAL

12 #shell these operations:

13 #PGPASSWORD=CONF_password dropdb -e --if-exists CONF_database_name -U CONF_username

14 #PGPASSWORD=CONF_password createdb -e CONF_database_name -U CONF_username

15 #PGPASSWORD=CONF_password pg_restore -U CONF_username -j 8 -F c -d CONF_database_name CONF_database_save_path/CONF_database_name.dump

16 if RELOAD_INTERVAL >0:17 if i % RELOAD_INTERVAL ==0:18 if i == 0 and dump isFalse:19 restore_database()20 elif i >0:21 restore_database()22

23 LOG.info('The %s-th Loop Starts / Total Loops %s', i + 1, max_iter)24 loop()25 LOG.info('The %s-th Loop Ends / Total Loops %s', i + 1, max_iter)

View Code

上面有些以CONF_开头的变量,其实都是从driver_config.json中读到的。在C1操作时我们曾经配置过这个文件。

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

1 {2 "database_type" : "postgres",3 "database_name" : "tpcc",4 "database_disk": "/dev/sda1",5 "database_conf": "/etc/postgresql/9.6/main/postgresql.conf",6 "database_save_path": "/home/tidb/ottertune",7 "username" : "postgres",8 "password" : "asdfgh",9 "oltpbench_home": "/home/tidb/oltpbench",10 "oltpbench_config": "/home/tidb/oltpbench/config/tpcc_config_postgres.xml",11 "oltpbench_workload": "tpcc",12 "oltpbench_log" : "/home/tidb/ottertune/client/driver/oltp.log",13 "controller_config": "/home/tidb/ottertune/client/controller/config/sample_postgres_config.json",14 "controller_log" : "/home/tidb/ottertune/client/driver/controller.log",15 "save_path": "/home/tidb/results",16 "upload_url" : "http://127.0.0.1:8000",17 "upload_code" : "I5I10PXK3PK27FM86YYS",18 "lhs_knob_path" : "/home/tidb/ottertune/client/driver/knobs/postgres-96.json",19 "lhs_save_path" : "/home/tidb/ottertune/client/driver/configs"

20 }

View Code

2. Controller

在Driver中会通过gradle来启动Controller。命令行参数为"-c CONF_controller_config -d output/"。

其中CONF_controller_config表示Controller的配置文件,这里是sample_postgres_config.json。里面会记录一些关于target DBMS的信息

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

1 {2 "database_type" : "postgres",3 "database_url" : "jdbc:postgresql://localhost:5432/postgres",4 "username" : "postgres",5 "password" : "asdfgh",6 "upload_code" : "DEPRECATED",7 "upload_url" : "DEPRECATED",8 "workload_name" : "tpcc"

9 }

View Code

Controller的总体结构如下:

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

1 Controller2 │ README.md3 │ log4j.properties4 │ build.gradle5 ├─config6 │ sample_saphana_config.json7 │ sample_postgres_config.json8 │ sample_mysql_config.json9 ├─gradle10 │ └─wrapper11 │ gradle-wrapper.properties12 ├─sample_output13 │ ├─mysql14 │ │ summary.json15 │ │ metrics_before.json16 │ │ metrics_after.json17 │ │ knobs.json18 │ ├─postgres19 │ │ metrics_after.json20 │ │ summary.json21 │ │ metrics_before.json22 │ │ knobs.json23 │ └─saphana24 │ knobs.json25 │ metrics_after.json26 │ metrics_before.json27 │ summary.json28 └─src29 ├─main30 │ └─java31 │ └─com32 │ └─controller33 │ │ ResultUploader.java upload result34 │ │ Main.java main函数入口35 │ │ ControllerConfiguration.java ControllerConfiguration类,用于存储target DBMS的相关信息(from sample_postgres_config.json)36 │ ├─types37 │ │ JSONSchemaType.java json读写库,用于判断输入的json是否合法38 │ │ DatabaseType.java 预定义可支持的target DBMS的种类, 配合ControllerConfiguration用39 │ ├─util40 │ │ │ ClassUtil.java 常用函数库41 │ │ │ CollectionUtil.java 常用函数库42 │ │ │ FileUtil.java 文件操作库43 │ │ │ JSONSerializable.java json读写库44 │ │ │ JSONUtil.java json读写库45 │ │ │ ValidationUtils.java json读写库46 │ │ └─json java的json读写库47 │ │ JSONStringer.java48 │ │ Test.java49 │ │ JSONWriter.java50 │ │ JSONTokener.java51 │ │ JSONObject.java52 │ │ JSONString.java53 │ │ JSONArray.java54 │ │ JSONException.java55 │ ├─collectors56 │ │ MySQLCollector.java extends DBCollector, 用于收集MySQL的knob/metric。57 │ │ PostgresCollector.java extends DBCollector, 用于收集postgres的knob/metric。包括使用SHOW ALL命令收集到的parameters,还有使用SELECT查到的internal metrics。58 │ │ SAPHanaCollector.java extends DBCollector,59 │ │ DBParameterCollector.java DBCollector的interface60 │ │ DBCollector.java DBCollector类,用于存储一个knob/metric sample。包括dbParameters, dbMetrics两个HashMap。另外也包含一些连接db需要的信息(如url, 用户名密码等)61 │ └─json_validation_schema JSONSchemaType用,,用于判断输入的json是否合法62 │ summary_schema.json63 │ schema.json64 │ config_schema.json65 └─test66 └─java67 └─com68 └─controller69 └─collectors70 TestMySQLJSON.java71 TestPostgresJSON.java72 AbstractJSONValidationTestCase.java73 TestInvalidJSON.java

View Code

controller\src\main\java\com\controller\Main.java

在fab loop中,调用Controller的命令行格式为:sudo gradle run -PappArgs="-c CONF_controller_config -d output/" --no-daemon > CONF_controller_log

这里用到了两个命令行参数:-c用来读配置文件,-d用来指定输出目录

这个java文件配合前面的fabfile.py工作(fabfile.py会通过signal来控制main.java的执行进度)。整个client的pipeline大致如下:

0bf79282518add3cbcfc8d9ea20d9fff.png

这是一份result的示例,包括一次fab loop出来metric_before, metric_after, knobs, summary。

从中我们可以看到metrics分为global和local两类,分别对应整个DBMS的和每个table的metrics(注意原paper的4.1节提到了This works because OtterTune currently only considers global knobs. We defer the problem of tuning table- or component-specific knobs as future work.)。

另外还有一些xls文件,这些是从server的ottertune数据库内dump出来的表。我们重点关注以下两个表:

website_metricdata(对应每次采集到的metrics,表格各列分别是id, name, creation_time, data[表示Numeric DBMS metrics], metrics[表示All DBMS metrics], dbms_id, session_id。)

website_knobdata(对应每次采集到的knobs,表格各列分别是id, name, creation_time, data[表示Tunable DBMS parameters], knobs[表示All DBMS parameters], dbms_id, session_id。)

假设我们以Throughput作为总体要优化的metric(即DBMS independent external metric),可以看到在metricdata表的data列中有一项是""throughput_txn_per_sec"": 653.8888888888889,这个和网页上显示出的Throughput是一致的。也就是说,这两列就是server用来训练ml model的Data Repository(但注意metrics这列并没有这一项)。

另外注意,knobdata表的knobs项和knobs.json文件是一致的;而metricdata表的两项数据都和metrics_before.json/metrics_after.json的数据不一致,个人猜测metricdata表的数据是对原始数据进行了降维和cluster的结果。

由此可见,client提交给server的数据包括:DBMS internal metrics(这里就是postgres的metrics),DBMS external metrics(这里是throughput_txn_per_sec),knobs

controller\src\main\java\com\controller\collectors\PostgresCollector.java

里面定义了获取postgres的parameters和metrics需要的查询语句

parameter查询示例:在psql命令行下执行SHOW ALL;

880f09a7e2d36a40bdff652e78629a57.png

metric查询示例:在psql命令行下运行select * from pg_stat_database;

449daba3356619576e8ea1236657ab7e.png

Server Side

server端对于接收到的(client上传的)result file运行ML model,并推荐一个新的DBMS configuration。其中还涉及到很多web后端的部分(例如区分不同的session来支持多个client等),为简单起见我们忽略这些部分,只关心ML model的实现。

1. website

根据paper的描述,对于client送来的数据,server会按如下pipeline来推荐新的DBMS configuration:1)Workload characterization, 2)Identify important knobs, 3)Automated Tuning

pipeline前两步的实现在periodic_tasks.py的run_background_tasks()函数中。这个函数会每隔一段时间自动运行一次(即使client没有上传sample),相当于在后台自动进行数据预处理。

首先它会从client接收新上传的knob/metrics sample(若client没动作则忽略),并将其和之前上传的相同workload的sample合并起来(knob_data和metric_data)。下面是一个client运行过24次fab loop之后,server侧的knob_data和metric_data的实例(即包含24个sample):

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

[2019-06-05 02:31:28,692: INFO/MainProcess] Received task: run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]

[2019-06-05 02:31:28,693: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: LEN of workloads

[2019-06-05 02:31:28,695: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: 1[2019-06-05 02:31:28,696: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: ====in unique_workloads====[2019-06-05 02:31:28,697: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: tpcc

[2019-06-05 02:31:28,697: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: , , , , , , , , , , , , , , , , , , , , '...(remaining elements truncated)...']>[2019-06-05 02:31:28,698: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: ====in unique_workloads====[2019-06-05 02:31:28,748: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: #knob_data

{'data': array([

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000],

[0.00000, 200.00000, 524288.00000, 100.00000, 2.00000, 0.50000, 262144.00000, 300000.00000, 0.00000, 5.00000, 1000.00000, 100.00000, 4294967296.00000, 1.00000, 8.00000, 8.00000, 67108864.00000, 0.00000, 1073741824.00000, 8.00000, 8388608.00000, 83886080.00000, 4.00000, 1.00000, 134217728.00000, 8388608.00000, 4194304.00000, 1.00000, 200.00000, 1048576.00000, 4194304.00000]

]),'rowlabels': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24],'columnlabels': ['global.backend_flush_after', 'global.bgwriter_delay', 'global.bgwriter_flush_after', 'global.bgwriter_lru_maxpages', 'global.bgwriter_lru_multiplier', 'global.checkpoint_completion_target', 'global.checkpoint_flush_after', 'global.checkpoint_timeout', 'global.commit_delay', 'global.commit_siblings', 'global.deadlock_timeout', 'global.default_statistics_target', 'global.effective_cache_size', 'global.effective_io_concurrency', 'global.from_collapse_limit', 'global.join_collapse_limit', 'global.maintenance_work_mem', 'global.max_parallel_workers_per_gather', 'global.max_wal_size', 'global.max_worker_processes', 'global.min_parallel_relation_size', 'global.min_wal_size', 'global.random_page_cost', 'global.seq_page_cost', 'global.shared_buffers', 'global.temp_buffers', 'global.wal_buffers', 'global.wal_sync_method', 'global.wal_writer_delay', 'global.wal_writer_flush_after', 'global.work_mem']

}

[2019-06-05 02:31:28,750: INFO/MainProcess] run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e]: #metric_data

{'data': array([

[0.00000, 0.00000, 1722.66667, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 136682.66667, 1726.44444, 0.00000, 0.00000, 0.00000, 0.00000, 240.00000, 30344.66667, 3250.00000, 55494.11111, 6327.88889, 547.44444, 2.22222, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 547.44444],

[0.00000, 0.00000, 1573.88889, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 104311.11111, 1573.88889, 0.00000, 0.00000, 0.00000, 0.00000, 188.88889, 22995.55556, 2520.11111, 42250.11111, 4916.77778, 420.00000, 1.88889, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 420.00000],

[0.00000, 0.00000, 1849.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 127318.37500, 1849.00000, 0.00000, 0.00000, 0.00000, 0.00000, 203.75000, 28032.75000, 3082.12500, 53200.37500, 5787.87500, 542.50000, 3.37500, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 542.50000],

[0.00000, 0.00000, 1601.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 107102.11111, 1603.00000, 0.00000, 0.00000, 0.00000, 0.00000, 162.22222, 23739.44444, 2603.77778, 44657.00000, 4714.44444, 455.44444, 1.66667, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 455.44444],

[0.00000, 0.00000, 1825.25000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 125129.37500, 1826.50000, 0.00000, 0.00000, 0.00000, 0.00000, 210.00000, 27711.12500, 3052.00000, 51322.00000, 5771.50000, 514.87500, 2.75000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 514.87500],

[0.00000, 0.00000, 1859.87500, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 128574.37500, 1859.87500, 0.00000, 0.00000, 0.00000, 0.00000, 222.50000, 28343.62500, 3116.87500, 53665.87500, 6024.37500, 548.62500, 1.87500, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 548.62500],

[0.00000, 0.00000, 1713.33333, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 128270.77778, 1713.55556, 0.00000, 0.00000, 0.00000, 0.00000, 208.88889, 28480.11111, 3070.22222, 53108.88889, 5797.77778, 534.11111, 2.33333, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 534.11111],

[0.00000, 0.00000, 1708.55556, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 126679.66667, 1708.55556, 0.00000, 0.00000, 0.00000, 0.00000, 196.66667, 27874.77778, 3090.66667, 51776.66667, 5643.66667, 518.88889, 3.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 518.88889],

[0.00000, 0.00000, 1745.11111, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 131574.44444, 1749.66667, 0.00000, 0.00000, 0.00000, 0.00000, 217.77778, 28679.00000, 3190.00000, 54254.77778, 6039.11111, 554.22222, 2.33333, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 554.22222],

[0.00000, 0.00000, 2122.28571, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 146507.57143, 2122.42857, 0.00000, 0.00000, 0.00000, 0.00000, 230.00000, 32624.71429, 3510.71429, 60823.28571, 6565.14286, 612.14286, 1.85714, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 612.14286],

[0.00000, 0.00000, 1632.22222, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 106987.55556, 1632.33333, 0.00000, 0.00000, 0.00000, 0.00000, 203.33333, 23917.22222, 2536.00000, 44102.55556, 5144.11111, 437.77778, 2.33333, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 437.77778],

[0.00000, 0.00000, 1726.33333, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 126287.00000, 1726.33333, 0.00000, 0.00000, 0.00000, 0.00000, 186.66667, 27283.55556, 3141.33333, 51592.33333, 5575.77778, 525.66667, 2.66667, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 525.66667],

[0.00000, 0.00000, 1704.88889, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 117649.44444, 1708.22222, 0.00000, 0.00000, 0.00000, 0.00000, 212.22222, 25811.11111, 2843.44444, 48149.00000, 5573.44444, 485.55556, 1.66667, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 485.55556],

[0.00000, 0.00000, 1749.55556, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 127589.88889, 1752.33333, 0.00000, 0.00000, 0.00000, 0.00000, 216.66667, 27837.77778, 3115.88889, 52392.66667, 5933.22222, 532.88889, 3.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 532.88889],

[0.00000, 0.00000, 2029.25000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 160316.37500, 2029.25000, 0.00000, 0.00000, 0.00000, 0.00000, 273.75000, 35874.25000, 3793.75000, 66179.12500, 7341.75000, 653.87500, 2.62500, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 653.87500],

[0.00000, 0.00000, 2005.62500, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 148021.12500, 2010.12500, 0.00000, 0.00000, 0.00000, 0.00000, 240.00000, 32267.00000, 3617.00000, 61340.25000, 6775.37500, 629.37500, 3.37500, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 629.37500],

[0.00000, 0.00000, 1737.88889, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 126782.00000, 1742.22222, 0.00000, 0.00000, 0.00000, 0.00000, 226.66667, 27851.22222, 3051.00000, 52269.55556, 6005.88889, 527.55556, 1.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 527.55556],

[0.00000, 0.00000, 1769.88889, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 127593.44444, 1769.88889, 0.00000, 0.00000, 0.00000, 0.00000, 213.33333, 28690.66667, 3023.88889, 52830.22222, 5797.66667, 526.33333, 2.22222, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 526.33333],

[0.00000, 0.00000, 1816.00000, 0.00000, 0.00000, 0.00000, 11.11111, 0.00000, 0.00000, 0.00000, 0.00000, 0.11111, 0.00000, 0.00000, 135974.55556, 1820.11111, 0.00000, 0.00000, 0.00000, 0.00000, 274.44444, 30169.66667, 3218.88889, 55979.66667, 6727.55556, 562.11111, 2.44444, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 562.11111],

[0.00000, 0.00000, 1772.88889, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 127041.77778, 1779.33333, 0.00000, 0.00000, 0.00000, 0.00000, 226.66667, 27772.44444, 3067.88889, 52149.66667, 5978.66667, 527.33333, 2.55556, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 527.33333],

[0.00000, 0.00000, 1733.22222, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 122698.11111, 1737.11111, 0.00000, 0.00000, 0.00000, 0.00000, 194.44444, 27286.33333, 2970.00000, 50195.22222, 5494.22222, 498.88889, 2.11111, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 498.88889],

[0.00000, 0.00000, 1816.33333, 0.00000, 0.00000, 0.00000, 11.11111, 0.00000, 0.00000, 0.00000, 0.00000, 0.11111, 0.00000, 0.00000, 130279.33333, 1816.55556, 0.00000, 0.00000, 0.00000, 0.00000, 223.33333, 28512.55556, 3171.66667, 53534.00000, 6055.66667, 544.11111, 1.77778, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 544.11111],

[0.00000, 0.00000, 1816.00000, 0.00000, 0.00000, 0.00000, 15.55556, 0.00000, 0.00000, 0.00000, 0.00000, 0.11111, 0.00000, 0.00000, 135903.00000, 1816.00000, 0.00000, 0.00000, 0.00000, 0.00000, 255.55556, 30097.44444, 3215.44444, 56213.44444, 6563.22222, 560.88889, 2.77778, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 560.88889],

[0.00000, 0.00000, 1755.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 121303.22222, 1759.55556, 0.00000, 0.00000, 0.00000, 0.00000, 192.22222, 26446.66667, 2979.33333, 50221.44444, 5474.77778, 513.22222, 2.33333, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 0.00000, 513.22222]

]),'rowlabels': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24],'columnlabels': ['pg_stat_archiver.archived_count', 'pg_stat_archiver.failed_count', 'pg_stat_bgwriter.buffers_alloc', 'pg_stat_bgwriter.buffers_backend', 'pg_stat_bgwriter.buffers_backend_fsync', 'pg_stat_bgwriter.buffers_checkpoint', 'pg_stat_bgwriter.buffers_clean', 'pg_stat_bgwriter.checkpoint_sync_time', 'pg_stat_bgwriter.checkpoint_write_time', 'pg_stat_bgwriter.checkpoints_req', 'pg_stat_bgwriter.checkpoints_timed', 'pg_stat_bgwriter.maxwritten_clean', 'pg_stat_database.blk_read_time', 'pg_stat_database.blk_write_time', 'pg_stat_database.blks_hit', 'pg_stat_database.blks_read', 'pg_stat_database.conflicts', 'pg_stat_database.deadlocks', 'pg_stat_database.temp_bytes', 'pg_stat_database.temp_files', 'pg_stat_database.tup_deleted', 'pg_stat_database.tup_fetched', 'pg_stat_database.tup_inserted', 'pg_stat_database.tup_returned', 'pg_stat_database.tup_updated', 'pg_stat_database.xact_commit', 'pg_stat_database.xact_rollback', 'pg_stat_database_conflicts.confl_bufferpin', 'pg_stat_database_conflicts.confl_deadlock', 'pg_stat_database_conflicts.confl_lock', 'pg_stat_database_conflicts.confl_snapshot', 'pg_stat_database_conflicts.confl_tablespace', 'pg_stat_user_indexes.idx_scan', 'pg_stat_user_indexes.idx_tup_fetch', 'pg_stat_user_indexes.idx_tup_read', 'pg_stat_user_tables.analyze_count', 'pg_stat_user_tables.autoanalyze_count', 'pg_stat_user_tables.autovacuum_count', 'pg_stat_user_tables.idx_scan', 'pg_stat_user_tables.idx_tup_fetch', 'pg_stat_user_tables.n_dead_tup', 'pg_stat_user_tables.n_live_tup', 'pg_stat_user_tables.n_mod_since_analyze', 'pg_stat_user_tables.n_tup_del', 'pg_stat_user_tables.n_tup_hot_upd', 'pg_stat_user_tables.n_tup_ins', 'pg_stat_user_tables.n_tup_upd', 'pg_stat_user_tables.seq_scan', 'pg_stat_user_tables.seq_tup_read', 'pg_stat_user_tables.vacuum_count', 'pg_statio_user_indexes.idx_blks_hit', 'pg_statio_user_indexes.idx_blks_read', 'pg_statio_user_tables.heap_blks_hit', 'pg_statio_user_tables.heap_blks_read', 'pg_statio_user_tables.idx_blks_hit', 'pg_statio_user_tables.idx_blks_read', 'pg_statio_user_tables.tidx_blks_hit', 'pg_statio_user_tables.tidx_blks_read', 'pg_statio_user_tables.toast_blks_hit', 'pg_statio_user_tables.toast_blks_read', 'throughput_txn_per_sec']

}

[2019-06-05 02:31:36,672: ERROR/MainProcess] Task run_background_tasks[bfa9548c-b286-4675-aaa0-7ae6312d757e] raised unexpected: AssertionError('Need more data to train the model',)

Traceback (most recent calllast):

File"/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 240, intrace_task

R= retval = fun(*args, **kwargs)

File"/usr/local/lib/python3.6/dist-packages/celery/app/trace.py", line 438, in__protected_call__

return self.run(*args, **kwargs)

File"/home/tidb/ottertune/server/website/website/tasks/periodic_tasks.py", line 111, inrun_background_tasks

dbms=workload.dbms)

File"/home/tidb/ottertune/server/website/website/tasks/periodic_tasks.py", line 257, inrun_knob_identification

assert len(nonconst_knob_matrix)> 0, "Need more data to train the model"AssertionError: Needmore data to train the model

View Code

假设N是sample数量,P是每个sample内的knob数量(只包括Tunable knobs,与前面website_knobdata表的data项对应。这里是31个),Q是每个sample内的metric数量(这里是61个)。

knob_data可以看作一个hash table,包括data(N*P的list,存储所有的Tunable knobs数据),rowlabels(长度N的list,存储1-N的数字),columnlabels(长度P的list,存储每个knob的名字)。

metric_data也可以看作一个hash table,包括data(N*Q的list,存储所有的metric数据),rowlabels(长度N的list,存储1-N的数字),columnlabels(长度Q的list,存储每个metric的名字)

之后会调用run_workload_characterization()函数,对metric_data进行降维和聚类操作,并存储到pruned_metrics里(pruned_metrics包括这些保留下的metrics的名字)。pruned_metrics里包含的metrics很少,这里只有2个。如下图

[2019-06-07 01:52:31,364: INFO/MainProcess] run_background_tasks[085bccf9-8188-4842-baee-c4223a78679f]: ----------pruned_metrics----------[2019-06-07 01:52:31,364: INFO/MainProcess] run_background_tasks[085bccf9-8188-4842-baee-c4223a78679f]: ['pg_stat_bgwriter.buffers_alloc', 'pg_stat_database.tup_returned']

[2019-06-07 01:52:31,364: INFO/MainProcess] run_background_tasks[085bccf9-8188-4842-baee-c4223a78679f]: ----------pruned_metrics----------

对于精简后的metrics data,会被保存到pruned_metric_data中。设当前有36个sample,那么pruned_metrics_data如下所示:

[2019-06-07 01:52:31,367: INFO/MainProcess] run_background_tasks[085bccf9-8188-4842-baee-c4223a78679f]: ----------pruned_metric_data----------[2019-06-07 01:52:31,367: INFO/MainProcess] run_background_tasks[085bccf9-8188-4842-baee-c4223a78679f]: {'data': array([[ 1845.111, 73727.111], [ 1861.667, 72836.111], [ 1865.111, 71055.222], [ 1712.889, 54248.889], [ 1836.444, 70974.556], [ 1835.222, 68532.556], [ 1941.889, 70615.778], [ 1826.667, 66300.889], [ 1913.333, 68670.444], [ 1823.889, 60201.778], [ 1901.222, 68815.222], [ 1662.111, 45577.444], [ 1835.111, 67402.778], [ 1896.778, 65052.333], [ 1740.333, 48970.889], [ 2088.500, 68338.500], [ 1762.100, 65846.700], [ 1835.667, 63783.111], [ 1524.600, 40146.100], [ 1734.333, 49186.000], [ 1985.667, 71331.111], [ 1758.889, 49650.222], [ 1988.333, 71295.222], [ 1838.000, 55996.778], [ 2008.222, 71495.222], [ 1439.444, 28370.444], [ 1729.667, 45568.444], [ 1938.111, 65747.333], [ 1910.222, 51863.222], [ 2069.333, 58888.556], [ 2328.333, 66940.778], [ 2408.222, 66183.000], [ 1832.889, 64115.667], [ 1989.000, 63310.778], [ 1861.889, 57102.667], [ 1844.333, 68077.111]]), 'rowlabels': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36], 'columnlabels': ['pg_stat_bgwriter.buffers_alloc', 'pg_stat_database.tup_returned']}

[2019-06-07 01:52:31,367: INFO/MainProcess] run_background_tasks[085bccf9-8188-4842-baee-c4223a78679f]: ----------pruned_metric_data----------

再之后调用run_knob_identification()函数,输入knob_data和pruned_metrics,检测出重要的knobs,并返回ranked_knobs。如下图

[2019-06-05 18:33:16,661: INFO/MainProcess] run_background_tasks[82b6937f-ca01-4534-87d1-a1c0a98351c2]: ----------ranked_knobs----------[2019-06-05 18:33:16,661: INFO/MainProcess] run_background_tasks[82b6937f-ca01-4534-87d1-a1c0a98351c2]: ['global.wal_sync_method', 'global.wal_writer_delay', 'global.wal_buffers', 'global.temp_buffers', 'global.shared_buffers', 'global.seq_page_cost', 'global.random_page_cost', 'global.min_wal_size', 'global.min_parallel_relation_size', 'global.max_worker_processes', 'global.max_wal_size', 'global.max_parallel_workers_per_gather', 'global.maintenance_work_mem', 'global.join_collapse_limit', 'global.from_collapse_limit', 'global.effective_io_concurrency', 'global.effective_cache_size', 'global.default_statistics_target', 'global.backend_flush_after', 'global.bgwriter_delay', 'global.bgwriter_flush_after', 'global.wal_writer_flush_after', 'global.bgwriter_lru_maxpages', 'global.checkpoint_completion_target', 'global.checkpoint_flush_after', 'global.checkpoint_timeout', 'global.commit_delay', 'global.commit_siblings', 'global.deadlock_timeout', 'global.bgwriter_lru_multiplier', 'global.work_mem']

[2019-06-05 18:33:16,661: INFO/MainProcess] run_background_tasks[82b6937f-ca01-4534-87d1-a1c0a98351c2]: ----------ranked_knobs----------

下面分别介绍这三个步骤:

1. Workload characterization:use DBMS internal metrics to characterize workload

这步的目的是对原始的metrics进行降维和聚类,将作用相似(相关度很高,values are strongly correlated)的metrics合并。

它一共分为两步:首先是降维技术,称为因子分析,将高维DBMS指标转换为低维数据。 然后我们使用第二种技术,称为k-means ,用于聚类这种低维数据,成为有意义的群体。 使用降维技术是许多聚类算法的预处理步骤,因为它们减少了数据中“噪音”的数量。 这个提高了聚类分析的稳健性和质量。(原文:Using a dimensionality reduction technique is a preprocessing step for many clustering algorithms because they reduce the amount of “noise” in the data [31, 30]. This improves the robustness and the quality of the cluster analysis)

这部分的代码实现位于periodic_tasks.py的run_workload_characterization()函数中。该函数的输入包括(metric_data[原始的metrics sample])。一开始它会对原始数据进行检测,筛选出metrics互不相同的metric sample(因为如果knob全都一样,就无法通过ML model来学习了),存到nonconst_matrix中。

首先对于原始的knob-metric samples,使用Factor Analysis对其进行降维。然后对于降维后的low-dimension data,使用k-means对作用相似的metrics进行聚类。

2.Identify important knobs: identify which knobs have strongest impact on DBMS target metric(eg, throughput)

OtterTune uses the Lasso path algorithm [29] to determine the order of importance of the DBMS’s knobs.

OtterTune constructs a set of independent variables (X) and one or more dependent variables (y) from the data in its repository. The independent variables(X) are the DBMS’s knobs (or functions of these knobs) and the dependent variables(y) are the metrics that OtterTune collects during an observation period from the DBMS. OtterTune uses the Lasso path algorithm [29] to determine the order of importance of the DBMS’s knobs.

在运行Lasso之前,还需要对原始数据进行Normalization(原文sec 5.1最后一段)。另外需要考虑不同knob之间可能存在的dependency(原文sec 5.2)

最后,OtterTune 必须决定在做出配置建议时使用多少个旋钮,旋钮用的太多会明显增加 OtterTune 的调优时间,而旋钮用的太少则难以找到最好的配置。OtterTune 用了一个增量方法来自动化这个过程,在一次调优过程中,逐步增加使用的旋钮。这个方法让 OtterTune 可以先用少量最重要的旋钮来探索并调优配置,然后再扩大范围考虑其他旋钮。

这部分的代码实现位于periodic_tasks.py的run_knob_identification()函数中。

该函数的输入包括(knob_data=knob_data[knobs sample], metric_data=pruned_metric_data[精简后的metrics sample])。首先它会对原始数据进行检测,筛选出knob互不相同的knob/metric sample(因为如果knob全都一样,就无法通过ML model来学习了),存到nonconst_metric_matrix中。之后对其再进行一些预处理(比如识别出可以调节的knob、standardize to N(0,1)等),然后送入lasso path algorithm。最后会输出一个list ranked_knobs,即按重要性排序的knobs名称。

3. Automated Tuning:

经过前面两步之后,我们已经有了这些数据:(1) the set of non-redundant metrics  (2) the set of most impactful configuration knobs  (3) the data from previous tuning sessions stored in its data repository

Automated Tuning这部分又可以分为两步操作:1) Identify which workload from a previous tuning session is most emblematic of the target workload. It does this by comparing the session’s metrics with those from the previously seen workloads to see which ones react similarly to different knob settings.   2) Once OtterTune has matched the target workload to the most similar one in 1014its repository, it then starts the second step of the analysis where it chooses a configuration that is explicitly selected to maximize the target objective.

注意在Tuning时,每次只调整最impactful的knobs。

1) Workload Mapping

这一步的目的是对于当前待优化的workload,在data repository中找到一个(之前跑过的)和它最相似的workload。

首先用一个matrix把data repository里的数据都存起来。设S[m][i][j]表示当使用configuration #j来运行workload #i时,metric #m的值。

记m表示任意一个metric。对于当前待优化的workload #w,它和另一个workload #i的差异被定义为向量S[m][w]和向量S[m][i]之间的Euclidean distance。记为ED[m][i]

之后对于所有的metrics #m,都计算出它的ED[m][i]。最终workload #i的score定义为所有ED[m][i]的平均值,即Score[i]=Average(ED[m][i] for each m)

最终取出Score最小的workload即可。这个就是和当前待优化的workload最相似的workload。

其实这步就是通过计算不同workloads之间的metrics数据的欧氏距离,来确定最接近的一个workload。但要注意这篇paper假设所有workload所运行在的DBMS的硬件配置都是一致的(原文Sec3.1:We assume that this hardware profile is a single identifier from a list of pre-defined types (e.g., an instance type on Amazon EC2). We defer the problem of automatically determining the hardware capabilities of a DBMS deployment to future work.  /  In the first step, OtterTune tries to “understand” the target workload and map it to a workload for the same DBMS and hardware profile that it has seen (and tuned) in the past.),因此对于更一般的情况,即使是同一个workload,在不同硬件配置下跑出的metrics也会有很大差距,这种方法可能就需要进行修改了。

另外在计算之前还要进行一些normalization之类的预处理(Before computing the score, it is critical that all metrics are of the same order of magnitude.)。

这部分的代码位于async_tasks.py文件的map_workload()函数中。该函数的输入target_data包括X_matrix(knobs数据集)、y_matrix(metrics数据集)、rowlabels(每个sample的序号,这里有25个sample)、X_columnlabels(knobs的编号,这里有31个)、y_columnlabels(metrics的编号,这里有61个)、newest_result_id(其中最新的,也就是client刚提交的sample序号)。注意target_data中只包括当前待优化的workload种类的sample。如下图:

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

[2019-06-06 22:08:14,963: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: ##############map_workload called##############

[2019-06-06 22:08:14,965: INFO/MainProcess] Task aggregate_target_results[b7847e24-173a-4c72-9db3-d3e23367ae36] succeeded in 0.09672254799988877s: {'X_matrix': array([[ 0.000, 200.000, 524288.000, 100.000, 2.000, 0.500,262144.000, 300000.000, 0.000, 5.000,...

[2019-06-06 22:08:14,965: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: X_matrix

[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: (25, 31)

[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: [ 0.000 200.000 524288.000 100.000 2.000 0.500 262144.000 300000.000

0.000 5.000 1000.000 100.000 4294967296.000 1.000 8.000 8.000

67108864.000 0.000 1073741824.000 8.000 8388608.000 83886080.000

4.000 1.000 134217728.000 8388608.000 4194304.000 1.000 200.000

1048576.000 4194304.000]

[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: -----[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: y_matrix

[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: (25, 61)

[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: [ 0.000 0.000 1845.111 0.000 0.000 0.000 33.333 0.000 0.000 0.000

0.000 0.333 0.000 0.000 168455.333 1871.000 0.000 0.000 0.000

0.000 291.111 39427.778 4287.222 73727.111 8055.444 740.778 3.444

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

0.000 0.000 0.000 740.778]

[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: -----[2019-06-06 22:08:14,966: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: rowlabels

[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]

[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: -----[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: X_columnlabels

[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: ['global.backend_flush_after', 'global.bgwriter_delay', 'global.bgwriter_flush_after', 'global.bgwriter_lru_maxpages', 'global.bgwriter_lru_multiplier', 'global.checkpoint_completion_target', 'global.checkpoint_flush_after', 'global.checkpoint_timeout', 'global.commit_delay', 'global.commit_siblings', 'global.deadlock_timeout', 'global.default_statistics_target', 'global.effective_cache_size', 'global.effective_io_concurrency', 'global.from_collapse_limit', 'global.join_collapse_limit', 'global.maintenance_work_mem', 'global.max_parallel_workers_per_gather', 'global.max_wal_size', 'global.max_worker_processes', 'global.min_parallel_relation_size', 'global.min_wal_size', 'global.random_page_cost', 'global.seq_page_cost', 'global.shared_buffers', 'global.temp_buffers', 'global.wal_buffers', 'global.wal_sync_method', 'global.wal_writer_delay', 'global.wal_writer_flush_after', 'global.work_mem']

[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: -----[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: y_columnlabels

[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: ['pg_stat_archiver.archived_count', 'pg_stat_archiver.failed_count', 'pg_stat_bgwriter.buffers_alloc', 'pg_stat_bgwriter.buffers_backend', 'pg_stat_bgwriter.buffers_backend_fsync', 'pg_stat_bgwriter.buffers_checkpoint', 'pg_stat_bgwriter.buffers_clean', 'pg_stat_bgwriter.checkpoint_sync_time', 'pg_stat_bgwriter.checkpoint_write_time', 'pg_stat_bgwriter.checkpoints_req', 'pg_stat_bgwriter.checkpoints_timed', 'pg_stat_bgwriter.maxwritten_clean', 'pg_stat_database.blk_read_time', 'pg_stat_database.blk_write_time', 'pg_stat_database.blks_hit', 'pg_stat_database.blks_read', 'pg_stat_database.conflicts', 'pg_stat_database.deadlocks', 'pg_stat_database.temp_bytes', 'pg_stat_database.temp_files', 'pg_stat_database.tup_deleted', 'pg_stat_database.tup_fetched', 'pg_stat_database.tup_inserted', 'pg_stat_database.tup_returned', 'pg_stat_database.tup_updated', 'pg_stat_database.xact_commit', 'pg_stat_database.xact_rollback', 'pg_stat_database_conflicts.confl_bufferpin', 'pg_stat_database_conflicts.confl_deadlock', 'pg_stat_database_conflicts.confl_lock', 'pg_stat_database_conflicts.confl_snapshot', 'pg_stat_database_conflicts.confl_tablespace', 'pg_stat_user_indexes.idx_scan', 'pg_stat_user_indexes.idx_tup_fetch', 'pg_stat_user_indexes.idx_tup_read', 'pg_stat_user_tables.analyze_count', 'pg_stat_user_tables.autoanalyze_count', 'pg_stat_user_tables.autovacuum_count', 'pg_stat_user_tables.idx_scan', 'pg_stat_user_tables.idx_tup_fetch', 'pg_stat_user_tables.n_dead_tup', 'pg_stat_user_tables.n_live_tup', 'pg_stat_user_tables.n_mod_since_analyze', 'pg_stat_user_tables.n_tup_del', 'pg_stat_user_tables.n_tup_hot_upd', 'pg_stat_user_tables.n_tup_ins', 'pg_stat_user_tables.n_tup_upd', 'pg_stat_user_tables.seq_scan', 'pg_stat_user_tables.seq_tup_read', 'pg_stat_user_tables.vacuum_count', 'pg_statio_user_indexes.idx_blks_hit', 'pg_statio_user_indexes.idx_blks_read', 'pg_statio_user_tables.heap_blks_hit', 'pg_statio_user_tables.heap_blks_read', 'pg_statio_user_tables.idx_blks_hit', 'pg_statio_user_tables.idx_blks_read', 'pg_statio_user_tables.tidx_blks_hit', 'pg_statio_user_tables.tidx_blks_read', 'pg_statio_user_tables.toast_blks_hit', 'pg_statio_user_tables.toast_blks_read', 'throughput_txn_per_sec']

[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: -----[2019-06-06 22:08:14,967: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: newest_result_id

[2019-06-06 22:08:14,968: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: 25[2019-06-06 22:08:14,968: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: -----[2019-06-06 22:08:14,968: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: bad

[2019-06-06 22:08:14,968: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: False

[2019-06-06 22:08:14,969: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: -----[2019-06-06 22:08:14,969: INFO/MainProcess] map_workload[1767d38c-4734-494a-b193-4bf8f9669473]: --------------map_workload called--------------

View Code

首先它会对不同种类workload的X_matrix、y_matrix进行预处理。对于X_matrix中的每个sample,按照ranked_knobs中的顺序挑出前10项重要的knobs,并删掉剩下的knobs。对于y_matrix中的每个sample,只保留pruned_metrics中选出的几个重要的metrics的值。然后还会进行一些预处理操作(如StandardScaler)。最终放入X_target、y_target数组中。

之后开始准备计算欧氏距离。注意代码中的操作和paper有不一致:对于每种workload,用这种workload自己的knob/metric samples(即代码中的X_scaled;里面不包括client刚交上来的那一个newest_result)训练一个Gaussian Process的model,然后再放在包含所有sample(代码中的X_target;包括了所有workload各自遇到过的knobs的并集)的数据集上进行predict。最后用这个predict出来的结果与原始的y_target计算欧氏距离:对于每个sample #i,计算dists[i]=sqrt(sum{(predictions[i][m]-y_target[i][m])^2 for all metrics #m}),然后将所有sample的dists[i]值取平均数,即为这个workload的score。

(注意对于y_target中,相同knob在不同workload上的情况,If we have multiple observations from running workload i with configuration j, then entry Xm;i;j is the median of all observed values of metric m.)

最后,取score最小的workload种类,即为返回的结果。

2) Configuration Recommendation

这部分的代码位于async_tasks.py文件的configuration_recommendation()函数中

这步使用了Gaussian Process regression,这也是一种ML model。而和神经网络之类的方法相比,高斯过程模型属于无参数模型,相对解决的问题复杂度及与其它算法比较减少了算法计算量;而且在训练样本很少的情况下表现比NN更好。

在这一步中,a Gaussian Process Regression model is trained to estimate the value of the tuning target objective metric under different knob configurations.  最终选出value最好的一个knob configuration方案作为推荐结果。以下就是configuration_recommendation()函数的输入target_data:

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

1 ##############configuration_recommendation called##############2 X_matrix3 (25, 31)4 [ 0.000 200.000 524288.000 100.000 2.000 0.500 262144.000 300000.000

5 0.000 5.000 1000.000 100.000 4294967296.000 1.000 8.000 8.000

6 67108864.000 0.000 1073741824.000 8.000 8388608.000 83886080.000

7 4.000 1.000 134217728.000 8388608.000 4194304.000 1.000 200.000

8 1048576.000 4194304.000]9 -----

10 y_matrix11 (25, 61)12 [ 0.000 0.000 1845.111 0.000 0.000 0.000 33.333 0.000 0.000 0.000

13 0.000 0.333 0.000 0.000 168455.333 1871.000 0.000 0.000 0.000

14 0.000 291.111 39427.778 4287.222 73727.111 8055.444 740.778 3.444

15 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

16 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

17 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000

18 0.000 0.000 0.000 740.778]19 -----

20 rowlabels21 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]22 -----

23 X_columnlabels24 ['global.backend_flush_after', 'global.bgwriter_delay', 'global.bgwriter_flush_after', 'global.bgwriter_lru_maxpages', 'global.bgwriter_lru_multiplier', 'global.checkpoint_completion_target', 'global.checkpoint_flush_after', 'global.checkpoint_timeout', 'global.commit_delay', 'global.commit_siblings', 'global.deadlock_timeout', 'global.default_statistics_target', 'global.effective_cache_size', 'global.effective_io_concurrency', 'global.from_collapse_limit', 'global.join_collapse_limit', 'global.maintenance_work_mem', 'global.max_parallel_workers_per_gather', 'global.max_wal_size', 'global.max_worker_processes', 'global.min_parallel_relation_size', 'global.min_wal_size', 'global.random_page_cost', 'global.seq_page_cost', 'global.shared_buffers', 'global.temp_buffers', 'global.wal_buffers', 'global.wal_sync_method', 'global.wal_writer_delay', 'global.wal_writer_flush_after', 'global.work_mem']25 -----

26 y_columnlabels27 ['pg_stat_archiver.archived_count', 'pg_stat_archiver.failed_count', 'pg_stat_bgwriter.buffers_alloc', 'pg_stat_bgwriter.buffers_backend', 'pg_stat_bgwriter.buffers_backend_fsync', 'pg_stat_bgwriter.buffers_checkpoint', 'pg_stat_bgwriter.buffers_clean', 'pg_stat_bgwriter.checkpoint_sync_time', 'pg_stat_bgwriter.checkpoint_write_time', 'pg_stat_bgwriter.checkpoints_req', 'pg_stat_bgwriter.checkpoints_timed', 'pg_stat_bgwriter.maxwritten_clean', 'pg_stat_database.blk_read_time', 'pg_stat_database.blk_write_time', 'pg_stat_database.blks_hit', 'pg_stat_database.blks_read', 'pg_stat_database.conflicts', 'pg_stat_database.deadlocks', 'pg_stat_database.temp_bytes', 'pg_stat_database.temp_files', 'pg_stat_database.tup_deleted', 'pg_stat_database.tup_fetched', 'pg_stat_database.tup_inserted', 'pg_stat_database.tup_returned', 'pg_stat_database.tup_updated', 'pg_stat_database.xact_commit', 'pg_stat_database.xact_rollback', 'pg_stat_database_conflicts.confl_bufferpin', 'pg_stat_database_conflicts.confl_deadlock', 'pg_stat_database_conflicts.confl_lock', 'pg_stat_database_conflicts.confl_snapshot', 'pg_stat_database_conflicts.confl_tablespace', 'pg_stat_user_indexes.idx_scan', 'pg_stat_user_indexes.idx_tup_fetch', 'pg_stat_user_indexes.idx_tup_read', 'pg_stat_user_tables.analyze_count', 'pg_stat_user_tables.autoanalyze_count', 'pg_stat_user_tables.autovacuum_count', 'pg_stat_user_tables.idx_scan', 'pg_stat_user_tables.idx_tup_fetch', 'pg_stat_user_tables.n_dead_tup', 'pg_stat_user_tables.n_live_tup', 'pg_stat_user_tables.n_mod_since_analyze', 'pg_stat_user_tables.n_tup_del', 'pg_stat_user_tables.n_tup_hot_upd', 'pg_stat_user_tables.n_tup_ins', 'pg_stat_user_tables.n_tup_upd', 'pg_stat_user_tables.seq_scan', 'pg_stat_user_tables.seq_tup_read', 'pg_stat_user_tables.vacuum_count', 'pg_statio_user_indexes.idx_blks_hit', 'pg_statio_user_indexes.idx_blks_read', 'pg_statio_user_tables.heap_blks_hit', 'pg_statio_user_tables.heap_blks_read', 'pg_statio_user_tables.idx_blks_hit', 'pg_statio_user_tables.idx_blks_read', 'pg_statio_user_tables.tidx_blks_hit', 'pg_statio_user_tables.tidx_blks_read', 'pg_statio_user_tables.toast_blks_hit', 'pg_statio_user_tables.toast_blks_read', 'throughput_txn_per_sec']28 -----

29 newest_result_id30 25

31 -----

32 bad33 False34 -----

35 mapped_workload36 (1, 'tpcc', 11.449912651941599)37 -----

38 scores39 {1: ('tpcc', 11.449912651941599)}40 -----

41 --------------configuration_recommendation called--------------

View Code

这是另一个例子:

8f900a89c6347c561fdf2122f13be562.png

961ddebeb323a10fe0623af514929fc1.png

1 ---------------------------------

2 X_matrix (1, 31)3 [[0.00000000e+00 2.00000000e+02 5.24288000e+05 1.00000000e+02

4 2.00000000e+00 5.00000000e-01 2.62144000e+05 3.00000000e+05

5 0.00000000e+00 5.00000000e+00 1.00000000e+03 1.00000000e+02

6 4.29496730e+09 1.00000000e+00 8.00000000e+00 8.00000000e+00

7 6.71088640e+07 0.00000000e+00 1.07374182e+09 8.00000000e+00

8 8.38860800e+06 8.38860800e+07 4.00000000e+00 1.00000000e+00

9 1.34217728e+08 8.38860800e+06 4.19430400e+06 1.00000000e+00

10 2.00000000e+02 1.04857600e+06 4.19430400e+06]]11 ---------------------------------

12 y_matrix (1, 61)13 [[0.00000000e+00 0.00000000e+00 2.00057143e+03 0.00000000e+00

14 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

15 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

16 0.00000000e+00 0.00000000e+00 1.11175000e+05 2.00085714e+03

17 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

18 1.97142857e+02 2.62194286e+04 2.82971429e+03 4.85960000e+04

19 5.40785714e+03 4.85285714e+02 7.14285714e-01 0.00000000e+00

20 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

21 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

22 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

23 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

24 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

25 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

26 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

27 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00

28 4.85285714e+02]]29 ---------------------------------

30 rowlabels31 [19]32 ---------------------------------

33 X_columnlabels34 ['global.backend_flush_after', 'global.bgwriter_delay', 'global.bgwriter_flush_after', 'global.bgwriter_lru_maxpages', 'global.bgwriter_lru_multiplier', 'global.checkpoint_completion_target', 'global.checkpoint_flush_after', 'global.checkpoint_timeout', 'global.commit_delay', 'global.commit_siblings', 'global.deadlock_timeout', 'global.default_statistics_target', 'global.effective_cache_size', 'global.effective_io_concurrency', 'global.from_collapse_limit', 'global.join_collapse_limit', 'global.maintenance_work_mem', 'global.max_parallel_workers_per_gather', 'global.max_wal_size', 'global.max_worker_processes', 'global.min_parallel_relation_size', 'global.min_wal_size', 'global.random_page_cost', 'global.seq_page_cost', 'global.shared_buffers', 'global.temp_buffers', 'global.wal_buffers','global.wal_sync_method', 'global.wal_writer_delay', 'global.wal_writer_flush_after', 'global.work_mem']35 ---------------------------------

36 y_columnlabels37 ['pg_stat_archiver.archived_count', 'pg_stat_archiver.failed_count', 'pg_stat_bgwriter.buffers_alloc', 'pg_stat_bgwriter.buffers_backend', 'pg_stat_bgwriter.buffers_backend_fsync', 'pg_stat_bgwriter.buffers_checkpoint', 'pg_stat_bgwriter.buffers_clean', 'pg_stat_bgwriter.checkpoint_sync_time', 'pg_stat_bgwriter.checkpoint_write_time', 'pg_stat_bgwriter.checkpoints_req', 'pg_stat_bgwriter.checkpoints_timed', 'pg_stat_bgwriter.maxwritten_clean', 'pg_stat_database.blk_read_time', 'pg_stat_database.blk_write_time', 'pg_stat_database.blks_hit', 'pg_stat_database.blks_read', 'pg_stat_database.conflicts', 'pg_stat_database.deadlocks', 'pg_stat_database.temp_bytes', 'pg_stat_database.temp_files', 'pg_stat_database.tup_deleted', 'pg_stat_database.tup_fetched', 'pg_stat_database.tup_inserted', 'pg_stat_database.tup_returned', 'pg_stat_database.tup_updated', 'pg_stat_database.xact_commit', 'pg_stat_database.xact_rollback', 'pg_stat_database_conflicts.confl_bufferpin', 'pg_stat_database_conflicts.confl_deadlock', 'pg_stat_database_conflicts.confl_lock', 'pg_stat_database_conflicts.confl_snapshot', 'pg_stat_database_conflicts.confl_tablespace', 'pg_stat_user_indexes.idx_scan', 'pg_stat_user_indexes.idx_tup_fetch', 'pg_stat_user_indexes.idx_tup_read', 'pg_stat_user_tables.analyze_count', 'pg_stat_user_tables.autoanalyze_count', 'pg_stat_user_tables.autovacuum_count', 'pg_stat_user_tables.idx_scan', 'pg_stat_user_tables.idx_tup_fetch', 'pg_stat_user_tables.n_dead_tup', 'pg_stat_user_tables.n_live_tup', 'pg_stat_user_tables.n_mod_since_analyze', 'pg_stat_user_tables.n_tup_del', 'pg_stat_user_tables.n_tup_hot_upd', 'pg_stat_user_tables.n_tup_ins', 'pg_stat_user_tables.n_tup_upd', 'pg_stat_user_tables.seq_scan', 'pg_stat_user_tables.seq_tup_read','pg_stat_user_tables.vacuum_count', 'pg_statio_user_indexes.idx_blks_hit', 'pg_statio_user_indexes.idx_blks_read', 'pg_statio_user_tables.heap_blks_hit', 'pg_statio_user_tables.heap_blks_read', 'pg_statio_user_tables.idx_blks_hit', 'pg_statio_user_tables.idx_blks_read', 'pg_statio_user_tables.tidx_blks_hit', 'pg_statio_user_tables.tidx_blks_read', 'pg_statio_user_tables.toast_blks_hit', 'pg_statio_user_tables.toast_blks_read', 'throughput_txn_per_sec']38 ---------------------------------

39 newest_result_id40 19

41 ---------------------------------

42 bad43 True44 ---------------------------------

45 config_recommend46 {'global.backend_flush_after': 1522010, 'global.bgwriter_delay': 9149, 'global.bgwriter_flush_after': 471075, 'global.bgwriter_lru_maxpages': 430, 'global.bgwriter_lru_multiplier': 3.0098488567060078, 'global.checkpoint_completion_target':0.08955746130258635, 'global.checkpoint_flush_after': 455731, 'global.checkpoint_timeout': 19236398, 'global.commit_delay': 77545, 'global.commit_siblings': 464, 'global.deadlock_timeout': 1955746883, 'global.default_statistics_target': 3571, 'global.effective_cache_size': 6287603076, 'global.effective_io_concurrency': 617, 'global.from_collapse_limit': 835373562, 'global.join_collapse_limit': 2113163502, 'global.maintenance_work_mem': 956493954, 'global.max_parallel_workers_per_gather': 774, 'global.max_wal_size': 11811263118911574, 'global.max_worker_processes': 179262, 'global.min_parallel_relation_size': 3563800869080, 'global.min_wal_size': 18601145769953536, 'global.random_page_cost': 9.19520558840551e+307, 'global.seq_page_cost': 1.38643328389727e+308, 'global.shared_buffers': 181678, 'global.temp_buffers': 864188196, 'global.wal_buffers': 431497396, 'global.wal_sync_method': 1, 'global.wal_writer_delay': 1250, 'global.wal_writer_flush_after': 7482882703966, 'global.work_mem': 5870930202}

View Code

target_data是一个dict,包含以下几个字段:X_matrix(knobs数据集,数据来源包括所有之前的workload【?】+当前的target workload,整体作为训练集)、y_matrix(metrics数据集,数据来源包括所有之前的workload【?】+当前的target workload,整体作为训练集)、rowlabels(每个sample的序号,这里有25个sample,那就是1排到25。其实没啥意义。。。)、X_columnlabels(每个knobs的名称,这里有31个)、y_columnlabels(每个metrics的名称,这里有61个)、newest_result_id(其中最新的,也就是client刚提交的target workload的序号。一般是最大的那个)、bad(True表示当前训练数据太少,是randomly generated的推荐参数。False表示是真的通过ML model推荐出来的)、mapped_workload(mapped workload的序号)、score(我也不知道是个啥qwq...但是用不到啦)、config_recommend(只有bad==True时才出现。当map_workload阶段发现训练数据太少,并randomly generate一个推荐参数时,该字段用来存储生成的参数,configuration_recommendation函数就直接返回这个参数,不通过ML学习了。)。

以下是 X_matrix中knob数值的存储形式 和 postgresql.conf中保存的原始形式的对应表。可以看出基本是去掉单位后转换成一致的量纲即可:

7820e9b5f5e1c639d0207de144ee7646.png

因为OtterTune bug的原因(详见https://www.cnblogs.com/pdev/p/10903628.html),无法通过运行来调试这部分的代码。我通过阅读configuration_recommendation()函数的源代码整理出了大致的逻辑:

defconfiguration_recommendation(target_data[input=map_workload的返回结果]):

X_workload, X_columnlabels=knob data of matched workload

y_workload, y_columnlabels=metric data of matched workload

newest_result, X_target, y_target= knob/metric data of target workload

X_matrix=np.vstack([X_target, X_workload])

X_scaled= transform(X_matrix) #Scale to N(0,1)

y_matrix =np.vstack([y_target, y_workload])

y_scaled=transform(y_matrix)

X_samples= randomly generated knobs #随机采样

X_min, X_max = min/max value constrains on knobs

model=GPRGD()

model.fit(X_scaled, y_scaled)

res=model.predict(X_samples)

ans=res.find_the_one_with_best_metricsreturn(ans.knobs)

这里省略了对knob进行normalization的一些细节。可以看出:

model的训练数据来自match到的workload和待优化的目标workload。

训练好的model将会在随机采样生成的knobs上预测该配置下metrics的数值,并从中选出最优的,返回它的knob。

2. analysis

该目录下的文件主要用于实现Machine Learning model

ottertune\server\analysis\

│ base.py

│ cluster.py

│ constraints.py

│ factor_analysis.py

│ gp.py      class GPRNP, numpy version of Gaussian Process Regression. map_workload中预测未知workload/knob上的metric value用

│ gp_tf.py class GPRGD, tensorflow version of Gaussian Process Regression. configuration_recommendation中推荐configuration用

│ lasso.py

│ preprocessing.py

│ util.py

└─tests

│ test_cluster.py

│ test_constraints.py

│ test_gpr.py

└─test_preprocessing.py

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值