TFA日志收集工具:
一.介绍:
TFA全称:Trace File Analyzer,日志分析工具。
TFA会监视的日志,以发现可能影响服务的重大问题,在检测到任何潜在问题时也会自动收集相关的诊断信息。
TFA可以识别日志文件中的相关信息,将日志文件修剪为解决问题所需的部分,还可以跨集群节点收集数据,并将所有内容整合到一个地方。
工作方式:
1.DBA发出diagcollect命令,启动TFA日志收集进程。
2.本地TFA发送收集请求至其他节点的TFA,在其他节点上开始日志收集工作。
3.本地TFA也同时开始进行日志收集工作。
4.所有涉及节点的TFA日志都归档至发起diagcollect命令的"master"节点。
5.DBA提取已归档的TFA日志信息,进行分析或提交SR进行处理
二.安装,启动与关闭:
需配置root的环境变量,否则会报ERROR: ORACLE_HOME is not set
[root@rac19cn1 ~]# more .bash_profile
# .bash_profile
# Get the aliases and functions
if [ -f ~/.bashrc ]; then
. ~/.bashrc
fi
# User specific environment and startup programs
PATH=$PATH:$HOME/bin
export PATH
export ORACLE_HOME=/oracle/app/product/193000/db_1
[root@rac19cn1~]#cd /oracle/app/product/193000/db_1/suptools/tfa/release/tfa_home/install/
[root@rac19cn1 install]# ls
inittab_master init.tfa.tmpl oracle-tfa.conf oracle-tfa.service roottfa.sh
[root@ylr install]# ./roottfa.sh
Do you want to setup Oracle Trace File Analyzer (TFA) now ? yes|[no] :
yes
Installing Oracle Trace File Analyzer (TFA).
LogFile: /oracle/app/product/193000/db_1/install/root_rac19cn1_2020-08-26_11-14-01.log
Finished installing Oracle Trace File Analyzer (TFA)
安装TFA完成
启动关闭:
启动:
[root@rac19cn1 bin]# ./tfactl
WARNING - TFA Software is older than 180 days. Please consider upgrading TFA to the latest version.
tfactl> start
Starting TFA..
Waiting up to 100 seconds for TFA to be started..
. . . . .
Message from syslogd@rac19cn1 at Nov 2 16:30:32 ...
kernel:do_IRQ: 0.153 No irq handler for vector (irq -1)
Successfully started TFA Process..
. . . . .
TFA Started and listening for commands
关闭:
tfactl> stop
三.日志收集:
仅收集数据库相关日志
tfactl> diagcollect -database ora19c
By default TFA will collect diagnostics for the last 12 hours. This can result in large collections
For more targeted collections enter the time of the incident, otherwise hit <RETURN> to collect for the last 12 hours
[YYYY-MM-DD HH24:MI:SS,<RETURN>=Collect for last 12 hours] :
Collecting data for the last 12 hours for this component ...
Collecting data for all nodes
Collection Id : 20201103092107rac19cn1
Detailed Logging at : /oracle/gridbase/tfa/repository/collection_Tue_Nov_03_09_21_07_CST_2020_node_all/diagcollect_20201103092107_rac19cn1.log
2020/11/03 09:21:23 CST : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2020/11/03 09:21:23 CST : Collection Name : tfa_Tue_Nov_03_09_21_07_CST_2020.zip
2020/11/03 09:21:24 CST : Collecting diagnostics from hosts : [rac19cn1, rac19cn2]
2020/11/03 09:21:25 CST : Scanning of files for Collection in progress...
2020/11/03 09:21:25 CST : Collecting additional diagnostic information...
2020/11/03 09:21:55 CST : Getting list of files satisfying time range [11/02/2020 21:21:23 CST, 11/03/2020 09:21:55 CST]
2020/11/03 09:22:52 CST : Completed collection of additional diagnostic information...
2020/11/03 09:29:02 CST : Collecting ADR incident files...
2020/11/03 09:29:03 CST : Completed Local Collection
2020/11/03 09:29:03 CST : Remote Collection in Progress...
.-------------------------------------.
| Collection Summary |
+----------+-----------+-------+------+
| Host | Status | Size | Time |
+----------+-----------+-------+------+
| rac19cn2 | Completed | 10kB | 61s |
| rac19cn1 | Completed | 137kB | 459s |
'----------+-----------+-------+------'
Logs are being collected to: /oracle/gridbase/tfa/repository/collection_Tue_Nov_03_09_21_07_CST_2020_node_all
/oracle/gridbase/tfa/repository/collection_Tue_Nov_03_09_21_07_CST_2020_node_all/rac19cn1.tfa_Tue_Nov_03_09_21_07_CST_2020.zip
/oracle/gridbase/tfa/repository/collection_Tue_Nov_03_09_21_07_CST_2020_node_all/rac19cn2.tfa_Tue_Nov_03_09_21_07_CST_2020.zip
收集指定时间的所有trace日志
tfactl> diagcollect -for Nov/2/2020
Collecting data for all nodes
Scanning files for Nov/2/2020
Collection Id : 20201103101305rac19cn1
Detailed Logging at : /oracle/gridbase/tfa/repository/collection_Tue_Nov_03_10_13_05_CST_2020_node_all/diagcollect_20201103101305_rac19cn1.log
2020/11/03 10:13:13 CST : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2020/11/03 10:13:13 CST : Collection Name : tfa_Tue_Nov_03_10_13_05_CST_2020.zip
2020/11/03 10:13:14 CST : Collecting diagnostics from hosts : [rac19cn1, rac19cn2]
2020/11/03 10:13:14 CST : Scanning of files for Collection in progress...
2020/11/03 10:13:14 CST : Collecting additional diagnostic information...
2020/11/03 10:14:39 CST : Getting list of files satisfying time range [11/02/2020 00:00:00 CST, 11/02/2020 23:59:59 CST]
2020/11/03 10:16:18 CST : Completed collection of additional diagnostic information...
2020/11/03 10:27:16 CST : Collecting ADR incident files...
2020/11/03 10:27:17 CST : Completed Local Collection
2020/11/03 10:27:18 CST : Remote Collection in Progress...
.-------------------------------------.
| Collection Summary |
+----------+-----------+-------+------+
| Host | Status | Size | Time |
+----------+-----------+-------+------+
| rac19cn2 | Completed | 294MB | 332s |
| rac19cn1 | Completed | 315MB | 843s |
'----------+-----------+-------+------'
Logs are being collected to: /oracle/gridbase/tfa/repository/collection_Tue_Nov_03_10_13_05_CST_2020_node_all
/oracle/gridbase/tfa/repository/collection_Tue_Nov_03_10_13_05_CST_2020_node_all/rac19cn1.tfa_Tue_Nov_03_10_13_05_CST_2020.zip
/oracle/gridbase/tfa/repository/collection_Tue_Nov_03_10_13_05_CST_2020_node_all/rac19cn2.tfa_Tue_Nov_03_10_13_05_CST_2020.zip
[root@rac19cn1 diag]# ls
asm clients crs rdbms tnslsnr
//监听日志 集群日志默认也会收集
***********只收集 2020.11.2的database trace日志***********:
tfactl> diagcollect -database ora19c -for Nov/2/2020
***********只收集 2020.11.2的集群日志***********:
tfactl> diagcollect -crs -for Nov/2/2020
收集指定时间范围的数据库日志:
tfactl> diagcollect -database ora19c -from “2020-11-02 18:00:00” -to “2020-11-03 08:00:00”
Collecting data for all nodes
Scanning files from nov/02/2020 18:00:00 to nov/03/2020 08:00:00
Collection Id : 20201106090421rac19cn1
Detailed Logging at : /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/diagcollect_20201106090421_rac19cn1.log
2020/11/06 09:04:31 CST : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2020/11/06 09:04:31 CST : Collection Name : tfa_Fri_Nov_06_09_04_21_CST_2020.zip
2020/11/06 09:04:32 CST : Collecting diagnostics from hosts : [rac19cn1, rac19cn2]
2020/11/06 09:04:32 CST : Scanning of files for Collection in progress...
2020/11/06 09:04:32 CST : Collecting additional diagnostic information...
2020/11/06 09:04:57 CST : Getting list of files satisfying time range [11/02/2020 18:00:00 CST, 11/03/2020 08:00:00 CST]
2020/11/06 09:05:47 CST : Completed collection of additional diagnostic information...
2020/11/06 09:12:57 CST : Collecting ADR incident files...
2020/11/06 09:12:57 CST : Completed Local Collection
2020/11/06 09:12:58 CST : Remote Collection in Progress...
.-------------------------------------.
| Collection Summary |
+----------+-----------+-------+------+
| Host | Status | Size | Time |
+----------+-----------+-------+------+
| rac19cn2 | Completed | 10kB | 83s |
| rac19cn1 | Completed | 185kB | 505s |
'----------+-----------+-------+------'
Logs are being collected to: /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all
/oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/rac19cn1.tfa_Fri_Nov_06_09_04_21_CST_2020.zip
/oracle/gridbase/tfa/repository/collection_Fri_Nov_06_09_04_21_CST_2020_node_all/rac19cn2.tfa_Fri_Nov_06_09_04_21_CST_2020.zip
***********收集一小时内数据库日志***********:
tfactl> diagcollect –database ora19c –since 1h
收集指定节点数据库日志:
tfactl> diagcollect -database ora19c -node rac19cn1 -for Nov/2/2020
Collecting data for rac19cn1 node(s)
Scanning files for Nov/2/2020
Collection Id : 20201106101221rac19cn1
Detailed Logging at : /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_10_12_21_CST_2020_node_rac19cn1/diagcollect_20201106101221_rac19cn1.log
2020/11/06 10:12:31 CST : NOTE : Any file or directory name containing the string .com will be renamed to replace .com with dotcom
2020/11/06 10:12:31 CST : Collection Name : tfa_Fri_Nov_06_10_12_21_CST_2020.zip
2020/11/06 10:12:32 CST : Collecting diagnostics from hosts : [rac19cn1]
2020/11/06 10:12:32 CST : Scanning of files for Collection in progress...
2020/11/06 10:12:32 CST : Collecting additional diagnostic information...
2020/11/06 10:12:37 CST : Getting list of files satisfying time range [11/02/2020 00:00:00 CST, 11/02/2020 23:59:59 CST]
2020/11/06 10:13:37 CST : Completed collection of additional diagnostic information...
2020/11/06 10:21:19 CST : Collecting ADR incident files...
2020/11/06 10:21:20 CST : Completed Local Collection
.-------------------------------------.
| Collection Summary |
+----------+-----------+-------+------+
| Host | Status | Size | Time |
+----------+-----------+-------+------+
| rac19cn1 | Completed | 1.5MB | 528s |
'----------+-----------+-------+------'
Logs are being collected to: /oracle/gridbase/tfa/repository/collection_Fri_Nov_06_10_12_21_CST_2020_node_rac19cn1
/oracle/gridbase/tfa/repository/collection_Fri_Nov_06_10_12_21_CST_2020_node_rac19cn1/rac19cn1.tfa_Fri_Nov_06_10_12_21_CST_2020.zip
四.日志分析:
Oracle提供了analyze命令来帮助我们分析数据库当前的trace文件
常用:
tfactl> analyze -search "ORA-" -since 1d INFO: analyzing all (Alert and Unix System Logs) logs for the last 1440 minutes... Please wait...
INFO: analyzing host: rac19cn1
Report title: Analysis of Alert,System Logs
Report date range: last ~1 day(s)
Report (default) time zone: CST - China Standard Time
Analysis started at: 06-Nov-2020 09:28:08 AM CST
Elapsed analysis time: 15 second(s).
Configuration file: /oracle/grid/crs_1/tfa/rac19cn1/tfa_home/ext/tnt/conf/tnt.prop
Configuration group: all
Parameter: ORA-
Total message count: 24,938, from 25-Aug-2020 06:22:19 PM CST to 06-Nov-2020 09:20:01 AM CST
Messages matching last ~1 day(s): 2,180, from 05-Nov-2020 09:30:01 AM CST to 06-Nov-2020 09:20:01 AM CST
Matching regex: ORA-
Case sensitive: false
Match count: 0
INFO: analyzing all (Alert and Unix System Logs) logs for the last 1440 minutes... Please wait...
INFO: analyzing host: rac19cn2
Report title: Analysis of Alert,System Logs
Report date range: last ~1 day(s)
Report (default) time zone: CST - China Standard Time
Analysis started at: 06-Nov-2020 09:28:26 AM CST
Elapsed analysis time: 8 second(s).
Configuration file: /oracle/grid/crs_1/tfa/rac19cn2/tfa_home/ext/tnt/conf/tnt.prop
Configuration group: all
Parameter: ORA-
Total message count: 35,060, from 25-Aug-2020 06:30:24 PM CST to 06-Nov-2020 09:20:02 AM CST
Messages matching last ~1 day(s): 4,940, from 05-Nov-2020 09:30:01 AM CST to 06-Nov-2020 09:20:02 AM CST
Matching regex: ORA-
Case sensitive: false
Match count: 0
以上os、db、asm、crs等所有日志都分析.
仅分析最近两天数据库实例的日志
tfactl> analyze -comp db -since 2d
-comp参数可以指定级别为os、db、asm、acfs、crs、all,默认的话是all,表示所有的都收集。
五:其他操作:
查看当前哪些用户可以使用tfactl
tfactl> access lsusers
.---------------------------------.
| TFA Users in rac19cn1 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| grid | USER | Allowed |
'-----------+-----------+---------'
.---------------------------------.
| TFA Users in rac19cn2 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| grid | USER | Allowed |
'-----------+-----------+---------'
TFA工具默认仅对root用户和grid用户授予使用权限
[oracle@rac19cn1 bin]$ ./tfactl
TFA-00519 Oracle Trace File Analyzer (TFA) is not installed.
//oracle用户使用出现未安装
授予oracle用户使用TFA的权限
[root@rac19cn1 bin]#tfactl access add -user oracle
Successfully added 'oracle' to TFA Access list.
.---------------------------------.
| TFA Users in rac19cn1 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| grid | USER | Allowed |
| oracle | USER | Allowed |
'-----------+-----------+---------'
.---------------------------------.
| TFA Users in rac19cn2 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| grid | USER | Allowed |
| oracle | USER | Allowed |
'-----------+-----------+---------'
查看当前主机状态
tfactl> print status
.-----------------------------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID | Inventory Status |
+----------+---------------+------+------+------------+----------------------+------------------+
| rac19cn1 | RUNNING | 9127 | 5000 | 18.3.3.0.0 | 18330020190315044534 | COMPLETE |
| rac19cn2 | RUNNING | 1848 | 5000 | 18.3.3.0.0 | 18330020190315044534 | COMPLETE |
'----------+---------------+------+------+------------+----------------------+------------------'