Oracle—运维监控

1、运维监控

1.1、状态查看

---query_sql
select instance_number,
       instance_name,
       host_name,
       version,
       startup_time,
       status,
       parallel,
       thread#,
       archiver,
       log_switch_wait,
       logins,
       shutdown_pending,
       database_status,
       instance_role,
       active_state,
       blocked
  from gv$instance t;

说明:使用 gv$instance 对于集群可以查看多个节点的实例状态;

  • 版本查看
select * from gv$version;

说明:使用 gv$version 对于集群可以查看多个节点的实例状态;

  • Jdk版本
    SELECT dbms_java.get_ojvm_property(PROPSTRING=>'java.version') FROM dual;
Oracle版本jdk版本
11.2.0.41.6.0_43
11.2.0.11.5.0_10

1.2、告警排查

近期做一个项目遇到一个问题,部署在Linux服务器上的Oracle数据库晚上17:00 ETL 定时任务调用后,第二天早上发现数据库实例异常挂掉了;

1) 确认环境

[oracle@server-xyjf ~]$ 
[oracle@server-xyjf ~]$ echo $ORACLE_SID
yyjydb
[oracle@server-xyjf ~]$ 
[oracle@server-xyjf ~]$ echo $ORACLE_BASE
/u01/app/oracle
[oracle@server-xyjf ~]$ 
[oracle@server-xyjf ~]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/db_1
[oracle@server-xyjf ~]$

2) 查看告警日志

[oracle@server-xyjf ~]$ cd $ORACLE_BASE/diag/rdbms/yyjydb/yyjydb/trace
[oracle@server-xyjf ~]$ ll |grep alert
-rw-r----- 1 oracle oinstall  1171872 May 30 15:27 alert_yyjydb.log

more 命令查看alert_yyjydb.log 日志

[oracle@server-xyjf ~]$ more alert_yyjydb.log
RECOMMENDATION:
  Total System Global Area size is 303 GB. For optimal performance,
  prior to the next instance restart:
  1. Increase the number of unused large pages by 
 at least 155137 (page size 2048 KB, total size 303 GB) system wide to
  get 100% of the System Global Area allocated with large pages
  2. Large pages are automatically locked into physical memory.
 Increase the per process memlock (soft) limit to at least 303 GB to lock
 100% System Global Area's large pages into physical memory
********************************************************************
LICENSE_MAX_SESSION = 0
LICENSE_SESSIONS_WARNING = 0
--More--(0%)

从5月17号下午5点的日志开始查看

/May 17 17:
...skipping
Fri May 17 15:05:46 2019
Completed: alter  tablespace TS_DAT_DTFX  add datafile '/oradata02/yyjydb/ts_dat_dtfx03.dbf' size 30G
Fri May 17 17:00:36 2019
Advanced SCN by 16643854 minutes worth to 0x0ee1.7a1ebc85, by distributed transaction remote logon, remote DB: NMSWHXFF.
 Client info: DB logon user SJJC_BZ, machine WORKGROUP\HUWEITBJ151, program plsqldev.exe, and OS user sygzh
Fri May 17 17:34:25 2019
Thread 1 cannot allocate new log, sequence 7
Private strand flush not complete
  Current log# 3 seq# 6 mem# 0: /oradata01/yyjydb/redo03.log
Thread 1 advanced to log sequence 7 (LGWR switch)
  Current log# 1 seq# 7 mem# 0: /oradata01/yyjydb/redo01.log
Fri May 17 17:35:07 2019
Thread 1 cannot allocate new log, sequence 8
Private strand flush not complete
  Current log# 1 seq# 7 mem# 0: /oradata01/yyjydb/redo01.log
Thread 1 advanced to log sequence 8 (LGWR switch)

查询 Errors 关键字,定位报错信息

/Errors
...skipping
Fri May 17 19:31:16 2019
KCF: read, write or open error, block=0x180 online=1
        file=3 '/oradata01/yyjydb/temp_sjck_bz_02.dbf'
        error=27072 txt: 'Linux-x86_64 Error: 28: No space left on device
Additional information: 4
Additional information: 384
Additional information: -1'
Errors in file /u01/app/oracle/diag/rdbms/yyjydb/yyjydb/trace/yyjydb_dbw0_3684.trc:
Errors in file /u01/app/oracle/diag/rdbms/yyjydb/yyjydb/trace/yyjydb_dbw0_3684.trc:
ORA-63999: data file suffered media failure
ORA-01114: IO error writing block to file 2003 (block # 384)
ORA-01110: data file 2003: '/oradata01/yyjydb/temp_sjck_bz_02.dbf'
ORA-27072: File I/O error
Linux-x86_64 Error: 28: No space left on device
Additional information: 4
Additional information: 384
Additional information: -1
LGWR waiting for instance termination
Fri May 17 19:31:16 2019

3)问题分析
分析:从报错的信息可以看到
1、 报错的日志跟踪文件为:/u01/app/oracle/diag/rdbms/yyjydb/yyjydb/trace/yyjydb_dbw0_3684.trc
2、报错具体信息为:ORA-01110: data file 2003: ‘/oradata01/yyjydb/temp_sjck_bz_02.dbf’
ORA-27072: File I/O error
3、Linux-x86_64 Error: 28: No space left on device
4、因为以上报错所以实例将停止:LGWR waiting for instance termination

4)临时表空间
现象:Oracle数据库在创建 temporary 表空间或给 temporary 表空间添加 tempfile 数据文件时,创建或添加的过程很快。

原因:Oracle数据库的 tempfile 数据文件是一类特殊的数据文件:稀疏文件(Sparse File),当 tempfile 文件创建时,它只会写入文件头部和最后块信息(only writes to the header and last block of the file),空间是延后分配的,只有在该临时表空间使用时对应的数据文件才会分配空间,即临时表空间的数据文件在服务器的空间大小是动态变化的;

5)磁盘空间查看

[oracle@server-xyjf ~]$ df -h
Filesystem                            Size  Used Avail Use% Mounted on
/dev/mapper/VolGroup-LogVol00        1000G  284G  666G  30% /
tmpfs                                 505G   43M  505G   1% /dev/shm
/dev/sda1                             194M   64M  121M  35% /boot
/root/rhel-server-6.5-x86_64-dvd.iso  3.6G  3.6G     0 100% /media/rhel
/dev/mapper/oradata01vg-oradata01lv    12T   12T   17M 100% /oradata01
/dev/mapper/oradata02vg-oradata02lv    12T  450G   11T   4% /oradata02
[oracle@server-xyjf ~]$

说明:可以看到,/oradata01 路径空闲的空间只剩 17M,而 报错信息里提示的数据文件为 临时表空间 TEMP_SJCK_BZ 的tempfile 数据文件,单个tempfile 文件的大小为 30G,该路劲下的空闲空间不满足临时数据文件的扩展,导致了数据库实例停止;

6)问题解决
说明1:oracle数据库安装,出于数据库实例的安全考虑,将数据库软件、实例文件安装路劲和数据文件的路径隔离,数据文件一般选择磁盘挂载路径,如这里的 /oradata01/oradata02
数据文件迁移操作请查看:
https://blog.csdn.net/weixin_38938108/article/details/90716152

1.3、多监听配置

场景介绍:近期一个项目试点上线,使用的是Linux服务器上的oracle数据库,涉及源库和目标库的数据抽取;因为源库和目标库都是在内网环境,但是不在同一个网段,甲方出于安全考虑不允许开放1521端口的访问权限,后来技术人员给甲方提的方案是分别在源库和目标库配置双监听,源库和目标库的数据库访问使用1530端口,不使用默认的1521端口;

常用指令

  • 监听指令
# 查看监听状态
lsnrctl status

# 启动监听
lsnrctl start

# 停止监听
lsnrctl stop

# 监听重载
lsnrctl reload
  • 环境配置
[oracle@sjfxserv-etlnode ~]$ 
[oracle@sjfxserv-etlnode ~]$ echo $ORACLE_SID
gdshb
[oracle@sjfxserv-etlnode ~]$ 
[oracle@sjfxserv-etlnode ~]$ echo $ORACLE_BASE
/u01/app/oracle
[oracle@sjfxserv-etlnode ~]$ 
[oracle@sjfxserv-etlnode ~]$ echo $ORACLE_HOME
/u01/app/oracle/product/11.2.0/dbhome_2
[oracle@sjfxserv-etlnode ~]$ 

操作步骤

1、查看当前监听进程

[oracle@sjfxserv-etlnode ~]$ 
[oracle@sjfxserv-etlnode ~]$ ps -ef|grep LISTENER
oracle    44450      1  0 May11 ?        00:11:35 /u01/app/oracle/product/11.2.0/dbhome_2/bin/tnslsnr LISTENER -inherit
oracle   115978 104692  0 15:26 pts/1    00:00:00 grep LISTENER
[oracle@sjfxserv-etlnode ~]$

说明:可以看到这里当前只有一个别名为LISTENER的监听

2、 查看当前的监听端口

[oracle@sjfxserv-etlnode ~]$ lsnrctl status

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 30-MAY-2019 15:22:45

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))
STATUS of the LISTENER
------------------------
Alias                     LISTENER
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                11-MAY-2019 18:09:36
Uptime                    18 days 21 hr. 13 min. 9 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Log File         /u01/app/oracle/diag/tnslsnr/sjfxserv-etlnode/listener/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=sjfxserv-etlnode)(PORT=1521)))
Services Summary...
Service "gdshb" has 1 instance(s).
  Instance "gdshb", status READY, has 1 handler(s) for this service...
Service "gdshbXDB" has 1 instance(s).
  Instance "gdshb", status READY, has 1 handler(s) for this service...
The command completed successfully
[oracle@sjfxserv-etlnode ~]$

说明:该监听默认使用 1521 作为默认的连接端口;

3、配置端口号为1530的监听

[oracle@sjfxserv-etlnode ~]$ 
[oracle@sjfxserv-etlnode ~]$ netca

监听配置-图1
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
注意:这里需要使用1530作为新的监听端口;
在这里插入图片描述
说明:退出netca 监听配置界面;

4、再次查看监听进程

[oracle@sjfxserv-etlnode ~]$ 
[oracle@sjfxserv-etlnode ~]$ ps -ef|grep LISTENER
oracle    44450      1  0 May11 ?        00:11:35 /u01/app/oracle/product/11.2.0/dbhome_2/bin/tnslsnr LISTENER -inherit
oracle   107968      1  0 May29 ?        00:00:00 /u01/app/oracle/product/11.2.0/dbhome_2/bin/tnslsnr LISTENER_1530 -inherit
oracle   115978 104692  0 15:26 pts/1    00:00:00 grep LISTENER
[oracle@sjfxserv-etlnode ~]$

说明:可以看到这里新增了一个别名问 LISTENER_1530的监听进程;

5、配置 LISTENER_1530 监听的服务

[oracle@sjfxserv-etlnode ~]$
[oracle@sjfxserv-etlnode ~]$ netmgr

在这里插入图片描述
说明:点击 Local----->Listeners----->LISTENER_1530,在右侧选择下拉的 Database Services,配置 Global Database Name 和 SID 为实际的实例名,保存退出;

6、查看 LISTENER_1530 监听的状态

[oracle@sjfxserv-etlnode ~]$ lsnrctl status LISTENER_1530

LSNRCTL for Linux: Version 11.2.0.4.0 - Production on 30-MAY-2019 15:27:07

Copyright (c) 1991, 2013, Oracle.  All rights reserved.

Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=sjfxserv-etlnode)(PORT=1530)))
STATUS of the LISTENER
------------------------
Alias                     LISTENER_1530
Version                   TNSLSNR for Linux: Version 11.2.0.4.0 - Production
Start Date                29-MAY-2019 19:00:04
Uptime                    0 days 20 hr. 27 min. 2 sec
Trace Level               off
Security                  ON: Local OS Authentication
SNMP                      OFF
Listener Parameter File   /u01/app/oracle/product/11.2.0/dbhome_2/network/admin/listener.ora
Listener Log File         /u01/app/oracle/diag/tnslsnr/sjfxserv-etlnode/listener_1530/alert/log.xml
Listening Endpoints Summary...
  (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=sjfxserv-etlnode)(PORT=1530)))
  (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1530)))
Services Summary...
Service "gdshb" has 1 instance(s).
  Instance "gdshb", status UNKNOWN, has 1 handler(s) for this service...
The command completed successfully
[oracle@sjfxserv-etlnode ~]$

说明:可以看到别名为 1530 的监听状态正常;

7、重载监听配置文件

[oracle@sjfxserv-etlnode ~]$
[oracle@sjfxserv-etlnode ~]$ lsnrctl reload LISTENER_1530
[oracle@sjfxserv-etlnode ~]$

说明:如果监听状态 提示 The listener supports no services,则执行 lsnrctl reload LISTENER_1530 重新加载监听配置文件即可;

8、Plsql 连接测试

说明:连接成功,至此数据库双监听配置完成;

1.5、密码策略

Oracle的用户默认一般创建时使用默认策略配置文件,default profile

  • 关闭/调整密码重试次数限制
alter profile default limit FAILED_LOGIN_ATTEMPTS 30;

alter profile default limit FAILED_LOGIN_ATTEMPTS unlimited;

说明:当值是 unlimited时候,用户密码错误不会锁定用户;

======================================== over ========================================

  • 1
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值