aix io pacing oracle,aix优化记录

最新推荐文章于 2022-03-30 09:05:53 发布

weixin_39743622

最新推荐文章于 2022-03-30 09:05:53 发布

阅读量102

点赞数

文章标签： aix io pacing oracle

DMS避免优化

DMS(deadman switch)是用来描述系统kernel extension用的，它可以在系统崩溃前down掉系统，并产生dump文件，以供日后检查。集群中为了正确处理节点失败，需要判断节点是否死掉。这期间deadman switch使用失败探测参数设置的相关参数进行判断，如果i/o memory等有问题都可能使集群管理器不能正常处理节点通讯，而错误地使集群节点死掉

DMS 的起因：

DMS起作用的原因主要有以下几点：

a. 某种应用程序的优先级大于clstrmgr deamon , 导致clstrmgr无法正常reset DMS计数器。

b. 在系统上存在大量IO 操作，导致cpu 没有时间相应clstrmgr deamon .

c. 内存泄漏或溢出问题

d. 大量的系统错误日志活动，如： (token-ring beaconing 问题)

优化调整：

1)调整系统的io pacing 高低水印

官方推荐值：

HIGH water mark for pending write I/Os per file[33]

LOW water mark for pending write I/Os per file[24]

现系统值：

HIGH water mark for pending write I/Os per file[8193]

LOW water mark for pending write I/Os per file[4096]

2)调快cpu同步syncd频率，(系统默认６０秒)

可见当前系统ha没有优化此频率。加快同步的频率，降低同步的IO量。

现系统值：60s

官方推荐值：10s

3)减慢ha心跳线诊断频率FDR(系统默认normal)

当系统有大io量，或者内存不够情况下，无法响应ha心跳，那么心跳检测的频率越快，就会加速节点预告死亡。

现系统值：normal

推荐优化值：slow

网络性能优化

当前网络参数值：

udp_sendspace = 65536

udp_recvspace = 262144

tcp_recvspace = 262144

tcp_sendspace = 262144

sb_max = 1048576

ipqmaxlen = 100

官方建议udp_sendspace = 65536已足够，但是udp_recvspace推荐为udp_sendspace的10倍。

sb_max = 1048576

因此需要修改主机网络参数

no -p -o udp_sendspace =655360动态修改重启下inted进程就可以。lun磁盘的锁定机制reserve_lock。

现二台主机的powerdisk，reserve_lock都是yes。同事看到主机内disk运行报错不断。

怀疑当时安装rac的时候，只是从网上下载的文档，并没有看官方文档，害人不浅。oracle官方文档说过：在HACMP+RAC环境中，PV的这个属性reserve_lock(reserve_policy)必须为否，以提供多节点的并发访问；这个案例业内不知道有太多例子了，如果不设置后果会不可预计。摘取官方原话：To enable simultaneous access to a disk device from multiple nodes, you must set the appropriate Object Data Manager (ODM) attribute listed in the following table to the value shown, depending on the disk type:

Disk Type Attribute Value

SSA, FAStT, or non-MPIO-capable disks reserve_lock no

ESS, EMC, HDS, CLARiiON, or MPIO-capable disks

reserve_policy no_reserve

To determine whether the attribute has the correct value, enter a command similar to the following on all cluster nodes for each disk device that you want to use:

# /usr/sbin/lsattr -E -l hdiskn

If the required attribute is not set to the correct value on any node, then enter a command similar to one of the following on that node:

■ SSA and FAStT devices

# /usr/sbin/chdev -l hdiskn -a reserve_lock=no

■ESS, EMC, HDS, CLARiiON, and MPIO-capable devices

# /usr/sbin/chdev -l hdiskn -a reserve_policy=no_reserve

weixin_39743622

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
aix io pacing oracle,aix优化记录

DMS避免优化DMS(deadman switch)是用来描述系统kernel extension用的，它可以在系统崩溃前down掉系统，并产生dump文件，以供日后检查。集群中为了正确处理节点失败，需要判断节点是否死掉。这期间deadman switch使用失败探测参数设置的相关参数进行判断，如果i/o memory等有问题都可能使集群管理器不能正常处理节点通讯，而错误地使集群节点死掉DMS ...
复制链接

扫一扫