现在安装新系统,一般是通过克隆现有的patch的环境,进行安装,因为这样可以在安装完成后,不用再对系统一个一个的安装需要的one-off的patch,这样安装很快也很方便,针对上面所说,可能是在克隆过程中跑root.sh出现问题;
在MOS上有对应的BUG,ID
1191067.1,BUG7313884,文档上介绍fixed版本11.2.0.2,可是我的系统是11.2.0.4,文档内容介绍如下:
Due to bug 9446443, automatic OCR backups are
incorrectly owned which is preventing CRSD from overwriting
them.
Expected ownership and permission on Linux - all 7 of
them:
-rw------- 1 root root 11640832 Aug 30 08:46
backup00.ocr
-rw------- 1 root root 11640832 Aug 30 04:46
backup01.ocr
-rw------- 1 root root 11640832 Aug 30 00:46
backup02.ocr
-rw------- 1 root root 11640832 Aug 30 00:46 day_.ocr
-rw------- 1 root root 11640832 Aug 29 00:46 day.ocr
-rw------- 1 root root 11640832 Aug 26 00:45 week_.ocr
-rw------- 1 root root 11640832 Aug 19 00:44 week.ocr
bug
9446443 is fixed in 11.2.0.2, 12.1.
It's recommended to apply patch to fix the issue, but if patch
is unavailable, workaround is to change ownership and permission of
all 7 automatic backup files manually. OCR should be owned by root,
but depend on platform, group may or may not be root - you can
check any randomly named backup file to identify what ownership and
permission it should have; in example below:
-rw------- 1 root root 7143424 Aug 30 09:40 38455890.ocr
With this, please change all 7 automatic backup files to be
owned by root:root with permission
"-rw-------"
根据文档介绍,再结合自己的坏境的情况,查看对应crs的操作日志:
2016-03-16 06:24:59.079: [UiServer][12081]{1:19564:21073} Done
for ctx=11191c2f0
2016-03-16 06:25:54.968: [
OCRRAW][3599]th_delete_backupfile: Failed to
delete the backup file
[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster/backup02.ocr]
Retval:[-2]
2016-03-16 06:25:54.968: [
OCRSRV][3599]th_delete_backupfile: Failed to
delete the backup file:[backup02.ocr]
Location:[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster]
2016-03-16 06:25:55.026: [
OCRRAW][3599]proprbkp_rename: Failed to rename
the backup file
[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster/backup01.ocr]
Retval:[1]
2016-03-16 06:25:55.026: [
OCRSRV][3599]th_rename_backupfile: Failed to
rename the backup file:[backup01.ocr]
Location:[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster].
Retval:[49]
2016-03-16 06:25:55.030: [
OCRRAW][3599]proprbkp_rename: Failed to rename
the backup file
[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster/backup00.ocr]
Retval:[1]
2016-03-16 06:25:55.030: [
OCRSRV][3599]th_rename_backupfile: Failed to
rename the backup file:[backup00.ocr]
Location:[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster].
Retval:[49]
2016-03-16 06:25:55.033: [
OCRRAW][3599]proprbkp_rename: Failed to rename
the backup file
[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster/16654495.ocr]
Retval:[1]
2016-03-16 06:25:55.033: [
OCRSRV][3599]th_rename_backupfile: Failed to
rename the backup file:[16654495.ocr]
Location:[/grid/product/11.2.0/gridhome_1/cdata/c4bidb-cluster].
Retval:[49]
2016-03-16 06:25:55.036: [
OCRSRV][3599]th_manipulate_backups: Failed to
rename the temporary backup file [16654495.ocr].
日志上在对ocr自动备份的过程中,需要删除老文件,创建新的文件,但是crs操作失败,而产生性的默认文件名来代替,
在10.2.0.4的crs日志可能说明的更清楚,直接指出了权限问题:
[ OCRAPI][4651]a_check_permission_int: Other
doesn't have permission
[ OCRAPI][5679]a_check_permission_int: Other
doesn't have permission
[ OCRAPI][4137]a_check_permission_int: Other
doesn't have permission
通过上面的列出,应该确定是由于文件权限导致问题,不是本文中提到的BUG,单纯是权限问题;
解决方法是修改默认备份文件名的权限为root:system,且手工删除number{n}.ocr的文件,
观察每4小时的备份正常,且集群状态正常;