./runcluvfy.sh stage -pre crsinst -n demo1,demo2 -verbose 报DNS错误时




I got that same error in 11.2.0.3. It's specific to OEL6 or RHEL6. I tracked down the Oracle bug. By googling. This URL got me close:

http://logicalchaos.org/blog/2012/12/cluvfy-prvf-5637-dns-response-time-could-not-be-checked/

Here's the problem. Oracle verifies the FQDN of all nodes are in DNS via nslookup:

$ nslookup <node1's fqdn>
etc.

It also verifies a bogus node isn't in DNS via nslookup:
/usr/bin/nslookup unknown-not-reachable-node
...
** server can't find unknown-not-reachable-node: NXDOMAIN

Here's the problem. OEL 5.8 (incorrectly) returns exit code of 0 in the above failing condition. OEL 6.3 (correctly) returns 1 in that condition.

Looking at the bind-utils RPM changelog, this nsloookup change went in for version 9.8.2-0.9.rc1.

* Mon May 07 2012 Adam Tkac <atkac redhat com> 32:9.8.2-0.9.rc1
- fix race condition in the resolver module
- nslookup: return non-zero exit code when fail to get answer (#816164)

In OEL 5.8, bind-utils RPM is version 9.3.6-20.P1.el5. Doesn't have the fix.
In OEL 6.3, bind-utils RPM version is 9.8.2-0.10.rc1.el6. Has the fix.

Oracle GRID installer mistakenly interprets OEL 6.3's exit code of 1 as a problem. When in fact, it's the expected behaviour for a failed DNS lookup.

If I trick nslookup into returning the (incorrect) OEL 5.8 behavior, then the Oracle GRID installer will install.

# mv /usr/bin/nslookup /usr/bin/nslookup.orig
# echo '#!/bin/bash
/usr/bin/nslookup.orig $*
exit 0' > /usr/bin/nslookup
# chmod a+x nslookup

Now when I run the Oracle GRID installer, it installs. Also cluvfy.sh succeeds too.