
In this Document
APPLIES TO:Oracle Server - Enterprise Edition - Version 10.2.0.1 and laterInformation in this document applies to any platform. PURPOSEThis note is intended to explain the purpose of the ONS daemon, how it is configured and what need to be checked when troubleshooting a ONS related problem in an Oracle Clusterware installation. SCOPEDBA and Oracle Clusterware installers DETAILS1. purpose of the ons daemonThe Oracle Notification Service daemon is a daemon started by the Oracle Clusterware as part of the nodeapps. There is one ons daemon started per clustered node. The Oracle Notification Service daemon is receiving a subset of published clusterware events a. the FAN or Fast Application Notification feature for allowing applications to respond b. the Load Balancing Advisory (the RLB feature) or the feature that permit load balancing 2. launching the ons daemonons daemon is started as part of the nodeapps in the crs home environment with user oracle and is encapsulated as a resource, whose settings can be seen via: crs_stat -p ora.<hostname>.ons or crsctl stat res ora.ons crs_getperm ora.hostname.ons or crs_getperm ora.ons, e.g. Name: ora.hostname.ons The command used by the clusterware to start/stop/ping the ons is 'onsctl start', 'onsctl stop' and 'onsctl ping'. It is possible to start/stop the ons daemon on one node via the clusterware commands: 3. configuration of the ons daemon prior to 11gR2The configuration stands in <crs_home>/opmn/conf/ons.config file on all nodes or in the OCR. The different ons.config parameters are: a. the tcp listening port parameters, e.g. localport=6101 The localport is used to communicate with local clients, e.g. the listeners on the server itself. The remoteport is used to communicate with remote ONS daemons, e.g. the ONS daemons running on the other node(s) of the cluster, or with ons clients (e.g. application or listeners). b. the "useocr" parameter, i.e. useocr=on When the useocr=on is set, then the ons configuration in the ocr is read to define the servers that will be contacted by the ons daemon to receive ons events from it. It is set via the ONS configuration assistant launched during the initial racgons add_config hostname1:6200 hostname2:6200 The hostname to use need to match the name retrieved from the OS command "hostname" "racgons remove_config hostname1" permits to delete the ocr configuration (or to replace it The command "onsctl debug" permit to view the ocr configuration, e.g. Number of onsconfiguration retrieved, numcfg = 2 All remote server connections are viewable via the server connection part of the 'onsctl debug' output, e.g. on a two node rac cluster, the remote node will appear: Server connections: c. "loglevel" and "logfile", e.g. loglevel=3 loglevel specify the level of messages that should be logged by ons. loglevel=3 is d. optional parameter "usesharedinstall" to permit ons to start when a shared $CRS_HOME is used on all nodes of the Oracle Clusterware. The ons is them appending the OS hostname to different files like the ons.log.<hostname> and the .formfactor.<hostname> useharedinstall=true e. optional parameter "allowgroup" to permit installations done with other oracle users than the crs installing user to communicate with the Oracle Clusterware ons, e.g. when the rdbms installation is done with orardbms user and the crs installation with oracrs user, then the "allowgroup' parameter need to be set to true to permit the orardbms listener to communicate with the oracrs ons daemon allowgroup=true e. optional parameter "walletfile" to be used to setup ssl to secure the ONS communication via a walletfile. 4. configuration of the ons daemon in 11gR2The ons configuration became dynamic in 11gR2, i.e. the clusterware ons agent force the usage of some parameters by changing dynamically the ons.config file and provide a new command line <grid_home>/opmn/bin/onsctli interface to debug the ons daemon. The srvctl is further enhanced to set the port values in the ocr instead of racgons (parameter useocr is deprecated and ons works as if useocr would be set to true). So, the clusterware ONS agent creates the ons config file based on current clusterware membership and adds host and port information to the config file following the ocr stored configuration. This way when nodes join the cluster, the ONS daemon on the new node can find the ONS servers already running and join the ONS network. The ones already running also find about the new ones by virtue of the new ones contacting them. See the command line help available via srvctl modify nodeapps -h <grid home>/opmn/bin/onsctli help <grid home>/opmn/bin/onsctli usage
The following ons config parameters are fixed by the clusterware ons agent to the underneath values localport=6201 # line added by Agent The parameters logfile and walletfile still can be set as in previous releases, e.g. logfile can be used to change the default logfile location in $GRID_HOME/opmn/logs. parameter loglevel is obsolete like useocr. When obsolete parameters are detected, they are removed automatically. 4.1 Setting the localport and remoteport via srvctl in 11gR2In 11gR2, OCR contain both the remoteport and localport, together with the EM port and is maintained via commands like:
4.2 to set the debug level ( see onsctli usage for a detailed descrition) in 11gR2onsctli has the possibility to start/shutdown the ons daemon. When in the <grid_home>, 5. ons clients/subscribersclients or subscribers connected to the ons are viewable via the SUBS column of the client connections Client connections: 5.1 the listeners are subscribers for the ons daemonWhen the ons is started, the listeners will register to the ons as client subscribers to all FAN and RLB events. Parameter SUBSCRIBE_FOR_NODE_DOWN_EVENT_<listener_name>=ON need to be set in the WARNING: Subscription for node down event still pending It is normally due to note:284602.1 and bug:4417761. When you start the listener using lsnrctl, environment variable ORACLE_CONFIG_HOME = {Oracle Clusterware HOME} 5.2 application clients/subscribersWith Oracle Database 10g Release 1, JDBC clients (both thick and thin driver) are integrated 6. the FAN and RLB eventsThere are two types of events ONS handle. The FAN event (or HA events) are meant for FAN processing.The RLB events are meant for workload management. When setting loglevel to 9, it is possible to check the events viewed in the <crs home>/opmn/logs/ons.log files. 6.1 the FAN eventsThe FAN events (event type=database/event/service) are forwarded by the racgimon (for pre 11gR2 databases) or by the 11gR2 agent and evmd clusterware processes to the ons daemon. Main bug:13879428 need to be fixed in this area (see note:1489751.1) and bug:6760284 RACGEVTF SOMTIMES DOES NOT SEND ONS EVENT. 6.1.1 FAN events forwarded to the ons daemon by 11gR2 agent or pre-11gR2 racgimonThe clusterware forward instance and service up/down events to the ons daemon. e.g. ../opmn/logs> grep -E "body|VERSION" ons.log (loglevel=9) 6.1.2 FAN events forwarded from the evmd daemon to the ons daemonIt concerns the node down and public network down events. Main bug:6083726 (see note:6083726.8), Bug:9538932 REBOOTED SERVER NODE ONS DOES NOT SEND EVENTS TO CLIENT ONS and bug:6760284 RACGEVTF SOMTIMES DOES NOT SEND ONS EVENT need to be fixed in this area. When there is a node down event or a public vip network down event, then the evmd will post an event to e.g. ons.log with level=9 showing VERSION=1.0 host=hostname incarn=100 status=nodedown reason=member_leave The FAN events are a subset of the EVM events (logged in the $CRS_HOME/evm/log/<hostname>_evmlog.<date> files). All evm events can be viewed via: 6.2 the RLB eventsThe RLB events (event type=database/event/servicemetrics/<service_name>) sent by the racgimon on MMON background process request e.g. Notification Type "database/event/servicemetrics/ALL" set via Querying the sys$service_metrics_tab show MMON events logged every 30seconds, e.g. SELECT user_data from SYS.SYS$SERVICE_METRICS_TAB order by 1 ; grep -E 'body|percent' ons.log (with loglevel=9) show the same events REFERENCESNOTE:744849.1 - After CRS installation, ONS can not start on 1 nodeNOTE:752595.1 - Questions about how ONS and FCF work with JDBC NOTE:754619.1 - CRS-0215 / ONS Failed to Start. Pingwait Exited With Exit Status 2 BUG:5749953 - ONS SIGBUS ERROR AFTER INSTALL PATCHSET 10.2.0.3 FOR CRS NOTE:284602.1 - 10g Listener: High CPU Utilization - Listener May Hang NOTE:372959.1 - 'Warning: Subscription For Node Down Event Still Pending' In Listener Log NOTE:433827.1 - How to Verify and Test Fast Connection Failover (FCF) Setup from a JDBC Thin Client Against a 10.2.x RAC Cluster NOTE:5749953.8 - Bug 5749953 - Solaris: CRS crashes after applying 10.2.0.3 Patch Set NOTE:6083726.8 - Bug 6083726 - ONS does not receive NODEDOWN event NOTE:731370.1 - ONS consumes high CPU and/or Memory NOTE:566573.1 - Fast Connection Failover (FCF) Test Client Using 11g JDBC Driver and 11g RAC Cluster NOTE:1489751.1 - 10.2/11.1 service does not failover at instance crash in 11.2 GI env BUG:9538932 - REBOOTED SERVER NODE ONS DOES NOT SEND EVENTS TO CLIENT ONS | ![]() |
![]()
|


