Bug No. 2649244
Filed 30-OCT-2002 Updated 26-FEB-2004
Product Oracle Server - Enterprise Edition Product Version 9.2.0.1.0
Platform AIX 4.3 Based Systems (64-bit) Platform Version No Data
Database Version 9.2.0.1.0 Affects Platforms Generic
Severity Severe Loss of Service Status Development to Q/A
Base Bug N/A Fixed in Product Version 10.0
Problem statement:
JOBS DON'T RUN IN SCHEDULED INTERVALS
--------------------------------------------------------------------------------
*** 10/30/02 02:29 am *** . . ========================= PROBLEM: . 1. Clear description of the problem encountered: After manual upgrading via export/import from 8.1.6.3.0 to 9.2.0.1.0, scheduled jobs Are not running automatically. It is possible to run job by dbms_job.run without any errors, . 2. Pertinent configuration information (MTS/OPS/distributed/etc) N/A 3. Indication of the frequency and predictability of the problem. If we bounce the database, the jobs run, but after some time they don't. . 4. Sequence of events leading to the problem . a. upgraded database via export/import from 8.1.6.3.0 to 9.2.0.1.0. b. the jobs are not running as per the schedule. 5. Technical impact on the customer. Include persistent after effects. High. Automatic processes don't work. ========================= DIAGNOSTIC ANALYSIS: The job co-coordinator CJQ process is producing trace files. The stack trace is of these files : kkjqawi kkjawiq kkjex1s kkjcjexe kkjssrh ksbcti . . ========================= WORKAROUND: . N/A ========================= RELATED BUGS: 1782381 . ========================= REPRODUCIBILITY: . Reproducible at customer site. 2. List the versions in which the problem has reproduced 9.2.0.1 3. List any versions in which the problem has not reproduced N/A ========================= TESTCASE: N?A . ======================== STACK TRACE: . kkjqawi kkjawiq kkjex1s kkjcjexe kkjssrh ksbcti . ========================= SUPPORTING INFORMATION: . ========================= 24 HOUR CONTACT INFORMATION FOR P1 BUGS: . . ========================= DIAL-IN INFORMATION: . . ========================= IMPACT DATE: *** 10/30/02 02:39 am *** (CHG: Sta->16 Asg->NEW OWNER) *** 10/30/02 02:43 am *** (CHG: G/P->G Asg->NEW OWNER) *** 10/30/02 02:43 am *** (CHG: Asg->NEW OWNER) *** 10/30/02 11:00 pm *** (CHG: Sta->10) *** 10/30/02 11:00 pm *** *** 10/31/02 04:46 am *** (CHG: Sta->16) *** 11/01/02 02:01 am *** *** 11/01/02 02:07 am *** (CHG: Sta->11 Asg->NEW OWNER) *** 11/01/02 02:07 am *** *** 11/03/02 09:27 pm *** (CHG: Asg->NEW OWNER) *** 11/03/02 09:27 pm *** *** 11/19/02 10:14 am *** ESCALATED *** 11/19/02 10:14 am *** Tracking tar is 2561941.1. *** 11/21/02 06:19 pm *** (CHG: Asg->NEW OWNER) *** 11/21/02 09:20 pm *** (CHG: DevPri->2) *** 11/22/02 06:16 am *** (CHG: Sta->30) *** 11/22/02 06:16 am *** . . . . *** 11/22/02 07:49 am *** . *** 11/25/02 12:44 am *** (CHG: Sta->11) *** 11/25/02 12:44 am *** *** 11/25/02 03:46 am *** Hi, . The ct sent the alert and trace files ISPV.zip. I have uploaded this file to ess30. . The cjq process runs when the problem happens. The following output confirms this. ISPV:/home/app/oracle> ps -aef |grep cjq oracle 25454 1 0 07:03:42 - 0:00 ora_cjq0_SPTS oracle 29412 1 0 Nov 04 - 0:10 ora_cjq0_TSPV oracle 45816 54108 1 10:25:20 pts/0 0:00 grep cjq oracle 51794 1 0 Nov 23 - 0:09 ora_cjq0_T oracle 58808 1 0 04:43:47 - 0:01 ora_cjq0_ISPV . . Thanks, Raj *** 11/25/02 05:32 am *** (CHG: Sta->30) *** 11/25/02 05:32 am *** . . . *** 11/25/02 05:58 am *** *** 11/26/02 12:11 am *** (CHG: Sta->11) *** 11/26/02 12:11 am *** Uploaded the CJQ0 process state dump trace file ispv_cjq0_58808.zip to ess30. . Thanks, Raja *** 11/27/02 06:19 am *** . . *** 11/28/02 07:19 am *** (CHG: Sta->30) *** 11/28/02 07:19 am *** From jobs6.txt: . 61 ISPV 27.10.02 14:05:53 28.10.02 14:05:53 N 0 pac_pred_ter.pr_predklad_ter; SYSDATE + 1 . 10 ISPV 27.10.02 04:00:03 28.10.02 04:00:00 N 0 dbms_refresh.refresh('"ISPV"."PVS_PREHLAD_VYROBY"');TRUNC(SYSDATE+1,'DD')+4/24 . 22 SYSTEM 26.10.02 22:04:08 26.10.02 22:09:08 N 0 system.b; sysdate+5/1440 . From the above output of job$, given by Customer (which is taken at 27.10.2002 15:49:30): 1. Not all jobs have problem. Job 61 and 10 have run as expected. 2. It is only job 22 that is having problem. From job$ output, I can see that after 26.10.02 22:09:08, other jobs have run. But job 22 is not selected for run after 26.10.02 22:09:08. . Is this what Customer is complaining? That some jobs don't run, while others do run? Did they check that dba_jobs_running had any rows at that time? . . . *** 11/28/02 10:56 pm *** . . . *** 11/28/02 11:38 pm *** ct is accepted to apply the Diagnostic patch. . Thanks, Raja *** 11/28/02 11:38 pm *** (CHG: Sta->11) *** 11/29/02 01:31 am *** (CHG: Sta->30) *** 11/29/02 01:31 am *** . . *** 11/29/02 01:45 am *** (CHG: Sta->11) *** 11/29/02 01:45 am *** Uploaded file j2.lst.txt.This contains the output of the following queries. . 1.select * from job$ 2.select * from dba_jobs_running. . . *** 11/29/02 01:59 am *** (CHG: Sta->30) *** 11/29/02 01:59 am *** . . *** 12/03/02 09:00 pm *** *** 12/04/02 01:24 am *** (CHG: Sta->11) *** 12/05/02 03:49 am *** *** 12/05/02 11:08 pm *** *** 12/05/02 11:12 pm *** (CHG: Sta->52 G/P->P Asg->NEW OWNER) *** 12/05/02 11:12 pm *** . *** 12/05/02 11:17 pm *** *** 12/06/02 12:02 am *** *** 12/06/02 04:22 am *** (CHG: Sta->30 Asg->NEW OWNER) *** 12/06/02 04:22 am *** *** 12/06/02 04:31 am *** (CHG: Sta->52 Asg->NEW OWNER) *** 12/06/02 04:31 am *** Sorry, the OS is 64 bit version. . Thanks, Raja *** 12/06/02 06:29 am *** (CHG: Sta->11 Asg->NEW OWNER) *** 12/06/02 09:21 am *** *** 12/06/02 09:22 am *** (CHG: Sta->30 Asg->NEW OWNER) *** 12/06/02 09:22 am *** *** 12/06/02 09:01 pm *** *** 12/06/02 09:02 pm *** (CHG: G/P->G Asg->NEW OWNER) *** 12/06/02 09:02 pm *** (CHG: Asg->NEW OWNER) *** 12/12/02 12:46 am *** ct sent the traces and the following information on the TAR. . "Test was done on test database which is version 9.2.0.1.0. But our production databases are 9.2.0.2.0 now. Please, keep it in mind. Not running job is job 22. It is my test job and his interval is 5 min. Last run was at 07:51:17 and never more. Job 564 ran at 07:53:15. job_queue_processes was set to 3 during the test and nothing was changed by alter system." . I have uploaded the file tspv.zip. . Thanks, Raja . *** 12/12/02 12:47 am *** (CHG: Sta->11) *** 12/12/02 03:10 am *** (CHG: Sta->30) *** 12/12/02 03:10 am *** *** 12/12/02 03:32 am *** Requested the customer to send above information *** 12/12/02 10:35 pm *** (CHG: Sta->11) *** 12/12/02 10:35 pm *** Uploaded the file job_tspv.txt to ess30. . Thanks, Raja *** 12/13/02 02:18 am *** *** 12/13/02 04:07 am *** *** 12/13/02 04:10 am *** *** 12/15/02 09:36 pm *** . *** 12/16/02 01:12 am *** *** 12/20/02 03:55 am *** *** 12/20/02 08:59 pm *** *** 12/23/02 01:11 am *** *** 12/23/02 08:33 pm *** (CHG: Sta->80) *** 12/23/02 08:33 pm *** (CHG: Confirmed Flag->Y) *** 12/23/02 08:33 pm *** (CHG: Fixed->10.0) *** 12/23/02 08:33 pm *** . . . . . . . . Rediscovery Information: If you find that some jobs (which are not broken) are not executing, while some other continue to execute, then you are hitting this bug. . Workaround: Setting job_queue_processes=0 and set back the old value after some time. . Release Notes: ]] Some jobs did not execute in their scheduled interval. . *** 12/24/02 02:40 am *** *** 12/24/02 02:45 am *** *** 12/25/02 12:29 am *** . . *** 01/02/03 09:26 am *** *** 01/08/03 01:14 pm *** *** 01/15/03 10:07 pm *** . . *** 01/16/03 01:33 pm *** *** 01/17/03 12:45 am *** *** 01/17/03 12:46 am *** *** 01/20/03 07:26 pm *** *** 01/20/03 07:26 pm *** *** 01/20/03 10:23 pm *** *** 01/20/03 10:23 pm *** *** 01/31/03 02:27 pm *** (CHG: Sta->11) *** 01/31/03 02:27 pm *** Can you please make this bug fix available for Tru64 as well? *** 01/31/03 02:39 pm *** (CHG: Sta->80) *** 01/31/03 02:39 pm *** (CHG: Fixed->10.0) *** 02/20/03 02:13 am *** *** 02/20/03 02:13 am *** *** 03/25/03 06:44 pm *** *** 04/01/03 04:40 am *** *** 04/03/03 09:09 am *** *** 04/08/03 05:14 am *** *** 04/10/03 10:38 am *** *** 04/25/03 05:25 pm *** Applied ARU 4173678 Patch for Linux Intel using OPatch ver. 1.0.0.0.24 to a 9.2.0.2 database. I'm still getting the problem of scheduled database jobs not running automatically. No errors are being output to the alert log and the job is not broken, setting job_queue_processes from 10 to 0, fixes the problem temporarily, the job will stop at the scheduled time after a non-deterministic period of time. . I did a query on dba_jobs_running view and my job id showed up in that view. My guess is that this probably means the job has actually finished running but the entry in dba_jobs_running is not being removed. Hence the DB assumes that the job is still running and does nothing about it... . This is what the job is running every minute: . declare errMsg varchar2(4000); begin queueScriptRunner('dbodProvQueue.sh','dbodProvQueue.log'); commit; exception when others then errMsg := sqlerrm; rollback; insert into errors values(sysdate,errMsg); commit; end; . It's basically a PL/SQL call to a function which invokes a Java Stored Procedure to run a shell script. . I did a ar -tv libserver9.a | grep kkj.o and got back the following: . size: 49068 date: Apr 8 05:06 2003 . I filed a separate bug (bug # 2910375) and was asked to re-open this bug should the patch for 9.2.0.2 doesn't work. *** 04/25/03 05:27 pm *** (CHG: Sta->11) *** 04/27/03 10:36 pm *** (CHG: Sta->80) *** 04/27/03 10:36 pm *** (CHG: Fixed->10.0) *** 04/27/03 10:36 pm *** . . *** 04/28/03 10:00 am *** I was not aware that status 80 bugs should not be re-opened, as I was asked to re-open the base bug. Thanks for clearing things up. Will re-open bug 2910375 *** 04/29/03 05:22 am *** *** 05/05/03 03:57 pm *** *** 05/22/03 09:35 pm *** *** 05/27/03 02:40 am *** *** 09/05/03 10:48 am *** *** 10/09/03 02:18 pm ***