Troubleshooting Steps
Why does the problem occur?
This can be due to many reasons: WebLogic Server itself, user created threads, bad coding practices, or 3rd party software. Unfortunately, proving where this happens can sometimes be difficult. This pattern will attempt to help troubleshoot this problem by using operating specific commands and gathering data.
Gathering Data for High CPU Usage
For operating specific information on gathering data for High CPU usage, follow the steps below depending upon your operating system.
This mapping will be available from 70SP4RP2 and 81SP2RP1 of JRockit onward. For example, on Linux the thread dump (SeeNote 1098691.1 for instructions to create a thread dump) will look like the following in those releases forward (the PID is printed in the thread dump):
thread: 0x469b0af0 lastj: 0xac0f19c
pt_thr: 237596 pid: 23166
at COM.jrockit.vm.Classes.defineClass0(Native Method)@0x8b4b798
at COM.jrockit.vm.Classes.defineClass(Unknown Source)@0x8b4b8b1
at java.lang.ClassLoader.defineClass(Unknown Source)@0x8b4b46f
In the above, the PID is 23166 which you directly relate to the PID from top (or whatever specific command you need to use on your operating system) output on Linux or whatever operating system you are on.
If you are using JRockit 70SP4 or higher on Windows the thread dump will look like this:
thread: 031FB368 lastj: 7FE30EBC
pt_thr: 0000064C
at java.lang.Thread.doYield(Native Method)@0DC40AC8
at java.lang.Object.wait0(Unknown Source)@0DC42163
at weblogic.kernel.ExecuteThread.waitForRequest(ExecuteThread.java:176)@102A11EC
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:202)@102A0E71
at java.lang.Thread.startThreadFromVM(Unknown Source)@0DC43775
--- End of stack trace
In the above on Windows, you can take the "pt_thr" and convert that to decimal. You can then relate that decimal number to the output obtained from pslist to match the thread that is taking up the most CPU on Windows.
Conversion to Hexadecimal Number
dec2hex.sh:
Usage:
dec -> hex: 755 = 2f3
Solaris
- Run the prstat command on the Java process. Do this for several iterations so you can see a pattern. Example:
prstat -L -p <PID> 1 1NOTE: If you do not have prstat available on your Solaris installation, you can do the following instead:
ps -Le -o pid,user,s,lwp,pcpu,args | awk '$3 != "S" { print }'where the output will show the PID, user, state, and the lightweight thread number. An example output is something like this:
0 root T 1 schedYou can then use the information from the fourth column of the output for the LWP number in the next steps.
2769 root O 1 ps -Le -o pid,user,s,lwp,args
3383 usera R 27 /wwsl/sharedInstalls/solaris/j2sdk1.4.2_04/bin/java -client -Xms32m -Xmx200m -X
3383 usera R 31 /wwsl/sharedInstalls/solaris/j2sdk1.4.2_04/bin/java -client -Xms32m -Xmx200m -X
2770 usera R 1 awk $3 != "S" { print } - Run the pstack command on the Java process to get a mapping of the Light Weight Process (LWP) to the PID (Process ID). Example:
pstack 9499and redirect the output to a file.
If you use regular thread libraries (i.e., you do not have /usr/lib/lwp in the LD_LIBRARY_PATH) in Solaris, a LWP is not mapped to OS thread directly, so you have to take pstack from the process (so check to see if you are using the alternate thread library). - Get several thread dumps of the server over time to make sure you get the right threads executing. You can do this by doing a
kill -3 <PID>
on the Java Process. - Map the LWP id to the Java Thread id.
For example, if the LWP in question is "8" it could map to the Java thread "76". Then calculate the hex value of 76 to be 0x4c. - Examine in the thread dump to find the thread that matched "nid= <to that identifier/value.
For this example, you would find the thread that matchednid=0x4c
and this would be the thread that is consuming the CPU. - You will then need to:
- Determine why this is happening in your code
- OR if the top of the stacktrace is from WebLogic contact Oracle Customer Support.
The following is an example of the above process for Solaris
- Run the prstat command on the Java process.
$ prstat -L -p 9499 1 1
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/LWPID
9499 usera 153M 100M sleep 58 0 0:00.22 0.6% java/8
9499 usera 153M 100M sleep 58 0 0:00.10 0.2% java/10
9499 usera 153M 100M sleep 58 0 0:00.11 0.1% java/9
9499 usera 153M 100M sleep 58 0 0:00.03 0.0% java/5
9499 usera 153M 100M sleep 58 0 0:01.01 0.0% java/1
9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/12
9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/11
9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/14
9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/13
9499 usera 153M 100M sleep 59 0 0:00.07 0.0% java/7
9499 usera 153M 100M sleep 59 0 0:00.00 0.0% java/6
9499 usera 153M 100M sleep 59 0 0:00.00 0.0% java/4
9499 usera 153M 100M sleep 58 0 0:00.11 0.0% java/3
9499 usera 153M 100M sleep 58 0 0:00.00 0.0% java/2 - Run the pstack command. Example:
pstack 9499and redirect the output to a file.
If you use regular thread libraries in Solaris (i.e., you do not have /usr/lib/lwp in the LD_LIBRARY_PATH), a LWP is not mapped to OS thread directly, so you have to take pstack from process (so check to see if you are using the alternate thread library).
The example above shows thejava/8
process at the top of prstat. - Examine the pstack output for
lwp# 8
.
You would find that this maps to "thread# 76" from the pstack output, as below.
----------------- lwp# 8 / thread# 76 --------------------
ff29d190 poll (e2e81548, 0, bb8)
ff24d154 select (0, 0, 0, e2e81548, ff2bf1b4, e2e81548) + 348
ff36b134 select (0, bb8, 7fffffff, fe4c8000, 0, bb8) + 34
fe0f62e4 __1cCosFsleep6FpnGThread_xl_i_ (0, bb8, fe4c8000, 1, 0, 1e2fd8) + 234
fe23f050 JVM_Sleep (2, 0, bb8, fe4de978, fe4c8000, 1e2fd8) + 22c
0008f7ac ???????? (e2e818d4, bb8, 1e2fd8, 984a4, 0, 109a0)
0008c914 ???????? (e2e8194c, 1, fe4d6a80, 98564, 8, e2e81868)
fe5324e8 __1cMStubRoutinesG_code1_ (e2e819d8, e2e81c10, a, f6cb5000, 4, e2e818f0) + 3ec
fe0cbe94 __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (e2e81c08,fe4c8000, e2e81b54, 1e2fd8, 8e764, e2e81c10) +308
fe1f6dbc __1cJJavaCallsMcall_virtual6FpnJJavaValue_nLKlassHandle_nMsymbolHandlee81c08, e2e81b54) + 150pnGThread__v_(f6cb64b8, e2e81b40, e2e81b44, fe4c8000, e2d8) + 60e_5pnGThread__v_ (e2e81c08, e2e81c04, e2e81c00,e2e81bf4, e2e81bec, 1e2f8000, e2e81d10, 1e, e) + 120FpnKJavaThread_pnGThread__v_ (f6817ff8, 1e2fd8, fe4c 7fd70) + 3d8cKJavaThreadDrun6M_v_ (e2e02000, fe4d3e34, fe4c8000, 7fd70, 1e2fd8, fe213ec8 _start (fe4c8000, fe625d10, 0, 5, 1, fe401000) + 20
ff36b728 _thread_start (1e2fd8, 0, 0, 0, 0, 0) + 40 - Get a thread dump of the server by doing:
kill -3 <PID>
on the Java Process. - Since lwp# 8 mapped to thread #76, you can calculate the hex value of 76 to be 4c. This maps to nid=0x4c in the JVM thread dump:
"Thread-6" prio=5 tid=0x1e2fd8 nid=0x4c waiting on monitor [0xe2e81000..0xe2e819d8]In this example, the thread using the most CPU is actually sleeping. The application poller runs when the server is started in Development Mode. It runs every 30 seconds, so it is apparent that the thread dump was not captured in time to see activity in this thread.
at java.lang.Thread.sleep(Native Method)
at weblogic.management.deploy.GenericAppPoller.run(GenericAppPoller.java:139)
Ideally, all three steps should be done quickly and in a row to capture the data as close together in time as possible. This can be done via a simple shell script like below:
#
# Takes an argument (PID of the WLS process) and loops three times. This will append the prstat information to a file called dump_high_cpu.txt. The thread dump information will either be in file where stdout was redirected or printed on the screen.
#
for loopnum in 1 2 3
do
prstat -L -p $1 1 1 >> dump_high_cpu.txt
pstack $1 >> dump_high_cpu.txt
kill -3 $1
echo "prstat, pstack, and thread dump done. #" $loopnum
sleep 1
echo "Done sleeping."
done - Then you can then examine that thread to determine what it is doing and if there is a problem.
Linux
- Use ps and grep to get the PID associated with your userID that started WLS that is using up the CPU.
- Find the thread id in the WLS process that is using the most CPU via
ps -eLo pid,ppid,tid,pcpu,comm | grep <PID>
. - Take several thread dumps of the WebLogic Server process via
kill -3 <PID>
. - Identify the stacktrace of the java thread with the high usage in the thread dumps, converting the thread id in the second step to a hex value if using an older Linux version.
- Search the thread dumps for the corresponding value
nid
for older Linux versions ortid
for newer versions. This will give you the stacktrace of the thread that is causing the High CPU issue.
The following is an example of the above process for Linux.
- Use ps and grep to get the PID associated with your userID that started WLS that is using up the CPU.
ps -ef | grep javaIn this case the actual java process is 806.
owner 806 766 86 19:06 pts/0 00:00:13 /wwsl/sharedInstalls/linux/wls1032/jrockit_160_14_R27.6.5-32/bin/java -jrockit -Xms512m -Xmx512m -Dweblogic.Name=AdminServer -Djava.security.policy=/wwsl/sharedInstalls/linux/wls1032/wlserver_10.3/server/lib/weblogic.policy -Xverify:none -da -Dplatform.home=/wwsl/sharedInstalls/linux/wls1032/wlserver_10.3 -Dwls.home=/wwsl/sharedInstalls/linux/wls1032/wlserver_10.3/server -Dweblogic.home=/wwsl/sharedInstalls/linux/wls1032/wlserver_10.3/server -Dweblogic.management.discover=true -Dwlw.iterativeDev= -Dwlw.testConsole= -Dwlw.logErrorsToConsole= -Dweblogic.ext.dirs=/wwsl/sharedInstalls/linux/wls1032/patch_wls1032/profiles/default/sysext_manifest_classpath weblogic.Server
owner 842 717 0 19:06 pts/1 00:00:00 grep java - Find the thread id in the WLS process that is using the most CPU via
ps -eLo pid,ppid,tid,pcpu,comm | grep <PID>
. Run the command:
watch "ps -eLo pid,ppid,tid,pcpu,comm | grep 806"
Watch will run the command continuously. CPU utilization will be displayed in the forth column with the corresponding thread id shown in the third (tid) column of the output. Note the thread id that consistently shows the highest value for pcpu.
806 766 806 0.0 java 806 766 807 2.3 java 806 766 808 0.0 (Signal Handler 806 766 809 0.0 (GC Main Thread 806 766 810 0.0 (GC Worker Thre 806 766 811 0.0 (GC Worker Thre 806 766 812 0.0 (Code Generatio 806 766 813 0.0 (Code Optimizat 806 766 814 0.0 (VM Periodic Ta 806 766 815 0.0 Finalizer 806 766 816 0.0 Reference Handl 806 766 817 0.0 (Sensor Event T 806 766 818 0.0 Timer-0 806 766 819 0.0 Timer-1 806 766 834 3.7 [ACTIVE] Execut
In this case thread id 834 is the highest usage thread.
If the process is using lots of threads, then just using watch may not be useful to identify the offending thread. In such cases please use the following method:
First: Make sure the high CPU symptom is present and execute the following command to collect the process threads:
$ ps -eLo pid,ppid,tid,pcpu,comm | grep <pid> > <pid.out>For example:
$ ps -eLo pid,ppid,tid,pcpu,comm | grep 18354 > 18354.outSecond: Easily identify the highest CPU usage thread using this sorted output.
$ cat| awk '{ print "pccpu: "$4" pid: "$1" ppid: "$2" ttid: "$3" comm: "$5}' |sort -nFor example:
$ cat 18354.out | awk '{ print "pccpu: "$4" pid: "$1" ppid: "$2" ttid: "$3" comm: "$5}' |sort -nExample output:
...This output is sorted so that the highest usage thread will appear at the bottom. In this case Thread (ttid) 18838 is the highest usage thread.
pccpu: 0.1 pid: 18354 ppid: 1 ttid: 18364 comm: java
pccpu: 0.1 pid: 18354 ppid: 1 ttid: 18840 comm: java
pccpu: 0.2 pid: 18354 ppid: 1 ttid: 18363 comm: java
pccpu: 0.2 pid: 18354 ppid: 1 ttid: 18835 comm: java
pccpu: 0.5 pid: 18354 ppid: 1 ttid: 19266 comm: java
pccpu: 10.8 pid: 18354 ppid: 1 ttid: 18838 comm: java
# - Take several thread dumps of the WebLogic Server process via
kill -3 <PID>
. This can be done via a simple shell script like below:
# # Takes an argument (PID of the WLS process) and loops three times. This will append the top information to a file called dump_high_cpu.txt. # The thread dump information will either be in file where stdout was redirected or printed on the screen. # for loopnum in 1 2 3 do top -b -n1 >> dump_high_cpu.txt kill -3 $1 echo "cpu snapshot and thread dump done. #" $loopnum sleep 1 echo "Done sleeping." done
- Identify the stacktrace of the java thread with the high usage in the thread dumps, converting the thread id in the second step to a hex value if using an older Linux version.
Older Linux
If the nid appears in the thread dump and is in hexadecimal format for example:
"ExecuteThread: '0' for queue: 'default'" daemon prio=1 tid=0x83da550 nid=0x59b2 waiting on monitor [0x56138000..0x56138870]
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:415)
at weblogic.kernel.ExecuteThread.waitForRequest(ExecuteThread.java:146)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:172)
Convert the Thread ID (tid) you got from step 2 to hexadecimal and then search for the nid with this value in the thread dump.
For example, if you get the thread id '5828' in the third column of the above ps command, convert '5828' to hexadecimal: 5828 (dec) corresponds to 0x16c4 (hex). Now search for nid=0x16c4 in the thread dump, and you may find the high usage thread:
"RMIServer [/0.0.0.0:12401] count:1" prio=1 tid=0xa7c1ee80 nid=0x16c4 runnable [a777e000..a777e22c]
at java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:353)
- locked <0xac0d5b50> (a java.net.PlainSocketImpl)
at java.net.ServerSocket.implAccept(ServerSocket.java:448)
at java.net.ServerSocket.accept(ServerSocket.java:419)
at com.evermind.server.rmi.RMIServer.run(RMIServer.java:464)
at com.evermind.util.ReleasableResourcePooledExecutor$MyWorker.run(ReleasableResourcePooledExecutor.java:186)
at java.lang.Thread.run(Thread.java:534)
Newer Linux
If the nid does not appear in the thread dump and the tid value in the thread dump is in decimal format e.g.:
"ExecuteThread: '1' for queue: 'weblogic.socket.Muxer'" id=23 idx=0x54 tid=827 prio=5 alive, in native, daemon
at jrockit/ext/epoll/EPoll.epollWait0(IIII)I(Native Method)
at jrockit/ext/epoll/EPoll.epollWait(EPoll.java:103)
at weblogic/socket/EPollSocketMuxer.processSockets(EPollSocketMuxer.java:156)
^-- Holding lock: java/lang/String@0xc446f00[thin lock]
at weblogic/socket/SocketReaderRequest.run(SocketReaderRequest.java:29)
Search for the tid with the value gotten in step 2 in the thread dump.
For example, if you get the tid '834' in the third column of step 2 search for tid=834 in the thread dump, and you may find something like:
"DynamicListenThread[Default]" id=32 idx=0x70 tid=834 prio=9 alive, in native, daemon
at java/net/PlainSocketImpl.socketAccept(Ljava/net/SocketImpl;)V(Native Method)
at java/net/PlainSocketImpl.accept(PlainSocketImpl.java:390)
^-- Holding lock: java/net/SocksSocketImpl@0x12553730[biased lock]
at java/net/ServerSocket.implAccept(ServerSocket.java:453)
at java/net/ServerSocket.accept(ServerSocket.java:421)
at weblogic/socket/WeblogicServerSocket.accept(WeblogicServerSocket.java:38)
at weblogic/server/channels/DynamicListenThread$SocketAccepter.accept(DynamicListenThread.java:528)
at weblogic/server/channels/DynamicListenThread$SocketAccepter.access$200(DynamicListenThread.java:420)
at weblogic/server/channels/DynamicListenThread.run(DynamicListenThread.java:171)
at java/lang/Thread.run(Thread.java:619)
at jrockit/vm/RNI.c2java(IIIII)V(Native Method)
-- end of trace - Determine why this is happening in your code, or if the top of the stacktrace is from WebLogic Server contact Oracle Customer Support.
AIX
- Do:
ps -mp <WLS_PID> -o THREAD
to find the tid that is using CPU. You should see the "CP" column (for CPU usage) for one of the threads having a high value. - Take a thread dump of the server by doing:
kill -3 <WLS_PID> - Get the tid from Step 1 which has a high CPU value. Change the tid decimal number to a hex value.
- Grep for the hex value obtained in Step 3 in the java core thread dump text file which is generated by the
kill -3
command in Step 2. - Determine why this is happening in your code, or if the top of the stacktrace is from WebLogic, contact Oracle Support.
The following is an example of the above process for AIX.
ps -mp 250076 -o THREAD
will show the following:
USER PID PPID TID ST CP PRI SC WCHAN F TT BND COMMAND
usera 250076 217266 - A 38 60 72 * 242011 pts/0 - /wwsl/sharedInstalls/aix/jdk130/...
- - - 315593 Z 0 97 1 - c00007 - - -
- - - 344305 S 0 60 1 f1000089c020e200 400400 - - -
- - - 499769 S 0 60 1 f1000089c0213a00 400400 - - -
- - - 540699 S 0 60 1 f100008790008440 8410400 - - -
- - - 544789 S 0 60 1 f100008790008540 8410400 - - -
- - - 548883 S 0 60 1 f100008790008640 8410400 - - -
- - - 552979 S 0 60 1 f100008790008740 8410400 - - -
- - - 565283 Z 0 60 1 - c00007 - - -
- - - 585783 S 0 60 1 f100008790008f40 8410400 - - -
- - - 589865 Z 0 80 1 - c00007 - - -
- - - 593959 S 1 60 1 f100008790009140 8410400 - - -
- - - 610365 S 0 60 1 f100008790009540 8410400 - - -
- - - 614453 S 0 60 1 f100008790009640 8410400 - - -
- - - 618547 S 0 60 1 f100008790009740 8410400 - - -
- - - 622645 S 0 60 1 f100008790009840 8410400 - - -
- - - 626743 S 0 60 1 f100008790009940 8410400 - - -
- - - 630841 S 0 60 1 f100008790009a40 8410400 - - -
- - - 634941 S 0 60 1 f100008790009b40 8410400 - - -
- - - 639037 S 0 60 1 f100008790009c40 8410400 - - -
- - - 643135 S 0 60 1 f100008790009d40 8410400 - - -
- - - 647233 S 0 60 1 f100008790009e40 8410400 - - -
- - - 651331 S 0 60 1 f100008790009f40 8410400 - - -
- - - 655429 S 0 60 1 f10000879000a040 8410400 - - -
- - - 659527 S 0 60 1 f10000879000a140 8410400 - - -
- - - 663625 S 0 60 1 f10000879000a240 8410400 - - -
- - - 909509 S 37 78 1 f1000089c020f150 400400 - - -
- - - 671821 S 0 60 1 f10000879000a440 8410400 - - -
- - - 675919 S 0 60 1 - 418400 - - -
- - - 680017 S 0 60 1 f10000879000a640 8410400 - - -
- - - 684115 S 0 60 1 f10000879000a740 8410400 - - -
- - - 688213 S 0 60 1 f10000879000a840 8410400 - - -
- - - 692311 S 0 60 1 f10000879000a940 8410400 - - -
- - - 696409 S 0 60 1 f10000879000aa40 8410400 - - -
- - - 712801 S 0 60 1 f10000879000ae40 8410400 - - -
- - - 716899 S 0 60 1 f10000879000af40 8410400 - - -
- - - 721011 S 0 60 1 f10000879000b040 8410400 - - -
- - - 725095 S 0 60 1 f10000879000b140 8410400 - - -
- - - 729193 S 0 60 1 f10000879000b240 8410400 - - -
- - - 733291 S 0 60 1 f10000879000b340 8410400 - - -
- - - 737389 S 0 60 1 f10000879000b440 8410400 - - -
- - - 741487 S 0 60 1 f10000879000b540 8410400 - - -
- - - 745585 S 0 60 1 f10000879000b640 8410400 - - -
- - - 749683 S 0 60 1 f10000879000b740 8410400 - - -
- - - 753781 S 0 60 1 f10000879000b840 8410400 - - -
- - - 757879 S 0 60 1 f10000879000b940 8410400 - - -
- - - 761977 S 0 60 1 f10000879000ba40 8410400 - - -
- - - 766075 S 0 60 1 f10000879000bb40 8410400 - - -
- - - 770173 S 0 60 1 f10000879000bc40 8410400 - - -
- - - 774271 Z 0 60 1 - c00007 - - -
- - - 778373 S 0 60 1 f10000879000be40 8410400 - - -
- - - 782467 S 0 60 1 f10000879000bf40 8410400 - - -
- - - 786565 S 0 60 1 f10000879000c040 8410400 - - -
- - - 790663 S 0 60 1 f10000879000c140 8410400 - - -
- - - 794761 S 0 60 1 f10000879000c240 8410400 - - -
- - - 798859 S 0 60 1 f10000879000c340 8410400 - - -
- - - 802957 S 0 60 1 f10000879000c440 8410400 - - -
- - - 807055 S 0 60 1 f10000879000c540 8410400 - - -
- - - 811153 S 0 60 1 f10000879000c640 8410400 - - -
- - - 815253 S 0 60 1 f10000879000c740 8410400 - - -
- - - 819357 S 0 60 1 f10000879000c840 8410400 - - -
- - - 823447 S 0 60 1 f10000879000c940 8410400 - - -
- - - 827545 S 0 60 1 f10000879000ca40 8410400 - - -
- - - 831643 S 0 60 1 f10000879000cb40 8410400 - - -
- - - 835741 S 0 60 1 f10000879000cc40 8410400 - - -
- - - 839839 S 0 60 1 f10000879000cd40 8410400 - - -
- - - 843937 S 0 60 1 f10000879000ce40 8410400 - - -
- - - 848037 S 0 60 1 f10000879000cf40 8410400 - - -
- - - 852135 S 0 60 1 f10000879000d040 8410400 - - -
- - - 856257 S 0 60 1 f10000879000d140 8410400 - - -
- - - 868527 S 0 60 1 f10000879000d440 8410400 - - -
- - - 872623 S 0 60 1 f10000879000d540 8410400 - - -
- - - 876725 S 0 60 1 f10000879000d640 8410400 - - -- Take a thread dump of that WLS_PID via
kill -3 <WLS_PID>
. - Examine the information from the
ps -mp <WLS_PID> -o THREAD
command. Notice that TID "909509" has a high number in the CP column (this happens to be "37" while all of the others are near 0). - Change the TID decimal number 909509 which has high CPU to a hex value: the result is DE0C5 hex value.
- Grep for DE0C5 hex value in the generated java core thread dump text file in Step 2. Below will be the thread corresponding to TID 909509. For example:
3XMTHREADINFO "[ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'" J9VMThread:0x346D4600, j9thread_t:0x33E29F0C, java/lang/Thread:0xB47A7728, state:R, prio=5
3XMTHREADINFO1 (native thread ID:0xDE0C5, native priority:0x5, native policy:UNKNOWN)
3XMTHREADINFO3 Java callstack:
4XESTACKTRACE at java/lang/Object.wait(Native Method)
4XESTACKTRACE at java/lang/Object.wait(Object.java:167(Compiled Code))
4XESTACKTRACE at weblogic/work/ExecuteThread.waitForRequest(ExecuteThread.java:155(Compiled Code))
4XESTACKTRACE at weblogic/work/ExecuteThread.run(ExecuteThread.java:178)
3XMTHREADINFO3 Native callstack:
4XENATIVESTACK _event_wait+0x2b8 (0xD011F340 [libpthreads.a+0x15340])
4XENATIVESTACK _cond_wait_local+0x4b0 (0xD012B208 [libpthreads.a+0x21208])
4XENATIVESTACK _cond_wait+0x48 (0xD012B840 [libpthreads.a+0x21840])
4XENATIVESTACK pthread_cond_wait+0x18c (0xD012C414 [libpthreads.a+0x22414])
4XENATIVESTACK (0xD210DCFC [libj9thr24.so+0x4cfc])
4XENATIVESTACK (0xD210DAA0 [libj9thr24.so+0x4aa0])
4XENATIVESTACK (0xD210DB0C [libj9thr24.so+0x4b0c])
4XENATIVESTACK (0xD2A0578C [libjclscar_24.so+0x2678c])
4XENATIVESTACK (0xD2A063A0 [libjclscar_24.so+0x273a0])
4XENATIVESTACK (0xD2095EB0 [libj9vm24.so+0xceb0])
4XENATIVESTACK (0xD211FA98 [libj9prt24.so+0x3a98])
4XENATIVESTACK (0xD2095DD0 [libj9vm24.so+0xcdd0])
4XENATIVESTACK (0xD210B7CC [libj9thr24.so+0x27cc])
4XENATIVESTACK _pthread_body+0x118 (0xD010D784 [libpthreads.a+0x3784])
NULL - Determine why this is happening in your code, or if the top of the stacktrace is from WebLogic, contact Oracle Support.
HP-UX
There is no command similar to prstat currently available from HP to gather the separate thread ids to translate them back to the thread dump. WebLogic Support has developed a simple utility that will show the Process ID (PID), Light Weight Process ID (LWPIDs) associated with the PID, the User time and the System time used. You can use this as a rough guide because as a LWPID uses more and more of the CPU in a High CPU Utilization case, you will see the User time increase over a small window of time. You can use this hp_prstat utility and take periodic measurements of the user time of each LWPID to see which is increasing over time. The user time can only be shown as a whole number, as the APIs provided by HP do not have a finer-grained measurement in this case.
To gather data for HP-UX:
- Click hp_prstat
to download the hp_prstat utility developed by WebLogic Support.
- Run the hp_prstat command on the Java process.
- Get several thread dumps of the server by doing:
kill -3 <PID>
on the Java Process. - Sometime later, take another
hp_prstat <PID>
snapshot. - Examine the output from both iterations of hp_prstat to find the offending LWPID that has its User time increasing rapidly.
- Once you have that number (LWPID), examine the thread dumps to match this to the "lwp_id=<Your LWPID>" on the thread dumps you have taken.
This will match to the offending thread that is using up the CPU. - Determine why this is happening in your code, OR if the top of the stacktrace is from WebLogic, contact OracleA Customer Support.
The following is an example of the above process for HP-UX:
- Run the hp_prstat command on the Java process. Example:
hp_prstat <PID>
- Do this several times, a couple of minutes apart while you are seeing the High CPU Utilization taking place.
Following is sample output:
lwpid pid pri status UsrTime SysTime
285365 4426 154 1 29 3
285381 4426 154 1 0 7
285382 4426 154 1 2 7
285383 4426 154 1 0 7
285384 4426 154 1 0 7
285385 4426 168 1 0 7
285386 4426 154 1 0 7
285387 4426 154 1 0 7
285388 4426 154 1 0 7
285389 4426 154 1 30 7
285404 4426 168 1 0 7
285405 4426 154 1 0 7
285406 4426 154 1 0 7
285407 4426 154 1 0 7
285408 4426 154 1 0 7
285409 4426 154 1 0 7
285410 4426 154 1 0 7
285411 4426 154 1 0 7
285412 4426 154 1 0 7
285413 4426 154 1 0 7
285414 4426 154 1 0 7
285415 4426 154 1 0 7
285416 4426 154 1 0 7
285417 4426 154 1 0 7
285418 4426 154 1 0 7
285419 4426 154 1 0 7
285420 4426 154 1 0 7
285421 4426 154 1 0 7
285422 4426 154 1 0 7
285423 4426 154 1 0 7
285424 4426 154 1 0 7
285425 4426 154 1 0 7
285426 4426 154 1 0 7
285427 4426 154 1 0 7
285428 4426 154 1 0 7
285429 4426 154 1 0 7
285430 4426 154 1 0 7
285431 4426 154 1 0 7
285432 4426 154 1 0 7
285433 4426 154 1 0 7
285434 4426 154 1 0 7
285435 4426 154 1 0 7
285436 4426 154 1 0 7
285439 4426 154 1 0 7
285441 4426 154 1 0 7
285442 4426 154 1 0 7
285443 4426 154 1 0 7
285444 4426 154 1 0 7
285445 4426 154 1 0 7
285446 4426 154 1 0 7
285449 4426 154 1 0 7
285450 4426 154 1 0 7
285451 4426 154 1 0 7
285452 4426 154 1 0 7
285453 4426 154 1 0 7
285454 4426 154 1 0 7
285455 4426 154 1 0 7
285456 4426 154 1 0 7
285457 4426 154 1 0 7
285458 4426 154 1 0 7
285459 4426 154 1 0 7
285460 4426 154 1 0 7
285461 4426 154 1 0 7
285462 4426 154 1 0 7
285463 4426 154 1 0 7
285464 4426 168 1 0 7
285468 4426 178 4 0 7
285469 4426 154 1 0 7
285470 4426 154 1 0 7
285471 4426 154 1 0 7
285472 4426 154 1 0 7
285473 4426 154 1 0 7
285475 4426 168 1 1 7
285477 4426 154 1 0 7
285478 4426 154 1 0 7 - Get a thread dump of the server by doing:
kill -3 <PID>
on the Java Process. - Sometime later, take another
hp_prstat <PID>
snapshot.
Notice that two of the LWPIDs (285475 and 285416) are a lot larger compared to the first snapshot. You will need to examine both.
lwpid pid pri status UsrTime SysTime
285365 4426 154 1 29 3
285381 4426 154 1 0 7
285382 4426 154 1 2 7
285383 4426 154 1 0 7
285384 4426 154 1 0 7
285385 4426 168 1 0 7
285386 4426 154 1 0 7
285387 4426 154 1 0 7
285388 4426 154 1 0 7
285389 4426 154 1 32 7
285404 4426 168 1 0 7
285405 4426 154 1 0 7
285406 4426 154 1 0 7
285407 4426 154 1 0 7
285408 4426 154 1 0 7
285409 4426 154 1 0 7
285410 4426 154 1 0 7
285411 4426 154 1 0 7
285412 4426 154 1 0 7
285413 4426 154 1 0 7
285414 4426 154 1 0 7
285415 4426 154 1 0 7
285416 4426 154 1 13 7
285417 4426 154 1 0 7
285418 4426 154 1 0 7
285419 4426 154 1 0 7
285420 4426 154 1 0 7
285421 4426 154 1 0 7
285422 4426 154 1 0 7
285423 4426 154 1 0 7
285424 4426 154 1 0 7
285425 4426 154 1 0 7
285426 4426 154 1 0 7
285427 4426 154 1 0 7
285428 4426 154 1 0 7
285429 4426 154 1 0 7
285430 4426 154 1 0 7
285431 4426 154 1 0 7
285432 4426 154 1 0 7
285433 4426 154 1 0 7
285434 4426 154 1 0 7
285435 4426 154 1 0 7
285436 4426 154 1 0 7
285439 4426 154 1 0 7
285441 4426 154 1 0 7
285442 4426 154 1 0 7
285443 4426 154 1 0 7
285444 4426 154 1 0 7
285445 4426 154 1 0 7
285446 4426 154 1 0 7
285449 4426 154 1 0 7
285450 4426 154 1 0 7
285451 4426 154 1 0 7
285452 4426 154 1 0 7
285453 4426 154 1 0 7
285454 4426 154 1 0 7
285455 4426 154 1 0 7
285456 4426 154 1 0 7
285457 4426 154 1 0 7
285458 4426 154 1 0 7
285459 4426 154 1 0 7
285460 4426 154 1 0 7
285461 4426 154 1 0 7
285462 4426 154 1 0 7
285463 4426 154 1 0 7
285464 4426 168 1 0 7
285468 4426 178 4 0 7
285469 4426 154 1 0 7
285470 4426 154 1 0 7
285471 4426 154 1 0 7
285472 4426 154 1 0 7
285473 4426 154 1 0 7
285475 4426 168 1 5 7
285477 4426 154 1 0 7
285478 4426 154 1 0 7 - Get another thread dump of the server by doing:
kill -3 <PID>
on the Java Process to make sure you are capturing the correct thread that is using up the CPU. - Get the LWPID from the hp_prstat output that looks like its UserTime is increasing. With this number (LWPID), examine the thread dumps to match this to the "lwp_id=<Your LWPID>" on the thread dumps you have taken.
The following two LWPIDs in question can be examined:
"Thread-6" prio=8 tid=0x0004f620 nid=75 lwp_id=285475 waiting on monitor [0x66d5e000..0x66d5e500]It appears the one that is really causing an issue is LWPID 285416.
at java.lang.Thread.sleep(Native Method)
at weblogic.management.deploy.GenericAppPoller.run(GenericAppPoller.java:139)
"ExecuteThread: '11' for queue: 'default'" daemon prio=10 tid=0x0004ad00 nid=23 lwp_id=285416 runnable [0x67874000..0x67874500]
at java.net.SocketOutputStream.socketWrite(Native Method)
at java.net.SocketOutputStream.write(Unknown Source)
at weblogic.servlet.internal.ChunkUtils.writeChunkTransfer(ChunkUtils.java:222)
at weblogic.servlet.internal.ChunkUtils.writeChunks(ChunkUtils.java:198)
at weblogic.servlet.internal.ChunkOutput.flush(ChunkOutput.java:285)
at weblogic.servlet.internal.ChunkOutput.checkForFlush(ChunkOutput.java:345)
at weblogic.servlet.internal.ChunkOutput.write(ChunkOutput.java:222)
at weblogic.servlet.internal.ChunkOutput.write(ChunkOutput.java:237)
at weblogic.servlet.internal.ChunkOutputWrapper.write(ChunkOutputWrapper.java:86)
at weblogic.servlet.internal.ChunkWriter.write(ChunkWriter.java:37)
at java.io.Writer.write(Unknown Source)
- locked <0x753408e8> (a weblogic.servlet.internal.ChunkWriter)
at java.io.PrintWriter.write(Unknown Source)
- locked <0x753408e8> (a weblogic.servlet.internal.ChunkWriter)
at java.io.PrintWriter.write(Unknown Source)
at java.io.PrintWriter.print(Unknown Source)
at java.io.PrintWriter.println(Unknown Source)
- locked <0x753408e8> (a weblogic.servlet.internal.ChunkWriter)
at examples.servlets.HelloWorldServlet.service(HelloWorldServlet.java:28)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
at weblogic.servlet.internal.ServletStubImpl$ServletInvocationAction.run(ServletStubImpl.java:1058)
at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:401)
at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:306)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:5445)
at weblogic.security.service.SecurityServiceManager.runAs(SecurityServiceManager.java:780)
at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:3105)
at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2588)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:213)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:189) - You can look into the service method of that Servlet to find out what is happening around that line number (line 28 of HelloWorldServlet.java) and determine the problem.
Windows
Using pslist
You can use pslist and get thread details of the java process on Windows. pslist is available from:http://technet.microsoft.com/en-us/sysinternals/bb896682. pslist is part of the PsTools suite, which is available as a bundle fromhttp://technet.microsoft.com/en-us/sysinternals/bb896649.
- Run pslist -d <Java PID> and redirect the output to a file.
- Do this for several iterations so you can see a pattern.
You should see the "User Time" and "Kernel Time" increasing. - Take thread dumps of the WLS server over several iterations.
- Take the Thread ID # from Step 1 that seems to be increasing, change the decimal value to a hex value (You can use the Calc function in Windows).
- Examine the thread dump for the "nid=0x<Your Hex Value from Step 3>" until you find the offending thread.
- Determine why this is happening in your code, or if the top of the stacktrace is from WebLogic, contact Oracle Customer Support.
Using Process Explorer
You can also use Process Explorer from Sysinternals (available from http://technet.microsoft.com/en-us/sysinternals/bb896653). This tool visually shows the CPU usage dynamically. There is no logging or history in Process Explorer, so you will have to monitor this and write down the Thread ID from the Java process that is taking up the most CPU. In order to do this via Process Explorer:
- Find the java process, then right click and select properties.
- Click the Threads tab to bring up all the threads associated with this java process.
- You can click on one of the threads that are listed as "MSVCRT.dll+<Some hex offset>"
You can see the "Thread ID" listed in the pane below. - Observe which thread is using up the most CPU.
- Take thread dumps of the WLS server.
- Take the Thread ID # from Step 4 that seems to be using up CPU, change the decimal value to a hex value (You can use the Calc function in Windows).
- Examine the thread dump for the "nid=0x<Your Hex Value>" until you find the offending thread.
- Determine why this is happening in your code, or if the top of the stacktrace is from WebLogic, contact Oracle Customer Support.
The following is an example of these steps using only pslist and thread dumps.
- Run pslist -d 172
java 1720:
Tid Pri Cswtch State User Time Kernel Time Elapsed Time
1520 8 9705 Wait:UserReq 0:00:23.734 0:00:01.772 0:04:55.264
1968 9 2233 Wait:UserReq 0:00:04.606 0:00:00.040 0:04:54.874
1748 15 146 Wait:UserReq 0:00:00.010 0:00:00.010 0:04:54.863
1744 11 62 Wait:UserReq 0:00:00.010 0:00:00.000 0:04:54.853
1420 15 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:54.563
1856 15 7 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:54.563
1860 9 3157 Wait:UserReq 0:00:03.314 0:00:00.160 0:04:54.563
412 15 5888 Wait:DelayExec 0:00:00.000 0:00:00.000 0:04:54.553
1864 8 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:50.567
1660 15 61 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:42.125
2020 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:42.025
1532 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:42.015
1332 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:42.005
1696 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.995
2060 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.995
1552 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.985
2072 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.985
2068 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.975
2044 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.975
2000 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.965
588 9 4744 Wait:UserReq 0:00:02.814 0:00:00.110 0:04:41.965
1784 9 132 Wait:UserReq 0:00:00.080 0:00:00.000 0:04:41.955
1756 9 196 Wait:UserReq 0:00:00.931 0:00:00.000 0:04:41.955
1716 8 6 Wait:Queue 0:00:00.000 0:00:00.000 0:04:41.945
1800 9 1457 Wait:Queue 0:00:00.761 0:00:00.210 0:04:41.945
1996 8 47 Wait:UserReq 0:00:00.170 0:00:00.000 0:04:41.835
1992 11 18 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.434
1988 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.424
1984 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.414
1976 8 231 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.274
1956 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.234
1740 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.224
1944 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.214
1964 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.204
1972 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.204
1280 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.194
1960 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.194
1872 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.184
1884 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.184
1952 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.174
1940 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.174
1936 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:41.164
1932 6 4 Wait:UserReq 0:00:00.010 0:00:00.000 0:04:39.291
1928 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:31.880
1916 8 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:31.870
1912 8 4 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:31.860
1908 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:31.860
1904 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:31.850
1900 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:31.840
1896 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.889
2064 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.879
1828 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.869
1892 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.859
1888 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.859
1852 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.849
1848 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.849
1844 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.839
1412 bsp; 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.839
332 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.829
1840 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.829
1440 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.819
1796 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.819
1240 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.809
568 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:30.809
1732 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:29.797
2056 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:15.006
1688 8 4 Wait:UserReq 0:00:00.000 0:00:00.010 0:04:14.996
1776 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:14.986
1648 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:14.976
1768 8 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:04:14.976
284 8 188 Wait:UserReq 0:00:00.190 0:00:00.040 0:04:08.887
576 9 123 Wait:UserReq 0:00:00.070 0:00:00.020 0:04:07.515 - Take the same output again some time later to get another snapshot of the threads to see which have increased significantly.
- Determine the Thread ID (TID) to examine further.
- Run pslist again:
pslist -d 1720
java 1720:Notice that Thread ID 588 is using up most of the User/Kernel time and hence the CPU. Obviously this thread has a problem.
Tid Pri Cswtch State User Time Kernel Time Elapsed Time
1520 8 9705 Wait:UserReq 0:00:23.734 0:00:01.772 0:08:14.511
1968 8 6527 Wait:UserReq 0:00:06.309 0:00:00.070 0:08:14.120
1748 15 157 Wait:UserReq 0:00:00.010 0:00:00.010 0:08:14.110
1744 10 68 Wait:UserReq 0:00:00.010 0:00:00.000 0:08:14.100
1420 15 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:13.810
1856 15 18 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:13.810
1860 8 3169 Wait:UserReq 0:00:03.314 0:00:00.160 0:08:13.810
412 15 9890 Wait:DelayExec 0:00:00.000 0:00:00.000 0:08:13.800
1864 8 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:09.814
1660 15 188 Wait:UserReq 0:00:00.010 0:00:00.010 0:08:01.372
2020 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.272
1532 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.262
1332 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.252
1696 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.241
2060 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.241
1552 9 40 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.231
2072 8 13 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.231
2068 8 20 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.221
2044 8 15 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:01.221
2000 8 657 Wait:UserReq 0:00:00.090 0:00:00.010 0:08:01.211
588 10 59123 Wait:UserReq 0:00:48.830 0:00:02.633 0:08:01.211
1784 8 150 Wait:UserReq 0:00:00.090 0:00:00.000 0:08:01.201
1756 8 251 Wait:UserReq 0:00:00.941 0:00:00.000 0:08:01.201
1716 8 6 Wait:Queue 0:00:00.000 0:00:00.000 0:08:01.191
1800 8 1457 Wait:Queue 0:00:00.761 0:00:00.210 0:08:01.191
1996 8 47 Wait:UserReq 0:00:00.170 0:00:00.000 0:08:01.081
1992 10 29 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.681
1988 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.671
1984 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.661
1976 9 400 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.520
1956 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.480
1740 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.470
1944 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.460
1964 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.450
1972 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.450
1280 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.440
1960 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.440
1872 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.430
1884 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.430
1952 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.420
1940 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.420
1936 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:08:00.410
1932 6 4 Wait:UserReq 0:00:00.010 0:00:00.000 0:07:58.538
1928 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:51.127
1916 8 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:51.117
1912 8 5 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:51.107
1908 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:51.107
1904 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:51.097
1900 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:51.087
1896 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.136
2064 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.126
1828 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.115
1892 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.105
1888 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.105
1852 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.095
1848 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.095
1844 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.085
1412 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.085
332 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.075
1840 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.075
1440 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.065
1796 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.065
1240 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.055
568 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:50.055
1732 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:49.044
2056 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:34.253
1688 8 4 Wait:UserReq 0:00:00.000 0:00:00.010 0:07:34.243
1776 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:34.233
1648 8 2 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:34.223
1768 8 3 Wait:UserReq 0:00:00.000 0:00:00.000 0:07:34.223
284 9 416 Ready 0:00:00.420 0:00:00.100 0:07:28.134
576 8 123 Wait:UserReq 0:00:00.070 0:00:00.020 0:07:26.762 - Take the Thread ID number of 588 and convert that to hexadecimal (0x24).
- Examine the thread dumps you took while the problem was occurring and look for "nid=0x24".
Observe below that this corresponds to ExecuteThread 10 from the thread dumps:
"ExecuteThread: '10' for queue: 'default'" daemon prio=5 tid=0x20d75808 nid=0x24c runnable [219ff000..219ffd90]It appears the problem is on a socketWrite native method, but it looks like the HelloWorld2.service() is doing something wrong.
at java.net.SocketOutputStream.socketWrite0(Native Method)
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at weblogic.servlet.internal.ChunkUtils.writeChunkTransfer(ChunkUtils.java:222)
at weblogic.servlet.internal.ChunkUtils.writeChunks(ChunkUtils.java:198)
at weblogic.servlet.internal.ChunkOutput.flush(ChunkOutput.java:284)
at weblogic.servlet.internal.ChunkOutput.checkForFlush(ChunkOutput.java:344)
at weblogic.servlet.internal.ChunkOutput.write(ChunkOutput.java:221)
at weblogic.servlet.internal.ChunkOutput.write(ChunkOutput.java:236)
at weblogic.servlet.internal.ChunkOutputWrapper.write(ChunkOutputWrapper.java:95)
at weblogic.servlet.internal.ChunkWriter.write(ChunkWriter.java:37)
at java.io.Writer.write(Writer.java:150)
- locked <0x11d0d1c0> (a weblogic.servlet.internal.ChunkWriter)
at java.io.PrintWriter.write(PrintWriter.java:230)
- locked <0x11d0d1c0> (a weblogic.servlet.internal.ChunkWriter)
at java.io.PrintWriter.write(PrintWriter.java:247)
at java.io.PrintWriter.print(PrintWriter.java:378)
at java.io.PrintWriter.println(PrintWriter.java:515)
- locked <0x11d0d1c0> (a weblogic.servlet.internal.ChunkWriter)
at examples.servlets.HelloWorld2.service(HelloWorld2.java:94)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
at weblogic.servlet.internal.ServletStubImpl$ServletInvocationAction.run(ServletStubImpl.java:1058)
at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:401)
at weblogic.servlet.internal.ServletStubImpl.invokeServlet(ServletStubImpl.java:306)
at weblogic.servlet.internal.WebAppServletContext$ServletInvocationAction.run(WebAppServletContext.java:5412)
at weblogic.security.service.SecurityServiceManager.runAs(SecurityServiceManager.java:744)
at weblogic.servlet.internal.WebAppServletContext.invokeServlet(WebAppServletContext.java:3086)
at weblogic.servlet.internal.ServletRequestImpl.execute(ServletRequestImpl.java:2544)
at weblogic.kernel.ExecuteThread.execute(ExecuteThread.java:153)
at weblogic.kernel.ExecuteThread.run(ExecuteThread.java:134) - Examine line number (94 of HelloWorld2.java) to determine what is happening.
From the code snippet of HelloWorld2.java from the service() method:
89 out.println(ExampleUtils.returnHtmlHeader("Hello World 2"));It appears that for some reason the output stream is being written in a very long "for loop". This is the error and cause of the High CPU in this example.
90 out.println("<h4>");
91 for (int i=0;i<100000000;i++) {
92 int j = 0;
93 j = j +i;
94 out.println(defaultGreeting + " " + defaultName + "!");
95 }
96
97 out.println("</h4>");
98 out.println(ExampleUtils.returnHtmlFooter());
If this code is corrected, then the CPU will not be maxed out.
External Resources
You may obtain the utilities/tools used to help examine CPU usage as follows:
pslist standalone: http://technet.microsoft.com/en-us/sysinternals/bb896682
PsTools Suite: http://technet.microsoft.com/en-us/sysinternals/bb896649
Process Explorer: http://technet.microsoft.com/en-us/sysinternals/bb896653
转自:https://support.oracle.com/
文档id:779349.1