在学习“统计信息”的过程中遇到了一个奇怪的问题,初步怀疑是Oracle 10g sqlplus 的Bug。
记录如下:
-- 1。找个测试用户建一个
create table table01 as
with seq as (
select level num from dual
connect by level<=250
) ,
testdata as (
select s2.num,rpad('killkill',100,'*') dummy from
seq s1 , seq s2
where s1.num <= s2.num
)
select * from testdata ;
-- 2。发出几条 select 的sql,提示 oracle 收集这两个列的统计信息:
select count(*) from table01 where num=1;
select count(*) from table01 where num=10;
select count(*) from table01 where num=100;
select count(*) from table01 where num=200;
select count(*) from table01 where dummy='1';
select count(*) from table01 where dummy='10';
select count(*) from table01 where dummy='100';
select count(*) from table01 where dummy='200';
-- 在 num 列上建立索引,这个貌似不影响结果,就不做了。
-- 3。做一个样本为 100% 的统计信息收集
exec dbms_stats.gather_table_stats( user , 'TABLE01' , estimate_percent => 100 , cascade=>true );
-- 4。看看列的统计信息,这句出问题了:
select utl_raw.cast_to_number( low_value) as low_value ,
utl_raw.cast_to_number(high_value) as high_value ,
num_distinct ,
density,
histogram ,
num_buckets,
SAMPLE_SIZE
from user_tab_col_statistics
where table_name='TABLE01'
and column_name in ('NUM','DUMMY')
以下是 10g 的 sqlplus的结果:
LOW_VALUE HIGH_VALUE NUM_DISTINCT DENSITY HISTOGRAM NUM_BUCKETS SAMPLE_SIZE
---------- ---------- ------------ ---------- --------------- ----------- -----------
1 250 250 .000015936 FREQUENCY 250 31375
>>>>> sqlplus 卡在这,完全无视 Ctrl+C
以下是 11g 的 sqlplus 的结果
LOW_VALUE HIGH_VALUE NUM_DISTINCT DENSITY HISTOGRAM NUM_BUCKETS SAMPLE_SIZE
---------- ---------- ------------ ---------- --------------- ----------- -----------
1 250 250 .000015936 FREQUENCY 250 31375
1 .000015936 FREQUENCY 1 31375
Linux TOP的输出:
top - 17:01:37 up 248 days, 1:16, 2 users, load average: 0.74, 1.50, 1.81
Tasks: 170 total, 2 running, 168 sleeping, 0 stopped, 0 zombie
Cpu0 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu1 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu2 : 0.0% us, 0.0% sy, 0.0% ni, 100.0% id, 0.0% wa, 0.0% hi, 0.0% si
Cpu3 : 100.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si
Mem: 8165004k total, 8116256k used, 48748k free, 23328k buffers
Swap: 2031608k total, 110732k used, 1920876k free, 6977144k cached
PID USER PR NI %CPU TIME+ %MEM VIRT RES SHR S COMMAND
29504 oracle 25 0 100 0:23.39 0.1 37160 11m 7560 R sqlplus
24161 root 16 0 2 179:17.39 0.7 167m 57m 19m S vmware-hostd
29872 oracle 15 0 2 0:00.08 0.0 6292 1208 848 R top
1 root 16 0 0 0:56.28 0.0 4756 548 456 S init
Solaris 10 prstat 的命令输出:
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
19124 ora10g 27M 11M cpu18 20 0 0:01:06 24% sqlplus/1
19214 root 5616K 3784K cpu2 59 0 0:00:00 0.1% prstat/1
26928 ora11g 401M 236M sleep 59 0 0:36:17 0.0% oracle/1
19194 ora11g 401M 279M sleep 59 0 0:00:00 0.0% oracle/1
19169 ora10g 2567M 1619M sleep 59 0 0:00:02 0.0% oracle/11
26916 ora11g 400M 235M sleep 101 - 0:40:35 0.0% oracle/1
19196 ora11g 400M 275M sleep 59 0 0:00:00 0.0% oracle/1
25333 ora11g 451M 347M sleep 59 0 0:07:34 0.0% java/50
167 root 9480K 3672K sleep 59 0 0:03:30 0.0% nscd/32
sqlplus cpu使用率 100% ,唯有 kill pid 才能结束。
找了几台机器测试:
受影响的sqlplus:
Oracle 10.2.0.1 on CentOS 4.6/5.2 的 sqlplus (相当 RHEL 4.6/5.2)
Oracle 10.2.0.2 on CentOS 4.7 的 sqlplus
Oracle 10.2.0.4 on Solaris 10(SPARC) 的 sqlplus
sqlplus on windows 2003 32bit 的 sqlplus
不受影响的sqlplus:
Oracle 11.2.0.1 on CentOS 4.6 的 sqlplus
从测试来看 10g 的sqlplus 存在问题,而 11g 的sqlplus 不存在这个问题。
sqlplus hang住的时候,从v$session_wait 可以查到如下信息:
SID SEQ# EVENT WAIT_TIME SECONDS_IN_WAIT STATE
---------- ---------- ------------------------------ ---------- --------------- -------------------
141 82 SQL*Net message from client 0 252 WAITING