關於shared pool的設置一直是一個爭議較多的內容。很多文章上說,shared pool設置過大會帶來額外的管理上的負擔,從而在某些條件下會導致性能的下降。那麼這個管理上的負擔指的是什麼內容呢? 本文對這個內容作一定的深入探討。本文只涉及一個方面,後續的文章將從其他方面繼續討論。
基礎知識:
我們可以通過如下命令轉儲shared pool共享內存的內容:
SQL> alter session set events 'immediate trace name heapdump level 2';
Session altered.
本測試中引用的兩個trace文件:
9i:
SQL> @gettrcname
TRACE_FILE_NAME
--------------------------------------------------------------------------------
/opt/Oracle/admin/hsjf/udump/hsjf_ora_24983.trc
8i:
SQL> @gettrcname
TRACE_FILE_NAME
--------------------------------------------------------------------------------
/usr/Oracle8/admin/guess/udump/guess_ora_22038.trc
Shared Pool通過free list管理free塊,Free List按不同size劃分Bucket.
在Oracle8i中,不同bucket的size范圍如下所示(size顯示的是下邊界):
oracle:/usr/Oracle8/admin/guess/udump>cat guess_ora_22038.trc|grep Bucket
Bucket 0 size=44
Bucket 1 size=76
Bucket 2 size=140
Bucket 3 size=268
Bucket 4 size=524
Bucket 5 size=1036
Bucket 6 size=2060
Bucket 7 size=4108
Bucket 8 size=8204
Bucket 9 size=16396
Bucket 10 size=32780
我們注意,在這裡,小於76的塊都位於Bucket 0上;大於32780的塊,都在Bucket 10上初始的,數據庫啟動以後,shared pool多數是連續內存塊當空間分配使用以後,內存塊開始被分割,碎片開始出現,Bucket列表開始變長。
Oracle請求shared pool空間時,首先進入相應的Bucket進行查找,如果找不到,則轉向下一個非空的bucket,獲取第一個chunk。分割這個chunk,剩余部分會進入相應的Bucket,進一步增加碎片。
最終的結果是,Bucket 0上的內存塊會越來越多,越來越碎小(在我這個測試的小型的數據庫上,Bucket 0上的碎片已經達到9030個而shared_pool_size設置僅為
而在大多數情況下,我們請求的都是相對小的chunk,這樣搜索Bucket 0往往消耗了大量的時間以及資源,這可能導致share pool Latch被長時間的持有,導致更多的share pool競爭。
所以在Oracle9i之前,如果盲目的增大shared_pool_size或設置過大的shared_pool_size,往往會適得其反。
我們看一下Oracle9i中的處理方式:
[oracle@jumper Oracle]$ sqlplus "/ as sysdba"
SQL*Plus: Release
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - ProductionSQL> alter session set events 'immediate trace name heapdump level 2';
Session altered.
SQL> @gettrcname
TRACE_FILE_NAME
--------------------------------------------------------------------------------
/opt/Oracle/admin/hsjf/udump/hsjf_ora_24983.trc
SQL>
SQL> !
[oracle@jumper Oracle]$ cd $admin
[Oracle@jumper udump]$ cat hsjf_ora_24983.trc|grep Bucket
(編者注:由於篇幅限制,編者將此處結果分欄顯示。)
Bucket 0 size=16
Bucket 1 size=20
Bucket 2 size=24
Bucket 3 size=28
Bucket 4 size=32
Bucket 5 size=36
Bucket 6 size=40
Bucket 7 size=44
Bucket 8 size=48
Bucket 9 size=52
Bucket 10 size=56
Bucket 11 size=60
Bucket 12 size=64
Bucket 13 size=68
Bucket 14 size=72
Bucket 15 size=76
Bucket 16 size=80
Bucket 17 size=84
Bucket 18 size=88
Bucket 19 size=92
Bucket 20 size=96
Bucket 21 size=100
Bucket 22 size=104
Bucket 23 size=108
Bucket 24 size=112
Bucket 25 size=116
Bucket 26 size=120
Bucket 27 size=124
Bucket 28 size=128
Bucket 29 size=132
Bucket 30 size=136
Bucket 31 size=140
Bucket 32 size=144
Bucket 33 size=148
Bucket 34 size=152
Bucket 35 size=156
Bucket 36 size=160
Bucket 37 size=164
Bucket 38 size=168
Bucket 39 size=172
Bucket 40 size=176
Bucket 41 size=180
Bucket 42 size=184
Bucket 43 size=188
Bucket 44 size=192
Bucket 45 size=196
Bucket 46 size=200
Bucket 47 size=204
Bucket 48 size=208
Bucket 49 size=212
Bucket 50 size=216
Bucket 51 size=220
Bucket 52 size=224
Bucket 53 size=228
Bucket 54 size=232
Bucket 55 size=236
Bucket 56 size=240
Bucket 57 size=244
Bucket 58 size=248
Bucket 59 size=252
Bucket 60 size=256
Bucket 61 size=260
Bucket 62 size=264
Bucket 63 size=268
Bucket 64 size=272
Bucket 65 size=276
Bucket 66 size=280
Bucket 67 size=284
Bucket 68 size=288
Bucket 69 size=292
Bucket 70 size=296
Bucket 71 size=300
Bucket 72 size=304
Bucket 73 size=308
Bucket 74 size=312
Bucket 75 size=316
Bucket 76 size=320
Bucket 77 size=324
Bucket 78 size=328
Bucket 79 size=332
Bucket 80 size=336
Bucket 81 size=340
Bucket 82 size=344
Bucket 83 size=348
Bucket 84 size=352
Bucket 85 size=356
Bucket 86 size=360
Bucket 87 size=364
Bucket 88 size=368
Bucket 89 size=372
Bucket 90 size=376
Bucket 91 size=380
Bucket 92 size=384
Bucket 93 size=388
Bucket 94 size=392
Bucket 95 size=396
Bucket 96 size=400
Bucket 97 size=404
Bucket 98 size=408
Bucket 99 size=412
Bucket 100 size=416
Bucket 101 size=420
Bucket 102 size=424
Bucket 103 size=428
Bucket 104 size=432
Bucket 105 size=436
Bucket 106 size=440
Bucket 107 size=444
Bucket 108 size=448
Bucket 109 size=452
Bucket 110 size=456
Bucket 111 size=460
Bucket 112 size=464
Bucket 113 size=468
Bucket 114 size=472
Bucket 115 size=476
Bucket 116 size=480
Bucket 117 size=484
Bucket 118 size=488
Bucket 119 size=492
Bucket 120 size=496
Bucket 121 size=500
Bucket 122 size=504
Bucket 123 size=508
Bucket 124 size=512
Bucket 125 size=516
Bucket 126 size=520
Bucket 127 size=524
Bucket 128 size=528
Bucket 129 size=532
Bucket 130 size=536
Bucket 131 size=540
Bucket 132 size=544
Bucket 133 size=548
Bucket 134 size=552
Bucket 135 size=556
Bucket 136 size=560
Bucket 137 size=564
Bucket 138 size=568
Bucket 139 size=572
Bucket 140 size=576
Bucket 141 size=580
Bucket 142 size=584
Bucket 143 size=588
Bucket 144 size=592
Bucket 145 size=596
Bucket 146 size=600
Bucket 147 size=604
Bucket 148 size=608
Bucket 149 size=612
Bucket 150 size=616
Bucket 151 size=620
Bucket 152 size=624
Bucket 153 size=628
Bucket 154 size=632
Bucket 155 size=636
Bucket 156 size=640
Bucket 157 size=644
Bucket 158 size=648
Bucket 159 size=652
Bucket 160 size=656
Bucket 161 size=660
Bucket 162 size=664
Bucket 163 size=668
Bucket 164 size=672
Bucket 165 size=676
Bucket 166 size=680
Bucket 167 size=684
Bucket 168 size=688
Bucket 169 size=692
Bucket 170 size=696
Bucket 171 size=700
Bucket 172 size=704
Bucket 173 size=708
Bucket 174 size=712
Bucket 175 size=716
Bucket 176 size=720
Bucket 177 size=724
Bucket 178 size=728
我們看到,在Oracle9i中,Free Lists被劃分為0~254,共255個Bucket。
每個Bucket容納的size范圍
Bucket 0~199 容納size以 4 遞增
Bucket 200~249 容納size以 64 遞增
從Bucket 249開始,Oracle各Bucket步長進一步增加:
Bucket 249: 4012 ~4107 = 96
Bucket 250: 4108 ~8203 = 4096
Bucket 251: 8204 ~16395 = 8192
Bucket 252: 16396~32779 = 16384
Bucket 253: 32780~65547 = 32768
Bucket 254: >=65548
在Oracle9i中,對於小的chunk,Oracle增加了更多的Bucket來管理。0~199共200個Bucket,size以4為步長遞增;200~249共50個Bucket,size以64遞增。這樣每個Bucket中容納的chunk數量大大減少,查找的效率得以提高。
這就是Oracle9i中shared pool管理的增強,通過這個算法的改進。Oracle8i中,過大shared pool帶來的栓鎖爭用等性能問題在某種程度上得以解決。
我們繼續把前面的問題展開一下。
其實我們可以從數據庫內部監控shared pool的空間碎片情況。
這涉及到一個內部視圖x$ksmsp
X$KSMSP的名稱含義為: [K]ernal [S]torage [M]emory Management [S]GA Hea[P]
其中每一行都代表著shared pool中的一個chunk.
首先記錄一下測試環境:
SQL> select * from v$version;
BANNER
----------------------------------------------------------------
Oracle9i
PL
CORE 9.2.0.3.0 Production
TNS for Linux: Version 9.2.0.3.0 - Production
NLSRTL Version 9.2.0.3.0 – Production
我們看一下x$ksmsp的結構:
SQL> desc x$ksmsp
Name Null? Type
----------------------------------------- -------- ----------------------------
ADDR RAW(4)
INDX NUMBER
INST_ID NUMBER
KSMCHIDX NUMBER
KSMCHDUR NUMBER
KSMCHCOM VARCHAR2(16)
KSMCHPTR RAW(4)
KSMCHSIZ NUMBER
KSMCHCLS VARCHAR2(8)
KSMCHTYP NUMBER
KSMCHPAR RAW(4)
我們關注以下幾個字段:
KSMCHCOM是注釋字段,每個內存塊被分配以後,注釋會添加在該字段中.
x$ksmsp.ksmchsiz代表塊大小
x$ksmsp.ksmchcls列代表類型,主要有四類,說明如下:
free
Free chunks--不包含任何對象的chunk,可以不受限制的被分配.
recr
Recreatable chunks--包含可以被臨時移出內存的對象,在需要的時候,這個對象可以被重新創建.例如,許多存儲共享sql代碼的內存都是可以重建的.
freeabl
Freeable chunks--包含session周期或調用的對象,隨後可以被釋放.這部分內存有時候可以全部或部分提前釋放.但是注意,由於某些對象是中間過程產生的,這些對象不能臨時被移出內存(因為不可重建).
perm
Permanent memory chunks--包含永久對象.通常不能獨立釋放.
我們可以通過查詢x$ksmsp視圖來考察shared pool中存在的內存片的數量。
不過注意:Oracle的某些版本(如:10.1.0.2)在某些平台上(如:HP-UX PA-RISC 64-bit)查
詢該視圖可能導致過度的CPU耗用,這是由於bug引起的。
我們看一下測試:
初始啟動數據庫,x$ksmsp中存在2259個chunk
SQL> select count(*) from x$ksmsp;
COUNT(*)
----------
2259
執行查詢:
SQL> select count(*) from dba_objects;
COUNT(*)
----------
10491
此時shared pool中的chunk數量增加
SQL> select count(*) from x$ksmsp;
COUNT(*)
----------
2358
這就是由於shared pool中進行sql解析,請求空間,進而導致請求free空間,分配、分割。從而產生了更多,更細碎的內存chunk。由此我們可以看出,如果數據庫系統中存在大量的硬解析,不停請求分配free的shred pool內存。除了必須的shared pool latch等競爭外,還不可避免的會導致shared pool中產生更多的內存碎片(當然,在內存回收時,你可能看到chunk數量減少的情況)。
我們看以下測試:
首先重新啟動數據庫:
SQL> startup force;
Oracle instance started.
Total System Global Area 47256168 bytes
Fixed Size 451176 bytes
Variable Size 29360128 bytes
Database Buffers 16777216 bytes
Redo Buffers 667648 bytes
Database mounted.
Database opened.
創建一張臨時表用以保存之前x$ksmsp的狀態:
SQL> CREATE GLOBAL TEMPORARY TABLE e$ksmsp ON COMMIT PRESERVE ROWS AS
2 SELECT a.ksmchcom,
3 SUM (a.CHUNK) CHUNK,
4 SUM (a.recr) recr,
5 SUM (a.freeabl) freeabl,
6 SUM (a.SUM) SUM
7 FROM (SELECT ksmchcom, COUNT (ksmchcom) CHUNK,
8 DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr,
9 DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl,
10 SUM (ksmchsiz) SUM
11 FROM x$ksmsp GROUP BY ksmchcom, ksmchcls) a
12 where 1 = 0
13 GROUP BY a.ksmchcom;
Table created.
保存當前shared pool狀態:
SQL> INSERT INTO E$KSMSP
2 SELECT a.ksmchcom,
3 SUM (a.CHUNK) CHUNK,
4 SUM (a.recr) recr,
5 SUM (a.freeabl) freeabl,
6 SUM (a.SUM) SUM
7 FROM (SELECT ksmchcom, COUNT (ksmchcom) CHUNK,
8 DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr,
9 DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl,
10 SUM (ksmchsiz) SUM
11 FROM x$ksmsp
12 GROUP BY ksmchcom, ksmchcls) a
13 GROUP BY a.ksmchcom
14 /
41 rows created.
執行查詢:
SQL> select count(*) from dba_objects;
COUNT(*)
----------
10492
比較前後shared pool內存分配的變化:
SQL> select a.ksmchcom,a.chunk,a.sum,b.chunk,b.sum,(a.chunk - b.chunk) c_diff,(a.sum -b.sum) s_diff
2 from
3 (SELECT a.ksmchcom,
4 SUM (a.CHUNK) CHUNK,
5 SUM (a.recr) recr,
6 SUM (a.freeabl) freeabl,
7 SUM (a.SUM) SUM
8 FROM (SELECT ksmchcom, COUNT (ksmchcom) CHUNK,
9 DECODE (ksmchcls, 'recr', SUM (ksmchsiz), NULL) recr,
10 DECODE (ksmchcls, 'freeabl', SUM (ksmchsiz), NULL) freeabl,
11 SUM (ksmchsiz) SUM
12 FROM x$ksmsp
13 GROUP BY ksmchcom, ksmchcls) a
14 GROUP BY a.ksmchcom) a,e$ksmsp b
15 where a.ksmchcom = b.ksmchcom and (a.chunk - b.chunk) <>0
16 /
KSMCHCOM CHUNK SUM CHUNK SUM C_DIFF S_DIFF
---------------- ---------- ---------- ---------- ---------- ---------- ----------
KGL handles 313 102080 302 98416 11 3664
KGLS heap 274 365752 270 360424 4 5328
KQR
free memory 93 2292076 90 2381304 3 -89228
library cache 1005 398284 965 381416 40 16868
sql area 287 547452 269 490052 18 57400
6 rows selected.
我們簡單分析一下以上結果:
首先free memory的大小減少了89228(增加到另外五個組件中),這說明sql解析存儲占用了一定的內存空間。而chunk從90增加為93,這說明內存碎片增加了。
在下面的部分中,我會著手介紹一下KGL handles, KGLS heap這兩個非常重要的shared pool中的內存結構。
基本命令:
ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level LL';
其中LL代表Level級別,對於9.2.0及以後版本,不同Level含義如下:
Level =1 ,轉儲Library cache統計信息
Level =2 ,轉儲hash table概要
Level =4 ,轉儲Library cache對象,只包含基本信息
Level =8 ,轉儲Library cache對象,包含詳細信息(包括child references,pin waiters等)
Level =16,增加heap sizes信息
Level =32,增加heap信息
Library cache由一個hash表組成,而hash表是一個由hash buckets組成的數組。
每個hash bucket都是包含library cache handle的一個雙向鏈表。
Library Cache Handle指向Library Cache Object和一個引用列表。
library cache對象進一步分為:依賴表、子表和授權表等
我們看一下library cache的結構:
通過
ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4'
獲得以下輸出(這部分信息來自Oracle8i,Trace文件可以從www.eygle.com上找到)
第一部分(等價於Level 1):
LIBRARY CACHE STATISTICS:
gets hit ratio pins hit ratio reloads invalids namespace
---------- --------- ---------- --------- ---------- ---------- ---------
619658171 0.9999160 2193292112 0.9999511 9404 380 CRSR
79698558 0.9998832 424614847 0.9999108 13589 0 TABL/PRCD/TYPE
163399 0.9979926 163402 0.9978948 16 0 BODY/TYBD
0 0.0000000 0 0.0000000 0 0 TRGR
34 0.0294118 35 0.0571429 0 0 INDX
18948 0.9968862 24488 0.9953855 0 0 CLST
0 0.0000000 0 0.0000000 0 0 OBJE
0 0.0000000 0 0.0000000 0 0 PIPE
0 0.0000000 0 0.0000000 0 0 LOB
0 0.0000000 0 0.0000000 0 0 DIR
0 0.0000000 0 0.0000000 0 0 QUEU
0 0.0000000 0 0.0000000 0 0 OBJG
0 0.0000000 0 0.0000000 0 0 PROP
0 0.0000000 0 0.0000000 0 0 JVSC
0 0.0000000 0 0.0000000 0 0 JVRE
0 0.0000000 0 0.0000000 0 0 ROBJ
0 0.0000000 0 0.0000000 0 0 REIP
0 0.0000000 0 0.0000000 0 0 CPOB
115071 0.9992179 115071 0.9930999 704 0 EVNT
0 0.0000000 0 0.0000000 0 0 SUMM
0 0.0000000 0 0.0000000 0 0 DIMN
0 0.0000000 0 0.0000000 0 0 CTX
0 0.0000000 0 0.0000000 0 0 OUTL
0 0.0000000 0 0.0000000 0 0 RULS
0 0.0000000 0 0.0000000 0 0 RMGR
0 0.0000000 0 0.0000000 0 0 UNUSED
0 0.0000000 0 0.0000000 0 0 PPLN
0 0.0000000 0 0.0000000 0 0 PCLS
0 0.0000000 0 0.0000000 0 0 SUBS
0 0.0000000 0 0.0000000 0 0 LOCS
0 0.0000000 0 0.0000000 0 0 RMOB
0 0.0000000 0 0.0000000 0 0 RSMD
699654181 0.9999117 2618209955 0.9999440 23713 380 CUMULATIVE
這部分信息也就是v$librarycache中顯示的.
第二部分(等價於Level 2中的輸出):
LIBRARY CACHE HASH TABLE: size=509 count=354
BUCKET 0:
BUCKET 1:
BUCKET 2: *
BUCKET 3:
BUCKET 4:
BUCKET 5: *
BUCKET 6: *
BUCKET 7:
BUCKET 8: **
BUCKET 9: ***
BUCKET 10: *
BUCKET 11: *
BUCKET 12: ***
……..
BUCKET 103:
BUCKET 104: *
BUCKET 105:
BUCKET 106:
BUCKET 107: ****
BUCKET 108:
BUCKET 109:
BUCKET 110:
BUCKET 111: *
BUCKET 112: **
BUCKET 113:
BUCKET 114:
BUCKET 115:
BUCKET 116: *
BUCKET 117:
BUCKET 118: *****
BUCKET 119:
BUCKET 120: *
BUCKET 121:
BUCKET 122:
BUCKET 123:
BUCKET 124:
BUCKET 125: *
BUCKET 126:
BUCKET 127:
BUCKET 128: *
BUCKET 129:
BUCKET 130: *
BUCKET 131: *
BUCKET 132:
BUCKET 133:
BUCKET 134:
BUCKET 135: *
BUCKET 136:
BUCKET 137:
BUCKET 138:
BUCKET 139: *
BUCKET 140: *
BUCKET 141: *
BUCKET 142:
BUCKET 143: *
BUCKET 144:
BUCKET 145: ***
BUCKET 146:
BUCKET 147: *
BUCKET 148:
BUCKET 149:
BUCKET 150: **
BUCKET 151:
BUCKET 152:
BUCKET 153: *
BUCKET 154:
BUCKET 155:
BUCKET 156:
BUCKET 157:
BUCKET 158:
BUCKET 159:
BUCKET 160:
BUCKET 161:
BUCKET 162:
BUCKET 163:
BUCKET 164: *
BUCKET 165: *
BUCKET 166:
BUCKET 167:
BUCKET 168:
BUCKET 169:
BUCKET 170: **
BUCKET 171:
BUCKET 172: *
BUCKET 173:
BUCKET 174:
BUCKET 175: *
BUCKET 176: *
BUCKET 177:
BUCKET 178:
BUCKET 179:
BUCKET 180:
BUCKET 181: *
BUCKET 182:
BUCKET 183:
BUCKET 184:
BUCKET 185: *
BUCKET 186:
BUCKET 187:
BUCKET 188: **
BUCKET 189:
BUCKET 190: *
BUCKET 191: *
BUCKET 192:
BUCKET 193:
BUCKET 194: *
BUCKET 195: **
BUCKET 196: *
BUCKET 197: **
BUCKET 198: ****
BUCKET 199: *
BUCKET 200: *
……….
BUCKET 509:
BUCKET 510:
BUCKET 511:
在Oracle8i中,Oracle以一個很長的LIBRARY CACHE HASH TABLE來記錄Library Cache的使用情況。"*"代表該Bucket中包含的對象的個數。
在以上輸出中我們看到Bucket 198中包含四個對象.
我們在第三部分中可以找到bucket 198:
BUCKET 198:
LIBRARY OBJECT HANDLE: handle=2c2b4ac4
name=
SELECT a.statement_id, a.timestamp, a.remarks, a.Operation, a.options,
a.object_node, a.object_owner, a.object_name, a.object_instance,
a.object_type, a.optimizer, a.search_columns, a.id, a.parent_id,
a.position, a.cost, a.cardinality, a.bytes, a.other_tag,
a.partition_start, a.partition_stop, a.partition_id, a.other,
a.distribution
, ROWID
FROM plan_table a
hash=60dd47a1 timestamp=08-27-2004 10:19:28
namespace=CRSR flags=RON/TIM/PN0/LRG/[10010001]
kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch=0
lwt=2c2b4adc[2c2b4adc,2c2b4adc] ltm=2c2b4ae4[2c2b4ae4,2c2b4ae4]
pwt=2c2b4af4[2c2b4af4,2c2b4af4] ptm=2c2b4b4c[2c2b4b4c,2c2b4b4c]
ref=2c2b4acc[2c2b4acc,2c2b4acc]
LIBRARY OBJECT: object=2c0b1430
type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
CHILDREN: size=16
child# table reference handle
------ -------- --------- --------
0 2c0b15ec 2c0b15b4 2c2c0d50
DATA BLOCKS:
data# heap pointer status pins change
----- -------- -------- ------ ---- ------
0 2c362290 2c0b14b4 I/-/A 0 NONE
LIBRARY OBJECT HANDLE: handle=2c3675d4
name=SYS.DBMS_STANDARD
hash=50748ddb timestamp=NULL
namespace=BODY/TYBD flags=TIM/SML/[02000000]
kkkk-dddd-llll=0000-0011-0011 lock=0 pin=0 latch=0
lwt=2c3675ec[2c3675ec,2c3675ec] ltm=2c3675f4[2c3675f4,2c3675f4]
pwt=2c367604[2c367604,2c367604] ptm=2c36765c[2c36765c,2c36765c]
ref=2c3675dc[2c3675dc,2c3675dc]
LIBRARY OBJECT: object=2c1528e8
flags=NEX[0002] pflags= [00] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change
----- -------- -------- ------ ---- ------
0 2c367564 2c1529cc I/-/A 0 NONE
4 2c15297c 0 -/P/- 0 NONE
LIBRARY OBJECT HANDLE: handle=2c347dd8
name=select pos#,intcol#,col#,spare1 from icol$ where obj#=:1
hash=fa15ebe3 timestamp=07-28-2004 18:04:43
namespace=CRSR flags=RON/TIM/PN0/SML/[12010000]
kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch=0
lwt=2c347df0[2c347df0,2c347df0] ltm=2c347df8[2c347df8,2c347df8]
pwt=2c347e08[2c347e08,2c347e08] ptm=2c347e60[2c347e60,2c347e60]
ref=2c347de0[2c347de0,2c347de0]
LIBRARY OBJECT: object=2c1cd1a0
type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
CHILDREN: size=16
child# table reference handle
------ -------- --------- --------
0 2c1cd35c 2c1cd324 2c281678
1 2c1cd35c 2c352c50 2c0eeb8c
2 2c1cd35c 2c352c6c 2c2bb05c
DATA BLOCKS:
data# heap pointer status pins change
----- -------- -------- ------ ---- ------
0 2c2e8c58 2c1cd224 I/-/A 0 NONE
LIBRARY OBJECT HANDLE: handle=2c3a6484
name=SYS.TS$
hash=bb42852e timestamp=04-24-2002 00:04:15
namespace=TABL/PRCD/TYPE flags=PKP/TIM/KEP/SML/[02900000]
kkkk-dddd-llll=0111-0111-0119 lock=0 pin=0 latch=0
lwt=2c3a649c[2c3a649c,2c3a649c] ltm=2c3a64a4[2c3a64a4,2c3a64a4]
pwt=2c3a64b4[2c3a64b4,2c3a64b4] ptm=2c3a650c[2c3a650c,2c3a650c]
ref=2c3a648c[2c0d4b14,2c09353c]
LIBRARY OBJECT: object=2c3a626c
type=TABL flags=EXS/LOC[0005] pflags= [00] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change
----- -------- -------- ------ ---- ------
0 2c3a8ea4 2c3a63b0 I/P/A 0 NONE
3 2c3a5828 0 -/P/- 0 NONE
4 2c3a6300 2c3a5960 I/P/A 0 NONE
8 2c3a6360 2c3a4f00 I/P/A 0 NONE
我們看到這裡包含了四個對象.
我們再來看看Oracle9i中的情況:
參考文件: hsjf_ora_15800.trc
LIBRARY CACHE HASH TABLE: size=131072 count=217
Buckets with more than 20 objects:
NONE
Hash Chain Size Number of Buckets
--------------- -----------------
0 130855
1 217
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
18 0
19 0
20 0
>20 0
Oracle9i中通過新的方式記錄Library Cache的使用狀況.
按不同的Hash Chain Size代表Library Cache中包含不同對象的個數.0表示Free的Bucket,>20表示包含超過20個對象的Bucket的個數.
從以上列表中我們看到,包含一個對象的Buckets有217個,包含0個對象的Buckets有130855個.
我們來驗證一下:
[Oracle@jumper udump]$ cat hsjf_ora_15800.trc |grep BUCKET|more
BUCKET 12:
BUCKET 12 total object count=1
BUCKET 385:
BUCKET 385 total object count=1
BUCKET 865:
BUCKET 865 total object count=1
...
<!--[if !supportLineBreakNewLine]-->
<!--[endif]-->
[Oracle@jumper udump]$ cat hsjf_ora_15800.trc |grep BUCKET|wc -l
<!--[if !supportLineBreakNewLine]-->
<!--[endif]-->
434
<!--[if !supportLineBreakNewLine]-->
<!--[endif]-->
[Oracle@jumper udump]$
434/2 = 217,證實了我們的猜想.
通過HASH TABLE算法的改進,Oracle Library Cache管理的效率大大提高.
我們進一步來討論一下shared pool的處理:
先進行相應查詢,獲得測試數據:
SQL*Plus: Release 9.2.0.3.0 - Production on Thu Aug 26 10:21:54 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
SQL> connect eygle/eygle
Connected.
SQL> create table emp as select * from scott.emp;
Table created.
SQL>
SQL> connect / as sysdba
Connected.
SQL> startup force;
Oracle instance started.
Total System Global Area 47256168 bytes
Fixed Size 451176 bytes
Variable Size 29360128 bytes
Database Buffers 16777216 bytes
Redo Buffers 667648 bytes
Database mounted.
Database opened.
SQL> set linesize 120
SQL> connect scott/tiger
Connected.
SQL> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
8888 EYGLE MANAGER 11-AUG-04 9999 10 10
7566 JONES MANAGER 7839 02-APR-81 2975 20
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7876
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
11 rows selected.
SQL> connect eygle/eygle
Connected.
SQL> select * from emp;
EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
8888 EYGLE MANAGER 11-AUG-04 9999 10 10
7566 JONES MANAGER 7839 02-APR-81 2975 20
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7876
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10
11 rows selected.
SQL> select SQL_TEXT,VERSION_COUNT,HASH_VALUE,to_char(HASH_VALUE,'xxxxxxxxxx') HEX,ADDRESS
2 from v$sqlarea where sql_text like 'select * from emp%';
SQL_TEXT VERSION_COUNT HASH_VALUE HEX ADDRESS
-------------------- ------------- ---------- ----------- --------
select * from emp 2 2648707557 9de011e5 52D9EA28
SQL> select sql_text,username,ADDRESS,HASH_VALUE,to_char(HASH_VALUE,'xxxxxxxxxx') HEX_HASH_VALUE,CHILD_NUMBER,CHILD_LATCH
2 from v$sql a,dba_users b where a.PARSING_USER_ID = b.user_id and sql_text like 'select * from emp%';
SQL_TEXT USERNAME ADDRESS HASH_VALUE HEX_HASH_VA CHILD_NUMBER CHILD_LATCH
-------------------- ------------------------------ -------- ---------- ----------- ------------ -----------
select * from emp SCOTT 52D9EA28 2648707557 9de011e5 0 1
select * from emp EYGLE 52D9EA28 2648707557 9de011e5 1 1
注意:這裡我們可以看出v$sqlarea和v$sql兩個視圖的不同之處
v$sql中為每一條sql保留一個條目,而v$sqlarea中根據sql_text進行group by,通過version_count計算子指針的個數.
我們注意到,這兩條sql語句因為其代碼完全相同,所以其ADDRESS,HASH_VALUE也完全相同.
這就意味著,這兩條sql語句在shared pool中的存儲位置是相同的(盡管其執行計劃可能不同),代碼得以共享.
在此過程中Oracle完成sql解析的第一個步驟:語法解析
Oracle進行語法檢查時遵循自右向左,自下向上的原則,如果發現語法錯誤就馬上返回錯誤.
語法檢查通過以後,Oracle將sql文本轉換為相應的ASCII數值,然後根據數值通過Hash函數計算其HASH_VALUE
在shared pool中尋找是否存在相同的sql語句,如果存在,則進入下一步驟;如果不存在則嘗試獲取shared pool latch
請求內存,存儲該sql代碼
在這裡有一個問題需要說明一下:
因為大小寫字母的ascii值是不同的,所以Oracle會把大小寫不同的代碼作為不同的sql來處理,我們看一下測試:
SQL> select * from scott.dept;
DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING
20 RESEARCH
40 OperaTIONS
SQL> select * from scott.DEPT;
DEPTNO DNAME LOC
---------- -------------- -------------
10 ACCOUNTING
20 RESEARCH
30 SALES
40 OperaTIONS
SQL> col sql_text for a30
SQL> select sql_text,hash_value from v$sql where sql_text like 'select * from scott%';
SQL_TEXT HASH_VALUE
------------------------------ ----------
select * from scott.DEPT 4096614922
select * from scott.dept 2089404358
我們注意到,僅只大小寫的不同使得原本相同的sql語句變成了兩條"不同的代碼"
所以從這裡我們可以看出,sql的規范編寫非常重要.
完成這一個步驟以後,Oracle需要進行的是語義分析:
在此步驟中,Oracle需要驗證對象是否存在,相關用戶是否具有權限,引用的是否是相同的對象...
對於我們第一個查詢,實際上emp表來自不同的用戶,那麼sql的執行計劃也就不同了
當然影響sql執行計劃的因素還有很多,包括優化器模式等
SQL> select a.*,to_char(to_hash,'xxxxxxxxxx') Hex_HASH_VALUE
2 from V$OBJECT_DEPENDENCY a where to_name='EMP';
FROM_ADD FROM_HASH TO_OWNER TO_NAME TO_ADDRE TO_HASH TO_TYPE HEX_HASH_VA
-------- ---------- -------------------- -------------------- -------- ---------- ---------- -----------
52D9EA28 2648707557 SCOTT EMP 52D9DEBC 828228010 2 315dc1aa
52D9EA28 2648707557 EYGLE EMP 52D82E58 1930491453 2 7310f63d
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 1';
Session altered.
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 2';
Session altered.
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 4';
Session altered.
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 8';
Session altered.
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 16';
Session altered.
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 32';
Session altered.
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 100';
Session altered.
此處在不同級別對shared pool進行轉儲.
在轉向trace文件之前,我們通過下圖來看一下Library CACHE的結構:<!--[if !vml]--><!--[endif]-->
Library Cache Handle指向Library Cache Object,含對象名,namespace,時間戳,引用列表,鎖定對象及pin對象列表等.
我們從dump文件中看看具體的信息,由以上v$sql視圖我們得到以上查詢的hash_value為9de011e5,ADDRESS為52D9EA28
在bucket 4851中,我們找到了select * from emp 這條sql語句.
BUCKET 4581:
LIBRARY OBJECT HANDLE: handle=52d9ea28
name=select * from emp
hash=9de011e5 timestamp=08-26-2004 10:24:43
==>這個hash正是v$sql中該sql語句的hash_value值
namespace=CRSR flags=RON/TIM/PN0/SML/[12010000]
kkkk-dddd-llll=0000-0001-0001 lock=0 pin=0 latch#=1
lwt=0x52d9ea40[0x52d9ea40,0x52d9ea40] ltm=0x52d9ea48[0x52d9ea48,0x52d9ea48]
pwt=0x52d9ea58[0x52d9ea58,0x52d9ea58] ptm=0x52d9eab0[0x52d9eab0,0x52d9eab0]
ref=0x52d9ea30[0x52d9ea30, 0x52d9ea30] lnd=0x52d9eabc[0x52d9eabc,0x52d9eabc]
LIBRARY OBJECT: object=52d9e7b0
type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
CHILDREN: size=16
child# table reference handle
------ -------- --------- --------
0 52d9e96c 52d9e6cc 52d9e4ac
1 52d9e96c 52d9e70c 52d885cc
==>這就是我們前邊提到過的子指針,每個都指向了一個不同的handle
DATA BLOCKS:
data# heap pointer status pins change alloc(K) size(K)
----- -------- -------- ------ ---- ------ -------- --------
0 52d9e9b8 52d9e838 I/-/A 0 NONE 0.86 1.09
==>此處的heap就是指內存地址.
==>這裡存放的就是SQL代碼及用戶連接信息
&n
HEAP DUMP OF DATA BLOCK 0:
******************************************************
HEAP DUMP heap name="library cache" desc=0x52d9e9b8
extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0
parent=0x5000002c owner=0x52d9e7b0 nex=(nil) xsz=0x224
EXTENT 0 addr=0x52d9e558
Chunk 52d9e560 sz= 540 perm "perm " alo=448
52D9E560 5000021D 00000000 52D9E7A0 000001C0 [...P.......R....]
52D9E570 52D9E704 52D9E660 00020002 52D9E57C [...R`..R....|..R]
52D9E580 52D9E57C 00000000 52D9E588 52D9E588 [|..R.......R...R]
52D9E590 00000000 52D9E594 52D9E594 00000000 [.......R...R....]
52D9E5A0 52D9E5A0 52D9E5A0 00000000 52D9E5AC [...R...R.......R]
52D9E5B0 52D9E5AC 00000000 52D9E5B8 52D9E5B8 [...R.......R...R]
…….
52D9E8D0 52D9E630 00000000 52D9E570 00000000 [0..R....p..R....]
52D9E8E0 00000000 00000000 00000000 00000000 [................]
52D9E8F0 00000000 00000000 00000002 0000000D [................]
52D9E900 00000001 00000000 [........]
Permanent space = 900
******************************************************
BUCKET 4581 total object count=1
我們以handle:52d885cc為例看一下Library Cache Object的結構:
******************************************************
LIBRARY OBJECT HANDLE: handle=52d885cc
namespace=CRSR flags=RON/KGHP/PN0/[10010000]
kkkk-dddd-llll=0000-0041-0041 lock=0 pin=0 latch#=1
lwt=0x52d885e4[0x52d885e4,0x52d885e4] ltm=0x52d885ec[0x52d885ec,0x52d885ec]
pwt=0x52d885fc[0x52d885fc,0x52d885fc] ptm=0x52d88654[0x52d88654,0x52d88654]
ref=0x52d885d4[0x52d9e70c, 0x52d9e70c] lnd=0x52d88660[0x52d88660,0x52d88660]
LIBRARY OBJECT: object=52d82a24
type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
DEPENDENCIES: count=1 size=16
dependency# table reference handle position flags
----------- -------- --------- -------- -------- -------------------
0 52d82be0 52d82b20 52d82e58 14 DEP[01]
==>在dependency部分我們看到,這個cursor依賴的對象handle: 52d82e58
==>這個handle指向的就是EYGLE.EMP表,如果以上兩個CRSR訪問的是同一個對象,
==>那麼這兩個SQL才會是真的共享.這裡我們的sql雖然是相同的,訪問的卻是不同用戶
==>的數據表, 子指針的概念就出來了.
==>在這裡我們看到52d82e58指向的是EYGLE.EMP這個對象,也就是EYGLE所查詢的數據表.
AccessES: count=1 size=16
dependency# types
----------- -----
0 0009
TRANSLATIONS: count=1 size=16
original final
-------- --------
52d82e58 52d82e58
DATA BLOCKS:
data# heap pointer status pins change alloc(K) size(K)
----- -------- -------- ------ ---- ------ -------- --------
0 52d8c244 52d827e4 I/-/A 0 NONE 1.09 1.64
6 52d82ac0 52d817c4 I/-/A 0 NONE 3.70 3.73
==>這裡的Data Blocks是個重要的部分
==>每個控制塊包含一個heap descriptor,指向相應的heap memory,這個heap memory ==>包含的就是Diana Tree,P-Code,Source Code,Shared Cursor Context area等重要==>數據,也就是我們通常。所說的,解析過的SQL及執行計劃樹,真正到這裡以後,sql才得==>以共享.也就真正的避免了硬解析
HEAP DUMP OF DATA BLOCK 0:
******************************************************
HEAP DUMP heap name="library cache" desc=0x52d8c244
extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0
parent=0x5000002c owner=0x52d82a24 nex=(nil) xsz=0x224
EXTENT 0 addr=0x52d80ff0
Chunk 52d80ff8 sz= 464 free " "
52D80FF0 C00001D1 00000000 [........]
52D81000 52D8C28C 52D8C28C 00000000 00000000 [...R...R........]
52D81010 00000000 00000000 00000000 00000000 [................]
Repeat 26 times
52D811C0 00000000 00000000 [........]
Chunk 52d811c8 sz= 76 freeable "kgltbtab "
52D811C0 1000004D 52D80FF8 [M......R]
52D811D0 0A857928 52D82B68 00000000 00000000 [(y..h+.R........]
52D811E0 00000000 00000000 00000000 00000000 [................]
Repeat 2 times
52D81210 00000000 [....]
EXTENT 1 addr=0x52d827cc
Chunk 52d827d4 sz= 540 perm "perm " alo=532
52D827D0 5000021D 00000000 52D82A14 [...P.....*.R]
52D827E0 00000214 00000000 0000001A 00000069 [............i...]
…………….
52D825D0 00000000 00000000 00000000 52D81ACF [...............R]
52D825E0 00000000 00000000 00000000 00000000 [................]
Repeat 3 times
52D82620 52D8263C 00000010 000006D4 00000010 [<&.R............]
52D82630 000006F4 00000010 000006F8 00000010 [................]
52D82640 000006FC 00000010 00000714 00000010 [................]
52D82650 00000720 00000010 0000072C 00000010 [ .......,.......]
52D82660 00000744 00000010 0000074C 00000010 [D.......L.......]
52D82670 00000764 00000010 0000077C [d.......|...]
Permanent space = 3784
MARKS:
Mark 0x52d8237c
******************************************************這裡的handle=52d82e58就是sql依賴的對象信息:
BUCKET 63037:
LIBRARY OBJECT HANDLE: handle=52d82e58
name=EYGLE.EMP
hash=7310f63d timestamp=08-26-2004 10:23:40
namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0501-0501 lock=0 pin=0 latch#=1
lwt=0x52d82e70[0x52d82e70,0x52d82e70] ltm=0x52d82e78[0x52d82e78,0x52d82e78]
pwt=0x52d82e88[0x52d82e88,0x52d82e88] ptm=0x52d82ee0[0x52d82ee0,0x52d82ee0]
ref=0x52d82e60[0x52d82e60, 0x52d82e60] lnd=0x52d82eec[0x52d7dcf0,0x52d89fc8]
LIBRARY OBJECT: object=52d81594
type=TABL flags=EXS/LOC[0005] pflags= [00] status=VALD load=0
==>Type:對象類型,這裡是一張表
==>flags:代表對象狀態
DATA BLOCKS:
data# heap pointer status pins change alloc(K) size(K)
----- -------- -------- ------ ---- ------ -------- --------
0 52d8c1e4 52d8161c I/-/A 0 NONE 0.66 1.09
8 52d81238 52d80a18 I/-/A 0 NONE 1.10 1.13
10 52d8129c 52d80ea0 I/-/A 0 NONE 0.12 0.37
HEAP DUMP OF DATA BLOCK 0:
******************************************************
HEAP DUMP heap name="library cache" desc=0x52d8c1e4
extent sz=0x224 alt=32767 het=16 rec=9 flg=2 opc=0
parent=0x5000002c owner=0x52d81594 nex=(nil) xsz=0x224
==>每個heap descriptor 都包含一個owner部分,指向所有者,這裡的 52d81594 也就==>是EYGLE.EMP指向的Library對象: LIBRARY OBJECT: object=52d81594
EXTENT 0 addr=0x52d81220
Chunk 52d81228 sz= 540 perm "perm " alo=196
52D81220 5000021D 00000000 [...P....]
52D81230 52D81584 000000C4 5000002C 00000824 [...R....,..P$...]
52D81240 52D81594 52D80A00 52D80A08 00000000 [...R...R...R....]
52D81250 00000000 05010200 00000000 00000000 [................]
52D81260 534C474B 61656820 00000070 00000000 [KGLS heap.......]
52D81270 00107FFF 7FFF7FFF 00000401 00000000 [................]
52D81280 52D81280 52D81280 52D8129C 00000000 [...R...R...R....]
52D81290 52D80EA0 00040000 52FF5C14 5000002C [...R.....\.R,..P]
……..
52D80EE0 00000000 00000000 00000000 00000000 [................]
Repeat 15 times
52D80FE0 00000000 [....]
Total heap size = 340
FREE LISTS:
Bucket 0 size=0
Total free space = 0
UNPINNED RECREATABLE CHUNKS (lru first):
PERMANENT CHUNKS:
Chunk 52d80e90 sz= 340 perm "perm " alo=120
52D80E90 50000155 00000000 00000000 00000078 [U..P........x...]
52D80EA0 00000000 00000002 00000068 00000004 [........h.......]
52D80EB0 52D80EB0 52D80EB0 52D80EB8 52D80EB8 [...R...R...R...R]
52D80EC0 00000000 00000000 00000005 52D80ECC [...............R]
52D80ED0 52D80ECC 52D80ED4 52D80ED4 00000000 [...R...R...R....]
52D80EE0 00000000 00000000 00000000 00000000 [................]
Repeat 15 times
52D80FE0 00000000 [....]
Permanent space = 340
******************************************************
BUCKET 63037 total object count=1
 
52D81230 52D81584 000000C4 5000002C 00000824 [...R....,..P$...]
52D81240 52D81594 52D80A00 52D80A08 00000000 [...R...R...R....]
52D81250 00000000 05010200 00000000 00000000 [................]
52D81260 534C474B 61656820 00000070 00000000 [KGLS heap.......]
52D81270 00107FFF 7FFF7FFF 00000401 00000000 [................]
52D81280 52D81280 52D81280 52D8129C 00000000 [...R...R...R....]
52D81290 52D80EA0 00040000 52FF5C14 5000002C [...R.....\.R,..P]
……..
52D80EE0 00000000 00000000 00000000 00000000 [................]
Repeat 15 times
52D80FE0 00000000 [....]
Total heap size = 340
FREE LISTS:
Bucket 0 size=0
Total free space = 0
UNPINNED RECREATABLE CHUNKS (lru first):
PERMANENT CHUNKS:
Chunk 52d80e90 sz= 340 perm "perm " alo=120
52D80E90 50000155 00000000 00000000 00000078 [U..P........x...]
52D80EA0 00000000 00000002 00000068 00000004 [........h.......]
52D80EB0 52D80EB0 52D80EB0 52D80EB8 52D80EB8 [...R...R...R...R]
52D80EC0 00000000 00000000 00000005 52D80ECC [...............R]
52D80ED0 52D80ECC 52D80ED4 52D80ED4 00000000 [...R...R...R....]
52D80EE0 00000000 00000000 00000000 00000000 [................]
Repeat 15 times
52D80FE0 00000000 [....]
Permanent space = 340
******************************************************
BUCKET 63037 total object count=1
Lock在handle上獲得,在pin一個對象之前,必須首先獲得該handle的鎖定.
鎖定主要有三種模式: Null,share,Exclusive.
在讀取訪問對象時,通常需要獲取Null(空)模式以及share(共享)模式的鎖定.
在修改對象時,需要獲得Exclusive(排他)鎖定.
在鎖定了Library Cache對象以後,一個進程在訪問之前必須pin該對象.
同樣pin有三種模式,Null,shared和exclusive.
只讀模式時獲得共享pin,修改模式獲得排他pin.
通常我們訪問、執行過程、Package時獲得的都是共享pin,如果排他pin被持有,那麼數據庫此時就要產生等待.
在很多statspack的report中,我們可能看到以下等待事件:
Top 5 Wait Events
~~~~~~~~~~~~~~~~~ Wait % Total
Event Waits Time (cs) Wt Time
-------------------------------------------- ------------ ------------ -------
library cache lock 75,884 1,409,500 48.44
latch free 34,297,906 1,205,636 41.43
library cache pin 563 142,491 4.90
db file scattered read 146,283 75,871 2.61
enqueue 2,211 13,003 .45 -------------------------------------------------------------
這裡的library cache lock和library cache pin都是我們關心的.接下來我們就研究一下這幾個等待事件.
(一).LIBRARY CACHE PIN等待事件
Oracle文檔上這樣介紹這個等待事件:
"library cache pin" 是用來管理library cache的並發訪問的,pin一個object會引起相應的heap被載入內存中(如果此前沒有被加載),Pins可以在三個模式下獲得:NULL,SHARE,EXCLUSIVE,可以認為pin是一種特定形式的鎖.
當Library Cache Pin等待事件出現時,通常說明該Pin被其他用戶已非兼容模式持有.
"library cache pin"的等待時間為3秒鐘,其中有1秒鐘用於PMON後台進程,即在取得pin之前最多等待3秒鐘,否則就超時."library cache pin"的參數如下,有用的主要是P1和P2:
P1 - KGL Handle address.
P2 - Pin address
P3 - Encoded Mode & Namespace
"LIBRARY CACHE PIN"通常是發生在編譯或重新編譯PL/SQL,VIEW,TYPES等object時.編譯通常都是顯性的,如安裝應用程序,升級,安裝補丁程序等,另外,"ALTER","GRANT","REVOKE"等操作也會使object變得無效, 可以通過object的"LAST_DDL"觀察這些變化.
當object變得無效時,Oracle 會在第一次訪問此object時試圖去重新編譯它,如果此時其他session已經把此object pin到library cache中,就會出現問題,特別時當有大量的活動session並且存在較復雜的dependence時.在某種情況下,重新編譯object可能會花幾個小時時間,從而阻塞其它試圖去訪問此object的進程.
下面讓我們通過一個例子來模擬及解釋這個等待:
1.創建測試用存儲過程
[Oracle@jumper udump]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 14:16:57 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to an idle instance.
SQL> startup
Oracle instance started.
Total System Global Area 47256168 bytes
Fixed Size 451176 bytes
Variable Size 29360128 bytes
Database Buffers 16777216 bytes
Redo Buffers 667648 bytes
Database mounted.
Database opened.
SQL> create or replace PROCEDURE pining
2 IS
3 BEGIN
4 NULL;
5 END;
6 /
Procedure created.
SQL>
SQL> create or replace procedure calling
2 is
3 begin
4 &nbs
5 dbms_lock.sleep(3000);
6 end;
7 /
Procedure created.
SQL>
2.模擬
首先執行calling過程,在calling過程中調用pining過程。此時pining過程上獲得共享Pin,如果此時嘗試對pining進行授權或重新編譯,將產生Library Cache Pin等待,直到calling執行完畢.
session 1:
[oracle@jumper Oracle]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 16:13:43 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
SQL> exec calling
此時calling開始執行
session 2:
[Oracle@jumper udump]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.3.0 - Production on Mon Sep 6 16:14:16 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i Enterprise Edition Release 9.2.0.3.0 - Production
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
SQL> grant execute on pining to eygle;
此時session 2掛起
ok,我們開始我們的研究:
從v$session_wait入手,我們可以得到哪些session正在經歷library cache pin的等待
SQL> select sid,seq#,event,p1,p1raw,p2,p2raw,p3,p3raw,state
2 from v$session_wait where event like 'library%';
SID SEQ# EVENT P1 P1RAW P2 P2RAW P3 WAIT_TIME SECONDS_IN_WAIT STATE
---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- -------------------
8 268 library cache pin 1389785868 52D6730C 1387439312 52B2A4D0 301 0 2 WAITING
等待3秒就超時,seq#會發生變化
SQL>
SID SEQ# EVENT P1 P1RAW P2 P2RAW P3 WAIT_TIME SECONDS_IN_WAIT STATE
---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- -------------------
8 269 library cache pin 1389785868 52D6730C 1387439312 52B2A4D0 301 0 2 WAITING
SQL>
SID SEQ# EVENT P1 P1RAW P2 P2RAW P3 WAIT_TIME SECONDS_IN_WAIT STATE
---- ---------- ------------------- ---------- -------- ---------- -------- ---------- ---------- --------------- -------------------
8 270 library cache pin 1389785868 52D6730C 1387439312 52B2A4D0 301 0 0 WAITING
在這個輸出中,P1 列是Library Cache Handle Address,Pn字段是10進制表示,PnRaw字段是16進制表示
我們看到,library cache pin等待的對象的handle地址為:52D6730C
通過這個地址,我們查詢X$KGLOB視圖就可以得到對象的具體信息:
col KGLNAOWN for a10
col KGLNAOBJ for a20
select ADDR,KGLHDADR,KGLHDPAR,KGLNAOWN,KGLNAOBJ,KGLNAHSH,KGLHDOBJ
from X$KGLOB
where KGLHDADR ='52D6730C'
/
ADDR KGLHDADR KGLHDPAR KGLNAOWN KGLNAOBJ KGLNAHSH KGLHDOBJ
-------- -------- -------- ---------- -------------------- ---------- --------
404F9FF0 52D6730C 52D6730C SYS PINING 2300250318 52D65BA4
這裡KGLNAHSH代表該對象的Hash Value
由此我們知道,在PINING對象上正經歷library cache pin的等待.
然後我們引入另外一個內部視圖X$KGLPN:
Note:X$KGLPN--[K]ernel [G]eneric [L]ibrary Cache Manager object [P]i[N]s
select a.sid,a.username,a.program,b.addr,b.KGLPNADR,b.KGLPNUSE,b.KGLPNSES,b.KGLPNHDL,
b.kGLPNLCK, b.KGLPNMOD, b.KGLPNREQ
from v$session a,x$kglpn b
where a.saddr=b.kglpnuse and b.kglpnhdl = '52D6730C' and b.KGLPNMOD<>0
/
SID USERNAME PROGRAM ADDR KGLPNADR KGLPNUSE KGLPNSES KGLPNHDL KGLPNLCK KGLPNMOD KGLPNREQ
----- ----------- ---------------------------------------- -------- -------- -------- -------- -------- -------- ---------- ----------
13 SYS [email protected] (TNS V1-V3) 404FA034 52B2A518 51E2013C 51E2013C 52D6730C 52B294C8 2 0
通過聯合v$session,可以獲得當前持有該handle的用戶信息.對於我們的測試sid=13的用戶正持有該handle
那麼這個用戶正在等什麼呢?
SQL> select * from v$session_wait where sid=13;
SID SEQ# EVENT P1TEXT P1 P1RAW P2TEXT P2 P2RAW P3TEXT P3 P3RAW WAIT_TIME SECONDS_IN_WAIT STATE
---------- ---------- ------------------- --------- ---------- -------- ------- ---------- -------- ------- ---------- -------- ---------- --------------- -------
13 25 PL/SQL lock timer duration 120000 0001D4C0 0 00 0 00 0 1200 WAITING
Ok,這個用戶正在等待一次PL/SQL lock timer計時.得到了sid,我們就可以通過v$session.SQL_HASH_VALUE,v$session.SQL_ADDRESS等字段關聯v$sqltext,v$sqlarea等視圖獲得當前session正在執行的操作.
SQL> select sql_text from v$sqlarea where v$sqlarea.hash_value='3045375777';
SQL_TEXT
--------------------------------------------------------------------------------
BEGIN calling; END;
這裡我們得到這個用戶正在執行calling這個存儲過程,接下來的工作就應該去檢查calling在作什麼了.
我們這個calling作的工作是dbms_lock.sleep(3000)也就是PL/SQL lock timer正在等待的原因。
至此就找到了Library Cache Pin的原因.
簡化一下以上查詢:
1.獲得Library Cache Pin等待的對象
SELECT addr, kglhdadr, kglhdpar, kglnaown, kglnaobj, kglnahsh, kglhdobj
FROM x$kglob
WHERE kglhdadr IN (SELECT p1raw
FROM v$session_wait
WHERE event LIKE 'library%')
/
ADDR KGLHDADR KGLHDPAR KGLNAOWN KGLNAOBJ KGLNAHSH KGLHDOBJ
-------- -------- -------- ---------- -------------------- ---------- --------
404F2178 52D6730C 52D6730C SYS PINING 2300250318 52D65BA4
2.獲得持有等待對象的session信息
SELECT a.SID, a.username, a.program, b.addr, b.kglpnadr, b.kglpnuse,
b.kglpnses, b.kglpnhdl, b.kglpnlck, b.kglpnmod, b.kglpnreq
FROM v$session a, x$kglpn b
WHERE a.saddr = b.kglpnuse
AND b.kglpnmod <> 0
AND b.kglpnhdl IN (SELECT p1raw
FROM v$session_wait
WHERE event LIKE 'library%')
/
SQL>
SID USERNAME PROGRAM ADDR KGLPNADR KGLPNUSE KGLPNSES KGLPNHDL KGLPNLCK KGLPNMOD KGLPNREQ
---------- ---------- ------------------------------------------------ -------- -------- -------- -------- -------- -------- ---------- ----------
13 SYS [email protected] (TNS V1-V3) 404F6CA4 52B2A518 51E2013C 51E2013C 52D6730C 52B294C8 2 0
3.獲得持有對象用戶執行的代碼
SELECT sql_text
FROM v$sqlarea
WHERE (v$sqlarea.address, v$sqlarea.hash_value) IN (
SELECT sql_address, sql_hash_value
FROM v$session
WHERE SID IN (
SELECT SID
FROM v$session a, x$kglpn b
WHERE a.saddr = b.kglpnuse
AND b.kglpnmod <> 0
AND b.kglpnhdl IN (SELECT p1raw
FROM v$session_wait
WHERE event LIKE 'library%')))
/
SQL_TEXT
--------------------------------------------------------------------------------
BEGIN calling; END;
在grant之前和之後我們可以轉儲一下shared pool的內容觀察比較一下:
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE level 32';
Session altered.
在grant之前:
從前面的查詢獲得pining的Handle是52D6730C:
*****************************************************
BUCKET 67790:
LIBRARY OBJECT HANDLE: handle=52d6730c
name=SYS.PINING
hash=891b08ce timestamp=09-06-2004 16:43:51
namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0011-0011 lock=N pin=S latch#=1
--在Object上存在共享pin
--在handle上存在Null模式鎖定,此模式允許其他用戶繼續以Null/shared模式鎖定該對象
lwt=0x52d67324[0x52d67324,0x52d67324] ltm=0x52d6732c[0x52d6732c,0x52d6732c]
pwt=0x52d6733c[0x52b2a4e8,0x52b2a4e8] ptm=0x52d67394[0x52d67394,0x52d67394]
ref=0x52d67314[0x52d67314, 0x52d67314] lnd=0x52d673a0[0x52d67040,0x52d6afcc]
LIBRARY OBJECT: object=52d65ba4
type=PRCD flags=EXS/LOC[0005] pflags=NST [01] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change alloc(K) size(K)
----- -------- -------- ------ ---- ------ -------- --------
0 52d65dac 52d65c90 I/P/A 0 NONE 0.30 0.55
4 52d65c40 52d67c08 I/P/A 1 NONE 0.44 0.48
在發出grant命令後:
******************************************************
BUCKET 67790:
LIBRARY OBJECT HANDLE: handle=52d6730c
name=SYS.PINING
hash=891b08ce timestamp=09-06-2004 16:43:51
namespace=TABL/PRCD/TYPE flags=KGHP/TIM/SML/[02000000]
kkkk-dddd-llll=0000-0011-0011 lock=X pin=S latch#=1
--由於calling執行未完成,在object上仍讓保持共享pin
--由於grant會導致重新編譯該對象,所以在handle上的排他鎖已經被持有
--進一步的需要獲得object上的Exclusive pin,由於shared pin被calling持有,所以library cache pin等待出現.
lwt=0x52d67324[0x52d67324,0x52d67324] ltm=0x52d6732c[0x52d6732c,0x52d6732c]
pwt=0x52d6733c[0x52b2a4e8,0x52b2a4e8] ptm=0x52d67394[0x52d67394,0x52d67394]ref=0x52d67314[0x52d67314, 0x52d67314] lnd=0x52d673a0[0x52d67040,0x52d6afcc]
LIBRARY OBJECT: object=52d65ba4
type=PRCD flags=EXS/LOC[0005] pflags=NST [01] status=VALD load=0
DATA BLOCKS:
data# heap pointer status pins change alloc(K) size(K)
----- -------- -------- ------ ---- ------ -------- --------
0 52d65dac 52d65c90 I/P/A 0 NONE 0.30 0.55
4 52d65c40 52d67c08 I/P/A 1 NONE 0.44 0.48
實際上recompile過程包含以下步驟,我們看一下lock和pin是如何交替發揮作用的:
1.存儲過程的library cache object以排他模式被鎖定,這個鎖定是在handle上獲得的
exclusive鎖定可以防止其他用戶執行同樣的操作,同時防止其他用戶創建新的引用此過程的對象.
2.以shared模式pin該對象,以執行安全和錯誤檢查.
3.共享pin被釋放,重新以排他模式pin該對象,執行重編譯.
4.使所有依賴該過程的對象失效
5.釋放exclusive lock和exclusive pin
(二).LIBRARY CACHE LOCK等待事件
如果此時我們再發出一條grant或compile的命令,那麼library cache lock等待事件將會出現:
session 3:
[oracle@jumper Oracle]$ sqlplus "/ as sysdba"
SQL*Plus: Release 9.2.0.3.0 - Production on Tue Sep 7 17:05:25 2004
Copyright (c) 1982, 2002, Oracle Corporation. All rights reserved.
Connected to:
Oracle9i
With the Partitioning, OLAP and Oracle Data Mining options
JServer Release 9.2.0.3.0 - Production
SQL> alter procedure pining compile;
此進程掛起,我們查詢v$session_wait視圖可以獲得以下信息:
SID SEQ# EVENT P1TEXT P1 P1RAW P2TEXT P2 P2RAW P3TEXT P3 P3RAW WAIT_TIME
---- ---- ------------------- --------------- ---------- -------- ------------ ---------- -------- ---------------- -------------- ---------- ------ ---
11 143 library cache pin handle address 1390239716 52DD5FE4 pin address 1387617456 52B55CB0 100*mode+namespace 301 0000012D 0 6 WAITING
13 18 library cache lock handle address 1390239716 52DD5FE4 lock address 1387433984 52B29000 100*mode+namespace 301 0000012D 0 3 WAITING
8 415 PL/SQL lock timer duration 120000 0001D4C0 0 00 0 00 0 63 WAITING
....
13 rows selected
由於handle上的lock已經被session 2以exclusive模式持有,所以session 3產生了等待。我們可以看到,在生產數據庫中權限的授予、對象的重新編譯都可能會導致library cache pin等待的出現。所以應該盡量避免在高峰期進行以上操作.
另外我們測試的案例本身就說明:如果Package或過程中存在復雜的、交互的依賴以來關系極易導致library cache pin的出現。所以在應用開發的過程中,我們也應該注意這方面的內容。
研究了幾天shared pool,沒想到忽然就撞到問題上來了。作為一個案例寫出來給大家參考一下吧。
問題起因是公司做短信群發,就是那個18萬買的4000字的短信小說(噓,小聲點,我也沒看過...)。群發的時候每隔一段時間就會發生一次消息隊列擁堵的情況,在數據庫內部實際上是向一個數據表中記錄發送日志。
我們介入來檢查數據庫的問題,在一個擁堵時段我開始診斷:
SQL> select sid,event,p1,p1raw from v$session_wait;
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
76 latch free 2147535824 8000CBD0
83 latch free 2147535824 8000CBD0
148 latch free 3415346832 CB920E90
288 latch free 2147535824 8000CBD0
285 latch free 2147535824 8000CBD0
196 latch free 2147535824 8000CBD0
317 latch free 2147535824 8000CBD0
2 pmon timer 300 0000012C
1 rdbms ipc message 300 0000012C
4 rdbms ipc message 300 0000012C
6 rdbms ipc message 180000 0002BF20
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
18 rdbms ipc message 6000 00001770
102 rdbms ipc message 6000 00001770
311 rdbms ipc message 6000 00001770
194 rdbms ipc message 6000 00001770
178 rdbms ipc message 6000 00001770
3 log file parallel write 1 00000001
13 log file sync 2705 00000A91 16 log file sync 2699 00000A8B
104 log file sync 2699 00000A8B
308 log file sync 2694 00000A86
262 log file sync 2705 00000A91
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
172 log file sync 2689 00000A81
169 log file sync 2705 00000A91
108 log file sync 2694 00000A86
38 log file sync 2707 00000A93
34 db file scattered read 63 0000003F
5 smon timer 300 0000012C
27 SQL*Net message to clIEnt 1413697536 54435000
60 SQL*Net message to clIEnt 1413697536 54435000
239 SQL*Net message to clIEnt 1413697536 54435000
...ignore some idle waiting here...
11 SQL*Net message from clIEnt 675562835 28444553
12 SQL*Net message from clIEnt 1413697536 54435000
170 rows selected.
在這次查詢中,我發現大量的latch free等待,再次查詢時這些等待消失,應用也恢復了正常。
SQL> select sid,event,p1,p1raw from v$session_wait where event not like 'SQL*Net%';
---------- ---------------------------------------------------------------- ---------- --------
2 pmon timer 300 0000012C
1 rdbms ipc message 300 0000012C
4 rdbms ipc message 300 0000012C
6 rdbms ipc message 180000 0002BF20
18 rdbms ipc message 6000 00001770
102 rdbms ipc message 6000 00001770
178 rdbms ipc message 6000 00001770
194 rdbms ipc message 6000 00001770
311 rdbms ipc message 6000 00001770
3 log file parallel write 1 00000001
148 log file sync 2547 000009F3
SID EVENT P1 P1RAW
---------- ---------------------------------------------------------------- ---------- --------
273 log file sync 2544 000009F0
190 log file sync 2545 000009F1
5 smon timer 300 0000012C
14 rows selected.
接下來我們來看這些latch free等待的是哪些latch。
SQL> select addr,latch#,name,gets,spin_gets from v$latch order by spin_gets;
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
80001398 3 session switching 111937 0
80002010 6 longop free list 37214 0
800023A0 7 cached attr list 0 0
80002628 10 event group latch 2391668 0
.....
80003F3C 28 message pool Operations parent latch 3 0
.....
80006030 60 mostly latch-free SCN 19 0
80005F8C 59 file number translation table 68 0
80005F14 58 dlm cr bast queue latch 0 0
80005E8C 57 name-service request 0 0
80005E14 56 name-service memory objects 0 0
80005DA0 55 name-service namespace bucket 0 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
80005D2C 54 name-service pending queue ; 0 0
80005CB4 53 name-service request queue 0 0
80004E08 52 name-service entry 0 0
80008AB0 76 KCL lock element parent latch 0 0
80008A48 75 KCL instance latch 0 0
80007F18 73 redo copy 816 0
80007BBC 71 archive process latch 0 0
80007B54 70 archive control 1 0
80006A10 68 Active checkpoint queue latch 2003308 0
800064B0 66 large memory latch 0 0
80006448 65 cache protection latch 0 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
800060EC 61 batching SCNs 0 0
8000CAB0 96 global transaction 6833807 0
8000CA48 95 global tx free list 58258 0
8000C238 93 cost function 0 0
80009FCC 91 temp lob duration state obj allocation 0 0
8000995C 87 ktm global data 8118 0
80009228 84 transaction branch allocation 282388 0
80008EC4 80 begin backup scn array 6968 0
80008D54 79 loader state object freelist 42712 0
80008B80 78 KCL freelist latch 0 0
80008B18 77 KCL name table latch 0 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
8000D484 118 presentation list 0 0
8000D41C 117 session timer 855944 0
.....
8000E9D0 129 process queue 44 0
8000E900 127 query server freelists 66 0
8000FC84 140 AQ Propagation Scheduling System Load 0 0
8000E898 126 query server process 10 0
8000E27C 125 job_queue_processes parameter latch 111937 0
8000DA1C 124 NLS data objects 2 0
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
8000D95C 123 ncodef allocation latch 111937 0
8000D674 122 virtual circuits 0 0
8000D60C 121 virtual circuit queues 159877 0
8000D5A4 120 virtual circuit buffers 0 0
8000D4EC 119 address list 2 0
.....
8000CD70 102 Direct I/O Adaptor 2 0
.....>80002408 8 GDS latch 30 0
800092E4 85 sort extent pool 69834 1
8000EC38 132 parallel query alloc buffer 80 1
8000E968 128 error message lists 22 1
80001400 4 process group creation 2615542 2
8000EAA0 131 parallel query stats 14 2
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
8000CD08 101 Token Manager 1151107 2
8000CB18 97 global tx hash mapping 507846 2
80006378 63 cache buffer handles 315924 4
8000EA38 130 process queue reference 190993 5
80003E3C 26 channel handle pool latch 2391680 18
80003EAC 27 channel Operations parent latch 4783425 24
80009B90 89 intra txn parallel recovery &n
8000FCF8 141 fixed table rows for x$hs_session 161368 41
800012C8 1 process allocation 2391688 154
80009B28 88 parallel txn reco latch 174519 271
8000CCA0 100 library cache load lock 14947545 5958
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
8000C8D0 94 user lock 13086412 6078
8000914C 82 list of block allocation 120650357 12024
80006A78 69 Checkpoint queue latch 154361751 17686
80009D34 90 sequence cache 64611720 32027
80009090 81 dml lock allocation 234465024 45351
800091C0 83 transaction allocation 214227648 48345
800096AC 86 undo global data 188271244 49641
800028A0 13 enqueue hash chains 373244264 131322
80007E04 72 redo allocation 439389808 201498
80001468 5 session idle bit 2039097976 204969
80002838 12 enqueues 471338482 273695
ADDR LATCH# NAME GETS SPIN_GETS
-------- ---------- ------------------------------------------------ ----------
80001330 2 session allocation 261826230 428312
800063E0 64 multiblock read objects 1380614923 1366278
800026B8 11 messages 207935758 1372606
80001218 0 latch wait list 203479569 1445342
80006310 62 cache buffers chains 3.8472E+10 2521699
8000A17C 92 row cache objects 1257586714 2555872
80007F80 74 redo writing 264722932 4458044
80006700 67 cache buffers lru chain 5664313769 30046921
8000CBD0 98 shared pool 122433688 59070585
8000CC38 99 library cache 4414533796 1037032730
142 rows selected.
SQL> select startup_time from v$instance;
STARTUP_T---------
13-AUG-04
檢查數據庫啟動時間
我們注意到,在當前數據庫中競爭最嚴重的兩個latch是shared pool和library cache.
顯然這極有可能是SQL的過度解析造成的。進一步我們檢查v$sqlarea發現:
SQL> select sql_text,VERSION_COUNT,INVALIDATIONS,PARSE_CALLS,OPTIMIZER_MODE,
PARSING_USER_ID,PARSING_SCHEMA_ID,ADDRESS,HASH_VALUE
from v$sqlarea where version_count >1000;
2
SQL_TEXT
------------------------------------------------------------------------------------------------------------------------
VERSION_COUNT INVALIDATIONS PARSE_CALLS OPTIMIZER_MODE PARSING_USER_ID
PARSING_SCHEMA_ID ADDRESS HASH_VALUE
------------- ------------- ----------- ------------------------- --------------- ----------------- -------- ----------
insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,
MSGLEN,MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,
ADDRUSER,SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,
SMCONT) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22)
7023 0 1596 MULTIPLE CHILDREN PRESENT 36 36 C82AF1C8 3974744754
這就是寫日志記錄的代碼,這段代碼使用了綁定變量,但是version_count卻有7023個。也就是這個sql有7023個子指針.這是不可想象的。
通過前面幾節的研究我們知道,如果這個sql有7023個子指針,那麼意味著這些子指針都將存在於同一個Bucket的鏈表上。那麼這也就意味著,如果同樣SQL再次執行,Oracle將不得不搜索這個鏈表以尋找可以共享的SQL。這將導致大量的library cache latch的競爭。
這時候我開始猜測原因:
1.可能代碼存在問題,在每次執行之前程序修改某些session參數,導致sql不能共性
2.可能是8.1.5的v$sqlarea記錄存在問題,我們看到的結果是假象:)
3.Bug
Ok,我們的診斷不能停。最直接的我dump內存來看:
SQL> ALTER SESSION SET EVENTS 'immediate trace name LIBRARY_CACHE leve
察看trace文件得到如下結果(摘錄包含該段代碼的片斷):
BUCKET 21049:
LIBRARY OBJECT HANDLE: handle=c82af1c8
name=
insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,MSGLEN,
MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER,
SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values
(:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22)
hash=ece9cab2 timestamp=09-09-2004 12:51:29
namespace=CRSR flags=RON/TIM/PN0/LRG/[10010001]
kkkk-dddd-llll=0000-0001-0001 lock=N pin=S latch=5
lwt=c82af1e0[c82af1e0,c82af1e0] ltm=c82af1e8[c82af1e8,c82af1e8]
pwt=c82af1f8[c82af1f8,c82af1f8] ptm=c82af250[c82af250,c82af250]
ref=c82af1d0[c82af1d0,c82af1d0]
LIBRARY OBJECT: object=c1588e84
type=CRSR flags=EXS[0001] pflags= [00] status=VALD load=0
CHILDREN: size=7024
child# table reference handle
------ -------- --------- --------
0 c1589040 c1589008 c668c2bc
1 c1589040 bfd179c4 c6ec9ee8
2 c1589040 bfd179e0 c2dd9b3c
3 c1589040 bfd179fc c5a46614
4 c1589040 bfd17a18 c35f1388
5 c1589040 bfd17a34 c77401bc
6 c1589040 bfd17a50 c4092838
7 c1589040 bfddb310 c6cd5258
8 c1589040 bfddb32c c63c6650
9 c1589040 bfddb348 c7e4e3d0
10 c1589040 bfddb364 c4c4b110
11 c1589040 bfddb380 c5950348
12 c1589040 bfddb39c c6c33aa4
13 c1589040 bfddb3b8 c672b0bc
...........................................
.....ignore losts of child cursor here................................................
7001 bf595bc8 c641fba0 c6467890
7002 bf595bc8 c641fbbc c3417168
7003 bf595bc8 c641fbd8 c3417bb0
7004 bf595bc8 c641fbf4 c2fdccbc
7005 bf595bc8 c641fc10 c7f7ca50
7006 bf595bc8 c641fc2c c7f508ec
7007 bf595bc8 c641fc48 c268d8d8
7008 c641fcb8 c641fc64 bec61ed8
7009 c641fcb8 c641fc80 c4a6cc5c
7010 c641fcb8 c641fc9c c1a8aa34
7011 c641fcb8 c0ae4ea0 c0ae4ddc
7012 c641fcb8 c0ae4ebc bd55fe60
7013 c641fcb8 c0ae4ed8 c226914c
7014 c641fcb8 c0ae4ef4 c51dd2e0
7015 c641fcb8 c0ae4f10 c480c468
7016 c641fcb8 c0ae4f2c c60196d0
7017 c641fcb8 c0ae4f48 c4675d2c
7018 c641fcb8 c0ae4f64 bd5e2750
7019 c641fcb8 c0ae4f80 c09b1bb0
7020 c641fcb8 c0ae4f9c bf2d6044
7021 c641fcb8 c0ae4fb8 c332c1c4
7022 c641fcb8 c0ae4fd4 cbdde0f8
DATA BLOCKS:
data# heap pointer status pins change
----- -------- -------- ------ ---- ------
0 c3ef2c50 c1588f08 I/P/A 0 NONE
這裡確實存在7023個子指針
查詢v$sql得到相同的結果:
SQL> select CHILD_NUMBER,EXECUTIONS,OPTIMIZER_MODE,OPTIMIZER_COST,PARSING_USER_ID,
2 PARSING_SCHEMA_ID,ADDRESS,HASH_VALUE
3 from v$sql where HASH_VALUE='3974744754';
CHILD_NUMBER EXECUTIONS OPTIMIZER_ OPTIMIZER_COST PARSING_USER_ID PARSING_SCHEMA_ID ADDRESS HASH_VALUE
------------ ---------- ------ ---------- ------------- ------------ -------- ----------
1 7111 CHOOSE 238150 36 36 C82AF1C8 3974744754
2 9160 CHOOSE 238150 36 36 C82AF1C8 3974744754
3 9127 CHOOSE 238150 36 36 C82AF1C8 3974744754
4 8109 CHOOSE 238150 36 36 C82AF1C8 3974744754
5 4386 CHOOSE 238150 36 36 C82AF1C8 3974744754
6 4913 CHOOSE 238150 36 36 C82AF1C8 3974744754
7 3764 CHOOSE 238150 36 36 C82AF1C8 3974744754
8 3287 CHOOSE 238150 36 36 C82AF1C8 3974744754
9 3156 CHOOSE 238150 36 36 C82AF1C8 3974744754
.....
7015 1 CHOOSE 238150 36 36 C82AF1C8 3974744754
7016 1 CHOOSE 238150 36 36 C82AF1C8 3974744754
7017 0 CHOOSE 238150 36 36 C82AF1C8 3974744754
7018 9396 NONE 0 0 C82AF1C8 3974744754
7019 5008 CHOOSE 237913 36 36 C82AF1C8 39747447547020 625 CHOOSE 237913 36 36 C82AF1C8 3974744754
7021 10101 CHOOSE 237913 36 36 C82AF1C8 3974744754
7022 7859 CHOOSE 237913 36 36 C82AF1C8 3974744754
7023 rows selected.
這裡確實存在7023個子指針,第二種猜測被否定了,同時研發發過來的代碼也不存在第一種情況。那麼只能是第三種情況了,Oracle的Bug,Ok,那我們需要找到解決辦法。
搜索Metalink,發現Bug:1210242,該Bug描述為:
On certain SQL statements cursors are not shared when TIMED_STATISTICS is enabled.
碰巧我這個數據庫的TIMED_STATISTICS設置為True,修改TIMED_STATISTICS為False以後,觀察v$sql,發現有效子指針很快下降到2個。
SQL> select CHILD_NUMBER,OPTIMIZER_COST,OPTIMIZER_MODE,EXECUTIONS,ADDRESS from v$sql where hash_value=3974744754 and OPTIMIZER_MODE='CHOOSE';
CHILD_NUMBER OPTIMIZER_COST OPTIMIZER_ EXECUTIONS ADDRESS
------------ -------------- ---------- ---------- --------
0 238167 CHOOSE 63943 C82AF1C8
1 238300 CHOOSE 28915 C82AF1C8
第二天下降到只有一個.
SQL> select CHILD_NUMBER,OPTIMIZER_COST,OPTIMIZER_MODE,EXECUTIONS,ADDRESS from v$sql where hash_value=3974744754 and OPTIMIZER_MODE='CHOOSE';
CHILD_NUMBER OPTIMIZER_COST OPTIMIZER_ EXECUTIONS ADDRESS
------------ -------------- ---------- ---------- --------
0 238702 CHOOSE 578124 C82AF1C8
短信群發從此正常.
對於這個問題,另外一個可選的方法是設置一個隱含參數:
_sqlexec_progression_cost = 0
這個參數的具體含義為:
SQL execution progression monitoring cost threshold
即:SQL執行進度監控成本閥值
這個參數根據COST來決定需要監控的SQL.執行進度監控會引入額外的函數調用和Row Sources這可能導致SQL的執行計劃或成本發生改變,從而產生不同的子指針.
_sqlexec_progression_cost 的缺省值為1000,成本大於1000的所有SQL都會被監控
如果該參數設置為0,那麼SQL的執行進度將不會被跟蹤.
執行進度監控信息會被記錄到V$SESSION_LONGOPS視圖中,如果Time_statistics參數設置為False,那麼這個信息就不會被記錄.
所以,Time_statistics參數和_sqlexec_progression_cost是解決問題的兩個途徑。
通過查詢我們也可以看到,在這個數據庫中,OPTIMIZER_COST >1000的SQL主要有以下五個:
SQL_TEXT
--------------------------------------------------------------------------------
insert into sms_detail_error (msgdate,addruser,msgid,areaid,reason,spnumber,msgt
ime,ifiddest,msqkey,servicecode,planid,feetype,feevalue,smcont,submittimes,submi
tdate,submittime,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_rept) valu
es (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:
b17,:b18,:b19,:b20)
insert into sms_detail_success (msgdate,addruser,msgid,areaid,spnumber,msgtime,i
fiddest,servicecode,planid,feetype,feevalue,smcont,submittimes,submitdate,submit
time,respdate,resptime,reptdate,repttime,msqkey) values (:b0,:b1,:b2,:b3,:b4,:b5
,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19)
insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,
MSGLEN,MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER,
SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values (:b0,:b1,:b2,:b3,:b
4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b2
1,:b22)
insert into sms_resprept_error (msgdate,areaid,addruser,msgid,submittimes,submit
date,submittime,msgid_gw,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_re
pt,servicecode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12)
insert into sms_statusrept (reptdate,addruser,msgid_gw,repttime,statustype,msgid
_stus,msgstate,errorcode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7)
而這五個SQL中,在v$sqlarea中,有四個version_count都在10以上:
SQL> select sql_text,version_count from v$sqlarea where version_count>10;
SQL_TEXT VERSION_COUNT
--------------------------------------------------------------------------------
insert into sms_detail_error (msgdate,addruser,msgid,areaid,reason,spnumber,msgt
ime,ifiddest,msqkey,servicecode,planid,feetype,feevalue,smcont,submittimes,submi
tdate,submittime,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_rept) valu
es (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20)
42
insert into sms_log (MSGDATE,MSGTIME,MSGID,MSGKIND,MSGTYPE,MSGTYPE_MOMT,MSGLEN,
MSGSTATUS,AREAID,IFIDDEST,IFIDSRC,ADDRSRC,ADDRDEST,ADDRFEE,ADDRUSER,
SERVICECODE,PLANID,FEETYPE,FEEVALUE,DATACODING,FLAGS,SMLEN,SMCONT) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12,:b13,:b14,:b15,:b16,:b17,:b18,:b19,:b20,:b21,:b22)
7026
insert into sms_resprept_error (msgdate,areaid,addruser,msgid,submittimes,submit
date,submittime,msgid_gw,msgstate_resp,errorcode_resp,msgstate_rept,errorcode_re
pt,servicecode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7,:b8,:b9,:b10,:b11,:b12)
301
insert into sms_statusrept (reptdate,addruser,msgid_gw,repttime,statustype,msgid
_stus,msgstate,errorcode) values (:b0,:b1,:b2,:b3,:b4,:b5,:b6,:b7)
41
具體可以參考Metalink: Note 62143
至此這個關於shared pool的問題找到了原因,並得以及時解決。(End)
編者注:您可以通過以下鏈接參與關於本文的討論,直接和作者對話:
http://www.eygle.com/index-internal.htm 或