程序師世界是廣大編程愛好者互助、分享、學習的平台,程序師世界有你更精彩!
首頁
編程語言
C語言|JAVA編程
Python編程
網頁編程
ASP編程|PHP編程
JSP編程
數據庫知識
MYSQL數據庫|SqlServer數據庫
Oracle數據庫|DB2數據庫
 程式師世界 >> 數據庫知識 >> Oracle數據庫 >> Oracle數據庫基礎 >> 為1.7億張記錄表創建快速索引

為1.7億張記錄表創建快速索引

編輯:Oracle數據庫基礎

本文講述了在大表上創建索引需要注意的事項,以及整個過程。



  需求:在STAT_SUBMIT_CENTER表的RECORDTIME字段上面創建一索引。
  環境:SunOS 5.9
     Oracle 9204
     8 cpu
     3G mem

  1. 查看表的具體情況

  是不是分區表,有多少個分區,分區字段:

SQL> col table_name for a20SQL> col column_name for a20SQL> select a.table_name,a.partitioned,b.partition_count,c.column_name  2  from user_tables a, user_part_tables b, user_part_key_columns c  3  where a.table_name='STAT_SUBMIT_CENTER'  4  and b.table_name='STAT_SUBMIT_CENTER'  5  and c.name='STAT_SUBMIT_CENTER';TABLE_NAME           PAR PARTITION_COUNT COLUMN_NAME-------------------- --- --------------- --------------------STAT_SUBMIT_CENTER   YES              50 MSGDATE


  已使用的每個分區的大小:

SQL> select segment_name,partition_name,round(bytes/1024/1024) from user_segments where segment_name ='STAT_SUBMIT_CENTER' and bytes/1024/1024>0.25 order by 3 desc;SEGMENT_NAME               PARTITION_NAME                 ROUND(BYTES/1024/1024)-------------------------- ------------------------------ ----------------------STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051101                      1722STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051021                      1488STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051111                      1440STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051121                      1355STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051221                      1335STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20050911                      1309STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051211                      1253STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051201                      1247STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20050921                      1198STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20060101                      1151STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20060111                      1068STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051001                      1018STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20051011                       865STAT_SUBMIT_CENTER         STAT_SUBMIT_CENTER_20060121                       79614 rows selected.

  整個表的大小:

SQL> select segment_name,sum(bytes/1024/1024) from user_segments where segment_name ='STAT_SUBMIT_CENTER' group by segment_name;SEGMENT_NAME                     SUM(BYTES/1024/1024)-------------------------------- --------------------STAT_SUBMIT_CENTER                              17234

  表的記錄數:

SQL> set timing onSQL> select count(*) from STAT_SUBMIT_CENTER;  COUNT(*)---------- 170341007Elapsed: 00:14:18.60

  還有這個表上的索引情況如下:

table   STAT_SUBMIT_CENTER             17234 Mindex   IDX_SUBCEN_ADDRUSER             5155 M        ADDRUSER    PK_STAT_SUBMIT_CENTER          10653 M        MSGDATE,ADDRUSER,MSGID



  然後查看一些數據庫參數情況:

SQL> show  parameter   workNAME                                 TYPE        VALUE------------------------------------ ----------- ------------------------------workarea_size_policy                 string      AUTOSQL> show parameter pgaNAME                                 TYPE        VALUE------------------------------------ ----------- ------------------------------pga_aggregate_target                 big integer 209715200SQL> select * from dba_temp_files;FILE_NAME------------------------------------------------------------------------------------------------------------------------   FILE_ID TABLESPACE_NAME                     BYTES     BLOCKS STATUS    RELATIVE_FNO AUT   MAXBYTES  MAXBLOCKS---------- ------------------------------ ---------- ---------- --------- ------------ --- ---------- ----------INCREMENT_BY USER_BYTES USER_BLOCKS------------ ---------- -----------/bgdata/Oracle/temp01.dbf         1 TEMP                           3563061248     434944 AVAILABLE            1 YES 4294967296     524288        6400 3562012672      434816


  2. 需要考慮的幾個方面

  1)創建的索引需要幾個G的磁盤空間 

  2)創建索引需要排序,使用pga_aggregate_target,要把這個值從200M加大到2G 

  3)如果內存不夠,需要temp表空間,把temp表空間加大到8G。itpub上有一個帖子說過,15億條記錄用了34G空間。 

  4)在線創建,時間會比較長。討論後,停這個表的操作,非online創建。 

  3. 實際操作過程

  1)數據文件夠,不擴展;temp數據文件擴展:

alter database tempfile '/bgdata/Oracle/temp01.dbf' resize 8192m; 

  2)在workarea_size_policy=AUTO的情況下,改pga_aggregate_target=2048m,對於串行操作,一個session能使用的pga=MIN(5%PGA_AGGREGATE_TARGET,100MB),這樣可以使得pga用到最大的值:

alter system set pga_aggregate_target=2048m; 

  3)因為這是一個比較長的過程,所以寫腳本讓後台運行:

nohup time createind.sh &vi createind.sh#!/bin/shsqlplus user/passWord <<EOFcreate index IDX_SUBMIT_RECORDTIME on STAT_SUBMIT_CENTER(RECORDTIME) local;exitEOF 

  4)創建過程中可以觀察v$sort_segment,v$sort_usage看排序情況:

select tablespace_name,current_users,total_blocks,used_blocks,free_blocks from v$sort_segment;select * from v$sort_usage; 

  5)創建完成後,把tempfile和pga_aggregate_target改回原值:

alter database tempfile '/bgdata/Oracle/temp01.dbf' resize 4096m;alter system set pga_aggregate_target=500m; 

  4. 實際創建過程中觀察到的情況

  1)開始之前:

SQL> select tablespace_name,current_users,total_blocks,used_blocks,free_blocks from v$sort_segment;TABLESPACE_NAME CURRENT_USERS TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS------------------------------- ------------- ------------ ----------- -----------TEMP 0 431360 0 431360SQL> select * from v$sort_usage;no rows selected 

  2)創建之初,抓到這麼一條sql:

insert into obj$(owner#,name,namespace,obj#,type#,ctime,mtime,stime,status,remoteowner,linkname,subname,dataobj#,flags,oid$,spare1,spare2)values(:1,:2,:3,:4,:5,:6,:7,:8,:9,:10,:11,:12,:13,:14,:15,:16, :17) 

  3)然後v$sort_segment.USED_BLOCKS變大,v$sort_usage.BLOCKS變大,一直增長到:

SQL> select tablespace_name,current_users,total_blocks,used_blocks,free_blocks from v$sort_segment;TABLESPACE_NAME CURRENT_USERS TOTAL_BLOCKS USED_BLOCKS FREE_BLOCKS------------------------------- ------------- ------------ ----------- -----------TEMP 1 431360 46720 384640SQL> select * from v$sort_usage;USERNAME USER SESSION_ADDR SESSION_NUM SQLADDR SQLHASH------------------------------ ------------------------------ ---------------- ----------- ---------------- ----------TABLESPACE CONTENTS SEGTYPE SEGFILE# SEGBLK# EXTENTS BLOCKS SEGRFNO#------------------------------- --------- --------- ---------- ---------- ---------- ---------- ----------DPC DPC 00000003974CFFB0 6134 0000000399CAB288 1254950678TEMP TEMPORARY SORT 201 431113 365 46720 1這個過程中抓到的sql:select file# from file$ where ts#=:1 

  4)v$sort_segment.USED_BLOCKS變為0,v$sort_usage.BLOCKS變為0 

  5)重復3,4兩步,估計這個是創建一個分區的索引 

  需要解釋一下的是,上面的sql只是我隨機抓到的運行時間比較長的,整個create index過程會復雜很多,具體怎麼樣可以用sqltrace跟蹤。這裡主要看的是temp表空間的使用情況。 


  同時,在創建的過程中:

SQL> select segment_name,partition_name from user_segments where segment_name='IDX_SUBMIT_RECORDTIME';no rows selectedSQL> select index_name,partition_name from user_ind_partitions where INDEX_NAME='IDX_SUBMIT_RECORDTIME';no rows selected 

  當時忘了查user_segments中其實是有一個segment_name為一串數字的記錄,那個才是正在創建的索引;如果這個事務失敗了,將回滾。 


  最後耗時99分鐘完成。

  5. 創建完成後分析索引

  但是接下來還有一件事。創建完成後要分析索引,否則就是走了索引,查詢也巨慢無比。

SQL> explain plan for select count(*) from stat_submit_center where recordtime>trunc(sysdate);Explained.SQL> @?/rdbms/admin/utlxplp.sqlPLAN_TABLE_OUTPUT-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| Id | Operation | Name | Rows | Bytes | Cost | Pstart| Pstop |-------------------------------------------------------------------------------------------------| 0 | SELECT STATEMENT | | 1 | 9 | 4 | | || 1 | SORT AGGREGATE | | 1 | 9 | | | || 2 | PARTITION RANGE ALL | | | | | 1 | 50 ||* 3 | INDEX FAST FULL SCAN| IDX_SUBMIT_RECORDTIME | 8878K| 76M| 4 | 1 | 50 |-------------------------------------------------------------------------------------------------Predicate Information (identifIEd by Operation id):---------------------------------------------------3 - filter("STAT_SUBMIT_CENTER"."RECORDTIME">TRUNC(SYSDATE@!))Note: cpu costing is off16 rows selected.SQL> set autotrace on explainSQL> set timing onSQL> select count(*) from stat_submit_center where recordtime>trunc(sysdate);aa^Cselect count(*) from stat_submit_center where recordtime>trunc(sysdate)*ERROR at line 1:ORA-01013: user requested cancel of current OperationElapsed: 00:11:49.85SQL>SQL> set autotrace off 

  上面可以看到,因為沒有分析索引,雖然它走的是新建的IDX_SUBMIT_RECORDTIME這個索引,但是查詢巨慢,10分鐘後也沒有結果。下面我們分析一下。

SQL> Analyze index IDX_SUBMIT_RECORDTIME estimate statistics;Index analyzed.Elapsed: 00:00:06.84SQL> set autotrace on explainSQL> select count(*) from stat_submit_center where recordtime>trunc(sysdate);COUNT(*)----------926736Elapsed: 00:00:05.37Execution Plan----------------------------------------------------------0 SELECT STATEMENT Optimizer=CHOOSE (Cost=4360 Card=1 Bytes=9)1 0 SORT (AGGREGATE)2 1 PARTITION RANGE (ALL)3 2 INDEX (RANGE SCAN) OF 'IDX_SUBMIT_RECORDTIME' (NON-UNIQUE) (Cost=4360 Card=8878740 Bytes=79908660)SQL> set autotrace off 

  索引分析之後,查詢時間在5分鐘,效率大大提高。
  1. 上一頁:
  2. 下一頁:
Copyright © 程式師世界 All Rights Reserved