本文主要討論以下幾種索引訪問方法:
1.索引唯一掃描(INDEX UNIQUE SCAN)
2.索引范圍掃描(INDEX RANGE SCAN)
3.索引全掃描(INDEX FULL SCAN)
4.索引跳躍掃描(INDEX SKIP SCAN)
5.索引快速全掃描(INDEX FAST FULL SCAN)
索引唯一掃描(INDEX UNIQUE SCAN)
通過這種索引訪問數據的特點是對於某個特定的值只返回一行數據,通常如果在查詢謂語中使用UNIQE和PRIMARY KEY索引的列作為條件的時候會選用這種掃描;訪問的高度總是索引的高度加一,除了某些特殊的情況,如另外存儲的LOB對象。
復制代碼 代碼如下:
SQL> set autotrace traceonly explain
SQL> select * from hr.employees where employee_id = 100;
Execution Plan
----------------------------------------------------------
Plan hash value: 1833546154
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 69 | 1 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 1 | 69 | 1 (0)| 00:00:01 |
|* 2 | INDEX UNIQUE SCAN | EMP_EMP_ID_PK | 1 | | 0 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("EMPLOYEE_ID"=100)
索引范圍掃描(INDEX RANGE SCAN)
謂語中包含將會返回一定范圍數據的條件時就會選用索引范圍掃描,索引可以是唯一的亦可以是不唯一的;所指定的條件可以是(<,>,LIKE,BETWEEN,=)等運算符,不過使用LIKE的時候,如果使用了通配符%,極有可能就不會使用范圍掃描,因為條件過於的寬泛了,下面是一個示例:
復制代碼 代碼如下:
SQL> select * from hr.employees where DEPARTMENT_ID = 30;
6 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 2056577954
-------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 6 | 414 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 6 | 414 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | EMP_DEPARTMENT_IX | 6 | | 1 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("DEPARTMENT_ID"=30)
Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
7 consistent gets
1 physical reads
0 redo size
1716 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
6 rows processed
范圍掃描的條件需要准確的分析返回數據的數目,范圍越大就越可能執行全表掃描;
復制代碼 代碼如下:
SQL> select department_id,count(*) from hr.employees group by department_id order by count(*);
DEPARTMENT_ID COUNT(*)
------------- ----------
10 1
40 1
1
70 1
20 2
110 2
90 3
60 5
30 6
100 6
80 34
50 45
12 rows selected.
-- 這裡使用數值最多的50來執行范圍掃描
SQL> set autotrace traceonly explain
SQL> select * from hr.employees where DEPARTMENT_ID = 50;
45 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1445457117
-------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 45 | 3105 | 3 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| EMPLOYEES | 45 | 3105 | 3 (0)| 00:00:01 |
-------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("DEPARTMENT_ID"=50)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
10 consistent gets
0 physical reads
0 redo size
4733 bytes sent via SQL*Net to client
545 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
45 rows processed
可以看到在獲取范圍數據較大的時候,優化器還是執行了全表掃描方法。
一種對於索引范圍掃描的優化方法是使用升序排列的索引來獲得降序排列的數據行,這種情況多發生在查詢中包含有索引列上的ORDER BY子句的時候,這樣就可避免一次排序操作了,如下:
復制代碼 代碼如下:
SQL> set autotrace traceonly explain
SQL> select * from hr.employees
2 where department_id in (90, 100)
3 order by department_id desc;
Execution Plan
----------------------------------------------------------
Plan hash value: 3707994525
---------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 9 | 621 | 2 (0)| 00:00:01 |
| 1 | INLIST ITERATOR | | | | | |
| 2 | TABLE ACCESS BY INDEX ROWID | EMPLOYEES | 9 | 621 | 2 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN DESCENDING| EMP_DEPARTMENT_IX | 9 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("DEPARTMENT_ID"=90 OR "DEPARTMENT_ID"=100)
上例中,索引條目被相反的順序讀取,避免了排序操作。
索引全掃描(INDEX FULL SCAN)
索引全掃描的操作將會掃描索引結構的每一個葉子塊,讀取每個條目的的行編號,並取出數據行,既然是訪問每一個索引葉子塊,那麼它相對的全表掃描的優勢在哪裡呢?實際上在索引塊中因為包含的信息列數較少,通常都是索引鍵和ROWID,所以對於同一個數據塊和索引塊,包含的索引鍵的條目數通常都是索引塊中居多,因此如果查詢字段列表中所有字段都是索引的一部分的時候,就可以完全跳過對表數據的訪問了,這種情況索引全掃描的方法會獲得更高的效率。
發生索引全掃描的情況有很多,幾種典型的場景:
1,查詢總缺少謂語,但獲取的列可以通過索引直接獲得
復制代碼 代碼如下:
SQL> select email from hr.employees;
Execution Plan
----------------------------------------------------------
Plan hash value: 2196514524
---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107 | 856 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN | EMP_EMAIL_UK | 107 | 856 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------
2,查詢謂語中包含一個位於索引中非引導列上的條件(其實也取決於引導列值的基數大小,如果引導列的唯一值較少,也可能出現跳躍掃描的情況)
復制代碼 代碼如下:
SQL> select first_name, last_name from hr.employees
2 where first_name like 'A%' ;
Execution Plan
----------------------------------------------------------
Plan hash value: 2228653197
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 45 | 1 (0)| 00:00:01 |
|* 1 | INDEX FULL SCAN | EMP_NAME_IX | 3 | 45 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("FIRST_NAME" LIKE 'A%')
filter("FIRST_NAME" LIKE 'A%')
SQL> SET LONG 2000000
SQL> select dbms_metadata.get_ddl('INDEX','EMP_NAME_IX','HR') from dual;
DBMS_METADATA.GET_DDL('INDEX','EMP_NAME_IX','HR')
--------------------------------------------------------------------------------
CREATE INDEX "HR"."EMP_NAME_IX" ON "HR"."EMPLOYEES" ("LAST_NAME", "FIRST_NAME"
)
PCTFREE 10 INITRANS 2 MAXTRANS 255 NOLOGGING COMPUTE STATISTICS
STORAGE(INITIAL 65536 NEXT 1048576 MINEXTENTS 1 MAXEXTENTS 2147483645
PCTINCREASE 0 FREELISTS 1 FREELIST GROUPS 1 BUFFER_POOL DEFAULT FLASH_CACHE DE
FAULT CELL_FLASH_CACHE DEFAULT)
TABLESPACE "EXAMPLE"
-- 可以看到EMP_NAME_IX索引是建立在列(("LAST_NAME", "FIRST_NAME")上的,使用了帶非引導列FIRST_NAME的謂語
3,數據通過一個已經排序的索引獲得從而省去單獨的排序操作
復制代碼 代碼如下:
SQL> select * from hr.employees order by employee_id ;
Execution Plan
----------------------------------------------------------
Plan hash value: 2186312383
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107 | 7383 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| EMPLOYEES | 107 | 7383 | 3 (0)| 00:00:01 |
| 2 | INDEX FULL SCAN | EMP_EMP_ID_PK | 107 | | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
-- 同樣可以使用升序索引返回降序數據
SQL> select employee_id from hr.employees order by employee_id desc ;
Execution Plan
----------------------------------------------------------
Plan hash value: 753568220
--------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 107 | 428 | 1 (0)| 00:00:01 |
| 1 | INDEX FULL SCAN DESCENDING| EMP_EMP_ID_PK | 107 | 428 | 1 (0)| 00:00:01 |
--------------------------------------------------------------------------------------------
在上面的例子中可以看出,索引全掃描也可以想范圍掃描一樣,通過升序索引返回降序數據,而它的優化不止這一種,當我們查詢某一列的最大值或最小值而這一列又是索引列的時候,索引全掃描就會獲得非常顯著的優勢,因為這時的優化器並沒有對索引的數據進行全部葉子節點的檢索,而只是對一個根塊,第一個或最後一個葉子塊的掃描,這無疑會顯著的提高性能!!
復制代碼 代碼如下:
-- 索引全掃描獲得最小值
SQL> select min(department_id) from hr.employees ;
Execution Plan
----------------------------------------------------------
Plan hash value: 613773769
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 1 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| EMP_DEPARTMENT_IX | 1 | 3 | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
-- 如果同時包含MAX和MIN的求值,優化器並不會主動選擇效率較高的索引全掃描方法
SQL> select min(department_id), max(department_id) from hr.employees ;
Execution Plan
----------------------------------------------------------
Plan hash value: 1756381138
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 3 | 3 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
| 2 | TABLE ACCESS FULL| EMPLOYEES | 107 | 321 | 3 (0)| 00:00:01 |
--------------------------------------------------------------------------------
-- 一種替代的優化方案
SQL> select
2 (select min(department_id) from hr.employees) min_id,
3 (select max(department_id) from hr.employees) max_id
4 from dual;
Execution Plan
----------------------------------------------------------
Plan hash value: 2189307159
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 2 (0)| 00:00:01 |
| 1 | SORT AGGREGATE | | 1 | 3 | | |
| 2 | INDEX FULL SCAN (MIN/MAX)| EMP_DEPARTMENT_IX | 1 | 3 | 1 (0)| 00:00:01 |
| 3 | SORT AGGREGATE | | 1 | 3 | | |
| 4 | INDEX FULL SCAN (MIN/MAX)| EMP_DEPARTMENT_IX | 1 | 3 | 1 (0)| 00:00:01 |
| 5 | FAST DUAL | | 1 | | 2 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
索引跳躍掃描(INDEX SKIP SCAN)
這種掃描方式也是一種特例,因為在早期的版本中,優化器會因為使用了非引導列而拒絕使用索引。跳躍掃描的前提有著對應的情景,當謂語中包含索引中非引導列上的條件,並且引導列的唯一值較小的時候,就有極有可能使用索引跳躍掃描方法;同索引全掃描,范圍掃描一樣,它也可以升序或降序的訪問索引;不同的是跳躍掃描會根據引導列的唯一值數目將復合索引分成多個較小的邏輯子索引,引導列的唯一值數目越小,分割的子索引數目也就越少,就越可能達到相對全表掃描較高的運算效率。
復制代碼 代碼如下:
-- 創建測試表,以dba_objects表為例
SQL> create table test as select * from dba_objects;
Table created.
-- 創建一個復合索引,這裡選取了一個唯一值較少的owner列作為引導列
SQL> create index i_test on test(owner,object_id,object_type) ;
Index created.
-- 分析表收集統計信息
SQL> exec dbms_stats.gather_table_stats('SYS','TEST');
PL/SQL procedure successfully completed.
-- 先看一下引導列的唯一值的比較
SQL> select count(*),count(distinct owner) from test;
COUNT(*) COUNT(DISTINCTOWNER)
---------- --------------------
72482 29
-- 使用非引導列的條件查詢來訪問觸發SKIP SCAN
SQL> select * from test where object_id = 46;
Execution Plan
----------------------------------------------------------
Plan hash value: 1001786056
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 97 | 31 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 1 | 97 | 31 (0)| 00:00:01 |
|* 2 | INDEX SKIP SCAN | I_TEST | 1 | | 30 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"=46)
filter("OBJECT_ID"=46)
Statistics
----------------------------------------------------------
101 recursive calls
0 db block gets
38 consistent gets
0 physical reads
0 redo size
1610 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
3 sorts (memory)
0 sorts (disk)
1 rows processed
-- 來看看這條語句全掃描的效率
SQL> select /*+ full(test) */ * from test where object_id = 46;
Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 97 | 282 (1)| 00:00:04 |
|* 1 | TABLE ACCESS FULL| TEST | 1 | 97 | 282 (1)| 00:00:04 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=46)
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
1037 consistent gets
0 physical reads
0 redo size
1607 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
分析上面的查詢可以看出,我們使用的索引中引導列有29個唯一值,也就是說在執行索引跳躍掃描的時候,分割成了29個邏輯子索引來查詢,只產生了38次邏輯讀;而相對全表掃描的1037次邏輯讀,性能提升非常明顯!
索引快速全掃描(INDEX FAST FULL SCAN)
這種訪問方法在獲取數據上和全表掃描相同,都是通過無序的多塊讀取來進行的,因此也就無法使用它來避免排序代價了;索引快速全掃描通常發生在查詢列都在索引中並且索引中一列有非空約束時,當然這個條件也容易發生索引全掃描,它的存在多可用來代替全表掃描,比較數據獲取不需要訪問表上的數據塊。
復制代碼 代碼如下:
-- 依舊使用上面創建的test表
SQL> desc test
Name Null? Type
----------------------------------------- -------- ----------------------------
OWNER VARCHAR2(30)
OBJECT_NAME VARCHAR2(128)
SUBOBJECT_NAME VARCHAR2(30)
OBJECT_ID NOT NULL NUMBER
DATA_OBJECT_ID NUMBER
OBJECT_TYPE VARCHAR2(19)
CREATED DATE
LAST_DDL_TIME DATE
TIMESTAMP VARCHAR2(19)
STATUS VARCHAR2(7)
TEMPORARY VARCHAR2(1)
GENERATED VARCHAR2(1)
SECONDARY VARCHAR2(1)
NAMESPACE NUMBER
EDITION_NAME VARCHAR2(30)
-- 在object_id列上創建索引
SQL> create index pri_inx on test (object_id);
Index created.
-- 直接執行全表掃描
SQL> select object_id from test;
72482 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 1357081020
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72482 | 353K| 282 (1)| 00:00:04 |
| 1 | TABLE ACCESS FULL| TEST | 72482 | 353K| 282 (1)| 00:00:04 |
--------------------------------------------------------------------------
Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
5799 consistent gets
0 physical reads
0 redo size
1323739 bytes sent via SQL*Net to client
53675 bytes received via SQL*Net from client
4834 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
72482 rows processed
-- 修改object_id為not null
SQL> alter table test modify (object_id not null);
Table altered.
-- 再次使用object_id列查詢就可以看到使用了快速全掃描了
SQL> select object_id from test;
72482 rows selected.
Execution Plan
----------------------------------------------------------
Plan hash value: 3806735285
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 72482 | 353K| 45 (0)| 00:00:01 |
| 1 | INDEX FAST FULL SCAN| PRI_INX | 72482 | 353K| 45 (0)| 00:00:01 |
--------------------------------------------------------------------------------
Statistics
----------------------------------------------------------
167 recursive calls
0 db block gets
5020 consistent gets
161 physical reads
0 redo size
1323739 bytes sent via SQL*Net to client
53675 bytes received via SQL*Net from client
4834 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
72482 rows processed
PS,這個INDEX FAST FULL SCAN的例子真是不好模擬,上面的例子弄了好久。。。。。