Oracle數據庫設計字段類型選擇錯誤的隱患
數據類型不准確的一個隱患,下面來構造一張表存日期字段,一個存varchar2,一個存date,做一個測試。之前也寫過兩篇blog:
1.字段類型設計與實際業務不符引發的問題1
2.字段類型設計與實際業務不符引發的問題2
SQL> drop table test purge;
SQL> create table test as select
to_char(to_date('2014-01-01','yyyy-MM-dd')+rownum,'yyyymmdd') s_date,
to_date('2014-01-01','yyyy-MM-dd')+rownum d_date
from all_objects;
SQL> create index ind_t_sdate on test(s_date) nologging;
SQL> create index ind_t_ddate on test(d_date) nologging;
SQL> exec dbms_stats.gather_table_stats(user,'test',cascade => true);
SQL> set timing on
SQL> set autotrace traceonly
SQL> select * from test where s_date between '20140201' and '20140222';
已選擇22行。
已用時間: 00: 00: 00.00
執行計劃
----------------------------------------------------------
Plan hash value: 953148778
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 51 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 3 | 51 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T_SDATE | 3 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
--可以看到CBO評估出來的行數是3,明明返回的是22
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("S_DATE">='20140201' AND "S_DATE"<='20140222')
統計信息
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
944 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
22 rows processed
SQL> select * from test
where d_date between to_date('20140201', 'yyyymmdd') and
to_date('20140222', 'yyyymmdd');
已選擇22行。
已用時間: 00: 00: 00.00
執行計劃
----------------------------------------------------------
Plan hash value: 112387541
-------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 23 | 391 | 3 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 23 | 391 | 3 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IND_T_DDATE | 23 | | 2 (0)| 00:00:01 |
-------------------------------------------------------------------------------------------
--可以看到CBO評估出來基本是准確的。
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("D_DATE">=TO_DATE(' 2014-02-01 00:00:00', 'syyyy-mm-dd hh24:mi:ss')
AND "D_DATE"<=TO_DATE(' 2014-02-22 00:00:00', 'syyyy-mm-dd hh24:mi:ss'))
統計信息
----------------------------------------------------------
1 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
944 bytes sent via SQL*Net to client
349 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
22 rows processed
總結:雖然這兩條SQL消耗的資源是一樣的,但SQL1的評估結果不對,在多表關聯的時候,這個絕對是個隱患,非常容易導致執行計劃走錯。除了以上的幾個原因之外,還存在的問題是用varchar2存date會造成N多的存儲格式,曾經看到過一個情況,日期格式五花八門(有年月日,年月日 小時,年月日 小時,分鐘),有中英文的:,有全角、半角,有null,甚至undefine(大概是從js傳過來的)。且造成索引建了用不上,不得已改數據類型,光寫轉換的腳本就花了一天多的時間。