今天mentor給了一個sql語句優化的任務。(環境是sql developer)有一個語句執行很慢,查詢出來的結果有17544條記錄,但需970秒,速度很慢。語句是這樣的:
SELECT DISTINCT 'AMEND_NEW', reporttitle, reportsubtitle, cab_cab_transactions.branchcode, cab_cab_transactions.prtfo_cd, cab_cab_transactions.sstm_scrty_id, cab_cab_transactions.sstm_trx_id, cab_cab_transactions.trde_dttm, cab_cab_transactions.efcte_dttm, cab_cab_transactions.due_stlmnt_dt, cab_cab_transactions.cncl_efcte_dttm, cab_cab_transactions.trde_sstm_id, cab_cab_transactions.trx_type_cd, cab_cab_transactions.trx_type_dscrn, cab_cab_transactions.trx_subtype_cd, cab_cab_transactions.trde_stat_flg, cab_cab_transactions.csh_cr_dr_indcr, cab_cab_transactions.long_shrt_indcr, cab_cab_transactions.lcl_crncy, cab_cab_transactions.stlmt_crncy, cab_cab_transactions.nomin_qty, cab_cab_transactions.price, cab_cab_transactions.lcl_cst, cab_cab_transactions.prtfo_cst, cab_cab_transactions.lcl_book_cst, cab_cab_transactions.prtfo_book_cst, cab_cab_transactions.lcl_sell_prcds, cab_cab_transactions.prtfo_sell_prcds, cab_cab_transactions.lcl_gnls, cab_cab_transactions.prtfo_gnls, cab_cab_transactions.lcl_acrd_intrt, cab_cab_transactions.prtfo_acrd_intrt, cab_cab_transactions.stlmt_crncy_stlmt_amt, cab_cab_transactions.lcl_net_amt, cab_cab_transactions.prtfo_net_amt, cab_cab_transactions.fx_bght_amt, cab_cab_transactions.fx_sold_amt, cab_cab_transactions.prtfo_crncy_stlmt_amt, cab_cab_transactions.prtfo_net_incme, cab_cab_transactions.dvnd_crncy_net_incme, cab_cab_transactions.dvnd_type_cd, cab_cab_transactions.lcl_intrt_pd_rec, cab_cab_transactions.prtfo_intrt_pd_rec, cab_cab_transactions.lcl_dvdnd_pd_rec, cab_cab_transactions.prtfo_dvdnd_pd_rec, cab_cab_transactions.lcl_sundry_inc_pd_rec, cab_cab_transactions.prtfo_sundry_inc_pd_rec, cab_cab_transactions.bnk_csh_cptl_secid, cab_cab_transactions.bnk_csh_inc_secid, cab_cab_transactions.reportdate, cab_cab_transactions.filename, sysdate, 'e483448' FROM cab_cfg_trx_type_mapping RIGHT JOIN(cab_cab_tran_adjustments INNER JOIN cab_cab_transactions ON(cab_cab_transactions.branchcode = cab_cab_tran_adjustments.branchcode ) AND(cab_cab_tran_adjustments.sstm_trx_id = cab_cab_transactions.sstm_trx_id)) ON(cab_cfg_trx_type_mapping.cab_trx_type_cd = cab_cab_transactions.trx_type_cd) AND(nvl(cab_cfg_trx_type_mapping.cab_trx_subtype_cd,' ') = nvl(cab_cab_transactions.trx_subtype_cd,' ') AND (cab_cfg_trx_type_mapping.branchcode=cab_cab_transactions.branchcode)) WHERE cab_cab_transactions.prtfo_cd IN (SELECT DISTINCT prtfo_cd FROM cab_cab_valuations_working WHERE created_by = 'e483448' AND branchcode='ISA') AND cab_cab_tran_adjustments.efcte_dttm > '2011-07-31' AND cab_cab_tran_adjustments.efcte_dttm <= '2011-08-31' AND eff_trde_stat_flg <> 'X' AND cab_cab_transactions.branchcode = 'ISA' AND cab_cab_tran_adjustments.branchcode = 'ISA' AND(cab_cfg_trx_type_mapping.cab_reportgroup = 'CABValuation' OR cab_cfg_trx_type_mapping.cab_reportgroup IS NULL)
問題在distinct上面,它會導致對全表掃描,而且會導致排序,然後刪除重復的記錄,所以速度很慢,因此需要優化distinct。查了不少資料,並逐一嘗試,最後發現了一個非常可觀的優化結果,用group by。語句如下:
SELECT 'AMEND_NEW', reporttitle, reportsubtitle, cab_cab_transactions.branchcode, cab_cab_transactions.prtfo_cd, cab_cab_transactions.sstm_scrty_id, cab_cab_transactions.sstm_trx_id, cab_cab_transactions.trde_dttm, cab_cab_transactions.efcte_dttm, cab_cab_transactions.due_stlmnt_dt, cab_cab_transactions.cncl_efcte_dttm, cab_cab_transactions.trde_sstm_id, cab_cab_transactions.trx_type_cd, cab_cab_transactions.trx_type_dscrn, cab_cab_transactions.trx_subtype_cd, cab_cab_transactions.trde_stat_flg, cab_cab_transactions.csh_cr_dr_indcr, cab_cab_transactions.long_shrt_indcr, cab_cab_transactions.lcl_crncy, cab_cab_transactions.stlmt_crncy, cab_cab_transactions.nomin_qty, cab_cab_transactions.price, cab_cab_transactions.lcl_cst, cab_cab_transactions.prtfo_cst, cab_cab_transactions.lcl_book_cst, cab_cab_transactions.prtfo_book_cst, cab_cab_transactions.lcl_sell_prcds, cab_cab_transactions.prtfo_sell_prcds, cab_cab_transactions.lcl_gnls, cab_cab_transactions.prtfo_gnls, cab_cab_transactions.lcl_acrd_intrt, cab_cab_transactions.prtfo_acrd_intrt, cab_cab_transactions.stlmt_crncy_stlmt_amt, cab_cab_transactions.lcl_net_amt, cab_cab_transactions.prtfo_net_amt, cab_cab_transactions.fx_bght_amt, cab_cab_transactions.fx_sold_amt, cab_cab_transactions.prtfo_crncy_stlmt_amt, cab_cab_transactions.prtfo_net_incme, cab_cab_transactions.dvnd_crncy_net_incme, cab_cab_transactions.dvnd_type_cd, cab_cab_transactions.lcl_intrt_pd_rec, cab_cab_transactions.prtfo_intrt_pd_rec, cab_cab_transactions.lcl_dvdnd_pd_rec, cab_cab_transactions.prtfo_dvdnd_pd_rec, cab_cab_transactions.lcl_sundry_inc_pd_rec, cab_cab_transactions.prtfo_sundry_inc_pd_rec, cab_cab_transactions.bnk_csh_cptl_secid, cab_cab_transactions.bnk_csh_inc_secid, cab_cab_transactions.reportdate, cab_cab_transactions.filename, sysdate, 'e483448' FROM cab_cfg_trx_type_mapping RIGHT JOIN(cab_cab_tran_adjustments INNER JOIN cab_cab_transactions ON(cab_cab_transactions.branchcode = cab_cab_tran_adjustments.branchcode ) AND(cab_cab_tran_adjustments.sstm_trx_id = cab_cab_transactions.sstm_trx_id)) ON(cab_cfg_trx_type_mapping.cab_trx_type_cd = cab_cab_transactions.trx_type_cd) AND(nvl(cab_cfg_trx_type_mapping.cab_trx_subtype_cd,' ') = nvl(cab_cab_transactions.trx_subtype_cd,' ') AND (cab_cfg_trx_type_mapping.branchcode=cab_cab_transactions.branchcode)) WHERE cab_cab_transactions.prtfo_cd IN (SELECT DISTINCT prtfo_cd FROM cab_cab_valuations_working WHERE created_by = 'e483448' AND branchcode='ISA') AND cab_cab_tran_adjustments.efcte_dttm > '2011-07-31' AND cab_cab_tran_adjustments.efcte_dttm <= '2011-08-31' AND eff_trde_stat_flg <> 'X' AND cab_cab_transactions.branchcode = 'ISA' AND cab_cab_tran_adjustments.branchcode = 'ISA' AND(cab_cfg_trx_type_mapping.cab_reportgroup = 'CABValuation' OR cab_cfg_trx_type_mapping.cab_reportgroup IS NULL) GROUP BY reporttitle, reportsubtitle, cab_cab_transactions.branchcode, cab_cab_transactions.prtfo_cd, cab_cab_transactions.sstm_scrty_id, cab_cab_transactions.sstm_trx_id, cab_cab_transactions.trde_dttm, cab_cab_transactions.efcte_dttm, cab_cab_transactions.due_stlmnt_dt, cab_cab_transactions.cncl_efcte_dttm, cab_cab_transactions.trde_sstm_id, cab_cab_transactions.trx_type_cd, cab_cab_transactions.trx_type_dscrn, cab_cab_transactions.trx_subtype_cd, cab_cab_transactions.trde_stat_flg, cab_cab_transactions.csh_cr_dr_indcr, cab_cab_transactions.long_shrt_indcr, cab_cab_transactions.lcl_crncy, cab_cab_transactions.stlmt_crncy, cab_cab_transactions.nomin_qty, cab_cab_transactions.price, cab_cab_transactions.lcl_cst, cab_cab_transactions.prtfo_cst, cab_cab_transactions.lcl_book_cst, cab_cab_transactions.prtfo_book_cst, cab_cab_transactions.lcl_sell_prcds, cab_cab_transactions.prtfo_sell_prcds, cab_cab_transactions.lcl_gnls, cab_cab_transactions.prtfo_gnls, cab_cab_transactions.lcl_acrd_intrt, cab_cab_transactions.prtfo_acrd_intrt, cab_cab_transactions.stlmt_crncy_stlmt_amt, cab_cab_transactions.lcl_net_amt, cab_cab_transactions.prtfo_net_amt, cab_cab_transactions.fx_bght_amt, cab_cab_transactions.fx_sold_amt, cab_cab_transactions.prtfo_crncy_stlmt_amt, cab_cab_transactions.prtfo_net_incme, cab_cab_transactions.dvnd_crncy_net_incme, cab_cab_transactions.dvnd_type_cd, cab_cab_transactions.lcl_intrt_pd_rec, cab_cab_transactions.prtfo_intrt_pd_rec, cab_cab_transactions.lcl_dvdnd_pd_rec, cab_cab_transactions.prtfo_dvdnd_pd_rec, cab_cab_transactions.lcl_sundry_inc_pd_rec, cab_cab_transactions.prtfo_sundry_inc_pd_rec, cab_cab_transactions.bnk_csh_cptl_secid, cab_cab_transactions.bnk_csh_inc_secid, cab_cab_transactions.reportdate, cab_cab_transactions.filename
最後執行時間只有15.1秒,快了60多倍,不得不說這優化效果還是很可觀的。不過查了很多資料,仍然沒有發現合理地解釋:為什麼distinct 和group by的效率會有這麼大差別。查的很多資料,講的基本都是兩者相差不大,實現也差不多。有待解決。
DISTINCT和GROUP BY這兩者本質上應該沒有可比性,distinct 取出唯一列,group by 是分組,但有時候在優化的時候,在沒有聚合函數的時候,他們查出來的結果也一樣。