在Oracle中,如果要實現行列轉換,較為常見的是用DECODE和CASE語句。對於簡單的行列轉行,DECODE和CASE語句尚能應付。在邏輯比較復雜,分組聚合較多的場景中,DECODE和CASE語句則力有不逮。而pivot則可完美解決這一切。
首先,我們來看看Oracle對於其的解釋:
可見,pivot是數據倉庫中的關鍵技術,它利用交叉查詢(crosstabulation query)將行轉換為列。
基本語法如下:
SELECT .... FROM <table-expr> PIVOT ( aggregate-function(<column>) FOR <pivot-column> IN (<value1>, <value2>,..., <valuen>) ) AS <alias> WHERE .....
下面我們來通過具體的案例對其進行闡述。
首先,構造案例所需的數據,
1> 創建視圖,以EMP表的數據作為源數據。
CREATE VIEW emp_view AS SELECT deptno,job,to_char(hiredate,'yyyy') hiredate, count(*) cnt,sum(sal) sum_sal FROM emp GROUP BY deptno,job,to_char(hiredate,'yyyy');
其中,deptno為部門號,job為工作的類型(即工種),hiredate為雇傭的日期,cnt為特定部門,特定工種在特定年份雇傭的員工的總數,sum_sal為特定部門,特定工種,特定年份雇傭的員工的工資的總和。
2> 視圖的數據如下:
SQL> select * from emp_view; DEPTNO JOB HIRE CNT SUM_SAL ---------- --------- ---- ---------- ---------- 20 CLERK 1980 1 800 20 ANALYST 1981 1 3000 20 ANALYST 1987 1 3000 30 CLERK 1981 1 950 30 MANAGER 1981 1 2850 10 MANAGER 1981 1 2450 30 SALESMAN 1981 4 5600 20 MANAGER 1981 1 2975 10 PRESIDENT 1981 1 5000 10 CLERK 1982 1 1300 20 CLERK 1987 1 1100 11 rows selected.
應用場景一:
基本的Pivot轉換
例1:
SELECT * FROM ( SELECT deptno,hiredate,cnt FROM emp_view ) PIVOT (SUM(cnt) FOR hiredate IN ('1980' AS "1980",'1981' AS "1981", '1982' AS "1982",'1987' AS "1987")) ORDER BY deptno; DEPTNO 1980 1981 1982 1987 ---------- ---------- ---------- ---------- ---------- 10 2 1 20 1 2 2 30 6 3 rows selected.
例2:
SELECT * FROM ( SELECT deptno,job,cnt FROM emp_view ) PIVOT (SUM(cnt) FOR job IN ('CLERK','ANALYST','MANAGER','SALESMAN','PRESIDENT')) ORDER BY deptno; DEPTNO 'CLERK' 'ANALYST' 'MANAGER' 'SALESMAN' 'PRESIDENT' ---------- ---------- ---------- ---------- ---------- ----------- 10 1 1 1 20 2 2 1 30 1 1 4 3 rows selected.
兩例以不同的列進行統計,前者是hiredate,後者是job。
除此之外,前者用了別名,後面沒有用別名,兩者的顯示效果也是不一樣的。
應用場景二:
對多列進行Pivot轉換
SELECT * FROM ( SELECT deptno,job,hiredate,cnt FROM emp_view ) PIVOT (SUM(cnt) FOR (job,hiredate) IN (('CLERK','1980') AS clerk_1980, ('CLERK','1981') AS clerk_1981, ('ANALYST','1987') AS analyst_1987, ('MANAGER','1981') AS manager_1981 ) ) ORDER by deptno; DEPTNO CLERK_1980 CLERK_1981 ANALYST_1987 MANAGER_1981 ---------- ---------- ---------- ------------ ------------ 10 1 20 1 1 1 30 1 1 3 rows selected.
限於篇幅,FOR (job,hiredate) IN語句中沒有列出更多組合,只列出了四組,當然,我們可以根據實際場景需要羅列更多的組合。
從本例中可以看出,對兩個列進行Pivot轉換可從三個維度呈現統計結果。
應用場景三:
用Pivot實現多個聚合
SELECT * FROM ( SELECT deptno,hiredate,cnt,sum_sal FROM emp_view ) PIVOT ( SUM(cnt) AS cnt, SUM(sum_sal) AS sum_sal FOR hiredate IN ('1980','1981','1982','1987')) ORDER BY deptno; DEPTNO '1980'_CNT '1980'_SUM_SAL '1981'_CNT '1981'_SUM_SAL '1982'_CNT '1982'_SUM_SAL '1987'_CNT '1987'_SUM_SAL ---------- ---------- -------------- ---------- -------------- ---------- -------------- ---------- -------------- 10 2 7450 1 1300 20 1 800 2 5975 2 4100 30 6 9400 3 rows selected.
'1981'_CNT指的是1981年雇傭的員工的總數,'1981'_SUM_SAL指的是1981年雇傭員工所開出的工資。
具體到本例中,即1981年10號部門招聘了2位員工,開出的工資合計為7450元,20號部門招聘了2位員工,開出的工資合計為5975元,30號部門招聘了6名員工,開出的工資合計為9400元,依次類推。
既然有pivot將行轉換為列,同樣也有unpivot操作將聚合後的列轉換為行。
UNPIVOT
以上述應用場景三的結果作為源數據進行操作
CREATE TABLE T1 AS SELECT * FROM ( SELECT deptno,hiredate,cnt,sum_sal FROM emp_view ) PIVOT ( SUM(cnt) AS cnt, SUM(sum_sal) AS sum_sal FOR hiredate IN ('1980' AS "1980",'1981' AS "1981", '1982' AS "1982",'1987' AS "1987")) ORDER BY deptno
表T1的結果為:
SQL> select * from t1; DEPTNO 1980_CNT 1980_SUM_SAL 1981_CNT 1981_SUM_SAL 1982_CNT 1982_SUM_SAL 1987_CNT 1987_SUM_SAL ---------- ---------- ------------ ---------- ------------ ---------- ------------ ---------- ------------ 10 2 7450 1 1300 20 1 800 2 5975 2 4100 30 6 9400 3 rows selected.
首先進行一維unpivot
SELECT deptno,DECODE(hiredate,'1980_CNT','1980','1981_CNT','1981','1982_CNT','1982','1987_CNT','1987') AS hiredate,cnt FROM T1 UNPIVOT INCLUDE NULLS ( cnt FOR hiredate IN ("1980_CNT","1981_CNT","1982_CNT","1987_CNT")); DEPTNO HIRE CNT ---------- ---- ---------- 10 1980 10 1981 2 10 1982 1 10 1987 20 1980 1 20 1981 2 20 1982 20 1987 2 30 1980 30 1981 6 30 1982 30 1987 12 rows selected.
輸出的結果為不同部門在不同年份的雇傭人數,
注意:上述SQL語句中UNPIVOT後加了INCLUDE NULLS,當然也可以指定為EXCLUDE NULLS,即排除cnt為空的值,如果不指定,則默認為EXCLUDE NULLS。
UNPIVOT後不指定INCLUDE NULLS的輸入結果為:
DEPTNO HIRE CNT ---------- ---- ---------- 10 1981 2 10 1982 1 20 1980 1 20 1981 2 20 1987 2 30 1981 6 6 rows selected.
下面,我們再進行二維unpivot
SELECT deptno,hiredate,cnt,sum_sal FROM T1 UNPIVOT ( (cnt,sum_sal) FOR hiredate IN (("1980_CNT","1980_SUM_SAL") AS 1980, ("1981_CNT","1981_SUM_SAL") AS 1981, ("1982_CNT","1982_SUM_SAL") AS 1982, ("1987_CNT","1987_SUM_SAL") AS 1987)); DEPTNO HIREDATE CNT SUM_SAL ---------- ---------- ---------- ---------- 10 1981 2 7450 10 1982 1 1300 20 1980 1 800 20 1981 2 5975 20 1987 2 4100 30 1981 6 9400 6 rows selected.
輸入結果為T1表列轉行的結果。
參考文檔:
SQL for Analysis and Reporting