tpch是TPC(Transaction Processing Performance Council)組織提供的工具包。用於進行OLAP測試,以評估商業分析中決策支持系統(DSS)的性能。它包含了一整套面向商業的ad-hoc查詢和並發數據修改,強調測試的是數據庫、平台和I/O性能,關注查詢能力。
官網:http://www.tpc.org/tpch
下載地址:http://www.tpc.org/tpch/spec/tpch_2_14_3.tgz 或 http://www.tpc.org/tpch/spec/tpch_2_14_3.zip
1、編譯安裝
下載源碼包,解壓縮,然後:
cp makefile.suite makefile
修改makefile文件中的CC、DATABASE、MACHINE、WORKLOAD等定義:
################
## CHANGE NAME OF ANSI COMPILER HERE
################
CC = gcc
# Current values for DATABASE are: INFORMIX, DB2, ORACLE,
# SQLSERVER, SYBASE, TDAT (Teradata)
# Current values for MACHINE are: ATT, DOS, HP, IBM, ICL, MVS,
# SGI, SUN, U2200, VMS, LINUX, WIN32
# Current values for WORKLOAD are: TPCH
DATABASE= MYSQL
MACHINE = LINUX
WORKLOAD = TPCH
修改tpcd.h文件,增加幾行宏定義:
#ifdef MYSQL
#define GEN_QUERY_PLAN ""
#define START_TRAN "START TRANSACTION"
#define END_TRAN "COMMIT"
#define SET_OUTPUT ""
#define SET_ROWCOUNT "limit %d;\n"
#define SET_DBASE "use %s;\n"
#endif
然後執行make編譯,編譯完畢後會生成兩個可執行文件:
dbgen:數據生成工具。在使用InfiniDB官方測試腳本進行測試時,需要用該工具生成tpch相關表數據。
qgen:SQL生成工具
生成初始化測試數據:
[root@imysql tpch]# time ./dbgen -s 50
TPC-H Population Generator (Version 2.9.0)
Copyright Transaction Processing Performance Council 1994 - 2008
real 192m43.897s
user 37m45.398s
sys 19m4.132s
[root@imysql tpch]# ls -lh *tbl
-rw-r--r-- 1 root root 1.2G Sep 21 15:23 customer.tbl
-rw-r--r-- 1 root root 1.4G Sep 21 15:23 lineitem.tbl
-rw-r--r-- 1 root root 2.2K Sep 21 15:23 nation.tbl
-rw-r--r-- 1 root root 317M Sep 21 15:23 orders.tbl
-rw-r--r-- 1 root root 504K Sep 21 15:23 partsupp.tbl
-rw-r--r-- 1 root root 464K Sep 21 15:23 part.tbl
-rw-r--r-- 1 root root 389 Sep 21 15:23 region.tbl
-rw-r--r-- 1 root root 69M Sep 21 15:23 supplier.tbl
dbgen參數 -s 的作用是指定生成測試數據的倉庫數,建議基准值設定在100以上,在我的測試環境中,一般都設定為1000。
由於源碼包中自帶的tpch初始化庫表腳本並不能完全適用MySQL,需要修改部分代碼。
先生成測試SQL腳本:
[root@imysql tpch]# ./qgen | sed -e 's/\r//' > queries/tpch_queries.sql
而後用vim打開tpch_queries.sql腳本,進行下面幾次全局替換:
:%s/;\nlimit/ limit/g
:%s/limit -1/limit 1/g
搜索所有類似下面的語句,去掉後面的 (3):
l_shipdate <= date '1998-12-01' - interval '106' day (3)
=>
l_shipdate <= date '1998-12-01' - interval '106' day
再修改第369行附近:
count(o_orderkey)
=>
count(o_orderkey) as c_count
修改第376行左右
) as c_orders (c_custkey, c_count)
=>
) as c_orders
修改第431行附近:
drop view revenue0 limit 1;
=>
drop view revenue0;
最後把大的查詢SQL腳本拆分成23個獨立的SQL查詢腳本,分別從tpch_01.sql ~ tpch_23.sql。
2、初始化庫表
tpch提供的數據庫表初始化腳本有些小問題,需要進行修改:
dss.ddl – DSS庫初始化DDL腳本
dss.ri – DSS數據表創建索引、外鍵腳本
dss.ddl腳本需要增加幾行:
drop database tpch;
create database tpch;
use tpch;
dss.ri腳本需要修改幾個地方:
修改第4行左右:
CONNECT TO TPCD;
=>
Use tpch;
修改第6~13行,所有的SQL注釋符 “--” 後面再加一個空格:
-- ALTER TABLE TPCD.REGION DROP PRIMARY KEY;
-- ALTER TABLE TPCD.NATION DROP PRIMARY KEY;
-- ALTER TABLE TPCD.PART DROP PRIMARY KEY;
-- ALTER TABLE TPCD.SUPPLIER DROP PRIMARY KEY;
-- ALTER TABLE TPCD.PARTSUPP DROP PRIMARY KEY;
-- ALTER TABLE TPCD.ORDERS DROP PRIMARY KEY;
-- ALTER TABLE TPCD.LINEITEM DROP PRIMARY KEY;
-- ALTER TABLE TPCD.CUSTOMER DROP PRIMARY KEY;
修改第25行:
ADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references TPCD.REGION;
=>
ADD FOREIGN KEY NATION_FK1 (N_REGIONKEY) references TPCD.REGION(R_REGIONKEY);
修改第40行:
ADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references TPCD.NATION;
=>
ADD FOREIGN KEY SUPPLIER_FK1 (S_NATIONKEY) references TPCD.NATION(N_NATIONKEY);
修改第55行:
ADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references TPCD.NATION;
=>
ADD FOREIGN KEY CUSTOMER_FK1 (C_NATIONKEY) references TPCD.NATION(N_NATIONKEY);
修改第73行:
ADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references TPCD.SUPPLIER;
=>
ADD FOREIGN KEY PARTSUPP_FK1 (PS_SUPPKEY) references TPCD.SUPPLIER(S_SUPPKEY);
修改第78行:
ADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references TPCD.PART;
=>
ADD FOREIGN KEY PARTSUPP_FK2 (PS_PARTKEY) references TPCD.PART(P_PARTKEY);
修改第84行:
ADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references TPCD.CUSTOMER;
=>
ADD FOREIGN KEY ORDERS_FK1 (O_CUSTKEY) references TPCD.CUSTOMER(C_CUSTKEY);
修改第90行:
ADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY) references TPCD.ORDERS;
=>
ADD FOREIGN KEY LINEITEM_FK1 (L_ORDERKEY) references TPCD.ORDERS(O_ORDERKEY);
修改第96行:
TPCD.PARTSUPP;
=>
TPCD.PARTSUPP(PS_PARTKEY,PS_SUPPKEY);
另外,由於tpch生成的表名是大寫的,需要修改下表名成小寫的,因此再增加幾行:
use tpch;
alter table CUSTOMER rename to customer ;
alter table LINEITEM rename to lineitem ;
alter table NATION rename to nation ;
alter table ORDERS rename to orders ;
alter table PART rename to part ;
alter table PARTSUPP rename to partsupp ;
alter table REGION rename to region ;
alter table SUPPLIER rename to supplier ;
3、導入數據
測試數據生成了,測試庫表也初始化完了,接下來就可以開始導入數據了。
需要注意下,如果開啟了binlog,在導入前最好先關閉binlog,否則會提示超出max_binlog_cache_size的錯誤提示,如果不能關閉binlog,則需要把導入文件切分成多個小文件再導入。
myqsl -e "LOAD DATA INFILE 'path/dbgen/customer.tbl' INTO TABLE CUSTOMER FIELDS TERMINATED BY '|';"
myqsl -e "LOAD DATA INFILE 'path/dbgen/orders.tbl' INTO TABLE ORDERS FIELDS TERMINATED BY '|';"
myqsl -e "LOAD DATA INFILE 'path/dbgen/lineitem.tbl' INTO TABLE LINEITEM FIELDS TERMINATED BY '|';"
myqsl -e "LOAD DATA INFILE 'path/dbgen/nation.tbl' INTO TABLE NATION FIELDS TERMINATED BY '|';"
myqsl -e "LOAD DATA INFILE 'path/dbgen/partsupp.tbl' INTO TABLE PARTSUPP FIELDS TERMINATED BY '|';"
myqsl -e "LOAD DATA INFILE 'path/dbgen/part.tbl' INTO TABLE PART FIELDS TERMINATED BY '|';"
myqsl -e "LOAD DATA INFILE 'path/dbgen/region.tbl' INTO TABLE REGION FIELDS TERMINATED BY '|';"
myqsl -e "LOAD DATA INFILE 'path/dbgen/supplier.tbl' INTO TABLE SUPPLIER FIELDS TERMINATED BY '|';"
4、執行tpch測試
接下來就可以進行tpch測試了,逐個執行23個查詢SQL腳本即可,每次執行前都要重啟下MySQL實例,確保每次的內存緩沖區都是干淨的。
簡單循環測試腳本如下:
#!/bin/sh ## ## 執行tpch OLAP測試 ## ## writed by yejr(http://imysql.com), 2012/12/14 ## PATH=$PATH:/usr/local/bin export PATH . ~/.bash_profile > /dev/null 2>&1 exec 3>&1 4>&2 1>> tpch-benchmark-olap-`date +'%Y%m%d%H%M%S'`.log 2>&1 I=1 II=3 while [ $I -le $II ] do N=1 T=23 while [ $N -lt $T ] do if [ $N -lt 10 ] ; then NN='0'$N else NN=$N fi echo "query $NN starting" /etc/init.d/mysql restart time mysql -f tpch < ./queries/tpch_${NN}.sql echo "query $NN ended!" N=`expr $N + 1` done I=`expr $I + 1` Done
附件:tpch初始化、自動化測試腳本壓縮包與word手冊。