為什麼Mysql explain extended中的filtered列值總是100%
1. 問題
執行Mysql的explain extended的輸出會比單純的explain多一列filtered(MySQL5.7缺省就會輸出filtered),它指返回結果的行占需要讀到的行(rows列的值)的百分比。按說filtered是個非常有用的值,因為對於join操作,前一個表的結果集大小直接影響了循環的次數。但是我的環境下測試的結果卻是,filtered的值一直是100%,也就是說失去了意義。
參考下面mysql 5.6的代碼,filtered值只對index和all的掃描有效(這可以理解,其它場合,通常rows值就等於估算的結果集大小。)。
sql/opt_explain.cc
- bool Explain_join::explain_rows_and_filtered()
- {
- if (table->pos_in_table_list->schema_table)
- return false;
- double examined_rows;
- if (select && select->quick)
- examined_rows= rows2double(select->quick->records);
- else if (tab->type == JT_INDEX_SCAN || tab->type == JT_ALL)
- {
- if (tab->limit)
- examined_rows= rows2double(tab->limit);
- else
- {
- table->pos_in_table_list->fetch_number_of_rows();
- examined_rows= rows2double(table->file->stats.records);
- }
- }
- else
- examined_rows= tab->position->records_read;
- fmt->entry()->col_rows.set(static_cast(examined_rows));
- /* Add "filtered" field */
- if (describe(DESCRIBE_EXTENDED))
- {
- float f= 0.0;
- if (examined_rows)
- f= 100.0 * tab->position->records_read / examined_rows;
- fmt->entry()->col_filtered.set(f);
- }
- return false;
- }
但是,我構造了一個全表掃描後,filtered的結果卻不對,仍然是100%,而我期待的是0.1%。
- mysql> desc tb2;
+-------+--------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+-------+
| id | int(11) | NO | PRI | 0 | |
| c1 | int(11) | YES | | NULL | |
| c2 | varchar(100) | YES | | NULL | |
+-------+--------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
mysql> explain extended select * from tb2 where c1<100;
+----+-------------+-------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------+-------------+
| 1 | SIMPLE | tb2 | ALL | NULL | NULL | NULL | NULL | 996355 | 100.00 | Using where |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (10 min 29.96 sec)
mysql> select count(*) from tb2 where c1<100;
+----------+
| count(*) |
+----------+
| 1001 |
+----------+
1 row in set (1.99 sec) -
通過gdb跟蹤,發現代碼走的分支是對的,但下面的值有問題。
- (gdb) p table->file->stats.records
- $18 = 996355
- (gdb) p tab->position->records_read
- $19 = 996355
上面的tab->position->records_read應該是估算出的返回行數,正確的值應該是1001左右,而不是全表大小996355。
2.原因
為什麼會出現上面的情況呢?後來我查看了下MySQL收集的統計信息就明白了。
MySQL和其它主流數據庫一樣會自動需要收集統計信息以便生成更好的執行計劃,也可以用analyze table手動收集,收集的統計信息存儲在mysql.innodb_table_stats和mysql.innodb_index_stats裡。
參考:http://dev.mysql.com/doc/refman/5.6/en/innodb-persistent-stats.html#innodb-persistent-stats-tables
但這不是重點,重點是,查看這兩個表就會發現MySQL收集的統計信息非常少。
- mysql> select * from mysql.innodb_table_stats where table_name='tb2';
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| database_name | table_name | last_update | n_rows | clustered_index_size | sum_of_other_index_sizes |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
| test | tb2 | 2015-12-02 06:26:54 | 996355 | 3877 | 0 |
+---------------+------------+---------------------+--------+----------------------+--------------------------+
1 row in set (0.00 sec)
mysql> select * from mysql.innodb_index_stats where table_name='tb2';
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| database_name | table_name | index_name | last_update | stat_name | stat_value | sample_size | stat_description |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
| test | tb2 | PRIMARY | 2015-12-02 06:26:54 | n_diff_pfx01 | 996355 | 20 | id |
| test | tb2 | PRIMARY | 2015-12-02 06:26:54 | n_leaf_pages | 3841 | NULL | Number of leaf pages in the index |
| test | tb2 | PRIMARY | 2015-12-02 06:26:54 | size | 3877 | NULL | Number of pages in the index |
+---------------+------------+------------+---------------------+--------------+------------+-------------+-----------------------------------+
3 rows in set (0.00 sec)
重要的信息也就2個,一是表的總記錄數(n_rows),二是索引中的列的唯一值數(n_diff_pfx01)。也就是說MySQL不會統計非索引列的值分布信息,在前面的查詢的例子中,由於c1沒有被索引,所以MySQL無法估算出"c1<100"會最終篩選出多少記錄。這樣一來,filtered的值真正有效的場合就非常少了,要where條件中出現的列上建有索引,又要執行計劃不走索引的range或ref掃描,而走全表掃描或覆蓋索引掃描,所以可以說這個filtered列幾乎無用。
3. 引申
後面我聯系到MySQL匮乏的統計信息會帶來什麼後果?
不難想象,如果缺少索引,MySQL很可能會生成性能糟糕的執行計劃,比如搞錯大表和小表的join順序,就像下面這樣。
- mysql> explain extended select count(*) from tb1,tb2 where tb1.c1=tb2.c1 and tb2.c2='xx';
+----+-------------+-------+------+---------------+------+---------+------+--------+----------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------+----------------------------------------------------+
| 1 | SIMPLE | tb1 | ALL | NULL | NULL | NULL | NULL | 1000 | 100.00 | NULL |
| 1 | SIMPLE | tb2 | ALL | NULL | NULL | NULL | NULL | 996355 | 100.00 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------+----------------------------------------------------+
2 rows in set, 1 warning (0.00 sec)
雖然t1表時小表,tb2表是大表,但是tb2上加上tb2.c2='xx'的條件限制後結果集就變成0了,因此先掃描tb2表才是性能更好的選擇。
相同的查詢,PostgreSQL給出的執行計劃是更好的,先掃描t2表再循環掃描t1表。
- postgres=# explain select count(*) from tb1,tb2 where tb1.c1=tb2.c1 and tb2.c2='xx';
- QUERY PLAN
- -------------------------------------------------------------------
- Aggregate (cost=20865.50..20865.51 rows=1 width=0)
- -> Nested Loop (cost=0.00..20865.50 rows=1 width=0)
- Join Filter: (tb1.c1 = tb2.c1)
- -> Seq Scan on tb2 (cost=0.00..20834.00 rows=1 width=4)
- Filter: ((c2)::text = 'xx'::text)
- -> Seq Scan on tb1 (cost=0.00..19.00 rows=1000 width=4)
- (6 rows)
下面實際對比一下執行時間看看。
MySQL花了0.34s
- mysql> select count(*) from tb1,tb2 where tb1.c1=tb2.c1 and tb2.c2='xx';
- +----------+
- | count(*) |
- +----------+
- | 0 |
- +----------+
- 1 row in set (0.34 sec)
PostgreSQL花了0.139s
- postgres=# select count(*) from tb1,tb2 where tb1.c1=tb2.c1 and tb2.c2='xx';
- count
- -------
- 0
- (1 row)
- Time: 139.600 ms
上面這個例子的性能差別其實不是很大,如果去掉tb2.c2='xx'的條件,差別就非常大了。
Mysql花了1分08秒
- mysql> explain select count(*) from tb1,tb2 where tb1.c1=tb2.c1;
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------------+
| 1 | SIMPLE | tb1 | ALL | NULL | NULL | NULL | NULL | 1000 | NULL |
| 1 | SIMPLE | tb2 | ALL | NULL | NULL | NULL | NULL | 996355 | Using where; Using join buffer (Block Nested Loop) |
+----+-------------+-------+------+---------------+------+---------+------+--------+----------------------------------------------------+
2 rows in set (0.00 sec)
mysql> select count(*) from tb1,tb2 where tb1.c1=tb2.c1;
+----------+
| count(*) |
+----------+
| 9949 |
+----------+
1 row in set (1 min 8.26 sec)
PostgreSQL只用了0.163秒
- postgres=# explain select count(*) from tb1,tb2 where tb1.c1=tb2.c1;
- QUERY PLAN
- -------------------------------------------------------------------------
- Aggregate (cost=23502.34..23502.35 rows=1 width=0)
- -> Hash Join (cost=31.50..23474.97 rows=10947 width=0)
- Hash Cond: (tb2.c1 = tb1.c1)
- -> Seq Scan on tb2 (cost=0.00..18334.00 rows=1000000 width=4)
- -> Hash (cost=19.00..19.00 rows=1000 width=4)
- -> Seq Scan on tb1 (cost=0.00..19.00 rows=1000 width=4)
- (6 rows)
- Time: 0.690 ms
- postgres=# select count(*) from tb1,tb2 where tb1.c1=tb2.c1;
- count
- -------
- 10068
- (1 row)
- Time: 163.868 ms
不過這個性能差別和統計信息無關,原因在於PG支持Nest Loop Join,Merge Join和Hash Join,而MySQL只支持Nest Loop Join,缺了索引Nest Loop Join會慢得跟龜似的。
4. 總結
1. MySQL的統計信息非常少,只有表行數和索引列的唯一值數目,這使得MySQL的優化器經常不能對數據規模有一個正確的認識而給出性能不佳的執行計劃。
2.MySQL的join操作的效率非常依賴於索引(我之前兩次幫人調優MySQL的SQL語句都是在加索引)。並不是說PG的join不需要索引,只是不像MySQL缺了索引的反應那麼大。上面那個MySQL執行了1分多鐘的例子,加上索引後,不管是MySQL還是PG的執行時間都立刻降到10毫秒以內。所以,開發人員在設計表的時候應該對可能的查詢方式做個評估,把該建的索引都建上(不能少建也不宜多建)。
3.相比之下,PG不僅統計所有列的值分布,而且除了唯一值還有直方圖,頻繁值等等信息,支撐了PG的優化器做出正確的決策。猜測也是由於這個原因,PG社區認為PG的優化器已經足夠智能,不需要把和Oracle類似的hint功能加到PG的內核裡(因為hint可能會被人濫用,導致系統很難維護;不過,實在想用的話可以自己裝pg_hint_plan插件)。