程式師世界 >> 數據庫知識 >> MYSQL數據庫 >> MySQL綜合教程 >> MySQL高效分頁與分頁優化分析

MySQL高效分頁與分頁優化分析

編輯：MySQL綜合教程

在mysql中分頁優化是一個老生常談的問題，如果我們幾萬條數據直接使用limit不需要做任何操作即可完成分頁並且性能還不錯，但是百萬千萬級數據我們就需要對limit進行有效的優化才能完成高效分頁了，要不就會卡死數據庫哦。在Percona Performance Conference 2009大會上來自yahoo的Surat Singh Bhati ([email protected]) 和 Rick James ([email protected])給大家分享了MySQL高效分頁的經驗。 一、概述

常見分頁方式
schema設計和常見的分頁方式(偏移)
避免分頁偏移過大的技巧
性能對比
重點

二、常見分頁方式
三.前提
大記錄表要高效分頁

WHERE條件使用索引完成
WHERE條件和排序能夠使用同個索引完成

基礎知識

http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html
http://dev.mysql.com/doc/refman/5.1/en/order-by-optimization.html
http://dev.mysql.com/doc/refman/5.1/en/limit-optimization.html

索引 a_b_c (a, b, c) 下面的查詢可以使用索引來解決ORDER部分：

ORDER BY a
ORDER BY a,b
ORDER BY a, b, c
ORDER BY a DESC, b DESC, c DESC

下面的查詢可以使用索引來解決WHERE和ORDER部分：:

WHERE a = const ORDER BY b, c
WHERE a = const AND b = const ORDER BY c
WHERE a = const ORDER BY b, c
WHERE a = const AND b > const ORDER BY b, c

下面的查詢無法使用索引完成，需額外排序：

ORDER BY a ASC, b DESC, c DESC /* 混合ASC和DESC */
WHERE g = const ORDER BY b, c /* 字段g不是索引一部分 */
WHERE a = const ORDER BY c /* 沒有使用字段b */
WHERE a = const ORDER BY a, d /* 字段d不是索引的一部分 */

四、Schema 設計

	  
		 代碼如下
		復制代碼
	  
	  
		CREATE TABLE `message` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`title` varchar(255) COLLATE utf8_unicode_ci NOT NULL,
`user_id` int(11) NOT NULL,
`content` text COLLATE utf8_unicode_ci NOT NULL,
`create_time` int(11) NOT NULL,
`thumbs_up` int(11) NOT NULL DEFAULT '0', /* 投票數 */
PRIMARY KEY (`id`),
KEY `thumbs_up_key` (`thumbs_up`,`id`)
) ENGINE=InnoDB

mysql> show table status like 'message' G
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 50000040 /* 5千萬 */
Avg_row_length: 565
Data_length: 28273803264 /* 26 GB */
Index_length: 789577728 /* 753 MB */
Data_free: 6291456
Create_time: 2009-04-20 13:30:45

	  
	兩個分頁例子：

按照time(發布時間)分頁，新發布的在前面
按照thumps_up(投票數)分頁，票高的在前面

五、典型的分頁查詢
1.統計記錄數量

	  
		 代碼如下
		復制代碼
	  
	  
		SELECT count(*) FROM message

	  
	2. 查詢當前頁

	  
		 代碼如下
		復制代碼
	  
	  
		SELECT * FROM message ORDER BY id DESC LIMIT 0, 20

    
    http://domain.com/message?page=1
    
    ORDER BY id DESC LIMIT 0, 20
    
    http://domain.com/message?page=2
    
    ORDER BY id DESC LIMIT 20, 20
    
    http://domain.com/message?page=3
    
    ORDER BY id DESC LIMIT 40, 20


	  
	提示：id 是自動增長的(auto_increment)，通過id就可以取得最新的列表，不需要創建專門記錄時間的字段。

六、explain

	  
		 代碼如下
		復制代碼
	  
	  
		mysql> explain SELECT * FROM message
ORDER BY id DESC
LIMIT 10000, 20G
***************** 1. row **************
id: 1
select_type: SIMPLE
table: message
type: index
possible_keys: NULL
key: PRIMARY
key_len: 4
ref: NULL
rows: 10020
Extra:
1 row in set (0.00 sec)

它可以使用索引，並且只要找到需要的結果後就停止掃描.
LIMIT 10000, 20 需要讀取前10000行，然後獲取後面的20行

六、瓶頸

較大的偏移(OFFSET)會增加結果集， MySQL has to bring data in memory that is never returned to caller.
Performance issue is more visible when your have database that can’t fit in main memory.
小比例的低效分頁足夠產生磁盤I/O瓶頸
為了顯示“第 21條至 40條 (共 1000000)，需要統計1000000行

七、簡單的解決方法

不顯示記錄總數，沒用戶在乎這個數字
不讓用戶訪問頁數比較大的記錄，重定向他們

八、避免count(*)

不顯示總數，讓用戶通過“下一頁”來翻頁
緩存總數，顯示一個大概值，沒有用戶在乎是324533條還是324633 （譯：測試在乎-_-!!）
Display 41 to 80 of Thousands
單獨統計總數，在插入和刪除時遞增/遞減

九、解決偏移查詢

更改ui，不提供跳到某頁的按鈕
LIMIT N 是高效的, 但不要使用 LIMIT M,N
- 從WHERE條件裡找到分頁(LIMIT N)的線索
- Find the desired records using more restricted WHERE using given clue and ORDER BY and LIMIT N without OFFSET)

十、尋找線索
譯：last_seen是id。這裡的分頁只有“上一頁”、“下一頁” 按鈕
十一、根據線索解決方案
下一頁: http://domain.com/forum?page=2&last_seen=100&dir=next

WHERE id< 100 /* last_seen */
ORDER BY id DESC LIMIT $page_size /* 沒有偏移 */

http://domain.com/forum?page=1&last_seen=98&dir=prev

WHERE id > 98 /* last_seen */
ORDER BY id ASC LIMIT $page_size /* 沒有偏移 */

譯：通過每頁第一條或最後一條記錄的id來做條件篩選，再配合降序和升序獲得上/下一頁的結果集
十二、根據線索解決方案

	  
		 代碼如下
		復制代碼
	  
	  
		mysql> explain
SELECT * FROM message
WHERE id < '49999961'
ORDER BY id DESC LIMIT 20 G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: message
type: range
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: NULL
Rows: 25000020 /* 忽略這裡 */
Extra: Using where
1 row in set (0.00 sec)

	  
	十三、當你排序的字段不是唯一的，怎麼辦?

代碼如下復制代碼 99
99
98 第一頁
98
98 98
98
97 第二頁
97
10 我們不能這樣查詢:

	  
		 代碼如下
		復制代碼
	  
	  
		WHERE thumbs_up< 98
ORDER BY thumbs_up DESC /* 結果將返回重復的記錄 */

	  
	我們可以這樣查詢:

	  
		 代碼如下
		復制代碼
	  
	  
		WHERE thumbs_up <= 98
AND <額外的條件>
ORDER BY thumbs_up DESC

	  
	十四、額外的條件

考慮到 thumbs_up 是“主要字段”，如果我們添加一個“次要字段”，我們可以使用“主要字段”和“次要字段”作為查詢條件
其次，我們可以考慮使用id(primary key)作為我們的次要字段

十五、解決方案
第一頁：

	  
		 代碼如下
		復制代碼
	  
	  
		SELECT thumbs_up, id
FROM message
ORDER BY thumbs_up DESC, id DESC
LIMIT $page_size
+-----------+----+
| thumbs_up | id |
+-----------+----+
| 99 | 14 |
| 99 | 2 |
| 98 | 18 |
| 98 | 15 |
| 98 | 13 |
+-----------+----+

	  
	下一頁：

	  
		 代碼如下
		復制代碼
	  
	  
		SELECT thumbs_up, id
FROM message
WHERE thumbs_up <= 98 AND (id < 13 OR thumbs_up< 98)
ORDER BY thumbs_up DESC, id DESC
LIMIT $page_size
+-----------+----+
| thumbs_up | id |
+-----------+----+
| 98 | 10 |
| 98 | 6 |
| 97 | 17 |

	  
	十六、優化

查詢:

	  
		 代碼如下
		復制代碼
	  
	  
		SELECT * FROM message
WHERE thumbs_up <= 98
AND (id < 13 OR thumbs_up < 98)
ORDER BY thumbs_up DESC, id DESC
LIMIT 20

	  
	我們可以這樣寫:

	  
		 代碼如下
		復制代碼
	  
	  
		SELECT m2.* FROM message m1, message m2
WHERE m1.id = m2.id
AND m1.thumbs_up <= 98
AND (m1.id <13 OR m1.thumbs_up< 98)
ORDER BY m1.thumbs_up DESC, m1.id DESC
LIMIT 20;

	  
	十七、explain

	  
		 代碼如下
		復制代碼
	  
	  
		*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: m1
type: range
possible_keys: PRIMARY,thumbs_up_key
key: thumbs_up_key /* (thumbs_up,id) */
key_len: 4
ref: NULL
Rows: 25000020 /* 忽略這裡 */
Extra: Using where; Using index /* Cover 譯：Cover就是說所需要的數據之從索引裡獲取就可以滿足了 */
*************************** 2. row ***************************
id: 1
select_type: SIMPLE
table: m2
type: eq_ref
possible_keys: PRIMARY
key: PRIMARY
key_len: 4
ref: forum.m1.id
rows: 1
Extra:

	  
	十八、性能提升

十九、吞吐量提升
每頁30條記錄，查看第一頁的話，使用 LIMIT OFFSET, N方式，可以達到 600 次查詢/秒，如果使用 LIMIT N (無偏移)方式，提升到 3.7k 次查詢/秒
二十、Bonus Point
Product issue with LIMIT M, N User is reading a page, in the mean time some records may be added to
previous page. Due to insert/delete pages records are going to move forward/backward
as rolling window:
– User is reading messages on 4th page
– While he was reading, one new message posted (it would be there on page
one), all pages are going to move one message to next page.
– User Clicks on Page 5
– One message from page got pushed forward on page 5, user has to read it
again No such issue with news approach
二十一、不足
SEO專家會說:Let bot reach all you pages with fewer number of deep dive 兩個解決方案：
Two Solutions:
• Read extra rows
– Read extra rows in advance and construct links for few previous & next pages
• Use small offset
– Do not read extra rows in advance, just add links for few past & next pages
with required offset & last_seen_id on current page
– Do query using new approach with small offset to display desired page