1、故事起因於2016年11月15日的一個生產bug。業務場景是:歸檔一個表裡邊的數據到歷史表裡邊,同是刪除主表記錄。
2、背景場景簡化如下(數據庫引擎InnoDb,數據隔離級別RR[REPEATABLE])
-- 創建表test1 CREATE TABLE test1 ( id int(11) NOT NULL AUTO_INCREMENT, name varchar(10) NOT NULL, PRIMARY KEY (id) ); insert into test1 values('hello'); -- 創建表test2 CREATE TABLE test2 ( id int(11) NOT NULL AUTO_INCREMENT, name varchar(10) NOT NULL, PRIMARY KEY (id) ); -- Transcation 1 begin; insert into test2 select * from test1 where id = 1; delete from test1 where id = 1; -- Transcation 2 begin; insert into test2 select * from test1 where id = 1;
3、具體執行順序
Transcation1
Transcation2
begin;
— 這條sql得到test1表主鍵索引鎖共享鎖S(id=1) insert into test2 select * from test1 where id = 1;
begin;
— 這條sql試圖獲取test1表主鍵索引鎖共享鎖S(id=1),但是已經被T1占有,所以它進入鎖請求隊列.
insert into test2 select * from test1 where id = 1;
— 這條sql試圖把自己擁有的test1表主鍵索引鎖共享鎖S(id=1)升級為排它鎖X(id=1)
— 這時T1也發起一個鎖請求,這個時候mysql發現鎖請求隊列裡邊已存在一個事物T2對(id=1)的這條記錄申請了S鎖,死鎖產生了。
delete from test1 where id = 1;
死鎖產生後mysql根據兩個事務的權重,事務2的權重更小,被選為死鎖的犧牲者,rollback。
T2 rollback 之後T1成功獲取了鎖執行成功
Mysql 官方解釋
Deadlock occurs here because client A needs an X lock to delete the row. However, that lock request cannot be granted because client B already has a request for an X lock and is waiting for client A to release its S lock. Nor can the S lock held by A be upgraded to an X lock because of the prior request by B for an X lock. As a result, InnoDBgenerates an error for one of the clients and releases its locks. The client returns this error。
實際場景和mysql文檔有些區別,文檔裡邊要獲取的是X鎖。具體事例裡邊要獲取的是S鎖。
下面我們來具體的一步步分析下mysql的死鎖
1、MySQL常用存儲引擎的鎖機制
MyISAM和MEMORY采用表級鎖(table-level locking)
BDB采用頁面鎖(page-level locking)或表級鎖,默認為頁面鎖
InnoDB支持行級鎖(row-level locking)和表級鎖,默認為行級鎖
2、各種鎖特點
表級鎖:開銷小,加鎖快;不會出現死鎖;鎖定粒度大,發生鎖沖突的概率最高,並發度最低
行級鎖:開銷大,加鎖慢;會出現死鎖;鎖定粒度最小,發生鎖沖突的概率最低,並發度也最高
頁面鎖:開銷和加鎖時間界於表鎖和行鎖之間;會出現死鎖;鎖定粒度界於表鎖和行鎖之間,並發度一般
3、各種鎖的適用場景
表級鎖更適合於以查詢為主,只有少量按索引條件更新數據的應用,如Web應用
行級鎖則更適合於有大量按索引條件並發更新數據,同時又有並發查詢的應用,如一些在線事務處理系統
4、死鎖
是指兩個或兩個以上的進程在執行過程中,因爭奪資源而造成的一種互相等待的現象,若無外力作用,它們都將無法推進下去。
表級鎖不會產生死鎖。所以解決死鎖主要還是針對於最常用的InnoDB。
5、死鎖舉例分析
在MySQL中,行級鎖並不是直接鎖記錄,而是鎖索引。索引分為主鍵索引和非主鍵索引兩種,如果一條sql語句操作了主鍵索引,MySQL就會鎖定這條主鍵索引;如果一條語句操作了非主鍵索引,MySQL會先鎖定該非主鍵索引,再鎖定相關的主鍵索引。
在UPDATE、DELETE操作時,MySQL不僅鎖定WHERE條件掃描過的所有索引記錄,而且會鎖定相鄰的鍵值,即所謂的next-key locking。
例如,一個表db。tab_test,結構如下:
id:主鍵;
state:狀態;
time:時間;
索引:idx_1(state,time)
出現死鎖日志如下:
?***(1) TRANSACTION: ?TRANSACTION 0 677833455, ACTIVE 0 sec, process no 11393, OS thread id 278546 starting index read ?mysql tables in use 1, locked 1 ?LOCK WAIT 3 lock struct(s), heap size 320 ?MySQL thread id 83, query id 162348740 dcnet03 dcnet Searching rows for update ?update tab_test set state=1064,time=now() where state=1061 and time < date_sub(now(), INTERVAL 30 minute) (任務1的sql語句) ?***(1) WAITING FOR THIS LOCK TO BE GRANTED: (任務1等待的索引記錄) ?RECORD LOCKS space id 0 page no 849384 n bits 208 index `PRIMARY` of table `db/tab_test` trx id 0 677833455 _mode X locks rec but not gap waiting ?Record lock, heap no 92 PHYSICAL RECORD: n_fields 11; compact format; info bits 0 ?0: len 8; hex 800000000097629c; asc b ;; 1: len 6; hex 00002866eaee; asc (f ;; 2: len 7; hex 00000d40040110; asc @ ;; 3: len 8; hex 80000000000050b2; asc P ;; 4: len 8; hex 800000000000502a; asc P*;; 5: len 8; hex 8000000000005426; asc T&;; 6: len 8; hex 800012412c66d29c; asc A,f ;; 7: len 23; hex 75706c6f6164666972652e636f6d2f6 8616e642e706870; asc xxx.com/;; 8: len 8; hex 800000000000042b; asc +;; 9: len 4; hex 474bfa2b; asc GK +;; 10: len 8; hex 8000000000004e24; asc N$;; ?*** (2) TRANSACTION: ?TRANSACTION 0 677833454, ACTIVE 0 sec, process no 11397, OS thread id 344086 updating or deleting, thread declared inside InnoDB 499 ?mysql tables in use 1, locked 1 ?3 lock struct(s), heap size 320, undo log entries 1 ?MySQL thread id 84, query id 162348739 dcnet03 dcnet Updating update tab_test set state=1067,time=now () where id in (9921180) (任務2的sql語句) ?*** (2) HOLDS THE LOCK(S): (任務2已獲得的鎖) ?RECORD LOCKS space id 0 page no 849384 n bits 208 index `PRIMARY` of table `db/tab_test` trx id 0 677833454 lock_mode X locks rec but not gap ?Record lock, heap no 92 PHYSICAL RECORD: n_fields 11; compact format; info bits 0 ?0: len 8; hex 800000000097629c; asc b ;; 1: len 6; hex 00002866eaee; asc (f ;; 2: len 7; hex 00000d40040110; asc @ ;; 3: len 8; hex 80000000000050b2; asc P ;; 4: len 8; hex 800000000000502a; asc P*;; 5: len 8; hex 8000000000005426; asc T&;; 6: len 8; hex 800012412c66d29c; asc A,f ;; 7: len 23; hex 75706c6f6164666972652e636f6d2f6 8616e642e706870; asc uploadfire.com/hand.php;; 8: len 8; hex 800000000000042b; asc +;; 9: len 4; hex 474bfa2b; asc GK +;; 10: len 8; hex 8000000000004e24; asc N$;; ?*** (2) WAITING FOR THIS LOCK TO BE GRANTED: (任務2等待的鎖) ?RECORD LOCKS space id 0 page no 843102 n bits 600 index `idx_1` of table `db/tab_test` trx id 0 677833454 lock_mode X locks rec but not gap waiting ?Record lock, heap no 395 PHYSICAL RECORD: n_fields 3; compact format; info bits 0 ?0: len 8; hex 8000000000000425; asc %;; 1: len 8; hex 800012412c66d29c; asc A,f ;; 2: len 8; hex 800000000097629c; asc b ;; ?*** WE ROLL BACK TRANSACTION (1) ?(回滾了任務1,以解除死鎖)
原因分析:
當“update tab_test set state=1064,time=now() where state=1061 and time < date_sub(now(), INTERVAL 30 minute)”
執行時,MySQL會使用idx_1索引,因此首先鎖定相關的索引記錄,因為idx_1是非主鍵索引,為執行該語句,MySQL還會鎖定主鍵索引。
假設“update tab_test set state=1067,time=now () where id in (9921180)
”幾乎同時執行時,本語句首先鎖定主鍵索引,由於需要更新state的值,所以還需要鎖定idx_1的某些索引記錄。
這樣第一條語句鎖定了idx_1的記錄,等待主鍵索引,而第二條語句則鎖定了主鍵索引記錄,而等待idx_1的記錄,這樣死鎖就產生了。
6、解決辦法
拆分第一條sql,先查出符合條件的主鍵值,再按照主鍵更新記錄:
select id from tab_test where state=1061 and time < date_sub(now(), INTERVAL 30 minute); update tab_test state=1064,time=now() where id in(......);
關於MySQL死鎖問題的實例分析及解決方法就介紹到這裡了,希望本次的介紹能夠對您有所收獲!
Mysql 官方文檔:http://dev.mysql.com/doc/refman/5.7/en/innodb-deadlock-example.html