mysql連接hang住問題分析 【問題現象】: 1. Linuxc多線程連接mysql數據庫,每次都是短連接,操作完後就釋放連接,有時候會出現mysql_real_connect掛住的現象 2. 掛住超時mysql_real_connect返回後報錯如下:Lostconnection to MySQL server at 'reading initial communication packet', systemerror: 104,返錯後線程號沒變,會繼續往下運行 【初步原因分析】: 1. mysql_real_connect連接數據庫, 沒有顯式調用超時時間,重連什麼的,使用的默認值: 2. 昨天拿c mysqlclient的源碼進行了分析測試,發現以下幾個配置項的默認值(如果調mysql_real_connect之前沒有設置任何屬性,mysql client端的機器上/etc/my.cnf也沒有配置)如下: connect_timeout = 0 read_timeout = 0 write_timeout = 0 reconnect = 0 //1表示自動重連 所以這些參數是需要顯式設置的。 {net = {vio = 0x7b8f150, options = {connect_timeout = 0, read_timeout = 0, write_timeout = 0,} 3. Mysqlclient中報錯的代碼如下(與服務端連接建立後,讀解析包的時候失敗了): /* Part 1: Connection established, read and parse first packet */ if ((pkt_length=cli_safe_read(mysql)) == packet_error) { if (mysql->net.last_errno == CR_SERVER_LOST) set_mysql_extended_error(mysql, CR_SERVER_LOST, unknown_sqlstate, ER(CR_SERVER_LOST_EXTENDED), "reading initial communication packet", errno); fprintf(myfp,"%s after cli_safe_read packet_error \n",cur_time()); goto error; } 線程阻塞時抓到的堆棧如下: #0 0x0000003ebe0c5f3b in read () from /lib64/libc.so.6 #1 0x00007f240dd91430 in vio_read () from /usr/lib/libmysql.so.16.0.0 #2 0x00007f240dcf9b05 in my_real_read () from /usr/lib/libmysql.so.16.0.0 #3 0x00007f240dcf9e38 in my_net_read () from /usr/lib/libmysql.so.16.0.0 #4 0x00007f240dcec396 in cli_safe_read () from /usr/lib/libmysql.so.16.0.0 #5 0x00007f240dceea17 in mysql_real_connect () from /usr/lib/libmysql.so.16.0.0 4. 似乎根本原因是為什麼阻塞在了read函數上,並且沒有超時返回 size_t vio_read(Vio * vio, uchar* buf, size_t size) { size_t r; DBUG_ENTER("vio_read"); DBUG_PRINT("enter", ("sd: %d buf: 0x%lx size: %u", vio->sd, (long) buf, (uint) size)); /* Ensure nobody uses vio_read_buff and vio_read simultaneously */ DBUG_ASSERT(vio->read_end == vio->read_pos); #ifdef __WIN__ r = recv(vio->sd, buf, size,0); #else errno=0; /* For linux */ r = read(vio->sd, buf, size); #endif /* __WIN__ */ #ifndef DBUG_OFF if (r == (size_t) -1) { DBUG_PRINT("vio_error", ("Got error %d during read",errno)); } #endif /* DBUG_OFF */ DBUG_PRINT("exit", ("%ld", (long) r)); DBUG_RETURN(r); } 5. 跟了下mysqlclient中與服務端連接讀寫socket的代碼,發現配置的mysql_options中配置的MYSQL_OPT_READ_TIMEOUT和MYSQL_OPT_WRITE_TIMEOUT正是socket連接中的讀寫超時時間: setsockopt(vio->sd, SOL_SOCKET, which ? SO_SNDTIMEO : SO_RCVTIMEO, IF_WIN(const char*, const void*)&wait_timeout, sizeof(wait_timeout)) 這個超時時間默認是很大的值(31536000s),如果read阻塞了線程就會掛住,直到tcp的超時時間斷開連接(默認2小時) 6. mysql官網上有類似的說明: http://dev.mysql.com/worklog/task/?id=1907 http://bugs.mysql.com/bug.php?id=4143 7. 數據庫讀/寫超時可以通過調用mysql_options()設置,具體參數為MYSQL_OPT_READ_TIMEOUT和MYSQL_OPT_WRITE_TIMEOUT。 MYSQL_OPT_READ_TIMEOUT (argument type: unsigned int *) The timeout in seconds for attempts to read from the server. Each attempt usesthis timeout value and there are retries if necessary, so the total effectivetimeout value is three times the option value. You can set the value so that alost connection can be detected earlier than the TCP/IP Close_Wait_Timeoutvalue of 10 minutes. Before MySQL 5.1.41, this option applies only to TCP/IPconnections and, prior to MySQL 5.1.12, only for Windows. MYSQL_OPT_WRITE_TIMEOUT (argument type: unsigned int *) The timeout in seconds for attempts to write to the server. Each attempt usesthis timeout value and there are net_retry_count retries if necessary, so thetotal effective timeout value is net_retry_count times the option value. BeforeMySQL 5.1.41, this option applies only to TCP/IP connections and, prior toMySQL 5.1.12, only for Windows. 然而,值得注意的是此兩個參數並不是對所有版本都支持: Before MySQL 5.1.41, this option applies only to TCP/IPconnections and, prior to MySQL 5.1.12, only for Windows. 對於linux系統來說,你必須使用官方發布的5.1.12版本之後的客戶端庫,否則你需要自己編譯線程安全的客戶端庫。