程式師世界 >> 數據庫知識 >> MYSQL數據庫 >> MySQL綜合教程 >> MariaDB線程池源碼分析

MariaDB線程池源碼分析

編輯：MySQL綜合教程

在引入線程池之前，MySQL支持的線程處理方式(thread_handling參數控制)有no-threads和one-thread-per-connection兩種方式，no-threads方式是指任一時刻最多只有一個連接可以連接到server，一般用於實驗性質。 one-thread-per-connection是指針對每個連接創建一個線程來處理這個連接的所有請求，直到連接斷開，線程結束。是thread_handling的默認方式。

one-thread-per-connection存在的問題就是需要為每個連接創建一個新的thread，當並發連接數達到一定程度，性能會有明顯下降，因為過多的線程會導致頻繁的上下文切換，CPU cache命中率降低和鎖的競爭更加激烈。

解決one-thread-per-connection的方法就是降低線程數，這樣就需要多個連接共用線程，這便引入了線程池的概念。線程池中的線程是針對請求的，而不是針對連接的，也就是說幾個連接可能使用相同的線程處理各自的請求。

MariaDB在5.5引入了一個動態的線程池方案，可以根據當前請求的並發情況自動增加或減少線程數，還好 MariaDB完全開源，本文結合MariaDB的代碼來介紹下thread pool的實現。這裡使用的MariaDB 10.0的代碼樹。

1 相關參數

MySQL的參數都寫在sys_vars.cc文件下。

static Sys_var_uint Sys_threadpool_idle_thread_timeout( 
  "thread_pool_idle_timeout", 
  "Timeout in seconds for an idle thread in the thread pool." 
  "Worker thread will be shut down after timeout", 
  GLOBAL_VAR(threadpool_idle_timeout), CMD_LINE(REQUIRED_ARG), 
  VALID_RANGE(1, UINT_MAX), DEFAULT(60), BLOCK_SIZE(1) 
); 
static Sys_var_uint Sys_threadpool_oversubscribe( 
  "thread_pool_oversubscribe", 
  "How many additional active worker threads in a group are allowed.", 
  GLOBAL_VAR(threadpool_oversubscribe), CMD_LINE(REQUIRED_ARG), 
  VALID_RANGE(1, 1000), DEFAULT(3), BLOCK_SIZE(1) 
); 
static Sys_var_uint Sys_threadpool_size( 
 "thread_pool_size", 
 "Number of thread groups in the pool. " 
 "This parameter is roughly equivalent to maximum number of concurrently " 
 "executing threads (threads in a waiting state do not count as executing).", 
  GLOBAL_VAR(threadpool_size), CMD_LINE(REQUIRED_ARG), 
  VALID_RANGE(1, MAX_THREAD_GROUPS), DEFAULT(my_getncpus()), BLOCK_SIZE(1), 
  NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(0), 
  ON_UPDATE(fix_threadpool_size) 
); 
static Sys_var_uint Sys_threadpool_stall_limit( 
 "thread_pool_stall_limit", 
 "Maximum query execution time in milliseconds," 
 "before an executing non-yielding thread is considered stalled." 
 "If a worker thread is stalled, additional worker thread " 
 "may be created to handle remaining clients.", 
  GLOBAL_VAR(threadpool_stall_limit), CMD_LINE(REQUIRED_ARG), 
  VALID_RANGE(10, UINT_MAX), DEFAULT(500), BLOCK_SIZE(1), 
  NO_MUTEX_GUARD, NOT_IN_BINLOG, ON_CHECK(0),  
  ON_UPDATE(fix_threadpool_stall_limit) 
);

這幾個參數都有相應的描述，這裡再稍微具體介紹一下。

thread_pool_size: 線程池的分組group）個數。MariaDB的線程池並不是說一整個大池子，而是分成了不同的group，而且是按照到來connection的順序進行分組的，如第一個connection分配到group[0]，那麼第二個connection就分配到group[1]，是一種 Round Robin的輪詢分配方式。默認值是CPU core個數。

thread_pool_idle_timeout: 線程最大空閒時間，如果某個線程空閒的時間大於這個參數，則線程退出。

thread_pool_stall_limit: 監控間隔時間，thread pool有個監控線程，每隔這個時間，會檢查每個group的線程可用數等狀態，然後進行相應的處理，如wake up或者create thread。

thread_pool_oversubscribe: 允許的每個group上的活躍的線程數，注意這並不是每個group上的最大線程數，而只是可以處理請求的線程數。

2 thread handling設置

thread pool模式其實是新增了一種thread_handling的方式，即在配置文件中設置：

[mysqld] 
thread_handling=pool-of-threads. 
....

MySQL內部是有一個scheduler_functions結構體，不論thread_handling是哪種方式，都是通過設置這個結構體中的函數來進行不同的調度。

/** scheduler_functions結構體 */ 
struct scheduler_functions 
{ 
  uint max_threads, *connection_count; 
  ulong *max_connections; 
  bool (*init)(void); 
  bool (*init_new_connection_thread)(void); 
  void (*add_connection)(THD *thd); 
  void (*thd_wait_begin)(THD *thd, int wait_type); 
  void (*thd_wait_end)(THD *thd); 
  void (*post_kill_notification)(THD *thd); 
  bool (*end_thread)(THD *thd, bool cache_thread); 
  void (*end)(void); 
}; 
static int get_options(int *argc_ptr, char ***argv_ptr) 
{ 
  ... 
  /** 根據thread_handling選項的設置，選擇不同的處理方式*/ 
if (thread_handling <= SCHEDULER_ONE_THREAD_PER_CONNECTION) 
    /**one thread per connection 方式 */ 
    one_thread_per_connection_scheduler(thread_scheduler, &max_connections, 
                                        &connection_count); 
  else if (thread_handling == SCHEDULER_NO_THREADS) 
    /** no thread 方式 */ 
    one_thread_scheduler(thread_scheduler); 
  else 
    /** thread pool 方式 */ 
    pool_of_threads_scheduler(thread_scheduler,  &max_connections, 
                                        &connection_count);  
  ...                                         
} 
static scheduler_functions tp_scheduler_functions= 
{ 
  0,                                  // max_threads 
  NULL, 
  NULL, 
  tp_init,                            // init 
  NULL,                               // init_new_connection_thread 
  tp_add_connection,                  // add_connection 
  tp_wait_begin,                      // thd_wait_begin 
  tp_wait_end,                        // thd_wait_end 
  post_kill_notification,             // post_kill_notification 
  NULL,                               // end_thread 
  tp_end                              // end 
}; 
void pool_of_threads_scheduler(struct scheduler_functions *func, 
    ulong *arg_max_connections, 
    uint *arg_connection_count) 
{ 
  /** 設置scheduler_functions結構體為tp_scheduler_functions */ 
  *func = tp_scheduler_functions; 
  func->max_threads= threadpool_max_threads; 
  func->max_connections= arg_max_connections; 
  func->connection_count= arg_connection_count; 
  scheduler_init(); 
}

上面可以看到設置了thread_scheduler的處理函數為tp_scheduler_functions，即為thread pool方式，這種方式對應的初始函數為tp_init, 創建新連接的函數為 tp_add_connection，等待開始函數為tp_wait_begin,等待結束函數為tp_wait_end. 這裡說明下等待函數的意義，等待函數一般是在等待磁盤I/O，等待鎖資源，SLEEP，或者等待網絡消息的時候，調用wait_begin，在等待結束後調用wait_end，那麼為什麼要等待的時候調用等待函數呢？這個在後面進行介紹。

上面講的其實和thread pool關系不是很大，下面開始thread pool流程的介紹。thread pool涉及到的源碼在emphsql/threadpool_common.cc和emphsql/threadpool_unix.cc, 對於windows而言，還有emphsql/threadpool_win.cc.

3 線程池初始化——tp_init

>tp_init 
| >thread_group_init 
| >start_timer

tp_init非常簡單，首先是調用了thread_group_init進行組的初始化，然後調用的start_timer開啟了監控線程timer_thread。至此為止，thread pool裡面只有一個監控線程啟動，而沒有任何工作線程, 直到有新的連接到來。

4 添加新連接——tp_add_connection

void tp_add_connection(THD *thd) 
{ 
  DBUG_ENTER("tp_add_connection"); 
  threads.append(thd); 
  mysql_mutex_unlock(&LOCK_thread_count); 
  connection_t *connection= alloc_connection(thd); 
  if (connection) 
  { 
    thd->event_scheduler.data= connection; 
    /* Assign connection to a group. */ 
    thread_group_t *group=  
      &all_groups[thd->thread_id%group_count]; 
    connection->thread_group=group; 
    mysql_mutex_lock(&group->mutex); 
    group->connection_count++; 
    mysql_mutex_unlock(&group->mutex); 
    /* 
       Add connection to the work queue.Actual logon  
       will be done by a worker thread. 
    */ 
   queue_put(group, connection); 
  } 
  else 
  { 
    /* Allocation failed */ 
    threadpool_remove_connection(thd); 
  }  
  DBUG_VOID_RETURN; 
}

但server的主監聽線程監聽到有客戶端的connect時，會調用tp_add_connection函數進行處理。首先根據thread_id對group_count取模，找到其所屬的group，然後調用queue_put將此connection 放入到group中的queue中。這裡涉及到兩個新的結構體，connection_t和thread_group_t。

struct connection_t 
{ 
  THD *thd; 
  thread_group_t *thread_group; 
  connection_t *next_in_queue; 
  connection_t **prev_in_queue; 
  ulonglong abs_wait_timeout; //等待超時時間 
  bool logged_in; //是否進行了登錄驗證 
  bool bound_to_poll_descriptor; //是否添加到了epoll進行監聽 
  bool waiting; //是否在等待狀態，如I/O， sleep 
}; 
struct thread_group_t  
{ 
  mysql_mutex_t mutex; 
  connection_queue_t queue;  //connection請求鏈表 
  worker_list_t waiting_threads; //group中正在等待被喚醒的thread 
  worker_thread_t *listener;  //當前group中用於監聽的線程 
  pthread_attr_t *pthread_attr; 
  int  pollfd;  //epoll 文件描述符，用於綁定group中的所有連接 
  int  thread_count;  //線程數 
  int  active_thread_count;//活躍線程數 
  int  connection_count; //連接數 
  /* Stats for the deadlock detection timer routine.*/ 
  int io_event_count;  //epoll產生的事件數 
  int queue_event_count; //工作線程消化的事件數 
  ulonglong last_thread_creation_time; 
  int  shutdown_pipe[2]; 
  bool shutdown; 
  bool stalled; // 工作線程是否處於停滯狀態 
} MY_ALIGNED(512);

上面對這些參數進行了說明，理解這些參數的意義，才能了解這個動態thread pool的管理機制，因為每個參數都會影響到thread pool的增長或收縮。

介紹完結構體，繼續回到新的連接到來，這時會調用queue_put函數，將此connection放到 group的隊列queue中。

static void queue_put(thread_group_t *thread_group, connection_t *connection) 
{ 
  DBUG_ENTER("queue_put"); 
  mysql_mutex_lock(&thread_group->mutex); 
  thread_group->queue.push_back(connection); 
  if (thread_group->active_thread_count == 0) 
    wake_or_create_thread(thread_group); 
  mysql_mutex_unlock(&thread_group->mutex); 
  DBUG_VOID_RETURN; 
}

注意，這時候有個active_thread_count的判斷，如果沒有活躍的線程，那麼就無法處理這個新到的請求啊，這時就需要調用wake_or_create_thread，這個函數首先會嘗試喚醒group 等待線程鏈表waiting_threads中的線程，如果沒有等待中的線程，則需要創建一個線程。至此，新到的connection被掛到了group的queue上，這樣一個連接算是add進隊列了，那麼如何處理這個連接呢？我們繼續往下看。

5 工作線程——worker_main

由於是第一個連接到來，那麼肯定沒有waiting_threads，此時會調用create_worker 函數創建一個工作線程。我們直接來看下工作線程。

static void *worker_main(void *param) 
{ 
 ... 
  DBUG_ENTER("worker_main"); 
  thread_group_t *thread_group = (thread_group_t *)param; 
  /* Run event loop */ 
  for(;;) 
  { 
    connection_t *connection; 
    struct timespec ts; 
    set_timespec(ts,threadpool_idle_timeout); 
    connection = get_event(&this_thread, thread_group, &ts); 
    if (!connection) 
      break; 
    this_thread.event_count++; 
    handle_event(connection); 
  } 
  .... 
  my_thread_end(); 
  return NULL; 
}

上面是整個工作線程的邏輯，可以看到是一個循環，get_event用來獲取新的需要處理的 connection，然後調用handle_event進行處理相應的connection。one thread per connection 中每個線程也是一個循環體，這兩者之間的區別就是，thread pool的循環等待的是一個可用的event，並不局限於某個固定的connection的event，而one thread per connection的循環等待是等待固定的 connection上的event，這就是兩者最大的區別。

6 事件獲取——get_event

工作線程通過get_event獲取需要處理的connection，

connection_t *get_event(worker_thread_t *current_thread,  
  thread_group_t *thread_group,  struct timespec *abstime) 
{  
  ... 
  for(;;)  
  { 
  ... 
      /** 從QUEUE中獲取connection */ 
      connection = queue_get(thread_group); 
      if(connection) { 
        fprintf(stderr, "Thread %x get a new connection.\n", (unsigned int)pthread_self()); 
        break; 
      } 
     ... 
      /**監聽epoll */ 
    if(!thread_group->listener) 
    { 
      thread_group->listener= current_thread; 
      thread_group->active_thread_count--; 
      mysql_mutex_unlock(&thread_group->mutex); 
      fprintf(stderr, "Thread %x waiting for a new event.\n", (unsigned int)pthread_self()); 
      connection = listener(current_thread, thread_group); 
      fprintf(stderr, "Thread %x get a new event for connection %p.\n", 
              (unsigned int)pthread_self(), connection); 
      mysql_mutex_lock(&thread_group->mutex); 
      thread_group->active_thread_count++; 
      /* There is no listener anymore, it just returned. */ 
      thread_group->listener= NULL; 
      break; 
    } 
    ... 
}

這個get_event的函數邏輯稍微有點多，這裡只抽取了獲取事件的兩個點，我們接著按照第一個連接到來是的情形進行說明，第一個連接到來，queue中有了一個connection，這是get_event便會從queue中獲取到一個 connection，返回給worker_main線程。worker_main接著調用handle_event進行事件處理。

每個新的connection連接到服務器後，其socket會綁定到group的epoll中，所以，如果queue中沒有connection，需要從epool中獲取，每個group的所有連接的socket都綁定在group的epool 中，所以任何一個時刻，最多只有一個線程能夠監聽epoll，如果epoll監聽到有event的話，也會返回相應的connection，然後再調用handle_event進行處理。

7 事件處理——handle_event

handle_event的邏輯比較簡單，就是根據connection_t上是否登錄過，進行分支，如果沒登錄過，說明是新到的連接，則進行驗證，否則直接進行請求處理。

static void handle_event(connection_t *connection) 
{ 
  DBUG_ENTER("handle_event"); 
  int err; 
  if (!connection->logged_in) //處理登錄 
  { 
    err= threadpool_add_connection(connection->thd); 
    connection->logged_in= true; 
  } 
  else  //處理請求 
  { 
    err= threadpool_process_request(connection->thd); 
  } 
  if(err) 
    goto end; 
  set_wait_timeout(connection); 
  /** 設置socket到epoll的監聽 */ 
  err= start_io(connection); 
end: 
  if (err) 
    connection_abort(connection); 
  DBUG_VOID_RETURN; 
} 
static int start_io(connection_t *connection) 
{  
  int fd = mysql_socket_getfd(connection->thd->net.vio->mysql_socket); 
  ... 
  /* 綁定到epoll *。 
  if (!connection->bound_to_poll_descriptor) 
  { 
    connection->bound_to_poll_descriptor= true; 
    return io_poll_associate_fd(group->pollfd, fd, connection); 
  } 
  return io_poll_start_read(group->pollfd, fd, connection); 
}

注意，在handle_event之後，會調用start_io，這個函數很重要，這個函數會將新到的connection的socket綁定到group的epoll上進行監聽。

8 線程等待

當group中的線程沒有任務執行時，所有線程都會在get_event處等待，但是有兩種等待方式，一種是在epoll上等待事件，每個group中只有一個線程會做這個事情，且這個會一直等待，直到有新的事件到來。另一種就是等待一定的時間，即參數thread_pool_idle_time這個時間，如果超過了這個時間，那麼當前的線程的get_event就會返回空，然後worker_main線程就會退出。如果在線程等待的過程被喚醒的話，那麼就會繼續在 get_event中進行循環，等待新的事件。

9 喚醒等待線程

有兩種方式會喚醒等待的線程，一種是監控線程timer_thread,另一種就是一些active的線程碰到需要等待的時候，會調用tp_wait_begin，這個函數如果判斷當前沒有active的thread且沒有thread監聽 epoll，則會調用wake_or_create_thread。

監控線程timer_thread用於定期監控group中的thread使用情況，具體的檢查函數是check_stall.

void check_stall(thread_group_t *thread_group) 
{ 
  ... 
  /** 如果沒有線程監聽epoll且自上次檢查到現在沒有新的event事件產生，說明所有的 
  活躍線程都在 忙於執行長任務，則需要喚醒或創建工作線程 */ 
  if (!thread_group->listener && !thread_group->io_event_count) 
  { 
    wake_or_create_thread(thread_group); 
    mysql_mutex_unlock(&thread_group->mutex); 
    return; 
  } 
  /*  Reset io event count */ 
  thread_group->io_event_count= 0; 
  /** 如果隊列queue中有請求，且自上次檢查到現在queue中的請求沒有被消化， 
  則說明所有活躍線程忙於執行長任務，需要喚醒或創建工作線程*/ 
  if (!thread_group->queue.is_empty() && !thread_group->queue_event_count) 
  { 
    thread_group->stalled= true; 
    wake_or_create_thread(thread_group); 
  } 
  /* Reset queue event count */ 
  thread_group->queue_event_count= 0; 
  mysql_mutex_unlock(&thread_group->mutex); 
}