pthread_cond_wait詳解

mrjbydd 2012-09-24

展開全文

1. 首先pthread_cond_wait 的定義是這樣的

The pthread_cond_wait() and pthread_cond_timedwait() functions are used to block on a condition variable. They are called with mutex locked by the calling thread or undefined behaviour will result.

These functions atomically release mutex and cause the calling thread to block on the condition variable cond ; atomically here means "atomically with respect to access by another thread to the mutex and then the condition variable". That is, if another thread is able to acquire the mutex after the about-to-block thread has released it, then a subsequent call to pthread_cond_signal() or pthread_cond_broadcast() in that thread behaves as if it were issued after the about-to-block thread has blocked.

2. 由上解釋可以看出,pthread_cond_wait() 必須與pthread_mutex 配套使用。(wait的內部操作：一進入wait狀態(tài)就unclock，在wait結束前l(fā)ock)

pthread_cond_wait()函數(shù)一進入wait狀態(tài)就會自動release mutex.

In Thread1:

pthread_mutex_lock(&m_mutex);
pthread_cond_wait(&m_cond,&m_mutex);
pthread_mutex_unlock(&m_mutex);

In Thread2:

pthread_mutex_lock(&m_mutex);
pthread_cond_signal(&m_cond);
pthread_mutex_unlock(&m_mutex);

為什么要與pthread_mutex 一起使用呢？這是為了應對線程1在調用pthread_cond_wait()但線程1還沒有進入wait cond的狀態(tài)的時候，此時線程2調用了 cond_singal 的情況。如果不用mutex鎖的話，這個cond_singal就丟失了。加了鎖的情況是，線程2必須等到 mutex 被釋放（也就是 pthread_cod_wait() 進入wait_cond狀態(tài) 并自動釋放mutex）的時候才能調用cond_singal(前提：線程2也使用mutex)。

3. pthread_cond_wait() 一旦wait成功獲得cond 條件的時候會自動 lock mutex.

這就會出現(xiàn)另一個問題。這是因為

The pthread_cond_wait() and pthread_cond_timedwait() is a cancellation point.

In Thread3:

pthread_cancel(&m_thread);

pthread_join();

因為pthread_cond_wait() and pthread_cond_timedwait() 是線程退出點函數(shù)，因此在Thread3中

可以調用pthread_cancel()來退出線程1。那樣顯然線程1會在 pthread_cond_wait(&m_cond,&m_mutex); 和 pthread_mutex_unlock(&m_mutex); 之間退出， pthread_cond_wait() 函數(shù)返回后自動lock住了mutex, 這個時候線程1退出（并沒有運行到pthread_mutex_unlock（）），如果Thread2這個時候就再也得不到lock狀態(tài)了。

通常解決這個問題的辦法如下

void cleanup(void *arg)
{
   pthread_mutex_unlock(&mutex);
}
void * thread1(void * arg)
{
    pthread_cleanup_push(cleanup, NULL); // thread cleanup handler
    pthread_mutex_lock(&mutex);
   pthread_cond_wait(&cond, &mutex);
   pthread_mutex_unlock(&mutex);
   pthread_cleanup_pop(0);
}

該方法也可用于其它可能異常終止或退出的線程。

關于線程清理棧，參看“線程終止方式：pthread_cleanup_push/pthread_cleanup_pop()”：http://www.cnblogs.com/mydomain/archive/2011/08/15/2139826.html

LINUX環(huán)境下多線程編程肯定會遇到需要條件變量的情況，此時必然要使用pthread_cond_wait()函數(shù)。但這個函數(shù)的執(zhí)行過程比較難于理解。
    pthread_cond_wait()的工作流程如下（以MAN中的EXAMPLE為例）：
       Consider two shared variables x and y, protected by the mutex mut, and a condition vari-
       able cond that is to be signaled whenever x becomes greater than y.

              int x,y;
              pthread_mutex_t mut = PTHREAD_MUTEX_INITIALIZER;
              pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

Waiting until x is greater than y is performed as follows:

              pthread_mutex_lock(&mut);
              while (x <= y) {
                      pthread_cond_wait(&cond, &mut);
              }
              /* operate on x and y */
              pthread_mutex_unlock(&mut);

Modifications on x and y that may cause x to become greater than y should signal the con-
dition if needed:

              pthread_mutex_lock(&mut);
              /* modify x and y */
              if (x > y) pthread_cond_broadcast(&cond);
              pthread_mutex_unlock(&mut);

這個例子的意思是，兩個線程要修改X和 Y的值，第一個線程當X<=Y時就掛起，直到X>Y時才繼續(xù)執(zhí)行（由第二個線程可能會修改X,Y的值，當X>Y時喚醒第一個線程），即首先初始化一個普通互斥量mut和一個條件變量cond。之后分別在兩個線程中分別執(zhí)行如下函數(shù)體：

              pthread_mutex_lock(&mut);
              while (x <= y) {
                      pthread_cond_wait(&cond, &mut);
              }
              /* operate on x and y */
              pthread_mutex_unlock(&mut);

和：       pthread_mutex_lock(&mut);
              /* modify x and y */
              if (x > y) pthread_cond_signal(&cond);
              pthread_mutex_unlock(&mut);
    其實函數(shù)的執(zhí)行過程非常簡單，在第一個線程執(zhí)行到pthread_cond_wait(&cond,&mut)時，此時如果X<=Y，則此函數(shù)就將mut互斥量解鎖，再將cond條件變量加鎖，此時第一個線程掛起（不占用任何CPU周期）。
    而在第二個線程中，本來因為mut被第一個線程鎖住而阻塞，此時因為mut已經釋放，所以可以獲得鎖mut，并且進行修改X和Y的值，在修改之后，一個IF語句判定是不是X>Y，如果是，則此時pthread_cond_signal()函數(shù)會喚醒第一個線程，并在下一句中釋放互斥量mut。然后第一個線程開始從pthread_cond_wait()執(zhí)行，首先要再次鎖mut ，如果鎖成功，再進行條件的判斷 （至于為什么用WHILE，即在被喚醒之后還要再判斷，后面有原因分析），如果滿足條件，則被喚醒 進行處理，最后釋放互斥量mut 。

至于為什么在被喚醒之后還要再次進行條件判斷（即為什么要使用while循環(huán)來判斷條件），是因為可能有“驚群效應”。有人覺得此處既然是被喚醒的，肯定是滿足條件了，其實不然。如果是多個線程都在等待這個條件，而同時只能有一個線程進行處理，此時就必須要再次條件判斷，以使只有一個線程進入臨界區(qū)處理。對此，轉來一段：

引用下POSIX的RATIONALE：

Condition Wait Semantics

It is important to note that when pthread_cond_wait() and pthread_cond_timedwait() return without error, the associated predicate may still be false. Similarly, when pthread_cond_timedwait() returns with the timeout error, the associated predicate may be true due to an unavoidable race between the expiration of the timeout and the predicate state change.

The application needs to recheck the predicate on any return because it cannot be sure there is another thread waiting on the thread to handle the signal, and if there is not then the signal is lost. The burden is on the application to check the predicate.

Some implementations, particularly on a multi-processor, may sometimes cause multiple threads to wake up when the condition variable is signaled simultaneously on different processors.

In general, whenever a condition wait returns, the thread has to re-evaluate the predicate associated with the condition wait to determine whether it can safely proceed, should wait again, or should declare a timeout. A return from the wait does not imply that the associated predicate is either true or false.

It is thus recommended that a condition wait be enclosed in the equivalent of a "while loop" that checks the predicate.

從上文可以看出：
1，pthread_cond_signal在多處理器上可能同時喚醒多個線程，當你只能讓一個線程處理某個任務時，其它被喚醒的線程就需要繼續(xù) wait,while循環(huán)的意義就體現(xiàn)在這里了，而且規(guī)范要求pthread_cond_signal至少喚醒一個pthread_cond_wait上的線程，其實有些實現(xiàn)為了簡單在單處理器上也會喚醒多個線程.
2，某些應用，如線程池，pthread_cond_broadcast喚醒全部線程，但我們通常只需要一部分線程去做執(zhí)行任務，所以其它的線程需要繼續(xù)wait.所以強烈推薦此處使用while循環(huán).