程式師世界 >> 編程語言 >> C語言 >> 關於C語言 >> GLIBC strlen源代碼分析

GLIBC strlen源代碼分析

編輯：關於C語言

tony bai

直接操作C標准庫提供的字符串操作函數是有一定風險的，稍有不慎就會導致內存問題。這周用業余時間寫了一個小型的安全字符串操作庫，但是測試之後才發現自己的實現有很大的性能缺陷。

在Solaris上初步做了一個簡單的性能比對，以下是得到的性能數據(以strlen的數據為例)：
當傳入的字符串長度為10時，執行100w次：
strlen 執行時間是：32762毫秒
my_strlen執行時間是：491836毫秒

當傳入的字符串長度為20時，執行100w次：
strlen 執行時間是：35075毫秒
my_strlen執行時間是：770397毫秒

很顯然，標准庫中strlen的消耗僅是my_strlen的十分之一不到，且其性能消耗隨著字符串長度的增加並未有近線性的增加，而my_strlen則是變化明顯。想必大家這時也能猜到my_strlen采用了傳統的實現的方式，即采用逐個字節判斷是否為方式，這也與測試出的現象相符。本著刨根問底的精神，我在網上找到了GNU提供的C標准庫中strlen實現的源碼，要看看GLIBC中strlen究竟采用何種技巧才達到了那麼高的性能。說實話在性能優化這方面自己一直還處於比較初級的位置，這也將是自己將來努力的一個方向。

下載了全部GLIBC的代碼包，這個包還真不小。在string子目錄下找到strlen.c，這就是大多數UNIX平台、Linux平台以及絕大多數GNU軟件使用的strlen的實現源碼了。這份代碼由Torbjorn Granlund(還實現了memcpy)編寫，Jim Blandy和Dan Sahlin提供了幫助和注釋。包括注釋在內，GLIBC的strlen的代碼足足有近130行，大致浏覽一下，沒有怎麼看懂，可耐下心來細致閱讀，還是有些心得的。下面是strlen源碼摘要版，後面我將針對這段代碼寫一些我的理解：

1 /* Return the length of the null-terminated string STR. Scan for
2    the null terminator quickly by testing four bytes at a time. */
3 size_t strlen (str) const char *str;
4 {
5         const char *char_ptr;
6         const unsigned long int *longword_ptr;
7         unsigned long int longword, magic_bits, himagic, lomagic;
8
9         /* Handle the first few characters by reading one character at a time.
10            Do this until CHAR_PTR is aligned on a longword boundary. */
11
12         for (char_ptr = str; ((unsigned long int) char_ptr
13              & (sizeof (longword) - 1)) != 0;
14              ++char_ptr)
15                 if (*char_ptr == )
16                         return char_ptr - str;
17
18         /* All these elucidatory comments refer to 4-byte longwords,
19            but the theory applies equally well to 8-byte longwords. */
20
21         longword_ptr = (unsigned long int *) char_ptr;
22
23         himagic = 0x80808080L;
24         lomagic = 0x01010101L;
25
26         if (sizeof (longword) > 8)
27                 abort ();
28
29         /* Instead of the traditional loop which tests each character,
30            we will test a longword at a time. The tricky part is testing
31            if *any of the four* bytes in the longword in question are zero. */
32
33         for (;;)
34         {
35                 longword = *longword_ptr++;
36
37                 if ( ((longword - lomagic) & himagic) != 0)
38                 {
39                         /* Which of the bytes was the zero? If none of them were, it was
40                            a misfire; continue the search. */
41
42                         const char *cp = (const char *) (longword_ptr - 1);
43
44                         if (cp[0] == 0)
45                                 return cp - str;
46                         if (cp[1] == 0)
47                                 return cp - str + 1;
48                         if (cp[2] == 0)
49                                 return cp - str + 2;
50                         if (cp[3] == 0)
51                                 return cp - str + 3;
52                         if (sizeof (longword) > 4)
53                         {
54                                 if (cp[4] == 0)
55                                         return cp - str + 4;
56