[root@localhost ~]# vmstat 2 procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu----- r b swpd free buff cache si so bi bo in cs us sy id wa st 0 0 0 400776 55292 82416 0 0 335 103 87 0 6 94 0 0 0 0 0 400768 55292 82416 0 000 54 65 0 2 98 0 0 0 0 0 400768 55292 82416 0 000 69 72 0 3 97 0 0 0 0 0 400644 55300 82416 0 00 18 67 79 0 3 97 0 0 0 0 0 400644 55300 82416 0 000 51 61 0 2 98 0 0 0 0 0 400644 55300 82416 0 000 64 69 0 2 98 0 0 0 0 0 400644 55308 82416 0 00 20 58 73 0 2 98 0 0
其中的 swap si: 表示每秒 swap in; so:表示每秒swap out; Swap si: Amount of memory swapped in from disk (/s). so: Amount of memory swapped to disk (/s). 使用 sar -B 從整個系統層面查看page out:
[root@localhost ~]# sar -B Linux 2.6.32-504.el6.i686 (localhost.localdomain) 10/01/2015 _i686_ (1 CPU) 10:57:33 AM LINUX RESTART 11:00:01 AM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 11:10:01 AM 39.84 4.85 340.32 0.21 39.40 0.00 0.00 0.00 0.00 11:20:01 AM 0.06 2.76 10.69 0.00 3.21 0.00 0.00 0.00 0.00 11:30:01 AM 0.14 2.68 10.16 0.00 3.08 0.00 0.00 0.00 0.00 11:40:01 AM 69.58 13.07 154.16 0.01 47.29 0.00 0.00 0.00 0.00 11:50:01 AM 1.84 3.93 28.39 0.02 9.17 0.00 0.00 0.00 0.00 12:00:01 PM 0.00 3.20 19.70 0.00 10.87 0.00 0.00 0.00 0.00 12:10:01 PM 0.01 2.90 31.96 0.00 8.77 0.00 0.00 0.00 0.00 12:20:01 PM 0.06 3.06 40.04 0.00 10.98 0.00 0.00 0.00 0.00 12:30:02 PM 2.17 3.81 81.19 0.02 21.63 0.00 0.00 0.00 0.00 Average: 12.62 4.47 79.63 0.03 17.15 0.00 0.00 0.00 0.00 03:01:38 PM LINUX RESTART 03:10:01 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 03:20:01 PM 6.22 3.99 93.05 0.04 22.89 0.00 0.00 0.00 0.00 Average: 6.22 3.99 93.05 0.04 22.89 0.00 0.00 0.00 0.00 [root@localhost ~]# sar -B 2 3 Linux 2.6.32-504.el6.i686 (localhost.localdomain) 10/01/2015 _i686_ (1 CPU) 03:24:05 PM pgpgin/s pgpgout/s fault/s majflt/s pgfree/s pgscank/s pgscand/s pgsteal/s %vmeff 03:24:07 PM 0.00 0.00 26.63 0.00 30.15 0.00 0.00 0.00 0.00 03:24:09 PM 0.00 0.00 19.70 0.00 30.30 0.00 0.00 0.00 0.00 03:24:11 PM 0.00 0.00 15.00 0.00 30.00 0.00 0.00 0.00 0.00 Average: 0.00 0.00 20.44 0.00 30.15 0.00 0.00 0.00 0.00
sar -B 取的是從系統啟動到目前的平均值;sar -B 2 3 是指每隔2秒取值,總共取值3次。輸出字段的含義如下:
-B Report paging statistics. Some of the metrics below are available only with post 2.5 kernels. The following values are displayed: pgpgin/s Total number of kilobytes the system paged in from disk per second. Note: With old kernels (2.2.x) this value is a number of blocks per second (and not kilo- bytes). pgpgout/s Total number of kilobytes the system paged out to disk per second. Note: With old kernels (2.2.x) this value is a number of blocks per second (and not kilo- bytes). fault/s Number of page faults (major + minor) made by the system per second. This is not a count of page faults that generate I/O, because some page faults can be resolved without I/O. majflt/s Number of major faults the system has made per second, those which have required loading a memory page from disk. pgfree/s Number of pages placed on the free list by the system per second. pgscank/s Number of pages scanned by the kswapd daemon per second. pgscand/s Number of pages scanned directly per second. pgsteal/s Number of pages the system has reclaimed from cache (pagecache and swapcache) per second to satisfy its memory demands. %vmeff Calculated as pgsteal / pgscan, this is a metric of the efficiency of page reclaim. If it is near 100% then almost every page coming off the tail of the inactive list is being reaped. If it gets too low (e.g. less than 30%) then the virtual memory is having some difficulty. This field is displayed as zero if no pages have been scanned during the interval of time.
pgpgout/s 表示就是每秒的page out 的KB數量。majflt/s 也是極為重要的指標,該指標涉及到虛擬內存的 page fault機制。 虛擬內存的 page fault機制: linux 使用虛擬內存層來映射物理地址空間,這種映射在某種意義上是說當一個進程開始運行,內核僅僅映射其需要的那部分,內核首先會搜索 CPU緩存和物理內存,如果沒有找到內核則開始一次 MPF, 一次 MPF 即是一次對磁盤子系統的請求,它將數據頁從磁盤和緩存讀入 RAM。一旦內存頁被映射到高速緩沖區,內核便會試圖使用這些頁,被稱作 MnPF,MnPF 通過重復使用內存頁而縮短了內核時間。 文件緩沖區(disk cache)可使內核減少對 MPFs 和 MnPFs 的使用, 隨著系統不斷地 IO 操作, 緩沖區會隨之增大, 直至內存空閒空間不足並開始回收. 使用 free 查看空閒內存:
[root@localhost ~]# free total used free shared buffers cached Mem: 1030548 630284 400264 220 55388 82428 -/+ buffers/cache: 492468 538080 Swap: 1048572 0 1048572 [root@localhost ~]# free -m total used free shared buffers cached Mem: 1006 616 390 0 54 80 -/+ buffers/cache: 481 524 Swap: 1023 0 1023
1g的內存,1g的swap分區,使用了616M,空閒390M; swap分區沒有被使用,全部空閒。 其實free內存很小不能說明問題,但是free比較大,卻能說明內存充足。 swap如果大部分被使用,或者全部使用也能說明 swap 嚴重,當然最好結合 vmstat 來綜合考慮。 使用 ps -mp 1959 -o THREAD,pmem,rss,vsz,tid,pid 查看mysqld的內存和CPU使用情況:
[root@localhost ~]# pidof -s mysqld 1959 [root@localhost ~]# ps -mp 1959 -o THREAD,pmem,rss,vsz,tid,pid USER %CPU PRI SCNT WCHAN USER SYSTEM %MEM RSS VSZ TID PID mysql 0.6 - - - - - 42.8 441212 752744 - 1959 mysql 0.1 19 - - - - - - - 1959 - mysql 0.0 19 - - - - - - - 1962 - mysql 0.0 19 - - - - - - - 1963 - mysql 0.0 19 - - - - - - - 1964 - mysql 0.0 19 - - - - - - - 1965 - mysql 0.0 19 - - - - - - - 1966 - mysql 0.0 19 - - - - - - - 1967 - mysql 0.0 19 - - - - - - - 1968 - mysql 0.0 19 - - - - - - - 1969 - mysql 0.0 19 - - - - - - - 1970 - mysql 0.0 19 - - - - - - - 1971 - mysql 0.0 19 - - - - - - - 1973 - mysql 0.0 19 - - - - - - - 1974 - mysql 0.0 19 - - - - - - - 1975 - mysql 0.0 19 - - - - - - - 1976 - mysql 0.0 19 - - - - - - - 1977 - mysql 0.0 19 - - - - - - - 1978 - mysql 0.0 19 - - - - - - - 1979 - mysql 0.0 19 - - - - - - - 1980 - mysql 0.0 19 - - - - - - - 1981 - mysql 0.0 19 - - - - - - - 1982 -
使用 pmap 查看進程的內存分布情況:
[root@localhost ~]# pmap -x 1959 1959: /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/var/lib/mysql --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/mysql/mysqld.pid --socket=/var/lib/mysql/mysql.sock Address Kbytes RSS Dirty Mode Mapping 00297000 4 4 0 r-x-- [ anon ] 002e0000 48 20 0 r-x-- libnss_files-2.12.so 002ec000 4 4 4 r---- libnss_files-2.12.so 002ed000 4 4 4 rw--- libnss_files-2.12.so 003fb000 116 60 0 r-x-- libgcc_s-4.4.7-20120601.so.1 00418000 4 4 4 rw--- libgcc_s-4.4.7-20120601.so.1 0041b000 28 8 0 r-x-- libcrypt-2.12.so 00422000 4 4 4 r---- libcrypt-2.12.so 00423000 4 4 4 rw--- libcrypt-2.12.so 00424000 156 0 0 rw--- [ anon ] 0044d000 368 148 0 r-x-- libfreebl3.so 004a9000 4 0 0 ----- libfreebl3.so 004aa000 4 4 4 r---- libfreebl3.so 004ab000 4 4 4 rw--- libfreebl3.so 004ac000 16 12 12 rw--- [ anon ] 0053e000 120 100 0 r-x-- ld-2.12.so 0055c000 4 4 4 r---- ld-2.12.so 0055d000 4 4 4 rw--- ld-2.12.so 00560000 4 4 0 r-x-- libaio.so.1.0.1 00561000 4 4 4 rw--- libaio.so.1.0.1 00564000 1600 680 0 r-x-- libc-2.12.so 006f4000 8 8 8 r---- libc-2.12.so 006f6000 4 4 4 rw--- libc-2.12.so 006f7000 12 12 12 rw--- [ anon ] 006fc000 92 84 0 r-x-- libpthread-2.12.so 00713000 4 4 4 r---- libpthread-2.12.so 00714000 4 4 4 rw--- libpthread-2.12.so 00715000 8 4 4 rw--- [ anon ] 00719000 12 8 0 r-x-- libdl-2.12.so 0071c000 4 4 4 r---- libdl-2.12.so 0071d000 4 4 4 rw--- libdl-2.12.so 00720000 28 16 0 r-x-- librt-2.12.so 00727000 4 4 4 r---- librt-2.12.so 00728000 4 4 4 rw--- librt-2.12.so 0072b000 160 28 0 r-x-- libm-2.12.so 00753000 4 4 4 r---- libm-2.12.so 00754000 4 4 4 rw--- libm-2.12.so 07b14000 900 400 0 r-x-- libstdc++.so.6.0.13 07bf5000 16 16 12 r---- libstdc++.so.6.0.13 07bf9000 8 8 8 rw--- libstdc++.so.6.0.13 07bfb000 24 8 8 rw--- [ anon ] 08048000 12096 4284 0 r-x-- mysqld 08c18000 1224 468 304 rw--- mysqld 08d4a000 256 252 252 rw--- [ anon ] 0a809000 5492 5396 5396 rw--- [ anon ] 8abfd000 4 0 0 ----- [ anon ] 8abfe000 10240 4 4 rw--- [ anon ] 8b5fe000 4 0 0 ----- [ anon ] 8b5ff000 10240 4 4 rw--- [ anon ] 8bfff000 4 0 0 ----- [ anon ] 8c000000 10240 8 8 rw--- [ anon ] 8ca00000 1024 436 436 rw--- [ anon ] 8cbf7000 4 0 0 ----- [ anon ] 8cbf8000 10240 16 16 rw--- [ anon ] 8d5f8000 4 0 0 ----- [ anon ] 8d5f9000 10240 8 8 rw--- [ anon ] 8dff9000 4 0 0 ----- [ anon ] 8dffa000 10240 4 4 rw--- [ anon ] 8e9fa000 4 0 0 ----- [ anon ] 8e9fb000 10240 4 4 rw--- [ anon ] 8f3fb000 4 0 0 ----- [ anon ] 8f3fc000 10240 4 4 rw--- [ anon ] 8fdfc000 4 0 0 ----- [ anon ] 8fdfd000 12720 2468 2468 rw--- [ anon ] 90c00000 132 4 4 rw--- [ anon ] 90c21000 892 0 0 ----- [ anon ] 90d04000 4 0 0 ----- [ anon ] 90d05000 192 12 12 rw--- [ anon ] 90d35000 4 0 0 ----- [ anon ] 90d36000 10240 4 4 rw--- [ anon ] 91736000 4 0 0 ----- [ anon ] 91737000 10240 4 4 rw--- [ anon ] 92137000 4 0 0 ----- [ anon ] 92138000 10240 4 4 rw--- [ anon ] 92b38000 4 0 0 ----- [ anon ] 92b39000 10240 4 4 rw--- [ anon ] 93539000 4 0 0 ----- [ anon ] 9353a000 10240 4 4 rw--- [ anon ] 93f3a000 4 0 0 ----- [ anon ] 93f3b000 10240 4 4 rw--- [ anon ] 9493b000 4 0 0 ----- [ anon ] 9493c000 10240 4 4 rw--- [ anon ] 9533c000 4 0 0 ----- [ anon ] 9533d000 10240 4 4 rw--- [ anon ] 95d3d000 4 0 0 ----- [ anon ] 95d3e000 10240 8 8 rw--- [ anon ] 9673e000 4 0 0 ----- [ anon ] 9673f000 133548 19940 19940 rw--- [ anon ] 9e9ab000 407108 406096 406096 rw--- [ anon ] b774b000 4 4 4 rw--- [ anon ] bfc28000 84 56 56 rw--- [ stack ] -------- ------- ------- ------- ------- total kB 752740 - - -
上面字段的含義:
EXTENDED AND DEVICE FORMAT FIELDS Address: start address of map Kbytes: size of map in kilobytes RSS: resident set size in kilobytes Dirty: dirty pages (both shared and private) in kilobytes Mode: permissions on map: read, write, execute, shared, private (copy on write) Mapping: file backing the map, or ’[ anon ]’ for allocated memory, or ’[ stack ]’ for the program stack Offset: offset into the file Device: device name (major:minor)
Mapping 字段說明是通過文件map使用的內存,還是[ anon ] 實際分配的內存,還是[ stack ] 棧使用的內存。 最後一行的 total KB 752740 的結果 和上面一條命令中 VSZ: 752744(虛擬內存) 是一致的。 5. 內存的調優 上面我們說到內存的瓶頸,主要看 swap out, page out, major page fault. 它們會極大的影響性能,特別是swap out. 所以內存調優也就是減少和防止它們的出現。 1)使用 hugepage 可以避免swap out; 但是 huagepage也是有代價的(導致page爭用加劇),一定要事先測試; 2)修改 vm.swapingness, 優先flush disk cache,盡量減少page out 和 swap out; 但是flush disk cache又可能會導致 major page fault的產生; 3)disk cache刷新到磁盤有兩個內核參數調節:vm.dirty_background_ratio=10; 默認值為10,表示disk cache中的髒頁數量達到10%時,pdflush內核 線程會被調用,異步刷新disk cache; vm.dirty_ratio=20; 默認值20,表示disk cache中的髒頁數量達到20%時,會進行同步的disk cache刷新,從而 會阻塞系統中應用進程的IO操作!我們可以調低vm.dirty_background_ratio來降低disk cache對mysql使用內存的影響,但是可能會增加磁盤IO; 4)加內存;