Monday, July 26, 2010

Dissect Linux memory cache

Command “free” has buffers and cached columns, What is the difference? How to dig further to find the size of Dentry cache, inode cache and page cache in cached column.

Difference between Buffers and cached


$free -m
total       used       free     shared    buffers     cached
Mem:          3777       3746         31          0        160        954
-/+ buffers/cache:       2631       1145
Swap:          753          1        751

Buffer Pages
Whenever the kernel must individually address a block, it refers to the buffer page that holds the block buffer and checks the corresponding buffer head.
Here are two common cases in which the kernel creates buffer pages:
· When reading or writing pages of a file that are not stored in contiguous disk blocks. This happens either because the filesystem has allocated noncontiguous blocks to the file, or because the file contains "holes"
· When accessing a single disk block (for instance, when reading a superblock or an inode block).


Raw disk operation such dd use buffers.
Read 10M of raw disk block
$dd if=/dev/sda6 of=/dev/zero bs=1024k count=10
10+0 records in
10+0 records out
10485760 bytes (10 MB) copied, 0.209051 seconds, 50.2 MB/s

Buffers sized increased by 10MB( 170M-160M)
$free -m
total used free shared buffers cached
Mem: 3777 3754 23 0 170 952
-/+ buffers/cache: 2631 1145
Swap: 753 1 751
Dentry cache, inode cache and page cache in cached column.
Dentry cache: Directory Entry Cache, pathname (filename) lookup cache.
Inode cache: Cache for inode, not actual data block.
Page cache: Cache for actual data block
[FROM: http://www.mjmwired.net/kernel/Documentation/filesystems/vfs.txt ]
The combined value of dentry and inode cache is not bigger than whole slab size
$grep -i slab /proc/meminfo 
Slab: 183896 kB

Examine in detail by checking /proc/slabinfo.
$ awk '/dentry|inode/ { print $1,$2,$3,$4}' /proc/slabinfo 
nfs_inode_cache 122787 123312 984
rpc_inode_cache 24 25 768
ext3_inode_cache 9767 9770 776
mqueue_inode_cache 1 4 896
isofs_inode_cache 0 0 624
minix_inode_cache 0 0 640
hugetlbfs_inode_cache 1 7 576
ext2_inode_cache 0 0 728
shmem_inode_cache 441 455 776
sock_inode_cache 231 235 704
proc_inode_cache 670 756 608
inode_cache 2415 2415 576
dentry_cache 99060 110162 200
#sum up them (bytes)
$awk '/dentry|inode/ { x=x+$3*$4} END {print x }' /proc/slabinfo 
153348952

view live stats with slabtop by sorting by cache size
$slabtop -s c
Active / Total Objects (% used) : 332028 / 361849 (91.8%)
Active / Total Slabs (% used) : 45630 / 45631 (100.0%)
Active / Total Caches (% used) : 102 / 139 (73.4%)
Active / Total Size (% used) : 171443.16K / 175338.12K (97.8%)
Minimum / Average / Maximum Object : 0.02K / 0.48K / 128.00K
OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
125948 125864 99% 0.96K 31487 4 125948K nfs_inode_cache
113962 105262 92% 0.20K 5998 19 23992K dentry_cache
14609 14460 98% 0.52K 2087 7 8348K radix_tree_node
9770 9765 99% 0.76K 1954 5 7816K ext3_inode_cache
59048 44057 74% 0.09K 1342 44 5368K buffer_head
2415 2410 99% 0.56K 345 7 1380K inode_cache

I haven’t found to way to get page_cache size directly, It needs bit of calculation page_cache=~ cached – inode – dentry
alternatively observe the value change  by releasing pagecache

#write dirty pages to disk
sync
#To free pagecache:
echo 1 > /proc/sys/vm/drop_caches
#To free dentries and inodes:
echo 2 > /proc/sys/vm/drop_caches
#To free pagecache, dentries and inodes:
echo 3 > /proc/sys/vm/drop_caches

As this is a non-destructive operation and dirty objects are not freeable, it is highly recommended to run command “sync” first.
#Suppress page cache usage.
sysctl -w vm.swappiness=0

vm.swappiness  # value range: 0 - 100, lower value tends to shrink page cache to get free memory, higher value  tends to use  swap to get free memory

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.