/proc/meminfo

/proc/meminfo Explained

March 2003

"Free," "buffer," "swap," "dirty." What does it all mean? If you said, "something to do with the Summer of '68", you may need a primer on 'meminfo'.

The entries in the /proc/meminfo can help explain what's going on with your memory usage, if you know how to read it.

Example of "cat /proc/meminfo":

root:    total:        used:        free:          shared:    buffers:    cached:
Mem:      1055760384    1041887232    13873152    0    100417536     711233536
Swap:     1077501952      8540160     1068961792

MemTotal:        1031016 kB    
MemFree:        13548 kB
MemShared:        0 kB
Buffers:        98064 kB
Cached:            692320 kB
SwapCached:        2244 kB
Active:            563112 kB
Inact_dirty:        309584 kB
Inact_clean:        79508 kB
Inact_target:        190440 kB
HighTotal:        130992 kB
HighFree:        1876 kB
LowTotal:        900024 kB
LowFree:        11672 kB
SwapTotal:        1052248 kB
SwapFree:        1043908 kB
Committed_AS:        332340 kB

The information comes in the form of both high-level and low-level statistics. At the top you see a quick summary of the most common values people would like to look at. Below you find the individual values we will discuss. First we will discuss the high-level statistics.

High-Level Statistics

MemTotal: Total usable ram (i.e. physical ram minus a few reserved bits and the kernel binary code)
MemFree: Is sum of LowFree+HighFree (overall stat)
MemShared: 0; is here for compat reasons but always zero.
Buffers: Memory in buffer cache. mostly useless as metric nowadays
Cached: Memory in the pagecache (diskcache) minus SwapCache
SwapCache: Memory that once was swapped out, is swapped back in but still also is in the swapfile (if memory is needed it doesn't need to be swapped out AGAIN because it is already in the swapfile. This saves I/O)

Detailed Level Statistics
VM Statistics

VM splits the cache pages into "active" and "inactive" memory. The idea is that if you need memory and some cache needs to be sacrificed for that, you take it from inactive since that's expected to be not used. The vm checks what is used on a regular basis and moves stuff around.

When you use memory, the CPU sets a bit in the pagetable and the VM checks that bit occasionally, and based on that, it can move pages back to active. And within active there's an order of "longest ago not used" (roughly, it's a little more complex in reality). The longest-ago used ones can get moved to inactive. Inactive is split into two in the above kernel (2.4.18-24.8.0). Some have it three.

Active: Memory that has been used more recently and usually not reclaimed unless absolutely necessary.
Inact_dirty: Dirty means "might need writing to disk or swap." Takes more work to free. Examples might be files that have not been written to yet. They aren't written to memory too soon in order to keep the I/O down. For instance, if you're writing logs, it might be better to wait until you have a complete log ready before sending it to disk.
Inact_clean: Assumed to be easily freeable. The kernel will try to keep some clean stuff around always to have a bit of breathing room.
Inact_target: Just a goal metric the kernel uses for making sure there are enough inactive pages around. When exceeded, the kernel will not do work to move pages from active to inactive. A page can also get inactive in a few other ways, e.g. if you do a long sequential I/O, the kernel assumes you're not going to use that memory and makes it inactive preventively. So you can get more inactive pages than the target because the kernel marks some cache as "more likely to be never used" and lets it cheat in the "last used" order.

Memory Statistics

HighTotal: is the total amount of memory in the high region. Highmem is all memory above (approx) 860MB of physical RAM. Kernel uses indirect tricks to access the high memory region. Data cache can go in this memory region.
LowTotal: The total amount of non-highmem memory.
LowFree: The amount of free memory of the low memory region. This is the memory the kernel can address directly. All kernel datastructures need to go into low memory.
SwapTotal: Total amount of physical swap memory.
SwapFree: Total amount of swap memory free.
Committed_AS: An estimate of how much RAM you would need to make a 99.99% guarantee that there never is OOM (out of memory) for this workload. Normally the kernel will overcommit memory. That means, say you do a 1GB malloc, nothing happens, really. Only when you start USING that malloc memory you will get real memory on demand, and just as much as you use. So you sort of take a mortgage and hope the bank doesn't go bust. Other cases might include when you mmap a file that's shared only when you write to it and you get a private copy of that data. While it normally is shared between processes. The Committed_AS is a guesstimate of how much RAM/swap you would need worst-case.

在Linux下查看内存我们一般用free命令：
[root@scs-2 tmp]# free
             total       used       free     shared    buffers     cached
Mem:       3266180    3250004      16176          0     110652    2668236
-/+ buffers/cache:     471116    2795064
Swap:      2048276      80160    1968116

下面是对这些数值的解释：
total:总计物理内存的大小。
used:已使用多大。
free:可用有多少。
Shared:多个进程共享的内存总额。
Buffers/cached:磁盘缓存的大小。
第三行(-/+ buffers/cached):
used:已使用多大。
free:可用有多少。
第四行就不多解释了。
区别：第二行(mem)的used/free与第三行(-/+ buffers/cache) used/free的区别。这两个的区别在于使用的角度来看，第一行是从OS的角度来看，因为对于OS，buffers/cached 都是属于被使用，所以他的可用内存是16176KB,已用内存是3250004KB,其中包括，内核（OS）使用+Application(X, oracle,etc)使用的+buffers+cached.
第三行所指的是从应用程序角度来看，对于应用程序来说，buffers/cached 是等于可用的，因为buffer/cached是为了提高文件读取的性能，当应用程序需在用到内存的时候，buffer/cached会很快地被回收。
所以从应用程序的角度来说，可用内存=系统free memory+buffers+cached。
如上例：
2795064=16176+110652+2668236

接下来解释什么时候内存会被交换，以及按什么方交换。当可用内存少于额定值的时候，就会开会进行交换。
如何看额定值：
cat /proc/meminfo

[root@scs-2 tmp]# cat /proc/meminfo
MemTotal:      3266180 kB
MemFree:         17456 kB
Buffers:        111328 kB
Cached:        2664024 kB
SwapCached:          0 kB
Active:         467236 kB
Inactive:      2644928 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      3266180 kB
LowFree:         17456 kB
SwapTotal:     2048276 kB
SwapFree:      1968116 kB
Dirty:               8 kB
Writeback:           0 kB
Mapped:         345360 kB
Slab:           112344 kB
Committed_AS:   535292 kB
PageTables:       2340 kB
VmallocTotal: 536870911 kB
VmallocUsed:    272696 kB
VmallocChunk: 536598175 kB
HugePages_Total:     0
HugePages_Free:      0
Hugepagesize:     2048 kB

用free -m查看的结果：
[root@scs-2 tmp]# free -m
             total       used       free     shared    buffers     cached
Mem:          3189       3173         16          0        107       2605
-/+ buffers/cache:        460       2729
Swap:         2000         78       1921

查看/proc/kcore文件的大小（内存镜像）：
[root@scs-2 tmp]# ll -h /proc/kcore
-r-------- 1 root root 4.1G Jun 12 12:04 /proc/kcore

备注：

占用内存的测量

测量一个进程占用了多少内存，linux为我们提供了一个很方便的方法，/proc目录为我们提供了所有的信息，实际上top等工具也通过这里来获取相应的信息。

/proc/meminfo 机器的内存使用信息

/proc/pid/maps pid为进程号，显示当前进程所占用的虚拟地址。

/proc/pid/statm 进程所占用的内存

[root@localhost ~]# cat /proc/self/statm

654 57 44 0 0 334 0

输出解释

CPU 以及CPU0。。。的每行的每个参数意思（以第一行为例）为：

参数解释 /proc//status

Size (pages) 任务虚拟地址空间的大小 VmSize/4

Resident(pages) 应用程序正在使用的物理内存的大小 VmRSS/4

Shared(pages) 共享页数 0

Trs(pages) 程序所拥有的可执行虚拟内存的大小 VmExe/4

Lrs(pages) 被映像到任务的虚拟内存空间的库的大小 VmLib/4

Drs(pages) 程序数据段和用户态的栈的大小（VmData+ VmStk ）4

dt(pages) 04

查看机器可用内存

/proc/28248/>free

total used free shared buffers cached

Mem: 1023788 926400 97388 0 134668 503688

-/+ buffers/cache: 288044 735744

Swap: 1959920 89608 1870312

我们通过free命令查看机器空闲内存时，会发现free的值很小。这主要是因为，在linux中有这么一种思想，内存不用白不用，因此它尽可能的cache和buffer一些数据，以方便下次使用。但实际上这些内存也是可以立刻拿来使用的。

所以空闲内存=free+buffers+cached=total-used

主要参考内核文档和红帽文档对
> cat /proc/meminfo 读出的内核信息进行解释，
下篇文章会简单对读出该信息的代码进行简单的分析。

MemTotal:507480 kB MemFree:10800 kB Buffers:34728 kB Cached:98852 kB SwapCached:128 kB Active:304248 kB Inactive:46192 kB HighTotal:0 kB HighFree:0 kB LowTotal:507480 kB LowFree:10800 kB SwapTotal:979956 kB SwapFree:941296 kB Dirty:32 kB Writeback:0 kB AnonPages:216756 kB Mapped:77560 kB Slab:22952 kB SReclaimable:15512 kB SUnreclaim:7440 kB PageTables:2640 kB NFS_Unstable:0 kB Bounce:0 kB CommitLimit:1233696 kB Committed_AS:828508 kB VmallocTotal:516088 kB VmallocUsed:5032 kB VmallocChunk:510580 kB

相应选项中文意思想各位高手已经知道，如果翻译有什么错误，请务必指出：

   MemTotal: 所有可用RAM大小（即物理内存减去一些预留位和内核的二进制代码大小）

    MemFree: LowFree与HighFree的总和，被系统留着未使用的内存

    Buffers: 用来给文件做缓冲大小

    Cached: 被高速缓冲存储器（cache memory）用的内存的大小（等于 diskcache minus SwapCache ）.

   SwapCached:被高速缓冲存储器（cache memory）用的交换空间的大小已经
    被交换出来的内存，但仍然被存放在swapfile中。用来在需要的时候很快的
    被替换而不需要再次打开I/O端口。

    Active: 在活跃使用中的缓冲或高速缓冲存储器页面文件的大小，除非非常必要否则不会被移作他用.

    Inactive: 在不经常使用中的缓冲或高速缓冲存储器页面文件的大小，可能被用于其他途径.

    HighTotal:
    HighFree: 该区域不是直接映射到内核空间。内核必须使用不同的手法使用该段内存。

    LowTotal:
    LowFree: 低位可以达到高位内存一样的作用，而且它还能够被内核用来记录
    一些自己的数据结构。Among many other things, it is where
    everything from the Slab is allocated. Bad things happen
    when you're out of lowmem.

    SwapTotal: 交换空间的总大小

    SwapFree: 未被使用交换空间的大小

    Dirty: 等待被写回到磁盘的内存大小。

    Writeback: 正在被写回到磁盘的内存大小。

    AnonPages：未映射页的内存大小

    Mapped: 设备和文件等映射的大小。

    Slab: 内核数据结构缓存的大小，可以减少申请和释放内存带来的消耗。

    SReclaimable:可收回Slab的大小

    SUnreclaim：不可收回Slab的大小（SUnreclaim+SReclaimable＝Slab）

    PageTables：管理内存分页页面的索引表的大小。

    NFS_Unstable:不稳定页表的大小

    Bounce:

   CommitLimit: Based on the overcommit ratio('vm.overcommit_ratio'),
              this is the total amount of memory currently available to
              be allocated on the system. This limit is only adhered to
              if strict overcommit accounting is enabled (mode 2 in
              'vm.overcommit_memory').
              The CommitLimit is calculated with the following formula:
              CommitLimit = ('vm.overcommit_ratio' * Physical RAM) + Swap
              For example, on a system with 1G of physical RAM and 7G
              of swap with a `vm.overcommit_ratio` of 30 it would
              yield a CommitLimit of 7.3G.
              For more details, see the memory overcommit documentation
              in vm/overcommit-accounting.

Committed_AS: The amount of memory presently allocated on
the system.
The committed memory is a sum of all of the memory which
              has been allocated by processes, even if it has not been
              "used" by them as of yet. A process which malloc()'s 1G
              of memory, but only touches 300M of it will only show up
              as using 300M of memory even if it has the address space
              allocated for the entire 1G. This 1G is memory which has
              been "committed" to by the VM and can be used at any time
              by the allocating application. With strict overcommit
              enabled on the system (mode 2 in 'vm.overcommit_memory'),
              allocations which would exceed the CommitLimit (detailed
              above) will not be permitted. This is useful if one needs
              to guarantee that processes will not fail due to lack of
              memory once that memory has been successfully allocated.

    VmallocTotal: 可以vmalloc虚拟内存大小

    VmallocUsed: 已经被使用的虚拟内存大小。

    VmallocChunk: largest contigious block of vmalloc area which is free

下面简单来个例子，看看已用内存和物理内存大小..

#include<stdio.h> #include<stdlib.h> #include<string.h> intMemInfo(char*Info,intlen); intmain() { charbuf[128]; memset(buf,0,128); MemInfo(buf,100); printf("%s",buf); return0; } intMemInfo(char*Info,intlen) { charsStatBuf[256]; FILE*fp; intflag; intTotalMem; intUsedMem; char*line; if(system("free -m | awk '{print $2,$3}' > mem")); memset(sStatBuf,0,256); fp=fopen("mem","rb"); if(fp<0) { return-1; } fread(sStatBuf,1,sizeof(sStatBuf),fp); line=strstr(sStatBuf,"\n"); TotalMem=atoi(line); line=strstr(line," "); UsedMem=atoi(line); memset(sStatBuf,0,256); sprintf(sStatBuf,"Used %dM/Total %dM\n",UsedMem,TotalMem); if(strlen(sStatBuf)>len) { return-1; } memcpy(Info,sStatBuf,strlen(sStatBuf)); return0; } 结果：Used 488M/Total 495M

VMSTAT介绍

通过STATSPACK收集服务器信息，主要通过收集VMSTAT的信息来展现服务器状况。VMSTAT工具是最常见的ＵＮＩＸ监控工具，可以展现给定时间间隔的服务器的状态值。

一般VMSTAT工具的使用是通过两个数字参数来完成的，第一个参数是采样的时间间隔数，单位是秒，第二个参数是采样的次数。如：
[oracle@brucelau oracle]$ vmstat 1 2
procs                   memory swap       io   system        　CPU
r   b   w swpd free buff   cache   　si   so bi bo in cs   us   sy   id
1   0   0    0 271844 186052 255852 0 0     2     6   102 10 0 0 100
0   0   0    0 271844 186052 255852 0 0     0     0   104 11 0 0 100

(注：目前系统几乎空闲，并且不同操作系统VMSTAT输出内容有所不同)

目前说来，对于服务器监控有用处的度量主要有：

r（运行队列）
pi（页导入）
us（用户CPU）
sy（系统CPU）
id（空闲）

通过VMSTAT识别ＣＰＵ瓶颈

r（运行队列）展示了正在执行和等待CPU资源的任务个数。当这个值超过了CPU数目，就会出现CPU瓶颈了。
获得CPU个数的命令(LINUX环境)：
cat /proc/cpuinfo|grep processor|wc –l
当r值超过了CPU个数，就会出现CPU瓶颈，解决办法大体几种：
1. 最简单的就是增加CPU个数
2. 通过调整任务执行时间，如大任务放到系统不繁忙的情况下进行执行，进尔平衡系统任务
3.   调整已有任务的优先级

通过VMSTAT识别ＣＰＵ满负荷

首先需要声明一点的是，vmstat中CPU的度量是百分比的。当us＋sy的值接近100的时候，表示CPU正在接近满负荷工作。但要注意的是，CPU 满负荷工作并不能说明什么，UNIX总是试图要CPU尽可能的繁忙，使得任务的吞吐量最大化。唯一能够确定CPU瓶颈的还是r（运行队列）的值。

通过VMSTAT识别ＲＡＭ瓶颈

数据库服务器都只有有限的RAM，出现内存争用现象是Oracle的常见问题。
首先察看RAM的数量，命令如下（LINUX环境）：
[root@brucelau root]#free
          total       used       free        shared    buffers     cached
Mem:    1027348     873312     154036     185736     187496     293964
-/+ buffers/cache:    391852    635496
Swap:    2096440       0    2096440

当然可以使用top等其他命令来显示RAM。

当内存的需求大于RAM的数量，服务器启动了虚拟内存机制，通过虚拟内存，可以将RAM段移到SWAP DISK的特殊磁盘段上，这样会出现虚拟内存的页导出和页导入现象，页导出并不能说明RAM瓶颈，虚拟内存系统经常会对内存段进行页导出，但页导入操作就表明了服务器需要更多的内存了，页导入需要从SWAP DISK上将内存段复制回RAM，导致服务器速度变慢。

解决的办法有几种：
1.    最简单的，加大RAM
2.    改小SGA，使得对RAM需求减少
3.    减少RAM的需求（如：减少PGA）

我们基本的了解了VMSTAT工作，下面是STATSPACK通过vmstat统计收集服务器性能数据。

STATSPACK通过vmstat收集服务器信息
首先在perfstat用户下建一个存储服务器信息的表：如
建表：
create table stats$vmstat
(
start_date date,   --系统时间
duration date,   --时间间隔
server_name varchar2(20), --服务器名称
runque_waits number, --运行队列数据
page_in   number, --页导入数据
page_out number, --页导出数据
user_cpu number, --用户cpu数据
system_cpu number, --系统cpu数据
idle_cpu number, --空闲cpu数据
wait_cpu number –等待cpu数据（只是aix存在）
)
tablespace perfstat;
然后，通过UNIX/LINUX的shell变成，利用vmstat的结果来获取相应的服务器信息，并且存放到表中。

linux系统内存实际使用率

图中的例子很典型，就是：多数的linux系统在free命令后会发现free（剩余）的内存很少，而自己又没有开过多的程序或服务。

对于上述的情况，正确的解释是：

linux的内存管理机制与windows的有所不同。具体的机制我们无需知道，我们需要知道的是，linux的内存管理机制的思想包括（不敢说就是）内存利用率最大化。内核会把剩余的内存申请为cached，而cached不属于free范畴。当系统运行时间较久，会发现cached很大，对于有频繁文件读写操作的系统，这种现象会更加明显。

直观的看，此时free的内存会非常小，但并不代表可用的内存小，当一个程序需要申请较大的内存时，如果free的内存不够，内核会把部分cached的内存回收，回收的内存再分配给应用程序。所以对于linux系统，可用于分配的内存不只是free的内存，还包括cached的内存（其实还包括buffers）。

1、通过定期采集/proc文件系统内的meminfo文件来获取当前内存使用情况：

proc文件系统是一个伪文件系统，它只存在内存当中，而不占用外存空间。它以文件系统的方式为访问系统内核数据的操作提供接口。用户和应用程序可以通过proc得到系统的信息，并可以改变内核的某些参数。由于系统的信息，如进程，是动态改变的，所以用户或应用程序读取proc文件时，proc文件系统是动态从系统内核读出所需信息并提交的采集流程图。

/proc/meminfo 信息如下：

需要使用的指标有：MemTotal ,MemFree,Buffers,Cached

MemTotal:总内存大小
MemFree: 空闲内存大小
Buffers和Cached：磁盘缓存的大小

Buffers和Cached的区别：

buffers 是指用来给块设备做的缓冲大小，他只记录文件系统的metadata以及 tracking in-flight pages.
cached 是用来给文件做缓冲。
buffers 是用来存储目录里面有什么内容，权限等等。
而cached直接用来记忆我们打开的文件，比如先后执行两次命令#man X ,你就可以明显的感觉到第二次的开打的速度快很多。
而buffers随时都在增加，比如先后两次使用ls /dev后，就会发现第二次执行的速度会较第一次快。
这就是buffers/chached的区别。

2、下面分别从操作系统角度和应用程序角度来区别Buffers和Cached

使用free命令可以看到

对操作系统来说，Buffers和Cached是已经被使用的(上图Mem:这一行)

MemFree=total-used
314952=24946552-24631600

对应用程序来说（上图对应-/+ buffers/cache那一行）

MemFree=buffers+cached+free
19536392=152116+19069324+314952

所以本着监控应用对物理内存使用情况的目的采取如下计算方法：

内存使用率(MEMUsedPerc)=100*(MemTotal-MemFree-Buffers-Cached)/MemTotal

在这里为了PatrolAgent的监控性能，采用定期读取/proc/meminfo文件来获取MemTotal ,MemFree,Buffers,Cached这些参数的值
该算法在MEMORY.km中实现。

===============================================================

通过看free命令的说明可以发现，free命令的数值是从/proc/meminfo文件重读取的。查看free的源码包查看其源码，明确知道了其中的每个数值的来源（具体内容可查看linux命令free源码解读：Procps free.c）。
有时我们计算内存使用率的时候会读取free命令的回显，但有时也会直接读取文件/proc/meminfo的内容，毕竟free命令的回显数据就是从meminfo文件中获得的。

然而，由于不同的linux发行版，在系统制作过程中会修改部分源码。一般的系统（如Debian）使用free命令和读取meminfo文件两种方式计算的内存使用率是相同的。但是对于部分系统，如SUSE（并不确定是每个版本的都是，这里指SUSE Enterprise Server 11），其在free命令回显的结果中，cached部分的值并不等于meminfo文件中的cached所显示的值，而是等于meminfo文件中cached部分和SReclaimable部分之和。

也就是说，debian之类的系统认为：

可用内存=free的内存+cached的内存+buffers的内存

而SUSE之类的系统则认为：

可用内存=free的内存+cached的内存+buffers的内存+SReclaimable的内存

PS:什么是SReclaimable？在linux内核中会有许多小对象，这些对象构造销毁十分频繁，比如i-node，dentry。这么这些对象如果每次构建的时候就向内存要一个页，而其实际大小可能只有几个字节，这样就非常浪费，为了解决这个问题就引入了一种新的机制来处理在同一页框中如何分配小存储器区，这个机制可以减少申请和释放内存带来的消耗，这些小存储器区的内存称为Slab。meminfo文件中标识了Slab的大小，而SReclaimable是指可收回Slab的大小。

posted @ 2013-06-29 18:57 blockcipher 阅读(988) 评论(0) 收藏举报

刷新页面返回顶部

blockcipher