brk()系统调用完成了数据段大小改变的功能,当然包括增加(malloc,申请)和减小(free,释放)两部分了。

    这一系统调用在一般应用中不会出现,但是可以确定一定是被使用最多的,因为其被malloc()调用,malloc()库函数的操作后续给出,但据说,是lib库为应用程序提供了内存管理的方法,当其管理的内存不足的时候,库向内核批量申请一段内存,当然要满足页面对齐的条件,即所分配的空间应该为页面大小的整数倍。

    在查看进程的内存使用情况时,我们也可以通过top得到VSZ的值,也可以使用如下命令得到进程的其他参数:

root@catalyst_24FD52F24E00:/tmp/log# cat /proc/`pidof snmpd`/status

Name:   snmpd

State:  S (sleeping)

Tgid:   4258

Pid:    4258

PPid:   1

TracerPid:      0

Uid:    0       0       0       0

Gid:    0       0       0       0

FDSize: 32

Groups:

VmPeak:     3748 kB

VmSize:     3748 kB

VmLck:         0 kB

VmHWM:      1308 kB

VmRSS:      1308 kB

VmData:      968 kB

VmStk:       136 kB

VmExe:       716 kB

VmLib:      1768 kB

VmPTE:        16 kB

VmSwap:        0 kB

Threads:        1

SigQ:   0/475

SigPnd: 00000000000000000000000000000000

ShdPnd: 00000000000000000000000000000000

SigBlk: 00000000000000000000000000000000

SigIgn: 00000000000000000000000000001004

SigCgt: 0000000000000000000000004000c003

CapInh: 0000000000000000

CapPrm: ffffffffffffffff

CapEff: ffffffffffffffff

CapBnd: ffffffffffffffff

Cpus_allowed:   1

Cpus_allowed_list:      0

voluntary_ctxt_switches:        32233

nonvoluntary_ctxt_switches:     12093

root@catalyst_24FD52F24E00:/tmp/log# 

    其中的VmPeak反应的是该程序所使用所有内存的大小,在内核中,通过mm->total_vm反映出来:

[fs/proc/task_mmu.c: task_mem()]

void task_mem(struct seq_file *m, struct mm_struct *mm)

{

unsigned long data, text, lib, swap;

unsigned long hiwater_vm, total_vm, hiwater_rss, total_rss;


/*

* Note: to minimize their overhead, mm maintains hiwater_vm and

* hiwater_rss only when about to *lower* total_vm or rss.  Any

* collector of these hiwater stats must therefore get total_vm

* and rss too, which will usually be the higher.  Barriers? not

* worth the effort, such snapshots can always be inconsistent.

*/

hiwater_vm = total_vm = mm->total_vm;

...

seq_printf(m,

"VmPeak:\t%8lu kB\n"

"VmSize:\t%8lu kB\n"

"VmLck:\t%8lu kB\n"

"VmHWM:\t%8lu kB\n"

"VmRSS:\t%8lu kB\n"

"VmData:\t%8lu kB\n"

"VmStk:\t%8lu kB\n"

"VmExe:\t%8lu kB\n"

"VmLib:\t%8lu kB\n"

"VmPTE:\t%8lu kB\n"

"VmSwap:\t%8lu kB\n",

hiwater_vm << (PAGE_SHIFT-10),

...

}

    该函数主要是在进程任务的proc下生成对应的status统计信息。

    要想明确获知VmPeak的来源,当然离不开malloc()库函数了,而这一函数将会调用brk()系统调用,下面,开始吧。


    brk()系统调用的部分过程如下:

[mm/mmap.c: brk()]

SYSCALL_DEFINE1(brk, unsigned long, brk)

{

...

/* Ok, looks good - let it rip. */

if (do_brk(oldbrk, newbrk-oldbrk) != oldbrk)

goto out;


set_brk:

mm->brk = brk;

out:

retval = mm->brk;

up_write(&mm->mmap_sem);

return retval;

}

    由上面的调用过程不难发现,brk()主要通过do_brk()完成其主要的功能。

    do_brk()的部分代码如下:

[mm/mmap.c: brk()->do_brk()]

/*

 *  this is really a simplified "do_mmap".  it only handles

 *  anonymous maps.  eventually we may be able to do some

 *  brk-specific accounting here.

 */

unsigned long do_brk(unsigned long addr, unsigned long len)

{

...

/*

* Clear old maps.  this also does some error checking for us

*/

 munmap_back:

vma = find_vma_prepare(mm, addr, &prev, &rb_link, &rb_parent);

if (vma && vma->vm_start < addr + len) {

if (do_munmap(mm, addr, len))

return -ENOMEM;

goto munmap_back;

}


/* Check against address space limits *after* clearing old maps... */

if (!may_expand_vm(mm, len >> PAGE_SHIFT))

return -ENOMEM;


if (mm->map_count > sysctl_max_map_count)

return -ENOMEM;


if (security_vm_enough_memory(len >> PAGE_SHIFT))

return -ENOMEM;


/* Can we just expand an old private anonymous mapping? */

vma = vma_merge(mm, prev, addr, addr + len, flags,

NULL, NULL, pgoff, NULL);

if (vma)

goto out;


/*

* create a vma struct for an anonymous mapping

*/

vma = kmem_cache_zalloc(vm_area_cachep, GFP_KERNEL);

if (!vma) {

vm_unacct_memory(len >> PAGE_SHIFT);

return -ENOMEM;

}

INIT_LIST_HEAD(&vma->anon_vma_chain);

vma->vm_mm = mm;

vma->vm_start = addr;

vma->vm_end = addr + len;

vma->vm_pgoff = pgoff;

vma->vm_flags = flags;

vma->vm_page_prot = vm_get_page_prot(flags);

vma_link(mm, vma, prev, rb_link, rb_parent);

out:

perf_event_mmap(vma);

mm->total_vm += len >> PAGE_SHIFT;

if (flags & VM_LOCKED) {

if (!mlock_vma_pages_range(vma, addr, addr + len))

mm->locked_vm += (len >> PAGE_SHIFT);

}

return addr;

}

    上面的代码完成了释放、内存申请、内存空间映射关系建立等功能,后续对这些部分进行分解。

    此时关注于另一点:mm中的total_vm成员。

    由上面的代码中发现,total_vm在最后会增加此次调整的长度,而且其单位页面大小,在这里2.6.36为4KB。

    下面一点没有进行对应的验证:由上面的计算来看,应该是在free的时候,total_vm的值会变小才对,因为此时len为负值,所以total_vm应该降低;但是前文说过,lib库为应用程序提供了一个内存管理的方法,只有在其内存不足的时候,才向内核申请内存,然后返回给应用程序其所期望的空间大小,如此,则lib库并不会将应用程序free的内存释放到系统的剩余内存中,所以将会导致total_vm会存在持续增长的情况。那么,此时就存在一个问题了,是lib库导致了total_vm的增长么?total_vm的真正作用是什么?在什么时候,total_vm会降低?为什么有些系统中,