问题
I would like calculate the memory usage for single process. So after a little bit of research I came across over smaps and statm.
First of all what is smaps and statm? What is the difference?
statm has a field RSS and in smaps I sum up all RSS values. But those values are different for the same process. I know that statm measures in pages. For comparison purposes I converted that value in kb as in smaps. But those values are not equal.
Why do these two values differ, even though they represent the rss value for the same process?
statm
232214 80703 7168 27 0 161967 0 (measured in pages, pages size is 4096)
smaps
Rss 1956
My aim is to calculate the memory usage for a single process. I am interested in two values. USS and PSS.
Can I gain those two values by just using smaps? Is that value correct?
Also, I would like to return that value as percentage.
回答1:
I think statm is an approximated simplification of smaps, which is more expensive to get. I came to this conclusion after I looked at the source:
smaps
The information you see in smaps is defined in /fs/proc/task_mmu.c:
static int show_smap(struct seq_file *m, void *v, int is_pid)
{
(...)
struct mm_walk smaps_walk = {
.pmd_entry = smaps_pte_range,
.mm = vma->vm_mm,
.private = &mss,
};
memset(&mss, 0, sizeof mss);
walk_page_vma(vma, &smaps_walk);
show_map_vma(m, vma, is_pid);
seq_printf(m,
(...)
"Rss: %8lu kB\n"
(...)
mss.resident >> 10,
The information in mss is used by walk_page_vma defined in /mm/pagewalk.c. However, the mss member resident is not filled in walk_page_vma - instead, walk_page_vma calls callback specified in smaps_walk:
.pmd_entry = smaps_pte_range,
.private = &mss,
like this:
if (walk->pmd_entry)
err = walk->pmd_entry(pmd, addr, next, walk);
So what does our callback, smaps_pte_range in /fs/proc/task_mmu.c, do?
It calls smaps_pte_entry and smaps_pmd_entry in some circumstances, out of which both call statm_account(), which in turn... upgrades resident size! All of these functions are defined in the already linked task_mmu.c so I didn't post relevant code snippets as they can be easily seen in the linked sources.
PTE stands for Page Table Entry and PMD is Page Middle Directory. So basically we iterate through the page entries associated with given process and update RAM usage depending on the circumstances.
statm
The information you see in statm is defined in /fs/proc/array.c:
int proc_pid_statm(struct seq_file *m, struct pid_namespace *ns,
struct pid *pid, struct task_struct *task)
{
unsigned long size = 0, resident = 0, shared = 0, text = 0, data = 0;
struct mm_struct *mm = get_task_mm(task);
if (mm) {
size = task_statm(mm, &shared, &text, &data, &resident);
mmput(mm);
}
seq_put_decimal_ull(m, 0, size);
seq_put_decimal_ull(m, ' ', resident);
seq_put_decimal_ull(m, ' ', shared);
seq_put_decimal_ull(m, ' ', text);
seq_put_decimal_ull(m, ' ', 0);
seq_put_decimal_ull(m, ' ', data);
seq_put_decimal_ull(m, ' ', 0);
seq_putc(m, '\n');
return 0;
}
This time, resident is filled by task_statm. This one has two implementations, one in /fs/proc/task_mmu.c and second in /fs/proc/task_nomm.c. Since they're almost surely mutually exclusive, I'll focus on the implementation in task_mmu.c (which also contained task_smaps). In this implementation we see that
unsigned long task_statm(struct mm_struct *mm,
unsigned long *shared, unsigned long *text,
unsigned long *data, unsigned long *resident)
{
*shared = get_mm_counter(mm, MM_FILEPAGES);
(...)
*resident = *shared + get_mm_counter(mm, MM_ANONPAGES);
return mm->total_vm;
}
it queries some counters, namely, MM_FILEPAGES and MM_ANONPAGES. These counters are modified during different operations on memory such as do_wp_page defined at /mm/memory.c. All of the modifications seem to be done by the files located in /mm/ and there seem to be quite a lot of them, so I didn't include them here.
Conclusion
smaps does complicated iteration through all referenced memory regions and updates resident size using the collected information. statm uses data that was already calculated by someone else.
The most important part is that while smaps collects the data each time in an independent manner, statm uses counters that get incremented or decremented during process life cycle. There are a lot of places that need to do the bookkeeping, and perhaps some places don't upgrade the counters like they should. That's why IMO statm is inferior to smaps, even if it takes fewer CPU cycles to complete.
Please note that this is the conclusion I drew based on common sense, but I might be wrong - perhaps there are no internal inconsistencies in counter decrementing and incrementing, and instead, they might count some pages differently than smaps. At this point I believe it'd be wise to take it to some experienced kernel maintainers.
回答2:
I would suggest looking through 'top' command line source code. It gets all its info from /proc as well so jt may be a good reference.
来源:https://stackoverflow.com/questions/30744333/linux-os-proc-pid-smaps-vs-proc-pid-statm