Handling low memory conditions in iOS and Mavericks

85 篇文章 1 订阅
80 篇文章 1 订阅

http://newosxbook.com/articles/MemoryPressure.html

 

No pressure, Mon!

 

Handling low memory conditions in iOS and Mavericks

Jonathan Levin, http://newosxbook.com/ - 11/03/13

1. About

Memory pressure in OS X and iOS is a very important aspect of virtual memory management which has been explored little in my book1. While I refer to Jetsam/memorystatus, the mechanism has undergone significant changes over time, culminating in a few very important sysctls and system calls recently introduced in Mavericks. While working on my version of Process Explorer for OS X and iOS, I got to encounter these new additions head on - and am therefore documenting them here. This is meant as an addendum to chapter 12 of the book, but can be read on its own, as well.

Why should you care? (Target Audience)

Physical memory (RAM), alongside CPU, is the scarcest resource in the system, and the one most likely to cause contention as apps vie for every bit available. More memory for an app directly correlates to better performance - usually at the cost of others. In iOS, where there is no swap space to fall back on, this is even more critical of a resource. This article is meant to make you think twice before the next time you call malloc() or mmap(), as well as elucidate the most common cause of crashes on iOS - those corresponding to low system memory.

Prerequisite: Virtual Memory in a nutshell

Whatever it is an application is programmed for, it must operate in a memory space. This space is where an application may hold its own code, data, and state. Naturally, one benefits if such a space is isolated from other applications, to provide for more security and stability. We call this space the virtual memory of the application, and it is one of the defining characteristics of the application as a process: All of an application's threads will share the same virtual memory space, and are thus defined to be in the same process.

The term "virtual" in virtual memory implies that the memory space, while very tangible to the process in question, does not exactly correspond to real memory on the system. This manifests itself in several ways:

  • The virtual memory space can exceed the amount of real memory available - Depending on the processor word size and OS in question, the virtual memory space can be up to 4GB (32-bits), or 256TB (64 bits)1. This, especially in the latter case, can far exceed the amount of actual memory available.
  • Virtual memory can, in fact, not exist at all: Given such a huge memory space which exceeds the physical memory backing capabilities, the system will only bother backing virtual memory with physical memory if an application explicitly requested it (that is, allocated it). A process virtual memory image is, therefore, quite sparse, with "islands" of memory inside a vast ocean of nothingness.
  • Even when allocated, virtual memory may still be quite virtual: - Just because you call malloc(3) doesn't mean the system should jump to and physically commit your memory by finding the approriate amount of RAM to back it. Most often, programmers allocate far more than they need. The malloc(3) operation, therefore, only allocated page table entries, but seldom commits the memory itself. It is actually accessing the memory (say, by memset(3)-ing it) which will cause the physical allocation.
  • The system may back up memory on the disk or network - Otherwise known as "swapping" memory out to a backing store. OS X traditionally uses swap files (in /var/vm). iOS has no swap.
  • Virtual memory you use may or may not be shared - The operating system reserves the right to implicitly share your virtual memory with other processes. This applies to file-backed memory you use (that is, memory claimed by a call to mmap(2)). If your process and another process mmap(2) the same file, the OS can give you each your private virtual copy, which is in fact backed by a single physical copy. Said physical copy will be marked unwritable. So long as everyone reads from the memory, a single copy suffices. If anyone, however, writes to such implicitly shared memory, the writing process will trigger a page fault, which will cause the kernel to perform a copy-on-write (COW), which produces a new physical copy whose contents may be modified.

Putting the above together, we can arrive at the following "formula": 

VSS = RSS + LSS + SwSS

Where:

VSSVirtual Set Size, as reported by top, ps(1), and others
RSSResident Set Size - the actual RAM footprint of the process. Also shown in top(1), ps(1), etc
LSS"Lazy" Set Size - Memory which the system has agreed to allocate, but not yet allocated
SwSS"Swap" Set Size - Memory which was previously in RAM, but has been pushed out to swap. 
    In iOS , this is always 0

 

All the above can be demonstrated succintly by a simple example - using vmmap(1) on any random process, in this case the shell itself:


morpheus@Zephyr (~/Documents) %vmmap -interleaved $$

Virtual Memory Map of process 480 (zsh)
Output report format:  2.2  -- 64-bit process

==== regions for process 480  (non-writable and writable regions are interleaved)
__TEXT                 000000010f2b7000-000000010f32b000 [  464K] r-x/rwx SM=COW  /bin/zsh
__DATA                 000000010f32b000-000000010f337000 [   48K] rw-/rwx SM=COW  /bin/zsh
__LINKEDIT             000000010f337000-000000010f347000 [   64K] r--/rwx SM=COW  /bin/zsh
MALLOC metadata        000000010f347000-000000010f348000 [    4K] r--/rwx SM=COW  
MALLOC metadata        000000010f348000-000000010f349000 [    4K] rw-/rwx SM=COW  
MALLOC guard page      000000010f349000-000000010f34a000 [    4K] ---/rwx SM=NUL  
MALLOC metadata        000000010f34a000-000000010f35f000 [   84K] rw-/rwx SM=COW  
MALLOC guard page      000000010f35f000-000000010f361000 [    8K] ---/rwx SM=NUL  
MALLOC metadata        000000010f361000-000000010f376000 [   84K] rw-/rwx SM=COW  
MALLOC guard page      000000010f376000-000000010f377000 [    4K] ---/rwx SM=NUL  
MALLOC metadata        000000010f377000-000000010f378000 [    4K] r--/rwx SM=COW  
VM_ALLOCATE            000000010f378000-000000010f379000 [    4K] r--/rw- SM=ALI  
MALLOC metadata        000000010f37d000-000000010f37e000 [    4K] r--/rwx SM=COW  
MALLOC metadata        000000010f37e000-000000010f37f000 [    4K] rw-/rwx SM=COW  
MALLOC guard page      000000010f37f000-000000010f380000 [    4K] ---/rwx SM=NUL  
MALLOC metadata        000000010f380000-000000010f395000 [   84K] rw-/rwx SM=COW  
MALLOC guard page      000000010f395000-000000010f397000 [    8K] ---/rwx SM=NUL  
MALLOC metadata        000000010f397000-000000010f3ac000 [   84K] rw-/rwx SM=COW  
MALLOC guard page      000000010f3ac000-000000010f3ad000 [    4K] ---/rwx SM=NUL  
__TEXT                 000000010f3ad000-000000010f3d3000 [  152K] r-x/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/zle.so
__DATA                 000000010f3d3000-000000010f3db000 [   32K] rw-/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/zle.so
__LINKEDIT             000000010f3db000-000000010f3e9000 [   56K] r--/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/zle.so
__TEXT                 000000010f3e9000-000000010f400000 [   92K] r-x/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/complete.so
__DATA                 000000010f400000-000000010f402000 [    8K] rw-/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/complete.so
__LINKEDIT             000000010f402000-000000010f40a000 [   32K] r--/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/complete.so
__TEXT                 000000010f40a000-000000010f414000 [   40K] r-x/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/compctl.so
__DATA                 000000010f414000-000000010f415000 [    4K] rw-/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/compctl.so
__LINKEDIT             000000010f415000-000000010f41a000 [   20K] r--/rwx SM=COW  /usr/lib/zsh/4.3.11/zsh/compctl.so
MALLOC_TINY            00007fb430c00000-00007fb430d00000 [ 1024K] rw-/rwx SM=COW  DefaultMallocZone_0x10f347000
MALLOC_TINY            00007fb430d00000-00007fb430e00000 [ 1024K] rw-/rwx SM=COW  DispatchContinuations_0x10f37d000
MALLOC_SMALL           00007fb431000000-00007fb431800000 [ 8192K] rw-/rwx SM=PRV  DefaultMallocZone_0x10f347000
STACK GUARD            00007fff4c949000-00007fff50149000 [ 56.0M] ---/rwx SM=NUL  stack guard for thread 0
Stack                  00007fff50149000-00007fff50949000 [ 8192K] rw-/rwx SM=COW  thread 0
__TEXT                 00007fff6eeb7000-00007fff6eeec000 [  212K] r-x/rwx SM=COW  /usr/lib/dyld
__DATA                 00007fff6eeec000-00007fff6ef28000 [  240K] rw-/rwx SM=COW  /usr/lib/dyld
__LINKEDIT             00007fff6ef28000-00007fff6ef3b000 [   76K] r--/rwx SM=COW  /usr/lib/dyld
__DATA                 00007fff71c5d000-00007fff71c5f000 [    8K] rw-/rwx SM=COW  /usr/lib/libauto.dylib
__DATA                 00007fff71cf8000-00007fff71cf9000 [    4K] rw-/rwx SM=COW  /usr/lib/system/libkeymgr.dylib
__DATA                 00007fff71d08000-00007fff71d19000 [   68K] rw-/rwx SM=COW  /usr/lib/system/libsystem_c.dylib
...
__TEXT                 00007fff8da49000-00007fff8da51000 [   32K] r-x/r-x SM=COW  /usr/lib/system/libcopyfile.dylib
__TEXT                 00007fff8da52000-00007fff8da75000 [  140K] r-x/r-x SM=COW  /usr/lib/system/libxpc.dylib
__LINKEDIT             00007fff8e5da000-00007fff919d9000 [ 52.0M] r--/r-- SM=COW  /usr/lib/libDiagnosticMessagesClient.dylib
shared memory          00007fffffe00000-00007fffffe01000 [    4K] r--/r-- SM=SHM  
shared memory          00007fffffe74000-00007fffffe75000 [    4K] r-x/r-x SM=SHM  

==== Legend
SM=sharing mode:  
        COW=copy_on_write PRV=private NUL=empty ALI=aliased 
        SHM=shared ZER=zero_filled S/A=shared_alias

==== Summary for process 480
ReadOnly portion of Libraries: Total=58.6M resident=19.6M(34%) swapped_out_or_unallocated=39.0M(66%)
Writable regions: Total=18.7M written=132K(1%) resident=488K(3%) swapped_out=104K(1%) unallocated=18.2M(97%)

REGION TYPE                      VIRTUAL
===========                      =======
MALLOC                             10.0M        see MALLOC ZONE table below
MALLOC guard page                    32K
MALLOC metadata                     356K
STACK GUARD                        56.0M
Stack                              8192K
VM_ALLOCATE                           4K
__DATA                              708K
__LINKEDIT                         52.2M
__TEXT                             6532K
shared memory                         8K
===========                      =======
TOTAL                             133.7M

                                     VIRTUAL ALLOCATION      BYTES
MALLOC ZONE                             SIZE      COUNT  ALLOCATED  % FULL
===========                          =======  =========  =========  ======
DefaultMallocZone_0x10f347000          9216K       4430       323K      3%
DispatchContinuations_0x10f37d000      1024K          5        288      0%
===========                          =======  =========  =========  ======
TOTAL                                  10.0M       4435       323K      3%

Nomenclature

Throughout this article, the following terms are used:

  • Page - The basic unit of memory management. In Intel and ARM, commonly 4k (4096), or 16K in ARM64. You can use the pagesize(1) command on OS X (or sysctl hw.pagesize on either OS) to figure out what the default page size is. Intel architectures support super pages (8k) and huge pages (2MB), but in practice those are relatively few and far between.
  • Phsyical Memory/RAM - The finite amount of memory installed on a host (Mac or i-Device). You can use the hostinfo(1) command to obtain this value.
  • Virtual Memory - Memory allocated by programs or the system itself, usually by a call to malloc(3)mmap(2), or higher level calls (e.g. Objective-C's [ alloc], etc). Virtual memory may be private (owned by a single process) or shared (owned by 2+ processes). Shared memory may be either explicitly or implicitly shared.
  • Page Fault - occurs when the memory management unit (MMU) detects access to virtual memory which is a violation, namely one of :
    • Accessing unallocated memory: Dereferencing a pointer to memory which has not previously been allocated - XNU translates that to an EXC_BAD_ACCESS exception, and the process receives a segmentation fault (SIGSEGV, Signal #11).
    • Accessing allocated, but not committed memory: Dereferencing a pointer to memory which has previously been allocated, but not yet used (or madvise(2)d accordingly) - XNU intercepts that and realizes that it can no longer procrastinate, and must allocate the physical page(s). The thread which caused the fault is frozen while those pages are allocated.
    • Accessing memory, but failing to comply with its permissions: Memory pages are protected by r/w/x in a similar manner to standard UNIX file permissions. Attempting to write to a read only (r-- or r-x) will cause a page fault which XNU will either translate to a Bus Fault (SIGBUS, Signal #7) or force a Copy-On-Write (COW) operation (if implicitly shared).

 

Tools:

Apple provides several important tools to inspect virtual memory:

  • vmmap(1) - Inspects the virtual memory of a single process, laying out its "map" in a manner akin to Linux's /proc/<pid>/maps.
  • vm_stat(1) - Provides statistics on virtual memory from a system-wide perspective. This is essentially just a wrapper over a call to the Mach host_statistics64 API, and printing out the vm_statistics64_t (from <mach/vm_statistics.h>.
  • top(1) - Provides system-wide and per-process statistics relating to performance. In it, the MemRegions, PhysMem and VM statistics pertain to virtual memory.

And, of course, I am shamelessly promoting my own tool here, process explorer (procexp), which provides what (IMHO) are better capabilities (including richer memory statistics) than top(1).

Memory Pressure

Memory pressure is defined by two counters Mach keeps internally:

  • vm_page_free_count: How many pages of RAM are presently free
  • vm_page_free_target: How many pages of RAM, at a minimum, should optimally be free.

You can see these easily using sysctl:

 
morpheus@Zephyr (~/Documents) % sysctl -a vm | grep page_free
vm.vm_page_free_target: 2000
vm.page_free_wanted: 0
vm.page_free_count: 73243

if the amount of free pages falls below the target amount - we have a pressure situation (there are other potential cases, but I'm omitting them here for the sake of simplicity2). You can also use sysctl(8) to query the value of vm.memory_pressure. In OS X 10.9 and later, you can also query kern.memorystatus_vm_pressure_level, which is a 1 (NORMAL), 2 (WARN) or 4 (CRITICAL)

Following kernel initiaization, the main thread becomes vm_pageout, and spawns a dedicated thread, aptly called vm_pressure_thread, to monitor pressure events. This thread is idle (blocking on its own continuation). The thread will be woken up from vm_pageout when pressure is detected. This behavior has been modified in XNU 2422/3 (OSX 10.9/iOS 7) (most notably packaged in vm_pressure_response).

As a side note, VM pressure handling is conditionally compiled into XNU, assuming VM_PRESSURE_EVENTS is #define (which it is). If it isn't (say, by custom-compiling), vm_pressure_thread does nothing in 2050, and will not even be started in 2422/3. Also, in iOS kernels, defining CONFIG_JETSAM changes some of the behavior by dispatching memory handling to the memorystatus thread more frequently, as well as updating its counters (more on that later).

[mach]_vm_pressure_monitor

XNU exports the undocumented system call #296, vm_pressure_monitor(bsd/vm/vm_unix.c), which is a wrapper over mach_vm_pressure_monitor (osfmk/vm/vm_pageout.c). The system call (and, consequently, the internal Mach call) is defined as follows: 
 

int vm_pressure_monitor(int wait_for_pressure, int nsecs_monitored, uint32_t *pages_reclaimed);

 

The call will either return immediately, or block (if wait_for_pressure is non-zero). It will return in pages_reclaimed how many physical pages were freed in the count of nsecs_monitored (not really nsecs so much as loop iterations). As its return value, it will provide how many pages were wanted (vm.page_free_wanted in the sysctl(8) output, above). Calling the system call is straightforward, and will not require root privileges. (Again, note you can use sysctl(8) to query vm.memory_pressure, as well, though that will not wait for pressure).

You can run process explorer with the "vmmon" argument to try this system call (otherwise, process explorer will do this for you in a separate thread when in interactive mode, to show pressure warnings). Specifying an additional parameter of "oneshot" will run the call without waiting for pressure. Otherwise, the call will wait until pressure is detected: 


morpheus@Zephyr (~) % procexp vmmon oneshot
No pressure, mon!
morpheus@Zephyr (~) % procexp vmmon
Running in VM Pressure Monitor mode; Press CTRL-C to exit
# consume memory, either by memory_pressure(1) on Mavericks, or by starting VM instances..
Wanted 734 pages, Reclaimed: 20
Wanted 714 pages, Reclaimed: 0
..

But how does the system actually reclaim the memory? For that, we need to involve memorystatus.

MemoryStatus and Jetsam

When XNU was ported for iOS, Apple encountered a significant challenge which arose from the mobile device constraints - no swap space. Unlike a desktop, wherein virtual memory can "spill over" to external storage, the same does not hold true here (largely due to limitations of flash memory). Memory, therefore, has become an even more important (and more scarce) resource.

Enter: MemoryStatus. This mechanism, originally introduced in iOS, is a kernel thread responsible for handling low RAM events in the only way iOS deems possible: Jettison (eject) as much RAM as possible in order to free it up for applications - even if it means killing applications along the way. This is what iOS refers to as jetsam, and can be seen in the XNU source code as #if CONFIG_JETSAM. In OS X, memorystatus instead kills only those processes marked for idle exit, which is a somewhat more gentle approach, more suitable for a desktop environment3 You can probably see memorystatus in action if you use dmesg, with grep:


bash-3.2# dmesg | grep memorystatus
memorystatus_thread: idle exiting pid 1586 [com.apple.audio.]
memorystatus_thread: idle exiting pid 1584 [com.apple.audio.]
memorystatus_thread: idle exiting pid 1583 [com.apple.qtkits]
memorystatus_thread: idle exiting pid 1570 [accountsd]
memorystatus_thread: idle exiting pid 1383 [CalendarAgent]
memorystatus_thread: idle exiting pid 1379 [pbs]
memorystatus_thread: idle exiting pid 1378 [AppleIDAuthAgent]
memorystatus_thread: idle exiting pid 1374 [com.apple.hiserv]
memorystatus_thread: idle exiting pid 1367 [xpcd]

The memorystatus thread is a separate thread (that is, not directly related to vm_pressure_thread), which is started in the BSD portion of XNU (by a call to memorystatus_init in bsd/kern/bsd_init.c). If CONFIG_JETSAM is defined (iOS), memorystatus starts another thread, memorystatus_jetsam_thread, which will essentially run in a blocking loop, waking up when necessary to kill the top processes on the memory list, as long as memorystatus_available_pages <= memorystatus_available_pages_critical, before blocking again.

In iOS, memorystatus/jetsam does not print out messages, but certainly leaves a trail of its victims' carcasses in /Library/Logs/CrashReporter/LowMemory-YYYY-MM-DD-hhmmss.plist - These logs are generated by the CrashReporter, and similar to crash logs they contain a dump. If you have a Jailbroken device, an easy way to force mass executions by jetsam is to run a small binary which keeps on allocating and memset()ing memory in chunks of 8MB (left as an exercise for the avid reader), and run it. You will see applications die, until the offending binary is (eventually) slain. The Logs will look something like:


HodgePodge:/Library/Logs/CrashReporter root#  ls -l
total 24
-rw-r----- 1 root wheel 6555 Nov  3 00:16 LowMemory-2013-11-03-001603.plist # Innocents die...
-rw-r----- 1 root wheel 5223 Nov  3 00:16 LowMemory-2013-11-03-001619.plist # More innocents.. 
-rw-r----- 1 root wheel 5231 Nov  3 00:16 LowMemory-2013-11-03-001623.plist # massacre ends here


HodgePodge:/Library/Logs/CrashReporter root# cat LowMemory-2013-11-03-001623.plist 
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>AutoSubmitted</key>
	<true/> Report gets sent automatically to Apple
	<key>SysInfoCrashReporterKey</key>
	<string>3f75cb56ee42b76d5e7830102a52dd2b741a9ecd</string>
	<key>bug_type</key>
	<string>198</string>
	<key>description</key>
	<string>Incident Identifier: BA7E76BC-1F87-4C47-8244-E04B7EE40C6D Now the guys @Infinite Loop know it's me :-) 
CrashReporter Key:   3f75cb56ee42b76d5e7830102a52dd2b741a9ecd
Hardware Model:      iPod5,1
OS Version:          iPhone OS 6.1 (10B141)
Kernel Version:      Darwin Kernel Version 13.0.0: Sun Dec 16 19:59:15 PST 2012; root:xnu-2107.7.55~11/RELEASE_ARM_S5L8942X
Date:                2013-11-03 00:16:23 -0400
Time since snapshot: 216 ms

Free pages:        781
Active pages:      3515
Inactive pages:    1846
Throttled pages:   103879
Purgeable pages:   0
Wired pages:       18150
Largest process:   a

Processes
     Name                    <UUID>                       rpages       recent_max       [reason]          (state)

           timed <129ae7acc9bc3209a60ac42d49b0d89f>          287              287         [vm]         (daemon) (idle)
            ptpd <096297a7a40f318290a972274cc44d87>           39               39         [vm]         (daemon)
       locationd <bc63a21ef4c93e109b2bdb84582db2dc>          366              366         [vm]         (daemon)
      backboardd <8db19add2bf937628cd17abdf8931372>         1189             1189         [vm]         (daemon)
       securityd <eeb8bd36f685306db25c302d7f3e18b4>          198              198         [vm]         (daemon)
            misd <e0961741ebe538ff94607ae3ae117d51>          250              250         [vm]         (daemon) <--hapless victim
               a <0bb1c477079e379195bbda3fa41206ab>        97889            97889         [vm]         (daemon) <-- culprit
      aslmanager <9c5983efd2273fb1beecd58f0836983c>          105              105         [vm]         (daemon)
         lockbot <1bc46f93162c33588d86caa3f223777c>          159              159         [vm]         (daemon)
CommCenterRootHe <1d3144da79743ac299c8332460c90977>          208              208         [vm]         (daemon)
            awdd <56af91d9e6a13f479b482cad51d733e7>          265              265         [vm]         (daemon)
..
.. These processes are survivors 
..
  UserEventAgent <7ee3410c25e4372d84e93318fe42696b>          607              607                      (daemon)
   fairplayd.N78 <133d3a7920833e65a42ab38c32865174>          394              394                      (daemon)
         notifyd <67a17b0c297e3785a9e09b8e72f3636a>          197              197                      (daemon)
     ReportCrash <68e323272a9d37c58ba4cdf1279764c4>          333              333                      (daemon)

**End**
</string>
	<key>displayName</key>
	<string>a</string>
	<key>name</key>
	<string>a</string>
	<key>os_version</key>
	<string>iPhone OS 6.1 (10B141)</string>
	<key>system_ID</key>
	<string></string>
	<key>version</key>
	<string>104</string>
</dict>
</plist>

(Note that you can do this on a non-jailbroken device as well, if you've configured it for development, you can create a simple iOS app in Objective-C which does the same allocations, then collect the logs via XCode's Organizer).

It should be noted that outright killing a process with Jetsam, while ruthless, is not all that unusual: Linux (and, by inheritance, Android) has a similar mechanism in its "OOM" (out-of-memory) killer, which keeps a (possibly adjustable) score for each process, and kills processes with a high score when a memory shortage is encountered. In desktop Linux, OOM wakes up when the system runs out of swap; In Android, a lot sooner, when RAM is running low. Whereas Android's method is score driven (the score, in effect being a heuristic of how much RAM was used, and how frequently), iOS's approach is priority based.

As of XNU 2423, Jetsam uses "priority bands" (q.v. <sys/kern_memorystatus.h> JETSAM_PRIORITY constants), which is another way of saying that jetsam tracked processes are maintained in an array of 21 linked lists in kernel space (memstat_bucket). Jetsam will pick the first process in the lowest priority bucket (starting with 0, or JETSAM_PRIORITY_IDLE), moving to the next priority list if the current priority is empty (q.v. memorystatus_get_first_proc_locked, in bsd/kern/kern_memorystatus.c). The default priority for processes is set at 18, allowing for jetsam to choose idle and background processes before interactive and potentially important ones. This is shown in the figure below:

Jetsam Priority Bands

Jetsam has another modus operandi, which uses a process memory "high water mark", and will outright kill processes exceeding their HWM. The HWM mode in Jetsam is triggered when a task's RSS ledger exceeds a system wide limit (more accurately, this would be the task phys_footprint ledger, which accounts for RSS, but also compressed and I/O Kit related memory). The HWM can be set with memorystatus_control operation #5 (MEMORYSTATUS_CMD_SET_JETSAM_HIGH_WATER_MARK, discussed later).

On iOS, Launchd can set jetsam priority bands. Originally this was done on a per daemon basis (i.e. in its plist). It seems that nowadays the settings have been moved to com.apple.jetsamproperties.model.plist (e.g. N51 (5s), J71 (iPad Air), etc). This looks like the following:

<plist version="1.0">
<dict>
        <key>CachedDefaults</key>
	<!-- Array of dict entries, with key being daemon name e.g. -->
        <dict>

                <key>com.apple.usb.networking.addNetworkInterface</key> 
                <dict>
                        <key>JetsamMemoryLimit</key>
                        <integer>integer>6</integer>
                        <key>JetsamPriority</key> 
                        <integer>integer>3</integer>
                        <key>WellBehaved</key> 
                        <true/>
                </dict>
..

Killing a process outright because of RAM consumption may seem overly harsh, but for lack of swap, there is really little else which can be done. Prior to killing a process with Jetsam, however, memorystatus does allow a process to "redeem itself", and avoid untimely termination, by getting the memorystatus thread to first send a kernel note (a.k.a kevent) to processes which are "candidates" for termination. This knote (NOTE_VM_PRESSURE, <sys/event.h>) will be picked up by EVFILT_VM kevent() filters, like what UIKit translates to the didReceieveMemoryWarning notification, which is undoubtedly familiar to (and loathed by) iOS App developers. Both Darwin's libC and GCD and are laced with memory pressure handlers, specifically:

  • Darwin's LibC ( <malloc/malloc.h>) defines a malloc_zone_pressure_relief (as of OSX 10.7/iOS 4.3)
  • LibCache (<cache.h>) defines a cache cost (for cache_set_and_retain), which allows caches to be purged automatically when a pressure event is encountered
  • GCD (<dispatch/source.h>) defines a DISPATCH_SOURCE_TYPE_MEMORYPRESSURE (as of OSX 10.9)

Generally speaking, an application registered for memory pressure (either directly, through the Darwin APIs, or indirectly, via UIKit) should reduce their caching and potentially free unneeded memory (though it should be noted that iterating over memory structures could result in page faults, which would only exacerbate the memory pressure). UIKit is closed source, but jtool provides a nice disassembly which demonstrates its behavior, when the memory warning is encountered by UIApplication:


morpheus@Zephyr (....) % cd /Developer/Platforms/iPhoneOS.platform/ \
 DeviceSupport/7.0.3 (11B508)/Symbols/System/Library/Frameworks/UIKit.framework)
morpheus@Zephyr (.../UIKit.framework) % jtool -arch armv7 -d 0x0288258 UIKit | more 
Disassembling from file offset 0x288258, Address 0x288258
[UIApplication _performMemoryWarning]:
-- 288258       b5f0            PUSH   {r4,r5,r6,r7,lr} 
-- 28825a       4604            MOV    R4, R0           ; R4 = 0x0
-- 28825c       f6412032        MOVW   R0, 0x1a32       ; R0 = 0x1a32
-- 288260       f2c0005a        MOVT   R0, 0x5a         ; R0 += 5a0000 = 5a1a32
-- 288264       af03            ADD    R7, SP, #12      ; R7 += 805d0c = 805d0d
-- 288266       4478            ADD    R0, PC           ; R0 += 28826a = 829c9c; _OBJC_IVAR_$_UIApplication._applicationFlags
...
;
; R0 =  _objc_msgSend(0x1004000,"didReceiveMemoryWarning",6000000);  
;
..
-- 2882b0       f24f210e        MOVW   R1, 0xf20e       ; R1 = 0xf20e
-- 2882b4       4623            MOV    R3, R4           ; R3 = 0x0
-- 2882b6       f2c00157        MOVT   R1, 0x57         ; R1 += 570000 = 57f20e
-- 2882ba       f245629c        MOVW   R2, 0x569c       ; R2 = 0x569c
-- 2882be       4479            ADD    R1, PC           ; R1 += 2882c2 = 8074d0
-- 2882c0       f2c0025a        MOVT   R2, 0x5a         ; R2 += 5a0000 = 5a569c
-- 2882c4       447a            ADD    R2, PC           ; R2 += 2882c8 = 82d964 UIApplicationDidReceiveMemoryWarningNotification
-- 2882c6       6809            LDR    R1, [ R1, #0 ]   ; R1 = *(8074d0) = 0x5f6382
-- 
;
; R0 =  _objc_msgSend(0x1004000,"postNotificationName:object:",UIApplicationDidReceiveMemoryWarningNotification);
;
...

Sometimes, however, freeing memory may not be enough to alleviate the memory pressure. Most often, the case is that the memory freed may simply be consumed by another application, which will not free it as willingly. In those cases, the last resort is to kill the top process in the list of prospective candidates - hence Jetsam.

Controlling memorystatus

Having a thread which can randomly decide on killing processes could be a bit dangerous. Apple therefore uses several APIs to "reign in" Jetsam/memorystatus. Naturally, these are private and undocumented (and Apple will likely kill *your* developer account if you use them in your apps..), but nonetheless, here they are:

  • Using sysctl kern.memorystatus_jetsam_change: Jetsam's priority list can be changed from userspace. This is a bit like Linux's oom_adj, which enables processes to escape the OOM's wrath by specifying a negative adjustment number (effectively reducing their score). Likewise in iOS, launchd (which starts all apps) can set the Jetsam priority list. (As an example, q.v com.apple.voiced.plist, which specifies JetSamMemoryLimit (8000) and JetsamPriority (-49). The sysctl internally calls memorystatus_list_change (in bsd/kern/kern_memorystatus.c), which sets the priority and state flags (active, foreground, etc). Again - similar to what Linux would do, in this case Android's "Low Memory Killer" (which enables the runtime to tweak the OOM_ADJ according to the application/activity's foreground status, thus preferring to kill backgrounded apps first). This method works up till iOS 6.x.
  • Using the memorystatus_control (#440) system call: Introduced somewhere around xnu 2107 (that is, as early as iOS 6 but not until OS X 10.9), this (undocumented) syscall enables you to control both memorystatus and jetsam (the latter, on iOS) with one of several "commands", as shown in the following table:
    MEMORYSTATUS_CMD_ constavailabilityusage
    GET_PRIORITY_LIST (1)OS X 10.9, iOS 6+Get priority list - array of memorystatus_priority_entry from <sys/kern_memorystatus.h> Example code can be seen Here
    SET_PRIORITY_PROPERTIES (2)iOS only (or CONFIG_JETSAM)Update properties for a given proess
    GET_JETSAM_SNAPSHOT (3)iOS only (or CONFIG_JETSAM)Get Jetsam snapshot - array of memorystatus_jetsam_snapshot_t entries (from <sys/kern_memorystatus.h>
    GET_PRESSURE_STATUS (4)iOS (or CONFIG_JETSAM)Privileged call: returns 1 if memorystatus_vm_pressure_level is not normal
    SET_JETSAM_HIGH_WATER_MARK (5)iOS (or CONFIG_JETSAM)Sets the maximum memory utilization for a given PID, after which it may be killed. Used by launchd for processes with a memory limit
    SET_JETSAM_TASK_LIMIT (6)iOS 8 (or CONFIG_JETSAM)Sets the maximum memory utilization for a given PID, after which it will be killed. Used by launchd for processes with a memory limit
    SET_MEMLIMIT_PROPERTIES (7)iOS 9 (or CONFIG_JETSAM)Sets memory limits + attributes
    GET_MEMLIMIT_PROPERTIES (8)iOS 9 (or CONFIG_JETSAM)Retrieves memory limits + attributes
    PRIVILEGED_LISTENER_ENABLE (9)Xnu-3247 (10.11, iOS 9)Registers self to receive memory notifications
    PRIVILEGED_LISTENER_DISABLE (10)Stops self receiving memory notifications
    TEST_JETSAM (1000)CONFIG_JETSAM && (DEVELOPMENT || DEBUG)Test Jetsam, kill specific processes (Debug/Development kernels only)
    TEST_JETSAM_SORT (1001)iOS 9 && (DEVELOPMENT || DEBUG)Test Jetsam sorting (Debug/Development kernels only)
    SET_JETSAM_PANIC_BITS (1001/1002)CONFIG_JETSAM && (DEVELOPMENT || DEBUG)Alter Jetsam's panic settings (Debug/Development kernels only)
  • Using posix_spawnattr_setjetsam: From the posix_spawnattr family of functions, but undocumented and present only in iOS (This is how launchd handles Jetsam as of iOS 7)..
  • Using sysctl kern.memorypressure_manual_trigger Used for simulating memory pressure levels, without actually hogging memory - used by OS X 10.9's memory_pressure utility (-S). This is a value from <sys/event.h>>, NOTE_MEMORYSTATUS_PRESSURE_[NORMAL|WARN|CRITICAL]

Other memorystatus configurable values:

  • Using sysctl kern.memorystatus_purge_on_* values (OS X) These values don't affect memorystatus so much as the pageout daemon, forcing it to force purge on warning (2), urgent (5) or critical (8) values. Setting these values to 0 will disable purging.
  • Using memorystatus_get_level (#453): This system call returns (into an int *) a number between 0 and 100 specifying the %-age of free memory. Diagnostics only. Used by Activity Monitor (and my Process Explorer) to show pressure in Mavericks and later

Ledgers

iOS reintroduced ledgers around iOS 5 (or 5.1?), and the concept has since been ported to OS X as well. I say "reintroduced", because ledgers have been around since the original design of Mach, but have never really been properly implemented until that point.

Ledgers can help solve the problem of excessive resource utilization. Unlike the classic UN*X model (setrlimit(2), known to users as ulimit(1), ledgers have a finer-grained, QoS-like model wherein a ledger allocates a certain quota per resource (RAM, CPU, I/O) per time unit, and "refills" magically. This allows the OS to provide a leaky-bucket type QoS mechanism, guaranteeing service levels. If a process exceeds its ledger, a Mach Exception (EXC_RESOURCE, #12, if memory serves) is generated.

Down the road, it makes sense for Apple to shift entirely to a ledger based mechanism for RAM management, especially with RAM being such a scarce resource in iOS (and no swap, to boot). Jetsam will likely remain as a method of last resort.

References:

  1. Mac OS X and iOS Internals, J Levin

ChangeLog

  • 3/1/2014 - Added jetsam properties plist from iPhone5s, and note about ledgers
  • 2/10/2016 - Added jetsam/memorystatus commands for xnu 32xx (iOS 9, OS X 10.11). Also updated procexp to show mem limits on iOS

Footnotes

  1. In an effort to maintain simplicitly, let's ignore the fact that some of the virtual memory provided for any given process is actually reserved and mapped for kernel use only. That 256TB for 64-bit, incidentally, is due to hardware imposed limits (plus the fact that nobody would actually use it all, much less 16EB of a full 64-bits). Mac OS X caps user space virtual memory at 47-bits (0x7fffffffffff) for 128-TB, and the topmost (technically, 0xffffffff8...) 128TB are reserved for the kernel.
  2. Again, to simplify, I'm not going into the actual conditions.
  3. I'm not going into the process of idle demotion, wherein (as of 10.9) processes may be moved to the idle band so they can be candidates for idle exit. A process can call proc_info with PROC_INFO_CALL_DIRTYCONTROL to have the kernel track its state, seek protection from killing when "dirty" and voluntarily consenting to killing when "clean" (idle). This is used with the vproc mechanism (<vproc.h>)


 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值