
(1)working memory 工作区,computational memory 计算区
又被称为Working Storage,working memory,computational memory或者Working Segments,即计算内存,它是指
可能被交换区管理程序调入、换出(Page In,Page Out)物理内存。AIX内核中可以调入、换出的部分也属于Working Segments。


(2)persistent 或permanent ,文件区/缓存区
里面暂时存放文件系统数据。当进程打开文件的时候,数据也被调入(Page In)到此区域,如果进程对文件进行了修改,此页被标记
修改过(Dirty),那么最后直接换出(Page Out)到文件数据块所存放的位置(刷新磁盘文件)而不是普通的交换区;如果数据没有
被修改,则可以直接丢弃,也不会写到交换区。程序使用的数据文件,以及程序代码本身,两者都是Persistent Objects,不同的是
Page Out动作中磁盘写操作(因为可以直接丢弃,磁盘上还留有一份原始文件数据),也能更高效地利用内存。

(3)client Objects
修改,则进行标记,最后再回写到数据原有存储的位置(或通过网络写回NFS Server);但是与Persistent Objects不同,即使数据没有
被修改过,也不会直接释放内存,而是Page Out到本地磁盘交换区,以便下次使用的时候提高效率(AIX假设本地磁盘读写比网络上重新
读取要快)。作为JFS2的Cache 的Client 内存管理稍有不同,只对特别创建的临时类型文件(是一种特殊的文件模式)进行page out
处理,其它处理方式与JFS的Cache 处理方式相同。





连接两个缓冲空间,分别是paging space交换区和磁盘文件系统,另一个放水口,代表当文件缓存并没有改写过的时候,如果需要释放,可以直接丢弃掉,因为



现在我们开始模拟AIX内存分配和释放的全过程。首先,水罐是完全空的,代表所有的页面都可以使用,也就是在操作系统刚刚启动时,在vmstat 输出中会有大量的fre.
种物理内存被分配,“液面”的高度是油和水总量的叠加。当空闲内存逐渐被分配光,“液面”上升到minfree 标尺所在位置,系统将启动lrud,示意图中相当于缓冲
阀门打开,水罐内的水/油分别向Paging Space和File内转移,当然也可能直接从放水口放掉(例如文件缓存没有被修改过)。





$vmstat 1 5
System Configuration :lcpu=16  mem=31488MB
kthr         memory        page                        faults              cpu
____________   __________  ___________________        ________        _________
  r     b      avm   fre   re pi po fr     sr    cy        in    sy    cs     us  sy id  wa
  4     2   3127273  3272  0  4  8  338    937   0        2357  22962  1561   13  1  83  3
  5     3   3127257  1556  0  4  0  0        0   0        2890  15790  2165   20  0  59 21
  2     4   3127903   958  0  7  4 1014   1673   0        2353   7290  1670   17  1  53 29
  3     3   3127948   966  0 23 12 1392   2693   0        2620   7943  1779   18  1  62 19
  avm:不是available memory(可用内存),而是active virtual memory,就是当前系统分配的所有虚拟内存之和(包括
  实际使用的物理内存和交换区使用的空间),并且avm不包括文件系统cache.(也就是说不包括persistent 内存)
  AIX 中最让人困惑的是剩余内存的问题.似乎所有的AIX初学者都不喜欢AIX系统中剩余物理内存总是那样少。说实话,开始我也不喜欢这样,

  如果内存请求超过了现在可用的物理内存(这种情况很少发生)的容量,系统会自动检索当前全部的物理内存,即vmstat 中的sr一项。如果


下面的命令可以将系统(显示的)剩余内存调大,其含义是当系统fre的内存少于5000个页面(5000*4KB=20MB)的时候,系统开始通过Page Out释放物理
#/usr/samples/kernel/vmtune  -F  10000 -f 5000 ->重新启动系统后参数会失效

#vmo -p -o maxfree = 10000 minfree = 5000

关于 maxfree 和minfree两个参数控制系统剩余内存如表 3-2所示

maxfree 和 minfree两个参数控制系统剩余的内存如表 3-2所示
     表 3-2  maxfree 和 minfree两个参数控制系统剩余的内存

  实际剩余内存                                        系统动作
多于maxfree                                           误操作
在minfree和maxfree之间                           如果当前在进行内存释放,则继续,知道剩余内存多余maxfree,否则无动作
少于minfree                                     开始进行内存释放动作

使用内存的方式。通常操作系统设定了80%和20%两个参数用于文件cache( persistent Objects)的内存,当前百分比的计算方式是Persistent 内存(即文件

表 3-3 Maxperm 和MinPerm两个参数控制用于缓存的内存      
文件内存占物理内存百分比                          操作系统行为
>80%(MaxPerm)                       只将Persistent 内存(即文件系统缓存)释放供程序使用
20%<可用内存<80%                    根据repage情况,优先释放(交换)Persistent内存(即文件系统缓存)
<20%(MinPerm)                      平等释放(交换)Persistent 和Working 内存

注意:释放Persistent 内存不需要交换,如果没有被修改,可以直接释放掉,因此减少了一次磁盘写入动作,提高了效率

为了优先使用缓存(Persistent)内存,将内存释放给程序(Working 内存)使用,可以通过下面的一条命令:
#/usr/samples/kernel/vmtune  -P 10 -p 5 #AIX旧版
#vmo -p -o maxperm%=10  minperm=5  #AIX新版用vmo替代了vmtune

这样虚拟内存管理将更多地优先将文件内存(Persistent ,缓存)释放出来。表3-4 时通常的设置参数和说明。

             3-4 设置参数和说明
设置参数               通常要求                           说明
maxclient            =maxperm                          maxclient 必须小于等于maxperm,maxclient 对jfs2文件系统有效
minfree                                                一半使用默认值128(个4K页面)
maxfree              maxfree=                          maxpgahead一般为16,使用maxpgahead(JFS文件系统)或者j2_max_Read_Ahead
                     minfree +                         (JFS2 文件系统)两个参数中最大的一个数值作为计算maxgpahead值

minperm%              15%
maxperm%              30%                              maxclient 需要小于等于maxperm

minperm%            使用默认值
maxperm%            使用默认值

minperm%              5%
maxperm%             10%                               maxclient需要小于等于maxperm



lru_file_repage 就可以帮助系统识别这种情况,减少thrash,但是如果系统物理内存严重不足,没有任何参数能够避免此现象的发生,只能通过增加物理内存解决。

AIX V5.3以后的操作系统倾向于设置比较小的 minperm%(例如 5%),而设置比较大的maxperm%(例如90%),同时将lru_file_repage设置为1。在这种设置下,系统会
颠簸发生,永远要优先交换文件缓存,也就是要将lru_file_repage 设置为0.



关于direct io

Use Direct I/O to improve performance of your AIX applications

Learn the benefits and the rules and find out when it pays to implement Direct I/O


The alternative I/O technique called Direct I/O was first introduced in AIX 4.3 and has been available for all later releases of AIX, including AIX 5L. It bypasses the Virtual Memory Manager (VMM) altogether and transfers data directly to/from the disk to/from the user's buffer. You may find improved performance of your applications when you implement this technique for file handling.

In the following discussion any reference to JFS will imply reference to both JFS and JFS2. JFS (Journaled File System) is native to the POWER-based platform. Although JFS2 (also known as Enhanced Journaled File System) is not native to the POWER-based platform, it is available on POWER. Both JFS and JFS2, used in AIX, exploit database journaling techniques to maintain its structural consistency. This prevents damage to the file system when the system is halted abnormally.

Normally, when an I/O request to a JFS file is invoked, the I/O goes from the application buffer to the Virtual Memory Manager (VMM) and then from the Virtual Memory Manager to the JFS. When the application makes a request for a file read, if the file page is not in memory, the JFS reads the data from the disk into the file buffer cache, then copies the data from the file buffer cache to the user's buffer. On the other hand, when the application makes a request for a write, the data is copied from the user's buffer into the file buffer cache. The actual writes to disk are done later if the write requests cannot be accommodated immediately.

This type of caching policy can be extremely effective in improving performance of JFS I/O when the cache hit rate is high. It would fully exploit the read-ahead and write-behind features of JFS. This would allow file writes to be asynchronous so that the applications can continue to process without having to wait for I/O requests to complete. On the other hand, if the applications have poor cache hit rates or if they do very large I/Os, such caching policy may not be of much benefit.

If you know that certain files have poor cache-utilization characteristics, then you could open those files as Direct I/O files. Doing this most likely will lead to improved performance of your application.

Direct I/O for files and raw I/O for devices are functionally equivalent, but Direct I/O doesn't impact raw I/O performance. In comparison, raw I/O performance is slightly better than Direct I/O, but Direct I/O does provide the benefits of a JFS as well as enhanced performance.

Enabling your applications to use Direct I/O

At the programming level, Direct I/O access to a file is enabled by passing the O_DIRECT flag to the fcntl.h. This flag is defined in open function. Applications must be compiled with _ALL_SOURCE enabled to have the definition of O_DIRECT available.

At the user level, starting with AIX 5.1D Direct I/O is enabled using the "dio" option on the mount command e.g. mount -odio /xyz where xyz is a filesystem. This works for both JFS and JFS2 filesystems. A filesystem mounted with the "dio" option will have all I/O treated as Direct I/O as long as the alignment requirements are met. The I/O should be aligned to page (4K byte) boundaries and in multiples of the page size. If the I/O doesn't meet those requirements, then the I/O will go through kernel buffers and the buffers will be flushed after the I/O completes. This will result in poor performance. Therefore, you should use the "dio" option on the mount command only if all applications running against the files in the filesystem are well behaved in this respect.

Once Direct I/O is implemented, it's easy to verify if it's working: Mount a filesystem with the dio option and record the number of memory pages used. Repeat the process with the filesystem mounted without the dio option. Notice that for the Direct I/O implemented filesystem, memory pages will NOT be used to cache pages, hence the vmtune parameter 'numperm' would not grow as in the case of normal I/O.

Rules for Direct I/O at the API level

There are very strict rules for Direct I/O at the API level. Buffers for the I/O requests need to be 4K byte aligned, and the I/O lengths must be in multiples of 4K bytes. Failure on either at the API level will bypass Direct I/O. Normally, databases naturally obey these rules as they are true of raw logical I/O, too.
Direct I/O does not bypass i-node locking. If i-node locking is a problem because of writes, it will likely continue to be a problem with Direct I/O.
Direct I/O is unbuffered, so writes are synchronous. If the application does lots of writes which are buffered without Direct I/O, it may run very slowly with Direct I/O.
Direct I/O is unbuffered, so there is no read-ahead. If application is doing a lot of sequential reads and taking advantage of the filesystem making them into bigger physical I/O's, Direct I/O may be slower.
Direct I/O does not coalesce contiguous I/O's. This would be a possible issue for applications using aio, listio, or readv/writev.

Benefits of Direct I/O

Direct I/O is only supported for program working storage, that is, local persistent files. The main benefit of Direct I/O is in the reduction of CPU cycles needed for file reads and writes. This results from not having to copy files from the VMM file cache to the user buffer as in the normal cache situations. For normal cache situation, if the cache hit rate is low, most read requests would go to the disk. As mentioned before, these are the ideal situations where applications would benefit from Direct I/O implementation. However, for cases where cache hit rate is high in normal cache situations, applications would see reduced CPU utilization from Direct I/O implementation but would not be able to take advantage of the read-ahead algorithms available under normal cache policy. Writes are faster with normal cached I/O in most cases. But if a file is opened with O_SYNC or O_DSYNC, then the writes have to go to disk. In these cases, applications would benefit from Direct I/O because the overhead of data copy is eliminated.

Another benefit of Direct I/O is that it doesn't allow applications to compromise the effectiveness of caching of other files. When a file is read or written, the file competes for space in the file cache, and this could cause other file data to be pushed out of the cache. If you know that certain files have poor cache-utilization characteristics, then only those files could be opened with O_DIRECT.

Performance of Direct I/O reads

Even though the use of Direct I/O has the potential to reduce the need of CPU cycles for application execution, ironically it leads to longer elapsed times in many cases. This is especially true for a series of small I/O requests.

Direct I/O reads from the disk are synchronous, and this can result in poor performance if the data was likely to be in memory under the normal caching policy. Direct I/O bypasses the VMM read-ahead algorithms because the I/Os would not go through the VMM. The read-ahead algorithm can be quite useful for sequential access to files because the VMM can initiate disk requests and have the pages already be resident in memory before the application has requested the pages. Applications can compensate for the loss of this read-ahead feature by using one of the following methods:

Issue larger read requests.
Issue asynchronous Direct I/O read-ahead by the use of multiple threads.
Use the asynchronous I/O facilities such as aio_read() or lio_listio().

Performance of Direct I/O writes

Direct I/O writes bypass the VMM and go directly to the disk, so that there can be a significant performance penalty; in normal cached I/O, the writes can go to memory and then get flushed to disk later. Because Direct I/O writes do not get copied into memory, when a sync operation is performed, it will not have to flush these pages to disk, thus reducing the amount of work the syncd daemon has to perform.

Performance example

In the following example, performance is measured on an RS/6000 server running AIX 4.3.1. KBPS is the throughput in kilobytes per second, and %CPU is CPU usage in percent.
Listing 1. Performance example
# of 2.2 GB SSA Disks        1        2        4        6        8
# of PCI SSA Adapters        1        1        1        1        1

        Sequential read throughput, using normal I/O

KBPS                                7108  14170  18725 18519  17892
%CPU                                23.9   56.1   92.1  97.0   98.3

        Sequential read throughput, using Direct I/O
KBPS                                7098  14150  22035  27588  30062
%CPU                                 4.4    9.1   22.0   39.2   54.4

        Sequential read throughput, using raw I/O

KBPS                                7258  14499  28504  30946  32165
%CPU                                 1.6    3.2   10.0   20.9   24.5


Conflicting file access modes

In order to avoid consistency issues between programs that use Direct I/O and programs that use normal cached I/O, Direct I/O is by default used in an exclusive use mode. If there are multiple opens of a file and some of them are direct and others are not, the file will stay in its normal cached access mode. Only when the file is open exclusively by Direct I/O programs will the file be placed in Direct I/O mode.

Similarly, if the file is mapped into virtual memory via the shmat() or mmap() system calls, then file will stay in normal cached mode.

The JFS or JFS2 will attempt to move the file into Direct I/O mode any time the last conflicting. non-direct access is eliminated (either by close(), munmap(), or shmdt() subroutines). Changing the file from normal mode to Direct I/O mode can be rather expensive since it requires writing all modified pages to disk and removing all the file's pages from memory.

Candidates for Direct I/O

I/O-intensive applications that don't benefit much from the normal caching policy are likely to see improved performance when Direct I/O is implemented.

Programs that are typically CPU-limited and perform lots of disk I/O are good candidates for Direct I/O. Codes that have large sequential I/Os are good candidates as well. Applications that do numerous small I/Os will typically see less performance benefit, since Direct I/O is unable to exploit read-ahead or write-behind algorithms available under normal caching policy. Applications that benefit from striping are also good candidates.



AIX 5L Version 5.1 Performance Management Guide

About the author

Shiv Dutta is a technical consultant for IBM eServer group where he assists independent software vendors with the enablement of their applications on pSeries servers. Shiv has considerable experience as a software developer, system administrator and an instructor. He provides AIX support in the areas of system administration, problem determination, performance tuning and sizing guides. Shiv has worked with AIX from its inception. He holds a Ph.D. in Physics from Ohio University and can be reached at


