Windows NT File System Internals》学习笔记之物理内存管理简介

from fengcaho

页帧和页帧数据库

NT VMM必须管理系统内可用的物理内存。VMM使用的方法和现代商用操作系统中使用的方法类似。

 

NT VMM将可用的RAM划分为固定尺寸的页帧。页帧的大小可以从4K64K。在Intel X86结构上,页尺寸为4K。每一个页桢都在页帧数据库(PFN Database)中有相应的入口项。页帧数据库在非分页内存池中,它是一个页帧入口数组。PFN数据库为每一个页帧维护以下信息:

 

l         页帧的物理地址,该字段为20比特,再加上12比特的偏移,那么可寻址4GB的物理地址空间

l         页帧的属性集

-一个修改比特,该比特标记页内容是否被修改

-一个状态标记,指明对该页的操作是读还是写

-该页关联的page color

-标记该页是共享页还是进程私有页的信息

l         一个指向PTE(Prototype Page Table Entry PPTE)的指针。PTEPPTE指针指向该页。PPTE指针和PTE指针用来从虚拟地址反向追踪到物理地址

l         该页的引用计数(Reference Count)VMM通过这个值判断页帧数据库中是否有PTE参考该页

l         一个事件指针。当一个Paging I/O在进行时或者数据从磁盘上读取到内存中时,该指针指向一个事件。

 

 

引用计数(Reference Count)不为0的页帧为有效的页帧。当某个页帧没有PTE指向时,引用计数被减1。当引用技术为0时,表明该页帧没有被使用。根据页桢的不同状态,每一个没有被使用的页帧在下面的5个不同的链表上:

l         坏页链表(bad page list):存在校验错误的页帧

l         空闲链表(free page list):这些页可以立即重新使用,但是没有被初始化为0

l         Zeroed list:这些页可以立即重新使用

l         Modified list:这些页帧没有引用了,但是在把它们的内容写到磁盘之前不能把这些页回收

Modified Page Writer/Mapped Page Writer通常执行异步操作将Modified pages 写到磁盘上。

l         Standby list:包含那些已经从进程工作集中移除的页帧。

NT VMM会基于不同进程的访问内存特点尽量减少分配给某一个进程的页帧数目,某个瞬间分配给进程的页总数称为该进程的工作集。NT VMM通过尽量缩减进程的工作集来提高物理内存的使用率。如果进程的某个页帧由于这个原因被VMM移除的话,VMM并不立即回收该页帧,而是把页帧放到standby链表中。这样该进程就有机会重新使用该页帧。当某个页帧被放到Standby链表时,它被标记为transitional 状态,因为它并没有被释放,而且不属于任何一个进程。

 

NT VMM会给处于freestandby状态的页帧设定一个最大值和最小值。当某个页帧被放到free或者standby链表中,并且总数在最大值和最小值中间时,一个VMM全局事件被置为信号态。VMM使用这些事件判断系统中是否有足够的可用物理页。

 

VMM会经常调用一个内部函数检查是否有足够的内存可以满足需要使用。例如你的驱动调用MMAllocateNonCachedMemory()函数,这个函数需要一定数量的Free页帧。该函数调用MiEnsureAvailablePageOrWait()检查在FreeStandBy链表中是否有足够的可用页,如果没有的话,该函数将在两个事件上阻塞,以等待足够的可用页。如果在一定时间内两个事件都没有变成信号态,将会导致KeBugCheck()

 

VMM使用全局自旋锁同步页帧数据库,当PFN数据库被访问时,需要在合适的IRQL获取该自旋锁(<=DISPATCH_LEVEL)

 

 

Win2K代码中的MMAllocateNonCachedMemory()函数代码如下

PVOID MmAllocateNonCachedMemory (

    IN SIZE_T NumberOfBytes

    )

/*++

Routine Description:

    This function allocates a range of noncached memory in

    the non-paged portion of the system address space.

    This routine is designed to be used by a driver's initialization

    routine to allocate a noncached block of virtual memory for

    various device specific buffers.

Arguments:

    NumberOfBytes - Supplies the number of bytes to allocate.

Return Value:

    NON-NULL - Returns a pointer (virtual address in the nonpaged portion

               of the system) to the allocated physically contiguous

               memory.

    NULL - The specified request could not be satisfied.

 

Environment:

    Kernel mode, IRQL of APC_LEVEL or below.

--*/

{

    PMMPTE PointerPte;

    MMPTE TempPte;

    PFN_NUMBER NumberOfPages;

    PFN_NUMBER PageFrameIndex;

    PVOID BaseAddress;

    KIRQL OldIrql;

 

    ASSERT (NumberOfBytes != 0);

 

    NumberOfPages = BYTES_TO_PAGES(NumberOfBytes);

    //

    // Obtain enough virtual space to map the pages.

    //

    PointerPte = MiReserveSystemPtes ((ULONG)NumberOfPages,

                                      SystemPteSpace,

                                      0,

                                      0,

                                      FALSE);

    if (PointerPte == NULL) {

        return NULL;

    }

    //

    // Obtain backing commitment for the pages.

    //

    if (MiChargeCommitmentCantExpand (NumberOfPages, FALSE) == FALSE) {

        MiReleaseSystemPtes (PointerPte, (ULONG)NumberOfPages, SystemPteSpace);

        return NULL;

    }

    MM_TRACK_COMMIT (MM_DBG_COMMIT_NONCACHED_PAGES, NumberOfPages);

    MmLockPagableSectionByHandle (ExPageLockHandle);

    //

    // Acquire the PFN mutex to synchronize access to the PFN database.

    //

    LOCK_PFN (OldIrql);

    //

    // Obtain enough pages to contain the allocation.

    // Check to make sure the physical pages are available.

    //

    if ((SPFN_NUMBER)NumberOfPages > MI_NONPAGABLE_MEMORY_AVAILABLE()) {

        UNLOCK_PFN (OldIrql);

        MmUnlockPagableImageSection (ExPageLockHandle);

        MiReleaseSystemPtes (PointerPte, (ULONG)NumberOfPages, SystemPteSpace);

        MiReturnCommitment (NumberOfPages);

        return NULL;

    }

#if defined(_IA64_)

    KeFlushEntireTb(FALSE, TRUE);

#endif

    MmResidentAvailablePages -= NumberOfPages;

    MM_BUMP_COUNTER(4, NumberOfPages);

    BaseAddress = (PVOID)MiGetVirtualAddressMappedByPte (PointerPte);

    do {

        ASSERT (PointerPte->u.Hard.Valid == 0);

        MiEnsureAvailablePageOrWait (NULL, NULL);

        PageFrameIndex = MiRemoveAnyPage (MI_GET_PAGE_COLOR_FROM_PTE (PointerPte));

        MI_MAKE_VALID_PTE (TempPte,

                           PageFrameIndex,

                           MM_READWRITE,

                           PointerPte);

        MI_SET_PTE_DIRTY (TempPte);

        MI_DISABLE_CACHING (TempPte);

        MI_WRITE_VALID_PTE (PointerPte, TempPte);

        MiInitializePfn (PageFrameIndex, PointerPte, 1);

        PointerPte += 1;

        NumberOfPages -= 1;

    } while (NumberOfPages != 0);

    //

    // Flush any data for this page out of the dcaches.

    //

#if !defined(_IA64_)

    //

    // Flush any data for this page out of the dcaches.

    //

    KeSweepDcache (TRUE);

#else

    MiSweepCacheMachineDependent(BaseAddress, NumberOfBytes, MmNonCached);

#endif

    UNLOCK_PFN (OldIrql);

    MmUnlockPagableImageSection (ExPageLockHandle);

 

    return BaseAddress;

}

 

Win2K代码中的MiEnsureAvailablePageOrWait函数如下:

 

ULONG FASTCALL

MiEnsureAvailablePageOrWait (

    IN PEPROCESS Process,

    IN PVOID VirtualAddress

    )

/*++

Routine Description:

    This procedure ensures that a physical page is available on

    the zeroed, free or standby list such that the next call the remove a

    page absolutely will not block.  This is necessary as blocking would

    require a wait which could cause a deadlock condition.

    If a page is available the function returns immediately with a value

    of FALSE indicating no wait operation was performed.  If no physical

    page is available, the thread enters a wait state and the function

    returns the value TRUE when the wait operation completes.

Arguments:

    Process - Supplies a pointer to the current process if, and only if,

              the working set mutex is held currently held and should

              be released if a wait operation is issued.  Supplies

              the value NULL otherwise.

    VirtualAddress - Supplies the virtual address for the faulting page.

                     If the value is NULL, the page is treated as a

                     user mode address.

Return Value:

    FALSE - if a page was immediately available.

    TRUE - if a wait operation occurred before a page became available.

Environment:

    Must be holding the PFN database mutex with APCs disabled.

--*/

{

    PVOID Event;

    NTSTATUS Status;

    KIRQL OldIrql;

    KIRQL Ignore;

    ULONG Limit;

    ULONG Relock;

    PFN_NUMBER StrandedPages;

    LOGICAL WsHeldSafe;

    PMMPFN Pfn1;

    PMMPFN EndPfn;

    LARGE_INTEGER WaitBegin;

    LARGE_INTEGER WaitEnd;

 

    MM_PFN_LOCK_ASSERT();

    if (MmAvailablePages >= MM_HIGH_LIMIT) {

        //

        // Pages are available.

        //

        return FALSE;

    }

    //

    // If this fault is for paged pool (or pagable kernel space,

    // including page table pages), let it use the last page.

    //

#if defined(_IA64_)

    if (MI_IS_SYSTEM_ADDRESS(VirtualAddress) ||

        (MI_IS_HYPER_SPACE_ADDRESS(VirtualAddress))) {

#else

    if (((PMMPTE)VirtualAddress > MiGetPteAddress(HYPER_SPACE)) ||

        ((VirtualAddress > MM_HIGHEST_USER_ADDRESS) &&

         (VirtualAddress < (PVOID)PTE_BASE))) {

#endif

        //

        // This fault is in the system, use 1 page as the limit.

        //

        if (MmAvailablePages >= MM_LOW_LIMIT) {

            //

            // Pages are available.

            //

            return FALSE;

        }

        Limit = MM_LOW_LIMIT;

        Event = (PVOID)&MmAvailablePagesEvent;

    } else {

        Limit = MM_HIGH_LIMIT;

        Event = (PVOID)&MmAvailablePagesEventHigh;

    }

    while (MmAvailablePages < Limit) {

        KeClearEvent ((PKEVENT)Event);

        UNLOCK_PFN (APC_LEVEL);

        if (Process == HYDRA_PROCESS) {

            UNLOCK_SESSION_SPACE_WS (APC_LEVEL);

        }

        else if (Process != NULL) {

            //

            // The working set lock may have been acquired safely or unsafely

            // by our caller.  Handle both cases here and below.

            //

            UNLOCK_WS_REGARDLESS (Process, WsHeldSafe);

        }

        else {

            Relock = FALSE;

            if (MmSystemLockOwner == PsGetCurrentThread()) {

                UNLOCK_SYSTEM_WS (APC_LEVEL);

                Relock = TRUE;

            }

        }

        KiQueryInterruptTime(&WaitBegin);

        //

        // Wait 7 minutes for pages to become available.

        //

        Status = KeWaitForSingleObject(Event,

                                       WrFreePage,

                                       KernelMode,

                                       FALSE,

                                       (PLARGE_INTEGER)&MmSevenMinutes);

        if (Status == STATUS_TIMEOUT) {

            KiQueryInterruptTime(&WaitEnd);

            //

            // See how many transition pages have nonzero reference counts as

            // these indicate drivers that aren't unlocking the pages in their

            // MDLs.

            //

            Limit = 0;

            StrandedPages = 0;

            do {

       

                Pfn1 = MI_PFN_ELEMENT (MmPhysicalMemoryBlock->Run[Limit].BasePage);

                EndPfn = Pfn1 + MmPhysicalMemoryBlock->Run[Limit].PageCount;

 

                while (Pfn1 < EndPfn) {

                    if ((Pfn1->u3.e1.PageLocation == TransitionPage) &&

                        (Pfn1->u3.e2.ReferenceCount != 0)) {

                            StrandedPages += 1;

                    }

                    Pfn1 += 1;

                }

                Limit += 1;

            } while (Limit != MmPhysicalMemoryBlock->NumberOfRuns);

            //

            // This bugcheck can occur for the following reasons:

            //

            // A driver has blocked, deadlocking the modified or mapped

            // page writers.  Examples of this include mutex deadlocks or

            // accesses to paged out memory in filesystem drivers, filter

            // drivers, etc.  This indicates a driver bug.

            //

            // The storage driver(s) are not processing requests.  Examples

            // of this are stranded queues, non-responding drives, etc.  This

            // indicates a driver bug.

            //

            // Not enough pool is available for the storage stack to write out

            // modified pages.  This indicates a driver bug.

            //

            // A high priority realtime thread has starved the balance set

            // manager from trimming pages and/or starved the modified writer

            // from writing them out.  This indicates a bug in the component

            // that created this thread.

            //

            // All the processes have been trimmed to their minimums and all

            // modified pages written, but still no memory is available.  The

            // freed memory must be stuck in transition pages with non-zero

            // reference counts - thus they cannot be put on the freelist.

            // A driver is neglecting to unlock the pages preventing the

            // reference counts from going to zero which would free the pages.

            // This may be due to transfers that never finish and the driver

            // never aborts or other driver bugs.

            //

            KeBugCheckEx (NO_PAGES_AVAILABLE,

                          MmModifiedPageListHead.Total,

                          MmTotalPagesForPagingFile,

                          (MmMaximumNonPagedPoolInBytes >> PAGE_SHIFT) - MmAllocatedNonPagedPool,

                          StrandedPages);

            if (!KdDebuggerNotPresent) {

                DbgPrint ("MmEnsureAvailablePageOrWait: 7 min timeout %x %x %x %x/n", WaitEnd.HighPart, WaitEnd.LowPart, WaitBegin.HighPart, WaitBegin.LowPart);

                DbgBreakPoint ();

            }

        }

        if (Process == HYDRA_PROCESS) {

            LOCK_SESSION_SPACE_WS (Ignore);

        }

        else if (Process != NULL) {

 

            //

            // The working set lock may have been acquired safely or unsafely

            // by our caller.  Reacquire it in the same manner our caller did.

            //

 

            LOCK_WS_REGARDLESS (Process, WsHeldSafe);

        }

        else {

            if (Relock) {

                LOCK_SYSTEM_WS (Ignore);

            }

        }

        LOCK_PFN (OldIrql);

    }

    return TRUE;

}

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值