[Memory] A look at the x86 "Self-referencing Page Directory trick"

Theory Concretion: A look at the x86 "Self-referencing Page Directory trick"

Since questions about this are asked all the time, it is best to simply explain it very clearly and get it out of the way.

This section of the article looks specifically at the x86-32 architecture and seeks to explain the "self-referencing page directory" trick. On x86-32, the processor takes after the translation fault model above which faults after *both* not finding an entry in the TLB *and* not finding a translation entry in the software constructed page tables which it walks for you since it's just nice like that. Like any other architecture with an MMU model that has hardware-assisted TLB loading (that is, the processor walks the page table for you), you are required to give the processor the address of the top level table that begins describing the address space's translations so it will know where to start walking from. This is essentially the page-directory's physical address in CR3. The processor walks and as it does so, it *interprets* the data at the addresses it finds as either Page-Directory Entries, or Page Table Entries.

Get that very solidly, please: bytes in RAM are not purposeful; they do not inherently imply any meaning. Page tables are just bytes in RAM. You could very well decide to give a page table to a network card for a DMA transfer. Just as you could put the address of a network frame into CR3 and have the processor walk that. It's very possible that your blunderous CR3 value would have, contiguous from it at the right offsets, the right bits set (PRESENT, WRITE, etc etc), and some frame address such that the CPU may even walk all the way to what it sees as a page table and find an entry to translate your fault address. This does not mean that this translation data the CPU interpreted as translation data was indeed translation data. The CPU takes your byte address in RAM and walks from there, *interpreting* those bytes as translation information.

Take the following example, then: You have a page directory at physical address 0x12345000. This page directory would of course have 1024 entries, numbered 0 to 1023. Since I want to jump into the meat of the subject, let's imagine that this page directory at 0x12345000's last entry, index 1023, points back to 0x12345000.

That is, pdir[1023] == 0x12345xxx, where 'xxx' represents the permission bits. Let us also imagine that pdir[0] points to a frame at 0x12344000, which the processor would interpret as a page table. This table has of course, it own 1024 page table entries. They map of course, the virtual addresses 0x0 to 0x3FFFFF to various frames in RAM. So right now, our example page directory looks like this:

[Page directory at 0x12345000]:
entry 0000 | phys: (0x12344 << 12) | perms 0bxxxxxxxxxxxx |
entry ...
entry ...
entry 1023 | phys: (0x12345 << 12) | perms 0bxxxxxxxxxxxx |

[Page table at 0x12344000 that pdir[0] points to]:
entry 0000 | phys: (0x34567 << 12) | perms 0bxxxxxxxxxxxx |
entry ...
entry ...
entry 512  | phys: (0x72445 << 12) | perms 0bxxxxxxxxxxxx |

This setup is such that: pdir[0] == 0x12344xxx, pdir[0], ptab[0] == 0x34567xxx, pdir[0], ptab[512] == 0x72445xxx, pdir[1023] == 0x12345xxx.

Let us simulate a page table walk to find out what physical address entry{pdir[0], ptab[512]} maps to; i.e, what physical address virtual address 0x200000 is mapped to.

  1. Processor encounters an instruction that references 0x200xxx while paging is turned on. This reference must pass through the MMU.
  2. Assume this entry was not found in the TLB. Fault #1 occurs. On x86-32, this causes the processor to walk the page tables rather than to trap into the OS.
  3. The CPU takes the virtual address, 0x00200xxx and splits it up, 10 bits, 10 bits, and 12 bits.
  4. The CPU now knows that it must index into entry 0x0 of the page directory pointed to by CR3, and then index into entry 0x200 (512) of the page table pointed to by the page directory in CR3.
  5. CPU begins page table walk. The OS has written 0x12345xxx into CR3. The CPU assumes that this is the address of a valid page directory and sends out (0x12345000 + 0x0 * sizeof(pdir_entry_t)) onto the address bus as a uint32_t fetch. That computes to the physical address 0x12345000 being sent out on the address bus.
  6. CPU gets the 4 bytes from the memory controller on the data bus, and interprets the 4bytes it just got from our page directory index 0 as a page directory entry.
  7. CPU sees that this page directory entry is 0x12344xxx, as in our example above. 'xxx' are the permission bits. The CPU checks these, and since we're just gansta like that, they're correct, and the entry is PRESENT.
  8. CPU now, having validated the permission bits, will extract the physical address from the page directory entry. It will get 0x12344000.
  9. CPU now confident that 0x12344000 is the start of a page table, decides to index into this page table by computing the offset from the base, 0x12344000. It needs to get index 0x200. It then computes: (0x12344000 + (0x200 * sizeof(ptab_entry_t)), and sends the result out onto the address bus.
  10. Memory controller returns the 4 bytes at 0x12344800.
  11. CPU interprets this as a page table entry, and checks the bits. Once again, we are some real Gs, so the permissions bits turn out to be valid.
  12. CPU is now at the leaf level, and extracts the frame address for the virtual address 0x200xxx. This turns out to be 0x72445000.
  13. CPU evicts some entry from the TLB to make space for this new translation data that maps 0x200000 in the current address space to the frame 0x72445000.
  14. Program execution continues. Note that translation fault number two, what people call the x86 Page Fault, never occurs since the CPU *did* find a translation from its walk.

Now that we have a firm grasp of how a page table walk works, and what the idea of "interpretation" is, let us simulate a walk of a self-referencing page directory entry.

  1. Processor encounters an instruction that references 0xFFFFFxxx while paging is turned on. This reference must pass through the MMU. This is also our self-referencing entry, as you can see from the example page-table setup above.
  2. Assume this entry was not found in the TLB. Fault #1 occurs. On x86-32, this causes the processor to walk the page tables rather than to trap into the OS.
  3. The CPU takes the virtual address, 0xFFFFFxxx and splits it up, 10 bits, 10 bits, and 12 bits.
  4. The CPU now knows that it must index into entry 0x3FF (1023) of the page directory pointed to by CR3, and then index into entry 0x3FF (1023) of the page table pointed to by the page directory in CR3.
  5. CPU begins page table walk. The OS has written 0x12345xxx into CR3. The CPU assumes that this is the address of a valid page directory and sends out (0x12345000 + 0x3FF * sizeof(pdir_entry_t)) onto the address bus as a uint32_t fetch. That computes to the physical address 0x12345FFC being sent out on the address bus.
  6. CPU gets the 4 bytes from the memory controller on the data bus, and correctly interprets the 4bytes it just got from our page directory index 1023 as a page directory entry.
  7. CPU sees that this page directory entry is 0x12345xxx, as in our example above: this entry references itself! But the CPU does not check that; it simply checks the permissions, and gets ready to interpret that address in the page directory entry as a page table's physical address. 'xxx' are the permission bits. The CPU checks these, and since we're just gansta like that, they're correct, and the entry is PRESENT.
  8. CPU now, having validated the permission bits, will extract the physical address from the page directory entry. It will gets 0x12345000 as a result of our self-referencing trick, and prepares to read *AGAIN* from the page directory, instead of reading from a page table. It will *interpret* the page directory's bytes now as a page table.
  9. CPU now confident that 0x12345000 is the start of a page table, decides to index into this page table by computing the offset from the base, 0x12345000. It needs to get index 0x3FF. It then computes: (0x12345000 + (0x3FF * sizeof(ptab_entry_t)), and sends the result out onto the address bus.
  10. Memory controller returns the 4 bytes at 0x12345FFC.
  11. CPU interprets this as a page table entry, and checks the bits. Just like last time, since we're reading the same bytes, the permissions turn out to be valid.
  12. CPU is now at the leaf level (or so it thinks), and extracts the frame address for the virtual address 0xFFFFFxxx. This turns out to be 0x12345000, since entry 0x3FF (1023) in the page directory (which the CPU is currently interpreting as a page table) points to the page directory, 0x12345000.
  13. CPU evicts some entry from the TLB to make space for this new translation data that maps 0xFFFFF000 in the current address space to the frame 0x12345000.
  14. Program execution continues. Note that translation fault number two, what people call the x86 Page Fault, never occurs since the CPU *did* find a translation from its walk.
  15. More importantly, the program that sent out the address, 0xFFFFF000, is about the receieve the data that was at physical frame 0x12345000, since the CP walked to that extent, and has a TLB entry that maps it as such. This program can now read and write from/to the page directory simply by accessing offsets from 0xFFFFFxxx. Accessing 0x12345000 will get page directory entry 0. Accessing 0x12345004 will get entry 1, and so on.

And now that we understand how the self-referencing page table trick can be used to read from and write to the current page directory, we'll examine how to read from and write to the page tables in the current process next. A note: 0xFFFFF000 is a valid address to any program unless you map it as SUPERVISOR, or in other words, you leave the USER bit unset. Otherwise programs in userspace will be able to edit their own page tables. Imagine a program deciding to start mapping pages in its address space to the kernel at physical address 0x100000? It could very well now decide to zero out the kernel in RAM.


http://wiki.osdev.org/Memory_Management_Unit

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值