With hardware vMMUs, in order to avoid the VMM overhead with shadow paging,the guest is left alone to update its page tables, while the hardware maintains its own page tables which maps gpa to hpa. Intel calls these Extended Page Tables (EPT). Having two page tables now requires that when a guest translates and address, two levels must be walked (sometimesreferred to as 2D page walks). http://blog.chinaunix.net/uid-1858380-id-3205061.html
So hardware support can come at a greater cost for programs with bad locality and cache unfriendly, than its software equivalent. When a TLB miss occurs, and the guest does a page walk,for each hierarchical分层的 level, the entire EPT must be walked as well, to obtain the hpa. For 64bit guests, this is worse than 32bit ones, as the 64bit address space requires more levels (PML4, PDP, PD, PTE) of translation.
KVM's implementation of EPT is quite unique and uses both the guest's tables and the hardware's to translate addresses. When a guest needs to translate virtual addresses to physical ones, the gva_to_gpa()function is called:
static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, gva_t vaddr)
{
struct guest_walker walker;
gpa_t gpa = UNMAPPED_GVA;
int r;
r = FNAME(walk_addr)(&walker, vcpu, vaddr, 0, 0, 0);
if (r) {
gpa = gfn_to_gpa(walker.gfn);
gpa |= vaddr & ~PAGE_MASK;
}
return gpa;
}
If the guest's walk fails and the gva-gpa mapping is not present, a page fault is raised, andtdp_page_fault() - two diminutional paging - is invoked through an EPT violation -handle_ept_violation() to translate gpa to hpa. A new page table entry is created and the shadow page code is reused throughmmu_set_spte()and added to the beginning of the page list throughpte_list_add(). This way, the next time the guest virtual address is accessed, it will already be in the guest's pages and walk_addr() will be done successfully, and the gpa can be returned without further a due.