Xen HyperCall 详解

A system call, or syscall, is the mechanismused by an application program to request service from the operating system.

 

A hypervisor call, or hypercall, referredto the paravirtualization interface, by which a guest operating system couldaccess hypervisor services.

hypercall的实现机制 1 Xen 2.0的实现

对于guest, xen-2.0/linux-2.6.11-xen-sparse/include/asm-xen/hypervisor.h

/* And the trap vector is... */

#define TRAP_INSTR "int $0x82"

static inline int HYPERVISOR_xen_version(int cmd)

{

   int ret;

   unsigned long ignore;

 

    __asm__ __volatile__ (

       TRAP_INSTR

       : "=a" (ret), "=b" (ignore)

       :"0" (__HYPERVISOR_xen_version), "1" (cmd)

       :"memory" );

 

   return ret;

}

 

对于xen

xen-2.0/xen/arch/x86/traps.c文件中trap_init函数中有如下代码

#define HYPERCALL_VECTOR   0x82

/* Only ring 1 can access Xen services. */

 _set_gate(idt_table+HYPERCALL_VECTOR,14,1,&hypercall);

 

xen-2.0/xen/arch/x86/x86_32/entry.S 有hypercall_table,包含了实现函数的地址。hypercall根据eax寄存器进行跳转。

ENTRY(hypercall)

    …..  

       call*SYMBOL_NAME(hypercall_table)(,%eax,4)

ENTRY(hypercall_table)

       .long SYMBOL_NAME(do_set_trap_table)    /*  0 */

       …..

       .long SYMBOL_NAME(do_xen_version)

       .long SYMBOL_NAME(do_console_io)

       

      

2 Xen 3.1的实现

原理:

按照下面的解释来理解,就是gues切换到xen有多种方式(int,sysenter等),如果依照xen 2.0的做法,guest migration到稍有区别的平台,则guest必须recompile。如果有xen来实现hypercall page,则解决了这个问题。一个困惑:sysenter/syscall只能用于ring 3 jump into ring 0,这样的话大部分情况guest无法利用(guest一般在ring 1),目前只看到x86_64体系结构的hypercall_page_initialise_ring3_kernel有用到,目前hypercall page依然基本是填充int 82h指令。

 

In more recent versions of Xen, hypercallsare issued via an extra layer of indirection. The guest kernel calls a functionin a shared memory page (mapped by the hypervisor) with the arguments passed inregisters. This allows more effcient mechanisms to be used for hypercalls onsystems that support them,without requiring the guest kernel to be recompiledfor every minor variation in architecture. Newer chips from AMD and Intelprovide mechanisms for fast transitions to and from ring 0.(指sysenter(intel),syscall(amd)之流) Thislayer of indirection allows these to be used when available.

 

Hypercalls were generated by a guest kernelin almost the same way as system calls are generated by userspace applications,the difference being that interrupt 82h, instead of 80h, is used.

 

This still works as of Xen 3, but is nowdeprecated. Instead, hypercalls are issued indirectly via the hypercall page.This is a memory page mapped in to the guest’s address space when the system isstarted.

 

[Xen-devel] Why using hypercall_page 有解释:

This allows guest migrated to a newer/olderxen with a different hypercall invocation convention. Xen fills hypercall pageby its convention, and thus release guest from hardcoding specific flow.

 

Hypercalls are issued by CALLing an addresswithin this page. As you know, the old Intel/AMD x86 cpus use INT to invokekernel's service. But the newer CPUs introduce two instruction pairs:

syscall/sysret, syscenter/sy***it. So,because the hypercall page is filled by Xen, it can hide the difference of thistwo types. Guest OS only take one uniform format to invoke a hypercall.

 

BTW, for HVM guest's hypercall, we don'tuse int 0x82 or the sysXXX instructions; we use VMCALL inside VMX guest orsomething similar (VMMCALL? I'm not sure) inside SVM guest.
Even for PV guest, the hypercall stub codes may have differentformats/versions...
We can see these differences in the function hypercall_page_initialise().

So considering compatibility and portability, it's really not OK for a guest toassume the underlying stub codes or doing hard coding.
Using the hypercall-page method, various guests can use one unified method toinvoke hypercalls.

图来自http://www.sprg.uniroma2.it/kernelhacking2008/lectures/lkhc08-06.pdf,里面的exits.s似乎应该为entry.s

hypercall_page的实现(xen 3.1) 初始化

Hypercall_page is actually a code page,which contains 32 hypercall entry.

every entry is something like

 

"mov $__HYPERVISOR_xxx,%eax

int $0x82 "

 

It is initialized inhypercall_page_initialise(void *hypercall_page) at the  time when control panel creates the domain. Later, domain can simply the  corresponding entry to issue a hypercall.

hypercall_page_initialise函数实现与平台相关。

 

void hypercall_page_initialise(structdomain *d, void *hypercall_page)

{

   if ( is_hvm_domain(d) )

       hvm_hypercall_page_initialise(d, hypercall_page);

   else if ( supervisor_mode_kernel )

       hypercall_page_initialise_ring0_kernel(hypercall_page);

   else

       hypercall_page_initialise_ring1_kernel(hypercall_page);

}

每个hypercall在hypercall_page中占32个字节,这32个字节填充指令(如int82h)和调用参数。

guest中的实现

一般说来, guest 有一个类似HYPERVISOR_XXX的调用(在linux-2.6-xen-sparse/include/asm-i386/mach-xen/asm/ Hypercall.h文件中), 调用_hypercallN,下面是_hypercall0的例子。

 

#define HYPERCALL_STR(name)                                   \

       "callhypercall_page + ("STR(__HYPERVISOR_##name)" * 32)"

 

#define _hypercall0(type, name)                  \

({                                       \

       long__res;                          \

       asmvolatile (                       \

              HYPERCALL_STR(name)            \

              :"=a" (__res)                \

              :                           \

              :"memory" );                \

       (type)__res;                         \

})

Xen中的实现

xen/arch/x86/x86_32/traps.c对中断向量初始化

void __init percpu_traps_init(void)

{..

     /* The hypercall entry vector is only accessible from ring 1. */

   _set_gate(idt_table+HYPERCALL_VECTOR, 14, 1, &hypercall);

 

/xen/arch/x86/x86_32/entry.S中有如下代码:

ENTRY(hypercall)

       ……

       call *hypercall_table(,%eax,4)

对于guest 的HYPERVISOR_XXX,在xen中有一个对应的do_XXX实现,可以看 hypercall_table。

 

例子

HYPERVISOR_set_trap_table(guest中) =>do_set_trap_table() (file xen/arch/x86/traps.c, 在xen中)

 

一个问答

http://old-list-archives.xen.org/archives/html/xen-devel/2008-10/msg00515.html

   In xen/arch/x86/x86_32/traps.c, if supervisor_mode_kernel is true, thehypercall_page will be initialized by hypercall_page_initialise_ring0_kernel.

   my question is, does supervisor_mode_kernel mean that the guest kernelis also running in ring0, the same privilege level as Xen hypervisor?

 

   The book "the definitive guide to the xen hypervisor" (in page30) says hypercall through int82 is now deprecated, and replaced byhypercall_page.

   but int82 can still be found in hypercall_page_initialise_ring1_kernel.In what situation it will be used?

 

 

Yes, supervisor_mode_kernel means that thedom0 kernel runs in ring 0. It also means that other guests cannot be run. It’snot really very useful these days.

 

To your other question: guests are supposedto call into the hypervisor via the hypercall page, but actually the underlyingmechanism is still int 0x82 for 32-bit PV guests. It’s just hidden in thehypervisor-provided hypercall page now.

应用程序与hypercall

hypercall一般是供guest kernel使用的,但是有时候应用程序也需要该服务。如:

 libxenctrl (tools/libxc/xenctrl.h) is a library for low-level access tothe Xen control interfaces.

 libxenguest (tools/libxc/xenguest.h) is a library for guest domainmanagement in Xen.

http://www.tumblr.com/tagged/xen?before=1307176245

应用程序申请超级调用的过程为:

  1. 打开Xen提供的内核驱动:/proc/xen/privcmd;
  2. 通过ioctl系统调用来间接调用hypercall:

fd = open(“/proc/xen/privcmd”,O_RDWR);  
privcmd_hypercall_t hcall = {  
       __HYPERVISOR_print_string,  
       {message, 0, 0, 0, 0}  
   };  
ioctl(fd, IOCTL_PRIVCMD_HYPERCALL, &hcall);  

复杂一点的超级调用申请的过程为:(以_HYPERVISOR_domctl超级调用为例)

  1. 通过pyxc_domain_create()获取要创建的domain的相关信息;
  2. 通过xc_domain_create()创建控制结构体变量domctl;
  3. 通过do_domctl()生成超级调用请求;
  4. 传递请求到OS内核:do_xen_hypercall()
  5. do_privcmd通过ioctl来完成由3环到1环的转变,并完成超级调用。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值