我觉得在用户空间调用Mmap方法和malloc方法是有点类似的,这两者都向内核申请一个虚拟地址空间,内核根据用户的需求创建一个struct vm_area_struct结构。
对于Malloc来说,可能这个新建struct vm_area_struc可能与进程原有的某个struct vm_area_struct有着相同的属性,这样内核把这个两个合并。
而对于Mmap来说,一般而言都找不相同属性struct vm_area_struct结构,于是就把这个结构添加到进程描述结构(struct task_struct)的内存管理域(mm,struct mm_struct)的struct vm_area_struct 链表和红黑树上。
两者申请到都是虚拟地址,在页表上的物理页域都为空。若CPU访问这些虚拟地址,都将会产生一个缺页中断。
处理这个缺页中断?我觉得,无论怎么说这个中断处理的任务就是:建立虚拟地址-->物理地址的映射关系。
根据操作系统书说的,首先会把在辅存的页面重新调回到内存中,然后再把这些物理页帧填充到页表上。
但这不就奇怪了吗?无论是Mmap还是malloc申请的地址,应该都是一片空白的(当前未被使用)的内存,这些内存怎么可能先存在辅存上呢?
我想,这种模式的缺页中断的处理与上面所描述的应该是不一样的,即缺页中断的处理应该是有两种的。
其一,就是要访问的页面是在辅存上,处理方式就如上面所描述的一样先填充再映射;其二,就是要访问的页面是本身就在内存的,只需要建立映射关系就可以了。
在编写Mmap方法中,我尝试用两种方法建立虚拟地址到物理地址的映射:一种是使用remap_pfn_range函数,一种是使用缺页中断调用no_page方法。
#define SIZE (2*PAGE_SIZE)
#define VMALLOC
#ifdef VMALLOC
buffer = vmalloc(SIZE,GFP_KERNEL);
#else
buffer = kmalloc(SIZE,GFP_KERNEL);
#endif
static int test_mmap(struct file *filp,struct vm_area_struct *vma)
{
unsigned long pos,pfn,start = vma->vm_start;
unsigned long size = vma->vm_end - vma->vm_start;
unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
if(size > SIZE)
return -EINVAL;
#ifdef VMALLOC
while(size > 0)
{
pfn = vmalloc_to_pfn(buffer);
if(remap_pfn_range(vma,start,pfn,PAGE_SIZE,vma->vm_page_prot))
return -EAGAIN;
buffer += PAGE_SIZE;
start += PAGE_SIZE;
size -= PAGE_SIZE;
}
#else
pos = __pa(buffer + offset);
pfn = pos >> PAGE_SHIFT;
if(remap_pfn_range(vma,start,pfn,size,vma->vm_page_prot))
return -EAGAIN;
#endif
return 0;
}
在test_mmap中完成调用remap_pfn_range建立映射,经检验在用户空间调用mmap,未发生异常,buffer也成功映射到用户空间上。
buffer是由kmalloc分配的内核逻辑地址,其对应物理地址也连续的。故可以用remap_pfn_range一次性映射完。
而若buffer是由vmalloc分配的内核虚拟地址,其对应的物理不一定是连续的。为了保证正确,只能采用这种一页一页地建立映射。页表建好后,访问就不会发生异常了。
void test_vma_open(struct vm_area_struct *vma)
{
printk(KERN_INFO"this is test_vma_open,vm_statr = %x,vm_end = %x,vm_pgoff = %d",\
vma->vm_start,vma->vm_end,vma->vm_pgoff);
}
void test_vma_close(struct vm_area_struct *vma)
{
printk(KERN_INFO"this is test_vma_close,vm_statr = %x,vm_end = %x,vm_pgoff = %d",\
vma->vm_start,vma->vm_end,vma->vm_pgoff);
}
struct page * test_vma_nopage(struct vm_area_struct *vma,unsigned long address,int *type)
{
struct page * pageptr;
unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
unsigned long physaddr = __pa(buffer + address - vma->vm_start + offset);
unsigned long pfn = physaddr >> PAGE_SHIFT;
printk("test_vma_nopage,address = %x,physaddr = %x\n",address,physaddr);
if(!pfn_valid(pfn))
return NOPAGE_SIGBUS;
pageptr = pfn_to_page(pfn);
get_page(pageptr);
if(*type)
*type = VM_FAULT_MINOR;
return pageptr;
}
static struct vm_operations_struct test_vma_ops = {
.open = test_vma_open,
.close = test_vma_close,
.nopage = test_vma_nopage
};
static int s3c2410led_mmap(struct file *filp,struct vm_area_struct *vma)
{
unsigned long pos,pfn;
unsigned long size = vma->vm_end - vma->vm_start;
unsigned long offset = vma->vm_pgoff << PAGE_SHIFT;
printk("s3c2410led_mmap,size = %d,pgoff = %d \n",size,vma->vm_pgoff);
if(size > SIZE)
return -EINVAL;
if(offset > __pa(high_memory) || filp->f_flags & O_SYNC)
vma->vm_flags |= VM_IO;
vma->vm_flags |= VM_RESERVED;
vma->vm_ops = &test_vma_ops;
test_vma_open(vma);
return 0;
}
在s3c2410led_mmap中并没建立映射,只给这个申请到的虚拟地址空间struct vm_area_struct对象,挂上了自定义的方法test_vma_ops。在缺页异常时自动调用
test_vma_nopage。
result:
[root yaffs]$ ./app
prepare to mmappis3c2410led_mmap,size = 4096,pgoff = 1
this is test_vma_open,vm_statr = 40016000,vm_end = 40017000,vm_pgoff = 1
test_vma_nopage,address = 40016000,physaddr = 33ab5000
Bad page state in process 'app'
page:c03166a0 flags:0x00000084 mapping:00000000 mapcount:0 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
[<c0024ba8>] (dump_stack+0x0/0x14) from [<c005a3dc>] (bad_page+0x68/0xa4)
[<c005a374>] (bad_page+0x0/0xa4) from [<c005a988>] (free_hot_cold_page+0x84/0x13c)
r7 = 00000000 r6 = C001A17C r5 = 00000000 r4 = C03166A0
[<c005a904>] (free_hot_cold_page+0x0/0x13c) from [<c005aa90>] (free_hot_page+0x14/0x18)
r8 = 40016000 r7 = 00000000 r6 = C001A17C r5 = C3AAF858
r4 = C03166A0
[<c005aa7c>] (free_hot_page+0x0/0x18) from [<c005dc08>] (__page_cache_release+0xb0/0xc4)
[<c005db58>] (__page_cache_release+0x0/0xc4) from [<c005dc78>] (put_page+0x5c/0x64)
[<c005dc1c>] (put_page+0x0/0x64) from [<c0062c50>] (unmap_vmas+0x3a4/0x58c)
[<c00628ac>] (unmap_vmas+0x0/0x58c) from [<c006748c>] (exit_mmap+0x70/0x124)
[<c006741c>] (exit_mmap+0x0/0x124) from [<c0034334>] (mmput+0x40/0xa4)
r7 = 00000001 r6 = C03A02E0 r5 = C001CB14 r4 = C001CAE0
[<c00342f4>] (mmput+0x0/0xa4) from [<c00388cc>] (exit_mm+0xc8/0xd4)
r4 = C001CAE0
[<c0038804>] (exit_mm+0x0/0xd4) from [<c0038e5c>] (do_exit+0x190/0x7e8)
r6 = C03A02E0 r5 = 00000000 r4 = 00000000
[<c0038ccc>] (do_exit+0x0/0x7e8) from [<c0039578>] (do_group_exit+0x8c/0x90)
[<c00394ec>] (do_group_exit+0x0/0x90) from [<c0039594>] (sys_exit_group+0x18/0x1c)
r4 = 401402FC
[<c003957c>] (sys_exit_group+0x0/0x1c) from [<c0020da0>] (ret_fast_syscall+0x0/0x2c)
this is test_vma_close,vm_statr = 40016000,vm_end = 40017000,vm_pgoff = 1
ng
this is a mmaping test,buffer physcial address = c3ab4000
!
貌似发生个异常,但是buffer还是能成功的映射到用户空间去。我的理解是:调用Mmap方法只是为了能分配到一个虚拟地址空间,至于这个空间的页表有没有建立不重要。因为vm_ops已经更新了,用户空间访问此页时,发生缺页中断时就能调用no_page方法来建立映射,然后用户空间也能正常的访问了。
能够被映射的物理地址的起址一定页对齐的。例如想映射 以pa=0x12345678起址 到用户空间是不实现的,因为页表硬件限制,所以映射到物理地址就是0x12345000,也就是返回给用户空间映射到的实际物理地址就是[0x12345000,0x12345fff],起址就产生偏移,这样是没有意义的,这是为什么mmap系统调用中offset参数一定要是页对齐的原因。
以上是我的理解,当该不少错误的地方,贻笑大方。我也只是学了3个月Linux驱动的菜鸟,希望大家能指出我的错误的地方,大家一起探讨一起分析,期待大牛的点评啊!
题外话,这是我人生中第一篇博客啊。。。