1、mmap分析
内存映射。对于驱动程序来说,内存映射可以提供给用户程序直接访问设备内存的能力。字符设备驱动程序提供了一个mmap接口,可以把内核空间中的那片内存所对应的物理地址空间再次映射到用户空间,这样一个物理内存就有了两份映射,或者说有两个虚拟地址,一个在内核空间,一个在用户空间。这样就可以通过直接操作用户空间的这片映射之后的内存来直接访问物理内存,提高数据传输效率。
(借用网络上的图)
2、mmap作用
例子:显卡一类的设备有一片很大的显存,驱动程序将这片显存映射到内核的地址空间,方便进行操作。如果用户想要在屏幕上面进行绘制操作,将要在用户空间开辟一片至少一样大的内存,将要绘制的图像数据填充在这片内存空间中,然后调用write系统调用,将数据复制到内核空间的显卡。
3、mmap接口函数参数
int (*mmap) (struct file *, struct vm_area_struct *);
struct vm_area_struct是用户空间经过系统调用后传递进来的。在系统调用过程中完成了用户空间地址获取并保存在这里。在实际映射过程中使用到的参数只是这个结构体里面的一部分。开始地址,内存大小等信息可以从这个结构体获取,内核已经处理好了。
/*
* This struct defines a memory VMM memory area. There is one of these
* per VM-area/task. A VM area is any part of the process virtual memory
* space that has a special rule for the page-fault handlers (ie a shared
* library, the executable area etc).
*/
struct vm_area_struct {
struct mm_struct * vm_mm; /* The address space we belong to. */
unsigned long vm_start; /* Our start address within vm_mm. */
unsigned long vm_end; /* The first byte after our end address
within vm_mm. */
/* linked list of VM areas per task, sorted by address */
struct vm_area_struct *vm_next;
pgprot_t vm_page_prot; /* Access permissions of this VMA. */
unsigned long vm_flags; /* Flags, see mm.h. */
struct rb_node vm_rb;
/*
* For areas with an address space and backing store,
* linkage into the address_space->i_mmap prio tree, or
* linkage to the list of like vmas hanging off its node, or
* linkage of vma in the address_space->i_mmap_nonlinear list.
*/
union {
struct {
struct list_head list;
void *parent; /* aligns with prio_tree_node parent */
struct vm_area_struct *head;
} vm_set;
struct raw_prio_tree_node prio_tree_node;
} shared;
/*
* A file's MAP_PRIVATE vma can be in both i_mmap tree and anon_vma
* list, after a COW of one of the file pages. A MAP_SHARED vma
* can only be in the i_mmap tree. An anonymous MAP_PRIVATE, stack
* or brk vma (with NULL file) can only be in an anon_vma list.
*/
struct list_head anon_vma_node; /* Serialized by anon_vma->lock */
struct anon_vma *anon_vma; /* Serialized by page_table_lock */
/* Function pointers to deal with this struct. */
struct vm_operations_struct * vm_ops;
/* Information about our backing store: */
unsigned long vm_pgoff; /* Offset (within vm_file) in PAGE_SIZE
units, *not* PAGE_CACHE_SIZE */
struct file * vm_file; /* File we map to (can be NULL). */
void * vm_private_data; /* was vm_pte (shared mem) */
unsigned long vm_truncate_count;/* truncate_count or restart_addr */
#ifndef CONFIG_MMU
struct vm_region *vm_region; /* NOMMU mapping region */
#endif
#ifdef CONFIG_NUMA
struct mempolicy *vm_policy; /* NUMA policy for the VMA */
#endif
};
在mmap中使用的映射函数
int remap_pfn_range(struct vm_area_struct *, unsigned long addr,
unsigned long pfn, unsigned long size, pgprot_t);
/*
addr-用户空间用于映射的地址
pfn-页帧号
size-大小
pgprot_t-此参数在vm_area_struct结构体中,直接取里面的值就好
*/
4、mmap实现
static int my_map(struct file *filp, struct vm_area_struct *vma)
{
unsigned long page;
unsigned char i;
unsigned long start = (unsigned long)vma->vm_start;
//unsigned long end = (unsigned long)vma->vm_end;
unsigned long size = (unsigned long)(vma->vm_end - vma->vm_start);
//得到物理地址
page = virt_to_phys(buffer);
//将用户空间的一个vma虚拟内存区映射到以page开始的一段连续物理页面上
if(remap_pfn_range(vma,start,page>>PAGE_SHIFT,size,PAGE_SHARED))//第三个参数是页帧号,由物理地址右移PAGE_SHIFT得到
return -1;
//往该内存写10字节数据
for(i=0;i<10;i++)
buffer[i] = array[i];
return 0;
}
5、mmap用户程序调用
//内存映射,offset是物理地址偏移位置大小,offset必须是页面大小的倍数
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
p_map = (unsigned char *)mmap(0, PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED,fd, 0);
6、mmap执行结果
在用户空间操作mmap映射返回的地址,相当于操作到驱动程序里面映射的物理地址或者kmalloc出来的内核虚拟地址。
7、man mmap查看函数说明
SYNOPSIS
#include <sys/mman.h>
void *mmap(void *addr, size_t length, int prot, int flags,
int fd, off_t offset);
int munmap(void *addr, size_t length);
DESCRIPTION
mmap() creates a new mapping in the virtual address space of the call‐
ing process. The starting address for the new mapping is specified in
addr. The length argument specifies the length of the mapping.
If addr is NULL, then the kernel chooses the address at which to create
the mapping; this is the most portable method of creating a new map‐
ping. If addr is not NULL, then the kernel takes it as a hint about
where to place the mapping; on Linux, the mapping will be created at a
nearby page boundary. The address of the new mapping is returned as
the result of the call.
The contents of a file mapping (as opposed to an anonymous mapping; see
MAP_ANONYMOUS below), are initialized using length bytes starting at
offset offset in the file (or other object) referred to by the file
descriptor fd. offset must be a multiple of the page size as returned
by sysconf(_SC_PAGE_SIZE).
The prot argument describes the desired memory protection of the map‐
ping (and must not conflict with the open mode of the file). It is
either PROT_NONE or the bitwise OR of one or more of the following
flags:
PROT_EXEC Pages may be executed.
PROT_READ Pages may be read.
PROT_WRITE Pages may be written.
PROT_NONE Pages may not be accessed.
The flags argument determines whether updates to the mapping are visi‐
ble to other processes mapping the same region, and whether updates are
carried through to the underlying file. This behavior is determined by
including exactly one of the following values in flags:
MAP_SHARED
Share this mapping. Updates to the mapping are visible to other
processes that map this file, and are carried through to the
underlying file. (To precisely control when updates are carried
through to the underlying file requires the use of msync(2).)
MAP_PRIVATE
Create a private copy-on-write mapping. Updates to the mapping
are not visible to other processes mapping the same file, and
are not carried through to the underlying file. It is unspeci‐
fied whether changes made to the file after the mmap() call are
visible in the mapped region.
In addition, zero or more of the following values can be ORed in flags:
MAP_32BIT (since Linux 2.4.20, 2.6)
Put the mapping into the first 2 Gigabytes of the process
address space. This flag is supported only on x86-64, for
64-bit programs. It was added to allow thread stacks to be
allocated somewhere in the first 2GB of memory, so as to improve
context-switch performance on some early 64-bit processors.
Modern x86-64 processors no longer have this performance prob‐
lem, so use of this flag is not required on those systems. The
MAP_32BIT flag is ignored when MAP_FIXED is set.
MAP_ANON
Synonym for MAP_ANONYMOUS. Deprecated.
MAP_ANONYMOUS
The mapping is not backed by any file; its contents are initial‐
ized to zero. The fd and offset arguments are ignored; however,
some implementations require fd to be -1 if MAP_ANONYMOUS (or
MAP_ANON) is specified, and portable applications should ensure
this. The use of MAP_ANONYMOUS in conjunction with MAP_SHARED
is supported on Linux only since kernel 2.4.
MAP_DENYWRITE
This flag is ignored. (Long ago, it signaled that attempts to
write to the underlying file should fail with ETXTBUSY. But
this was a source of denial-of-service attacks.)
MAP_EXECUTABLE
This flag is ignored.
MAP_FILE
Compatibility flag. Ignored.
MAP_FIXED
Don't interpret addr as a hint: place the mapping at exactly
that address. addr must be a multiple of the page size. If the
memory region specified by addr and len overlaps pages of any
existing mapping(s), then the overlapped part of the existing
mapping(s) will be discarded. If the specified address cannot
be used, mmap() will fail. Because requiring a fixed address
for a mapping is less portable, the use of this option is dis‐
couraged.
MAP_GROWSDOWN
Used for stacks. Indicates to the kernel virtual memory system
that the mapping should extend downward in memory.
MAP_HUGETLB (since Linux 2.6.32)
Allocate the mapping using "huge pages." See the Linux kernel
source file Documentation/vm/hugetlbpage.txt for further infor‐
mation, as well as NOTES, below.
MAP_HUGE_2MB, MAP_HUGE_1GB (since Linux 3.8)
Used in conjunction with MAP_HUGETLB to select alternative
hugetlb page sizes (respectively, 2 MB and 1 GB) on systems that
support multiple hugetlb page sizes.
More generally, the desired huge page size can be configured by
encoding the base-2 logarithm of the desired page size in the
six bits at the offset MAP_HUGE_SHIFT. (A value of zero in this
bit field provides the default huge page size; the default huge
page size can be discovered vie the Hugepagesize field exposed
by /proc/meminfo.) Thus, the above two constants are defined
as:
#define MAP_HUGE_2MB (21 << MAP_HUGE_SHIFT)
#define MAP_HUGE_1GB (30 << MAP_HUGE_SHIFT)
The range of huge page sizes that are supported by the system
can be discovered by listing the subdirectories in /sys/ker‐
nel/mm/hugepages.
MAP_LOCKED (since Linux 2.5.37)
Mark the mmaped region to be locked in the same way as mlock(2).
This implementation will try to populate (prefault) the whole
range but the mmap call doesn't fail with ENOMEM if this fails.
Therefore major faults might happen later on. So the semantic
is not as strong as mlock(2). One should use mmap(2) plus
mlock(2) when major faults are not acceptable after the initial‐
ization of the mapping. The MAP_LOCKED flag is ignored in older
kernels.
MAP_NONBLOCK (since Linux 2.5.46)
Only meaningful in conjunction with MAP_POPULATE. Don't perform
read-ahead: create page tables entries only for pages that are
already present in RAM. Since Linux 2.6.23, this flag causes
MAP_POPULATE to do nothing. One day, the combination of
MAP_POPULATE and MAP_NONBLOCK may be reimplemented.
MAP_NORESERVE
Do not reserve swap space for this mapping. When swap space is
reserved, one has the guarantee that it is possible to modify
the mapping. When swap space is not reserved one might get
SIGSEGV upon a write if no physical memory is available. See
also the discussion of the file /proc/sys/vm/overcommit_memory
in proc(5). In kernels before 2.6, this flag had effect only
for private writable mappings.
MAP_POPULATE (since Linux 2.5.46)
Populate (prefault) page tables for a mapping. For a file map‐
ping, this causes read-ahead on the file. This will help to
reduce blocking on page faults later. MAP_POPULATE is supported
for private mappings only since Linux 2.6.23.
MAP_STACK (since Linux 2.6.27)
Allocate the mapping at an address suitable for a process or
thread stack. This flag is currently a no-op, but is used in
the glibc threading implementation so that if some architectures
require special treatment for stack allocations, support can
later be transparently implemented for glibc.
MAP_UNINITIALIZED (since Linux 2.6.33)
Don't clear anonymous pages. This flag is intended to improve
performance on embedded devices. This flag is honored only if
the kernel was configured with the CONFIG_MMAP_ALLOW_UNINITIAL‐
IZED option. Because of the security implications, that option
is normally enabled only on embedded devices (i.e., devices
where one has complete control of the contents of user memory).
Of the above flags, only MAP_FIXED is specified in POSIX.1-2001 and
POSIX.1-2008. However, most systems also support MAP_ANONYMOUS (or its
synonym MAP_ANON).
8、意外收获
通过查找mmap的说明,发现该函数有使用的例子,说的很详细。
经验就是以后有不明白的库函数,就直接man查找例子,这样准确,快速。
#include <sys/mman.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
int
main(int argc, char *argv[])
{
char *addr;
int fd;
struct stat sb;
off_t offset, pa_offset;
size_t length;
ssize_t s;
if (argc < 3 || argc > 4) {
fprintf(stderr, "%s file offset [length]\n", argv[0]);
exit(EXIT_FAILURE);
}
addr = mmap(NULL, length + offset - pa_offset, PROT_READ,
MAP_PRIVATE, fd, pa_offset);
if (addr == MAP_FAILED)
handle_error("mmap");
s = write(STDOUT_FILENO, addr + offset - pa_offset, length);
if (s != length) {
if (s == -1)
handle_error("write");
fprintf(stderr, "partial write");
exit(EXIT_FAILURE);
}
exit(EXIT_SUCCESS);
}