平时看/proc/meminfo中,总能看到CMA的身影,且好像总是被用光了。他是做什么的呢,为啥作为预留内存能用的干干净净呢~? 一起看下!
CmaTotal: 438272 kB
CmaFree: 0 kB
Contiguous Memory Allocator, 连续内存分配器,用于分配连续的大块内存。
预留了一片物理内存区域:
- 设备驱动不用时,内存管理系统将该区域用于分配和管理可移动类型页面;
- 设备驱动使用时,用于连续内存分配,此时已经分配的页面需要进行迁移或丢弃;此外,CMA分配器还可以与DMA子系统集成在一起,使用DMA的设备驱动程序无需使用单独的CMA API。
msm-4.19/kernel/dma/contiguous.c
=> struct page *dma_alloc_from_contiguous(struct device *dev, size_t count,
unsigned int align, gfp_t gfp_mask);
=> cma_alloc(dev_get_cma_area(dev), count, align, no_warn);
DMA释放也是
=> bool dma_release_from_contiguous(struct device *dev, struct page *pages,
int count);
驱动直接从DMA分配内存就不用页表了,但是如果给buddy使用,估计还是要MMU转换的。
结构体定义: msm-4.19/mm/cma.h
struct cma {
unsigned long base_pfn; //CMA区域物理地址的起始页帧号;
unsigned long count; //CMA区域总体的页数;
unsigned long *bitmap; //描述页的分配情况;
unsigned int order_per_bit; /* Order of pages represented by one bit */
//位图中每个bit描述的物理页面的order值,其中页面数为2^order值;
struct mutex lock;
#ifdef CONFIG_CMA_DEBUGFS
struct hlist_head mem_head;
spinlock_t mem_head_lock;
#endif
const char *name;
};
extern struct cma cma_areas[MAX_CMA_AREAS];
extern unsigned cma_area_count;
CMA创建
从dtsi文件中解析,像高通就放在android/vendor/xxxx/proprietary/devicetree-4.19/xxxx/kona.dtsi
/* global autoconfigured region for contiguous allocations */
linux,cma {
compatible = "shared-dma-pool";
alloc-ranges = <0x0 0x00000000 0x0 0xffffffff>;
reusable;
alignment = <0x0 0x400000>;
size = <0x0 0x2000000>;
linux,cma-default;
};
- static int __init rmem_cma_setup(struct reserved_mem *rmem)
- cma_init_reserved_mem(rmem->base, rmem->size, 0, rmem->name, &cma); 初始化CMA,对齐检测,初始化各字段
- memblock_is_region_reserved(base, size) 判断分配给CMA区域的内存是否已经被预留
添加到buddy中
这是和预留内存区别的地方,预留内存是纯NHLOS,是不计算在memtotal中的,但是cma是可以算在memtotal中,被使用的,用于可移动页面的分配和管理。
对每个cma字段执行cma_activate_area
static int __init cma_init_reserved_areas(void)
{
int i;
for (i = 0; i < cma_area_count; i++) {
int ret = cma_activate_area(&cma_areas[i]);
if (ret)
return ret;
}
show_mem_notifier_register(&cma_nb);
return 0;
}
core_initcall(cma_init_reserved_areas);
__free_pages将页面添加到buddy system中,调整zone管理的页面总数adjust_managed_page_count
CMA分配
/**
* cma_alloc() - allocate pages from contiguous area
* @cma: Contiguous memory region for which the allocation is performed.
* @count: Requested number of pages.
* @align: Requested alignment of pages (in PAGE_SIZE order).
* @no_warn: Avoid printing message about failed allocation
*
* This function allocates part of contiguous memory on specific
* contiguous memory area.
*/
struct page *cma_alloc(struct cma *cma, size_t count, unsigned int align,
bool no_warn)
{
unsigned long mask, offset;
unsigned long pfn = -1;
unsigned long start = 0;
unsigned long bitmap_maxno, bitmap_no, bitmap_count;
size_t i;
struct page *page = NULL;
int ret = -ENOMEM;
int retry_after_sleep = 0;
int max_retries = 20;
int available_regions = 0;
mask = cma_bitmap_aligned_mask(cma, align); //计算对齐后的掩码值
offset = cma_bitmap_aligned_offset(cma, align); //起始地址对齐到2^align后,得到对应的offset
bitmap_maxno = cma_bitmap_maxno(cma); //计算位图中可用最大位数
bitmap_count = cma_bitmap_pages_to_bits(cma, count); //计算分配页面所需的位图位数
for (;;) {
mutex_lock(&cma->lock); //cma_area上锁
/*查找空白的连续位置,位图为0的*/
bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap,
bitmap_maxno, start, bitmap_count, mask,
offset);
/*没找到,就等着呗亲*/
if (bitmap_no >= bitmap_maxno) {
if ((retry_after_sleep < max_retries) &&
(ret == -EBUSY)) {
start = 0;
/*
* update max retries if available free regions
* are less.
*/
if (available_regions < 3)
max_retries = 25;
available_regions = 0;
/*
* Page may be momentarily pinned by some other
* process which has been scheduled out, eg.
* in exit path, during unmap call, or process
* fork and so cannot be freed there. Sleep
* for 100ms and retry twice to see if it has
* been freed later.
*/
mutex_unlock(&cma->lock);
msleep(100);
retry_after_sleep++;
continue;
} else {
mutex_unlock(&cma->lock);
break;
}
}
/*找到了,可用区域++*/
available_regions++;
/*将这部分的位图置为1,表明已经被使用*/
bitmap_set(cma->bitmap, bitmap_no, bitmap_count);
/*
* It's safe to drop the lock here. We've marked this region for
* our exclusive use. If the migration fails we will take the
* lock again and unmark it.
* 只要位图置为1,就说明已经被使用了,其他进程就不会抢占了,可以放锁了
*/
mutex_unlock(&cma->lock);
/*偏移找到物理页帧号*/
pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit);
mutex_lock(&cma_mutex);
/*页分配器分配内存,物理起始,结束,迁移类型,so on*/
ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA,
GFP_KERNEL | (no_warn ? __GFP_NOWARN : 0));
mutex_unlock(&cma_mutex);
if (ret == 0) {
/*如果分配成功,直接返回page结构体的指针*/
page = pfn_to_page(pfn);
break;
}
/*分配失败,位图再次清零*/
cma_clear_bitmap(cma, pfn, count);
/* try again with a bit different memory target */
start = bitmap_no + mask + 1;
}
return page;
}
page结构体其实就是物理页了,和真正的页帧号只是偏移量的差距。要和虚拟进程的页面区别开
CMA释放
释放就真的简单很多,直接看注释就行
/**
* cma_release() - release allocated pages
* @cma: Contiguous memory region for which the allocation is performed.
* @pages: Allocated pages.
* @count: Number of allocated pages.
*
* This function releases memory allocated by alloc_cma().
* It returns false when provided pages do not belong to contiguous area and
* true otherwise.
*/
bool cma_release(struct cma *cma, const struct page *pages, unsigned int count)
{
unsigned long pfn;
/*根据要释放的page找到页帧号*/
pfn = page_to_pfn(pages);
/*判断要释放的页帧号在CMA的范围内*/
if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count)
return false;
/*越界进dump*/
VM_BUG_ON(pfn + count > cma->base_pfn + cma->count);
/*遍历count遍,去__free_page*/
free_contig_range(pfn, count);
/*CMA的位图置位0*/
cma_clear_bitmap(cma, pfn, count);
return true;
}
DMA与cache一致性问题
DMA可以直接在内存和外设进行数据搬移,而CPU访问内存时要经过MMU。DMA访问不到CPU内部的cache,所以会出现cache不一致的问题。因为CPU读写内存时,如果在cache中命中,就不会再访问内存。
当CPU 写memory时,cache有两种算法:write_back ,write_through。一般都采用write_back。cache的硬件,使用LRU算法,把cache中的数据替换到磁盘。
cache一致性问题,主要靠以上两类api来解决这个问题。
- 一致性DMA缓冲区api,dma_alloc_coherent ..
- 流式DMA映射api,dma_mag_sg / dma_map_single
CPU通过MMU访问DMA区域,在页表项中可以配置这片区域是否带cache。
现代的SoC,DMA引擎可以自动维护cache的同步。
原文作者:辣鸡工程师
原文地址:CMA预留内存(版权归原文作者所有,侵权留言联系删除)