- Slave driver 如何使用DMA
1.Slave-DMA
从方向上来说,DMA传输可以分为4类:
- memory到device
- device到memory
- device到device
- memory到memory
从Linux kernel角度来看,所有外设都是slave,因此称这些有device参与的传输(MEM2DEV、DEV2MEM、DEV2DEV)为Slave-DMA传输。而memory到memory的传输,被称为Async TX。如下图所示:
2.Slave Driver
Below is a guide to device driver writers on how to use the Slave-DMA API of the DMA Engine. This is applicable only for slave DMA usage only.
The slave DMA usage consists of following steps:
- Allocate a DMA slave channel
- Set slave and controller specific parameters
- Get a descriptor for transaction
- Submit the transaction
- Issue pending requests and wait for callback notification
refer to Documentation/driver-api/dmaengine/client.rst
2.1. Allocate DMA channel
任何consumer在开始DMA传输之前,都要申请一个DMA channel。 channel用struct dma_chan来表示,如下所示:
265 struct dma_chan {
266 struct dma_device *device;
267 dma_cookie_t cookie;
268 dma_cookie_t completed_cookie;
269
270 /* sysfs */
271 int chan_id;
272 struct dma_chan_dev *dev;
273
274 struct list_head device_node;
275 struct dma_chan_percpu __percpu *local;
276 int client_count;
277 int table_count;
278
279 /* DMA router */
280 struct dma_router *router;
281 void *route_data;
282
283 void *private;
284 };
Allocate DMA channel function:
Channel allocation is slightly different in the slave DMA context, client drivers typically need a channel from a particular DMA controller only and even in some cases a specific channel is desired. API Interface:
struct dma_chan *dma_request_chan(struct device *dev, const char *name);
struct dma_chan *dma_request_chan_by_mask(const dma_cap_mask_t *mask);
-
The dma_request_chan_by_mask() is to request any channel matching with the requested capabilities, can be used to request channel for memcpy, memset, xor, etc where no hardware synchronization is needed.
-
The dma_request_chan() is to request a slave channel. The dma_request_chan() will try to find the channel via DT, ACPI or in case if the kernel booted in non DT/ACPI mode it will use a filter lookup table and retrieves the needed information from the dma_slave_map provided by the DMA drivers.
Our dma_device structure has a field called cap_mask that holds the various types of transaction supported, and you need to modify this mask using the dma_cap_set function, with various flags depending on transaction types you support as an argument.
传输类型具体列为:
enum dma_transaction_type {
DMA_MEMCPY, //The device is able to do memory to memory copies
DMA_XOR,
DMA_PQ,
DMA_XOR_VAL,
DMA_PQ_VAL,
DMA_MEMSET,
DMA_INTERRUPT,
DMA_SG,
DMA_PRIVATE,
DMA_ASYNC_TX,
DMA_SLAVE,
DMA_CYCLIC,
DMA_INTERLEAVE,
/* last transaction type for creation of the capabilities mask */
DMA_TX_TYPE_END,
};
Currently, the types available are:
* DMA_MEMCPY
- The device is able to do memory to memory copies
* DMA_XOR
- The device is able to perform XOR operations on memory areas
- Used to accelerate XOR intensive tasks, such as RAID5
* DMA_XOR_VAL
- The device is able to perform parity check using the XOR
algorithm against a memory buffer.
* DMA_PQ
- The device is able to perform RAID6 P+Q computations, P being a
simple XOR, and Q being a Reed-Solomon algorithm.
* DMA_PQ_VAL
- The device is able to perform parity check using RAID6 P+Q
algorithm against a memory buffer.
* DMA_INTERRUPT
- The device is able to trigger a dummy transfer that will
generate periodic interrupts
- Used by the client drivers to register a callback that will be
called on a regular basis through the DMA controller interrupt
* DMA_PRIVATE
- The devices only supports slave transfers, and as such isn't
available for async transfers.
* DMA_ASYNC_TX
- Must not be set by the device, and will be set by the framework
if needed
- /* TODO: What is it about? */
* DMA_SLAVE
- The device can handle device to memory transfers, including
scatter-gather transfers.
- While in the mem2mem case we were having two distinct types to
deal with a single chunk to copy or a collection of them, here,
we just have a single transaction type that is supposed to
handle both.
- If you want to transfer a single contiguous memory buffer,
simply build a scatter list with only one item.
* DMA_CYCLIC
- The device can handle cyclic transfers.
- A cyclic transfer is a transfer where the chunk collection will
loop over itself, with the last item pointing to the first.
- It's usually used for audio transfers, where you want to operate
on a single ring buffer that you will fill with your audio data.
* DMA_INTERLEAVE
- The device supports interleaved transfer.
- These transfers can transfer data from a non-contiguous buffer
to a non-contiguous buffer, opposed to DMA_SLAVE that can
transfer data from a non-contiguous data set to a continuous
destination buffer.
- It's usually used for 2d content transfers, in which case you
want to transfer a portion of uncompressed data directly to the
display to print it
Release dma channel function:
void dma_release_channel(struct dma_chan *chan);
2.2.Set slave and controller specific parameters
Next step is always to pass some specific information to the DMA driver. Most of the generic information which a slave DMA can use is in struct dma_slave_config. This allows the clients to specify DMA direction, DMA addresses, bus widths, DMA burst lengths etc for the peripheral.
If some DMA controllers have more parameters to be sent then they should try to embed struct dma_slave_config in their controller specific structure. That gives flexibility to client to pass more parameters, if required.
int dmaengine_slave_config(struct dma_chan *chan, struct dma_slave_config *config);
349 struct dma_slave_config {
350 enum dma_transfer_direction direction;
351 dma_addr_t src_addr;
352 dma_addr_t dst_addr;
353 enum dma_slave_buswidth src_addr_width;
354 enum dma_slave_buswidth dst_addr_width;
355 u32 src_maxburst;
356 u32 dst_maxburst;
357 bool device_fc;
358 unsigned int slave_id;
359 };
- direction:传输方向,可以取:
DMA_MEM_TO_MEM,
DMA_MEM_TO_DEV,
DMA_DEV_TO_MEM,
DMA_DEV_TO_DEV,
- src_addr:从哪搬运
- dst_addr:搬到哪
- src_addr_width:一次搬运粒度,单位,可取如下值
enum dma_slave_buswidth {
DMA_SLAVE_BUSWIDTH_UNDEFINED = 0,
DMA_SLAVE_BUSWIDTH_1_BYTE = 1,
DMA_SLAVE_BUSWIDTH_2_BYTES = 2,
DMA_SLAVE_BUSWIDTH_3_BYTES = 3,
DMA_SLAVE_BUSWIDTH_4_BYTES = 4,
DMA_SLAVE_BUSWIDTH_8_BYTES = 8,
DMA_SLAVE_BUSWIDTH_16_BYTES = 16,
DMA_SLAVE_BUSWIDTH_32_BYTES = 32,
DMA_SLAVE_BUSWIDTH_64_BYTES = 64,
};
- src_maxburst:源缓存大小,为减少memory的访问
- dst_maxburst:目的缓存大小
2.3.Get a descriptor for transaction
For slave usage the various modes of slave transfers supported by the DMA-engine are:
- slave_sg: DMA a list of scatter gather buffers from/to a peripheral
- dma_cyclic: Perform a cyclic DMA operation from/to a peripheral till the operation is explicitly stopped.
- interleaved_dma: This is common to Slave as well as M2M clients. For slave address of devices’ fifo could be already known to the driver. Various types of operations could be expressed by setting appropriate values to the ‘dma_interleaved_template’ members.
496 struct dma_async_tx_descriptor {
497 dma_cookie_t cookie;
498 enum dma_ctrl_flags flags; /* not a 'long' to pack with cookie */
499 dma_addr_t phys;
500 struct dma_chan *chan;
501 dma_cookie_t (*tx_submit)(struct dma_async_tx_descriptor *tx);
502 int (*desc_free)(struct dma_async_tx_descriptor *tx);
503 dma_async_tx_callback callback;
504 dma_async_tx_callback_result callback_result;
505 void *callback_param;
506 struct dmaengine_unmap_data *unmap;
507 #ifdef CONFIG_ASYNC_TX_ENABLE_CHANNEL_SWITCH
508 struct dma_async_tx_descriptor *next;
509 struct dma_async_tx_descriptor *parent;
510 spinlock_t lock;
511 #endif
512 };
//用于在“scatter gather buffers”列表和总线设备之间进行DMA传输
struct dma_async_tx_descriptor *dmaengine_prep_slave_sg(
struct dma_chan *chan, struct scatterlist *sgl,
unsigned int sg_len, enum dma_data_direction direction,
unsigned long flags);
//常用于音频等场景中,在进行一定长度的dma传输(buf_addr&buf_len)的过程中,每传输一定的byte(period_len),就会调用一次传输完成的回调函数
struct dma_async_tx_descriptor *dmaengine_prep_dma_cyclic(
struct dma_chan *chan, dma_addr_t buf_addr, size_t buf_len,
size_t period_len, enum dma_data_direction direction);
//可进行不连续的、交叉的DMA传输,通常用在图像处理、显示等场景中。
struct dma_async_tx_descriptor *dmaengine_prep_interleaved_dma(
struct dma_chan *chan, struct dma_interleaved_template *xt,
unsigned long flags);
2.4.Submit the transaction
Once the descriptor has been prepared and the callback information added, it must be placed on the DMA engine drivers pending queue.
//dmaengine_submit() will not start the DMA operation, it merely adds it to the pending queue.
dma_cookie_t dmaengine_submit(struct dma_async_tx_descriptor *desc);
2.5.Issue pending DMA requests and wait for callback notification
The transactions in the pending queue can be activated by calling the issue_pending API. If channel is idle then the first transaction in queue is started and subsequent ones queued up.
On completion of each DMA operation, the next in queue is started and a tasklet triggered. The tasklet will then call the client driver completion callback routine for notification, if set.
void dma_async_issue_pending(struct dma_chan *chan);
2.6. wait for callback notification
传输请求被提交之后,client driver可以通过回调函数获取传输完成的消息。当然,也可以通过dma_async_is_tx_complete等API,测试传输是否完成。
参考:
- Documentation/dmaengine.txt
- Documentation/driver-api/dmaengine/provider.rst
- https://www.kernel.org/doc/Documentation/dmaengine/provider.txt
- https://www.kernel.org/doc/html/v4.15/driver-api/dmaengine/client.html