Things start to get a little bit more complicated once paged databegins to be used. For the most part the ability to use [page,offset, len] tuples for SKB data came about so that file systemfile contents could be directly sent over a socket. But, as it turnsout, it is sometimes beneficial to use this for nomal buffering ofprocess sendmsg() data.
It must be understood that once paged data starts to be used on anSKB, this puts a specific restriction on all future SKB data areaoperations. In particular, it is no longer possible to doskb_put() operations.
We will now mention that there are actually two length variablesassosciated with an SKB, len and data_len. Thelatter only comes into play when there is paged data in the SKB.skb->data_len tells how many bytes of paged data thereare in the SKB. From this we can derive a few more things:
- The existence of paged data in an SKB is indicated byskb->data_len being non-zero. This is codified inthe helper routine skb_is_nonlinear() so that it thefunction you should use to test this.
- The amount of non-paged data at skb->data can becalculated as skb->len - skb->data_len. Again, thereis a helper routine already defined for this calledskb_headlen() so please use that.
Each chunk of paged data in an SKB is described by the followingstructure:
struct skb_frag_struct { struct page *page; __u16 page_offset; __u16 size; };There is a pointer to the page (which you must hold a properreference to), the offset within the page where this chunk ofpaged data starts, and how many bytes are there.
The paged frags are organized into an array in the sharedSKB area, defined by this structure:
#define MAX_SKB_FRAGS (65536/PAGE_SIZE + 2) struct skb_shared_info { atomic_t dataref; unsigned int nr_frags; unsigned short tso_size; unsigned short tso_segs; struct sk_buff *frag_list; skb_frag_t frags[MAX_SKB_FRAGS]; };The nr_frags member states how many frags there areactive in the frags[] array. The tso_sizeand tso_segs is used to convey information to thedevice driver for TCP segmentation offload. The frag_listis used to maintain a chain of SKBs organized for fragmentationpurposes, it is _not_ used for maintaining paged data. Andfinally the frags[] holds the frag descriptors themselves.
A helper routine is available to help you fill in page descriptors.
void skb_fill_page_desc(struct sk_buff *skb, int i, struct page *page, int off, int size)
This fills the i'th page vector to point to pageat offset off of size size. It also updates the nr_frags member to be one past i.
If you wish to simply extend an existing frag entry by some numberof bytes, increment the size member by that amount.
With all of the complications imposed by non-linear SKBs, it mayseem difficult to inspect areas of a packet in a straightforwardway, or to copy data out from a packet into another buffer. Thisis not the case. There are two helper routines available whichmake this pretty easy.
First, we have:
void *skb_header_pointer(const struct sk_buff *skb, int offset, int len, void *buffer)
You give it the SKB, the offset (in bytes) to the piece of datayou are interested in, the number of bytes you want, and a localbuffer which is to be used _only_ if the data you are interestedin resides in the non-linear data area.
You are returned a pointer to the data item, or NULL if you askedfor an invalid offset and len parameter. This pointer could beone of two things. First, if what you asked for is directly inthe skb->data linear data area, you are given a directpointer into there. Else, you are given the buffer pointer youpassed in.
Code inspecting packet headers on the output path, especially,should use this routine to read and interpret protocol headers.The netfilter layer uses this function heavily.
For larger pieces of data other than protocol headers, it maybe more appropriate to use the following helper routine instead.
int skb_copy_bits(const struct sk_buff *skb, int offset, void *to, int len);
This will copy the specified number of bytes, and the specifiedoffset, of the given SKB into the 'to'buffer. Thisis used for copies of SKB data into kernel buffers, and thereforeit is not to be used for copying SKB data into userspace. Thereis another helper routine for that:
int skb_copy_datagram_iovec(const struct sk_buff *from, int offset, struct iovec *to, int size);
Here, the user's data area is described by the given IOVEC.The other parameters are nearly identical to those passed into skb_copy_bits() above.