主题:《Linux内核模块编程指南》(五)
《Linux内核模块编程指南》
《Linux Kernel Module Programming Guide》
作者:Ori Pomerantz 中译者:谭志(lkmpg@21cn.com)
译者注:
1、LKMPG是一本免费的书,英文版的发行和修改遵从GPL version 2的许可。为了
节省时间,我只翻译了其中的大部分的大意,或者说这只是我学习中的一些中文
笔记吧,不能算是严格上的翻译,但我认为这已经足够了。本文也允许免费发布
,但发布前请和我联系,但不要把本文用于商业目的。鉴于本人的水平,文章中
难免有错误,请大家不吝指正。
2、本文中的例子在Linux(kernel version 2.2.10)上调试通过。你用的Linux必
须支持内核模块的加载,如果不支持,请在编译内核时选上内核模块的支持或升
级你的内核到一个支持内核模块的版本。
第五章 写/proc文件系统
我们有两个方法来产生内核模块的输出:注册一个设备驱动程序并mknod一个
设备文件,建立一个/proc文件。这允许内核模块告诉我们感兴趣的信息,问题只
是我们不能向内核模块反馈回去。我们向内核模块传送输入的第一个方法是写/p
roc文件。
因为/proc文件系统主要是向进程报告系统信息,一般没有指定的信息输入。
proc_dir_entry结构体不包含输入的函数指针,它只包含一个指向输出的函数指
针,所以我们要用标准的文件系统机制来写/proc文件系统。
Linux里有一个标准的机制来注册文件系统,因为每个文件系统都有自己的处
理i节点和文件操作的函数。有一个特定的结构体inode_operations来存放所有的
这些函数指针,包括一个指向file_operations结构体的指针。在/proc文件系统
里,无论什么时候我们注册一个文件,它都允许我们指定访问这个文件时所使用
的inode_operations结构体。file_operations里有指向module_input和module_
output的函数指针。
要注意的是在内核里的读和写与一般的读和写是相反的。内核的读是用于输
出(向用户进程输出),写是用于输入。这是因为以用户的观点去看,用户进程从
内核读(用户进程的输入)需要内核输出信息。写入内核的信息则被内核作为输入
来接收。(kevintz:希望大家不要被搞晕:)
另一个感兴趣的是module_permission函数。这个函数在进程尝试对/proc文
件做一些事情时会被调用。它会判别是否允许访问或不许访问等。现在它是根据
操作动作和用户号(uid)来判断的,但我们可以按我们的喜好来修改判断的,例如
其他进程对相同文件正在做的操作、当前时间或最后的输入等等。
使用put_user和get_user是因为Linux的内存是分段的。这意味着一个指针不
一定指向一个唯一的内存位置(可对应于内核段和用户段的两个位置),只是某一
个内存段里的一个位置而已,所以你要知道哪个内存段将会被使用。Linux系统有
一个内核内存段和每个进程一个内存段。
一个普通进程只可以访问它自己的内存段。所以以进程运行的程序不需要担
心段的问题。但你写一个内核模块时就不同了,它可以用系统自动处理内核内存
段。然而,当一个内存缓冲区的内容需要在内核模块和进程之间传输时,内核模
块接收到一个进程的内存段地址,你应该用put_user和get_user宏来访问这些地
址。
kevintz注:特别感谢网友tengel向我提供了本书较新的源代码,使我免去了修改
旧源码的麻烦。同时我将去掉低版本的代码,使代码简短些,易读些。代码适合
在2.2.0以上的版本编译。
例子procfs.c
/* procfs.c - create a "file" in /proc, which allows
* both input and output. */
/* Copyright (C) 1998-1999 by Ori Pomerantz */
/* 版权所有 (C) 2000 by Kevin T.Z*/
/* kevin注:英文注悉很容易看懂,所以就不写中文了*/
/* The necessary header files */
/* Standard in kernel modules */
#include <linux/kernel.h> /* We're doing kernel work */
#include <linux/module.h> /* Specifically, a module */
/* Deal with CONFIG_MODVERSIONS */
#if CONFIG_MODVERSIONS==1
#define MODVERSIONS
#include <linux/modversions.h>
#endif
/* Necessary because we use proc fs */
#include <linux/proc_fs.h>
#include <asm/uaccess.h> /* for get_user and put_user */
/* The module's file functions ********************** */
/* Here we keep the last message received, to prove
* that we can process our input */
#define MESSAGE_LENGTH 80
static char Message[MESSAGE_LENGTH];
/* Since we use the file operations struct, we can't
* use the special proc output provisions - we have to
* use a standard read function, which is this function */
static ssize_t module_output(
struct file *file, /* The file read */
char *buf, /* The buffer to put data to (in the
* user segment) */
size_t len, /* The length of the buffer */
loff_t *offset) /* Offset in the file - ignore */
{
static int finished = 0;
int i;
char message[MESSAGE_LENGTH+30];
/* We return 0 to indicate end of file, that we have
* no more information. Otherwise, processes will
* continue to read from us in an endless loop. */
if (finished) {
finished = 0;
return 0;
}
/* We use put_user to copy the string from the kernel's
* memory segment to the memory segment of the process
* that called us. get_user, BTW, is
* used for the reverse. */
sprintf(message, "Last input:%s", Message);
for(i=0; i<len && message[i]; i++)
put_user(message[i], buf+i);
/* Notice, we assume here that the size of the message
* is below len, or it will be received cut. In a real
* life situation, if the size of the message is less
* than len then we'd return len and on the second call
* start filling the buffer with the len+1'th byte of
* the message. */
finished = 1;
return i; /* Return the number of bytes "read" */
}
/* This function receives input from the user when the
* user writes to the /proc file. */
static ssize_t module_input(
struct file *file, /* The file itself */
const char *buf, /* The buffer with input */
size_t length, /* The buffer's length */
loff_t *offset) /* offset to file - ignore */
{
int i;
/* Put the input into Message, where module_output
* will later be able to use it */
for(i=0; i<MESSAGE_LENGTH-1 && i<length; i++)
get_user(Message[i], buf+i);
/* In version 2.2 the semantics of get_user changed,
* it not longer returns a character, but expects a
* variable to fill up as its first argument and a
* user segment pointer to fill it from as the its
* second.
*
* The reason for this change is that the version 2.2
* get_user can also read an short or an int. The way
* it knows the type of the variable it should read
* is by using sizeof, and for that it needs the
* variable itself.
*/
Message[i] = '/0'; /* we want a standard, zero
* terminated string */
/* We need to return the number of input characters
* used */
return i;
}
/* This function decides whether to allow an operation
* (return zero) or not allow it (return a non-zero
* which indicates why it is not allowed).
*
* The operation can be one of the following values:
* 0 - Execute (run the "file" - meaningless in our case)
* 2 - Write (input to the kernel module)
* 4 - Read (output from the kernel module)
*
* This is the real function that checks file
* permissions. The permissions returned by ls -l are
* for referece only, and can be overridden here.
*/
static int module_permission(struct inode *inode, int op)
{
/* We allow everybody to read from our module, but
* only root (uid 0) may write to it */
if (op == 4 || (op == 2 && current->euid == 0))
return 0;
/* If it's anything else, access is denied */
return -EACCES;
}
/* The file is opened - we don't really care about
* that, but it does mean we need to increment the
* module's reference count. */
int module_open(struct inode *inode, struct file *file)
{
MOD_INC_USE_COUNT;
return 0;
}
/* The file is closed - again, interesting only because
* of the reference count. */
int module_close(struct inode *inode, struct file *file)
{
MOD_DEC_USE_COUNT;
return 0; /* success */
}
/* Structures to register as the /proc file, with
* pointers to all the relevant functions. */
/* File operations for our proc file. This is where we
* place pointers to all the functions called when
* somebody tries to do something to our file. NULL
* means we don't want to deal with something. */
static struct file_operations File_Ops_4_Our_Proc_File =
{
NULL, /* lseek */
module_output, /* "read" from the file */
module_input, /* "write" to the file */
NULL, /* readdir */
NULL, /* select */
NULL, /* ioctl */
NULL, /* mmap */
module_open, /* Somebody opened the file */
NULL, /* flush, added here in version 2.2 */
module_close, /* Somebody closed the file */
/* etc. etc. etc. (they are all given in
* /usr/include/linux/fs.h). Since we don't put
* anything here, the system will keep the default
* data, which in Unix is zeros (NULLs when taken as
* pointers). */
};
/* Inode operations for our proc file. We need it so
* we'll have some place to specify the file operations
* structure we want to use, and the function we use for
* permissions. It's also possible to specify functions
* to be called for anything else which could be done to
* an inode (although we don't bother, we just put
* NULL). */
static struct inode_operations Inode_Ops_4_Our_Proc_File =
{
&File_Ops_4_Our_Proc_File,
NULL, /* create */
NULL, /* lookup */
NULL, /* link */
NULL, /* unlink */
NULL, /* symlink */
NULL, /* mkdir */
NULL, /* rmdir */
NULL, /* mknod */
NULL, /* rename */
NULL, /* readlink */
NULL, /* follow_link */
NULL, /* readpage */
NULL, /* writepage */
NULL, /* bmap */
NULL, /* truncate */
module_permission /* check for permissions */
};
/* Directory entry */
static struct proc_dir_entry Our_Proc_File =
{
0, /* Inode number - ignore, it will be filled by
* proc_register[_dynamic] */
7, /* Length of the file name */
"rw_test", /* The file name */
S_IFREG | S_IRUGO | S_IWUSR,
/* File mode - this is a regular file which
* can be read by its owner, its group, and everybody
* else. Also, its owner can write to it.
*
* Actually, this field is just for reference, it's
* module_permission that does the actual check. It
* could use this field, but in our implementation it
* doesn't, for simplicity. */
1, /* Number of links (directories where the
* file is referenced) */
0, 0, /* The uid and gid for the file -
* we give it to root */
80, /* The size of the file reported by ls. */
&Inode_Ops_4_Our_Proc_File,
/* A pointer to the inode structure for
* the file, if we need it. In our case we
* do, because we need a write function. */
NULL
/* The read function for the file. Irrelevant,
* because we put it in the inode structure above */
};
/* Module initialization and cleanup ***** */
/* Initialize the module - register the proc file */
int init_module()
{
/* Success if proc_register[_dynamic] is a success,
* failure otherwise */
/* In version 2.2, proc_register assign a dynamic
* inode number automatically if it is zero in the
* structure , so there's no more need for
* proc_register_dynamic
*/
return proc_register(&proc_root, &Our_Proc_File);
}
/* Cleanup - unregister our file from /proc */
void cleanup_module()
{
proc_unregister(&proc_root, Our_Proc_File.low_ino);
}
测试:
分别用root和普通用户测试。
1、vi /proc/rw_test
2、echo "my input is here" >/proc/rw_test
3、more /proc/rw_test
思考:
请大家在脑中思考一下当我们做以上的测试时,内核模块做了些什么,proc文件
系统做了什么,大家会不会很兴奋呢:-)。
发信人: kevintz() 整理人: kevintz(2000-06-24 00:40:31), 站内信件 |
《Linux Kernel Module Programming Guide》
作者:Ori Pomerantz 中译者:谭志(lkmpg@21cn.com)
译者注:
1、LKMPG是一本免费的书,英文版的发行和修改遵从GPL version 2的许可。为了
节省时间,我只翻译了其中的大部分的大意,或者说这只是我学习中的一些中文
笔记吧,不能算是严格上的翻译,但我认为这已经足够了。本文也允许免费发布
,但发布前请和我联系,但不要把本文用于商业目的。鉴于本人的水平,文章中
难免有错误,请大家不吝指正。
2、本文中的例子在Linux(kernel version 2.2.10)上调试通过。你用的Linux必
须支持内核模块的加载,如果不支持,请在编译内核时选上内核模块的支持或升
级你的内核到一个支持内核模块的版本。
第五章 写/proc文件系统
我们有两个方法来产生内核模块的输出:注册一个设备驱动程序并mknod一个
设备文件,建立一个/proc文件。这允许内核模块告诉我们感兴趣的信息,问题只
是我们不能向内核模块反馈回去。我们向内核模块传送输入的第一个方法是写/p
roc文件。
因为/proc文件系统主要是向进程报告系统信息,一般没有指定的信息输入。
proc_dir_entry结构体不包含输入的函数指针,它只包含一个指向输出的函数指
针,所以我们要用标准的文件系统机制来写/proc文件系统。
Linux里有一个标准的机制来注册文件系统,因为每个文件系统都有自己的处
理i节点和文件操作的函数。有一个特定的结构体inode_operations来存放所有的
这些函数指针,包括一个指向file_operations结构体的指针。在/proc文件系统
里,无论什么时候我们注册一个文件,它都允许我们指定访问这个文件时所使用
的inode_operations结构体。file_operations里有指向module_input和module_
output的函数指针。
要注意的是在内核里的读和写与一般的读和写是相反的。内核的读是用于输
出(向用户进程输出),写是用于输入。这是因为以用户的观点去看,用户进程从
内核读(用户进程的输入)需要内核输出信息。写入内核的信息则被内核作为输入
来接收。(kevintz:希望大家不要被搞晕:)
另一个感兴趣的是module_permission函数。这个函数在进程尝试对/proc文
件做一些事情时会被调用。它会判别是否允许访问或不许访问等。现在它是根据
操作动作和用户号(uid)来判断的,但我们可以按我们的喜好来修改判断的,例如
其他进程对相同文件正在做的操作、当前时间或最后的输入等等。
使用put_user和get_user是因为Linux的内存是分段的。这意味着一个指针不
一定指向一个唯一的内存位置(可对应于内核段和用户段的两个位置),只是某一
个内存段里的一个位置而已,所以你要知道哪个内存段将会被使用。Linux系统有
一个内核内存段和每个进程一个内存段。
一个普通进程只可以访问它自己的内存段。所以以进程运行的程序不需要担
心段的问题。但你写一个内核模块时就不同了,它可以用系统自动处理内核内存
段。然而,当一个内存缓冲区的内容需要在内核模块和进程之间传输时,内核模
块接收到一个进程的内存段地址,你应该用put_user和get_user宏来访问这些地
址。
kevintz注:特别感谢网友tengel向我提供了本书较新的源代码,使我免去了修改
旧源码的麻烦。同时我将去掉低版本的代码,使代码简短些,易读些。代码适合
在2.2.0以上的版本编译。
例子procfs.c
/* procfs.c - create a "file" in /proc, which allows
* both input and output. */
/* Copyright (C) 1998-1999 by Ori Pomerantz */
/* 版权所有 (C) 2000 by Kevin T.Z*/
/* kevin注:英文注悉很容易看懂,所以就不写中文了*/
/* The necessary header files */
/* Standard in kernel modules */
#include <linux/kernel.h> /* We're doing kernel work */
#include <linux/module.h> /* Specifically, a module */
/* Deal with CONFIG_MODVERSIONS */
#if CONFIG_MODVERSIONS==1
#define MODVERSIONS
#include <linux/modversions.h>
#endif
/* Necessary because we use proc fs */
#include <linux/proc_fs.h>
#include <asm/uaccess.h> /* for get_user and put_user */
/* The module's file functions ********************** */
/* Here we keep the last message received, to prove
* that we can process our input */
#define MESSAGE_LENGTH 80
static char Message[MESSAGE_LENGTH];
/* Since we use the file operations struct, we can't
* use the special proc output provisions - we have to
* use a standard read function, which is this function */
static ssize_t module_output(
struct file *file, /* The file read */
char *buf, /* The buffer to put data to (in the
* user segment) */
size_t len, /* The length of the buffer */
loff_t *offset) /* Offset in the file - ignore */
{
static int finished = 0;
int i;
char message[MESSAGE_LENGTH+30];
/* We return 0 to indicate end of file, that we have
* no more information. Otherwise, processes will
* continue to read from us in an endless loop. */
if (finished) {
finished = 0;
return 0;
}
/* We use put_user to copy the string from the kernel's
* memory segment to the memory segment of the process
* that called us. get_user, BTW, is
* used for the reverse. */
sprintf(message, "Last input:%s", Message);
for(i=0; i<len && message[i]; i++)
put_user(message[i], buf+i);
/* Notice, we assume here that the size of the message
* is below len, or it will be received cut. In a real
* life situation, if the size of the message is less
* than len then we'd return len and on the second call
* start filling the buffer with the len+1'th byte of
* the message. */
finished = 1;
return i; /* Return the number of bytes "read" */
}
/* This function receives input from the user when the
* user writes to the /proc file. */
static ssize_t module_input(
struct file *file, /* The file itself */
const char *buf, /* The buffer with input */
size_t length, /* The buffer's length */
loff_t *offset) /* offset to file - ignore */
{
int i;
/* Put the input into Message, where module_output
* will later be able to use it */
for(i=0; i<MESSAGE_LENGTH-1 && i<length; i++)
get_user(Message[i], buf+i);
/* In version 2.2 the semantics of get_user changed,
* it not longer returns a character, but expects a
* variable to fill up as its first argument and a
* user segment pointer to fill it from as the its
* second.
*
* The reason for this change is that the version 2.2
* get_user can also read an short or an int. The way
* it knows the type of the variable it should read
* is by using sizeof, and for that it needs the
* variable itself.
*/
Message[i] = '/0'; /* we want a standard, zero
* terminated string */
/* We need to return the number of input characters
* used */
return i;
}
/* This function decides whether to allow an operation
* (return zero) or not allow it (return a non-zero
* which indicates why it is not allowed).
*
* The operation can be one of the following values:
* 0 - Execute (run the "file" - meaningless in our case)
* 2 - Write (input to the kernel module)
* 4 - Read (output from the kernel module)
*
* This is the real function that checks file
* permissions. The permissions returned by ls -l are
* for referece only, and can be overridden here.
*/
static int module_permission(struct inode *inode, int op)
{
/* We allow everybody to read from our module, but
* only root (uid 0) may write to it */
if (op == 4 || (op == 2 && current->euid == 0))
return 0;
/* If it's anything else, access is denied */
return -EACCES;
}
/* The file is opened - we don't really care about
* that, but it does mean we need to increment the
* module's reference count. */
int module_open(struct inode *inode, struct file *file)
{
MOD_INC_USE_COUNT;
return 0;
}
/* The file is closed - again, interesting only because
* of the reference count. */
int module_close(struct inode *inode, struct file *file)
{
MOD_DEC_USE_COUNT;
return 0; /* success */
}
/* Structures to register as the /proc file, with
* pointers to all the relevant functions. */
/* File operations for our proc file. This is where we
* place pointers to all the functions called when
* somebody tries to do something to our file. NULL
* means we don't want to deal with something. */
static struct file_operations File_Ops_4_Our_Proc_File =
{
NULL, /* lseek */
module_output, /* "read" from the file */
module_input, /* "write" to the file */
NULL, /* readdir */
NULL, /* select */
NULL, /* ioctl */
NULL, /* mmap */
module_open, /* Somebody opened the file */
NULL, /* flush, added here in version 2.2 */
module_close, /* Somebody closed the file */
/* etc. etc. etc. (they are all given in
* /usr/include/linux/fs.h). Since we don't put
* anything here, the system will keep the default
* data, which in Unix is zeros (NULLs when taken as
* pointers). */
};
/* Inode operations for our proc file. We need it so
* we'll have some place to specify the file operations
* structure we want to use, and the function we use for
* permissions. It's also possible to specify functions
* to be called for anything else which could be done to
* an inode (although we don't bother, we just put
* NULL). */
static struct inode_operations Inode_Ops_4_Our_Proc_File =
{
&File_Ops_4_Our_Proc_File,
NULL, /* create */
NULL, /* lookup */
NULL, /* link */
NULL, /* unlink */
NULL, /* symlink */
NULL, /* mkdir */
NULL, /* rmdir */
NULL, /* mknod */
NULL, /* rename */
NULL, /* readlink */
NULL, /* follow_link */
NULL, /* readpage */
NULL, /* writepage */
NULL, /* bmap */
NULL, /* truncate */
module_permission /* check for permissions */
};
/* Directory entry */
static struct proc_dir_entry Our_Proc_File =
{
0, /* Inode number - ignore, it will be filled by
* proc_register[_dynamic] */
7, /* Length of the file name */
"rw_test", /* The file name */
S_IFREG | S_IRUGO | S_IWUSR,
/* File mode - this is a regular file which
* can be read by its owner, its group, and everybody
* else. Also, its owner can write to it.
*
* Actually, this field is just for reference, it's
* module_permission that does the actual check. It
* could use this field, but in our implementation it
* doesn't, for simplicity. */
1, /* Number of links (directories where the
* file is referenced) */
0, 0, /* The uid and gid for the file -
* we give it to root */
80, /* The size of the file reported by ls. */
&Inode_Ops_4_Our_Proc_File,
/* A pointer to the inode structure for
* the file, if we need it. In our case we
* do, because we need a write function. */
NULL
/* The read function for the file. Irrelevant,
* because we put it in the inode structure above */
};
/* Module initialization and cleanup ***** */
/* Initialize the module - register the proc file */
int init_module()
{
/* Success if proc_register[_dynamic] is a success,
* failure otherwise */
/* In version 2.2, proc_register assign a dynamic
* inode number automatically if it is zero in the
* structure , so there's no more need for
* proc_register_dynamic
*/
return proc_register(&proc_root, &Our_Proc_File);
}
/* Cleanup - unregister our file from /proc */
void cleanup_module()
{
proc_unregister(&proc_root, Our_Proc_File.low_ino);
}
测试:
分别用root和普通用户测试。
1、vi /proc/rw_test
2、echo "my input is here" >/proc/rw_test
3、more /proc/rw_test
思考:
请大家在脑中思考一下当我们做以上的测试时,内核模块做了些什么,proc文件
系统做了什么,大家会不会很兴奋呢:-)。