原创 2007年09月20日 14:11:00

When I first started in the Solaris group, I was faced with two equally difficult tasks: learning the development model, and understanding the source code. For both these tasks, the recommended method is usually picking a small bug and working through the process. For the curious, the first bug I putback to ON was 4912227 (ptree call returns zero on failure), a simple bug with near zero risk. It was the first step down a very long road.

As a another first step, someone suggested adding a very simple system call to the kernel. This turned out to be a whole lot harder than one would expect, and has so many subtle aspects that experienced Solaris engineers (myself included) still miss some of the necessary changes. With that in mind, I thought a reasonable first OpenSolaris blog would be describing exactly how to add a new system call to the kernel.

For the purposes of this post, we will assume that it's a simple system call that lives in the generic kernel code, and we'll put the code into an existing file to avoid having to deal with Makefiles. The goal is to print an arbitrary message to the console whenever the system call is issued.

1. Picking a syscall number

Before writing any real code, we first have to pick a number that will represent our system call. The main source of documentation here is syscall.h, which describes all the available system call numbers, as well as which ones are reserved. The maximum number of syscalls is currently 256 (NSYSCALL), which doesn't leave much space for new ones. This could theoretically be extended - I believe the hard limit is in the size of sysset_t, whose 16 integers must be able to represent a complete bitmask of all system calls. This puts our actual limit at 16*32, or 512, system calls. But for the purposes of our tutorial, we'll pick system call number 56, which is currently unused. For my own amusement, we'll name our (my?) system call 'schrock'. So first we add the following line to syscall.h

#define SYS_uadmin      55
#define SYS_schrock 56
#define SYS_utssys 57

2. Writing the syscall handler

Next, we have to actually add the function that will get called when we invoke the system call. What we should really do is add a new file schrock.c to usr/src/uts/common/syscall, but I'm trying to avoid Makefiles. Instead, we'll just stick it in getpid.c:

#include <sys/cmn_err.h>

schrock(void *arg)
char buf[1024];
size_t len;

if (copyinstr(arg, buf, sizeof (buf), &len) != 0)
return (set_errno(EFAULT));

cmn_err(CE_WARN, "%s", buf);

return (0);

Note that declaring a buffer of 1024 bytes on the stack is a very bad thing to do in the kernel. We have limited stack space, and a stack overflow will result in a panic. We also don't check that the length of the string was less than our scratch space. But this will suffice for illustrative purposes. The cmn_err() function is the simplest way to display messages from the kernel.

3. Adding an entry to the syscall table

We need to place an entry in the system call table. This table lives in sysent.c, and makes heavy use of macros to simplify the source. Our system call takes a single argument and returns an integer, so we'll need to use the SYSENT_CI macro. We need to add a prototype for our syscall, and add an entry to the sysent and sysent32 tables:

int     rename();
void rexit();
int schrock();
int semsys();
int setgid();

/* ... */

/* 54 */ SYSENT_CI("ioctl", ioctl, 3),
/* 55 */ SYSENT_CI("uadmin", uadmin, 3),
/* 56 */ SYSENT_CI("schrock", schrock, 1),
/* 57 */ IF_LP64(
SYSENT_2CI("utssys", utssys64, 4),
SYSENT_2CI("utssys", utssys32, 4)),

/* ... */

/* 54 */ SYSENT_CI("ioctl", ioctl, 3),
/* 55 */ SYSENT_CI("uadmin", uadmin, 3),
/* 56 */ SYSENT_CI("schrock", schrock, 1),
/* 57 */ SYSENT_2CI("utssys", utssys32, 4),

4. /etc/name_to_sysnum

At this point, we could write a program to invoke our system call, but the point here is to illustrate everything that needs to be done to integrate a system call, so we can't ignore the little things. One of these little things is /etc/name_to_sysnum, which provides a mapping between system call names and numbers, and is used by dtrace(1M), truss(1), and friends. Of course, there is one version for x86 and one for SPARC, so you will have to add the following lines to both the intel and SPARC versions:

ioctl                   54
uadmin 55
schrock 56
utssys 57
fdsync 58

5. truss(1)

Truss does fancy decoding of system call arguments. In order to do this, we need to maintain a table in truss that describes the type of each argument for every syscall. This table is found in systable.c. Since our syscall takes a single string, we add the following entry:

{"ioctl",       3, DEC, NOV, DEC, IOC, IOA},                    /*  54 */
{"uadmin", 3, DEC, NOV, DEC, DEC, DEC}, /* 55 */
{"schrock", 1, DEC, NOV, STG}, /* 56 */
{"utssys", 4, DEC, NOV, HEX, DEC, UTS, HEX}, /* 57 */
{"fdsync", 2, DEC, NOV, DEC, FFG}, /* 58 */

Don't worry too much about the different constants. But be sure to read up on the truss source code if you're adding a complicated system call.

6. proc_names.c

This is the file that gets missed the most often when adding a new syscall. Libproc uses the table in proc_names.c to translate between system call numbers and names. Why it doesn't make use of /etc/name_to_sysnum is anybody's guess, but for now you have to update the systable array in this file:

        "ioctl",                /* 54 */
"uadmin", /* 55 */
"schrock", /* 56 */
"utssys", /* 57 */
"fdsync", /* 58 */

7. Putting it all together

Finally, everything is in place. We can test our system call with a simple program:

#include <sys/syscall.h>

main(int argc, char **argv)
syscall(SYS_schrock, "OpenSolaris Rules!");
return (0);

If we run this on our system, we'll see the following output on the console:

June 14 13:42:21 halcyon genunix: WARNING: OpenSolaris Rules!

Because we did all the extra work, we can actually observe the behavior using truss(1), mdb(1), or dtrace(1M). As you can see, adding a system call is not as easy as it should be. One of the ideas that has been floating around for a while is the Grand Unified Syscall(tm) project, which would centralize all this information as well as provide type information for the DTrace syscall provider. But until that happens, we'll have to deal with this process.


This article is originally posted at http://blogs.sun.com/eschrock/date/20050614#how_to_add_a_system



【版权声明:尊重原创,转载请保留出处:blog.csdn.net/shallnet,文章仅供学习交流,请勿用于商业用途】         应用不能访问内核的内存空间,为了应用和内核交互信息,内核提供一...


今天,利用内核模块的方式向系统添加了一个内核模块。在make后,我开始sudo insmod hello.ko完成之后。老是出现killed,在lsmod后,发现hello这个模块已经加载上去,但是测...


本文转自:http://blog.csdn.net/chenjieb520/article/details/7317629 作者:chenjieb520 给linux内核增加一个系...


使用内核模块的方式添加系统调用 1,为什么? 编译内核的方式费时间,一般的PC机都要两三个小时,而且不方便调试,一旦出现问题前面的工作都前功尽弃,所以我使用内核模块的方式添加系统调用。 2,怎...

在ARM Linux内核中增加一个新的系统调用

实验平台内核版本为4.0-rc1,增加的系统调用仅仅是简单打印一个Hello World,最后我们在用户空间用swi验证。


原文地址::http://blog.csdn.net/chenjieb520/article/details/7317629     作者:chenjieb520 给linux内核增加...


作者:chenjieb520 给linux内核增加一个系统调用的方法    为了更加好地调试linux内核,笔者的实验均在mini6410的arm板上运行的。这样做的原因,第一是因为本人是...