向OpenSolaris内核中添加一个系统调用


When I first started in the Solaris group, I was faced with two equally difficult tasks: learning the development model, and understanding the source code. For both these tasks, the recommended method is usually picking a small bug and working through the process. For the curious, the first bug I putback to ON was 4912227 (ptree call returns zero on failure), a simple bug with near zero risk. It was the first step down a very long road.

As a another first step, someone suggested adding a very simple system call to the kernel. This turned out to be a whole lot harder than one would expect, and has so many subtle aspects that experienced Solaris engineers (myself included) still miss some of the necessary changes. With that in mind, I thought a reasonable first OpenSolaris blog would be describing exactly how to add a new system call to the kernel.

For the purposes of this post, we will assume that it's a simple system call that lives in the generic kernel code, and we'll put the code into an existing file to avoid having to deal with Makefiles. The goal is to print an arbitrary message to the console whenever the system call is issued.

1. Picking a syscall number

Before writing any real code, we first have to pick a number that will represent our system call. The main source of documentation here is syscall.h, which describes all the available system call numbers, as well as which ones are reserved. The maximum number of syscalls is currently 256 (NSYSCALL), which doesn't leave much space for new ones. This could theoretically be extended - I believe the hard limit is in the size of sysset_t, whose 16 integers must be able to represent a complete bitmask of all system calls. This puts our actual limit at 16*32, or 512, system calls. But for the purposes of our tutorial, we'll pick system call number 56, which is currently unused. For my own amusement, we'll name our (my?) system call 'schrock'. So first we add the following line to syscall.h

#define SYS_uadmin      55
#define SYS_schrock 56
#define SYS_utssys 57

2. Writing the syscall handler

Next, we have to actually add the function that will get called when we invoke the system call. What we should really do is add a new file schrock.c to usr/src/uts/common/syscall, but I'm trying to avoid Makefiles. Instead, we'll just stick it in getpid.c:

#include <sys/cmn_err.h>

int
schrock(void *arg)
{
char buf[1024];
size_t len;

if (copyinstr(arg, buf, sizeof (buf), &len) != 0)
return (set_errno(EFAULT));

cmn_err(CE_WARN, "%s", buf);

return (0);
}

Note that declaring a buffer of 1024 bytes on the stack is a very bad thing to do in the kernel. We have limited stack space, and a stack overflow will result in a panic. We also don't check that the length of the string was less than our scratch space. But this will suffice for illustrative purposes. The cmn_err() function is the simplest way to display messages from the kernel.

3. Adding an entry to the syscall table

We need to place an entry in the system call table. This table lives in sysent.c, and makes heavy use of macros to simplify the source. Our system call takes a single argument and returns an integer, so we'll need to use the SYSENT_CI macro. We need to add a prototype for our syscall, and add an entry to the sysent and sysent32 tables:

int     rename();
void rexit();
int schrock();
int semsys();
int setgid();

/* ... */

/* 54 */ SYSENT_CI("ioctl", ioctl, 3),
/* 55 */ SYSENT_CI("uadmin", uadmin, 3),
/* 56 */ SYSENT_CI("schrock", schrock, 1),
/* 57 */ IF_LP64(
SYSENT_2CI("utssys", utssys64, 4),
SYSENT_2CI("utssys", utssys32, 4)),

/* ... */

/* 54 */ SYSENT_CI("ioctl", ioctl, 3),
/* 55 */ SYSENT_CI("uadmin", uadmin, 3),
/* 56 */ SYSENT_CI("schrock", schrock, 1),
/* 57 */ SYSENT_2CI("utssys", utssys32, 4),

4. /etc/name_to_sysnum

At this point, we could write a program to invoke our system call, but the point here is to illustrate everything that needs to be done to integrate a system call, so we can't ignore the little things. One of these little things is /etc/name_to_sysnum, which provides a mapping between system call names and numbers, and is used by dtrace(1M), truss(1), and friends. Of course, there is one version for x86 and one for SPARC, so you will have to add the following lines to both the intel and SPARC versions:

ioctl                   54
uadmin 55
schrock 56
utssys 57
fdsync 58

5. truss(1)

Truss does fancy decoding of system call arguments. In order to do this, we need to maintain a table in truss that describes the type of each argument for every syscall. This table is found in systable.c. Since our syscall takes a single string, we add the following entry:

{"ioctl",       3, DEC, NOV, DEC, IOC, IOA},                    /*  54 */
{"uadmin", 3, DEC, NOV, DEC, DEC, DEC}, /* 55 */
{"schrock", 1, DEC, NOV, STG}, /* 56 */
{"utssys", 4, DEC, NOV, HEX, DEC, UTS, HEX}, /* 57 */
{"fdsync", 2, DEC, NOV, DEC, FFG}, /* 58 */

Don't worry too much about the different constants. But be sure to read up on the truss source code if you're adding a complicated system call.

6. proc_names.c

This is the file that gets missed the most often when adding a new syscall. Libproc uses the table in proc_names.c to translate between system call numbers and names. Why it doesn't make use of /etc/name_to_sysnum is anybody's guess, but for now you have to update the systable array in this file:

        "ioctl",                /* 54 */
"uadmin", /* 55 */
"schrock", /* 56 */
"utssys", /* 57 */
"fdsync", /* 58 */

7. Putting it all together

Finally, everything is in place. We can test our system call with a simple program:

#include <sys/syscall.h>

int
main(int argc, char **argv)
{
syscall(SYS_schrock, "OpenSolaris Rules!");
return (0);
}

If we run this on our system, we'll see the following output on the console:

June 14 13:42:21 halcyon genunix: WARNING: OpenSolaris Rules!

Because we did all the extra work, we can actually observe the behavior using truss(1), mdb(1), or dtrace(1M). As you can see, adding a system call is not as easy as it should be. One of the ideas that has been floating around for a while is the Grand Unified Syscall(tm) project, which would centralize all this information as well as provide type information for the DTrace syscall provider. But until that happens, we'll have to deal with this process.

-------------------------------------------------

This article is originally posted at http://blogs.sun.com/eschrock/date/20050614#how_to_add_a_system
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值