Unreliable Guide To Hacking The Linux

Unreliable Guide To Hacking The Linux
Kernel
Paul Rusty Russell
rusty@linuxcare.com
Unreliable Guide To Hacking The Linux Kernel
by Paul Rusty Russell
Copyright © 2000 by Paul Russell
This documentation is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
For more details see the file COPYING in the source distribution of Linux.
Table of Contents
1. Introduction...........................................................................................................................................5
2. The Players ............................................................................................................................................6
User Context......................................................................................................................................6
Hardware Interrupts (Hard IRQs) ......................................................................................................6
Software Interrupt Context: Bottom Halves, Tasklets, softirqs..........................................................7
3. Some Basic Rules ..................................................................................................................................9
4. ioctls: Not writing a new system call ..................................................................................................10
5. Recipes for Deadlock ...........................................................................................................................11
6. Common Routines ...............................................................................................................................12
printk() include/linux/kernel.h ........................................................................................12
copy_[to/from]_user() / get_user() / put_user() include/asm/uaccess.h.............12
kmalloc()/kfree() include/linux/slab.h..........................................................................13
current include/asm/current.h ............................................................................................14
local_irq_save()/local_irq_restore() include/asm/system.h.................................14
local_bh_disable()/local_bh_enable() include/asm/softirq.h...............................14
smp_processor_id()/cpu_[number/logical]_map() include/asm/smp.h .....................14
__init/__exit/__initdata include/linux/init.h ........................................................................14
__initcall()/module_init() include/linux/init.h ......................................................15
module_exit() include/linux/init.h .................................................................................15
MOD_INC_USE_COUNT/MOD_DEC_USE_COUNT include/linux/module.h...............................15
7.Wait Queues include/linux/wait.h .............................................................................................17
Declaring .........................................................................................................................................17
Queuing ...........................................................................................................................................17
Waking Up Queued Tasks ................................................................................................................17
8. Atomic Operations..............................................................................................................................18
9. Symbols................................................................................................................................................19
EXPORT_SYMBOL() include/linux/module.h.........................................................................19
EXPORT_SYMTAB.............................................................................................................................19
10. Routines and Conventions.................................................................................................................20
Double-linked lists include/linux/list.h ...............................................................................20
Return Conventions.........................................................................................................................20
Breaking Compilation ......................................................................................................................20
Initializing structure members .........................................................................................................20
GNU Extensions ..............................................................................................................................21
C++..................................................................................................................................................21
#if ....................................................................................................................................................21
3
11. Putting Your Stuff in the Kernel ......................................................................................................23
12. Kernel Cantrips.................................................................................................................................24
13. Thanks ...............................................................................................................................................26
4
Chapter 1. Introduction
Welcome, gentle reader, to Rusty’s Unreliable Guide to Linux Kernel Hacking. This document describes
the common routines and general requirements for kernel code: its goal is to serve as a primer for Linux
kernel development for experienced C programmers. I avoid implementation details: that’s what the code
is for, and I ignore whole tracts of useful routines.
Before you read this, please understand that I never wanted to write this document, being grossly
under-qualified, but I always wanted to read it, and this was the only way. I hope it will grow into a
compendium of best practice, common starting points and random information.
5
Chapter 2. The Players
At any time each of the CPUs in a system can be:
• not associated with any process, serving a hardware interrupt;
• not associated with any process, serving a softirq, tasklet or bh;
• running in kernel space, associated with a process;
• running a process in user space.
There is a strict ordering between these: other than the last category (userspace) each can only be
pre-empted by those above. For example, while a softirq is running on a CPU, no other softirq will
pre-empt it, but a hardware interrupt can. However, any other CPUs in the system execute independently.
We’ll see a number of ways that the user context can block interrupts, to become truly non-preemptable.
User Context
User context is when you are coming in from a system call or other trap: you can sleep, and you own the
CPU (except for interrupts) until you call schedule(). In other words, user context (unlike userspace)
is not pre-emptable.
Note: You are always in user context on module load and unload, and on operations on the block
device layer.
In user context, the current pointer (indicating the task we are currently executing) is valid, and
in_interrupt() (include/asm/hardirq.h) is false .
Caution
Beware that if you have interrupts or bottom halves disabled (see below),
in_interrupt() will return a false positive.
Hardware Interrupts (Hard IRQs)
Timer ticks, network cards and keyboard are examples of real hardware which produce interrupts at any
time. The kernel runs interrupt handlers, which services the hardware. The kernel guarantees that this
handler is never re-entered: if another interrupt arrives, it is queued (or dropped). Because it disables
6
Chapter 2. The Players
interrupts, this handler has to be fast: frequently it simply acknowledges the interrupt, marks a ‘software
interrupt’ for execution and exits.
You can tell you are in a hardware interrupt, because in_irq() returns true.
Caution
Beware that this will return a false positive if interrupts are disabled (see below).
Software Interrupt Context: Bottom Halves,
Tasklets, softirqs
Whenever a system call is about to return to userspace, or a hardware interrupt handler exits, any
‘software interrupts’ which are marked pending (usually by hardware interrupts) are run
(kernel/softirq.c).
Much of the real interrupt handling work is done here. Early in the transition to SMP, there were only
‘bottom halves’ (BHs), which didn’t take advantage of multiple CPUs. Shortly after we switched from
wind-up computers made of match-sticks and snot, we abandoned this limitation.
include/linux/interrupt.h lists the different BH’s. No matter how many CPUs you have, no two
BHs will run at the same time. This made the transition to SMP simpler, but sucks hard for scalable
performance. A very important bottom half is the timer BH (include/linux/timer.h): you can
register to have it call functions for you in a given length of time.
2.3.43 introduced softirqs, and re-implemented the (now deprecated) BHs underneath them. Softirqs are
fully-SMP versions of BHs: they can run on as many CPUs at once as required. This means they need to
deal with any races in shared data using their own locks. A bitmask is used to keep track of which are
enabled, so the 32 available softirqs should not be used up lightly. (Yes, people will notice).
tasklets (include/linux/interrupt.h) are like softirqs, except they are dynamically-registrable
(meaning you can have as many as you want), and they also guarantee that any tasklet will only run on
one CPU at any time, although different tasklets can run simultaneously (unlike different BHs).
Caution
The name ‘tasklet’ is misleading: they have nothing to do with ‘tasks’, and probably
more to do with some bad vodka Alexey Kuznetsov had at the time.
You can tell you are in a softirq (or bottom half, or tasklet) using the in_softirq() macro
(include/asm/softirq.h).
7
Chapter 2. The Players
Caution
Beware that this will return a false positive if a bh lock (see below) is held.
8
Chapter 3. Some Basic Rules
No memory protection
If you corrupt memory, whether in user context or interrupt context, the whole machine will crash.
Are you sure you can’t do what you want in userspace?
No floating point or MMX
The FPU context is not saved; even in user context the FPU state probably won’t correspond with
the current process: you would mess with some user process’ FPU state. If you really want to do
this, you would have to explicitly save/restore the full FPU state (and avoid context switches). It is
generally a bad idea; use fixed point arithmetic first.
A rigid stack limit
The kernel stack is about 6K in 2.2 (for most architectures: it’s about 14K on the Alpha), and
shared with interrupts so you can’t use it all. Avoid deep recursion and huge local arrays on the
stack (allocate them dynamically instead).
The Linux kernel is portable
Let’s keep it that way. Your code should be 64-bit clean, and endian-independent. You should also
minimize CPU specific stuff, e.g. inline assembly should be cleanly encapsulated and minimized to
ease porting. Generally it should be restricted to the architecture-dependent part of the kernel tree.
9
Chapter 4. ioctls: Not writing a new
system call
A system call generally looks like this
asmlinkage int sys_mycall(int arg)
{
return 0;
}
First, in most cases you don’t want to create a new system call. You create a character device and
implement an appropriate ioctl for it. This is much more flexible than system calls, doesn’t have to be
entered in every architecture’s include/asm/unistd.h and arch/kernel/entry.S file, and is
much more likely to be accepted by Linus.
Inside the ioctl you’re in user context to a process. When a error occurs you return a negated errno (see
include/linux/errno.h), otherwise you return 0.
After you slept you should check if a signal occurred: the Unix/Linux way of handling signals is to
temporarily exit the system call with the -ERESTARTSYS error. The system call entry code will switch
back to user context, process the signal handler and then your system call will be restarted (unless the
user disabled that). So you should be prepared to process the restart, e.g. if you’re in the middle of
manipulating some data structure.
if (signal_pending())
return -ERESTARTSYS;
If you’re doing longer computations: first think userspace. If you really want to do it in kernel you should
regularly check if you need to give up the CPU (remember there is cooperative multitasking per CPU).
Idiom:
if (current-
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值