The sysfs Filesystem | sysfs文件系统
Abstract | 摘要
sysfs is a feature of the Linux 2.6 kernel that allows kernel code to export information to user processes via an in-memory filesystem. The organization of the filesystem directory hierarchy is strict, and based the internal organization of kernel data structures. The files that are created in the filesystem are (mostly) ASCII files with (usually) one value per file. These features ensure that the information exported is accurate and easily accessible, making sysfs one of the most intuitive and useful features of the 2.6 kernel.
sysfs是Linux 2.6内核的功能之一,内核代码能够通过一个内存文件系统将信息导出到用户进程空间。文件系统的目录层次结构组织严格,并且是基于内核数据结构的内部组织。文件系统中创建的文件主要是ASCII文件(每个文件通常只有一个值)。这些特点保证了信息导出的准确性和方便性,使其成为2.6内核中最直观最有用的功能之一。
Introduction | 概述
sysfs is a mechanism for representing kernel objects, their attributes, and their relationships with each other. It provides two components: a kernel programming interface for exporting these items via sysfs, and a user interface to view and manipulate these items that maps back to the kernel objects which they represent. The table below shows the mapping between internel (kernel) constructs and their external (userspace) sysfs mappings.
sysfs是一种机制,用来表示内核对象,内核对象的属性和内核对象之间的相互关系。sysfs提供了两个组件:一个是内核编程接口(用于导出内核对象及属性),另一个则是用户界面(用于查看和操作对应的内核对象及属性)。下面的表格列出了内部结构(内核态)与外部(用户态)sysfs映射的对应关系。
Internal | External |
Kernel Objects | Directories |
Object Attributes | Regular Files |
Object Relationships | Symbolic Link |
sysfs is a core piece of kernel infrastructure, which means that it provides a relatively simple interface to perform a simple task. Rarely is the code overly complicated, or the descriptions obtuse. However, like many core pieces of infrastructure, it can get a bit too abstract and far removed to keep track of. To help alleviate that, this paper takes a gradual approach to sysfs before getting to the nitty-gritty details.
sysfs是内核基础设施的核心组成部分之一,它提供了一个相对简单的接口来完成简单的任务。实现代码不是超复杂,描述得也不是很呆瓜。然而,与许多基础设施的核心组成部分一样,sysfs可能有点过于抽象以致于无法跟踪。为了帮助理解sysfs的本质,本文将以循序渐进的方式徐徐展开。
First, a short but touching history describes its origins. Then crucial information about mounting and accessing sysfs is included. Next, the directory organization and layout of subsystems in sysfs is described. This provides enough information for a user to understand the organization and content of the information that is exported through sysfs, though for reasons of time and space constraints, not every object and its attributes are described.
首先,简述其历史起源,包括如何安装和访问sysfs。其次,介绍sysfs目录组织结构以及子系统布局。这给用户提供了足够的信息以帮助理解sysfs的组织结构和通过它导出的信息内容,但由于时间和空间的限制,对每一个对象及属性不可能面面俱到。
The primary goal of this paper is to provide a technical overview of the internal sysfs interface -- the data structures and the functions that are used to export kernel constructs to userspace. It describes the functions among the three concepts mentioned above -- Kernel Objects, Object Attributes, and Object Relationships -- and dedicates a section to each one. It also provides a section for each of the two additional regular file interfaces created to simplify some common operations -- Attribute Groups and Binary Attributes.
本文的首要目标是给读者呈现有关sysfs接口内部的技术概貌--包括数据结构和用来将内核结构导出到用户空间的函数。本文描述了在上面表格中提及的3个概念对应的函数--内核对象、对象属性和对象之间的关系--并为每个概念都单独用一个章节来描述。本文还提供了一个章节来讲述两个额外的普通文件接口--属性组和二进制属性,该文件接口旨在用来简化一些常规操作。
sysfs is a conduit of information between the kernel and user space. There are many opportunities for user space applications to leverage this information. Some existing uses are the ability to I/O Scheduler parameters and the udev program. The final section describes a sampling of the current applications that use sysfs and attempts to provide enough inspiration to spawn more development in this area.
sysfs是贯通内核空间和用户空间之间的信息导管。用户空间应用程序有非常多的机会使用这些信息。现存的应用是提供I/O调度参数和udev程序的能力。本文最后一节描述了使用sysfs的一个应用样本,和提供充足的灵感寄希望于在该领域有更多的建树。
Because it is a simple and mostly abstract interface, much time can be spent describing its interactions with each subsystem that uses it. This is especially true for the kobject and driver models, which are both new features of the 2.6 kernel and heavily intertwined with sysfs. It would be impossible to do those topics justice in such a medium and are left as subjects for other documents. Readers still curious in these and related topics are encouraged to read [4].
因为sysfs是一个简单且抽象的接口,所以需要花费大量的时间来描述sysfs和使用sysfs的每个子系统之间是如何交互的。这对kobject和驱动模型来说的确是这样的,因为它们是2.6内核的新功能,而且与sysfs重度缠绕在一起。在本文中将这些东东都讲清楚是万万不可能的,于是在其他文档中讲述这一话题是必须地,读者对如相关话题感兴趣,不妨阅读文献[4]。
1 The History of sysfs | sysfs历史回顾
sysfs is an in-memory filesystem that was originally based on ramfs. ramfs was written around the time the 2.4.0 kernel was being stabilized. It was an exercise in elegance, as it showed just how easy it was to write a simple filesystem using the then-new VFS layer. Because of its simplicity and use of the VFS, it provided a good base from which to derive other in-memory based filesystems.
sysfs是一个内存文件系统,最初基于ramfs实现。ramfs大约是在2.4.0内核稳定的时候开发的。实现ramfs是一个优雅的实践,表明了使用新的VFS层写一个简单的文件系统是非常容易的事儿。ramfs实现简单且使用VFS,因此为开发其他基于内存的文件系统奠定了良好的基础。
sysfs was originally called ddfs (Device Driver Filesystem) and was written to debug the new driver model as it was being written. That debug code had originally used procfs to export a device tree, but under strict urging from Linus Torvalds, it was converted to use a new filesystem based on ramfs.
sysfs原名ddfs(设备驱动文件系统), 开发它的初衷不过是用来调试新的驱动模型罢了。调试代码原本使用procfs导出设备树,但在Linus Torvalds的严格敦促下,被切换到使用基于ramfs的新文件系统。
By the time the new driver model was merged into the kernel around 2.5.1, it had changed names to driverfs to be a little more descriptive. During the next year of 2.5 development, the infrastructural capabilities of the driver model and driverfs began to prove useful to other subsystems. kobjects were developed to provide a central object management mechanism and driverfs was converted to sysfs to represent its subsystem agnosticism.
在新的驱动模型被合并成到内核版本2.5.1的时候,ddfs改名为driverfs, 看起来更具描述性一点。在接下来内核2.5发展的几年里,驱动模型在基础架构方面的能力和driverfs被证明对其他子系统有用。于是开发了kobject以提供一种集中的对象管理机制,driverfs被切换为sysfs,该名字代表其子系统不可辨识。
2 Mounting sysfs | 挂载sysfs
sysfs can be mounted from userspace just like any other memory-based filesystem. The command for doing so is listed in Table 1.
跟其他内存文件系统一样,sysfs能够从用户空间挂载。对应的命令如表1所示。
mount -t sysfs sysfs /sys
Table 1: A sysfs mount command
sysfs can also be mounted automatically on boot using the file /etc/fstab. Most distributions that support the 2.6 kernel have entries for sysfs in /etc/fstab. An example entry is shown in Table 2.
如使用文件/etc/fstab, sysfs也可以自动挂载。很多支持2.6内核的发行版都支持使用/etc/fstab自动挂载sysfs。例如(表2):
sysfs /sys sysfs noauto 0
Table 2: A sysfs entry in /etc/fstab
Note that the directory that sysfs is mounted on: /sys. That is the de facto standard location for the sysfs mount point. This was adopted without objection by every major distribution.
注意sysfs被挂载到/sys目录。/sys是sysfs挂载点的事实标准,而且在各个发行版本中也不曾被反对过。
3 Navigating sysfs | 操纵sysfs
Since sysfs is simply a collection of directories, files, and symbolic links, it can be navigated and manipulated using simple shell utilities. The author recommends the tree(1) utility. It was an invaluable aide during the development of the core kernel object infrastructure.
sysfs是一个简单的集目录、文件和符号链接的集合,因此可以使用简单的shell工具来导航和操作它。推荐使用tree(1),在核心内核对象基础设施开发过程中,这是一个非常有用的助手。
/sys/ |-- block |-- bus |-- class |-- devices |-- firmware |-- module `-- power Table 3: Top level sysfs directories
At the top level of the sysfs mount point are a number of directories. These directories represent the major subsystems that are registered with sysfs. At the time of publication, this consisted of the directories listed in Table 3. These directories are created at system startup when the subsystems register themselves with the kobject core. After they are initialized, they begin to discover objects, which are registered within their respective directories.
在sysfs挂载点的顶层是多个目录。这些目录代表了用sysfs注册的主要的子系统。子系统在被发布的时候,是由表3中列出的目录组成的。当系统启动时,子系统们用kobject core注册了这些子目录。一旦这些子系统被初始化,它们就开始发现对象,并把这些对象注册到在各自的目录中去。
The method by which objects register with sysfs and how directores are created is explained later in the paper. In the meantime, the curious are encouraged to meander on their own through the sysfs hierarchy, and the meaning of each subsystem and their contents follows now.
有关用sysfs注册对象的方法以及如何创建目录将在本文中予以阐述。同时,鼓励好奇者自己通过sysfs层次结构去遍历整个结构。每个子系统的含义以及它们的内容如下所示。
3.1 block | 块
The block directory contains subdirectories for each block device that has been discovered in the system. In each block device's directory are attributes that describe many things, including the size of the device and the dev_t number that it maps to. There is a symbolic link that points to the physical device that the block device maps to (in the physical device tree, which is explained later). And, there is a directory that exposes an interface to the I/O scheduler. This interface provides some statistics about about the device request queue and some tunable features that a user or administrator can use to optimize performance, including the ability to dyanmically change the I/O scheduler to use.
XXXX
Each partition of each block device is represented as a subdirectory of the block device. Included in these directories are read-only attributes about the partitions.
XXXX
3.2 bus | 总线
The bus directory contains subdirectories for each physical bus type that has support registered in the kernel (either statically compiled or loaded via a module). Partial output is listed in Table 4.
XXX
bus/ |-- ide |-- pci |-- scsi `-- usb Table 4: The bus directory
Each bus type that is represented has two subdirectories: devices and drivers. The devices directory contains a flat listing of every device discovered on that type of bus in the entire system. The devices listed are actually symbolic links that point to the device's directory in the global device tree. An example listing is shown in Table 5.
XXX
bus/pci/devices/ |-- 0000:00:00.0 -> ../../../devices/pci0000:00/0000:00:00.0 |-- 0000:00:01.0 -> ../../../devices/pci0000:00/0000:00:01.0 |-- 0000:01:00.0 -> ../../../devices/pci0000:00/0000:00:01.0/0000:01:00.0 |-- 0000:02:00.0 -> ../../../devices/pci0000:00/0000:00:1e.0/0000:02:00.0 |-- 0000:02:00.1 -> ../../../devices/pci0000:00/0000:00:1e.0/0000:02:00.1 |-- 0000:02:01.0 -> ../../../devices/pci0000:00/0000:00:1e.0/0000:02:01.0 `-- 0000:02:02.0 -> ../../../devices/pci0000:00/0000:00:1e.0/0000:02:02.0 Table 5: PCI devices represented in bus/pci/devices/
The drivers directory contains directories for each device driver that has been registered with the bus type. Within each of the drivers' directories are attributes that allow viewing and manipulation of driver parameters, and symbolic links that point to the physical devices (in the global device tree) that the driver is bound to.
XXX
3.3 class | 类
The class directory contains representations of every device class that is registered with the kernel. A device class describes a functional type of device. Examples of classes are shown in Table 6.
XXX
class/ |-- graphics |-- input |-- net |-- printer |-- scsi_device |-- sound `-- tty Table 6: The class directory
Each device class contains subdirectories for each class object that has been allocated and registered with that device class. For most of class device objects, their directories contain symbolic links to the device and driver directories (in the global device hierarchy and the bus hierarchy respectively) that are associated with that class object.
XXX
Note that there is not necessarily a 1:1 mapping between class objects and physical devices; a physical device may contain multiple class objects that perform a different logical function. For example, a physical mouse device might map to a kernel mouse object, as well as a generic "input event" device and possibly a "input debug" device.
XXXX
Each class and class object may contain attributes exposing parameters that describe or control the class object. The contents and format, though, are completely class dependent and depend on the support present in one's kernel.
XXXX
3.4 devices | 设备
The devices directory contains the global device hierarchy. This contains every physical device that has been discovered by the bus types registered with the kernel. It represents them in an ancestrally correct way--each device is shown as a subordinate device of the device that it is physically (electrically) subordinate to.
XXXX
There are two types of devices that are exceptions to this representation: platform devices and system devices. Platform devices are peripheral devices that are inherent to a particular platform. They usually have some I/O ports, or MMIO, that exists at a known, fixed location. Examples of platform devices are legacy x86 devices like a serial controller or a floppy controller, or the embedded devices of a SoC solution.
XXXX
System devices are non-peripheral devices that are integral components of the system. In many ways, they are nothing like any other device. They may have some hardware register access for configuration, but do not have the capability to transfer data. They usually do not have drivers which can be bound to them. But, at least for those represented through sysfs, have some architecture-specific code that configures them and treats them enough as objects to export them. Examples of system devices are CPUs, APICs, and timers.
XXXX
3.5 firmware | 固件
The firmware directory contains interfaces for viewing and manipulating firmware-specific objects and attributes. In this case, 'firmware' refers to the platform-specific code that is executed on system power-on, like the x86 BIOS, OpenFirmware on PPC platforms, and EFI on ia64 platforms.
XXX
Each directory contains a set of objects and attributes that is specific to the firmware "driver in the kernel." For example, in the case of ACPI, every object found in the ACPI DSDT table is listed in firmware/acpi/namespace/ directory.
XXX
3.6 module | 模块
The module directory contains subdirectories for each module that is loaded into the kernel. The name of each directory is the name of the module -- both the name of the module object file and the internal name of the module. Every module is represented here, regardless of the subsystem it registers an object with. Note that the kernel has a single global namespace for all modules.
XXXX
Within each module directory is a subdirectory called sections. This subdirectory contains attributes about the module sections. This information is used for debugging and generally not very interesting.
XXXX
Each module directory also contains at least one attribute: refcnt. This attributes displays the current reference count, or number of users, of the module. This is the same value in the fourth column of lsmod(8) output.
XXXX
3.7 power | 电源
The power directory represents the under-used power subsystem. It currently contains only two attributes: disk which controls the method by which the system will suspend to disk; and state, which allows a process to enter a low power state. Reading this file displays which states the system supports.
XXX
4 General Kernel Information | 内核信息概览
4.1 Code Organization | 代码组织
The code for sysfs resides in fs/sysfs/ and its shared function prototypes are in include/linux/sysfs.h. It is relatively small (~2000 lines), but it is divided up among 9 files, including the shared header file. The organization of these files is listed below. The contents of each of these files is described in the next section.
- include/linux/sysfs.h - Shared header file containing function prototypes and data structure definitions.
- fs/sysfs/sysfs.h - Internal header file for sysfs. Contains function definitions shared locally among the sysfs source.
- fs/sysfs/mount.c - This contains the data structures, methods, and initialization functions necessary for interacting with the VFS layer.
- fs/sysfs/inode.c - This file contains internal functions shared among the sysfs source for allocating and freeing the core filesystem objects.
- fs/sysfs/dir.c - This file contains the externally visible sysfs interface responsible for creating and removing directories in the sysfs hierarchy.
- fs/sysfs/file.c - This file contains the externally visible sysfs interface responsible for creating and removing regular, ASCII files in the sysfs hiearchy.
- fs/sysfs/group.c - This file contains a set of externally-visible helpers that aide in the creation and deletion of multiple regular files at a time.
- fs/sysfs/symlink.c - This file contains the externally-visible interface responsible for creating and removing symlink in the sysfs hierarchy.
- fs/sysfs/bin.c - This file contains the externally visible sysfs interface responsible for creating and removing binary (non-ASCII) files.
XXXX
4.2 Initialization | 初始化
sysfs is initialized in fs/sysfs/mount.c, via the sysfs_init function. This function is called directly by the VFS initialization code. It must be called early, since many subsystems depend on sysfs being initialized to register objects with. This function is responsible for doing three things.
- Creating a kmem_cache. This cache is used for the allocation of sysfs_dirent objects. These are discussed in a later section.
- Registering with the VFS. register_filesystem() is called with the sysfs_fs_type object. This sets up the appropriate super block methods and adds a filesystem with the name sysfs.
- Mounts itself internally. This is done to ensure that it is always available for other kernel code to use, even early in the boot process, instead of depending on user interaction to explicitly mount it.
Once these actions complete, sysfs is fully functional and ready to use by all internal code.
XXX
4.3 Configuration | 配置
sysfs is compiled into the kernel by default. It is dependent on the configuration option CONFIG_SYSFS. CONFIG_SYSFS is only visible if the CONFIG_EMBEDDED option is set, which provides many options for configuring the kernel for size-constrained envrionments. In general, it is considered a good idea to leave sysfs configured in a custom-compiled kernel. Many tools currently do, and probably will in the future, depend on sysfs being present in the system.
XXXX
4.4 Licensing | 版权许可
The sysfs code is licensed under the GPLv2. While most of it is now original, it did originate as a clone of ramfs, which is licensed under the same terms. All of the externally-visible interfaces are original works, and are of course also licensed under the GPLv2.
XXXX
The external interfaces are exported to modules, however only to GPL-compatible modules, using the macro EXPORT_SYMBOL_GPL. This is done for reasons of maintainability and derivability. sysfs is a core component of the kernel. Many subsystems rely on it, and while it is a stable piece of infrastructure, it occasionally must change. In order to develop the best possible modifications, it's imperative that all callers of sysfs interfaces be audited and updated in lock-step with any sysfs interface changes. By requiring that all users be licensed in a GPL manner, and hopefully merged into the kernel, the level of difficulty of an interface change can be greatly reduced.
XXXX
Also, since sysfs was developed initially as an extension of the driver model and has gone through many iterations of evolution, it has a very explicit interaction with its users. To develop code that used sysfs but was not copied or derived from an existing in-kernel GPL- based user would be difficult, if not impossible. By requiring GPL-compatibility in the users of sysfs, this can be made explicit and help prevent falsification of derivability.
XXXX
5 Kernel Interface Overview | 内核接口概述
The sysfs functions visible to kernel code are divided into three categories, based on the type of object they are exporting to userspace (and the type of object in the filesystem they create).
XXXX
- Kernel Objects (Directories).
- Object Attributes (Regular Files).
- Object Relationships (Symbolic Links).
XXXX
There are also two other sub-categories of exporting attributes that were developed to accomodate users that needed to export other files besides single, ASCII files. Both of these categories result in regular files being created in the filesystem.
XXXX
- Attribute Groups
- Binary Files
The first parameter to all sysfs functions is the kobject (hereby referenced as k), which is being manipulated. The sysfs core assumes that this kobject will remain valid throughout the function; i.e., they will not be freed. The caller is always responsible for ensuring that any necessary locks that would modify the object are held across all calls into sysfs.
XXXX
For almost every function (the exception being sysfs_create_dir), the sysfs core assumes that k->dentry is a pointer to a valid dentry that was previously allocated and initialized.
XXXX
All sysfs function calls must be made from process context. They should also not be called with any spinlocks held, as many of them take semaphores directly and all call VFS functions which may also take semaphores and cause the process to sleep.
XXXX
6 Kernel Objects | 内核对象
Kernel objects are exported as directories via sysfs. The functions for manipulating these directories are listed in Table 7.
XXXX
sysfs_create_dir is the only sysfs function that does not rely on a directory having already been created in sysfs for the kobject (since it performs the crucial action of creating that directory). It does rely on the following parameters being valid:
XXXX
- k->parent
- k->name
6.1 Creating Directories
These parameters control where the directory will be located and what it will be called. The location of the new directory is implied by the value of k->parent; it is created as a subdirectory of that. In all cases, the subsystem (not a low-level driver) will fill in that field with information it knows about the object when the object is registered with the subsystem. This provides a simple mechanism for creating a complete user-visible object tree that accurately represents the internal object tree within the kernel.
XXXX
It is possible to call sysfs_create_dir without k->parent set; it will simply create a directory at the very top level of the sysfs filesystem. This should be avoided unless one is writing or porting a new top-level subsystem using the kobject/sysfs model.
XXXX
When sysfs_create_dir() is called, a dentry (the object necessary for most VFS transactions) is allocated for the directory, and is placed in k->dentry. An inode is created, which makes a user-visible entity, and that is stored in the new dentry. sysfs fills in the file_operations for the new directory with a set of internal methods that exhibit standard behavior when called via the VFS system call interface. The return value is 0 on success and a negative errno code if an error occurs.
XXXX
6.2 Removing Directories
sysfs_remove_dir will remove an object's directory. It will also remove any regular files that reside in the directory. This was an original feature of the filesystem to make it easier to use (so all code that created attributes for an object would not be required to be called when an object was removed). However, this feature has been a source of several race conditions throughout the years and should not be relied on in the hopes that it will one day be removed. All code that adds attributes to an object's directory should explicitly remove those attributes when the object is removed.
XXXX
6.3 Renaming Directories
sysfs_rename_dir is used to give a directory a new name. When this function is called, sysfs will allocate a new dentry for the kobject and call the kobject routine to change the object's name. If the rename succeeds, this function will return 0. Otherwise, it will return a negative errno value specifying the error that occurred.
XXXX
It is not possible at this time to move a sysfs directory from one parent to another.
XXXX
7 Object Attributes
XXXX
8 Object Relationships
XXX
9 Attribute Groups
XXX
10 Binary Attributes
XXX
11 Current sysfs Users
XXX
12 Conclusion
XXX
References
。。。。。。TBD。。。。。。。。。。。
参考资料: