1.7 --1.8 SDK-RMIOS

1.7 Scanf/Print family of routines

The input and output support in RMIOS is through newlib, that provides

APIs and functionality similar to that provided by glibc.

rmios的输入输出支持,是newlib,API和glbc类似。

 

In RMIOS environment, the standard input, output and error streams are

mapped to UART device 0.

RMIOS下的标准输入输出以及错误流,都是映射到UART设备0上的。

 

 

The functions provided by newlib are not synchronized, and attempts by

multiple instances of an RMIOS application to write and read using

newlib functions might result in garbled output and bad input. Hence,

for synchronizing output, RMIOS lib provides the ’printk’ macro, which

holds a spinlock while the output is written to the UART device. Please

consult the API guide for its usage and the header files to include.

 

 

WARNING: ’printk’ does not use a recursive spinlock and might lead to

a deadlock.

printk没用循环的 旋转锁,可能会导致死锁。

For example:

f()

{

printk ("g() returns %d", g());

}

g()

{

printk ("this will not be prined due to deadlock/n");

return 0;

}

BUGS:

scanf() doesn’t read the floating point value with precision. For example,

a value of 4.58 is read as 4.5498. This will be investigated in a subsequent

release

 

*********************************************

1.8 Ext2 File System

*********************************************

 

是个RAM disk,使用的是ext2rd 库。

 

RMI OS EXT2 RAM DISK  

RMI OS ram disk is a library implementing an ext2 file system

to make file I/O functions available to PSB applications.

Although, the current implementation provides support for a ram

disk device. It can also be applied to other devices as well.

Using ext2rd library, PSB applications can use familiar file

I/O functions to store and retrieve temporary result in an

also-familiar hierarchical分级的 file system.

 

The below diagram illustrates the software layer architecture of

RMI OS EXT2-ram disk library:

 

下面是EXT2-RAM DISK 库的 软件体系结构:

+--------------------------------------+

| PSB Applications |  PSB应用程序

| |

+--------------------------------------+

| File I/O Functions |  文件I/O函数

| |

| fopen(), fclose(), fread(), fwrite() |

| ................................... |

----- +--------------------------------------+

E | EXT2 File System |  EXT2文件系统

X | |

T | File/Dir <---> Inode Mapping |

2 +--------------------------------------+

| RAM DISK DRIVER |   RAM DISK 驱动

L | |

I | rd_init(), rd_read(), rd_write(),... |

B | |

----- +--------------------------------------+

 

 

Ram disk driver will provide underlying support for logical sector addressing

initiated from the EXT2 file system module. The file I/O functions, in turn,

use EXT2 file system API to access files and directories stored on the ram disk.

RAM disk 驱动提供底层支持,实现EXT2文件系统模块的逻辑扇区寻址的功能。

文件 I/O 函数,使用EXT2文件系统的API 访问存储在ram disk上的文件和目录。

 

 

The ram disk will be constructed from a continuous memory region by

paritioning the memory region into fixed-size and linear sectors. In current

implementation, more than one file systems is supported and therefore, can be

simultaneously mounted.

ram disk 从一个连续的内存区域

 

The followings identify the features and limitations of the PSBAPP ext2 FS

implementation:

1. Support multiple file systems. However, mounting a file system on a directory

of another file system has not been supported yet. Mount point can be any string

in file system expression, e.g. /, /mnt, /fs1, /fsN, /log/cpu1, ...

支持多种文件系统。

 

However, if you have a mounted file system as /mnt and a directory named ’dir’

on /mnt, you can still mount another file system with the name ’/mnt/dir’.

However, a list of ’/mnt/dir’ will list the content of the root directory of

the later file system instead of the ’dir’ directory of the former FS.

 

2. Currently, the current implementation will provide support for only 1 block group.

Support for multiple block group can be provided with some changes.

 

3. Access control will be supported to avoid multiple application instances

from overwriting each other files. However, such a support requires the PSBAPP

operating environment to provide support for process/thread identifier.

The getpid(), getuid() functions are currently defined in ext2fs.h to get the

code compiled without error. They can be replaced by the real functions when

RMI OS provides support for the notion of user/process/thread id.

 

4. We also need a system timer to provide timestamp for file creation/access/modification.

Currently, file timestamp are set to 0 (see the definition of time() in ext2fs.h).

我们需要一个系统定时器提供文件创建/访问/修改 的时间戳。

当前,文件时间戳被设置为0(参见ex2fs.h中的time()的定义)。

 

5. The some improvements and error handlings need to be taken care of for a robust i

implementation. These are documented in the source code with a prefix前缀 of "TODO".

A grep of this string will reveal显示、泄露 all things that need to be improved.

 

6. File I/O API is extensively implemented to maximum compatibility with Linux applications.

However, they are not yet integrated to newlib. To do this, you simply remove the

"pp_" prefix in the functions in fileio.c.

FILE I/O API 扩展实现了最大限度的与linux应用程序的兼容。然而,还未被集成进newlib。要想集成进去,应该在fileio.c

函数中去除pp_ 前缀。

 

FILES:

RMI OS EXT2 file system is implemented as a library, which can later be integrated to

RMI OS core function if so desired. Currently, to take advantage of file I/O functions,

PSB applications link itself to the library. The source files for this library is in

ext2rd_lib directory.

RMIOS的EXT2文件系统被实现为一个库,可以被集成到RMIOS 核心函数中。当前,要想利用I/O函数,PSB应用程序把自己连接到库中。

库的源码文件都是在ext2rd_lib目录下。

 

ext_io.c implement EXT2 file system API functions. These functions are used by

file I/O layer to access files/directories.

ext_io.c 实现EXT2文件系统API ,这些函数,文件I/O层会用到以访问文件/目录。

 

ramdisk.c implement a simple ram disk driver.

ramdisk.c实现了一个简单的ram disk 驱动。

 

ext2_inode.c implement inode-related functions of the EXT2 FS module.

ext2_inode.c 实现的是inode相关的EXT2文件系统模块函数。

 

ext2_block.c implement block-related functions of the EXT2 FS module.

ext2_block.c 块设备函数。

 

ext2_dir.c implement directory traversal, file/dir creation/delete functions.

ext2_dir.c 目录移动,文件/目录 创建/删除 函数。

 

fileio.c implement file I/O API.  

fileio.c 实现文件I/O API

 

ext2fs.h is the only EXT2-related include files used by the library source files.

ext2fs.h 是库源文件唯一使用的EXT2相关的包含文件。

 

Makefile make file to build the library

makefile文件是用来编译库的。

 

There is also a test program under directory ’ext2rd’.

ext2rd是个测试程序

 

The Second Extended Filesystem

==============================

ext2 was originally released in January 1993. Written by R/’emy Card,

Theodore Ts’o and Stephen Tweedie, it was a major rewrite of the

Extended Filesystem. It is currently still (April 2001) the predominant

filesystem in use by Linux. There are also implementations available

for NetBSD, FreeBSD, the GNU HURD, Windows 95/98/NT, OS/2 and RISC OS.

Options

=======

When mounting an ext2 filesystem, the following options are accepted.

Defaults are marked with (*).

bsddf (*) Makes ‘df’ act like BSD.

minixdf Makes ‘df’ act like Minix.

check=none, nocheck (*) Don’t do extra checking of bitmaps on mount

(check=normal and check=strict options removed)

debug Extra debugging information is sent to the

kernel syslog. Useful for developers.

 

errors=continue (*) Keep going on a filesystem error.

errors=remount-ro Remount the filesystem read-only on an error.

errors=panic Panic and halt the machine if an error occurs.

grpid, bsdgroups Give objects the same group ID as their parent.

nogrpid, sysvgroups (*) New objects have the group ID of their creator.

resuid=n The user ID which may use the reserved blocks.

resgid=n The group ID which may use the reserved blocks.

sb=n Use alternate superblock at this location.

grpquota,noquota,quota,usrquota Quota options are silently ignored by ext2.

 

 

Specification

=============

ext2 shares many properties with traditional Unix filesystems. It has

the concepts of blocks, inodes and directories. It has space in the

specification for Access Control Lists (ACLs), fragments, undeletion and

compression though these are not yet implemented (some are available as

separate patches). There is also a versioning mechanism to allow new

features (such as journalling) to be added in a maximally compatible

manner.

 

Blocks

------

The space in the device or file is split up into blocks. These are

a fixed size, of 1024, 2048 or 4096 bytes (8192 bytes on Alpha systems),

which is decided when the filesystem is created. Smaller blocks mean

less wasted space per file, but require slightly more accounting overhead,

and also impose other limits on the size of files and the filesystem.

 

Block Groups

 

------------

Blocks are clustered into block groups in order to reduce fragmentation

and minimise the amount of head seeking when reading a large amount

of consecutive data. Information about each block group is kept in a

descriptor table stored in the block(s) immediately after the superblock.

Two blocks near the start of each group are reserved for the block usage

bitmap and the inode usage bitmap which show which blocks and inodes

are in use. Since each bitmap is limited to a single block, this means

that the maximum size of a block group is 8 times the size of a block.

The block(s) following the bitmaps in each block group are designated

as the inode table for that block group and the remainder are the data

blocks. The block allocation algorithm attempts to allocate data blocks

in the same block group as the inode which contains them.

 

The Superblock超级块

--------------

The superblock contains all the information about the configuration of

the filing system.

 超级快含有文件系统的所有信息。

The primary copy of the superblock is stored at an

offset of 1024 bytes from the start of the device, and it is essential

to mounting the filesystem.

 超级块的主要copy放在距离设备起始地址1024字节的地方,这对mount文件系统非常重要。

Since it is so important, backup copies of

the superblock are stored in block groups throughout the filesystem.

The first version of ext2 (revision 0) stores a copy at the start of

every block group, along with backups of the group descriptor block(s).

 超级块的重要性,使得超级快的备份存储在文件系统的block groups中。

 

Because this can consume a considerable amount of space for large

filesystems, later revisions can optionally reduce the number of backup

copies by only putting backups in specific groups (this is the sparse

superblock feature). The groups chosen are 0, 1 and powers of 3, 5 and 7.

The information in the superblock contains fields such as the total

number of inodes and blocks in the filesystem and how many are free,

how many inodes and blocks are in each block group, when the filesystem

was mounted (and if it was cleanly unmounted), when it was modified,

what version of the filesystem it is (see the Revisions section below)

and which OS created it.

If the filesystem is revision 1 or higher, then there are extra fields,

such as a volume name, a unique identification number, the inode size,

and space for optional filesystem features to store configuration info.

All fields in the superblock (as in all other ext2 structures) are stored

on the disc in little endian format, so a filesystem is portable between

machines without having to know what machine it was created on.

 

Inodes 目录结点

------

The inode (index node) is a fundamental concept in the ext2 filesystem.

Each object in the filesystem is represented by an inode. The inode

structure contains pointers to the filesystem blocks which contain the

data held in the object and all of the metadata about an object except

its name. The metadata about an object includes the permissions, owner,

group, flags, size, number of blocks used, access time, change time,

modification time, deletion time, number of links, fragments, version

(for NFS) and extended attributes (EAs) and/or Access Control Lists (ACLs).

There are some reserved fields which are currently unused in the inode

structure and several which are overloaded. One field is reserved for the

directory ACL if the inode is a directory and alternately for the top 32

bits of the file size if the inode is a regular file (allowing file sizes

larger than 2GB). The translator field is unused under Linux, but is used

by the HURD to reference the inode of a program which will be used to

interpret this object. Most of the remaining reserved fields have been

used up for both Linux and the HURD for larger owner and group fields,

The HURD also has a larger mode field so it uses another of the remaining

fields to store the extra more bits.

 

There are pointers to the first 12 blocks which contain the file’s data

in the inode. There is a pointer to an indirect block (which contains

pointers to the next set of blocks), a pointer to a doubly-indirect

block (which contains pointers to indirect blocks) and a pointer to a

trebly-indirect block (which contains pointers to doubly-indirect blocks).

The flags field contains some ext2-specific flags which aren’t catered

for by the standard chmod flags. These flags can be listed with lsattr

and changed with the chattr command, and allow specific filesystem

behaviour on a per-file basis. There are flags for secure deletion,

undeletable, compression, synchronous updates, immutability, append-only,

dumpable, no-atime, indexed directories, and data-journaling. Not all

of these are supported yet.

 

Directories

-----------

A directory is a filesystem object and has an inode just like a file.

It is a specially formatted file containing records which associate

each name with an inode number. Later revisions of the filesystem also

encode the type of the object (file, directory, symlink, device, fifo,

socket) to avoid the need to check the inode itself for this information

(support for taking advantage of this feature does not yet exist in

Glibc 2.2).

 

The inode allocation code tries to assign inodes which are in the same

block group as the directory in which they are first created.

The current implementation of ext2 uses a singly-linked list to store

 

the filenames in the directory; a pending enhancement uses hashing of the

filenames to allow lookup without the need to scan the entire directory.

The current implementation never removes empty directory blocks once they

have been allocated to hold more files.

 

Special files

-------------

Symbolic links are also filesystem objects with inodes. They deserve

special mention because the data for them is stored within the inode

itself if the symlink is less than 60 bytes long. It uses the fields

which would normally be used to store the pointers to data blocks.

This is a worthwhile optimisation as it we avoid allocating a full

block for the symlink, and most symlinks are less than 60 characters long.

Character and block special devices never have data blocks assigned to

them. Instead, their device number is stored in the inode, again reusing

the fields which would be used to point to the data blocks.

 

Reserved Space

--------------

Inext2, there is a mechanism for reserving a certain number of blocks

for a particular user (normally the super-user). This is intended to

allow for the system to continue functioning even if non-priveleged users

fill up all the space available to them (this is independent of filesystem

quotas). It also keeps the filesystem from filling up entirely which

helps combat fragmentation.

 

Filesystem check

----------------

At boot time, most systems run a consistency check (e2fsck) on their

filesystems. The superblock of the ext2 filesystem contains several

fields which indicate whether fsck should actually run (since checking

the filesystem at boot can take a long time if it is large). fsck will

run if the filesystem was not cleanly unmounted, if the maximum mount

count has been exceeded or if the maximum time between checks has been

exceeded.

 

Feature Compatibility

---------------------

The compatibility feature mechanism used in ext2 is sophisticated.

It safely allows features to be added to the filesystem, without

unnecessarily sacrificing compatibility with older versions of the

filesystem code. The feature compatibility mechanism is not supported by

the original revision 0 (EXT2_GOOD_OLD_REV) of ext2, but was introduced in

revision 1. There are three 32-bit fields, one for compatible features

(COMPAT), one for read-only compatible (RO_COMPAT) features and one for

incompatible (INCOMPAT) features.

 

These feature flags have specific meanings for the kernel as follows:

A COMPAT flag indicates that a feature is present in the filesystem,

but the on-disk format is 100% compatible with older on-disk formats, so

a kernel which didn’t know anything about this feature could read/write

the filesystem without any chance of corrupting the filesystem (or even

making it inconsistent). This is essentially just a flag which says

"this filesystem has a (hidden) feature" that the kernel or e2fsck may

want to be aware of (more on e2fsck and feature flags later). The ext3

HAS_JOURNAL feature is a COMPAT flag because the ext3 journal is simply

a regular file with data blocks in it so the kernel does not need to

take any special notice of it if it doesn’t understand ext3 journaling.

An RO_COMPAT flag indicates that the on-disk format is 100% compatible

with older on-disk formats for reading (i.e. the feature does not change

the visible on-disk format). However, an old kernel writing to such a

filesystem would/could corrupt the filesystem, so this is prevented. The

most common such feature, SPARSE_SUPER, is an RO_COMPAT feature because

sparse groups allow file data blocks where superblock/group descriptor

backups used to live, and ext2_free_blocks() refuses to free these blocks,

which would leading to inconsistent bitmaps. An old kernel would also

get an error if it tried to free a series of blocks which crossed a group

boundary, but this is a legitimate layout in a SPARSE_SUPER filesystem.

 

An INCOMPAT flag indicates the on-disk format has changed in some

way that makes it unreadable by older kernels, or would otherwise

cause a problem if an old kernel tried to mount it. FILETYPE is an

INCOMPAT flag because older kernels would think a filename was longer

than 256 characters, which would lead to corrupt directory listings.

The COMPRESSION flag is an obvious INCOMPAT flag - if the kernel

doesn’t understand compression, you would just get garbage back from

read() instead of it automatically decompressing your data. The ext3

RECOVER flag is needed to prevent a kernel which does not understand the

ext3 journal from mounting the filesystem without replaying the journal.

For e2fsck, it needs to be more strict with the handling of these

flags than the kernel. If it doesn’t understand ANY of the COMPAT,

RO_COMPAT, or INCOMPAT flags it will refuse to check the filesystem,

because it has no way of verifying whether a given feature is valid

or not. Allowing e2fsck to succeed on a filesystem with an unknown

feature is a false sense of security for the user. Refusing to check

a filesystem with unknown features is a good incentive for the user to

update to the latest e2fsck. This also means that anyone adding feature

flags to ext2 also needs to update e2fsck to verify these features.

Metadata

--------

It is frequently claimed that the ext2 implementation of writing

asynchronous metadata is faster than the ffs synchronous metadata

scheme but less reliable. Both methods are equally resolvable by their

respective fsck programs.

 

If you’re exceptionally paranoid, there are 3 ways of making metadata

writes synchronous on ext2:

per-file if you have the program source: use the O_SYNC flag to open()

per-file if you don’t have the source: use "chattr +S" on the file

per-filesystem: add the "sync" option to mount (or in /etc/fstab)

the first and last are not ext2 specific but do force the metadata to

be written synchronously. See also Journaling below.

Limitations

-----------

There are various limits imposed by the on-disk layout of ext2. Other

limits are imposed by the current implementation of the kernel code.

Many of the limits are determined at the time the filesystem is first

created, and depend upon the block size chosen. The ratio of inodes to

data blocks is fixed at filesystem creation time, so the only way to

increase the number of inodes is to increase the size of the filesystem.

No tools currently exist which can change the ratio of inodes to blocks.

Most of these limits could be overcome with slight changes in the on-disk

format and using a compatibility flag to signal the format change (at

the expense of some compatibility).

 

 

Filesystem block size: 1kB 2kB 4kB 8kB

File size limit: 16GB 256GB 2048GB 2048GB

Filesystem size limit: 2047GB 8192GB 16384GB 32768GB

There is a 2.4 kernel limit of 2048GB for a single block device, so no

filesystem larger than that can be created at this time. There is also

an upper limit on the block size imposed by the page size of the kernel,

so 8kB blocks are only allowed on Alpha systems (and other architectures

which support larger pages).

There is an upper limit of 32768 subdirectories in a single directory.

There is a "soft" upper limit of about 10-15k files in a single directory

with the current linear linked-list directory implementation. This limit

stems from performance problems when creating and deleting (and also

finding) files in such large directories. Using a hashed directory index

(under development) allows 100k-1M+ files in a single directory without

performance problems (although RAM size becomes an issue at this point).

The (meaningless) absolute upper limit of files in a single directory

(imposed by the file size, the realistic limit is obviously much less)

is over 130 trillion files. It would be higher except there are not

enough 4-character names to make up unique directory entries, so they

have to be 8 character filenames, even then we are fairly close to

running out of unique filenames.

 

Journaling

 

 

----------

A journaling extension to the ext2 code has been developed by Stephen

Tweedie. It avoids the risks of metadata corruption and the need to

wait for e2fsck to complete after a crash, without requiring a change

to the on-disk ext2 layout. In a nutshell, the journal is a regular

file which stores whole metadata (and optionally data) blocks that have

been modified, prior to writing them into the filesystem. This means

it is possible to add a journal to an existing ext2 filesystem without

the need for data conversion.

When changes to the filesystem (e.g. a file is renamed) they are stored in

a transaction in the journal and can either be complete or incomplete at

the time of a crash. If a transaction is complete at the time of a crash

(or in the normal case where the system does not crash), then any blocks

in that transaction are guaranteed to represent a valid filesystem state,

and are copied into the filesystem. If a transaction is incomplete at

the time of the crash, then there is no guarantee of consistency for

the blocks in that transaction so they are discarded (which means any

filesystem changes they represent are also lost).

The ext3 code is currently (Apr 2001) available for 2.2 kernels only,

and not yet available for 2.4 kernels.

References

==========

 

References

==========

The kernel source file:/usr/src/linux/fs/ext2/

e2fsprogs (e2fsck) http://e2fsprogs.sourceforge.net/

Design & Implementation http://e2fsprogs.sourceforge.net/ext2intro.html

Journaling (ext3) ftp://ftp.uk.linux.org/pub/linux/sct/fs/jfs/

Hashed Directories http://kernelnewbies.org/˜phillips/htree/

Filesystem Resizing http://ext2resize.sourceforge.net/

Extended Attributes &

Access Control Lists http://acl.bestbits.at/

Compression (*) http://www.netspace.net.au/˜reiter/e2compr/

Implementations for:

Windows 95/98/NT/2000 http://uranus.it.swin.edu.au/˜jn/linux/Explore2fs.htm

Windows 95 (*) http://www.yipton.demon.co.uk/content.html#FSDEXT2

DOS client (*) ftp://metalab.unc.edu/pub/Linux/system/filesystems/ext2/

OS/2 http://perso.wanadoo.fr/matthieu.willm/ext2-os2/

RISC OS client ftp://ftp.barnet.ac.uk/pub/acorn/armlinux/iscafs/

(*) no longer actively developed/supported (as of Apr 2001)

评论 4
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值