Large File Support in Linux

From: http://users.suse.com/~aj/linux_lfs.html

To support files larger than 2 GiB on 32-bit systems, e.g. x86,PowerPC and MIPS, a number of changes to kernel and C library had tobe done. This is called Large File Support (LFS). The support forLFS should be complete now in Linux and this article should give ashort overview of the current status.

64 bit systems like Alpha, IA64 and x86-64 don't have problemswith large files but do support the new interfaces also. In this casethe new interface is mainly an alias to the normal interface.

The LFS support is done by the Linux kernel and the GNU C library(aka glibc).

Limits

LFS raises the limit of maximal file size. For 32-bit systems thelimit is 231 (2 GiB) but using the LFS interface on filesystems thatsupport LFS applications can handle files as large as 263 bytes.

For 64-bit systems the file size limit is 263 bytes unless afilesystem (like NFSv2) only supports less.

LFS in Glibc 2.1.3 and Glibc 2.2

The LFS interface in glibc 2.1.3 is complete - but the implementationnot. The implementation in 2.1.3 contains also some bugs,e.g. ftello64 is broken. If you want to use the LFS interface, youneed to use a glibc that has been compiled against headers from akernel with LFS support in it.

Since glibc 2.1.3 was released before LFS support went into Linux2.3.X/2.4.0-testX, some fixes had to be made to glibc to support thekernel routines. The current stable release of glibc is glibc 2.2.3(2.2 was released in November 2000) and it does support all the features fromLinux 2.4.0. Glibc 2.2.x is now used by most of the major distributionsin their latest release (e.g. SuSE 7.2, Red Hat 7.1).glibc 2.2supports the following features that glibc 2.1.3 doesn't support:

  • getdents64 system call
  • 64 bit file locking interface (see below for details)

Programs compiled against glibc 2.1.3 will work on a LFS system,there's no need to recompile the programs (with the exception of the64 bit fcntl locking). Only glibc needs to be updated to support LFS.

Note that glibc 2.0 and libc5 do not support LFS at all.

Locking on Large Files is Not Supported withfcntl/lockfin Glibc 2.1.x

Locking via fcntl/lockf doesn't work with large files inglibc 2.1.3. The support has been added in Linux 2.4.0-test7 to thekernel and needed incompatible changes to glibc, only glibc 2.2 does handlethem. This means:

  • You can't use the flags F_GETLK64, F_SETLK64and F_SETLKW64 with fcntl when you use glibc 2.1.x.If your programs use them now, they fail. They also need to berecompiled with glibc 2.2 which will support these fcntlflags.
  • lockf64 only works on files < 2 GiB with glibc 2.1.x,it does work with glibc 2.2 and no recompilation is needed.

LFS in the Linux Kernel

Since Linux 2.4.0-test7 most of the kernel interface is includedinto the kernel. The open problems and restrictions are describedbelow.

File Systems

We can separate two levels of LFS compliance in the file systems:

  1. Full support for files > 2 GiB and O_LARGEFILE
  2. Limited LFS support: it gives proper EINVAL/EFBIG/EOVERFLOW error messages when you try to use O_LARGEFILE or positions > 2 GiB.

At least the second level should be generally reachable, but issome work to audit all the weird file systems.

Some bugs in NFSv2 regarding (2) have been fixed already, but someare missing (like the O_LARGEFILE check). Other file systems probablymiss it too. A complete audit of all file systems is needed (see alsothe 2.4 kernel TODO page at http://linux24.sourceforge.net/).

The situation about the different filesystems used in Linux2.4.0 and later can be summarized as follows:

ext2/ext3
Full support for LFS
NFSv2
Cannot handle LFS due to protocol restrictions(limited to 2 GiB - 1); limited LFS support but expect some bugs
NFSv3
The protocol is ok, but I'm not sure about theLinux implementation status
ReiserFS 3.5.x (not part of the kernel, separate patch)
Does not support LFS
ReiserFS 3.6.x (part of kernel 2.4.1 and newer)
Full support for LFS if the new on disk format is used. Thisformat is incompatible to the format used by 3.5.x (see below for somemore details).
coda
Does not work with LFS (local cache issues, protocol isok)
UFS
Full support for LFS (although not completevs. O_LARGEFILE flag use)
minix
limited to 2 GiB - 1 (file size is limited to 65804 MiBbut note that filesystem size is limited to 64 MiB - but holes are allowed)
SysV (aka SCO)
limited to 2 GiB -1
msdos
limited to 2 GiB - 1
umsdos
based on msdos, limited to 2 GiB - 1
smbfs
Older protocols are limited to 4 GiB - 1. SMB extensions allow 64 bitfilesystems. Linux smbfs implementation is currently limited to 2 GiB - 1.
NCPfs
protocol is limited to 4 GiB - 1, Linux implementationto 2 GiB - 1
JFS
Should work with LFS (for details about JFS see http://oss.software.ibm.com/developer/opensource/jfs)
XFS
Should work with LFS (for details about XFS see http://http://linux-xfs.sgi.com/projects/xfs/)
other file systems
I don't have any information yet, feel freeto send me updates.
Note for ext2

When files > 2 GiB are created in ext2 older kernels will mountfile systems only read-only (it sets a read-only compatibilityflag).

Note for ReiserFS

Chris Mason wrote:

Disks formatted with the current 2.2 code are called our 3.5 disk format.They will not support large files under any kernel (even the 2.4 code).

But, you can mount a 3.5 disk format under the 2.4 kernel code, anduse -o conv. This will turn on large file support for theold disks, but only new files will be allowed to grow past 2 GiB.

Once you mount with -o conv, you can't mount under 2.2any more. We are testing a back port of the LFS disk format to 2.2,it should be ready soon. It has the same -o conv mountoption that our2.4 code has, so all the same rules will apply.

rlimit64 Is Not Supported

The Linux kernel doesn't support a 64bit rlimit system callyet, glibc supports getrlimit64 and setrlimit64 butwraps too large values to RLIMIT_INFINITY.

Using LFS

For using LFS in user programs, the programs have to use the LFS API.This involves recompilation and changes of programs. The API isdocumented in the glibc manual (the libc info pages) which can be readwith e.g. "info libc".

In a nutshell for using LFS you can choose either of the following:

  • Compile your programs with "gcc -D_FILE_OFFSET_BITS=64". This forces all file access calls to usethe 64 bit variants. Several types change also, e.g. off_tbecomes off64_t. It's therefore important to always use thecorrect types and to not usee.g. int instead of off_t.For portability with other platforms you should usegetconf LFS_CFLAGS which will return -D_FILE_OFFSET_BITS=64 on Linux platforms but might returnsomething else on e.g. Solaris. For linking, you should use the linkflags that are reported via getconf LFS_LDFLAGS. On Linuxsystems, you do not need special link flags.
  • Define _LARGEFILE_SOURCE and_LARGEFILE64_SOURCE. With these defines you can use the LFSfunctions like open64 directly.
  • Use the O_LARGEFILE flag with open to operateon large files.

A complete documentation of the feature test macros like_FILE_OFFSET_BITS and _LARGEFILE_SOURCE is in theglibc manual (run e.g. "info libc 'Feature Test Macros'").

The LFS API is also documented in the LFS standard which is availableat http://ftp.sas.com/standards/large.file/x_open.20Mar96.html.

LFS and Libraries other than Glibc

Be careful when using _FILE_OFFSET_BITS=64 to compile aprogram that calls a library or a library if any of the interfacesuses off_t. With _FILE_OFFSET_BITS=64 glibc willchange the type of off_t to off64_t. You can eitherchange the interface to always use off64_t, use a differentfunction if _FILE_OFFSET_BITS=64 is used (like glibc does).Otherwise take care that both library and program have the same_FILE_OFFSET_BITS setting. Note that glibc is aware of the_FILE_OFFSET_BITS setting, there's no problem with it butthere might be problems with other libraries.

Distributions with LFS Support

SuSE 7.0

Release 7.0 of SuSE Linux supports LFS on all supported platforms.The kernel of SuSE 7.0 is based on Linux 2.2.16.

The LFS support in the SuSE Linux kernel is the same as in thedevelopment kernel 2.4.0-test1 for the file systems which are in bothkernels, glibc supports all the features of the kernel. The differentfilesystems are ReiserFS (so far only in SuSE, the 2.2 port doesn'tsupport LFS) and NFSv3 (not available in SuSE 7.0). This means thatyou need to use ext2 as file system for LFS.

Both Linux 2.4.0-test1 and SuSE 7.0 do not support thegetdents64 system call and the 64 bit locking interface.These are only implemented in Linux 2.4.0-test8 and newer.

SuSE 7.1

Release 7.1 of SuSE Linux supports LFS on all supported platforms.SuSE 7.1 comes with kernels based on 2.4.0 and 2.2.18.

The 2.2.18 kernel support LFS with the ext2 file system. The 2.4.0kernel supports LFS with the ext2 and NFSv3 filesystems andadditionally with the ReiserFS filesystem if the new ReiserFS format(incompatible to the 2.2 format) is used instead of the default 2.2format.

SuSE 7.1 comes with glibc 2.2 that supports the full LFS interface.But the 2.2.18 kernel only does not support the 64-bit filelocking andthe getdents64 calls.

SuSE 7.2 and newer

The kernel support for LFS is like the one in 7.1.

Other Distributions

Since I can't verify each and every distribution, I have to trustothers for the following information.

Debian

The current stable release (Debian 3.0, codename "woody") has LFSsupport.

Red Hat

The beta called Fisher was the first to have LFS support (thanksto Russ Marshall). Current Red Hat releases like Red Hat 8 have LFS support.

Tim Small <tim@digitalbrain.com> send the following specialcombo-gotcha for Red Hat 6.2 (and probably other older distros aswell):

The 'ulimit' command which is built into bash 1.x (the default forRed Hat 6.2) uses the 32 bit versions of the system calls. The waythat glibc currently behaves means that requests to the 32bitsetrlimit, or getrlimit will translate 'unlimited' to '231 - 1' inboth directions (I would argue that setting a limit to RLIM_INFINITYusing the 32bit interface should end up in a call to the 64 bitsetrlimit variant with the 64 bit RLIM_INFITIY).

The default PAM configuration for sshd (/etc/pam.d/sshd), includes the line:

session    required     /lib/security/pam_limits.so

Which fiddles about with various limits (using the 32bit versions ofthe calls).

If you log-in using ssh, and use bash 1.x to view the limits, you willbe told that your file size is unlimited, when it is in fact set to2097151 (1024 byte) blocks!

Workaround:

  • Either:
    • Comment out the line in /etc/pam.d/sshd (note that limits set in /etc/security/limits.conf will no longer be effective for ssh logins)
    • Or: Rebuild the pam package with 64 bit support
  • Install the bash2 RPM
  • Either:
    • rename the old bash, and symlink /bin/bash2 to /bin/bash (you may want to keep /bin/sh pointing at the old bash, if you are worried about compatibility)
    • Or: use vipw to change users over to /bin/bash2
Other...

I don't have any other information yet. Feel free to send me detailed information aboutdistributions if they supports LFS.

Some Other Often Requested Data about Filesystems

Please send me information to fill in the missing bits.

Maximum On-Disk Sizes of the Filesystems

FilesystemFile Size LimitFilesystem Size Limit
ext2/ext3 with 1 KiB blocksize16448 MiB (~ 16 GiB)2048 GiB (= 2 TiB)
ext2/3 with 2 KiB blocksize256 GiB8192 GiB (= 8 TiB)
ext2/3 with 4 KiB blocksize2048 GiB (= 2 TiB)8192 GiB (= 8 TiB)
ext2/3 with 8 KiB blocksize (Systems with 8 KiB pages like Alpha only)65568 GiB (~ 64 TiB)32768 GiB (= 32 TiB)
ReiserFS 3.52 GiB16384 GiB (= 16 TiB)
ReiserFS 3.6 (as in Linux 2.4)1 EiB16384 GiB (= 16 TiB)
XFS8 EiB8 EiB
JFS with 512 Bytes blocksize8 EiB512 TiB
JFS with 4KiB blocksize8 EiB4 PiB
NFSv2 (client side)2 GiB8 EiB
NFSv3 (client side)8 EiB8 EiB

Note Kernel Limitations: The table above describeslimitations of the on-disk format. The following kernel limitsexist:

  • On 32-bit systems with Kernel 2.4.x: The size of a file and a block device is limited to 2 TiB. By using LVM several block devices can be combined enabling the handling of larger file systems.
  • 64-bit systems: The sizes of a filesytem and of a file are limited by 263 (8 EiB). But there might be hardware driver limits that do not allow to access such large devices.
  • Kernel 2.6: For both 32-bit systems with option CONFIG_LBD set and for 64-bit systems: The size of a file system is limited to 273 (far too much for today). On 32-bit systems (without CONFIG_LBD set) the size of a file is limited to 2 TiB. Note that not all filesystems and hardware drivers might handle such large filesystems.

Note in the above:1024 Bytes = 1 KiB;1024 KiB = 1 MiB;1024 MiB = 1 GiB; 1024 GiB = 1 TiB; 1024 TiB = 1 PiB; 1024 PiB = 1EiB (check http://physics.nist.gov/cuu/Units/binary.html)

Maximum Number of Partitions

An IDE disk has 64 minors, one is used for the full disk and therefore63 partitions are possible. A SCSI disk has 16 minors and thereforeonly 15 partitions maximal.

Links

Thanks

Thanks to Andi Kleen, Matti Aarnio, Rogier Wolff, Chris Mason,Andreas Schwab, Lenz Grimmer, Andries Brouwer, Urban Widmark, BruceAllen and Jana Jaeger for additions to and comments on the contents ofthis page.

Translations

Belorussian translation


  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值