CMU Computer Systems: System-Level I/O

I/O

  • Unix I/O
  • RIO (robust I/O) package
  • Metadata, sharing, and redirection
  • Standard I/O
  • Closing remarks
Unix I/O Overview
  • A Linux file is a sequence of m bytes
  • Cool fact: All I/O devices are represented as files
  • Even the kernel is represented as a file
  • Elegant mapping of files to devices allows kernel to export simple interface called Unix I/O
File Types
  • Each file has a type indicating its role in the system
    • Regular file: Contains arbitrary data
    • Directory: Index of a related group of files
    • Socket: For communicating with a process on another machine
  • Other file types beyond our scope
    • Named pipes
    • Symbolic links
    • Character and block devices
Regular Files
  • A regular file contains arbitrary
  • Applications often distinguish between text files and binary files
    • Text files are regular files with only ASCII or Unicode characters
    • Binary files are everything else
    • Kernel doesn’t know the difference
  • Text file is sequence of text lines
    • Text line is sequence of chars terminated by newline char(’\n’)
  • End of line indicators in other systems
    • Linux and Mac OS: ‘\n’
    • Windows and Internet protocols: ‘\r\n’
Directories
  • Directory consists of an array of links
    • Each link maps a filename to a file
  • Each directory contains at least two entries
    • . is a link to itself
    • … is a link to the parent directory in the directory hierarchy
  • Commands for manipulating directories
    • mkdir: create empty directory
    • ls: view directory contents
    • rmdir: delete empty directory
Directory Hierarchy
  • All files are organized as a hierarchy anchored by root directory named /
  • Kernel maintains current working directory for each command
Opening Files
  • Opening a file informs the kernel that you are getting ready to access that file
  • Returns a small identifying integer file descriptor
    • fd == -1 indicates that an error occurred
  • Each process created by a Linux shell begins life with three open files associated with a terminal
    • 0: standard input
    • 1: standard output
    • 2: standard error
Closing Files
  • Closing a file informs the kernel that you are finished accessing that file
  • Closing an already closed file is a recipe for disaster in threaded programs
  • Moral: Always check return codes, even for seemingly benign functions such as close()
Reading Files
  • Reading a file copies bytes from the current file position to memory, and then updates file position
  • Returns number of bytes read from file rd into buf
    • Return type ssize_t is signed integer
    • nbytes < 0 indicates that an error occurred
    • Short counts (nbytes < sizeof (buf) ) are possible and are not errors
Writing Files
  • Writing a file copies bytes from memory to the current file position, and then updates current file position
  • Returns number of bytes written from buf to file fd
    • nbytes < 0 indicates that an error occurred
    • As with reads, short counts are possible and are not errors
On Short Counts
  • Short counts can occur in these situations
    • Encountering EOF on reads
    • Reading text lines from a terminal
    • Reading and writing network sockets
  • Short counts never occur in these situations
    • Reading from disk files (except for EOF)
    • Writing to disk files
  • Best practice is to always allow for short counts
The RIO Package
  • RIO is a set of wrappers that provide efficient and robust I/O in apps, such as network programs that are subject to short counts
  • RIO provides two different kinds of function
    • Unbuffered input and output of binary data
      • rio_readn and rio_writen
    • Buffered input of text lines and binary data
      • rio_realineb and rio_readnb
      • Buffered RIO routines are thread-safe and can be interleaved arbitrarily on the same descriptor
File Metadata
  • Metadata is data about data, in this case file data
  • Per-file metadata maintained by kernel
    • accessed by users with the stat and fstat functions
Pros and Cons of Unix I/O
  • Pros
    • Unix I/O is the most general and lowest overhead form of I/O
      • All other I/O packages are implemented using Unix I/O functions
    • Unix I/O provides functions for accessing file metadata
    • Unix I/O functions are async-signal-safe and can be used safely in signal handlers
  • Cons
    • Dealing with short counts is tricky and error prone
    • Efficient reading of text lines requires some form of buffering, also tricky and error prone
    • Both of these issues are addresses by the standard I/O and RIO
Pros and Cons of Standard I/O
  • Pros
    • Buffering increases efficiency by decreasing the number of read and writes system calls
    • Short counts are handled automatically
  • Cons
    • Provides no function for accessing file metadata
    • Standard I/O functions are not async-signal-safe, and not appropriate for signal handlers
    • Standard I/O is not appropriate for input and output on network sockets
      • There are poorly documented restrictions on streams that interact badly with restrictions on sockets
Choosing I/O Functions
  • General rule: use the highest-level I/O functions you can
    • Many C programmers are able to do all of their work using the standard I/O functions
  • When to use standard I/O
    • When working with disk or terminal files
  • When to use raw Unix I/O
    • Inside signal handlers, because Unix I/P is async-signal-safe
    • In rare cases when you need absolute highest performance
  • When to use RIO
    • When you are reading and writing network sockets
    • Avoid using standard I/O on sockets
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值