Storage and File Structure
存储器层次结构(向上变快变贵但是易失)
基本存储(primary storage): 高速缓冲存储器(cache)、主存储器(main memory)
辅助存储(secondary storage)或联机存储(online storage): 基本存储下一层,如磁盘(magnetic disk)
三级存储(tertiary storage)或脱机存储(offline storage): 最底层,如磁带(magnetic tapes)或光盘(optical disk)
主存储器以上均为易失性存储(volatile storage):设备断电后失去所有内容
独立磁盘冗余阵列(RAID)
平均故障时间(MTTF)
RAID level:
Choice of RAID Level
Factors in choosing RAID level
- Monetary cost
- Performance: Number of I/O operations per second, and bandwidth during normal operation
- Performance during failure
- Performance during rebuild of failed disk
- Including time taken to rebuild failed disk
RAID 0 is used only when data safety is not important
- E.g., data can be recovered quickly from other sources
Level 2 and 4 never used since they are subsumed by 3 and 5
Level 3 is not used anymore since bit-striping forces single block reads to access all disks, wasting disk arm movement, which block striping (level 5) avoids
Level 6 is rarely used since levels 1 and 5 offer adequate safety for almost all applications
So competition is between 1 and 5 only
Level 1 provides much better write performance than level 5
- Level 5 requires at least 2 block reads and 2 block writes to write a single block, whereas Level 1 only requires 2 block writes
- Level 1 preferred for high update environments such as log disks
Level 1 had higher storage cost than level 5
- Disk drive capacities increasing rapidly (50%/year) whereas disk access times have decreased much less (x 3 in 10 years)
- I/O requirements have increased greatly, e.g. for Web servers
- When enough disks have been bought to satisfy required rate of I/O, they often have spare storage capacity
- So there is often no extra monetary cost for Level 1!
Level 5 is preferred for applications with low update rate, and large amounts of data.
Level 1 is preferred for all other applications.
Buffer Manager
当Buffer的空闲区不够,不能容下新读入的Block时,需要将Buffer中 原有Block覆盖(替换)。主要策略为:
- LRU strategy (Least Recently Used, 最近最少使用策略): Replace the block which was least recently used.
- MRU strategy (Most recently used, 最近最常用策略): System must pin the block currently being processed. After the final tuple of that block has been processed, the block is unpinned, and it becomes the most recently used block.(最优)
- Toss-immediate,立即丢弃策略:处理完的元组立即丢弃
File Organization
- 定长记录
- 变长记录
Organization of Records in Files
- heap file 堆文件,流水文件:
a record can be placed anywhere in the file where there is space - sequential file 顺序文件:
store records in sequential order, based on the value of a search key of each record - hashing file 散列文件:
a hash function computed on some attribute of each record; the result specifies in which block of the file the record should be placed - clustering file organization 聚集文件组织:
records of several different relations can be stored in the same file
Motivation: store related records in different relations on the same block to minimize I/O