File System Review Note - Operating System

Cindy Chen的笔记地址: https://docs.google.com/document/d/14qvHObfoNGrAA6SB6C688ErOzDby8A-5wxqLD_97-p8/edit?usp=sharing

 

Disk I/O

  1. Understand the memory hierarchy concept, locality

  2. Physical disk structure

    1. platters

    2. surfaces

    3. tracks

    4. sectors

    5. cylinders

    6. arms

    7. heads

 

  1. disk scheduling

    1. FCFS

      1. reasonable when load is low

      2. long waiting times for long request queues

    2. SSTF (shortest seek time first)

      1. minimize arm movement (seek time), maximize request rate

      2. favors middle blocks

    3. SCAN (elevator)

      1. service requests in one direction until done, then reverse

    4. C-SCAN

      1. like SCAN, but only go in one direction

    5. in general, unless there are request queues, disk scheduling does not have much impact

    6. modern disks often do the disk scheduling themselves

      1. disk know their layout better than OS, can optimize better

      2. ignores, undoes any scheduling done by OS

  2. Disk interface

    1. How does the OS make requests to the disk?

  3. Disk performance

    1. What steps determine disk request performance?

    2. What are seek, rotation, transfer?

 

File systems

    1. Topics

      1. files

      2. directories

      3. sharing

      4. protection

      5. layouts

      6. buffer cache

    2. What is a file system?

      1. implement an abstraction (files) for secondary storage

      2. organize files locally (directories)

      3. permit sharing of data between processes, people and machines

      4. protect data from unwanted access (security)

    3. Why are file systems useful (why do we have them)?

 

Files and directories

What is a file?

    1. What characteristics do they have?

      1. data with some properties

      2. can have a type

        1. understood by file system: block, character, device, portal, link, etc

        2. understood by other parts of the OS or runtime libraries

      3. a file's type can be encoded in its name or contents

    2. What are file access methods?

      1. sequential access

      2. direct access

      3. indexed sequential access

  1. What is a directory?

    1. What are they used for?

      1. for users, they provide a structured way to organize files

      2. for the file system, they provide a convenient naming interface that allows the implementation to separate logical file organization from physical file placement on the disk

    2. How are the implemented?

      1. <name, location>

      2. name is just the name of the file or directory

      3. location depends upon how file is represented on disk

    3. What is a directory entry?

      1. a struct of inode and filename: just enough information to translate from a filename to an inode and get to the actual file

  2. How are directories used to do path name translation?

 

Protection

  1. What is file protection used for?

    1. a protection system dictates whether a given action performed by a given subject on a given object should be allowed

  2. How is it implemented?

  3. What are access control lists (ACLs)?

    1. for each object, maintain a list of subjects and their permitted actions

  4. What are capabilities?

    1. for each subject, maintain a list of objects and their permitted actions

  5. What are the advantages/disadvantages of each

    1. capabilities: compact enough to fit in just a few bytes; not very expressive

    2. access control list: a per-file list that tells who can access that file: highly expressive; harder to represent in a compact way

    3. Which one is easier for deleting an object (file)? - ACL

 

File system layouts

  1. What are file system layouts used for? - to store files

    1. file systems define a block size (like 4 KB)

      1. disk space is allocated in granularity of blocks

    2. a "master block" determines location of root directory

    3. a free map determines which blocks are free, allocated

      1. usually a bitmap, one bit per block on the disk

      2. also stored on disk, cached in memory for performance

    4. remaining disk blocks used to store files

  2. What are the general strategies?

    1. contiguous allocation

      1. like memory

      2. fast, simplifies directory access

      3. inflexible, causes fragmentation, needs compaction

    2. linked structure

      1. each block points to the next, directory points to the first

      2. good for sequential access, bad for all others

    3. indexed structure (indirection, hierarchy)

      1. an "index block" contains pointers to many other blocks

      2. handles random better, still good for sequential

      3. may need multiple index blocks

  3. What are the tradeoffs for those strategies?

    1. contiguous allocation

      1. pros:

        1. simple

        2. performance

      2. cons:

        1. fragmentation

        2. usability

        3. hard for a file to dynamically grow

      3. used in CDROMs, DVDs

    2. linked list allocation

      1. pros:

        1. no space lost to external fragmentation

        2. disk only needs to maintain first block of each file

      2. cons:

        1. random access is costly

        2. overheads of pointers

  4. MS-DOS file system

    1. file allocation table (FAT)

    2. take pointers away from blocks, store in this table

    3. pros:

      1. entire block is available for data

      2. random access is faster than linked list

    4. cons:

      1. many file seeks unless entire FAT is in memory

  5. How do those strategies reflect file access methods?

  6. What is an inode?

    1. How are inodes different from directories?

      1. unix inodes implement an indexed structure for files

      2. also store metadata info (protection, timestamps, length, ref count, ...)

      3. each inode contains 15 block pointers

      4. first 12 are direct blocks, then single, double, and triple indirect

    2. How are inodes and directories used to do path resolution, find files?

      1. read inode for ...

      2. read data block for ...

      3. read inode for ...

      4. read data block for ...

 

A hard link creates a new link to the same inode underneath. When you delete the original file, you just decrease the reference count to the inode by one. The new created hard link still exists.


A  soft link is just a link to another path name. If remove the original file, the path does not exist anymore. Then it is a dangling ptr.



shared files

  1. if B wants to share a file owned by C

    1. one solution: copy disk addresses in B's directory entry

    2. problem: modification by one not reflected in other user's view

 

File buffer cache

  1. What is the file buffer cache, and why do operating systems use one?

    1. applications exhibit significant locality for reading and writing files

    2. cache file blocks in memory to capture locality

    3. file buffer cache

    4. cache is system wide, used and shared by all processes

    5. reading from the cache makes a disk perform like memory

    6. even a 4 MB cache can be very effective

  2. What is the difference between caching reads and caching writes?

    1. on a write, some applications assume that data makes it through the buffer cache and onto the disk

    2. as a result, writes are often slow even with caching

    3. compensates:

      1. write-behind

        1. maintain a queue of uncommitted blocks

        2. periodically flush the queue to disk

        3. unreliable

      2. battery backed-up RAM (NVRAM)

        1. as with write-behind, but maintain queue in NVRAM

        2. expensive

      3. log-structured file system

        1. always write next block after last block written

        2. complicated

    4. read ahead

    5. predicts that the process will request next block

    6. FS goes ahead and requests it from the disk

    7. this can happen while the process is computing on previous block

    8. when the process requests block, it will be in cache

    9. compliments the disk cache, which also is doing read ahead

    10. big win for sequentially accessed files

  3. What are the tradeoffs of using memory for a file buffer cache vs. VM?

    1. file buffer competes with VM

    2. like VM, it has limited size

    3. need replacement algorithms again (LRU)

 

Advanced topics

    1. Distributed/parallel systems

      1. benefits of distributed systems

        1. performance: parallelism across multiple nodes

          1. google file systems, big table, map reduce, hadoop, etc

        2. reliability and fault tolerance

          1. redundancy

          2. eg. google search engine

        3. scalability by adding more nodes

      2. benefits of RPC (remote procedure calls)

        1. most common model for communication in distributed applications

        2. RPC is essentially language support for distributed programming

        3. relies upon a stub compiler to automatically generate client/server stubs from the IDL server descriptions

          1. these stubs do the marshalling/unmarshalling, message sending/receiving/replying

        4. NFS uses RPC to implement remote file systems

    2. Big data and cloud

      1. main goal/objectives of hadoop/Map Reduce

      2. main challenges

 

转载于:https://www.cnblogs.com/yxcindy/p/10223522.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值