Happy Apache Cassandra 2: File Store Format

h3. Model Review

Cassandra data model

Column Family



Super Column Family



h3. File Format

h4. Keyspace

  • Each Keyspace(Eg. Lobs)in separated directory
  • Each ColumnFamily(Eg. object) in separated sstable files
    • ColumnFamilyName-version-#-Data.db
    • ColumnFamilyName-version-#-Index.db
    • ColumnFamilyName-version-#-Filter.db

h4. Index File 

  • Each entry is [Key, Position], order by key
  • Index Summary(pre-loaded into memory), every 1 entry in 128 entries
  • Key Cache, cache the entry after access it


h4. data file

  • Each entry is [Row], order by key
  • Each Row contains columns, order by column name
  • Each Row contains column index (in range)


h4. others

  • Bloom filter file
    • Given {an element}, answer {is contained in a set} .
    • Given a {key}, answer {is contained in current sstable file}
  • Commit log file
    • Used in write, avoid data lost 
  • Statistical file

h3. CRUD

h4. write

BigTable

  • First write to a disk commit log (sequential) 
  • Update to appropriate memtables
  • Memtables are flushed to disk into SSTable ( immutable)


h4. read

  • Check row cache
  • Cassandra will read all the SSTables for that Column Family
    • Bloom Filter for each SSTable to determine whether this SSTable contains the key
    • Use index in SSTable to locate the data (check key cache)
    • Read from data file

h4. update

  • Same sequence with write
    • SSTable is immutable
    • Update is write into new SSTable
  • Column Reconcile
    • Read columns from all the SSTables
    • Merge/Reduced columns with same name, CollationController
      • Use timestamp, return latest column

h4. delete

  • Same sequence with write
    • SSTable is immutable
    • Delete is write into new SSTable
  • Delete a column
    • Column flag is set to delete flag
  • Delete a Row
    • markedForDeleteAt  is set to delete timestamp
    • Column timestamp is compared with markedForDeleteAt when reducing columns in read

h4. drop

  • Drop a Column Family
    • Take snapshots (move to snapshot dir)
    • Remove column family definition
  • Drop a Keyspace
    • Take snapshots (move to snapshot dir)
    • Remove keyspace definition

h4. Compaction
Periodically data files are merged sorted into a new file (and creates new index)
  • Merge keys 
  • Combine columns 
  • Discard tombstones


References

http://jonathanhui.com/how-cassandra-read-persists-data-and-maintain-consistency

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

FireCoder

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值