-
Overview
In C-Store, the primary representation of data on disk is as
a set of column files
which is called “read optimized store (ROS)”.Each column-file contains data from one column, compressed using a column-specific compression mothod, and sorted accroding to some attribute in the table that the column belongs to.
Newly loaded data is stored in a write-optimized store (WOS) where data is uncompressed and not vertically partitioned.
Periodically, data is moved from the WOS into the ROS via a background “tuple mover” process, which sorts, compresses, and writes re-organized data to disk in a columnar form.
-
Projections
Groups of columns sorted on the same attribute are referred to as “projections”;
Typically there is at least one projection containing all columns that can be used to answer any query;
-
Index
C-Store does not support secondary indices on tables
A sparse index is a small tree-based index that stores the first value contained on each physical page of a column.
-
No-overwrite
C-Store uses a “no-overwrite” storage representation, where updates are treated as deletes followed by inserts.
Deletes are processed by storing a special “delete column” that records the time every tuple was deleted.