Just to keep track of the course for myself
文章目录
Lecture
- Lecture 1: Course Introduction and the Relational Model
- Lecture 2: Advanced SQL
- Lecture 3: Database Storage I
- Lecture 4: Database Storage II
- Lecture 5: Buffer Pools
- Lecture 6: Hash Tables
- Lecture 7: Tree Indexes
- Lecture 8: Index Concurrency Control
- Lecture 9: Sorting + Aggregations
- Lecture10: Joins Algorithms
- Lecture 11: Query Execution I
- Lecture 12: Query Execution II
- Lecture 13: Query Planning & Optimization I (13:11)
Lecture1: Course Introduction and the Relational Model
Databases、Flat File Strawman、Database Management System、Relational Mode、Data Manipulation Languages (DMLs)、Relational Algebra
Relational Algebra
Select、Projection、Union、Itersection、Difference、Product、Join、Rename、Assignment、Duplicate Elimination、Aggregation、Sorting、Division
Lecture 2: Advanced SQL
Relational Languages、SQL History、Joins、Aggregates、String Operations、Date and Time、Output Redirection、Output Control、Nested Queries、Window Functions、Common Table Expressions
Lecture 3: Database Storage I
Storage, Disk-Oriented DBMS Overview, DBMS vs. OS, File Storage, Database Pages, Database Heap, Page Layout, Tuple Layout,
DBMS vs OS
Why not use the OS? Why not use the mmap?
File Storage
The DBMS stores a database as one or more files on disk.
Storage Manager
The storage manager is responsible for maintaining a database’s files.
It organizes the files as a collection of pages.
Pages
A page is fixed-size block of data. It contrain tuples, meta-data, indexes, log records…
Each page is given a unique identifier.
Hardware Page(usually 4KB) & OS Page(usually 4KB) & Database Page(512B-16KB)
Different DBMS manage pages in files on disk in different ways:
- Heap File Organization (Linked List & Page Directory)
- Sequential / Sorted File Organization
- Hashing File Organization
Page Layout
Two approaches about organizing the data stored inside of the page:
- Tuple-oriented
- Log-structured
Page Header: Page Size、Checksum、DBMS Version、Transaction Visibility、Compression Information
Tuple Storage & Slotted Pages
Record ID
Each tuple is assigned a unique record identifier.
- Most common: page_id + offset/slot
- Can also contain file location info
Tuple Layout
Each tuple is prefixed with a header that contains meta-data about it.
- Visibility info (concurrency control)
- Bit Map for NULL values
We do not need to store meta-data about the schema.
physical denormalization??
Log-structured File Organization
The system appends log records to the file of how the database was modified:
- Inserts store the entire tuple.
- Deletes mark the tuple as deleted.
- Updates contain the delta of just the attributes that were modified.
Why better write performance? avoid random io
To read a record, the DBMS scans the log backwards and “recreates” the tuple to find what it needs.
leveldb/rocksdb/hbase/cassandra
Lecture 4: Database Storage II
Data Representation, Workloads, Storage Models,
Variable Precision Numbers
Typically faster than arbitrary precision numbers but can have rouding errors…
Examples: FLOAT, REAL/DOUBLE
Fixed Precision Numbers
Examples: NUMERIC, DECIMAL
Large Values
To store values that are larger than a page, the DBMS uses separate overflow storage pages.
Example: BLOB
System Catalogs
Almost every DBMS stores the database’s catalog inside itself (i.e., as tables).
Database Workloads
On-Line Transaction Processing (OLTP): Fast operations that only read/update a small amount of data each time.
On-Line Analytical Processing (OLAP): Complex queries that read a lot of data to compute aggregates.
Hybrid Transacntion + Analytical Processing (HTAP): OLTP + OLAP together on the same database instance
N-ARY Storage Model (NSM)
The DBMS stores all attributes for a single tuple contiguously in a page.
Decomposition Storage Model (DSM)
The DBMS stores the values of a single attribute for all tuples contiguously in a page.
Also known as a column store.
Tuple Identification
Fixed-length Offesets & Embedded Tuple Ids
Lecture 5: Buffer Pools
Introduction, Locks vs. Latches, Buffer Pool, Buffer Pool Optimizations, OS Page Cache, Buffer Replacement Policies, Other Memory Pools
Buffer Pool Organization
Memory region organized as an array of fixed-size pages.
An array entry is called a frame.
Buffer Pool Meta-Data
The page table keeps track of pages that are currently in memory.
Also maintains additional meta-data per page: Dirty Flag, Pin/Reference Counter
Allocation Policies
Global Policies & Local Policies
Multiple Buffer Pools
Multiple buffer pool instances, Per-database buffer pool, Per-page type buffer pool
How to choose buffer pools: Object Id & Hashing
Pre-Fetching
The DBMS can also prefetch pages based on a query plan.
Scan Sharing
Queries can reuse data retrieved from storage or operator computations.
Aslo called synchronized scans.
Buffer Pool Bypass
The sequential scan operator will not store fetched pages in the buffer pool to avoid overhead.
OS Page Cache
Most DBMSs use direct I/O to bypass the OS’s page cache.
Buffer Replacement Policies
LRU, CLOCK, LRU-K
Localization, Priority hints
Dirty Pages
FAST: If a page in the buffer pool is not dirty, then the DBMS can simply drop it.
SLOW: If a page is dirty, then the DBMS must write back to disk to ensure that its changes are persisted.
Background Writing
The DBMS can periodically walk through the page table and write dirty pages to disk.
Other Memory Pools
The DBMS needs memory for things other than just tuples and indexes.
- Sorting + Join Buffers
- Query Caches
- Maintenance Buffers
- Log Buffers
- Dictionary Caches
Lecture 6: Hash Tables
Data Structures, Hash Table, Hash Functions, Static Hashing Schemes, Dynamic Hashing Schemes
Data Structures
Internal Meta-Data, Core Data Storage, Temporary Data Structures, Table Indices
Hash Table
Trade-off: Hash Function & Hash Scheme
Hash Functions
CRC-64、MurmurHash、Google CityHash、Facebook XXHash、Google FarmHash
Static Hashing Schemes
Linear Probe Hashing
Robin Hood Hashing
Cuckoo Hashing
Dynamic Hashing Schemes
Chained Hashing
Extendible Hashing
Linear Hashing
Lecture 7: Tree Indexes
Table Indexes, B+Tree, B+Tree Design Choices, Optimizations
Table Indexes
A table index is a replica of a subset of a table’s attributes that are organized and/or sorted for efficient access using those attributes.
Leaf Node Values
- Record IDs: A pointer to the location of the tuple to which the index entry corresponds.
- Tuple Data: The leaf nodes store the actual contents of the tuple; Secondary indexes must store the Record ID as their values.
Clustered Indexes
The table is stored in the sort order specified by the primary key.
B+Tree Design Choices
- Node Size
- Merge Threshold
- Variable-Length Keys
- Intra-Node Search
Lecture 14: Query Planner and Op
rule:
cost-based search: histogram, sampling, dynamic programming
Homework
Project
Reference
https://15445.courses.cs.cmu.edu/fall2021/