Reading Notes of Database System Implementation

Reading Notes of Database System Implementation

 

Chapter 1 Introduction

1.1 Introducing: The Megatron 2000 Database System

The capabilities that a DBMS provides the user are:

Persistent storage/Programming interface/Transaction management

1.1.1 Megatron 2000 Implementation Details
1.1.2 How Megatron 2000 Executes Queries
1.1.3 What’s Wrong with Megatron 2000?

1.2 Overview of a Database Management System

 

1.2.1 Data-Definition Language Commands

DBMS components (Figure 1.1)

 

1.2.2 Overview of Query Processing

 

1.2.3 Main-Memory Buffers and the Buffer Manager

 

Information that various components may need: Data/Metadata/Statistics/Indexes

 

1.2.4 Transaction Processing

 

Tasks for transaction processor: Logging/Concurrency control/Deadlock resolution

 

1.2.5 The Query Processor

Two components: query compiler (query parser/query preprocessor/query optimizer)/execution engine

1.3 Outline of This Book

3 parts: Storage management/Query processing/Transaction management

 

1.3.1 Prerequisites

<A First Course in Database Systems>

Database Design / Database Programming

1.3.2 Storage-Management Overview

 

1.3.3 Query-Processing Overview

 

1.3.4 Transaction-Processing Overview

 

1.3.5 Information Integration Overview

 

1.4 Review of Database Models and Languages

 

1.4.1 Relational Model Review

 

1.4.2 SQL Review

 

1.4.3 Relational and Object-Oriented Data

 

1.5 Summary of Chapter 1

Database Management Systems:

Comparison With File Systems:

Components of a DBMS:

The Storage Manager:

The Query Processor:

The Transaction Manager

SQL:

Data Concepts:

 

1.6 References for Chapter1

 

 


 

 

Chapter2 Data Storage

2.1 The Memory Hierarchy

The memory hierarchy (Figure 2.1)

2.1.1 Cache

 

2.1.2 Main Memory

 

2.1.3 Virtual Memory

 

2.1.4 Secondary

 

2.1.5 Tertiary Storage

 

2.1.6 Volatile and Nonvolatile Storage

Access time versus capacity for various levels of the memory hierarchy

 

2.1.7 Exercise

 

 

2.2 Disks

2.2.1 Mechanics of Disks

Tracks/sector/gap

Disk head

 

2.2.2 The Disk Controller

 

2.2.3 Disk Storage Characteristics

Rotation Speed of the Disk Assembly

Capacity = Number of Platters per Unit * Number of Tracks per Surface * Number of Bytes per Track

Density of bits:

 

2.2.4 Disk Access Characteristics

 

Average time to read a block := Latency of the disk: *+seek time + rotational latency + transfer time

 

2.2.5 Writing Blocks

 

 

2.2.6 Modifying Blocks

 

 

2.3 Using Secondary Storage Effectively

 

2.3.1 The I/O Model of Computation

 

2.3.2 Sorting Data in Secondary Storage

 

2.3.3 Merge-Sort

 

2.3.4 Tow-Phase, Multiway Merge-Sort

Main-memory organization for multiway merging (Figure 2.11)

 

2.3.5 Extension of Multiway Merging to Larger Relations

The total number of records we can sort by 2-phase, multiway merge-sort

 

 

2.4 Improving the Access Time of Secondary Storage

 

2.4.1 Organizing Data by Cylinders

 

2.4.2 Using Multiple Disks

 

2.4.3 Mirroring Disks

 

2.4.4 Disk Scheduling and the Elevator Algorithm

 

2.4.5 Prefetching and Large-Scale Buffering

 

2.4.6 Summary of Strategies and Tradeoffs

 

2.5 Disk Failures

 

2.5.1 Intermittent Failures

 

2.5.2 Checksums

 

2.5.3 Stable Storage

 

2.5.4 Error-Handling Capabilities of Stable Storage

 

2.6 Recovery from Disk Crashes

 

2.6.1 The Failure Model for Disks

 

2.6.2 Mirroring as a Redundancy Technique

RAID level 1

Probability

 

2.6.3 Parity Blocks

RAID level 4

 

2.6.4 An Improvement: RAID 5

 

2.6.5 Coping With Multiple Disk Crashes

RAID level 6

Hamming Code

 

2.7 Summary of Chapter 2

Memory Hierarchy:

Tertiary Storage:

Disk/Secondary Storage:

Blocks and Sectors:

Disk Controller:

Disk Access Time:

Moore’s Law:

Algorithms Using Secondary Storage:

Two-Phase, Multiway Merge-Sort:

Speeding Up Disk Access:

Elevator Algorithms:

Disk Failure Modes:

Checksums:

Stable Storage:

RAID:

 

 

Chapter 3 Representing Data Elements

3.1 Data Elements and Fields

 

3.1.1 Representing Relational Database Elements

 

3.1.2 Representing Objects

 

3.1.3 Representing Data Elements

 

Fixed-length

Variable-length: Length plus content / Null-terminated string

 

Date and Time :: Bits :: Enumerated Typed

 

 

3.2 Records

3.2.1 Building Fixed-Length Records

 

3.2.2 Record Headers

 

3.2.3 Packing Fixed-Length Records into Blocks

 

3.3 Representing Block and Record Address

 

3.3.1 Client-Server Systems

Database address space: physical address & logical address

 

3.3.2 Logical and Structured Address

 

3.3.3 Pointer Swizzling

Strategies

 

3.3.4 Returning Blocks to Disk

 

3.3.5 Pinned Records and Blocks

 

3.4 Variable-Length Data and Records

Types of variable-length data: data items whose size varies / Repeating fields / Variable-format records / enormous fields

 

3.4.1 Records With Variable-Length Fields

 

3.4.2 Records With Repeating Fields

 

3.4.3 Variable-Format Records

tag records

 

3.4.4 Records That Do Not Fit in Block

 

3.4.5 BLOBS

Binary, large objects

 

 

3.5 Record Modifications

3.5.1 Insertion

 

Problems: make room for new records

Solving:

1

2

3.5.2 Deletion

 

3.5.3 Update

 

3.6 Summary of Chapter3

Fields:

Records:

Variable-Length Records:

Blocks:

Spanned Records:

BLOBS:

Offset Tables:

Overflow Blocks:

Database Addresses:

Structured Addresses:

Pointer Swizzling:

Tombstones:

Pinned Blocks:

 

 

Chapter 4 Index Structures

 

4.1 Indexes on Sequential Files

 

4.1.1 Sequential Files

 

4.1.2 Dense Indexes

 

4.1.3 Sparse Indexes

 

4.1.4 Multiple Levels of Index

 

4.1.5 Indexes With Duplicate Search Keys

 

4.1.6 Managing Indexes During Data Modifications

 

4.2 Secondary Indexes

 

4.2.1 Design of Secondary Indexes

 

4.2.2 Applications of Secondary Indexes

 

4.2.3 Indirection in Secondary Indexes

 

4.2.4 Document Retrieval and Inverted Indexes

 

4.3 B-Trees

B+ tree

4.3.1 The Structure of B-Trees

 

4.3.2 Applications of B-trees

 

4.3.3 Lookup in B-Trees

 

4.3.4 Range Queries

 

4.3.5 Insertion Into B-Trees

Principle (recursive):

Overflow

4.3.6 Deletion From B-Trees

 

4.3.7 Efficiency of B-Trees

 

4.4 Hash Tables

 

4.4.1 Secondary-Storage Hash Tables

 

4.4.2 Insertion Into a Hash Table

 

4.4.3 Hash-Table Deletion

 

4.4.4 Efficiency of Hash Table Indexes

Static/Dynamic hash tables

4.4.5 Extensible Hash Tables

 

4.4.6 Insertion Into Extensible Hash Tables

 

4.4.7 Linear Hash Tables

Defects of Extensible hash tables:

Linear hashing grows the number of buckets more slowly

 

4.4.8 Insertion Into Linear Hash Tables

 

4.5 Summary of Chapter 4

Sequential Files:

Dense Indexes:

Sparse Indexes:

Multilevel Indexes:

Expanding Files:

Secondary Indexes:

Inverted Indexes:

B-trees:

Range Queries:

Hash Tables:

Dynamic Hashing:

Extensible Hashing:

Linear Hashing:

 

 

 

 

 

Chapter 5 Multidimensional Indexes

 

5.1 Applications Needing Multiple Dimensions

 

5.1.1 Geographic Information Systems

Partial match queries / Range queries / Nearest-neighbor queries / Where-am-I queries

 

5.1.2 Data Cubes

 

5.1.3 Multidimensional Queries in SQL

 

5.1.4 Executing Range Queries Using Conventional Indexes

 

5.1.5 Executing Nearest-Neighbor Queries Using Conventional Indexes

 

5.1.6 Other Limitations of Conventional Indexes

 

5.1.7 Overview of Multidimensional Index Structures

Data structures for supporting queries on multidimensional data:

1. Hash-table-like approaches

2. Tree-like approaches

 

5.2 Hash-Like Structures for Multidimensional Data

 

5.2.1 Grid Files

 

5.2.2 Lookup in a Grid File

 

5.2.3 Insertion Into Grid Files

 

5.2.4 Performance of Grid Files

 

5.2.5 Partitioned Hash Functions

 

5.2.6 Comparison of Grid Files and Partitioned Hashing

 

5.3 Tree-Like Structures for Multidimensional Data

 

5.3.1 Multiple-Key Indexes

 

5.3.2 Performance of Multiple-Key Indexes

 

5.3.3 kd-Trees

 

5.3.4 Operations on kd-Trees

 

5.3.5 Adapting kd-Trees to Secondary Storage

 

5.3.6 Quad Trees

 

5.3.7 R-Trees

 

5.3.8 Operations on R-trees

 

5.4 Bitmap Indexes

 

5.4.1 Motivation for Bitmap Indexes

 

5.4.2 Compressed Bitmaps

Run-length encoding

 

5.4.3 Operating in Run-Length-Encoded Bit-Vectors

 

5.4.4 Managing Bitmap Indexes

 

Finding Bit-Vectors / Finding Records / Handling Modifications to the Data File

 

 

5.5 Summary of Chapter 5

Multidimensional Data:

Queries Needing Multidimensional Indexes:

Executing Nearest-Neighbor Queries:

Grid Files:

Partitioned Hash Tables:

Multiple-Key Indexes:

kd-Trees

Quad Trees / ( Octrees ):

R-Trees:

Bitmap Indexes:

Compressed Bitmaps:

 

 

Chapter 6 Query Execution

 

Logical query plans

Query optimizer

Parsing -> Query rewirte -> Physical plan generation

 

The major parts of the query process (figure 6.1)

Outline of query compilation (Figure 6.2)

 

6.1 An Algebra for Queries

 

Relational algebra operators: Union, intersection, and difference / Selection / Projection / Product / Join / Duplicate elimination / Grouping / Sorting

6.1.1 Union, Intersection and Difference

Set and Bag versions

6.1.2 The Selection Operator

 

6.1.3 The Projection Operator

 

6.1.4 The Product of Relations

 

6.1.5 Joins

 

6.1.6 Duplicate Elimination

 

6.1.7 Grouping and Aggregation

 

6.1.8 The Sorting Operator

 

6.1.9 Expression Trees

 

 

6.2 Introduction to Physical-Query-Plan Operators

 

6.2.1 Scanning Tables

table-scan / index-scan

 

6.2.2 Sorting While Scanning Tables

 

6.2.3 The Model of Computation for Physical Operators

 

6.2.4 Parameters for Measuring Costs

B(R), T(R), V(R, a)

6.2.5 I/O Cost for Scan Operators

 

6.2.6 Iterators for Implementation of Physical Operators

Open / GetNext / Close

 

6.3 One-Pass Algorithms for Database Operations

 

Classify operators into three broad groups

 

6.3.1 One-Pass Algorithms for Tuple-at-a-Time Operations

 

6.3.2 One-Pass Algorithms for Unary, Full-Relation Operations

 

6.3.3 One-Pass Algorithms for Binary Operations

 

6.4 Nested-Loop Joins

 

6.4.1 Tuple-Based Nested-Loop Join

 

6.4.2 An Iterator for Tuple-Based Nested-Loop Join

 

6.4.3 A Block-Based Nested-Loop Join Algorithm

 

6.4.4 Analysis of Nested-Loop Join

 

6.4.5 Summary of Algorithms so Far

Main memory and disk I/O requirements for one-pass and nested-loop algorithms (Figure 6.14)

 

6.5 Two-Pass Algorithms Based on Sorting

 

6.5.1 Duplicate Elimination Using Sorting

 

6.5.2 Grouping and Aggregation Using Sorting

 

6.5.3 A Sort-Based Union Algorithm

 

6.5.4 Sort-Based Algorithms for Intersection and Difference

 

6.5.5 A Simple Sort-Based Join Algorithm

 

6.5.6 Analysis of Simple Sort-Join

 

6.5.7 A More Efficient Sort-Based Join

 

6.5.8 Summary of Sort-Based Algorithms

 

Fig.6.16

 

6.6 Two-Pass Algorithms Based on Hashing

 

6.6.1 Partitioning Relations by Hashing

 

6.6.2 A Hash-Based Algorithm for Duplicate Elimination

 

6.6.3 A Hash-Based Algorithm for Grouping and Aggregation

 

6.6.4 Hash-Based Algorithms for Union, Intersection, and Difference

 

6.6.5 The Hash-Join Algorithm

 

6.6.6 Saving Some Disk I/O’s

 

6.6.7 Summary of Hash-Based Algorithms

Fig.6.18

 

6.7 Index-Based Algorithms

 

6.7.1 Clustering and Nonclustering Indexes

 

6.7.2 Index-Based Selection

 

6.7.3 Joining by Using an Index

 

6.7.4 Joins Using a Sorted Index

zig-zag Join

 

6.8 Buffer Management

 

6.8.1 Buffer Management Architecture

Buffer pool

6.8.2. Buffer Management Strategies

LRU / FIFO / The “Clock” Algorithm / System Control

 

6,8.3 The Relationship Between Physical Operator Selection and Buffer Management

 

 

6.9 Algorithms Using More Than Two Passes

 

6.9.1 Multipass Sort-Based Algorithms

Basis->Induction

6.9.2 Performance of Multipass, Sort-Based Algorithms

 

6.9.3 Multipass Hash-Based Algorithms

 

6.9.4 Performance of Multipass Hash-Based Algorithms

 

6.10 Parallel Algorithms for Relational Operations

 

6.10.1 Models of Parallelism

Shared Memory / Shared Disk / Shared Nothing

 

6.10.2 Tuple-at-a-Time Operations in Parallel

 

6.10.3 Parallel Algorithms for Full-Relation Operations

 

6.10.4 Performance of Parallel Algorithms

 

6.11 Summary of Chapter 6

Query Processing:

Query Plans:

Extended Relational Algebra:

Table Scanning:

Cost Measures for Physical Operators:

Iterators:

One-Pass Algorithms:

Nested-Loop Join:

Two-Pass Algorithms:

Sort-Based Algorithms:

Hash-Based Algorithms:

Hash Versus Algorithms:

The Buffer Maneger:

Coping With Variable Numbers of Buffers:

Multipass Algorithms:

Parallel Machines:

Parallel Algotihms:

 

 

Chapter 7 The Query Compiler

 

7.1 Parsing

From a query to a logical query plan (Figure 7.1)

 

7.1.1 Syntax Analysis and Parse Trees

 

7.1.2 A Grammar for a Simple Subset of SQL

<SFW> -> Parse Tree

 

7.1.3 The Preprocessor

 

 

7.2 Algebraic Laws for Improving Query Plans

 

7.2.1 Commutative and Associative Laws

 

7.2.2 Laws Involving Selection

 

7.2.3 Pushing Selection

 

7.2.4 Laws Involving Projection

 

7.2.5 Laws About Joins and Products

 

7.2.6 Laws Involving Duplicate Elimination

 

7.2.7 Laws Involving Grouping and Aggregation

 

7.3 From Parse Trees to Logical Query Plans

 

7.3.1 Conversion to Relational Algebra

 

7.3.2 Removing Subqueries From Conditions

 

7.3.3 Improving the Logical Query Plan

 

7.3.4 Grouping Associative / Commutative Operators

 

7.4 Estimating the Cost of Operations

 

7.4.1 Estimating Sizes of Intermediate Relations

 

7.4.2 Estimating the Size of a Projection

 

7.4.3 Estimating the Size of a Selection

 

7.4.4 Estimating the Size of a Join

 

7.4.5 Natural Joins With Multiple Join Attributes

 

7.4.6 Joins of Many Relations

 

7.4.7 Estimating Sizes for Other Operations

 

7.5 Introduction to Cost-Based Plan Selection

 

7.5.1 Obtaining Estimates for Size Parameters

Most common types of histograms: Equal-width / Equal-height / Most-frequent-values

 

7.5.2 Incremental Computation of Statistics

 

7.5.3 Heuristics for Reducing the Cost of Logical Query Plans

 

7.5.4 Approaches to Enumerating Physical Plans

Heuristic Selection / Branch-and-Bound Plan Enumeration / Hill Climbing / Dynamic Programming / Selinger-Style Optimization

 

7.6 Choosing and Order for Joins

 

7.6.1 Significance of Left and Right Join Arguments

 

7.6.2 Join Trees

 

7.6.3 Left-Deep Join Trees

 

7.6.4 Dynamic Programming to Select a Join Order and Grouping

 

7.6.5 Dynamic Programming With More Detailed Cost Functions

 

7.6.6 A Greedy Algorithm for Selecting a Join Order

 

7.7 Completing the Physical-Query-Plan Selection

 

7.7.1 Choosing a Selection Method

 

7.7.2 Choosing a Join Method

 

7.7.3 Pipelining Versus Materialization

 

7.7.4 Pipelining Unary Operations

 

7.7.5 Pipelining Binary Operations

 

7.7.6 Notation for Physical Query Plans

 

7.7.7 Ordering of Physical Operations

 

7.8 Summary of Chapter 7

Compilation of Queries:

The Parser:

Semantic Checking:

Conversion to a Logical Query Plan:

Algebraic Transformations:

Choosing a Logical Query Plan:

Estimating Sizes of Relations:

Histograms:

Cost-Based Optimization:

Plan-Enumeration Strategies:

Left-Deep Join Trees:

Physical Plans for Selection:

Pipelining Versus Materialization:

 

Chapter 8 Coping With System Failures

 

8.1 Issues and Models for Resilient Operation

 

8.1.1 Failure Modes

 

Erroneous Data Entry / Media Failures / Catastrophic Failure / System Failures

 

8.1.2 More About Transactions

 

 

COMMIT ROLLBACK

 

The log Manger and transaction manager (Figure 8.1)

 

8.1.3 Correct Execution of Transactions

 

Fundamental assumption about transaction: The Correctness Principle

 

8.1.4 The Primitive Operations of Transaction

INPUT READ WRITE OUTPUT

 

8.2 Undo Logging

 

8.2.1 Log Records

 

Log manager

START COMMIT ABORT

 

8.2.2 The Undo-Logging Rules

 

8.2.3 Recovery Using Undo Logging

 

8.2.4 Checkpointing

 

8.2.5 Nonquiescent Checkpointing

START CKPT / END CKPT

 

8.3 Redo Logging

Differences between undo and redo

 

8.3.1 The Redo-Logging Rules

 

8.3.2 Recovery With Redo Logging

 

8.3.3 Checkpointing a Red Log

 

8.3.4 Recovery With a Chekpointed Redo Log

 

8.4 Undo/Redo Logging

 

8.4.1 The Undo/Redo Rules

 

8.4.2 Recovery With Undo/Redo Logging

 

8.4.3 Checkpointing an Undo/Redo Log

 

8.5 Protecting Against Media Failures

 

8.5.1 The Archive

 

Full dump / incremental dump

 

8.5.2 Nonquiescent Archiving

 

8.5.3 Recovery Using an Archive and Log

 

8.6 Summary of Chapter 8

Transaction Management:

Database Elements:

Logging:

Recovery:

Logging Methods:

Undo Logging:

Redo Logging:

Undo/Redo Logging:

Checkpointing:

Nonquiescent Checkpointing:

Archiving:

Incremental Backups:

Nonquiescent Archiving:

Recovery From Media Failures:

 

 

Chapter 9 Concurrency Control

 

9.1 Serial and Serializable Schedules

 

9.1.1 Schedules

 

9.1.2 Serial Schedules

 

9.1.3 Serializable Schedules

 

9.1.4 The Effect of Transaction Semantics

 

9.1.5 A Notation for Transactions and Schedules

 

9.2 Conflict-Serializability

 

9.2.1 Conflicts

 

9.2.2 Precedence Graphs and a Test for Conflict-Serializability

 

Precedence graph

 

9.2.3 Why the Precedence-Graph Test Works

Basis -> Induction

 

9.3 Enforcing Serializability by Locks

 

9.3.1 Locks

 

9.3.2 The Locking Scheduler

 

9.3.3 Two-Phase Locking

 

9.3.4 Why Two-Phase Locking Works

 

9.4 Locking Systems With Several Lock Modes

 

9.4.1 Shared and Exclusive Lock

Consistency and 2 PL for transactions and legality for schedules

 

9.4.2 Compatibility Matrices

 

9.4.3 Upgrading Locks

 

9.4.4 Update Locks

 

9.4.5 Increment Locks

 

9.5 An Architecture for a Locking Scheduler

 

9.5.1 A Scheduler That Inserts Lock Actions

 

9.5.2 The Lock Table

Structures of lock-table entries (Figure 9.26)

 

9.6 Managing Hierarchies of Database Elements

 

9.6.1 Locks With Multiple Granularity

 

9.6.2 Warning Locks

 

9.6.3 Phantoms and Handling Insertions Correctly

 

9.7 The Tree Protocol

 

9.7.1 Motivation for Tree-Based Locking

 

9.7.2 Rules for Access to Tree-Structured Data

 

9.7.3 Why the Tree Protocol Works

 

 

9.8 Concurrency Control by Timestamps

 

9.8.1 Timestamps

 

9.8.2 Physically Unrealizable Behaviors

 

9.8.3 Problems With Dirty Data

 

9.8.4 The Rules for Timestamp-Based Scheduling

 

9.8.5 Multiversion Timestamps

 

9.8.6 Timestamps and Locking

 

9.9 Concurrency Control by Validation

 

9.9.1 Architecture of a Validation-Based Scheduler

 

9.9.2 The Validation Rules

 

9.9.3 Comparison of Three Concurrency-Control Mechanisms

 

9.10 Summary of Chapter 9

Consistent Database:

Consistency of Concurrent Transactions:

Schedules:

Serial Schedules:

Serializable Schedules:

Conflict-Serializability:

Precedence Graphs:

Locking:

Two-Phase Locking:

Lock Modes:

Compatibility Matrices:

Update Locks:

Increment Locks:

Locking Elements With a Granularity Hierarchy:

Locking Elements Arranged in a Tree:

Optimistic Concurrency Control:

Timestamp-Based Schedulers:

Validation-Based Schedulers:

Multiversion Timestamps:

 

 

Chapter 10 More About Transaction Management

 

10.1 Transactions that Read Uncommitted Data

 

10.1.1 The Dirty-Data Problem

 

10.1.2 Cascading Rollback

 

10.1.3 Managing Rollbacks

 

10.1.4 Group Commit

 

10.1.5 Logical Logging

 

10.2 View Serializability

 

10.2.1 View Equivalence

 

10.2.2 Polygraphs and the Test for View-Serializability

 

10.2.3 Testing for View-Serializability

 

10.3 Resolving Deadlocks

 

10.3.1 Deadlock Detection by Timeout

 

10.3.2 The Waits-For Graph

 

10.3.3 Deadlock Prevention by Ordering Elements

 

10.3.4 Detecting Deadlocks by Timestamps

 

10.3.5 Comparison of Deadlock-Management Methods

 

10.4 Distributed Databases

 

10.4.1 Distribution of Data

 

10.4.2 Distributed Transactions

 

10.4.3 Data Replication

 

10.4.4 Distributed Query Optimization

 

10.5 Distributed Commit

 

10.5.1 Supporting Distributed Atomicity

 

10.5.2 Two-Phase Commit

Coordinator

 

10.5.3 Recovery of Distributed Transactions

 

10.6 Distributed Locking

 

10.6.1 Centralized Lock Systems

 

10.6.2 A Cost Model for Distributed Locking Algorithms

 

10.6.3 Locking Replicated Elements

 

10.6.4 Primary-Copy Locking

 

10.6.5 Global Locks From Local Locks

Read-Locks-One / Majority Locking

 

10.7 Long-Duration Transactions

 

10.7.1 Problems of Long Transactions

 

10.7.2 Sagas

 

10.7.3 Compensating Transactions

 

10.7.4 Why Compensating Transactions Work

 

10.8 Summary of Chapter 10

Dirty Data:

Cascading Rollback:

Strict Locking:

Group Commit:

Restoring Database State After an Abort:

Logical Logging:

View Serilizability:

Polygraphs:

Deadlocks:

Waits-For Graphs:

Deadlock Avoidance by Ordering Resources:

Timestamp-Based Deadlock Avoidance:

Distributed Data:

Distributed Transactions:

Two-Phase Commit:

Distributed Locks:

Locking Replicated Data:

Sagas:

Compensating Transactions:

 

Chapter 11 Information Integration

 

11.1 Modes of Information Integration

Federated databases / Warehousing / Mediation

 

11.1.1 Problems of Information Integration

 

Data type differences / Value differences / Semantic differences / Missing values

 

11.1.2 Federated Database Systems

 

11.1.3 Data Warehouses

 

11.1.4 Mediators

 

11.2 Wrappers in Mediator-Based Systems

 

11.2.1 Templates for Query Patterns

 

11.2.2 Wrapper Generators

 

11.2.3 Filters

 

11.2.4 Other Operations at Wrapper

 

11.3 On-Line Analytic Processing

OLAP

OLTP

11.3.1 OLAP Applications

 

11.3.2 A Multidimensional View of OLAP Data

 

ROLAP MOLAP

 

11.3.3 Star Schemas

 

11.3.4 Slicing and Dicing

 

11.4 Data Cubes

MOLAP

11.4.1 The Cube Operator

 

11.4.2 Cube Implementation by Materialized Views

 

11.4.3 The Lattice of Views

 

11.5 Data Mining

KDD

 

11.5.1 Data-Mining Applications

 

11.5.2 Association-Rule Mining

 

11.5.3 The A-Priori Algorithm

 

11.6 Summary of Chapter 11

Integration of Information:

Approaches to Information Integration:

Extractors and Wrappers:

Wrapper Generators:

OLAP:

ROLAP and MOLAP:

Star Schemas:

The Cube Operator:

Dimension Lattices and Materialized Views:

Data Mining:

The A-Priori Algorithm:

 

 

 

 

 

 

 

 

 

  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值