数据库原理复习笔记

最新推荐文章于 2023-10-23 14:06:02 发布

songtitan

最新推荐文章于 2023-10-23 14:06:02 发布

阅读量2k

点赞数

分类专栏：我的总结和心得文章标签：数据库 attributes join sorting each pointers

本文链接：https://blog.csdn.net/songtitan/article/details/191869

版权

我的总结和心得专栏收录该内容

20 篇文章 0 订阅

订阅专栏

Chapter 1: Introduction

Filesystem Database System

Data Redundancy and inconsistency Levels of Abstraction:Physical,Logiccal,View

Difficulty in accessing data

Data isolation

Integrity problem

Atomicity of updates

Multiple user for current access

Security problem

Some concepts:

Schema:logical structure of database

Instance:the actual content of database in a particular point in time.

Physical Data Independence – the ability to modify the physical schema without changing the logical schema

Entity Relationship Model

Entities and Relationships between entities.

DDL

DDL compiler generates a set of tables stored in a data dictionary

DML

SQL: widely used non-procedural language

Chapter 2: Entity-Relationship Model

Entity:is a object

Entity Set:: An entity set is a set of entities of the same type that share the same properties.

Domain – the set of permitted values for each attribute

Attribute types:Simple and composite attributes.Single-valued and multi-valued attributesDerived attributes

Relationship: is an association among several entities

Relationship Set: is a mathematical relation among n >=2 entities, each taken from entity sets

Degree of a Relationship Set:

Relationship sets that involve two entity sets are binary (or degree two). Generally, most relationship sets in a database system are binary.

E-R Diagrams:

Rectangles represent entity sets.

Chapter13:

Query Processing Basic Steps: (a picture in the ppt)

1. Parsing and translation

Parsing checks syntax, verifies relations.Translate the query into its internal form. This is then translated into relational algebra.

2. Optimization

Relational algebra can be expressed by many forms , and choose the one with lowest cost.

3. Evaluation The query-execution engine takes a query-evaluation plan, executes that tlan, and returns the answers to the query.

Measure of Query cost:

Time cost :disk accesses,CPU,or network communication

Selection Operation:

File Scan:

(br denotes number of blocks containing records from relation r)

A1(Linear Search)----scan each file blocks to check whether satisfy the selection contition. Cost = br or br /2(when the selection is on a key attribute)

A2(Binary search)----applicable if the selection is an equality comparison on the attribute on which file is ordered Cost = [log2(br)]

Index scan – search algorithms that use an index

A3(primary index on candidate key ,equality ):

Cost = HT + 1

A4(priamry index on non-key ,equality )

Cost =HT +number of blocks containing retrieved records

A5 (equality on search-key of secondary index).

if search-key is a candidate key

Cost = HTi + 1

Retrieve multiple records if search-key is not a candidate key

Cost = HTi + number of records retrieved

Selections Involving Comparisons(Relation is sorted on A )

A6(primary index comparison )

For δA > V(r) use index to find first tuple >= v and scan relation sequentially from there

For δA<V (r) just scan relation sequentially till first tuple > v; do not use index

A7 (secondary index, comparison).

For δA> V(r) use index to find first index entry >= v and scan index sequentially from there, to find pointers to records.

For δA<V (r) just scan leaf pages of index finding pointers to records, till first entry > v

Complex Selections: Conjunction

A8(conjunctive selection using one index).

Select a combination of qi and algorithms A1 through A7 that results in the least cost

A9 (conjunctive selection using multiple-key index).

Use appropriate composite (multiple-key) index if available

A10 (conjunctive selection by intersection of identifiers).

Requires indices with record pointers.

Use corresponding index for each condition, and take intersection of all the obtained sets of record pointers.

Then fetch records from file

Complex Selections: Disjunction

A11 (disjunctive selection by union of identifiers).

Applicable if all conditions have available indices.

Otherwise use linear scan.

Use corresponding index for each condition, and take union of all the obtained sets of record pointers.

Then fetch records from file

Negation:

Use linear scan on file

----------------------------------------------------------------------------------------------------------------

Sorting:

For relations that fit in memory, techniques like quicksort can be used. For relations that don’t fit in memory, external ort-merge is a good choice.

Let M denote memory size (in pages).

External Sort-Merge

Cost:Thus total number of disk accesses for external sorting:

br ( 2 [log M–1(br / M)]+1)

Join Operation:

( r is called the outer relation and s the inner relation of the join.)

Nested-loop join :

for each tuple tr in r do begin
            for each tuple ts in s do begin
                      test pair (tr,ts) to see if they satisfy the join condition q
                      if they do, add tr • ts to the result.
            end
end

Requires no indices and can be used with any kind of join condition.so Expensive!

the worst case Cost =nr * bs + br disk accesses.

Cost `=br + bs disk accesses.

Block nested-loop join :

           for each block Br of r do begin
                     for each block Bs of s do begin
                                for each tuple tr in Br do begin
                                         for each tuple ts in Bs do begin
                                                  Check if (tr,ts) satisfy the join condition
                                                  if they do, add tr • ts to the result.
                                         end
                                end
                     end
           end

Worst case estimate: Cost = br * bs + br block accesses.

Best case: br + bs block accesses.

nImprovements to nested loop and block nested loop algorithms:

Cost = [br / (M-2)] * bs + br

Indexed Nested-Loop Join

Cost of the join: br + nr * c

note: If indices are available on join attributes of both r and s, use the relation with fewer tuples as the outer relation.

Merge-Join

Sort both relations on their join attribute (if not already sorted on the join attributes).

Can be used only for equi-joins and natural joins

Cost = br + bs + the cost of sorting if relations are unsorted.

hybrid merge-join:

Hash-Join

Applicable for equi-joins and natural joins.

2004年11月23日11:07:52

songtitan

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
数据库原理复习笔记

Chapter 1: Introduction Filesystem Database SystemData Redundancy and inconsistency Levels of Abstraction:Physical,Logiccal,ViewDifficulty in accessing dataDa
复制链接

扫一扫