200403 Introduction to Database Ch. 15 Query Processing and Optimization

🚩2020/04/03

现在是5月3日,终于补到一个月前的课了哈哈哈哈哈
视频前1h57min都在讲期中考试的题目 先直接跳过 之后复习再看 先把课上完🐶


Indices on Multiple Keys

  • Composite search keys are search keys containing more than one attribute
    • E.g., (dept_name, salary)
  • Lexicographic(adj. 词典式的) ordering: (a1, a2) < (b1, b2) if either
    • a1<b1,or
    • a1=b1 and a2 < b2

Indices on Multiple Attributes

在这里插入图片描述

Covering indices

• Add extra attributes to index so (some) queries can avoid fetching the actual records (从index文件query而不是数据)

Creation of Indices

在这里插入图片描述

  • example
create index takes_pk on takes(ID, course_ID, semester, section)
drop index takes_pk
  • Indices can greatly speed up lookups, but impose cost on updates

Query Processing and Optimization

(Database Systems Concepts, V7, Ch. 15)

在这里插入图片描述

  • Parsing and translation
    • translate the query into its internal form. This is then translated into relational algebra.
    • Parser(语法分析程序) checks syntax, verifies relations
  • Evaluation
    • The query-execution engine takes a query-evaluation plan, executes that plan, and returns the answers to the query.

Query Optimization:

Amongst all equivalent evaluation plans choose
the one with lowest cost.
• Cost is estimated using statistical information from the database catalog. e.g… number of tuples in each relation, size of tuples, etc.

Measures of Query Cost

  • Many factors contribute to time cost
    disk access, CPU, and network communication
  • Cost can be measured based on
    response time, i.e. total elapsed time for answering query, or
    • total resource consumption

P.S. 类比你在机场值机派对,
resource consumption是 when u get to the head of the line. How long does it take the person to check u in?
response time = resource consumption + how many people are in line

  • We use total resource consumption as cost metric
    • Response time harder to estimate, and minimizing resource consumption is a good idea in a shared database

  • We ignore CPU costs for simplicity
    • Real systems do take CPU cost into account
    • Network costs must be considered for parallel systems

  • We describe how estimate the cost of each operation
    • We do not include cost to writing output to disk

  • Required data may be buffer resident already, avoiding disk I/O
    • But hard to take into account for cost estimation

  • Several algorithms can reduce disk IO by using extra buffer space
    • Amount of real memory available to buffer depends on other concurrent queries and OS processes, known only during execution

  • Worst case estimates assume that no data is initially in buffer and only the minimum amount of memory needed for the operation is available
    • But more optimistic estimates are used in practice

Selection Operation

Which index do u pick and is the index clustered or not?

在这里插入图片描述在这里插入图片描述在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值