理解cache-line&tuple-at-a-time&clock cycles

  • Overview

    Those who cannot remember the past are condemned to repeat it - George Santayana

  • Cache

    A Cache is a hardware or software component that stores data so that future requests for that data can be served faster.

  • CPU cache

    A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main memory.

    When trying to read from or write to a location in the main memory, the processor checks whether the data from that location is already in the cache. If so, the processor will read from or write to the cache instead of the much slower main memory.

    Most modern desktop and server CPUs have at least three independent caches:

    • instruction cache to speed up executable instruction fetch
    • data cache to speed up data fetch and store
    • translation lookaside buffer(TLB) to speed up virual-to-physical address translation for both executable instructions and data
  • Cache line

    Data is transferred between memory and cache in blocks of fixed size, called cache lines or cache blocks.

    When a cache line is copied from memory into the cache, a cache entry is created.

    The cache entry will include the copied data as well as the requested memory location (called a tag).

  • Query Processing

    The vectorization model aims to increase the efficiency of the materialization model with a better use of the CPU caches.

    There are three ways for a DBMS to execute a query plan:

    • Tuple-at-a-time: Each operator calls next on their child to get the next tuple to process. Also known as the Volcano interator model;

    • Operator-at-a-time: Each operator materializes their entire output for their parent operator, it is ideal for in-memory OLTP engine;

    • Vector-at-a-time: Each operator calls next on their child to get the bext batch of data to process;

  • Volcano iterator model

    OceanBase: 数据库查询引擎的进化之路

    Volcano–An Extensible and Parallel Query Evaluation System

  • clock cycles

    A clock signal oscillates between a high and a low state and is used like a metronome to coordinate actions of digital circuits.

    A clock cycle is a single electronic pulse of a CPU. During each cycle, a CPU can perform a basic operation such as fetching an instruction, accessing memory, or writing data.

    In physics, the frequency of a signal is determined by cycles per second, or “hertz”, similarly, the frequency of a processor is measured in clock cycles per second.

    The speed of a computer processor, or CPU, is determined by the Clock Cycle, which is the amount of time between two pulses of an oscillator.

  • References

  1. How do cache lines work?
  2. What is a “cache-friendly” code?
  3. The Elements of Cache Programming Style
  4. Notes on Cache Memory
  5. Why software developers should care about CPU caches
  6. Lecture #03: Query Compilation
  7. Data Processing on Modern Hardware : Assignment 2
  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值