斯坦福密码学 I 学习笔记4：Lecture 3 Message Integrity

R.X. NLOS

已于 2022-06-03 11:16:49 修改

阅读量741

点赞数 2

分类专栏： # 密码学 (斯坦福Dan Boneh) 文章标签：密码学 Dan Boneh 信息完整性笔记斯坦福

于 2022-06-03 10:47:19 首次发布

本文链接：https://blog.csdn.net/qazwsxrx/article/details/125110770

版权

密码学 (斯坦福Dan Boneh) 专栏收录该内容

8 篇文章 9 订阅

订阅专栏

本系列为斯坦福 Dan Boneh教授的"密码学 I"的学习笔记
课程网址: http://www.coursera.org/lecture/crypto/course-overview-lboqg

内容在CSDN、知乎和微信公众号同步更新

在这里插入图片描述

Markdown源文件暂未开源，如有需要可联系邮箱
笔记难免存在问题，欢迎联系邮箱指正

课程完整目录如下

本文为其中Chapter 3 Message Integrity 的内容，包括：

文章目录

3 Message Integrity

3 Message Integrity

This chapter:

First:
stop talking about encryption
instead, discuss Message Integrity

Next:
come back to encryption
show how to Provide Both Encryption And Integirty

3.1 Message Integrity: Definitions

3.1.1 Message Authentication Codes

Message Integrity

Goal: integrity, no confidentiality
Examples:
- Protecting public binaries on disk
  - such as opertating system files on your disk
    - they are not confidential
    - But it is important to make sure that they’re not modified by a virus or some malwares
- Protecting banner ads on web pages
  - The provider of the ads does not care at all if someone copies them
    - No confidentiality at all
    - But they do care about modifying those ads
    - where integrity matters

Message integrity: MACs

MAC: message authentication code

Alice send a message to Bob
- integrity: make sure that an attacker along the way cannot modify this message
- Alice: Generate tag:
  - using a MAC signing algorithms
  - $\leftarrow S(k,m)$
  - takes input as the key and the message
  - output: a very short tag
    - like 90 bits or 100 bits
    - even though the message is gigabytes long
- Then append the tag to the message
  - Sends the combination of them to Bob
- Bob：Verify tag:
  - using a MAC verification algorithm on this tag
  - input: k, m, tag
  - output: yes / no

Def:
- MAC $I = (S, V)$ defined over $(K, M, T)$ is a pair of algs:
- S(k,m) outputs t in T
- V(k,m,t) outputs “yes” or “no”

应满足的一致性 (consistency) requirement:

$\forall k \in \mathcal{K}, m\in \mathcal{M}$
$V (k, m, S (k, m)) =^{'} y e s^{'}$

Integrity requires a secret key

If use a CRC instead of encryption:
- which is keyless
- Bob: detect tag == CRC(m)

Attacker can easily modify message m and re-compute CRC
- i.e., send (m’||CRC(m’))
CRC designed to detect random, not malicious errors
By introducing the key, Alice can do something that the attacker cannot do!
- which complement the integrity!

Secure MACs

Attacker’s power:
- chosen message attack
  - the attacker can give Alice arbitrary messages of his choice
  - and Alice will compute the tag for the attacker
  - Why would Alice do that?
    - In practice, it’s normal
    - 例如还是email, attacker 发邮件给Alice，Alice为了安全就可能会加上tag，然后就被attacker拿到
- for $m_1,m_2, ... , m_q$ attacker is given $t_i \leftrightarrow S(k,m_i)$
  - even for a message that’s completely gibberish gibberish: 乱码，无意义的数据!
    - Why？gibberish have no value to the attacker
    - 因为假设what we want to send is a random secret key
    - 看上去很gibberish
    - 但attacker却可以凭此fool a user into using the wrong secret key!
Attacker’s goal:
- existential forgery forgery: 伪造
- produce some new valid message/tag pair (m,t)
- $\notin {(m_1,t_1),...,(m_q,t_q)}$

$\Rightarrow$ attacker cannot produce a valid tag for a new message
- $\Rightarrow$ given (m,t) attacker cannot even produce (m,t’) for $t\not ={t'}$
- if the attacker has a tag t for a message m, we need the attacker cannot produce another tag t’ for the message
- Because: there are many applications where it’s really important that the attacker not to be able to produce new tag for a previously signed message
  - Inparticular, when we combine encryption and integrity

More precisely definition
- For a MAC I = (S,V) and adv. A define a MAC game as:

Def: $I = (S, V)$ is a secure MAC if for all “efficient” A:
- $Adv_{MAC}[A,I]$ = Pr[Chal. outputs 1] is “negligible”
i.e., no efficient adversary can win this game with non negligible probability

Example on the MAC security

Example 1

Let $I = (S, V)$ be a MAC
Suppose an attacker is able to find $m_0 \not ={m_1}$ such that
- $S(k,m_0)=S(k,m_1)$ for 1/2 of the keys k in K
Can this MAC be secure?
Ans: No, This mac can be broken using a chosen msg attack!
- the attacker can ask for the tag on $m_0$
- then receive $m_0,t)$
- then the attacker output as his existential forgery $m_1,t)$
  - $m_0,t)$ is different fron $m_1,t)$
- So the advance of the attack is 1/2
- non negligible

–

Example 2

Let $I = (S, V)$ be a MAC
Suppose S(k,m) is always 5 bits long
Can this MAC be secure?
Ans:
- No, an attacker can simply guess the tag for messages
- What the attack will do?
  - ask no query!
  - just output an existential forgery as follows:
    - choose a random tag $\leftarrow ^R \{0,1 \}^{5}$
    - output: (0,t)
- And the adv = 1/32
  - Non negligible
MAC 码的长度不能太短
- typical tag length: 64, 96, 128,bits

Example: Protecting system files

Suppose at install time the system computes:
- k derived from the user’s password
- generate a tag for each one of the files
- then erases the key K
  - no longer stores the key K on disc

Later, a virus infects system and modifies system files
- 用户如何检测到哪些文件遭到了纂改？
- User reboots into clean OS and supplies his password
  - Then, secure MAC $\Rightarrow$ all modified files will be detected
    - Means the virus could not creat a new file such that (F’, t’) to cheat the MAC verification alg.

Next:

Try to build a secure MAC algorithm

3.1.2 MAC beasd on PRFs

Review: Secure MAC

MAC $I = (S, V)$ defined over $(K, M, T)$ is a pair of algs:
- Signing S(k,m) outputs t in T
- Verification V(k,m,t) outputs “yes” or “no”
Attacker’s power:
- chosen message attack
- for $m_1,m_2, ... , m_q$ attacker is given $t_i \leftrightarrow S(k,m_i)$
Attacker’s goal:
- existential forgery forgery: 伪造
- produce some new valid message/tag pair (m,t)
- $\notin {(m_1,t_1),...,(m_q,t_q)}$

$\Rightarrow$ cannot produce a valid tag for a new message

How to build a MAC?

Secure PRF $\Rightarrow$ Secure MAC

For a PRF: $\times X \Rightarrow Y$ define a MAC $I_F = (S,V)$ as:
- $S (k, m) : = F (k, m)$
- $V (k, m, t)$ : output ‘yes’ if $t = F (k, m)$ and ‘no’ otherwise

A bad example

Suppose $F:K\times X \rightarrow Y$ is a secure PRF with $Y=\{0,1 \}^{10}$
Is the derived MAC $I_F$ a secure MAC system？
Ans: No tags are too short: anyone can guess the tag for any msg
just like the last example
- Adv = 1/1024

Security

Thm: if $\times Y$ is a secure PRF and 1/|Y| is negligible
- then $I_F$ is a secure MAC
In particular, for every eff. MAC adversary A attacking $I_F$
- there exists an eff. PRF adversary B attacking F, s.t.:
- $\operatorname{Adv}_{\mathrm{MAC}}\left[\mathrm{A}, \mathrm{I}_{\mathrm{F}}\right] \leq \operatorname{Adv}_{\mathrm{PRF}}[\mathrm{B}, \mathrm{F}]+1 /|\mathrm{Y}|$
  - $\operatorname{Adv}_{\mathrm{PRF}}[\mathrm{B}, \mathrm{F}]$ is negligible
$\Rightarrow$ $I_F$ is secure as along as |Y| is large, say $Y| = 2^{80}$

Proof Sketch

Sketch: 简述，概述，素描

Suppose $\rightarrow Y$ is a truly random function
Then MAC adversary A must win the following game:
A wins if t=f(m) and $m\notin {m_1, ... , m_q}$
Because $\rightarrow Y$ is a truly random function,
$\Rightarrow$ Pr[A wins] = 1/ |Y|
Same must hold for F(k,x)

Examples

AES: a MAC for 16-byte messages
Main question: how to convert Small-MAC into a Big-MAC?
- 输入的message: 能够非常big， instead of 16 bytes for example
Two main constructions used in practice:
- CBC-MAC:
  - banking - ANSI X 9.9, X9.19, FIPS 186-3
- HMAC
  - Internet protocols: SSL, IPsec , SSH
Both convert small-PRF into a big PRF

Truncating MACs based on PRFs

Truncating truncating 截断
Easy lemma: suppose $F:K\times X \rightarrow \{0,1 \}^{n}$ is a secure PRF
- Then so is $F_t(k,m) = F(k,m)[1:t]$ for all $\leq t\leq n$
  - just output the first t-bits
$\Rightarrow$ if (S,V) is a MAC based on a secure PRF outputing n-bits tags
- the truncated MAC outputing w bits is secure
  - as long as $1/2^w$ is still negligible (say w $\geq $ 64)

Next segment:

See how CBC-MAC works

3.2 Message Integrity 2: Construction (Squential MAC Construction)

3.2.1 CBC-MAC and NMAC

In this segment:

Construct two classic MACs
The CBC-MAC and the NMAC

MACs and PRFs

Recall: secure PRF F $\Rightarrow$ secure MAC
- as long as |Y| is large
- S(k,m) = F(k,m)
Our goal:
- given a PRF for shor messages (AES)
  - it can only process 16 byte messages
- Coustruct a PRF for long messages
（shorthand shorthand 速记，简写 for what’s coming）
- From here on let $X = \{0,1 \}^{n}$
  - e.g. n = 128

Construction 1: CBC-MAC

also encrypted CBC-MAC
- ECBC
Let $\times X \rightarrow X$ be a PRP
- Define new PRF $F_{ECBC}: K^2 \times X^{\leq L} \rightarrow X$
  - $X^{\leq L} = U^{L}_{i=1} X^i$
  - L: the bound of the maximum length

Two process:
- F(k,): many rounds
  - The first process is called raw CBC
  - CBC-MAC only with this step is NOT secure!!
- and F(k1,)
  - k1 is an independent key from k
  - and output the final tag (length N)
  - it’s ok to truncate the tag to less than N bits
  - this step is critical for making the MAC secure

Another construction for converting a small PRF to a large PRF:

Called NMAC

Construction 2: NMAC (nested MAC)

Let $\times X \rightarrow K$ be a PRF
- Define new PRF $F_{NMAC}: K^2 \times X^{\leq L} \rightarrow K$
  - outputs element in the KEY space!

Process
- 1 take our message and break it into blocks
  - Each length is as big as the block length of the underlying PRF
- 2 Feed the key as the input to F
  - the msg in 1 is acted as the date input to F
  - to generate a new key, and input the new key into the next block …
  - - finally get output $t\in \mathcal{K}$
  - Just like before：
    - if we stop here, the function we obetain is called cascade function – Not secure!
- 3 map the element $\in \mathcal{K}$ into the set X
  - because the last F’s input is X and K
- 4 Get the final output tag

Why the last encryption step in ECBC-MAC and NMAC?

NMAC: Suppose we define a MAC $I = (S, V)$ where
- S(k,m) = cascade(k,m)$

Question: Is the MAC I Secure? Why?

Ans: This mac can be forged with one chosen msg query!
- forge 伪造
- Query: Cascade(k,m)
- $\Rightarrow$ the attacker can calculate: Cascade(k, m || w), for any w
- 其中，Cascade(k, m || w) = F(Cascade(k,m), w)
  - the attacker knows Cascade(k,m), w, F
  - therefore the attacker can forge (m||w, Cascade(k, m || w))
- Not secure with only one chosen msg query!!

ECBC: Suppose we define a MAC $I_{RAW} = (S,V)$ where
- S(k,m) = rawCBC(k,m)
Then $I_{RAW}$ is easily broken using a 1-chosen msg attack!
- The reson is different with NMAC:
  - The attacker does not know k in ECBC!
  - in NMAC: the key is the query output t!
- However, the attacking can works as follows:
  - Choose an arbitrary one-block message $m\in X$
  - Request tag for m
    - Get t = F(k,m)
  - Output t as MAC forgery for the 2-block message $\bigoplus m)$
  - Indeed: $\bigoplus m)) = F(k, F(k,m)\bigoplus (t\bigoplus m)) = F(k, t\bigoplus(t\bigoplus m)) = F(k,m) = t$

ECBC-MAC and NMAC analysis

Theorem:
- For any $L > 0$
- For every eff. q-query PRF adv. A attacking $F_{ECBC}$ or $F_{NMAC}$
- there exists an eff. adversary B, s.t.:
  - $\operatorname{Adv}_{P R F}\left[A, F_{E C B C}\right] \leq \operatorname{Adv}_{P R P}[B, F]+2 q^{2} /|X|$
    $v_{P R F}\left[A, F_{\text {NMAC }}\right] \leq \mathrm{q} \cdot L \cdot \operatorname{Adv}_{P R F}[B, F]+q^{2} / 2|\mathrm{~K}|$
CBC-MAC is secure as long as $q << |X|^{1/2}$
NMAC is secure as long as $q<< |K|^{1/2}$
- $2^{64} for AES-128$

An example

$\operatorname{Adv}_{\mathrm{PRF}}\left[\mathrm{A}, \mathrm{F}_{\mathrm{ECBC}}\right] \leq \operatorname{Adv}_{\mathrm{PRP}}[\mathrm{B}, \mathrm{F}]+2 \mathrm{q}^{2} /|\mathrm{X}|$
- q = # messages MAC-ed with k
Suppose we want $Adv_{PRF} [A,F_{ECBC}] \leq 1/2^{32}$
- $\Leftarrow q^2 / |X| < 1/2^{32}$
For AES:
- $2^{128} \Rightarrow q<2^{48}$
- So, after $2^{48}$ messages must, must change the key!!!
But for 3DES:
- |X| = $2^{64}$
- $\Rightarrow q<2^{16}$
- Too short!

The security bounds are tight: an attack

Theory

After signing:
- $X|^{1/2}$ messages with ECBC-MAC or
- $K|^{1/2}$ messages with NMAC
The MACs become insecure
Suppose the underlying PRF F is a PRP
- e.g., AES
- Then both PRFs (ECBC and NMAC) have the following entension property:
- $\forall x, y, w: \quad F_{B I G}(k, x)=F_{B I G}(k, y) \Rightarrow F_{B I G}(k, x \| w)=F_{B I G}(k, y \| w)$
  - namely, if you give me a collision : x and y
  - then, in fact, that also implies a collision on an extension of x and y

An example

Let $F_{BIG}: K \times X \rightarrow Y $ be a PRF that has the extension property
- $F_{B I G}(k, x)=F_{B I G}(k, y) \Rightarrow F_{B I G}(k, x \| w)=F_{B I G}(k, y \| w)$
Generic attack on the derived MAC

Step1: issue $Y|^{1/2}$ message queries for rand. messages in X

Obtain $m_i,t_i)$ for $i= 1, 2, ..., |Y|^{1/2}$

Step 2: find a collision $t_u = t_v$ for $u\not ={v}$

One exists w.h.p by b-day paradox
b-day paradox: birthday paradox paradox 悖论，矛盾

Step 3: choose some w and query for $t:=F_{B I G}\left(k, m_{u} \| w\right)$
Step 4: output forgery $m_v||w,t)$ . Indeed $t:=F_{BIG}(k,m_v||w)$

Comparison

ECBC-MAC：
- is commonly used as an AES-based MAC
- CCM encryption mode (used in 802.11i)
- NIST standard called CMAC
NMAC:
- not usually used with AES or 3DES
- Main reason:
  - need to change AES key on every block
    - reuqires re-computing AES key expension
    - AES is not designed well when it changes key very rapidly!
- But NMAC is the basis for a popular MAC called HMAC
  - Will be introduced later

3.2.2 MAC padding

Last segment:

talk about CBC-MAC and NMAC
assume that the message length waas a multiple of the block length

In this segment:

See What to do when the message length is not a multiple of the block size

Recall ECBC-MAC

ECBC MAC:
- Let $F: K \times X \rightarrow X $ be a PRP
- Define new PRF $F_{ECBC} = K^2 \times X^{\leq L} \rightarrow X$

CBC MAC padding: What if msg. len. is not multiple of block-size

Bad idea:
- pad m with 0’s
- m[0] || m[1] $\Rightarrow$ m[0] || m[1] || 0000
Is the resulting MAC secure?
- No! given tag on msg m, attacker obtains tag on m||0
- The problem is:
  - it’s possible to come up with a msg. m, so that m and m||0 happen to have exactly the same pad
    - as shown in the following figure
  - measn that both m and m||0 have the same tag!
  - and therefore the attacker can mount an existential forgery.

a concrete example：
- we do a check for a $ 100
- But the tag of $ 100 = the tag of $ 1000
- Terrible!

CBC MAC padding

For security, padding must be invertible!
- namely, $m_0 \not ={m_1} \Rightarrow pad(m_0) \not ={pad(m_1)}$
- different messages must have different padding resutls!
ISO (proposed by the International Standards Organization)
- pad widht “100 … 00”
  - shorter than the block size
- add new dummy block if needed
  - 恰好是multiple
- The ‘1’ indicates beginning of pad

上图第二行
- 如果不添加dummy block
  - the pad would be uninvertible
  - the MAC becones un secure!
  - for example, if m’[1]的最后几行happen to be ‘100’

Is there a padding scheme that never needs to add a dummy block?

The answer: if you look at a deterministic padding function, there always be cases where we need to pad!
恰好是block length整数倍的messages的数量 (padding后) is much smaller 不是整数倍的msg的数量 (padding前)
无法找到一个从bigger set到smaller set的映射！

But: by using randomized padding function, we can!
see next page (CMAC)

CMAC (NIST standard)

Variant of CBC-MAC where key = $k,k_1,k_2)$
- (some times it is called thee-key construction)
- k: used in the standard CBC-MAC algorithm
- k1 and k2: used just for the padding scheme at the very last block!
  - k1 and k2 are derived from k throught PRG
- No final encryption step
  - extension attack thwarted by last keyed xor
- No dummy block
  - ambiguity resolved by use of $k_1$ or $k_2$
Principle:
- If the msg. is not multiple of the block length:
  - padding 100…
  - and using k1
- elif the msg. is multiple of the block length:
  - No dummy block
  - using k2!
This secure
- The attacker does not know k1 and k2
Benefits
- 1 No final encryption layer
- 2 The two distinct keys resolve the ambiguity between the 2 cases
  - and the scheme is secure!
CMAC: The standard!
- when using CBCMAC
- you actually be using CMAC as the standard way to do it
- F --> AES

This & Last segment:

Sequential MAC 串行MAC
CBC-MAC
NMAC
How to padding

Next segment:

Parallel MAC

3.3 More constructions (Parallel or One-time MAC): PMAC and the Carter Wegman MAC

Parallel MAC
also converts a small PRF into a large PRF
but does it in a parallelizable fashion

Construction 3: PMAC - Parallel MAC

Let $\times X \rightarrow X$ be a PRF
- Define new PRF $F_{PMAC}: K^2 \times X^{\leq L} \rightarrow X$

The construction (im the above figure)
- P: function P
  - First: if we delete these P, just input m[0] to F(k1, $\cdot$ )
    - The resulting MAC is completely insecure!
    - Reason: No order is enforced between the message blocks!
      - blocks swapping attack
      - If we swap m[i] and m[j]
        does not change the resulting tag!!
      - Not secure!
  - P’s input: the key k and the block’s order n
  - P(k,i): an easy to compute function
- F: PRF
  - The last block even does not need PRF
- key = (k,k1)
- padding similar to CMAC

PAC: Analysis

PMAC theorem:
- For any L > 0,
- if F is a secure PRF over (K,X,X) then
  - $F_{MAC}$ is a secure PRF over (K, $X^{\leq L}$ ,X)
- For every eff. q-query PRF adv. A attacking $F_{MAC}$
  - there exists an eff. PRF adversary B s.t.:
  - $\operatorname{Adv}_{\mathrm{PRF}}\left[\mathrm{A}, \mathrm{F}_{\mathrm{PMAC}}\right] \leq \operatorname{Adv}_{\mathrm{PRF}}[\mathrm{B}, \mathrm{F}]+2 \mathrm{q}^{2} \mathrm{~L}^{2} /|\mathrm{X}|$
PMAC is secure as long as $qL << |X|^{1/2}$
- so that $\mathrm{q}^{2} \mathrm{~L}^{2} /|\mathrm{X}|$ is negligible

PMAC is incremental

Suppose F is a PRP
When m[1] $\rightarrow$ m’[1]
- (one message bolck of this long message changes)
- Can we quickly update tag?
  - For other MAC (e.g., CBC-MAC) we have to recompute the tag on the entire message! （O(N)）
- Note: Fuction F is a PRP
  - Invertible!
Ans:
- do $F^{-1}(k_1,\text{tag}) \bigoplus F\big(k_1,\ m[1] \bigoplus P(k,1)\big) \bigoplus F\big(k_1,\ m’[1] \bigoplus P(k,1))$
- and apply $F(k_1,\cdot)$ to the result

Next:

Swith topics a little bit
talk about the concept of a one time MAC

One time MAC (analog of one time pad)

creat a MAC used for integrity of a single message
- every time we conpute the MAC for a particular message, we also change the key！
- 对于只认证一条消息的应用而言，one time MAC 是有用的
  - just like one time pad / stream cipher is useful!

Security:
- the attacker只能看到one message
- only one chosen message attack
  - then forge a a message tag pair
- The attacker’s goal: the forged message tag pair verifies correctly and is different from the given pair

Def: I=(S, V) is a secure MAC if for all “efficient” A:
$Adv_{MAC}[A,I]$ = Pr[Chal. outputs 1] is negligible

One-time MAC: an example

Can be secure against all adversaries and faster than PRF-based MACs
Let q be a large prime prime 素数 (e.g., $q = 2^{128} + 51$ )
- (q is slightly larger than the block size)
  - This case: use 128-bit block
  - 因此选择略大于 $2^{128}$
- key = (k,a) $\in {1,2, ..., q}^2$
  - two random ints. in [1,q]
- msg = (m[1],m[2], … , m[L])
  - where each block is 128 bit
- 接下来将每个数字视为一个在 $0\sim 2^{128}$ 的整数
$\operatorname{msg})=P_{m s g}(k)+a \quad(\bmod q)$
- where $P_{m s g}(x)=m[L] \cdot x^{L}+\ldots+m[1] \cdot x$ is a poly. of deg. L
The MAC：
- 1st: take the polynomial that correspond to the message
  - evaluate at the point k
    - k: one half of the secret key
- 2nd: sadd the value a
  - a: second half of the secret key
- 3rd: mod q, and get the MAC
A fact:
- given S(key, $msg_1$ ), adv. has no info about S(key, $msg_2$ )
  - the MAC for one message tells you nothing about the MAC for another message!
- namely: No way of forging this MAC for another new message!
- Secure!

One-time MAC $\Rightarrow$ Many-time MAC

Let (S,V) be a secure one-time MAC over ( $K_I, M, \{0,1 \}^{n}$ )
Let $F:K_F\times \{0,1 \}^{n} \rightarrow \{0,1 \}^{n}$ be a secure PRF
Carter-Wegman MAC:
- $CW((k_1,k_2),m) = (r, F(k_1,r)\bigoplus S(k_2,m))$
- for random $r\leftarrow \{0,1 \}^{n}$
- where
  - $F(k_1,r)$ : slow but short inp
  - $S(k_2,m)$ : fast long inp
- Process:
  - 1 apply one time MAC to the message M
  - 2 encrypt the result using the PRF
    - How to encryt the result?
      - choose a random r
        因此Carter-Wegman MAC可以用作many-time MAC
      - then apply PRF to this r and compute one-time pad:
        $F(k_1,r)\bigoplus S(k_2,m)$
  - 总结：
    - fast one time pad is applied to the long msg
      - 如gigabytes long
    - the slower PRf is only applied to the nonce r
Thm:
- If (S,V) is a secure one-time MAC and F a secure PRF, then CW is a secure MAC outputting tags in ${0,1 \}^{2n}$

A Practice of Carter-Wegman MAC

$\operatorname{CW}\left(\left(k_{1}, k_{2}\right), m\right)=\left(r, F\left(k_{1}, r\right) \oplus S\left(k_{2}, m\right)\right)$

How would you verify a CW tag (r,t) on message m?
- Recall that $V(k_2, m, .)$ is the verification alg. for the one time MAC
Ans:
- Run $V(k_2, m, (F(k_1,r) \bigoplus t))$
  - where $(F(k_1,r)\bigoplus t)= S\left(k_{2}, m\right)$

Construction HMAC (Hash-MAC)

Mostly widely used MAC on the Internet
- But, we first need to discuss hash function

3.4 Collision Ressistance 1: What is a collision resistant function

This module:

talk about a new concept: collision resistance
it plays important role in providing message integrity
Then build HMAC
which is built from collision resistant hash function

3.4.1 Introduction

Recap: message integrity

so far, four MAC constructions:

based on PRF
- ECBC-MAC, CMAC: (squential)
  - commonly used with AES
    - e.g., 802.11i
- NMAC: basis of HMAC (squential)
  - (This segment)
- PMAC
  - a parallel MAC
randomized MAC (not PRF)
- Carter-Wegman MAC: built from a fast one-time MAC

- This module: Creat MACs from collision resistance - The first thing: - construct collision resistance hash functions

Collision Resistance

What does it mean for a hash function to be collision resistant?

Let $\rightarrow T$ be a hash function
- |M| >> |T|
A collision for H is a pair $m_0, m_1 \in M$ such that:
- $H(m_0)=H(m_1)$ and $m_0 \not ={m_1}$
事实上肯定会有collision:
- because input space is much larger than the output space
A function H is collision resistant if for all (explicit) eff. algs. A:
- “explicit”: it’s not enough to just say that an algorithms exists
  - 必然有很多collision，只要无法明确地找到算法A即可。
- $Adv_{CR}[A,H]$ = Pr[A outputs collision for H]
- is “negligible”
Example: SHA-256 (outputs 256 bits)

MACs from Collision Resistance

An application of Collision Resistance:

How we can trivially build a MAC given a collision resistant hash function

Let I = (S,V) be a MAC for short messages over $(K, M, T)$ (e.g., AES)
Let $H:M^{big}\rightarrow M$
- a collision resistant hash function
Def: a new MAC $I^{big} = (S^{big}, V^{big})$ over $K,M^{big},T)$ as:
- $S^{big}(k,m) = S(k,H(m));$
- $V^{big}(k,m,t) = V(k,H(M),t)$
The new MAC $I^{big} = (S^{big}, V^{big})$ is for long messages!
- The collision resistant hash function cam be used to expand the input space!

Thm: If I is a secure MAC and H is collision resistant

then $I^{big}$ is a secure MAC
Example:
- $S_{2 \text {-block-cbc }}(k$ , SHA-256 $(m))$ is a secure MAC!

MACs from Collision Resistance

$S^{big}(k,m) = S(k,H(m));$
$V^{big}(k,m,t) = V(k,H(M),t)$

Collision resistance is necessary for security:
- Suppose adversary can find $m_0 \not ={m_1}$ , s.t.,
  - $H(m_0)=H(m_1)$
- Then:
  - $S^{big}$ is insecure under a 1-chosen msg attack
    - the combined mac
  - step 1: adversary asks for $t\leftarrow S(k,m_0)$
  - step 2: output $m_1,t)$ as forgery!

therefore:

Collision resistance is a very useful primitive!

Protecting file integity using C.R. hash

Software packages:

target:
- ensure 用户下载的安装包（packages）是正确的，而不是some version that the attacker tampered with

Methods:
- basically refer to a read-only public sapce
  - this space: hold small hashes of these software packages
    - so the space is small
  - read-only:
    - attacker cannot modify hashes stored in this space
- When user downloads package, the user can verify that contents are vilid!
- H collision resistant
  - $\Rightarrow$ attacker cannot modify package without detection
- no key needed
  - (public verifiability – everyone can verify the space)
  - but requires read-only space
  - 后面：会看到通过digital signature, 可实现 public verifiability + No read-only (extra) space

This segment:

introduce collision resistance

Next segment:

talk about generic attack on collision resistance

3.4.2 Generic birthday attack

In block cipher:

There is exhaustive attack
limit the min length of the key.

同样，对于collision resistance：

There is a general attack called the birthday attack
forces the output of collision resistant hash functions to be more than a certain bound

Generic attack on C.R. functions

Let H: $M\rightarrow \{0,1 \}^{n}$ be a hash function
- |M| >> $2^n$
Generic alg. to find a collision in time $O(2^{n/2})$ hashes!

Algorithm:

Choose $2^{n/2}$ random messages in M:
- $m_{1}, \ldots, m_{2}^{n / 2}$
- distinct w.h.p
For $\mathrm{i}=1, \ldots, 2^{\mathrm{n} / 2}$ compute $t_i = H(m_i) \in \{0,1 \}^{n}$
Look for a collision $t_i = t_j)$
- If not found, got back to step 1

How well will this work?
- ANS: the number of iteration is very very small!
- that means找到碰撞的时间复杂度约为 $O(2^{n/2})$

The birthday paradox

Let $r_{1}, \ldots, r_{n} \in\{1, \ldots, B\}$ be independent identically distributed integers
Thm: when $\times B^{1 / 2}$ ,
- the $Pr[\exist i \not ={j}: r_i = r_j] \geq 1/2$
  - 事实上， $\times B^{1 / 2}$ 是最差结果
    - 当 $r_i$ 均匀采样时的结果
  - 如果 $r_i$ 采样不均匀， $n$ 会更小
Proof:
- for uniform indep. $r_{1}, \ldots, r_{n}$
- $Pr[\exist i \not ={j}: r_i = r_j] = 1- Pr[\forall i \not ={j}: r_i \not ={r_j}] = 1- (\frac{B-1}{B})(\frac{B-2}{B})(\frac{B-n+1}{B})$
- $\prod_{i=1}^{n-1}(1-\frac{i}{B}) \geq 1- \prod_{i=1}^{n-1} e^{-i/B}$
  - $ 1-x \leq e^{-x} = 1- x + \frac{x^2}{2} … $
- $=1-e^{-\frac{1}{B}\Sigma_{i=1}^{n-1}i }$
- $\geq 1 - e^{-n^2/2B}$
- 当 $\times B^{1 / 2}$ 时，上式等于
- $1 - e^{-0.72}$ = 0.53
Why is it called “paradox”？
- because it is very paradoxical that the square root function grows very slowly
- for birthday:
  - $1.2 \times \sqrt{365} \approx 23$ 人中就会有超过一半的概率出现两个生日相同的人
  - amazing (paradoxical)
  - (考虑到birth dates are actually not uniform, the actual bound is going to be samller that $1.2 \times \sqrt{365}$ )
The graph of paradox when $B=10^6$

Generic attack

H: $M\rightarrow \{0,1 \}^{n}$
Collision finding algorithms:
1. Choose $2^{n/2}$ random messages in M: $m_{1}, \ldots, m_{2}^{n / 2}$
2. For $\mathrm{i}=1, \ldots, 2^{\mathrm{n} / 2}$ compute $t_i = H(m_i) \in \{0,1 \}^{n}$
3. Look for a collision $t_i = t_j)$ . Go back to step 1 if not found
Q: Expected number of iteration is about ?
Ans: 2
- Since each iteration’s probability is about 1/2
- therefore: 2 times!
Running time: $O(2^{n/2})$
- Space $O(2^{n/2})$

Sample C.R. hash functions:

AMD Opteron. 2.2 GHz
- Linux

speed:
- 每秒钟能Hash映射多少数据
- the bigger the block (output) size, the slower the algorithm is
- But security is more important!
Best known collision finder for SHA-1 requires $2^{51}$ hash evaluations

Next segment:

building collision resistant function

3.5 Collision Resistance 2: constructions

3.5.1 The Merkle-Damgard Paradigm

Paradigm 范式范例

This segment:

look at a very general paradigm：The Merkle-Damgard Paradigm
used for constructing collision-resistant hash functions

Collision resistance: review

Let H: $M\rightarrow T$ be a hash function
- |M| >> |T|
A collision for H is a pair $m_0$ , $m_1$ $\in M$ such that:
- $H(m_0) = H(m_1)$ and $m_0 \not ={m_1}$
Out goal: construct collision resistant (C.R.) hash functions
- even though many collision collisions exist, no efficient algorithm can even output a single collision
Step 1: Given C.R. function for short messages
- construct C.R. function for long messages
- will be done in this segment!

Next segment:

step 2:
build CR hash functions for next messages

The Merkle-Damgard iterated construction

Note:
- IV is fixed:
  - IV is basically embedded in the code and in the standards
  - just a fixed ID as part of the definition of the fucntion
- 中间each h’s output: chaining variables
- PB in the last bolck:
  - padding block
Given $h:T\times X \rightarrow T$
- compression function
we obtain** $H:X^{\leq L} \rightarrow T$ **
- up to L blocks of X
- output t: a tag in the tag space T
- $H_i$ : chaining variables
PB: padding block
- 1000..00 || (msg length)
- The “msg length” has 64 bits
  - The max message length is $2^{64}$ in The Merkle-Damgard hash function
- If no space for PB: Add another block

MD collision resistanace

Thm: if h is collision resistant so is H.

Proof:
- 证明contrapositive：Collision on H $\Rightarrow$ collision on h
  - contrapositive 对换的，逆否命题
- Suppose H(M) = H(M’)
  - then we build collision for h
- Remind: how H works:
  - IV = $H_0$ -> $H_1$ -> $H_2$ -> … -> $H_t$ -> $H_{t+1}$ -->H(M)
  - IV = $H_0'$ -> $H_1'$ -> $H_2'$ -> … -> $H_r'$ -> $H_{r+1}'$ -->H(M’)
  - The lenght of them donot have to be the same
- Therefore, if H(M) = H(M’), then $H_{t+1}$ = $H_{r+1}'$
  - $h\left(H_{t}, M_{t} \| P B\right)=H_{t+1}=H_{r+1}^{\prime}=h\left(H_{r}^{\prime}, M_{r}^{\prime} \| P B^{\prime}\right)$
- $h\left(H_{t}, M_{t} \| P B\right) = h\left(H_{r}^{\prime}, M_{r}^{\prime} \| P B^{\prime}\right)$
  - if $H_t\not ={H_r'}$ or $M_t\not ={M_r'}$ or $PB\not ={PB'}$
    - then find collision for h!
- What if ( $H_t\not ={H_r'}$ or $M_t\not ={M_r'}$ or $PB\not ={PB'}$ ) is False?
  - 即 $H_{t}=H_{r}^{\prime}$ and $M_{t}=M_{r}^{\prime}$ and $B=P^{\prime}$ , 怎么办？
  - $B=P^{\prime}$ ：t = r (the length of M and M’ are same)
  - Then $\mathrm{h}\left(\mathrm{H}_{\mathrm{t}-1}, \mathrm{M}_{\mathrm{t}-1}\right)=\mathrm{H}_{\mathrm{t}}=\mathrm{H}_{\mathrm{t}}^{\prime}=\mathrm{h}\left(\mathrm{H}_{\mathrm{t}-1}^{\prime}, \mathrm{M}_{\mathrm{t}-1}^{\prime}\right)$
    - 又找到了一组h的碰撞：
      - 如果其中的两个M又恰好相等，就继续
    - 除非M = M’ (违反了假设)，否则总能找到h的碰撞
      - 即只有两种结果
        (1) Find collision for h 或
        (2) $\forall i: M_i = M_i' \Rightarrow M = M'$
- Done！
Therefore:
- To construct C.R. function,
  - suffices to construct compression function
只要h CR，就能建立CR的H

Next segment:

如何建立collision resistant的h

3.5.2 Constructing Compression Functions

Out goal for this segment: \

build secure (collision resistant) compression

The Merkle-Damgard iterated construction

Thm: h collision $\Rightarrow$ H collision resistant
Goal: construct compression function $h:T\times X \rightarrow T$

接下来：

see a couple of constructions

Comopression function from a block cipher

$\times\{0,1\}^{n} \longrightarrow\{0,1\}^{n}$ a block cipher
The Davies-Meyer compression function:
- $\bigoplus H$
- encrypt the chaining variable using the message block as the key

Thm: Suppose E is an ideal cipher
collection of |K| random permutations
Finding a collision h(H,m) = h(H’,m) takes $O(2^{n/2})$ evaluations of (E,D)
because there is always a generic birthday attack

Davies-Meyer compression function is very popular
- SHA functions all used Davies-Meyer

Example

Suppose we define h(H,m) = E(m,H)
- without XOR H compared with Davies-Meyer compression function ( $\bigoplus H$ )
- 下面证明 h(H,m) = E(m,H) is not collision resistant
Then the resulting h(.,.) is not collision resistant:
- to build a collision (H,m) and (H’,m’)
- choose random (H,m,m’) and construct H’ as follows:

Ans:
- Just let H’=D(m’,E(m,H))
  - E(m’,H’) = E(m,H)

Other block cipher constructions

Let $E:\{0,1\}^{n} \times\{0,1\}^{n} \longrightarrow\{0,1\}^{n}$ for simplicity
Miyaguchi-Preneel:
- $\oplus H \bigoplus m$ (Whirlpool)
- $E(H\bigoplus m, m) \bigoplus m$
  - total of 12 variants like this
Other natural variants are insecure
- 上述12种变体可以，其他的不行
- e.g.,: h(H,m) = E(m,H) XOR m (HW)

Now:

We have all the ingredients to describe the SHA 256 hash function

Case study: SHA-256

Merkel-Damgard function
Davies-Meyer compression function
Block cipher: SHACAL-2
- the underlying block cipher for Daview-Meyer
- the 512-bit key of SHACAL-2: msg
- 256-bit block of SHACAL-2: the chaining variable

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-yoWPMgtN-1654224109094)(http://qiniu.ruixu.top/1650071158974----cryptographyI_csdnimg.png)]

compression Functions:

1类来自block cipher (said before)
另一类: 来自Hard problems from number theory
briefly show one example in the following

Provable compression function

provable 可证明的
- provable: If you can find the collision on this compression function, then you are going to be able to solve a very hard number theoritic problem which is believed to be intractable intractable 棘手的，难治的
choose random 2000-bit prime p and random $\leq u,v \leq p$
For $\in {0,1,2, ... , p-1}$ define:
- $m)=u^{H} \cdot v^{m} \quad(\bmod p)$
- 输入0至p-1的两个数，输出0至p-1的一个数
  - compression ratio is 2.
Fact:
- finding collision for $h (., .)$ is as hard as
- solving “discrete-log” modulo p
- 证明：
  - 见后文：when we get to the numver theoretic part of the course!
此方法的problem:
- slow!
- therefore
  - not really used for any compression functions
  - 但假如要sign的msg很短，可用时间又很长的话还是可以用的 – because it is provable

Next segment:

goint to talk about HMAC

3.6 A MAC from a hash function

3.6.1 HMAC: a MAC from SHA-256

The Merkle-Damgard iterated construction

Thm: h collision $\Rightarrow$ H collision resistant

Can we use H(.) to directly build a MAC?

without having to rely on a PRF

MAC from a Merkle-Damgard Hash Function

$\mathrm{H}: \mathrm{X}^{\leq \mathrm{L}} \longrightarrow \mathrm{T}$ a C.R. Merkle-Damgard Hash Function
Attempt #1: S(k,m) = H(k||m)
This MAC is insecure. Prove this!

Ans: Given H(k||m) anyone can compute H(k||m||PB||w) for any w!

The adversary can use w to creat (k||m||PB||w) and H(k||m||PB||w) to construct the existential forgery
so this is totally insecure and should never be used

Standardized method: HMAC (Hash-MAC)

Most widely used MAC on the Internet
H: hash function
- example: SHA-256
- output is 256 bits
  - the 256 bits can be regarded as pseudrandom numbers!
Building a MAC out of a hash function:
- HMAC: $\oplus o p a d, H(k \oplus i p a d \| m))$
- 1st: concatenate k with an iternal pad (ipad)
  - as one block of the Merkel-Damguard construction
    - would be 512 bits in the case of SHA-256
- 2nd: append $\oplus i p a d$ with message $m$
- 3rd: hash!
- 4th: 和 $k\bigoplus opad$ 一起再进行一次SHA256
HMAC in pictures:

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Xq7UjslU-1654224109095)(http://qiniu.ruixu.top/1650073014663----cryptographyI_csdnimg.png)]

Similar to the NMAC PRF
- main difference: the two keys k1 and k2 are dependent
- 如下图所示

在这里插入图片描述

ipad and opad are constants specified in the HMAC standard

HMAC properties

HMAC is assumed to be a secure PRF
- Can be proven under certain PRF assumptions about h(.,.)
- Security bounds similar to NMAC
  - Need $q^2/|T|$ to be negligible $q << |T|^{1/2})$
    - q: the number of messages you’er macing
    - T: the output tag space
In TLS: must support HMAC-SHA1-96
- SHA1: SHA1 hash function, output 160bits
- -96: truncated to 96 bits

Next sement:

timing attack on HMAC

3.6.2 Timing attacks on MAC verification

a general attack that affects many implementations of MAC alg.
a lesson for us!

Warning: verification timing attacks

Example: Keyczar crpto library (Python)
- e.g.,: HMAC(key,msg)和sig_bytes both are 16 bytes

def Verify(key, msg, sig_bytes):
  return HMAC(key,msg) == sig_bytes

The problem: ‘==’ implemented as a byte-by-byte comparison
- comparator returns false when first inequality found

How to attack?

Warning: verification timing attacks

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-zptEUD4Q-1654224109096)(http://qiniu.ruixu.top/1650074120941----cryptographyI_csdnimg.png)]

Timing attack: to compute tag for target message m do:
- 可利用的条件：一台有着密钥key的HMAC认证服务器
- Step1: Query server with random tag
- Step2: Loop over all possible first bytes and query server
  - Stop when verification takes longer than in step 1
- Step3: repeat for all tag bytes until valid tag found

如何抵御？

Defnese #1

Make string comparator always take same time (python)
- in fact, Keyczar lib. exactly implemented this defense

return false if sig_bytes has wrong length
result = 0 
# zip: create pairs between HMAC(key,msg) and big_bytes
for x,y in zip(HMAC(key,msg),big_bytes): 
  result |= ord(x) ^ ord(y)
return result == 0

can be difficult to ensure due to optimizing compiler
- an optimized compiler: 会自动地在不相等的字节对break for循环！
- 导致防御失效

Defense #2

Make string comparator always take same time (python)
- 隐藏真正被验证的string

def Verify(key, msg, sig_bytes):
  mac = HMAC(key,msg)
  return HMAC(key,mac) == HMAC(key,sig_bytes)

Attacker does not know values being compared!

Lesson:

密码算法实现时处处可能犯错
谨慎implement crypto yourself!

R.X. NLOS

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

专栏目录

斯坦福 密码学 I 学习笔记4：Lecture 3 Message Integrity

文章目录

3 Message Integrity

3.1 Message Integrity: Definitions

3.1.1 Message Authentication Codes

Message Integrity

Message integrity: MACs

Integrity requires a secret key

Secure MACs

Example on the MAC security

Example: Protecting system files

3.1.2 MAC beasd on PRFs

Review: Secure MAC

Secure PRF ⇒ \Rightarrow ⇒ Secure MAC

A bad example

Security

Proof Sketch

Examples

Truncating MACs based on PRFs

3.2 Message Integrity 2: Construction (Squential MAC Construction)

3.2.1 CBC-MAC and NMAC

MACs and PRFs

Construction 1: CBC-MAC

Construction 2: NMAC (nested MAC)

Why the last encryption step in ECBC-MAC and NMAC?

ECBC-MAC and NMAC analysis

An example

The security bounds are tight: an attack

Comparison

3.2.2 MAC padding

Recall ECBC-MAC

CBC MAC padding: What if msg. len. is not multiple of block-size

CBC MAC padding

CMAC (NIST standard)

3.3 More constructions (Parallel or One-time MAC): PMAC and the Carter Wegman MAC

Construction 3: PMAC - Parallel MAC

PAC: Analysis

PMAC is incremental

One time MAC (analog of one time pad)

One-time MAC: an example

One-time MAC ⇒ \Rightarrow ⇒ Many-time MAC

Construction HMAC (Hash-MAC)

Further reading

3.4 Collision Ressistance 1: What is a collision resistant function

3.4.1 Introduction

Recap: message integrity

Collision Resistance

MACs from Collision Resistance

MACs from Collision Resistance

Protecting file integity using C.R. hash

3.4.2 Generic birthday attack

Generic attack on C.R. functions

The birthday paradox

Generic attack

Sample C.R. hash functions:

3.5 Collision Resistance 2: constructions

3.5.1 The Merkle-Damgard Paradigm

Collision resistance: review

The Merkle-Damgard iterated construction

MD collision resistanace

3.5.2 Constructing Compression Functions

The Merkle-Damgard iterated construction

Comopression function from a block cipher

Other block cipher constructions

Case study: SHA-256

Provable compression function

3.6 A MAC from a hash function

3.6.1 HMAC: a MAC from SHA-256

The Merkle-Damgard iterated construction

MAC from a Merkle-Damgard Hash Function

Standardized method: HMAC (Hash-MAC)

HMAC properties

3.6.2 Timing attacks on MAC verification

Warning: verification timing attacks

Warning: verification timing attacks

Defnese #1

Defense #2

斯坦福密码学 I 学习笔记4：Lecture 3 Message Integrity

Secure PRF $\Rightarrow$ Secure MAC

One-time MAC $\Rightarrow$ Many-time MAC