本系列为斯坦福 Dan Boneh教授的"密码学 I"的学习笔记
课程网址: http://www.coursera.org/lecture/crypto/course-overview-lboqg
内容在CSDN、知乎和微信公众号同步更新
- Markdown源文件暂未开源,如有需要可联系邮箱
- 笔记难免存在问题,欢迎联系邮箱指正
课程完整目录如下
- 课程大纲
- 0 Introduction
- 1 Stream Ciphers
- 2 Block Ciphers
- 3 Message Integrity
- 4 Authenticated Encryption
- 5 Basic Key Exchange
- 6 Public-Encryption
本文为其中Chapter 3 Message Integrity 的内容,包括:
文章目录
- 3 Message Integrity
- 3.1 Message Integrity: Definitions
- 3.2 Message Integrity 2: Construction (Squential MAC Construction)
- 3.3 More constructions (Parallel or One-time MAC): PMAC and the Carter Wegman MAC
- 3.4 Collision Ressistance 1: What is a collision resistant function
- 3.5 Collision Resistance 2: constructions
- 3.6 A MAC from a hash function
3 Message Integrity
This chapter:
- First:
- stop talking about encryption
- instead, discuss Message Integrity
- Next:
- come back to encryption
- show how to Provide Both Encryption And Integirty
3.1 Message Integrity: Definitions
3.1.1 Message Authentication Codes
Message Integrity
- Goal: integrity, no confidentiality
- Examples:
- Protecting public binaries on disk
- such as opertating system files on your disk
- they are not confidential
- But it is important to make sure that they’re not modified by a virus or some malwares
- such as opertating system files on your disk
- Protecting banner ads on web pages
- The provider of the ads does not care at all if someone copies them
- No confidentiality at all
- But they do care about modifying those ads
- where integrity matters
- The provider of the ads does not care at all if someone copies them
- Protecting public binaries on disk
Message integrity: MACs
- MAC: message authentication code
- Alice send a message to Bob
- integrity: make sure that an attacker along the way cannot modify this message
- Alice: Generate tag:
- using a MAC signing algorithms
- t a g ← S ( k , m ) tag \leftarrow S(k,m) tag←S(k,m)
- takes input as the key and the message
- output: a very short tag
- like 90 bits or 100 bits
- even though the message is gigabytes long
- Then append the tag to the message
- Sends the combination of them to Bob
- Bob:Verify tag:
- using a MAC verification algorithm on this tag
- input: k, m, tag
- output: yes / no
- Def:
- MAC I = ( S , V ) I = (S,V) I=(S,V) defined over ( K , M , T ) (K,M,T) (K,M,T) is a pair of algs:
- S(k,m) outputs t in T
- V(k,m,t) outputs “yes” or “no”
- 应满足的一致性 (consistency) requirement:
∀ k ∈ K , m ∈ M \forall k \in \mathcal{K}, m\in \mathcal{M} ∀k∈K,m∈M
V ( k , m , S ( k , m ) ) = ′ y e s ′ V(k,m,S(k,m)) = 'yes' V(k,m,S(k,m))=′yes′
Integrity requires a secret key
- If use a CRC instead of encryption:
- which is keyless
- Bob: detect tag == CRC(m)
-
Attacker can easily modify message m and re-compute CRC
- i.e., send (m’||CRC(m’))
-
CRC designed to detect random, not malicious errors
-
By introducing the key, Alice can do something that the attacker cannot do!
- which complement the integrity!
Secure MACs
- Attacker’s power:
- chosen message attack
- the attacker can give Alice arbitrary messages of his choice
- and Alice will compute the tag for the attacker
- Why would Alice do that?
- In practice, it’s normal
- 例如还是email, attacker 发邮件给Alice,Alice为了安全就可能会加上tag,然后就被attacker拿到
- for
m
1
,
m
2
,
.
.
.
,
m
q
m_1,m_2, ... , m_q
m1,m2,...,mq attacker is given
t
i
↔
S
(
k
,
m
i
)
t_i \leftrightarrow S(k,m_i)
ti↔S(k,mi)
- even for a message that’s completely gibberish gibberish: 乱码,无意义的数据!
- Why?gibberish have no value to the attacker
- 因为假设what we want to send is a random secret key
- 看上去很gibberish
- 但attacker却可以凭此fool a user into using the wrong secret key!
- even for a message that’s completely gibberish gibberish: 乱码,无意义的数据!
- chosen message attack
- Attacker’s goal:
- existential forgery forgery: 伪造
- produce some new valid message/tag pair (m,t)
- ( m , t ) ∉ ( m 1 , t 1 ) , . . . , ( m q , t q ) (m,t) \notin {(m_1,t_1),...,(m_q,t_q)} (m,t)∈/(m1,t1),...,(mq,tq)
-
⇒
\Rightarrow
⇒ attacker cannot produce a valid tag for a new message
- ⇒ \Rightarrow ⇒ given (m,t) attacker cannot even produce (m,t’) for t ≠ t ′ t\not ={t'} t=t′- if the attacker has a tag t for a message m, we need the attacker cannot produce another tag t’ for the message
- Because: there are many applications where it’s really important that the attacker not to be able to produce new tag for a previously signed message
- Inparticular, when we combine encryption and integrity
- More precisely definition
- For a MAC I = (S,V) and adv. A define a MAC game as:
- Def:
I
=
(
S
,
V
)
I=(S,V)
I=(S,V) is a secure MAC if for all “efficient” A:
- A d v M A C [ A , I ] Adv_{MAC}[A,I] AdvMAC[A,I] = Pr[Chal. outputs 1] is “negligible”
- i.e., no efficient adversary can win this game with non negligible probability
Example on the MAC security
Example 1
-
Let I = ( S , V ) I=(S,V) I=(S,V) be a MAC
-
Suppose an attacker is able to find m 0 ≠ m 1 m_0 \not ={m_1} m0=m1 such that
- S ( k , m 0 ) = S ( k , m 1 ) S(k,m_0)=S(k,m_1) S(k,m0)=S(k,m1) for 1/2 of the keys k in K
-
Can this MAC be secure?
-
Ans: No, This mac can be broken using a chosen msg attack!
- the attacker can ask for the tag on m 0 m_0 m0
- then receive ( m 0 , t ) (m_0,t) (m0,t)
- then the attacker output as his existential forgery
(
m
1
,
t
)
(m_1,t)
(m1,t)
- ( m 0 , t ) (m_0,t) (m0,t) is different fron ( m 1 , t ) (m_1,t) (m1,t)
- So the advance of the attack is 1/2
- non negligible
–
Example 2
-
Let I = ( S , V ) I=(S,V) I=(S,V) be a MAC
-
Suppose S(k,m) is always 5 bits long
-
Can this MAC be secure?
-
Ans:
- No, an attacker can simply guess the tag for messages
- What the attack will do?
- ask no query!
- just output an existential forgery as follows:
- choose a random tag t ← R { 0 , 1 } 5 t \leftarrow ^R \{0,1 \}^{5} t←R{0,1}5
- output: (0,t)
- And the adv = 1/32
- Non negligible
-
MAC 码的长度不能太短
- typical tag length: 64, 96, 128,bits
Example: Protecting system files
- Suppose at install time the system computes:
- k derived from the user’s password
- generate a tag for each one of the files
- then erases the key K
- no longer stores the key K on disc
- Later, a virus infects system and modifies system files
- 用户如何检测到哪些文件遭到了纂改?
- User reboots into clean OS and supplies his password
- Then, secure MAC
⇒
\Rightarrow
⇒ all modified files will be detected
- Means the virus could not creat a new file such that (F’, t’) to cheat the MAC verification alg.
- Then, secure MAC
⇒
\Rightarrow
⇒ all modified files will be detected
Next:
- Try to build a secure MAC algorithm
3.1.2 MAC beasd on PRFs
Review: Secure MAC
- MAC
I
=
(
S
,
V
)
I = (S,V)
I=(S,V) defined over
(
K
,
M
,
T
)
(K,M,T)
(K,M,T) is a pair of algs:
- Signing S(k,m) outputs t in T
- Verification V(k,m,t) outputs “yes” or “no”
- Attacker’s power:
- chosen message attack
- for m 1 , m 2 , . . . , m q m_1,m_2, ... , m_q m1,m2,...,mq attacker is given t i ↔ S ( k , m i ) t_i \leftrightarrow S(k,m_i) ti↔S(k,mi)
- Attacker’s goal:
- existential forgery forgery: 伪造
- produce some new valid message/tag pair (m,t)
- ( m , t ) ∉ ( m 1 , t 1 ) , . . . , ( m q , t q ) (m,t) \notin {(m_1,t_1),...,(m_q,t_q)} (m,t)∈/(m1,t1),...,(mq,tq)
- ⇒ \Rightarrow ⇒ cannot produce a valid tag for a new message
How to build a MAC?
Secure PRF ⇒ \Rightarrow ⇒ Secure MAC
- For a PRF:
F
:
K
×
X
⇒
Y
F: K \times X \Rightarrow Y
F:K×X⇒Y define a MAC
I
F
=
(
S
,
V
)
I_F = (S,V)
IF=(S,V) as:
- S ( k , m ) : = F ( k , m ) S(k,m) := F(k,m) S(k,m):=F(k,m)
- V ( k , m , t ) V(k,m,t) V(k,m,t): output ‘yes’ if t = F ( k , m ) t = F(k,m) t=F(k,m) and ‘no’ otherwise
A bad example
-
Suppose F : K × X → Y F:K\times X \rightarrow Y F:K×X→Y is a secure PRF with Y = { 0 , 1 } 10 Y=\{0,1 \}^{10} Y={0,1}10
-
Is the derived MAC I F I_F IF a secure MAC system?
-
Ans: No tags are too short: anyone can guess the tag for any msg
-
just like the last example
- Adv = 1/1024
Security
-
Thm: if F : K × Y F:K \times Y F:K×Y is a secure PRF and 1/|Y| is negligible
- then I F I_F IF is a secure MAC
-
In particular, for every eff. MAC adversary A attacking I F I_F IF
- there exists an eff. PRF adversary B attacking F, s.t.:
-
Adv
M
A
C
[
A
,
I
F
]
≤
Adv
P
R
F
[
B
,
F
]
+
1
/
∣
Y
∣
\operatorname{Adv}_{\mathrm{MAC}}\left[\mathrm{A}, \mathrm{I}_{\mathrm{F}}\right] \leq \operatorname{Adv}_{\mathrm{PRF}}[\mathrm{B}, \mathrm{F}]+1 /|\mathrm{Y}|
AdvMAC[A,IF]≤AdvPRF[B,F]+1/∣Y∣
- Adv P R F [ B , F ] \operatorname{Adv}_{\mathrm{PRF}}[\mathrm{B}, \mathrm{F}] AdvPRF[B,F] is negligible
-
⇒ \Rightarrow ⇒ I F I_F IF is secure as along as |Y| is large, say ∣ Y ∣ = 2 80 |Y| = 2^{80} ∣Y∣=280
Proof Sketch
Sketch: 简述,概述,素描
- Suppose f : X → Y f:X \rightarrow Y f:X→Y is a truly random function
- Then MAC adversary A must win the following game:
- A wins if t=f(m) and m ∉ m 1 , . . . , m q m\notin {m_1, ... , m_q} m∈/m1,...,mq
- Because f : X → Y f:X \rightarrow Y f:X→Y is a truly random function,
- ⇒ \Rightarrow ⇒ Pr[A wins] = 1/ |Y|
- Same must hold for F(k,x)
Examples
- AES: a MAC for 16-byte messages
- Main question: how to convert Small-MAC into a Big-MAC?
- 输入的message: 能够非常big, instead of 16 bytes for example
- Two main constructions used in practice:
- CBC-MAC:
- banking - ANSI X 9.9, X9.19, FIPS 186-3
- HMAC
- Internet protocols: SSL, IPsec , SSH
- CBC-MAC:
- Both convert small-PRF into a big PRF
Truncating MACs based on PRFs
-
Truncating truncating 截断
-
Easy lemma: suppose F : K × X → { 0 , 1 } n F:K\times X \rightarrow \{0,1 \}^{n} F:K×X→{0,1}n is a secure PRF
- Then so is
F
t
(
k
,
m
)
=
F
(
k
,
m
)
[
1
:
t
]
F_t(k,m) = F(k,m)[1:t]
Ft(k,m)=F(k,m)[1:t] for all
1
≤
t
≤
n
1 \leq t\leq n
1≤t≤n
- just output the first t-bits
- Then so is
F
t
(
k
,
m
)
=
F
(
k
,
m
)
[
1
:
t
]
F_t(k,m) = F(k,m)[1:t]
Ft(k,m)=F(k,m)[1:t] for all
1
≤
t
≤
n
1 \leq t\leq n
1≤t≤n
-
⇒ \Rightarrow ⇒ if (S,V) is a MAC based on a secure PRF outputing n-bits tags
- the truncated MAC outputing w bits is secure
- as long as 1 / 2 w 1/2^w 1/2w is still negligible (say w $\geq $ 64)
- the truncated MAC outputing w bits is secure
Next segment:
- See how CBC-MAC works
3.2 Message Integrity 2: Construction (Squential MAC Construction)
3.2.1 CBC-MAC and NMAC
In this segment:
- Construct two classic MACs
- The CBC-MAC and the NMAC
MACs and PRFs
- Recall: secure PRF F
⇒
\Rightarrow
⇒ secure MAC
- as long as |Y| is large
- S(k,m) = F(k,m)
- Our goal:
- given a PRF for shor messages (AES)
- it can only process 16 byte messages
- Coustruct a PRF for long messages
- given a PRF for shor messages (AES)
- (shorthand shorthand 速记,简写 for what’s coming)
- From here on let
X
=
{
0
,
1
}
n
X = \{0,1 \}^{n}
X={0,1}n
- e.g. n = 128
- From here on let
X
=
{
0
,
1
}
n
X = \{0,1 \}^{n}
X={0,1}n
Construction 1: CBC-MAC
-
also encrypted CBC-MAC
- ECBC
-
Let F : K × X → X F: K \times X \rightarrow X F:K×X→X be a PRP
- Define new PRF
F
E
C
B
C
:
K
2
×
X
≤
L
→
X
F_{ECBC}: K^2 \times X^{\leq L} \rightarrow X
FECBC:K2×X≤L→X
- X ≤ L = U i = 1 L X i X^{\leq L} = U^{L}_{i=1} X^i X≤L=Ui=1LXi
- L: the bound of the maximum length
- Define new PRF
F
E
C
B
C
:
K
2
×
X
≤
L
→
X
F_{ECBC}: K^2 \times X^{\leq L} \rightarrow X
FECBC:K2×X≤L→X
- Two process:
- F(k,): many rounds
- The first process is called
raw CBC
- CBC-MAC only with this step is NOT secure!!
- The first process is called
- and F(k1,)
- k1 is an independent key from k
- and output the final tag (length N)
- it’s ok to truncate the tag to less than N bits
- this step is critical for making the MAC secure
- F(k,): many rounds
Another construction for converting a small PRF to a large PRF:
- Called NMAC
Construction 2: NMAC (nested MAC)
- Let
F
:
K
×
X
→
K
F: K \times X \rightarrow K
F:K×X→K be a PRF
- Define new PRF
F
N
M
A
C
:
K
2
×
X
≤
L
→
K
F_{NMAC}: K^2 \times X^{\leq L} \rightarrow K
FNMAC:K2×X≤L→K
- outputs element in the KEY space!
- Define new PRF
F
N
M
A
C
:
K
2
×
X
≤
L
→
K
F_{NMAC}: K^2 \times X^{\leq L} \rightarrow K
FNMAC:K2×X≤L→K
- Process
-
1 take our message and break it into blocks
- Each length is as big as the block length of the underlying PRF
-
2 Feed the key as the input to F
- the msg in 1 is acted as the date input to F
- to generate a new key, and input the new key into the next block …
-
- finally get output t ∈ K t\in \mathcal{K} t∈K
- Just like before:
- if we stop here, the function we obetain is called cascade function – Not secure!
-
3 map the element t ∈ K t \in \mathcal{K} t∈K into the set X
- because the last F’s input is X and K
-
4 Get the final output tag
-
Why the last encryption step in ECBC-MAC and NMAC?
- NMAC: Suppose we define a MAC
I
=
(
S
,
V
)
I=(S,V)
I=(S,V) where
- S(k,m) = cascade(k,m)$
Question: Is the MAC I Secure? Why?
- Ans: This mac can be forged with one chosen msg query!
- forge 伪造
- Query: Cascade(k,m)
- ⇒ \Rightarrow ⇒ the attacker can calculate: Cascade(k, m || w), for any w
- 其中,Cascade(k, m || w) = F(Cascade(k,m), w)
- the attacker knows Cascade(k,m), w, F
- therefore the attacker can forge (m||w, Cascade(k, m || w))
- Not secure with only one chosen msg query!!
- ECBC: Suppose we define a MAC
I
R
A
W
=
(
S
,
V
)
I_{RAW} = (S,V)
IRAW=(S,V) where
- S(k,m) = rawCBC(k,m)
- Then
I
R
A
W
I_{RAW}
IRAW is easily broken using a 1-chosen msg attack!
- The reson is different with NMAC:
- The attacker does not know k in ECBC!
- in NMAC: the key is the query output t!
- However, the attacking can works as follows:
- Choose an arbitrary one-block message m ∈ X m\in X m∈X
- Request tag for m
- Get t = F(k,m)
- Output t as MAC forgery for the 2-block message ( m , t ⨁ m ) (m, t \bigoplus m) (m,t⨁m)
- Indeed: r a w C B C ( k , ( m , t ⨁ m ) ) = F ( k , F ( k , m ) ⨁ ( t ⨁ m ) ) = F ( k , t ⨁ ( t ⨁ m ) ) = F ( k , m ) = t rawCBC(k, (m, t \bigoplus m)) = F(k, F(k,m)\bigoplus (t\bigoplus m)) = F(k, t\bigoplus(t\bigoplus m)) = F(k,m) = t rawCBC(k,(m,t⨁m))=F(k,F(k,m)⨁(t⨁m))=F(k,t⨁(t⨁m))=F(k,m)=t
- The reson is different with NMAC:
ECBC-MAC and NMAC analysis
-
Theorem:
- For any L > 0 L>0 L>0
- For every eff. q-query PRF adv. A attacking F E C B C F_{ECBC} FECBC or F N M A C F_{NMAC} FNMAC
- there exists an eff. adversary B, s.t.:
-
Adv
P
R
F
[
A
,
F
E
C
B
C
]
≤
Adv
P
R
P
[
B
,
F
]
+
2
q
2
/
∣
X
∣
\operatorname{Adv}_{P R F}\left[A, F_{E C B C}\right] \leq \operatorname{Adv}_{P R P}[B, F]+2 q^{2} /|X|
AdvPRF[A,FECBC]≤AdvPRP[B,F]+2q2/∣X∣
A d v P R F [ A , F NMAC ] ≤ q ⋅ L ⋅ Adv P R F [ B , F ] + q 2 / 2 ∣ K ∣ A d v_{P R F}\left[A, F_{\text {NMAC }}\right] \leq \mathrm{q} \cdot L \cdot \operatorname{Adv}_{P R F}[B, F]+q^{2} / 2|\mathrm{~K}| AdvPRF[A,FNMAC ]≤q⋅L⋅AdvPRF[B,F]+q2/2∣ K∣
-
Adv
P
R
F
[
A
,
F
E
C
B
C
]
≤
Adv
P
R
P
[
B
,
F
]
+
2
q
2
/
∣
X
∣
\operatorname{Adv}_{P R F}\left[A, F_{E C B C}\right] \leq \operatorname{Adv}_{P R P}[B, F]+2 q^{2} /|X|
AdvPRF[A,FECBC]≤AdvPRP[B,F]+2q2/∣X∣
-
CBC-MAC is secure as long as q < < ∣ X ∣ 1 / 2 q << |X|^{1/2} q<<∣X∣1/2
-
NMAC is secure as long as q < < ∣ K ∣ 1 / 2 q<< |K|^{1/2} q<<∣K∣1/2
- 2 64 f o r A E S − 128 2^{64} for AES-128 264forAES−128
An example
-
Adv P R F [ A , F E C B C ] ≤ Adv P R P [ B , F ] + 2 q 2 / ∣ X ∣ \operatorname{Adv}_{\mathrm{PRF}}\left[\mathrm{A}, \mathrm{F}_{\mathrm{ECBC}}\right] \leq \operatorname{Adv}_{\mathrm{PRP}}[\mathrm{B}, \mathrm{F}]+2 \mathrm{q}^{2} /|\mathrm{X}| AdvPRF[A,FECBC]≤AdvPRP[B,F]+2q2/∣X∣
- q = # messages MAC-ed with k
-
Suppose we want A d v P R F [ A , F E C B C ] ≤ 1 / 2 32 Adv_{PRF} [A,F_{ECBC}] \leq 1/2^{32} AdvPRF[A,FECBC]≤1/232
- ⇐ q 2 / ∣ X ∣ < 1 / 2 32 \Leftarrow q^2 / |X| < 1/2^{32} ⇐q2/∣X∣<1/232
-
For AES:
- ∣ X ∣ = 2 128 ⇒ q < 2 48 |X| = 2^{128} \Rightarrow q<2^{48} ∣X∣=2128⇒q<248
- So, after 2 48 2^{48} 248 messages must, must change the key!!!
-
But for 3DES:
- |X| = 2 64 2^{64} 264
- ⇒ q < 2 16 \Rightarrow q<2^{16} ⇒q<216
- Too short!
The security bounds are tight: an attack
Theory
-
After signing:
- ∣ X ∣ 1 / 2 |X|^{1/2} ∣X∣1/2 messages with ECBC-MAC or
- ∣ K ∣ 1 / 2 |K|^{1/2} ∣K∣1/2 messages with NMAC
-
The MACs become insecure
-
Suppose the underlying PRF F is a PRP
- e.g., AES
- Then both PRFs (ECBC and NMAC) have the following entension property:
-
∀
x
,
y
,
w
:
F
B
I
G
(
k
,
x
)
=
F
B
I
G
(
k
,
y
)
⇒
F
B
I
G
(
k
,
x
∥
w
)
=
F
B
I
G
(
k
,
y
∥
w
)
\forall x, y, w: \quad F_{B I G}(k, x)=F_{B I G}(k, y) \Rightarrow F_{B I G}(k, x \| w)=F_{B I G}(k, y \| w)
∀x,y,w:FBIG(k,x)=FBIG(k,y)⇒FBIG(k,x∥w)=FBIG(k,y∥w)
- namely, if you give me a collision : x and y
- then, in fact, that also implies a collision on an extension of x and y
An example
- Let $F_{BIG}: K \times X \rightarrow Y $ be a PRF that has the extension property
- F B I G ( k , x ) = F B I G ( k , y ) ⇒ F B I G ( k , x ∥ w ) = F B I G ( k , y ∥ w ) F_{B I G}(k, x)=F_{B I G}(k, y) \Rightarrow F_{B I G}(k, x \| w)=F_{B I G}(k, y \| w) FBIG(k,x)=FBIG(k,y)⇒FBIG(k,x∥w)=FBIG(k,y∥w)
- Generic attack on the derived MAC
Step1: issue ∣ Y ∣ 1 / 2 |Y|^{1/2} ∣Y∣1/2 message queries for rand. messages in X
- Obtain ( m i , t i ) (m_i,t_i) (mi,ti) for i = 1 , 2 , . . . , ∣ Y ∣ 1 / 2 i= 1, 2, ..., |Y|^{1/2} i=1,2,...,∣Y∣1/2
Step 2: find a collision t u = t v t_u = t_v tu=tv for u ≠ v u\not ={v} u=v
- One exists w.h.p by b-day paradox
- b-day paradox: birthday paradox paradox 悖论,矛盾
Step 3: choose some w and query for t : = F B I G ( k , m u ∥ w ) t:=F_{B I G}\left(k, m_{u} \| w\right) t:=FBIG(k,mu∥w)
Step 4: output forgery ( m v ∣ ∣ w , t ) (m_v||w,t) (mv∣∣w,t). Indeed t : = F B I G ( k , m v ∣ ∣ w ) t:=F_{BIG}(k,m_v||w) t:=FBIG(k,mv∣∣w)
Comparison
- ECBC-MAC:
- is commonly used as an AES-based MAC
- CCM encryption mode (used in 802.11i)
- NIST standard called CMAC
- NMAC:
- not usually used with AES or 3DES
- Main reason:
- need to change AES key on every block
- reuqires re-computing AES key expension
- AES is not designed well when it changes key very rapidly!
- need to change AES key on every block
- But NMAC is the basis for a popular MAC called HMAC
- Will be introduced later
3.2.2 MAC padding
Last segment:
- talk about CBC-MAC and NMAC
- assume that the message length waas a multiple of the block length
In this segment:
- See What to do when the message length is not a multiple of the block size
Recall ECBC-MAC
- ECBC MAC:
- Let $F: K \times X \rightarrow X $ be a PRP
- Define new PRF F E C B C = K 2 × X ≤ L → X F_{ECBC} = K^2 \times X^{\leq L} \rightarrow X FECBC=K2×X≤L→X
CBC MAC padding: What if msg. len. is not multiple of block-size
- Bad idea:
- pad m with 0’s
- m[0] || m[1] ⇒ \Rightarrow ⇒ m[0] || m[1] || 0000
- Is the resulting MAC secure?
- No! given tag on msg m, attacker obtains tag on m||0
- The problem is:
- it’s possible to come up with a msg. m, so that m and m||0 happen to have exactly the same pad
- as shown in the following figure
- measn that both m and m||0 have the same tag!
- and therefore the attacker can mount an existential forgery.
- it’s possible to come up with a msg. m, so that m and m||0 happen to have exactly the same pad
- a concrete example:
- we do a check for a $ 100
- But the tag of $ 100 = the tag of $ 1000
- Terrible!
CBC MAC padding
-
For security, padding must be invertible!
- namely, m 0 ≠ m 1 ⇒ p a d ( m 0 ) ≠ p a d ( m 1 ) m_0 \not ={m_1} \Rightarrow pad(m_0) \not ={pad(m_1)} m0=m1⇒pad(m0)=pad(m1)
- different messages must have different padding resutls!
-
ISO (proposed by the International Standards Organization)
- pad widht “100 … 00”
- shorter than the block size
- add new dummy block if needed
- 恰好是multiple
- The ‘1’ indicates beginning of pad
- pad widht “100 … 00”
- 上图第二行
- 如果不添加dummy block
- the pad would be uninvertible
- the MAC becones un secure!
- for example, if m’[1]的最后几行happen to be ‘100’
- 如果不添加dummy block
Is there a padding scheme that never needs to add a dummy block?
- The answer: if you look at a deterministic padding function, there always be cases where we need to pad!
- 恰好是block length整数倍的messages的数量 (padding后) is much smaller 不是整数倍的msg的数量 (padding前)
- 无法找到一个从bigger set到smaller set的映射!
- But: by using randomized padding function, we can!
- see next page (CMAC)
CMAC (NIST standard)
-
Variant of CBC-MAC where key = ( k , k 1 , k 2 ) (k,k_1,k_2) (k,k1,k2)
- (some times it is called thee-key construction)
- k: used in the standard CBC-MAC algorithm
- k1 and k2: used just for the padding scheme at the very last block!
- k1 and k2 are derived from k throught PRG
- No final encryption step
- extension attack thwarted by last keyed xor
- No dummy block
- ambiguity resolved by use of k 1 k_1 k1 or k 2 k_2 k2
-
Principle:
- If the msg. is not multiple of the block length:
- padding 100…
- and using k1
- elif the msg. is multiple of the block length:
- No dummy block
- using k2!
- If the msg. is not multiple of the block length:
-
This secure
- The attacker does not know k1 and k2
-
Benefits
- 1 No final encryption layer
- 2 The two distinct keys resolve the ambiguity between the 2 cases
- and the scheme is secure!
-
CMAC: The standard!
- when using CBCMAC
- you actually be using CMAC as the standard way to do it
- F --> AES
This & Last segment:
- Sequential MAC 串行MAC
- CBC-MAC
- NMAC
- How to padding
Next segment:
- Parallel MAC
3.3 More constructions (Parallel or One-time MAC): PMAC and the Carter Wegman MAC
- Parallel MAC
- also converts a small PRF into a large PRF
- but does it in a parallelizable fashion
Construction 3: PMAC - Parallel MAC
- Let
F
:
K
×
X
→
X
F: K \times X \rightarrow X
F:K×X→X be a PRF
- Define new PRF F P M A C : K 2 × X ≤ L → X F_{PMAC}: K^2 \times X^{\leq L} \rightarrow X FPMAC:K2×X≤L→X
- The construction (im the above figure)
- P: function P
- First: if we delete these P, just input m[0] to F(k1,
⋅
\cdot
⋅)
- The resulting MAC is completely insecure!
- Reason: No order is enforced between the message blocks!
- blocks swapping attack
- If we swap m[i] and m[j]
- does not change the resulting tag!!
- Not secure!
- P’s input: the key k and the block’s order n
- P(k,i): an easy to compute function
- First: if we delete these P, just input m[0] to F(k1,
⋅
\cdot
⋅)
- F: PRF
- The last block even does not need PRF
- key = (k,k1)
- padding similar to CMAC
- P: function P
PAC: Analysis
- PMAC theorem:
- For any L > 0,
- if F is a secure PRF over (K,X,X) then
- F M A C F_{MAC} FMAC is a secure PRF over (K, X ≤ L X^{\leq L} X≤L,X)
- For every eff. q-query PRF adv. A attacking
F
M
A
C
F_{MAC}
FMAC
- there exists an eff. PRF adversary B s.t.:
- Adv P R F [ A , F P M A C ] ≤ Adv P R F [ B , F ] + 2 q 2 L 2 / ∣ X ∣ \operatorname{Adv}_{\mathrm{PRF}}\left[\mathrm{A}, \mathrm{F}_{\mathrm{PMAC}}\right] \leq \operatorname{Adv}_{\mathrm{PRF}}[\mathrm{B}, \mathrm{F}]+2 \mathrm{q}^{2} \mathrm{~L}^{2} /|\mathrm{X}| AdvPRF[A,FPMAC]≤AdvPRF[B,F]+2q2 L2/∣X∣
- PMAC is secure as long as
q
L
<
<
∣
X
∣
1
/
2
qL << |X|^{1/2}
qL<<∣X∣1/2
- so that 2 q 2 L 2 / ∣ X ∣ 2 \mathrm{q}^{2} \mathrm{~L}^{2} /|\mathrm{X}| 2q2 L2/∣X∣ is negligible
PMAC is incremental
-
Suppose F is a PRP
-
When m[1] → \rightarrow → m’[1]
- (one message bolck of this long message changes)
- Can we quickly update tag?
- For other MAC (e.g., CBC-MAC) we have to recompute the tag on the entire message! (O(N))
- Note: Fuction F is a PRP
- Invertible!
-
Ans:
- do F − 1 ( k 1 , tag ) ⨁ F ( k 1 , m [ 1 ] ⨁ P ( k , 1 ) ) ⨁ F ( k 1 , m ’ [ 1 ] ⨁ P ( k , 1 ) ) F^{-1}(k_1,\text{tag}) \bigoplus F\big(k_1,\ m[1] \bigoplus P(k,1)\big) \bigoplus F\big(k_1,\ m’[1] \bigoplus P(k,1)) F−1(k1,tag)⨁F(k1, m[1]⨁P(k,1))⨁F(k1, m’[1]⨁P(k,1))
- and apply F ( k 1 , ⋅ ) F(k_1,\cdot) F(k1,⋅) to the result
Next:
- Swith topics a little bit
- talk about the concept of a one time MAC
One time MAC (analog of one time pad)
- creat a MAC used for integrity of a single message
- every time we conpute the MAC for a particular message, we also change the key!
- 对于只认证一条消息的应用而言,one time MAC 是有用的
- just like one time pad / stream cipher is useful!
- Security:
- the attacker只能看到one message
- only one chosen message attack
- then forge a a message tag pair
- The attacker’s goal: the forged message tag pair verifies correctly and is different from the given pair
Def: I=(S, V) is a secure MAC if for all “efficient” A:
A d v M A C [ A , I ] Adv_{MAC}[A,I] AdvMAC[A,I] = Pr[Chal. outputs 1] is negligible
One-time MAC: an example
-
Can be secure against all adversaries and faster than PRF-based MACs
-
Let q be a large prime prime 素数 (e.g., q = 2 128 + 51 q = 2^{128} + 51 q=2128+51)
- (q is slightly larger than the block size)
- This case: use 128-bit block
- 因此选择略大于 2 128 2^{128} 2128
- key = (k,a)
∈
1
,
2
,
.
.
.
,
q
2
\in {1,2, ..., q}^2
∈1,2,...,q2
- two random ints. in [1,q]
- msg = (m[1],m[2], … , m[L])
- where each block is 128 bit
- 接下来将每个数字视为一个在 0 ∼ 2 128 0\sim 2^{128} 0∼2128的整数
- (q is slightly larger than the block size)
-
S ( k e y , msg ) = P m s g ( k ) + a ( m o d q ) S(k e y, \operatorname{msg})=P_{m s g}(k)+a \quad(\bmod q) S(key,msg)=Pmsg(k)+a(modq)
- where P m s g ( x ) = m [ L ] ⋅ x L + … + m [ 1 ] ⋅ x P_{m s g}(x)=m[L] \cdot x^{L}+\ldots+m[1] \cdot x Pmsg(x)=m[L]⋅xL+…+m[1]⋅x is a poly. of deg. L
-
The MAC:
- 1st: take the polynomial that correspond to the message
- evaluate at the point k
- k: one half of the secret key
- evaluate at the point k
- 2nd: sadd the value a
- a: second half of the secret key
- 3rd: mod q, and get the MAC
- 1st: take the polynomial that correspond to the message
-
A fact:
- given S(key,
m
s
g
1
msg_1
msg1), adv. has no info about S(key,
m
s
g
2
msg_2
msg2)
- the MAC for one message tells you nothing about the MAC for another message!
- namely: No way of forging this MAC for another new message!
- Secure!
- given S(key,
m
s
g
1
msg_1
msg1), adv. has no info about S(key,
m
s
g
2
msg_2
msg2)
One-time MAC ⇒ \Rightarrow ⇒ Many-time MAC
-
Let (S,V) be a secure one-time MAC over ( K I , M , { 0 , 1 } n K_I, M, \{0,1 \}^{n} KI,M,{0,1}n)
-
Let F : K F × { 0 , 1 } n → { 0 , 1 } n F:K_F\times \{0,1 \}^{n} \rightarrow \{0,1 \}^{n} F:KF×{0,1}n→{0,1}n be a secure PRF
-
Carter-Wegman MAC:
- C W ( ( k 1 , k 2 ) , m ) = ( r , F ( k 1 , r ) ⨁ S ( k 2 , m ) ) CW((k_1,k_2),m) = (r, F(k_1,r)\bigoplus S(k_2,m)) CW((k1,k2),m)=(r,F(k1,r)⨁S(k2,m))
- for random r ← { 0 , 1 } n r\leftarrow \{0,1 \}^{n} r←{0,1}n
- where
- F ( k 1 , r ) F(k_1,r) F(k1,r): slow but short inp
- S ( k 2 , m ) S(k_2,m) S(k2,m): fast long inp
- Process:
- 1 apply one time MAC to the message M
- 2 encrypt the result using the PRF
- How to encryt the result?
- choose a random r
- 因此Carter-Wegman MAC可以用作many-time MAC
- then apply PRF to this r and compute one-time pad:
- F ( k 1 , r ) ⨁ S ( k 2 , m ) F(k_1,r)\bigoplus S(k_2,m) F(k1,r)⨁S(k2,m)
- choose a random r
- How to encryt the result?
- 总结:
- fast one time pad is applied to the long msg
- 如gigabytes long
- the slower PRf is only applied to the nonce r
- fast one time pad is applied to the long msg
-
Thm:
- If (S,V) is a secure one-time MAC and F a secure PRF, then CW is a secure MAC outputting tags in { 0 , 1 } 2 n \{0,1 \}^{2n} {0,1}2n
A Practice of Carter-Wegman MAC
CW ( ( k 1 , k 2 ) , m ) = ( r , F ( k 1 , r ) ⊕ S ( k 2 , m ) ) \operatorname{CW}\left(\left(k_{1}, k_{2}\right), m\right)=\left(r, F\left(k_{1}, r\right) \oplus S\left(k_{2}, m\right)\right) CW((k1,k2),m)=(r,F(k1,r)⊕S(k2,m))
-
How would you verify a CW tag (r,t) on message m?
- Recall that V ( k 2 , m , . ) V(k_2, m, .) V(k2,m,.) is the verification alg. for the one time MAC
-
Ans:
- Run
V
(
k
2
,
m
,
(
F
(
k
1
,
r
)
⨁
t
)
)
V(k_2, m, (F(k_1,r) \bigoplus t))
V(k2,m,(F(k1,r)⨁t))
- where ( F ( k 1 , r ) ⨁ t ) = S ( k 2 , m ) (F(k_1,r)\bigoplus t)= S\left(k_{2}, m\right) (F(k1,r)⨁t)=S(k2,m)
- Run
V
(
k
2
,
m
,
(
F
(
k
1
,
r
)
⨁
t
)
)
V(k_2, m, (F(k_1,r) \bigoplus t))
V(k2,m,(F(k1,r)⨁t))
Construction HMAC (Hash-MAC)
- Mostly widely used MAC on the Internet
- But, we first need to discuss hash function
Further reading
-
2005 CBC MACs for Arbitrary-Length Messages: The three-key constructions
-
2006 A tight bound for EMAC
-
2002 A block-cipher mode of operation for parallelizable message authentication
- PMAC
-
2006 New Proofs for NMAC and HMAC: Security withouot collision resistance
- security of NMAC and HMAC
-
2008 A new mode of operation for block ciphers and length-preserving MACs
- If AES is only an unpredictable function, but not a secure PRF, can we still build MACs for long messages?
In the next segment:
- talk about collision resistance
collision resistance 抗碰撞性
3.4 Collision Ressistance 1: What is a collision resistant function
This module:
- talk about a new concept: collision resistance
- it plays important role in providing message integrity
- Then build HMAC
- which is built from collision resistant hash function
3.4.1 Introduction
Recap: message integrity
so far, four MAC constructions:
- based on PRF
- ECBC-MAC, CMAC: (squential)
- commonly used with AES
- e.g., 802.11i
- commonly used with AES
- NMAC: basis of HMAC (squential)
- (This segment)
- PMAC
- a parallel MAC
- ECBC-MAC, CMAC: (squential)
- randomized MAC (not PRF)
- Carter-Wegman MAC: built from a fast one-time MAC
- This module: Creat MACs from collision resistance - The first thing: - construct collision resistance hash functions
Collision Resistance
What does it mean for a hash function to be collision resistant?
-
Let H : M → T H: M \rightarrow T H:M→T be a hash function
- |M| >> |T|
-
A collision for H is a pair m 0 , m 1 ∈ M m_0, m_1 \in M m0,m1∈M such that:
- H ( m 0 ) = H ( m 1 ) H(m_0)=H(m_1) H(m0)=H(m1) and m 0 ≠ m 1 m_0 \not ={m_1} m0=m1
-
事实上肯定会有collision:
- because input space is much larger than the output space
-
A function H is collision resistant if for all (explicit) eff. algs. A:
- “explicit”: it’s not enough to just say that an algorithms exists
- 必然有很多collision,只要无法明确地找到算法A即可。
- A d v C R [ A , H ] Adv_{CR}[A,H] AdvCR[A,H] = Pr[A outputs collision for H]
- is “negligible”
- “explicit”: it’s not enough to just say that an algorithms exists
-
Example: SHA-256 (outputs 256 bits)
MACs from Collision Resistance
An application of Collision Resistance:
- How we can trivially build a MAC given a collision resistant hash function
-
Let I = (S,V) be a MAC for short messages over ( K , M , T ) (K,M,T) (K,M,T) (e.g., AES)
-
Let H : M b i g → M H:M^{big}\rightarrow M H:Mbig→M
- a collision resistant hash function
-
Def: a new MAC I b i g = ( S b i g , V b i g ) I^{big} = (S^{big}, V^{big}) Ibig=(Sbig,Vbig) over ( K , M b i g , T ) (K,M^{big},T) (K,Mbig,T) as:
- S b i g ( k , m ) = S ( k , H ( m ) ) ; S^{big}(k,m) = S(k,H(m)); Sbig(k,m)=S(k,H(m));
- V b i g ( k , m , t ) = V ( k , H ( M ) , t ) V^{big}(k,m,t) = V(k,H(M),t) Vbig(k,m,t)=V(k,H(M),t)
-
The new MAC I b i g = ( S b i g , V b i g ) I^{big} = (S^{big}, V^{big}) Ibig=(Sbig,Vbig) is for long messages!
- The collision resistant hash function cam be used to expand the input space!
Thm: If I is a secure MAC and H is collision resistant
-
then I b i g I^{big} Ibig is a secure MAC
-
Example:
- S ( k , m ) = A E S 2 -block-cbc ( k S(k, m)=A E S_{2 \text {-block-cbc }}(k S(k,m)=AES2-block-cbc (k, SHA-256 ( m ) ) (m)) (m)) is a secure MAC!
MACs from Collision Resistance
- S b i g ( k , m ) = S ( k , H ( m ) ) ; S^{big}(k,m) = S(k,H(m)); Sbig(k,m)=S(k,H(m));
- V b i g ( k , m , t ) = V ( k , H ( M ) , t ) V^{big}(k,m,t) = V(k,H(M),t) Vbig(k,m,t)=V(k,H(M),t)
- Collision resistance is necessary for security:
- Suppose adversary can find
m
0
≠
m
1
m_0 \not ={m_1}
m0=m1, s.t.,
- H ( m 0 ) = H ( m 1 ) H(m_0)=H(m_1) H(m0)=H(m1)
- Then:
-
S
b
i
g
S^{big}
Sbig is insecure under a 1-chosen msg attack
- the combined mac
- step 1: adversary asks for t ← S ( k , m 0 ) t\leftarrow S(k,m_0) t←S(k,m0)
- step 2: output ( m 1 , t ) (m_1,t) (m1,t) as forgery!
-
S
b
i
g
S^{big}
Sbig is insecure under a 1-chosen msg attack
- Suppose adversary can find
m
0
≠
m
1
m_0 \not ={m_1}
m0=m1, s.t.,
therefore:
- Collision resistance is a very useful primitive!
Protecting file integity using C.R. hash
- Software packages:
- target:
- ensure 用户下载的安装包(packages)是正确的,而不是some version that the attacker tampered with
- Methods:
- basically refer to a read-only public sapce
- this space: hold small hashes of these software packages
- so the space is small
- read-only:
- attacker cannot modify hashes stored in this space
- this space: hold small hashes of these software packages
- When user downloads package, the user can verify that contents are vilid!
- H collision resistant
- ⇒ \Rightarrow ⇒ attacker cannot modify package without detection
- no key needed
- (public verifiability – everyone can verify the space)
- but requires read-only space
- 后面:会看到通过digital signature, 可实现 public verifiability + No read-only (extra) space
- basically refer to a read-only public sapce
This segment:
- introduce collision resistance
Next segment:
- talk about generic attack on collision resistance
3.4.2 Generic birthday attack
In block cipher:
- There is exhaustive attack
- limit the min length of the key.
同样,对于collision resistance:
- There is a general attack called the birthday attack
- forces the output of collision resistant hash functions to be more than a certain bound
Generic attack on C.R. functions
- Let H:
M
→
{
0
,
1
}
n
M\rightarrow \{0,1 \}^{n}
M→{0,1}n be a hash function
- |M| >> 2 n 2^n 2n
- Generic alg. to find a collision in time O ( 2 n / 2 ) O(2^{n/2}) O(2n/2) hashes!
Algorithm:
- Choose
2
n
/
2
2^{n/2}
2n/2 random messages in M:
- m 1 , … , m 2 n / 2 m_{1}, \ldots, m_{2}^{n / 2} m1,…,m2n/2
- distinct w.h.p
- For i = 1 , … , 2 n / 2 \mathrm{i}=1, \ldots, 2^{\mathrm{n} / 2} i=1,…,2n/2 compute t i = H ( m i ) ∈ { 0 , 1 } n t_i = H(m_i) \in \{0,1 \}^{n} ti=H(mi)∈{0,1}n
- Look for a collision
(
t
i
=
t
j
)
(t_i = t_j)
(ti=tj)
- If not found, got back to step 1
- How well will this work?
- ANS: the number of iteration is very very small!
- that means找到碰撞的时间复杂度约为 O ( 2 n / 2 ) O(2^{n/2}) O(2n/2)
The birthday paradox
-
Let r 1 , … , r n ∈ { 1 , … , B } r_{1}, \ldots, r_{n} \in\{1, \ldots, B\} r1,…,rn∈{1,…,B} be independent identically distributed integers
-
Thm: when n = 1.2 × B 1 / 2 n=1.2 \times B^{1 / 2} n=1.2×B1/2,
- the
P
r
[
∃
i
≠
j
:
r
i
=
r
j
]
≥
1
/
2
Pr[\exist i \not ={j}: r_i = r_j] \geq 1/2
Pr[∃i=j:ri=rj]≥1/2
- 事实上,
n
=
1.2
×
B
1
/
2
n=1.2 \times B^{1 / 2}
n=1.2×B1/2是最差结果
- 当 r i r_i ri均匀采样时的结果
- 如果 r i r_i ri采样不均匀, n n n会更小
- 事实上,
n
=
1.2
×
B
1
/
2
n=1.2 \times B^{1 / 2}
n=1.2×B1/2是最差结果
- the
P
r
[
∃
i
≠
j
:
r
i
=
r
j
]
≥
1
/
2
Pr[\exist i \not ={j}: r_i = r_j] \geq 1/2
Pr[∃i=j:ri=rj]≥1/2
-
Proof:
- for uniform indep. r 1 , … , r n r_{1}, \ldots, r_{n} r1,…,rn
- P r [ ∃ i ≠ j : r i = r j ] = 1 − P r [ ∀ i ≠ j : r i ≠ r j ] = 1 − ( B − 1 B ) ( B − 2 B ) ( B − n + 1 B ) Pr[\exist i \not ={j}: r_i = r_j] = 1- Pr[\forall i \not ={j}: r_i \not ={r_j}] = 1- (\frac{B-1}{B})(\frac{B-2}{B})(\frac{B-n+1}{B}) Pr[∃i=j:ri=rj]=1−Pr[∀i=j:ri=rj]=1−(BB−1)(BB−2)(BB−n+1)
-
=
1
−
∏
i
=
1
n
−
1
(
1
−
i
B
)
≥
1
−
∏
i
=
1
n
−
1
e
−
i
/
B
= 1 - \prod_{i=1}^{n-1}(1-\frac{i}{B}) \geq 1- \prod_{i=1}^{n-1} e^{-i/B}
=1−∏i=1n−1(1−Bi)≥1−∏i=1n−1e−i/B
- $ 1-x \leq e^{-x} = 1- x + \frac{x^2}{2} … $
- = 1 − e − 1 B Σ i = 1 n − 1 i =1-e^{-\frac{1}{B}\Sigma_{i=1}^{n-1}i } =1−e−B1Σi=1n−1i
- ≥ 1 − e − n 2 / 2 B \geq 1 - e^{-n^2/2B} ≥1−e−n2/2B
- 当 n = 1.2 × B 1 / 2 n=1.2 \times B^{1 / 2} n=1.2×B1/2时,上式等于
- = 1 − e − 0.72 = 1 - e^{-0.72} =1−e−0.72 = 0.53
-
Why is it called “paradox”?
- because it is very paradoxical that the square root function grows very slowly
- for birthday:
- 1.2 × 365 ≈ 23 1.2 \times \sqrt{365} \approx 23 1.2×365≈23 人中就会有超过一半的概率出现两个生日相同的人
- amazing (paradoxical)
- (考虑到birth dates are actually not uniform, the actual bound is going to be samller that 1.2 × 365 1.2 \times \sqrt{365} 1.2×365)
-
The graph of paradox when B = 1 0 6 B=10^6 B=106
Generic attack
-
H: M → { 0 , 1 } n M\rightarrow \{0,1 \}^{n} M→{0,1}n
-
Collision finding algorithms:
- Choose 2 n / 2 2^{n/2} 2n/2 random messages in M: m 1 , … , m 2 n / 2 m_{1}, \ldots, m_{2}^{n / 2} m1,…,m2n/2
- For i = 1 , … , 2 n / 2 \mathrm{i}=1, \ldots, 2^{\mathrm{n} / 2} i=1,…,2n/2 compute t i = H ( m i ) ∈ { 0 , 1 } n t_i = H(m_i) \in \{0,1 \}^{n} ti=H(mi)∈{0,1}n
- Look for a collision ( t i = t j ) (t_i = t_j) (ti=tj). Go back to step 1 if not found
-
Q: Expected number of iteration is about ?
-
Ans: 2
- Since each iteration’s probability is about 1/2
- therefore: 2 times!
-
Running time: O ( 2 n / 2 ) O(2^{n/2}) O(2n/2)
- Space O ( 2 n / 2 ) O(2^{n/2}) O(2n/2)
Sample C.R. hash functions:
- AMD Opteron. 2.2 GHz
- Linux
-
speed:
- 每秒钟能Hash映射多少数据
- the bigger the block (output) size, the slower the algorithm is
- But security is more important!
-
Best known collision finder for SHA-1 requires 2 51 2^{51} 251 hash evaluations
Next segment:
- building collision resistant function
3.5 Collision Resistance 2: constructions
3.5.1 The Merkle-Damgard Paradigm
Paradigm 范式 范例
This segment:
- look at a very general paradigm:The Merkle-Damgard Paradigm
- used for constructing collision-resistant hash functions
Collision resistance: review
- Let H:
M
→
T
M\rightarrow T
M→T be a hash function
- |M| >> |T|
- A collision for H is a pair
m
0
m_0
m0,
m
1
m_1
m1
∈
M
\in M
∈M such that:
- H ( m 0 ) = H ( m 1 ) H(m_0) = H(m_1) H(m0)=H(m1) and m 0 ≠ m 1 m_0 \not ={m_1} m0=m1
- Out goal: construct collision resistant (C.R.) hash functions
- even though many collision collisions exist, no efficient algorithm can even output a single collision
- Step 1: Given C.R. function for short messages
- construct C.R. function for long messages
- will be done in this segment!
Next segment:
- step 2:
- build CR hash functions for next messages
The Merkle-Damgard iterated construction
-
Note:
- IV is fixed:
- IV is basically embedded in the code and in the standards
- just a fixed ID as part of the definition of the fucntion
- 中间each h’s output: chaining variables
- PB in the last bolck:
- padding block
- IV is fixed:
-
Given h : T × X → T h:T\times X \rightarrow T h:T×X→T
- compression function
-
we obtain** H : X ≤ L → T H:X^{\leq L} \rightarrow T H:X≤L→T**
- up to L blocks of X
- output t: a tag in the tag space T
- H i H_i Hi: chaining variables
-
PB: padding block
1000..00 || (msg length)
- The “msg length” has 64 bits
- The max message length is 2 64 2^{64} 264 in The Merkle-Damgard hash function
- If no space for PB: Add another block
MD collision resistanace
Thm: if h is collision resistant so is H.
-
Proof:
- 证明contrapositive:Collision on H
⇒
\Rightarrow
⇒ collision on h
- contrapositive 对换的,逆否命题
- Suppose H(M) = H(M’)
- then we build collision for h
- Remind: how H works:
- IV = H 0 H_0 H0 -> H 1 H_1 H1 -> H 2 H_2 H2 -> … -> H t H_t Ht -> H t + 1 H_{t+1} Ht+1 -->H(M)
- IV = H 0 ′ H_0' H0′ -> H 1 ′ H_1' H1′ -> H 2 ′ H_2' H2′ -> … -> H r ′ H_r' Hr′ -> H r + 1 ′ H_{r+1}' Hr+1′ -->H(M’)
- The lenght of them donot have to be the same
- Therefore, if H(M) = H(M’), then
H
t
+
1
H_{t+1}
Ht+1 =
H
r
+
1
′
H_{r+1}'
Hr+1′
- h ( H t , M t ∥ P B ) = H t + 1 = H r + 1 ′ = h ( H r ′ , M r ′ ∥ P B ′ ) h\left(H_{t}, M_{t} \| P B\right)=H_{t+1}=H_{r+1}^{\prime}=h\left(H_{r}^{\prime}, M_{r}^{\prime} \| P B^{\prime}\right) h(Ht,Mt∥PB)=Ht+1=Hr+1′=h(Hr′,Mr′∥PB′)
-
h
(
H
t
,
M
t
∥
P
B
)
=
h
(
H
r
′
,
M
r
′
∥
P
B
′
)
h\left(H_{t}, M_{t} \| P B\right) = h\left(H_{r}^{\prime}, M_{r}^{\prime} \| P B^{\prime}\right)
h(Ht,Mt∥PB)=h(Hr′,Mr′∥PB′)
- if
H
t
≠
H
r
′
H_t\not ={H_r'}
Ht=Hr′ or
M
t
≠
M
r
′
M_t\not ={M_r'}
Mt=Mr′ or
P
B
≠
P
B
′
PB\not ={PB'}
PB=PB′
- then find collision for h!
- if
H
t
≠
H
r
′
H_t\not ={H_r'}
Ht=Hr′ or
M
t
≠
M
r
′
M_t\not ={M_r'}
Mt=Mr′ or
P
B
≠
P
B
′
PB\not ={PB'}
PB=PB′
- What if (
H
t
≠
H
r
′
H_t\not ={H_r'}
Ht=Hr′ or
M
t
≠
M
r
′
M_t\not ={M_r'}
Mt=Mr′ or
P
B
≠
P
B
′
PB\not ={PB'}
PB=PB′) is False?
- 即 H t = H r ′ H_{t}=H_{r}^{\prime} Ht=Hr′ and M t = M r ′ M_{t}=M_{r}^{\prime} Mt=Mr′ and P B = P ′ P B=P^{\prime} PB=P′, 怎么办?
- P B = P ′ P B=P^{\prime} PB=P′:t = r (the length of M and M’ are same)
- Then
h
(
H
t
−
1
,
M
t
−
1
)
=
H
t
=
H
t
′
=
h
(
H
t
−
1
′
,
M
t
−
1
′
)
\mathrm{h}\left(\mathrm{H}_{\mathrm{t}-1}, \mathrm{M}_{\mathrm{t}-1}\right)=\mathrm{H}_{\mathrm{t}}=\mathrm{H}_{\mathrm{t}}^{\prime}=\mathrm{h}\left(\mathrm{H}_{\mathrm{t}-1}^{\prime}, \mathrm{M}_{\mathrm{t}-1}^{\prime}\right)
h(Ht−1,Mt−1)=Ht=Ht′=h(Ht−1′,Mt−1′)
- 又找到了一组h的碰撞:
- 如果其中的两个M又恰好相等,就继续
- 除非M = M’ (违反了假设), 否则总能找到h的碰撞
- 即只有两种结果
- (1) Find collision for h 或
- (2) ∀ i : M i = M i ′ ⇒ M = M ′ \forall i: M_i = M_i' \Rightarrow M = M' ∀i:Mi=Mi′⇒M=M′
- 即只有两种结果
- 又找到了一组h的碰撞:
- Done!
- 证明contrapositive:Collision on H
⇒
\Rightarrow
⇒ collision on h
-
Therefore:
- To construct C.R. function,
- suffices to construct compression function
- To construct C.R. function,
-
只要h CR,就能建立CR的H
Next segment:
- 如何建立collision resistant的h
3.5.2 Constructing Compression Functions
Out goal for this segment: \
- build secure (collision resistant) compression
The Merkle-Damgard iterated construction
- Thm: h collision ⇒ \Rightarrow ⇒ H collision resistant
- Goal: construct compression function h : T × X → T h:T\times X \rightarrow T h:T×X→T
接下来:
- see a couple of constructions
Comopression function from a block cipher
- E : K × { 0 , 1 } n ⟶ { 0 , 1 } n E: K \times\{0,1\}^{n} \longrightarrow\{0,1\}^{n} E:K×{0,1}n⟶{0,1}n a block cipher
- The Davies-Meyer compression function:
- h ( H , m ) = E ( m , H ) ⨁ H h(H, m)=E(m, H) \bigoplus H h(H,m)=E(m,H)⨁H
- encrypt the chaining variable using the message block as the key
- Thm: Suppose E is an ideal cipher
- collection of |K| random permutations
- Finding a collision h(H,m) = h(H’,m) takes O ( 2 n / 2 ) O(2^{n/2}) O(2n/2) evaluations of (E,D)
- because there is always a generic birthday attack
- Davies-Meyer compression function is very popular
- SHA functions all used Davies-Meyer
Example
- Suppose we define h(H,m) = E(m,H)
- without XOR H compared with Davies-Meyer compression function ( h ( H , m ) = E ( m , H ) ⨁ H h(H, m)=E(m, H) \bigoplus H h(H,m)=E(m,H)⨁H )
- 下面证明 h(H,m) = E(m,H) is not collision resistant
- Then the resulting h(.,.) is not collision resistant:
- to build a collision (H,m) and (H’,m’)
- choose random (H,m,m’) and construct H’ as follows:
- Ans:
- Just let H’=D(m’,E(m,H))
- E(m’,H’) = E(m,H)
- Just let H’=D(m’,E(m,H))
Other block cipher constructions
- Let E : { 0 , 1 } n × { 0 , 1 } n ⟶ { 0 , 1 } n E:\{0,1\}^{n} \times\{0,1\}^{n} \longrightarrow\{0,1\}^{n} E:{0,1}n×{0,1}n⟶{0,1}n for simplicity
- Miyaguchi-Preneel:
- h ( H , m ) = E ( m , H ) ⊕ H ⨁ m h(H, m)=E(m, H) \oplus H \bigoplus m h(H,m)=E(m,H)⊕H⨁m (Whirlpool)
-
h
(
H
,
m
)
=
E
(
H
⨁
m
,
m
)
⨁
m
h(H,m) = E(H\bigoplus m, m) \bigoplus m
h(H,m)=E(H⨁m,m)⨁m
- total of 12 variants like this
- Other natural variants are insecure
- 上述12种变体可以,其他的不行
- e.g.,: h(H,m) = E(m,H) XOR m (HW)
Now:
- We have all the ingredients to describe the SHA 256 hash function
Case study: SHA-256
- Merkel-Damgard function
- Davies-Meyer compression function
- Block cipher: SHACAL-2
- the underlying block cipher for Daview-Meyer
- the 512-bit key of SHACAL-2: msg
- 256-bit block of SHACAL-2: the chaining variable
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-yoWPMgtN-1654224109094)(http://qiniu.ruixu.top/1650071158974----cryptographyI_csdnimg.png)]
compression Functions:
- 1类来自block cipher (said before)
- 另一类: 来自Hard problems from number theory
- briefly show one example in the following
Provable compression function
-
provable 可证明的
- provable: If you can find the collision on this compression function, then you are going to be able to solve a very hard number theoritic problem which is believed to be intractable intractable 棘手的,难治的
-
choose random 2000-bit prime p and random 1 ≤ u , v ≤ p 1 \leq u,v \leq p 1≤u,v≤p
-
For m , H ∈ 0 , 1 , 2 , . . . , p − 1 m,H \in {0,1,2, ... , p-1} m,H∈0,1,2,...,p−1 define:
- h ( H , m ) = u H ⋅ v m ( m o d p ) h(H, m)=u^{H} \cdot v^{m} \quad(\bmod p) h(H,m)=uH⋅vm(modp)
- 输入0至p-1的两个数,输出0至p-1的一个数
- compression ratio is 2.
-
Fact:
- finding collision for h ( . , . ) h(.,.) h(.,.) is as hard as
- solving “discrete-log” modulo p
- 证明:
- 见后文:when we get to the numver theoretic part of the course!
-
此方法的problem:
- slow!
- therefore
- not really used for any compression functions
- 但假如要sign的msg很短,可用时间又很长的话还是可以用的 – because it is provable
Next segment:
- goint to talk about HMAC
3.6 A MAC from a hash function
3.6.1 HMAC: a MAC from SHA-256
The Merkle-Damgard iterated construction
- Thm: h collision ⇒ \Rightarrow ⇒ H collision resistant
Can we use H(.) to directly build a MAC?
- without having to rely on a PRF
MAC from a Merkle-Damgard Hash Function
-
H : X ≤ L ⟶ T \mathrm{H}: \mathrm{X}^{\leq \mathrm{L}} \longrightarrow \mathrm{T} H:X≤L⟶T a C.R. Merkle-Damgard Hash Function
-
Attempt #1: S(k,m) = H(k||m)
-
This MAC is insecure. Prove this!
Ans: Given H(k||m) anyone can compute H(k||m||PB||w) for any w!
- The adversary can use w to creat (k||m||PB||w) and H(k||m||PB||w) to construct the existential forgery
- so this is totally insecure and should never be used
Standardized method: HMAC (Hash-MAC)
-
Most widely used MAC on the Internet
-
H: hash function
- example: SHA-256
- output is 256 bits
- the 256 bits can be regarded as pseudrandom numbers!
-
Building a MAC out of a hash function:
- HMAC: S ( k , m ) = H ( k ⊕ o p a d , H ( k ⊕ i p a d ∥ m ) ) S(k, m)=H(k \oplus o p a d, H(k \oplus i p a d \| m)) S(k,m)=H(k⊕opad,H(k⊕ipad∥m))
- 1st: concatenate k with an iternal pad (ipad)
- as one block of the Merkel-Damguard construction
- would be 512 bits in the case of SHA-256
- as one block of the Merkel-Damguard construction
- 2nd: append k ⊕ i p a d k \oplus i p a d k⊕ipad with message m m m
- 3rd: hash!
- 4th: 和 k ⨁ o p a d k\bigoplus opad k⨁opad一起再进行一次SHA256
-
HMAC in pictures:
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-Xq7UjslU-1654224109095)(http://qiniu.ruixu.top/1650073014663----cryptographyI_csdnimg.png)]
- Similar to the NMAC PRF
- main difference: the two keys k1 and k2 are dependent
- 如下图所示
- ipad and opad are constants specified in the HMAC standard
HMAC properties
- HMAC is assumed to be a secure PRF
- Can be proven under certain PRF assumptions about h(.,.)
- Security bounds similar to NMAC
- Need
q
2
/
∣
T
∣
q^2/|T|
q2/∣T∣ to be negligible
(
q
<
<
∣
T
∣
1
/
2
)
(q << |T|^{1/2})
(q<<∣T∣1/2)
- q: the number of messages you’er macing
- T: the output tag space
- Need
q
2
/
∣
T
∣
q^2/|T|
q2/∣T∣ to be negligible
(
q
<
<
∣
T
∣
1
/
2
)
(q << |T|^{1/2})
(q<<∣T∣1/2)
- In TLS: must support HMAC-SHA1-96
- SHA1: SHA1 hash function, output 160bits
- -96: truncated to 96 bits
Next sement:
- timing attack on HMAC
3.6.2 Timing attacks on MAC verification
- a general attack that affects many implementations of MAC alg.
- a lesson for us!
Warning: verification timing attacks
- Example: Keyczar crpto library (Python)
- e.g.,: HMAC(key,msg)和sig_bytes both are 16 bytes
def Verify(key, msg, sig_bytes):
return HMAC(key,msg) == sig_bytes
- The problem: ‘==’ implemented as a byte-by-byte comparison
- comparator returns false when first inequality found
How to attack?
Warning: verification timing attacks
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-zptEUD4Q-1654224109096)(http://qiniu.ruixu.top/1650074120941----cryptographyI_csdnimg.png)]
- Timing attack: to compute tag for target message m do:
- 可利用的条件:一台有着密钥key的HMAC认证服务器
- Step1: Query server with random tag
- Step2: Loop over all possible first bytes and query server
- Stop when verification takes longer than in step 1
- Step3: repeat for all tag bytes until valid tag found
如何抵御?
Defnese #1
- Make string comparator always take same time (python)
- in fact, Keyczar lib. exactly implemented this defense
return false if sig_bytes has wrong length
result = 0
# zip: create pairs between HMAC(key,msg) and big_bytes
for x,y in zip(HMAC(key,msg),big_bytes):
result |= ord(x) ^ ord(y)
return result == 0
- can be difficult to ensure due to optimizing compiler
- an optimized compiler: 会自动地在不相等的字节对break for循环!
- 导致防御失效
Defense #2
- Make string comparator always take same time (python)
- 隐藏真正被验证的string
def Verify(key, msg, sig_bytes):
mac = HMAC(key,msg)
return HMAC(key,mac) == HMAC(key,sig_bytes)
- Attacker does not know values being compared!
Lesson:
- 密码算法实现时处处可能犯错
- 谨慎implement crypto yourself!