一,Desirable Goals
- Detect whether a piece of data has been altered
- Be assured that the data is authentic (e.g., was created by the person claiming to have created it)
- Prevent others from making sense of the data, even if they obtain access to it
Practical Example:
Subresource integrity checking for websites
二,Hash Functions
Applying H produces a fixed-length message digest or hash from any length of input, x
Required Properties
- Pre-image resistance: given hash h, it is computationally infeasible to find x such that H(x) = h
- Second pre-image resistance: given input x, it is computationally infeasible to find input y such that y ≠ x and H(y) = H(x)
- Collision resistance: it is computationally infeasible to find any pair of different inputs {x, y} for which H(x) = H(y)
Exercise 1: Introduction to Hash Functions
using GCHQ’s browser-based tool CyberChef:CyberChef (gchq.github.io)
Using Hash Functions
CyberChef computing an MD5 hash
c7cd4529f9ebe5b40ff061188ec6f5c1
Hash Collisions
This is an example of an MD5 collision. It was constructed in 2005 by Magnus Daum of Ruhr-Universität Bochum and Stefan Lucks from the University of Mannheim. The increasing feasibility of finding such collisions is the key reason why MD5 – and, subsequently, SHA-1 – are no longer considered usable for security purposes
Exercise 2: Hash Functions in Python
1.Enter python
in the terminal window to run Python’s REPL (read-eval-print loop). Read the contents of mes.txt
into a byte string using the following code:
import pathlib
path = pathlib.Path("mes.txt")
data = path.read_bytes()
2.Use the hashlib module to compute and display a SHA-256 hash of the file contents, like so:
import hashlib
h = hashlib.sha256()
h.update(data)
print(h.hexdigest())
Note that the update()
method can be called repeatedly, to feed the hash function with multiple items of data. If you have a single chunk of data, this example can be shortened to
print(hashlib.sha256(data).hexdigest())
Exercise 3: Computing HMACs in Python
1.Start a Python 3 REPL from a terminal window and use the secrets
module to create a key consisting of 16 random bytes:
import secrets
key = secrets.token_bytes(16)
key.hex()
2.Now read the contents of mes1.txt
and mes2.txt
like so:
from pathlib import Path
message1 = Path("mes1.txt").read_bytes()
message2 = Path("mes2.txt").read_bytes()
3.To compute an authentication tag for message1.txt
, create an HMAC object from the key and file contents, specifying the required hash function. Then call the digest()
method to retrieve the tag as a sequence of bytes:
import hmac
h = hmac.new(key, message1, digestmod="sha256")
tag1 = h.digest()
tag1.hex()
4.Notice that we deliberately do not test for tag equality using tag1 == tag2
. The compare_digest
method performs a more careful constant-time comparison, as a defence against timing attacks.
h = hmac.new(key, message2, digestmod="sha256")
tag2 = h.digest()
print(hmac.compare_digest(tag1, tag2))
If you try this, it should return False
, indicating that the HMACs are different, hence mes2.txt
is different from mes1.txt
. Although the attacker has been able to change the message, they cannot forge a valid authentication tag for it because they do not have the key needed to generate a valid tag