Format Preserving Encryption介绍
1. Introduction
分组密码(block cipher)的工作模式是基于分组密码算法,如AES等,它能够加密任意长度的binary,解决了分组密码只能加密固定长度的问题(如:AES-128只能加密128bit长度的明文)。对于非binary的情况,分组密码可以先将其转换为binary再进行加密,但是不能并不能保持相同的格式。例如,美国的社保号(Society Security Number,SSN)由九位数字组成,经过加密之后长度远超9位数字,所以传统的工作模式并不适用,所以FPE(Format Preserving Encryption)因此被设计出来处理这类问题。FPE通常用于数据脱敏中,因为它需要保持明密文的格式相同,例如社保号经过加密之后并不是固定长度的杂文,而是相同格式、打乱的号码,依然是社保号的格式。如下图所示:
这就变得有意义,之后可以用于数据挖掘等更广泛的用途,平衡了机密性和可用性。
2. Format Preserving Encryption算法
目前常用的由两种FPE模式,FF1和FF3,都是基于Festiel的加密模式,FF2被设计出来的时候不满足期望的128bit的安全强度,因此被弃用。每一个FPE模式控制在一个更大的叫做FFX的框架中。FF1和FF3都采用Festiel结构,以Triple Data Encryption Algorithm(TDEA)为基础。
FF1和FF3都有一个额外的输入参数,“tweak”(公开的参数);tweak可以被视为密钥的可变部分,因为他们同时决定了加密解密函数。Tweaks that vary can be especially important for implementations of FPE modes, because the number of possible values for the confidential data is often relatively small.
FF1 and FF3 offer somewhat different performance advantages. FF1 supports a greater range of lengths for the protected, formatted data, as well as flexibility in the length of the tweak. FF3 achieves greater throughput, mainly because its round count is eight, compared to ten for FF1.
2.1 Preliminary
Alphabet:A finite set of two or more symbols is called an alphabet.
Character:The symbols in an alphabet are called the characters of the alphabet.
radix:The number of characters in an alphabet is called the base, denoted by radix ; thus, radix≥2 .
在对字符进行FPE加密之前,需要先进行编码,例如对于小写英文字母: