[crypto]-30-The Armv8 Cryptographic Extension在linux中的应用

快速链接:
.
👉👉👉 个人博客笔记导读目录(全部) 👈👈👈


相关推荐:
Armv8 Cryptographic Extension介绍
Linux Kernel aarch64 Crypto原理和框架介绍
Linux kernel内核调用crypto算法的方法
Linux Kernel aarch64的ARM-CE aes-ecb的底层代码导读


说明: 在无特别的说明下,本文讲述得都是armv8-aarch64体系、linux kernel 4.14 arm64软件环境!

在这里插入图片描述

1、在linux crypto底层,实现aes/hash的算法有三种方式:
(1)、cpu的纯软实现,使用cpu的ALU,x0-x30等寄存器,加加减减的计算。(本文不讨论此项)
(2)、ARM-CE,就是The Armv8 Cryptographic Extension了,调用arm-ce的指令和寄存器,进行加加减减计算
(3)、ARM-NEON : 调用arm neon指令(128bit的寄存器v0-v31),进行加加减减计算

再进一步阐述ARM-CE,其实也调用NEON的浮点型运算器,读写arm-ce的寄存器、以前使用arm-ce的指令,进行加解密运算。

2、需要明确一点
The Armv8 Cryptographic Extension provides instructions for the acceleration of encryption and decryption
Armv8 Cryptographic Extension 并不是单独的硬件 ,只是ARM扩展了一套寄存器和命令,依然还是cpu计算的
ARM NEON也是,也是一套寄存器和命令,依然还是cpu在下执行

3、以aes为例,arm-ce和arm-neon的crypto的暴露给上层的接口代码,都在下列文件中
在aes-glue.c中, 注册了aes算法接口:

static struct crypto_alg aes_algs[] = { {
	.cra_name		= "__ecb-aes-" MODE,
	.cra_driver_name	= "__driver-ecb-aes-" MODE,
	.cra_priority		= 0,
	.cra_flags		= CRYPTO_ALG_TYPE_BLKCIPHER |
				  CRYPTO_ALG_INTERNAL,
	.cra_blocksize		= AES_BLOCK_SIZE,
	.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_blkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_blkcipher = {
		.min_keysize	= AES_MIN_KEY_SIZE,
		.max_keysize	= AES_MAX_KEY_SIZE,
		.ivsize		= 0,
		.setkey		= aes_setkey,
		.encrypt	= ecb_encrypt,
		.decrypt	= ecb_decrypt,
	},
}, {
	.cra_name		= "__cbc-aes-" MODE,
	.cra_driver_name	= "__driver-cbc-aes-" MODE,
	.cra_priority		= 0,
	.cra_flags		= CRYPTO_ALG_TYPE_BLKCIPHER |
				  CRYPTO_ALG_INTERNAL,
	.cra_blocksize		= AES_BLOCK_SIZE,
	.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_blkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_blkcipher = {
		.min_keysize	= AES_MIN_KEY_SIZE,
		.max_keysize	= AES_MAX_KEY_SIZE,
		.ivsize		= AES_BLOCK_SIZE,
		.setkey		= aes_setkey,
		.encrypt	= cbc_encrypt,
		.decrypt	= cbc_decrypt,
	},
}, {
	.cra_name		= "__ctr-aes-" MODE,
	.cra_driver_name	= "__driver-ctr-aes-" MODE,
	.cra_priority		= 0,
	.cra_flags		= CRYPTO_ALG_TYPE_BLKCIPHER |
				  CRYPTO_ALG_INTERNAL,
	.cra_blocksize		= 1,
	.cra_ctxsize		= sizeof(struct crypto_aes_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_blkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_blkcipher = {
		.min_keysize	= AES_MIN_KEY_SIZE,
		.max_keysize	= AES_MAX_KEY_SIZE,
		.ivsize		= AES_BLOCK_SIZE,
		.setkey		= aes_setkey,
		.encrypt	= ctr_encrypt,
		.decrypt	= ctr_encrypt,
	},
}, {
	.cra_name		= "__xts-aes-" MODE,
	.cra_driver_name	= "__driver-xts-aes-" MODE,
	.cra_priority		= 0,
	.cra_flags		= CRYPTO_ALG_TYPE_BLKCIPHER |
				  CRYPTO_ALG_INTERNAL,
	.cra_blocksize		= AES_BLOCK_SIZE,
	.cra_ctxsize		= sizeof(struct crypto_aes_xts_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_blkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_blkcipher = {
		.min_keysize	= 2 * AES_MIN_KEY_SIZE,
		.max_keysize	= 2 * AES_MAX_KEY_SIZE,
		.ivsize		= AES_BLOCK_SIZE,
		.setkey		= xts_set_key,
		.encrypt	= xts_encrypt,
		.decrypt	= xts_decrypt,
	},
}, {
	.cra_name		= "ecb(aes)",
	.cra_driver_name	= "ecb-aes-" MODE,
	.cra_priority		= PRIO,
	.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
	.cra_blocksize		= AES_BLOCK_SIZE,
	.cra_ctxsize		= sizeof(struct async_helper_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_ablkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_init		= ablk_init,
	.cra_exit		= ablk_exit,
	.cra_ablkcipher = {
		.min_keysize	= AES_MIN_KEY_SIZE,
		.max_keysize	= AES_MAX_KEY_SIZE,
		.ivsize		= 0,
		.setkey		= ablk_set_key,
		.encrypt	= ablk_encrypt,
		.decrypt	= ablk_decrypt,
	}
}, {
	.cra_name		= "cbc(aes)",
	.cra_driver_name	= "cbc-aes-" MODE,
	.cra_priority		= PRIO,
	.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
	.cra_blocksize		= AES_BLOCK_SIZE,
	.cra_ctxsize		= sizeof(struct async_helper_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_ablkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_init		= ablk_init,
	.cra_exit		= ablk_exit,
	.cra_ablkcipher = {
		.min_keysize	= AES_MIN_KEY_SIZE,
		.max_keysize	= AES_MAX_KEY_SIZE,
		.ivsize		= AES_BLOCK_SIZE,
		.setkey		= ablk_set_key,
		.encrypt	= ablk_encrypt,
		.decrypt	= ablk_decrypt,
	}
}, {
	.cra_name		= "ctr(aes)",
	.cra_driver_name	= "ctr-aes-" MODE,
	.cra_priority		= PRIO,
	.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
	.cra_blocksize		= 1,
	.cra_ctxsize		= sizeof(struct async_helper_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_ablkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_init		= ablk_init,
	.cra_exit		= ablk_exit,
	.cra_ablkcipher = {
		.min_keysize	= AES_MIN_KEY_SIZE,
		.max_keysize	= AES_MAX_KEY_SIZE,
		.ivsize		= AES_BLOCK_SIZE,
		.setkey		= ablk_set_key,
		.encrypt	= ablk_encrypt,
		.decrypt	= ablk_decrypt,
	}
}, {
	.cra_name		= "xts(aes)",
	.cra_driver_name	= "xts-aes-" MODE,
	.cra_priority		= PRIO,
	.cra_flags		= CRYPTO_ALG_TYPE_ABLKCIPHER|CRYPTO_ALG_ASYNC,
	.cra_blocksize		= AES_BLOCK_SIZE,
	.cra_ctxsize		= sizeof(struct async_helper_ctx),
	.cra_alignmask		= 7,
	.cra_type		= &crypto_ablkcipher_type,
	.cra_module		= THIS_MODULE,
	.cra_init		= ablk_init,
	.cra_exit		= ablk_exit,
	.cra_ablkcipher = {
		.min_keysize	= 2 * AES_MIN_KEY_SIZE,
		.max_keysize	= 2 * AES_MAX_KEY_SIZE,
		.ivsize		= AES_BLOCK_SIZE,
		.setkey		= ablk_set_key,
		.encrypt	= ablk_encrypt,
		.decrypt	= ablk_decrypt,
	}
} };

4、aes-glue.c文件中并且定义了每一个接口指向的底层函数, 这里根据USE_V8_CRYPTO_EXTENSIONS分为了两中情况:
(1)、使用arm的crypto extension硬件算aes
(2)、使用arm的SIMD指令(ARM NEON)来算aes

#ifdef USE_V8_CRYPTO_EXTENSIONS
#define MODE			"ce"
#define PRIO			300
#define aes_setkey		ce_aes_setkey
#define aes_expandkey		ce_aes_expandkey
#define aes_ecb_encrypt		ce_aes_ecb_encrypt
#define aes_ecb_decrypt		ce_aes_ecb_decrypt
#define aes_cbc_encrypt		ce_aes_cbc_encrypt
#define aes_cbc_decrypt		ce_aes_cbc_decrypt
#define aes_ctr_encrypt		ce_aes_ctr_encrypt
#define aes_xts_encrypt		ce_aes_xts_encrypt
#define aes_xts_decrypt		ce_aes_xts_decrypt
MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 Crypto Extensions");
#else
#define MODE			"neon"
#define PRIO			200
#define aes_setkey		crypto_aes_set_key
#define aes_expandkey		crypto_aes_expand_key
#define aes_ecb_encrypt		neon_aes_ecb_encrypt
#define aes_ecb_decrypt		neon_aes_ecb_decrypt
#define aes_cbc_encrypt		neon_aes_cbc_encrypt
#define aes_cbc_decrypt		neon_aes_cbc_decrypt
#define aes_ctr_encrypt		neon_aes_ctr_encrypt
#define aes_xts_encrypt		neon_aes_xts_encrypt
#define aes_xts_decrypt		neon_aes_xts_decrypt
MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS using ARMv8 NEON");
#endif

5、以ce_aes_cbc_encrypt为例,看下crypto extension的硬件实现:
在aes-ce-core.S中, 由此可以看出,这里最终调用的都是aesd、aesmc…等armv8 crypto extension的汇编指令

ENTRY(ce_aes_cbc_encrypt)
	push		{r4-r6, lr}
	ldrd		r4, r5, [sp, #16]
	vld1.8		{q0}, [r5]
	prepare_key	r2, r3
.Lcbcencloop:
	vld1.8		{q1}, [r1, :64]!	@ get next pt block
	veor		q0, q0, q1		@ ..and xor with iv
	bl		aes_encrypt
	vst1.8		{q0}, [r0, :64]!
	subs		r4, r4, #1
	bne		.Lcbcencloop
	vst1.8		{q0}, [r5]
	pop		{r4-r6, pc}
ENDPROC(ce_aes_cbc_encrypt)
aes_encrypt:
	add		ip, r2, #32		@ 3rd round key
.Laes_encrypt_tweak:
	do_block	enc_dround, enc_fround
ENDPROC(aes_encrypt)
.macro		enc_dround, key1, key2
enc_round	q0, \key1
enc_round	q0, \key2
.endm
	.macro		enc_round, state, key
aese.8		\state, \key
aesmc.8		\state, \state
.endm

6、以ce_aes_cbc_encrypt为例,看下NEON的硬件实现:
在aes-neon.S/aes-modes.S中, 由此可以看出,这里最终操作的都是v0-v31等128bit的SIMD寄存器

AES_ENTRY(aes_cbc_encrypt)
	cbz		w6, .Lcbcencloop

	ld1		{v0.16b}, [x5]			/* get iv */
	enc_prepare	w3, x2, x6

	/* do preload for encryption */
	.macro		enc_prepare, ignore0, ignore1, temp
	prepare		.LForward_Sbox, .LForward_ShiftRows, \temp
	.endm
/* preload the entire Sbox */
.macro		prepare, sbox, shiftrows, temp
adr		\temp, \sbox
movi		v12.16b, #0x40
ldr		q13, \shiftrows
movi		v14.16b, #0x1b
ld1		{v16.16b-v19.16b}, [\temp], #64
ld1		{v20.16b-v23.16b}, [\temp], #64
ld1		{v24.16b-v27.16b}, [\temp], #64
ld1		{v28.16b-v31.16b}, [\temp]
.endm

7、疑问与答案
kernel进程A在cpu0上跑的时候,被调度了,此时会将通用寄存器(如x0-x30,sp,pc等)保存起来,然后调度。等再回来时,(假设被cpu1接着执行了)再恢复下这些寄存器。程序就可以接着跑。
调度时只是保存了x0-x30等通用寄存器,并没有保存NEON寄存器v0-v31、ARM-CE的寄存器。
假如kernel线程A正在cpu0上执行浮点型运算(ARM NEON运算),然后被调度了(没有保存v0-v31),等再回来时是cpu1再来执行该线程,那么怎么可以回复之前的状态呢?

答案在Documentation/arm/kernel_mode_neon.txt中:
Use only NEON instructions, or VFP instructions that don’t rely on support
(1)Isolate your NEON code in a separate compilation unit, and compile it with ‘-mfpu=neon -mfloat-abi=softfp’
(2)Put kernel_neon_begin() and kernel_neon_end() calls around the calls into your NEON code
(3)Don’t sleep in your NEON code, and be aware that it will be executed with preemption disabled

也就是说在执行NEON的代码中 ,需要使用kernel_neon_begin()/kernel_neon_end()包上,这样这段代码就不会被抢占、属于原子操作了。就没有调度之说了
示例:请看arch/arnm64/crypto/aes-glue.c,在该代码中,在调用ARM-NEON或ARM-CE时,都是使用kernel_neon_begin()/kernel_neon_end()包上了

如下列代码所示,arm-neon和arm-ce的ecb_encrypt都是调用的下列函数,然后aes_ecb_encrypt是一个宏,有USE_V8_CRYPTO_EXTENSIONS宏来决定,其要么指向ARM-CE的函数,要么指向ARM-NEON的函数
ecb_encrypt的主体是被kernel_neon_begin()/kernel_neon_end()包裹着的,所以这段函数是原子操作。

static int ecb_encrypt(struct blkcipher_desc *desc, struct scatterlist *dst,
		       struct scatterlist *src, unsigned int nbytes)
{
	struct crypto_aes_ctx *ctx = crypto_blkcipher_ctx(desc->tfm);
	int err, first, rounds = 6 + ctx->key_length / 4;
	struct blkcipher_walk walk;
	unsigned int blocks;

	desc->flags &= ~CRYPTO_TFM_REQ_MAY_SLEEP;
	blkcipher_walk_init(&walk, dst, src, nbytes);
	err = blkcipher_walk_virt(desc, &walk);

	kernel_neon_begin();
	for (first = 1; (blocks = (walk.nbytes / AES_BLOCK_SIZE)); first = 0) {
		aes_ecb_encrypt(walk.dst.virt.addr, walk.src.virt.addr,
				(u8 *)ctx->key_enc, rounds, blocks, first);
		err = blkcipher_walk_done(desc, &walk, walk.nbytes % AES_BLOCK_SIZE);
	}
	kernel_neon_end();
	return err;
}

8、userspace调用、物理地址的连续性、对齐、block
linux kernel crypto和用户空间通过netlink交互,在linux kernel的algif_skcipher.c程序中,会将userspace sock传来的数据,保存在scatterlist结构体(离散的物理块).
然后在aes-ce-glue.c中,依次对每一块连续的物理地址数据调用底层的加解密函数
在底层的加解密函数中,需要自行解决对齐的处理、分block的处理


相关推荐:
         [crypto]-01-对称加解密AES原理概念详解
         [crypto]-02-非对称加解密RSA原理概念详解
         [crypto]-03-数字摘要HASH原理概念详解
         [crypto]-04-国产密码算法(国密算法sm2/sm3/sm4)介绍
         [crypto]-05-转载:PKCS #1 RSA Encryption Version 1.5介绍
         [crypto]-05.1-PKCS PKCS#1 PKCS#7 PKCS#11的介绍
         [crypto]-06-CA证书介绍和使用方法


         [crypto]-30-The Armv8 Cryptographic Extension在linux中的应用
         [crypto]-31-crypto engion的学习和总结


         [crypto]-50-base64_encode和base64_decode的C语言实现
         [crypto]-51-RSA私钥pem转换成der, 在将der解析出n e d p q dp dq qp
         [crypto]-52-python3中rsa(签名验签加密解密)aes(ecb cbc ctr)hmac的使用,以及unittest测试用
         [crypto]-53-openssl命令行的使用(aes/rsa签名校验/rsa加密解密/hmac)


         [crypto]-90-crypto的一些术语和思考

  • 1
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

代码改变世界ctw

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值