AVX向量化学习(一)

2 篇文章 0 订阅
1 篇文章 0 订阅

AVX指令集的简单操作

使用AVX指令集进行2个double型的数组相加操作

我的博客地址

我的博客地址
https://amicoyuan.github.io/

使用到的AVX函数介绍

1.

__m256 _mm256_loadu_ps (float const * mem_addr)

Description

Load 256-bits (composed of 8 packed single-precision (32-bit) floating-point elements) from memory into dst. mem_addr does not need to be aligned on any particular boundary.

Operation

dst[255:0] := MEM[mem_addr+255:mem_addr]
dst[MAX:256] := 0

2.

__m256d _mm256_add_pd (__m256d a, __m256d b)

Description

Add packed double-precision (64-bit) floating-point elements in a and b, and store the results in dst.

Operation

FOR j := 0 to 3
	i := j*64
	dst[i+63:i] := a[i+63:i] + b[i+63:i]
ENDFOR
dst[MAX:256] := 0

3.

void _mm256_storeu_pd (double * mem_addr, __m256d a)

Description

Store 256-bits (composed of 4 packed double-precision (64-bit) floating-point elements) from a into memory. mem_addr does not need to be aligned on any particular boundary.

Operation

MEM[mem_addr+255:mem_addr] := a[255:0]

未进行AVX向量化的情况

程序源代码

#include<stdio.h>
int main()
{
	double a[9] = {1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8,2.1};
	double b[9] = {2.1,3.2,6.4,8.6,3.7,9.9,5.1,4.2,6.6};
	double c[9] = {0};
	
	for(int i=0 ;i<9;i++)	
	{
		c[i]=a[i]+b[i];
		
	}
	
	printf("this is c.\n");
	for(int i=0;i<9;i++)
	{
		printf("%lf\n",c[i]);
	}
	
	return 0;
 } 

程序输出

this is c.
3.200000
5.400000
9.700000
13.000000
9.200000
16.500000
12.800000
13.000000
8.700000

进行AVX向量化的情况

程序源代码

#include<stdio.h>
#include <immintrin.h>
int main()
{
	double a[9] = {1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8,2.1};
	double b[9] = {2.1,3.2,6.4,8.6,3.7,9.9,5.1,4.2,6.6};
	double c[9] = {0};
	__m256d v0;
	__m256d v1;
	__m256d v2;
	int i=0;
	for(;i<9-4;i+=4)
	{	
			v0 = _mm256_loadu_pd(a+i);
			v1 = _mm256_loadu_pd(b+i);
			v2=_mm256_add_pd(v0,v1);
		 	_mm256_storeu_pd(c+i,v2);
			
	}
	for(;i<9;i++)
	{
		c[i]=a[i]+b[i];
	
	}
	printf("this is c with AVX.\n");
		for(int i=0;i<9;i++)
	{
		printf("%lf\n",c[i]);
	}

	return 0;
 } 

程序输出

this is c with AVX.
3.200000
5.400000
9.700000
13.000000
9.200000
16.500000
12.800000
13.000000
8.700000

相关链接

[https://software.intel.com/sites/landingpage/IntrinsicsGuide/]: " Intel® Intrinsics Guide"

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值