特别推荐: https://www.codeproject.com/Articles/874396/Crunching-Numbers-with-AVX-and-AVX

1. 查看自己cpu支持指令集:

直接去官网查:

https://ark.intel.com/content/www/cn/zh/ark.html#@Processors

比如这颗

https://ark.intel.com/content/www/cn/zh/ark/products/75131/intel-core-i7-4900mq-processor-8m-cache-up-to-3-80-ghz.html

2. 测试例子:

#include

#include

int main(int argc, char* argv[])

{

__m256i first = _mm256_set_epi64x(10, 20, 30, 40);

__m256i second = _mm256_set_epi64x(5, 5, 5, 5);

__m256i result = _mm256_add_epi64(first, second);

long int* values = (long int*) &result;

printf("==%ld \n", sizeof(long int));

for (int i = 0;i < 4; i++)

{

printf("%ld ", values[i]);

}

return 0;

}

_mm256_set_epi64x() _mm256_add_epi64() 等内建函数的含义和用法:

https://software.intel.com/sites/landingpage/IntrinsicsGuide

注意:左边栏勾选后,右栏结果不一定准确。比如SSE的addss指令在有AVX机器中中变为vaddvss,但是勾选AVX512中才能搜到。

编译命令:

gcc -mavx2 -S -fverbose-asm fun.c #看详细的汇编语言结果

gcc -mavx2 fun.c

补充个例子:

#include

#include

float aa[] = {10, 20, 30, 40, 50, 60, 70, 80};

float bb[] = {0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5};

float cc[] = {0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5, 0.5};

int main(int argc, char* argv[])

{

__m256 first = _mm256_loadu_ps (aa);

__m256 second = _mm256_loadu_ps (bb);

__m256 result = _mm256_add_ps (first, second);

_mm256_storeu_ps (cc, result);

printf("==%ld \n", sizeof(float));

for (int i = 0;i < 8; i++)

{

printf("%f\n", cc[i]);

}

return 0;

}

查错手册:

AVX vector return without AVX enabled changes the ABI ——————————没有 -mavx2

inlining failed in call to always_inline 'xxx': target specific option mismatch —————— 架构不匹配,看看cpu是否支持 avx2

参考资料:

https://software.intel.com/content/www/cn/zh/develop/articles/introduction-to-intel-advanced-vector-extensions.html

https://zhuanlan.zhihu.com/p/94649418

https://www.codeproject.com/Articles/874396/Crunching-Numbers-with-AVX-and-AVX

https://software.intel.com/sites/landingpage/IntrinsicsGuid