AVX512F instructions

Hi, 

First of all - thanks for creating (and open-sourcing) this swift code! Looks great!

I was looking through the SIMD wrappers for `AVX512F` in `vector.h` and I noticed a few wrappers that refer to non-existent intrinsics (at least in `AVX512F`) or have better implementations. In particular, `vec_and` maps to `_mm512_and_ps`, which does not exist (at least according to the [Intel Intrinsics Guide](https://software.intel.com/sites/landingpage/IntrinsicsGuide/#avx512techs=AVX512F)). From the looks of it, all `and/or` operations are now only relevant for `masks` and not for individual data-types. 

I also saw that `vec_fabs` is implemented via two intrinsics -- is the new `_mm512_abs_ps` intrinsic too slow? 

I am also curious - I do not see any references to any `mask(z)_load`. I found those masks quite useful for staying in SIMD mode and eliminating the serial part of the code (dealing with remainder loops for array lengths not divisible by the SIMD width). 

Once again, the performance gains look awesome! 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AVX512F instructions #3

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AVX512F instructions #3

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions