-
Notifications
You must be signed in to change notification settings - Fork 130
Home
Povilas Kanapickas edited this page Mar 30, 2014
·
26 revisions
The library is developed in C++11. A separate, C++03 branch is provided for compatibility with older compilers. Note that the master branch is unstable. If unsure, use one of the releases or at least the latest beta.
Supported instruction sets: SSE2, SSE3, SSSE3, SSE4.1, AVX, AVX2, FMA3, FMA4, XOP and NEON
Supported compilers:
-
C++11 version:
- GCC: 4.8.1, 4.7.3
- Clang: 3.3, 3.4
-
C++98 version
- GCC: 4.8.1, 4.7.3
- Clang: 3.3, 3.4
A 2.0 release of the library is planned. It contains lots of new features and a different high-level architecture which necessitated a major API break.
Implemented changes:
- Expression template-based backend. It is used only for functions that may benefit from micro-optimizations.
- Support for vectors much longer than the native vector type. The only limitation is that the length must be a power of 2. The widest available instructions are used for the particular vector type.
- Vector initialization is simplified, for example:
int32<8> v = make_uint(2);
orint* p = ...; v = load(p);
. - Curriously recurring template pattern is used to categorize vector types. Function templates no longer need to be written for each vector type or their combination, instead, an appropriate vector category may be used.
- Each vector type can be explicitly constructed from any other vector with the same size.
- Most functions accept much wider range of vector type combinations. For example, bitwise functions accept any two vectors of the same size.
- If different vector types are used as arguments to such functions, the
return type is computed as if one or both of the arguments were "promoted"
according to certain rules. For example,
int32 + int32 --> int32
, whereasuint32 + int32 --> uint32
, anduint32 + float32 --> float32
. See simdpp/types/tag.h for more information. - API break:
int128
andint256
types have been removed. On some architectures such as AVX512 it's more efficient to have different physical representations for vectors with different element widths. E.g. 8-bit integer elements would use 256-bit vectors and 32-bit integer elements would use 512-bit vectors. - API break:
basic_int##
types have been removed. The CRTP-based type categorization and promotion rules make second inheritance-based vector categorization system impossible. In majority of cases basic_int##can be straightforwardly replaced with
uint##`. - API break: 'broadcast' family of functions have been renamed to 'splat'
- API break: 'permute' family of functions has been renamed to 'permute2' and 'permute4' depending on the number of template arguments taken.
- API break: value conversion functions such as to_float32x4 have been renamed and now returns a vector with the same number of elements as the source vector.
- API break: saturated add and sub are now called
sat_add
andsat_sub
- More API breaks... (grep for 'API break' or 'API-break' in the commit logs)
Planned changes:
- Visual Studio support
- AVX512 support