File simdlib_avx2.h

namespace faiss

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. Implementation of k-means clustering with many variants.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. IDSelector is intended to define a subset of vectors to handle (for removal or as subset to search)

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. PQ4 SIMD packing and accumulation functions

The basic kernel accumulates nq query vectors with bbs = nb * 2 * 16 vectors and produces an output matrix for that. It is interesting for nq * nb <= 4, otherwise register spilling becomes too large.

The implementation of these functions is spread over 3 cpp files to reduce parallel compile times. Templates are instantiated explicitly.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. This file contains callbacks for kernels that compute distances.

The SIMDResultHandler object is intended to be templated and inlined. Methods:

  • handle(): called when 32 distances are computed and provided in two simd16uint16. (q, b) indicate which entry it is in the block.

  • set_block_origin(): set the sub-matrix that is being computed

Throughout the library, vectors are provided as float * pointers. Most algorithms can be optimized when several vectors are processed (added/searched) together in a batch. In this case, they are passed in as a matrix. When n vectors of size d are provided as float * x, component j of vector i is

x[ i * d + j ]

where 0 <= i < n and 0 <= j < d. In other words, matrices are always compact. When specifying the size of the matrix, we call it an n*d matrix, which implies a row-major storage.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. I/O functions can read/write to a filename, a file handle or to an object that abstracts the medium.

The read functions return objects that should be deallocated with delete. All references within these objectes are owned by the object.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. Definition of inverted lists + a few common classes that implement the interface.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. Since IVF (inverted file) indexes are of so much use for large-scale use cases, we group a few functions related to them in this small library. Most functions work both on IndexIVFs and IndexIVFs embedded within an IndexPreTransform.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. In this file are the implementations of extra metrics beyond L2 and inner product

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. Defines a few objects that apply transformations to a set of vectors Often these are pre-processing steps.

Functions

inline simd16uint16 min(simd16uint16 a, simd16uint16 b)
inline simd16uint16 max(simd16uint16 a, simd16uint16 b)
inline simd16uint16 combine2x2(simd16uint16 a, simd16uint16 b)
inline uint32_t cmp_ge32(simd16uint16 d0, simd16uint16 d1, simd16uint16 thr)
inline uint32_t cmp_le32(simd16uint16 d0, simd16uint16 d1, simd16uint16 thr)
inline simd16uint16 hadd(const simd16uint16 &a, const simd16uint16 &b)
inline void cmplt_min_max_fast(const simd16uint16 candidateValues, const simd16uint16 candidateIndices, const simd16uint16 currentValues, const simd16uint16 currentIndices, simd16uint16 &minValues, simd16uint16 &minIndices, simd16uint16 &maxValues, simd16uint16 &maxIndices)
inline simd32uint8 uint16_to_uint8_saturate(simd16uint16 a, simd16uint16 b)
inline uint32_t get_MSBs(simd32uint8 a)

get most significant bit of each byte

inline simd32uint8 blendv(simd32uint8 a, simd32uint8 b, simd32uint8 mask)

use MSB of each byte of mask to select a byte between a and b

inline void cmplt_min_max_fast(const simd8uint32 candidateValues, const simd8uint32 candidateIndices, const simd8uint32 currentValues, const simd8uint32 currentIndices, simd8uint32 &minValues, simd8uint32 &minIndices, simd8uint32 &maxValues, simd8uint32 &maxIndices)
inline simd8float32 hadd(simd8float32 a, simd8float32 b)
inline simd8float32 unpacklo(simd8float32 a, simd8float32 b)
inline simd8float32 unpackhi(simd8float32 a, simd8float32 b)
inline simd8float32 fmadd(simd8float32 a, simd8float32 b, simd8float32 c)
inline void cmplt_and_blend_inplace(const simd8float32 candidateValues, const simd8uint32 candidateIndices, simd8float32 &lowestValues, simd8uint32 &lowestIndices)
inline void cmplt_min_max_fast(const simd8float32 candidateValues, const simd8uint32 candidateIndices, const simd8float32 currentValues, const simd8uint32 currentIndices, simd8float32 &minValues, simd8uint32 &minIndices, simd8float32 &maxValues, simd8uint32 &maxIndices)
struct simd256bit
#include <simdlib_avx2.h>

256-bit representation without interpretation as a vector

Simple wrapper around the AVX 256-bit registers

The objective is to separate the different interpretations of the same registers (as a vector of uint8, uint16 or uint32), to provide printing functions, and to give more readable names to the AVX intrinsics. It does not pretend to be exhausitve, functions are added as needed.

Subclassed by faiss::simd16uint16, faiss::simd16uint16, faiss::simd32uint8, faiss::simd32uint8, faiss::simd8float32, faiss::simd8float32, faiss::simd8uint32, faiss::simd8uint32

Public Functions

inline simd256bit()
inline explicit simd256bit(__m256i i)
inline explicit simd256bit(__m256 f)
inline explicit simd256bit(const void *x)
inline void clear()
inline void storeu(void *ptr) const
inline void loadu(const void *ptr)
inline void store(void *ptr) const
inline void bin(char bits[257]) const
inline std::string bin() const
inline bool is_same_as(simd256bit other) const

Public Members

__m256i i
__m256 f
union faiss::simd256bit::[anonymous] [anonymous]
struct simd16uint16 : public faiss::simd256bit, public faiss::simd256bit
#include <simdlib_avx2.h>

vector of 16 elements in uint16

Public Functions

inline simd16uint16()
inline explicit simd16uint16(__m256i i)
inline explicit simd16uint16(int x)
inline explicit simd16uint16(uint16_t x)
inline explicit simd16uint16(simd256bit x)
inline explicit simd16uint16(const uint16_t *x)
inline explicit simd16uint16(uint16_t u0, uint16_t u1, uint16_t u2, uint16_t u3, uint16_t u4, uint16_t u5, uint16_t u6, uint16_t u7, uint16_t u8, uint16_t u9, uint16_t u10, uint16_t u11, uint16_t u12, uint16_t u13, uint16_t u14, uint16_t u15)
inline std::string elements_to_string(const char *fmt) const
inline std::string hex() const
inline std::string dec() const
inline void set1(uint16_t x)
inline simd16uint16 operator*(const simd16uint16 &other) const
inline simd16uint16 operator>>(const int shift) const
inline simd16uint16 operator<<(const int shift) const
inline simd16uint16 operator+=(simd16uint16 other)
inline simd16uint16 operator-=(simd16uint16 other)
inline simd16uint16 operator+(simd16uint16 other) const
inline simd16uint16 operator-(simd16uint16 other) const
inline simd16uint16 operator&(simd256bit other) const
inline simd16uint16 operator|(simd256bit other) const
inline simd16uint16 operator^(simd256bit other) const
inline simd16uint16 operator~() const
inline uint16_t get_scalar_0() const
inline uint32_t ge_mask(simd16uint16 thresh) const
inline uint32_t le_mask(simd16uint16 thresh) const
inline uint32_t gt_mask(simd16uint16 thresh) const
inline bool all_gt(simd16uint16 thresh) const
inline uint16_t operator[](int i) const
inline void accu_min(simd16uint16 incoming)
inline void accu_max(simd16uint16 incoming)
inline void storeu(void *ptr) const
inline void loadu(const void *ptr)
inline void store(void *ptr) const
inline bool is_same_as(simd256bit other) const

Public Members

__m256i i
__m256 f
union faiss::simd256bit::[anonymous] [anonymous]

Friends

inline friend simd16uint16 operator==(const simd256bit lhs, const simd256bit rhs)
struct simd32uint8 : public faiss::simd256bit, public faiss::simd256bit

Public Functions

inline simd32uint8()
inline explicit simd32uint8(__m256i i)
inline explicit simd32uint8(int x)
inline explicit simd32uint8(uint8_t x)
inline explicit simd32uint8(simd256bit x)
inline explicit simd32uint8(const uint8_t *x)
inline std::string elements_to_string(const char *fmt) const
inline std::string hex() const
inline std::string dec() const
inline void set1(uint8_t x)
inline simd32uint8 operator&(simd256bit other) const
inline simd32uint8 operator+(simd32uint8 other) const
inline simd32uint8 lookup_2_lanes(simd32uint8 idx) const
inline simd16uint16 lane0_as_uint16() const
inline simd16uint16 lane1_as_uint16() const
inline simd32uint8 operator+=(simd32uint8 other)
inline uint8_t operator[](int i) const
inline void storeu(void *ptr) const
inline void loadu(const void *ptr)
inline void store(void *ptr) const
inline bool is_same_as(simd256bit other) const

Public Members

__m256i i
__m256 f
union faiss::simd256bit::[anonymous] [anonymous]

Public Static Functions

template<uint8_t _0, uint8_t _1, uint8_t _2, uint8_t _3, uint8_t _4, uint8_t _5, uint8_t _6, uint8_t _7, uint8_t _8, uint8_t _9, uint8_t _10, uint8_t _11, uint8_t _12, uint8_t _13, uint8_t _14, uint8_t _15, uint8_t _16, uint8_t _17, uint8_t _18, uint8_t _19, uint8_t _20, uint8_t _21, uint8_t _22, uint8_t _23, uint8_t _24, uint8_t _25, uint8_t _26, uint8_t _27, uint8_t _28, uint8_t _29, uint8_t _30, uint8_t _31>
static inline simd32uint8 create()
struct simd8uint32 : public faiss::simd256bit, public faiss::simd256bit
#include <simdlib_avx2.h>

vector of 8 unsigned 32-bit integers

Public Functions

inline simd8uint32()
inline explicit simd8uint32(__m256i i)
inline explicit simd8uint32(uint32_t x)
inline explicit simd8uint32(simd256bit x)
inline explicit simd8uint32(const uint8_t *x)
inline explicit simd8uint32(uint32_t u0, uint32_t u1, uint32_t u2, uint32_t u3, uint32_t u4, uint32_t u5, uint32_t u6, uint32_t u7)
inline simd8uint32 operator+(simd8uint32 other) const
inline simd8uint32 operator-(simd8uint32 other) const
inline simd8uint32 &operator+=(const simd8uint32 &other)
inline bool operator==(simd8uint32 other) const
inline bool operator!=(simd8uint32 other) const
inline std::string elements_to_string(const char *fmt) const
inline std::string hex() const
inline std::string dec() const
inline void set1(uint32_t x)
inline simd8uint32 unzip() const
inline void storeu(void *ptr) const
inline void loadu(const void *ptr)
inline void store(void *ptr) const
inline bool is_same_as(simd256bit other) const

Public Members

__m256i i
__m256 f
union faiss::simd256bit::[anonymous] [anonymous]
struct simd8float32 : public faiss::simd256bit, public faiss::simd256bit

Public Functions

inline simd8float32()
inline explicit simd8float32(simd256bit x)
inline explicit simd8float32(__m256 x)
inline explicit simd8float32(float x)
inline explicit simd8float32(const float *x)
inline explicit simd8float32(float f0, float f1, float f2, float f3, float f4, float f5, float f6, float f7)
inline simd8float32 operator*(simd8float32 other) const
inline simd8float32 operator+(simd8float32 other) const
inline simd8float32 operator-(simd8float32 other) const
inline simd8float32 &operator+=(const simd8float32 &other)
inline bool operator==(simd8float32 other) const
inline bool operator!=(simd8float32 other) const
inline std::string tostring() const
inline void storeu(void *ptr) const
inline void loadu(const void *ptr)
inline void store(void *ptr) const
inline bool is_same_as(simd256bit other) const

Public Members

__m256i i
__m256 f
union faiss::simd256bit::[anonymous] [anonymous]