File distances.h

namespace faiss

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.

Throughout the library, vectors are provided as float * pointers. Most algorithms can be optimized when several vectors are processed (added/searched) together in a batch. In this case, they are passed in as a matrix. When n vectors of size d are provided as float * x, component j of vector i is

x[ i * d + j ]

where 0 <= i < n and 0 <= j < d. In other words, matrices are always compact. When specifying the size of the matrix, we call it an n*d matrix, which implies a row-major storage.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. I/O functions can read/write to a filename, a file handle or to an object that abstracts the medium.

The read functions return objects that should be deallocated with delete. All references within these objectes are owned by the object.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. Definition of inverted lists + a few common classes that implement the interface.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. Since IVF (inverted file) indexes are of so much use for large-scale use cases, we group a few functions related to them in this small library. Most functions work both on IndexIVFs and IndexIVFs embedded within an IndexPreTransform.

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. In this file are the implementations of extra metrics beyond L2 and inner product

Copyright (c) Facebook, Inc. and its affiliates.

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree. Defines a few objects that apply transformations to a set of vectors Often these are pre-processing steps.

Functions

float fvec_L2sqr(const float *x, const float *y, size_t d)

Squared L2 distance between two vectors.

float fvec_inner_product(const float *x, const float *y, size_t d)

inner product

float fvec_L1(const float *x, const float *y, size_t d)

L1 distance.

float fvec_Linf(const float *x, const float *y, size_t d)

infinity distance

void pairwise_L2sqr(int64_t d, int64_t nq, const float *xq, int64_t nb, const float *xb, float *dis, int64_t ldq = -1, int64_t ldb = -1, int64_t ldd = -1)

Compute pairwise distances between sets of vectors

Parameters
  • d – dimension of the vectors

  • nq – nb of query vectors

  • nb – nb of database vectors

  • xq – query vectors (size nq * d)

  • xb – database vectors (size nb * d)

  • dis – output distances (size nq * nb)

  • ldq, ldb, ldd – strides for the matrices

void fvec_inner_products_ny(float *ip, const float *x, const float *y, size_t d, size_t ny)
void fvec_L2sqr_ny(float *dis, const float *x, const float *y, size_t d, size_t ny)
float fvec_norm_L2sqr(const float *x, size_t d)

squared norm of a vector

void fvec_norms_L2(float *norms, const float *x, size_t d, size_t nx)

compute the L2 norms for a set of vectors

Parameters
  • norms – output norms, size nx

  • x – set of vectors, size nx * d

void fvec_norms_L2sqr(float *norms, const float *x, size_t d, size_t nx)

same as fvec_norms_L2, but computes squared norms

void fvec_renorm_L2(size_t d, size_t nx, float *x)
void inner_product_to_L2sqr(float *dis, const float *nr1, const float *nr2, size_t n1, size_t n2)
void fvec_add(size_t d, const float *a, const float *b, float *c)

compute c := a + b for vectors

c and a can overlap, c and b can overlap

Parameters
  • a – size d

  • b – size d

  • c – size d

void fvec_add(size_t d, const float *a, float b, float *c)

compute c := a + b for a, c vectors and b a scalar

c and a can overlap

Parameters
  • a – size d

  • c – size d

void fvec_sub(size_t d, const float *a, const float *b, float *c)

compute c := a - b for vectors

c and a can overlap, c and b can overlap

Parameters
  • a – size d

  • b – size d

  • c – size d

void fvec_inner_products_by_idx(float *ip, const float *x, const float *y, const int64_t *ids, size_t d, size_t nx, size_t ny)
void fvec_L2sqr_by_idx(float *dis, const float *x, const float *y, const int64_t *ids, size_t d, size_t nx, size_t ny)
void pairwise_indexed_L2sqr(size_t d, size_t n, const float *x, const int64_t *ix, const float *y, const int64_t *iy, float *dis)

compute dis[j] = L2sqr(x[ix[j]], y[iy[j]]) forall j=0..n-1

Parameters
  • x – size (max(ix) + 1, d)

  • y – size (max(iy) + 1, d)

  • ix – size n

  • iy – size n

  • dis – size n

void pairwise_indexed_inner_product(size_t d, size_t n, const float *x, const int64_t *ix, const float *y, const int64_t *iy, float *dis)
void knn_inner_product(const float *x, const float *y, size_t d, size_t nx, size_t ny, float_minheap_array_t *res)

Return the k nearest neighors of each of the nx vectors x among the ny vector y, w.r.t to max inner product

Parameters
  • x – query vectors, size nx * d

  • y – database vectors, size ny * d

  • res – result array, which also provides k. Sorted on output

void knn_L2sqr(const float *x, const float *y, size_t d, size_t nx, size_t ny, float_maxheap_array_t *res, const float *y_norm2 = nullptr)

Same as knn_inner_product, for the L2 distance

Parameters

y_norm2 – norms for the y vectors (nullptr or size ny)

void knn_inner_products_by_idx(const float *x, const float *y, const int64_t *ids, size_t d, size_t nx, size_t ny, float_minheap_array_t *res)
void knn_L2sqr_by_idx(const float *x, const float *y, const int64_t *ids, size_t d, size_t nx, size_t ny, float_maxheap_array_t *res)
void range_search_L2sqr(const float *x, const float *y, size_t d, size_t nx, size_t ny, float radius, RangeSearchResult *result)

Return the k nearest neighors of each of the nx vectors x among the ny vector y, w.r.t to max inner product

Parameters
  • x – query vectors, size nx * d

  • y – database vectors, size ny * d

  • radius – search radius around the x vectors

  • result – result structure

void range_search_inner_product(const float *x, const float *y, size_t d, size_t nx, size_t ny, float radius, RangeSearchResult *result)

same as range_search_L2sqr for the inner product similarity

void compute_PQ_dis_tables_dsub2(size_t d, size_t ksub, const float *centroids, size_t nx, const float *x, bool is_inner_product, float *dis_tables)

specialized function for PQ2

Variables

FAISS_API int distance_compute_blas_threshold
FAISS_API int distance_compute_blas_query_bs
FAISS_API int distance_compute_blas_database_bs
FAISS_API int distance_compute_min_k_reservoir