Struct faiss::IndexBinaryHash

struct IndexBinaryHash : public faiss::IndexBinary

just uses the b first bits as a hash value

Public Types

using InvertedListMap = std::unordered_map<idx_t, InvertedList>
using component_t = uint8_t
using distance_t = int32_t

Public Functions

IndexBinaryHash(int d, int b)
IndexBinaryHash()
virtual void reset() override

Removes all elements from the database.

virtual void add(idx_t n, const uint8_t *x) override

Add n vectors of dimension d to the index.

Vectors are implicitly assigned labels ntotal .. ntotal + n - 1

Parameters:

x – input matrix, size n * d / 8

virtual void add_with_ids(idx_t n, const uint8_t *x, const idx_t *xids) override

Same as add, but stores xids instead of sequential ids.

The default implementation fails with an assertion, as it is not supported by all indexes.

Parameters:

xids – if non-null, ids to store for the vectors (size n)

virtual void range_search(idx_t n, const uint8_t *x, int radius, RangeSearchResult *result, const SearchParameters *params = nullptr) const override

Query n vectors of dimension d to the index.

return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory). The distances are converted to float to reuse the RangeSearchResult structure, but they are integer. By convention, only distances < radius (strict comparison) are returned, ie. radius = 0 does not return any result and 1 returns only exact same vectors.

Parameters:
  • x – input vectors to search, size n * d / 8

  • radius – search radius

  • result – result table

virtual void search(idx_t n, const uint8_t *x, idx_t k, int32_t *distances, idx_t *labels, const SearchParameters *params = nullptr) const override

Query n vectors of dimension d to the index.

return at most k vectors. If there are not enough results for a query, the result array is padded with -1s.

Parameters:
  • x – input vectors to search, size n * d / 8

  • labels – output labels of the NNs, size n*k

  • distances – output pairwise distances, size n*k

void display() const
size_t hashtable_size() const
virtual void train(idx_t n, const uint8_t *x)

Perform training on a representative set of vectors.

Parameters:
  • n – nb of training vectors

  • x – training vecors, size n * d / 8

void assign(idx_t n, const uint8_t *x, idx_t *labels, idx_t k = 1) const

Return the indexes of the k vectors closest to the query x.

This function is identical to search but only returns labels of neighbors.

Parameters:
  • x – input vectors to search, size n * d / 8

  • labels – output labels of the NNs, size n*k

virtual size_t remove_ids(const IDSelector &sel)

Removes IDs from the index. Not supported by all indexes.

virtual void reconstruct(idx_t key, uint8_t *recons) const

Reconstruct a stored vector.

This function may not be defined for some indexes.

Parameters:
  • key – id of the vector to reconstruct

  • recons – reconstucted vector (size d / 8)

virtual void reconstruct_n(idx_t i0, idx_t ni, uint8_t *recons) const

Reconstruct vectors i0 to i0 + ni - 1.

This function may not be defined for some indexes.

Parameters:

recons – reconstucted vectors (size ni * d / 8)

virtual void search_and_reconstruct(idx_t n, const uint8_t *x, idx_t k, int32_t *distances, idx_t *labels, uint8_t *recons, const SearchParameters *params = nullptr) const

Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.

If there are not enough results for a query, the resulting array is padded with -1s.

Parameters:

recons – reconstructed vectors size (n, k, d)

virtual void merge_from(IndexBinary &otherIndex, idx_t add_id = 0)

moves the entries from another dataset to self. On output, other is empty. add_id is added to all moved ids (for sequential ids, this would be this->ntotal)

virtual void check_compatible_for_merge(const IndexBinary &otherIndex) const

check that the two indexes are compatible (ie, they are trained in the same way and have the same parameters). Otherwise throw.

Public Members

InvertedListMap invlists
int b
int nflip
int d = 0

vector dimension

int code_size = 0

number of bytes per vector ( = d / 8 )

idx_t ntotal = 0

total nb of indexed vectors

bool verbose = false

verbosity level

bool is_trained = true

set if the Index does not require training, or if training is done already

MetricType metric_type = METRIC_L2

type of metric this index uses for search

struct InvertedList

Public Functions

void add(idx_t id, size_t code_size, const uint8_t *code)

Public Members

std::vector<idx_t> ids
std::vector<uint8_t> vecs