Struct faiss::ProgressiveDimClustering

struct ProgressiveDimClustering : public faiss::ProgressiveDimClusteringParameters

K-means clustering with progressive dimensions used

The clustering first happens in dim 1, then with exponentially increasing dimension until d (I steps). This is typically applied after a PCA transformation (optional). Reference:

“Improved Residual Vector Quantization for High-dimensional Approximate

Nearest Neighbor Search”

Shicong Liu, Hongtao Lu, Junru Shao, AAAI’15

https://arxiv.org/abs/1509.05195

Public Functions

ProgressiveDimClustering(int d, int k)
ProgressiveDimClustering(int d, int k, const ProgressiveDimClusteringParameters &cp)
void train(idx_t n, const float *x, ProgressiveDimIndexFactory &factory)
inline virtual ~ProgressiveDimClustering()

Public Members

size_t d

dimension of the vectors

size_t k

nb of centroids

std::vector<float> centroids

centroids (k * d)

std::vector<ClusteringIterationStats> iteration_stats

stats at every iteration of clustering

int progressive_dim_steps

number of incremental steps

bool apply_pca

apply PCA on input

int niter

clustering iterations

int nredo

redo clustering this many times and keep best

bool verbose
bool spherical

do we want normalized centroids?

bool int_centroids

round centroids coordinates to integer

bool update_index

re-train index after each iteration?

bool frozen_centroids

use the centroids provided as input and do not change them during iterations

int min_points_per_centroid

otherwise you get a warning

int max_points_per_centroid

to limit size of dataset

int seed

seed for the random number generator

size_t decode_block_size

how many vectors at a time to decode