Struct faiss::ProgressiveDimClustering
-
struct ProgressiveDimClustering : public faiss::ProgressiveDimClusteringParameters
K-means clustering with progressive dimensions used
The clustering first happens in dim 1, then with exponentially increasing dimension until d (I steps). This is typically applied after a PCA transformation (optional). Reference:
“Improved Residual Vector Quantization for High-dimensional Approximate
Nearest Neighbor Search”
Shicong Liu, Hongtao Lu, Junru Shao, AAAI’15
https://arxiv.org/abs/1509.05195
Public Functions
-
ProgressiveDimClustering(int d, int k)
-
ProgressiveDimClustering(int d, int k, const ProgressiveDimClusteringParameters &cp)
-
void train(idx_t n, const float *x, ProgressiveDimIndexFactory &factory)
-
inline virtual ~ProgressiveDimClustering()
Public Members
-
size_t d
dimension of the vectors
-
size_t k
nb of centroids
-
std::vector<float> centroids
centroids (k * d)
-
std::vector<ClusteringIterationStats> iteration_stats
stats at every iteration of clustering
-
int progressive_dim_steps
number of incremental steps
-
bool apply_pca
apply PCA on input
-
int niter
clustering iterations
-
int nredo
redo clustering this many times and keep best
-
bool verbose
-
bool spherical
do we want normalized centroids?
-
bool int_centroids
round centroids coordinates to integer
-
bool update_index
re-train index after each iteration?
-
bool frozen_centroids
use the centroids provided as input and do not change them during iterations
-
int min_points_per_centroid
otherwise you get a warning
-
int max_points_per_centroid
to limit size of dataset
-
int seed
seed for the random number generator
-
size_t decode_block_size
how many vectors at a time to decode
-
ProgressiveDimClustering(int d, int k)