Struct faiss::Clustering1D
-
struct Clustering1D : public faiss::Clustering
Exact 1D clustering algorithm
Since it does not use an index, it does not overload the train() function
Public Functions
-
explicit Clustering1D(int k)
-
Clustering1D(int k, const ClusteringParameters &cp)
-
void train_exact(idx_t n, const float *x)
-
inline virtual ~Clustering1D()
-
virtual void train(idx_t n, const float *x, faiss::Index &index, const float *x_weights = nullptr)
run k-means training
- Parameters:
x – training vectors, size n * d
index – index used for assignment
x_weights – weight associated to each vector: NULL or size n
-
void train_encoded(idx_t nx, const uint8_t *x_in, const Index *codec, Index &index, const float *weights = nullptr)
run with encoded vectors
win addition to train()’s parameters takes a codec as parameter to decode the input vectors.
- Parameters:
codec – codec used to decode the vectors (nullptr = vectors are in fact floats) *
-
void post_process_centroids()
Post-process the centroids after each centroid update. includes optional L2 normalization and nearest integer rounding
Public Members
-
size_t d
dimension of the vectors
-
size_t k
nb of centroids
-
std::vector<float> centroids
centroids (k * d) if centroids are set on input to train, they will be used as initialization
-
std::vector<ClusteringIterationStats> iteration_stats
stats at every iteration of clustering
-
int niter
clustering iterations
-
int nredo
redo clustering this many times and keep best
-
bool verbose
-
bool spherical
do we want normalized centroids?
-
bool int_centroids
round centroids coordinates to integer
-
bool update_index
re-train index after each iteration?
-
bool frozen_centroids
use the centroids provided as input and do not change them during iterations
-
int min_points_per_centroid
otherwise you get a warning
-
int max_points_per_centroid
to limit size of dataset
-
int seed
seed for the random number generator
-
size_t decode_block_size
how many vectors at a time to decode
-
explicit Clustering1D(int k)