Namespace faiss::gpu
-
namespace faiss::gpu
Enums
-
enum class DistanceDataType
Values:
-
enumerator F32
-
enumerator F16
-
enumerator F32
-
enum class IndicesDataType
Values:
-
enumerator I64
-
enumerator I32
-
enumerator I64
-
enum IndicesOptions
How user vector index data is stored on the GPU.
Values:
-
enumerator INDICES_CPU
The user indices are only stored on the CPU; the GPU returns (inverted list, offset) to the CPU which is then translated to the real user index.
-
enumerator INDICES_IVF
The indices are not stored at all, on either the CPU or GPU. Only (inverted list, offset) is returned to the user as the index.
-
enumerator INDICES_32_BIT
Indices are stored as 32 bit integers on the GPU, but returned as 64 bit integers
-
enumerator INDICES_64_BIT
Indices are stored as 64 bit integers on the GPU.
-
enumerator INDICES_CPU
-
enum AllocType
Values:
-
enumerator Other
Unknown allocation type or miscellaneous (not currently categorized)
-
enumerator FlatData
Primary data storage for GpuIndexFlat (the raw matrix of vectors and vector norms if needed)
-
enumerator IVFLists
Primary data storage for GpuIndexIVF* (the storage for each individual IVF list)
-
enumerator Quantizer
Quantizer (PQ, SQ) dictionary information.
-
enumerator QuantizerPrecomputedCodes
For GpuIndexIVFPQ, “precomputed codes” for more efficient PQ lookup require the use of possibly large tables. These are marked separately from Quantizer as these can frequently be 100s - 1000s of MiB in size
-
enumerator TemporaryMemoryBuffer
StandardGpuResources implementation specific types When using StandardGpuResources, temporary memory allocations (MemorySpace::Temporary) come out of a stack region of memory that is allocated up front for each gpu (e.g., 1.5 GiB upon initialization). This allocation by StandardGpuResources is marked with this AllocType.
-
enumerator TemporaryMemoryOverflow
When using StandardGpuResources, any MemorySpace::Temporary allocations that cannot be satisfied within the TemporaryMemoryBuffer region fall back to calling cudaMalloc which are sized to just the request at hand. These “overflow” temporary allocations are marked with this AllocType.
-
enumerator Other
-
enum MemorySpace
Memory regions accessible to the GPU.
Values:
-
enumerator Temporary
Temporary device memory (guaranteed to no longer be used upon exit of a top-level index call, and where the streams using it have completed GPU work). Typically backed by Device memory (cudaMalloc/cudaFree).
-
enumerator Device
Managed using cudaMalloc/cudaFree (typical GPU device memory)
-
enumerator Unified
Managed using cudaMallocManaged/cudaFree (typical Unified CPU/GPU memory)
-
enumerator Temporary
Functions
-
faiss::Index *index_gpu_to_cpu(const faiss::Index *gpu_index)
converts any GPU index inside gpu_index to a CPU index
-
faiss::Index *index_cpu_to_gpu(GpuResourcesProvider *provider, int device, const faiss::Index *index, const GpuClonerOptions *options = nullptr)
converts any CPU index that can be converted to GPU
-
faiss::Index *index_cpu_to_gpu_multiple(std::vector<GpuResourcesProvider*> &provider, std::vector<int> &devices, const faiss::Index *index, const GpuMultipleClonerOptions *options = nullptr)
-
void bfKnn(GpuResourcesProvider *resources, const GpuDistanceParams &args)
A wrapper for gpu/impl/Distance.cuh to expose direct brute-force k-nearest neighbor searches on an externally-provided region of memory (e.g., from a pytorch tensor). The data (vectors, queries, outDistances, outIndices) can be resident on the GPU or the CPU, but all calculations are performed on the GPU. If the result buffers are on the CPU, results will be copied back when done.
All GPU computation is performed on the current CUDA device, and ordered with respect to resources->getDefaultStreamCurrentDevice().
For each vector in
queries
, searches all ofvectors
to find its k nearest neighbors with respect to the given metric
-
void bruteForceKnn(GpuResourcesProvider *resources, faiss::MetricType metric, const float *vectors, bool vectorsRowMajor, int numVectors, const float *queries, bool queriesRowMajor, int numQueries, int dims, int k, float *outDistances, Index::idx_t *outIndices)
Deprecated legacy implementation.
-
std::string allocTypeToString(AllocType t)
Convert an AllocType to string.
-
std::string memorySpaceToString(MemorySpace s)
Convert a MemorySpace to string.
-
AllocInfo makeDevAlloc(AllocType at, cudaStream_t st)
Create an AllocInfo for the current device with MemorySpace::Device.
-
AllocInfo makeTempAlloc(AllocType at, cudaStream_t st)
Create an AllocInfo for the current device with MemorySpace::Temporary.
-
AllocInfo makeSpaceAlloc(AllocType at, MemorySpace sp, cudaStream_t st)
Create an AllocInfo for the current device.
-
void newTestSeed()
Generates and displays a new seed for the test.
-
void setTestSeed(long seed)
Uses an explicit seed for the test.
-
float relativeError(float a, float b)
Returns the relative error in difference between a and b (|a - b| / (0.5 * (|a| + |b|))
-
int randVal(int a, int b)
Generates a random integer in the range [a, b].
-
bool randBool()
Generates a random bool.
-
template<typename T>
T randSelect(std::initializer_list<T> vals) Select a random value from the given list of values provided as an initializer_list
-
std::vector<float> randVecs(size_t num, size_t dim)
Generates a collection of random vectors in the range [0, 1].
-
std::vector<unsigned char> randBinaryVecs(size_t num, size_t dim)
Generates a collection of random bit vectors.
-
void compareIndices(const std::vector<float> &queryVecs, faiss::Index &refIndex, faiss::Index &testIndex, int numQuery, int dim, int k, const std::string &configMsg, float maxRelativeError = 6e-5f, float pctMaxDiff1 = 0.1f, float pctMaxDiffN = 0.005f)
Compare two indices via query for similarity, with a user-specified set of query vectors
-
void compareIndices(faiss::Index &refIndex, faiss::Index &testIndex, int numQuery, int dim, int k, const std::string &configMsg, float maxRelativeError = 6e-5f, float pctMaxDiff1 = 0.1f, float pctMaxDiffN = 0.005f)
Compare two indices via query for similarity, generating random query vectors
-
void compareLists(const float *refDist, const faiss::Index::idx_t *refInd, const float *testDist, const faiss::Index::idx_t *testInd, int dim1, int dim2, const std::string &configMsg, bool printBasicStats, bool printDiffs, bool assertOnErr, float maxRelativeError = 6e-5f, float pctMaxDiff1 = 0.1f, float pctMaxDiffN = 0.005f)
Display specific differences in the two (distance, index) lists.
-
template<typename A, typename B>
void testIVFEquality(A &cpuIndex, B &gpuIndex) Compare IVF lists between a CPU and GPU index.
-
int getCurrentDevice()
Returns the current thread-local GPU device.
-
void setCurrentDevice(int device)
Sets the current thread-local GPU device.
-
int getNumDevices()
Returns the number of available GPU devices.
-
void profilerStart()
Starts the CUDA profiler (exposed via SWIG)
-
void profilerStop()
Stops the CUDA profiler (exposed via SWIG)
-
void synchronizeAllDevices()
Synchronizes the CPU against all devices (equivalent to cudaDeviceSynchronize for each device)
-
const cudaDeviceProp &getDeviceProperties(int device)
Returns a cached cudaDeviceProp for the given device.
-
const cudaDeviceProp &getCurrentDeviceProperties()
Returns the cached cudaDeviceProp for the current device.
-
int getMaxThreads(int device)
Returns the maximum number of threads available for the given GPU device
-
int getMaxThreadsCurrentDevice()
Equivalent to getMaxThreads(getCurrentDevice())
Returns the maximum smem available for the given GPU device.
Equivalent to getMaxSharedMemPerBlock(getCurrentDevice())
-
int getDeviceForAddress(const void *p)
For a given pointer, returns whether or not it is located on a device (deviceId >= 0) or the host (-1).
-
bool getFullUnifiedMemSupport(int device)
Does the given device support full unified memory sharing host memory?
-
bool getFullUnifiedMemSupportCurrentDevice()
Equivalent to getFullUnifiedMemSupport(getCurrentDevice())
-
bool getTensorCoreSupport(int device)
Does the given device support tensor core operations?
-
bool getTensorCoreSupportCurrentDevice()
Equivalent to getTensorCoreSupport(getCurrentDevice())
-
int getMaxKSelection()
Returns the maximum k-selection value supported based on the CUDA SDK that we were compiled with. .cu files can use DeviceDefs.cuh, but this is for non-CUDA files
-
template<typename L1, typename L2>
void streamWaitBase(const L1 &listWaiting, const L2 &listWaitOn) Call for a collection of streams to wait on.
-
template<typename L1>
void streamWait(const L1 &a, const std::initializer_list<cudaStream_t> &b) These versions allow usage of initializer_list as arguments, since otherwise {…} doesn’t have a type
-
template<typename L2>
void streamWait(const std::initializer_list<cudaStream_t> &a, const L2 &b)
-
inline void streamWait(const std::initializer_list<cudaStream_t> &a, const std::initializer_list<cudaStream_t> &b)
-
struct AllocInfo
- #include <GpuResources.h>
Information on what/where an allocation is.
Subclassed by faiss::gpu::AllocRequest
Public Functions
-
inline AllocInfo()
-
inline AllocInfo(AllocType at, int dev, MemorySpace sp, cudaStream_t st)
-
std::string toString() const
Returns a string representation of this info.
Public Members
-
AllocType type
The internal category of the allocation.
-
int device
The device on which the allocation is happening.
-
MemorySpace space
The memory space of the allocation.
-
cudaStream_t stream
The stream on which new work on the memory will be ordered (e.g., if a piece of memory cached and to be returned for this call was last used on stream 3 and a new memory request is for stream 4, the memory manager will synchronize stream 4 to wait for the completion of stream 3 via events or other stream synchronization.
The memory manager guarantees that the returned memory is free to use without data races on this stream specified.
-
inline AllocInfo()
-
struct AllocRequest : public faiss::gpu::AllocInfo
- #include <GpuResources.h>
Information on what/where an allocation is, along with how big it should be.
Public Functions
-
inline AllocRequest()
-
inline AllocRequest(const AllocInfo &info, size_t sz)
-
inline AllocRequest(AllocType at, int dev, MemorySpace sp, cudaStream_t st, size_t sz)
-
std::string toString() const
Returns a string representation of this request.
Public Members
-
size_t size
The size in bytes of the allocation.
-
AllocType type
The internal category of the allocation.
-
int device
The device on which the allocation is happening.
-
MemorySpace space
The memory space of the allocation.
-
cudaStream_t stream
The stream on which new work on the memory will be ordered (e.g., if a piece of memory cached and to be returned for this call was last used on stream 3 and a new memory request is for stream 4, the memory manager will synchronize stream 4 to wait for the completion of stream 3 via events or other stream synchronization.
The memory manager guarantees that the returned memory is free to use without data races on this stream specified.
-
inline AllocRequest()
-
class CpuTimer
- #include <Timer.h>
CPU wallclock elapsed timer.
Public Functions
-
CpuTimer()
Creates and starts a new timer.
-
float elapsedMilliseconds()
Returns elapsed time in milliseconds.
Private Members
-
std::chrono::time_point<std::chrono::steady_clock> start_
-
CpuTimer()
-
class CublasHandleScope
- #include <DeviceUtils.h>
RAII object to manage a cublasHandle_t.
Public Functions
-
CublasHandleScope()
-
~CublasHandleScope()
-
inline cublasHandle_t get()
Private Members
-
cublasHandle_t blasHandle_
-
CublasHandleScope()
-
class CudaEvent
Public Functions
-
explicit CudaEvent(cudaStream_t stream, bool timer = false)
Creates an event and records it in this stream.
-
CudaEvent(const CudaEvent &event) = delete
-
CudaEvent(CudaEvent &&event) noexcept
-
~CudaEvent()
-
inline cudaEvent_t get()
-
void streamWaitOnEvent(cudaStream_t stream)
Wait on this event in this stream.
-
void cpuWaitOnEvent()
Have the CPU wait for the completion of this event.
Private Members
-
cudaEvent_t event_
-
explicit CudaEvent(cudaStream_t stream, bool timer = false)
-
class DeviceScope
- #include <DeviceUtils.h>
RAII object to set the current device, and restore the previous device upon destruction
Public Functions
-
explicit DeviceScope(int device)
-
~DeviceScope()
Private Members
-
int prevDevice_
-
explicit DeviceScope(int device)
-
struct GpuClonerOptions
- #include <GpuClonerOptions.h>
set some options on how to copy to GPU
Subclassed by faiss::gpu::GpuMultipleClonerOptions, faiss::gpu::ToGpuCloner
Public Functions
-
GpuClonerOptions()
Public Members
-
IndicesOptions indicesOptions
how should indices be stored on index types that support indices (anything but GpuIndexFlat*)?
-
bool useFloat16CoarseQuantizer
is the coarse quantizer in float16?
-
bool useFloat16
for GpuIndexIVFFlat, is storage in float16? for GpuIndexIVFPQ, are intermediate calculations in float16?
-
bool usePrecomputed
use precomputed tables?
-
long reserveVecs
reserve vectors in the invfiles?
-
bool storeTransposed
For GpuIndexFlat, store data in transposed layout?
-
bool verbose
Set verbose options on the index.
-
GpuClonerOptions()
-
struct GpuDistanceParams
- #include <GpuDistance.h>
Arguments to brute-force GPU k-nearest neighbor searching.
Public Functions
-
inline GpuDistanceParams()
Public Members
-
faiss::MetricType metric
Search parameter: distance metric.
-
float metricArg
Search parameter: distance metric argument (if applicable) For metric == METRIC_Lp, this is the p-value
-
int k
Search parameter: return k nearest neighbors If the value provided is -1, then we report all pairwise distances without top-k filtering
-
int dims
Vector dimensionality.
-
const void *vectors
If vectorsRowMajor is true, this is numVectors x dims, with dims innermost; otherwise, dims x numVectors, with numVectors innermost
-
DistanceDataType vectorType
-
bool vectorsRowMajor
-
int numVectors
-
const float *vectorNorms
Precomputed L2 norms for each vector in
vectors
, which can be optionally provided in advance to speed computation for METRIC_L2
-
const void *queries
If queriesRowMajor is true, this is numQueries x dims, with dims innermost; otherwise, dims x numQueries, with numQueries innermost
-
DistanceDataType queryType
-
bool queriesRowMajor
-
int numQueries
-
float *outDistances
A region of memory size numQueries x k, with k innermost (row major) if k > 0, or if k == -1, a region of memory of size numQueries x numVectors
-
bool ignoreOutDistances
Do we only care about the indices reported, rather than the output distances? Not used if k == -1 (all pairwise distances)
-
IndicesDataType outIndicesType
A region of memory size numQueries x k, with k innermost (row major). Not used if k == -1 (all pairwise distances)
-
void *outIndices
-
inline GpuDistanceParams()
-
class GpuIcmEncoder : public IcmEncoder
- #include <GpuIcmEncoder.h>
Perform LSQ encoding on GPU.
Split input vectors to different devices and call IcmEncoderImpl::encode to encode them
Public Functions
-
GpuIcmEncoder(const LocalSearchQuantizer *lsq, const std::vector<GpuResourcesProvider*> &provs, const std::vector<int> &devices)
-
~GpuIcmEncoder()
-
GpuIcmEncoder(const GpuIcmEncoder&) = delete
-
GpuIcmEncoder &operator=(const GpuIcmEncoder&) = delete
-
void set_binary_term() override
-
void encode(int32_t *codes, const float *x, std::mt19937 &gen, size_t n, size_t ils_iters) const override
Private Members
-
std::unique_ptr<IcmEncoderShards> shards
-
GpuIcmEncoder(const LocalSearchQuantizer *lsq, const std::vector<GpuResourcesProvider*> &provs, const std::vector<int> &devices)
-
struct GpuIcmEncoderFactory : public IcmEncoderFactory
Public Functions
-
explicit GpuIcmEncoderFactory(int ngpus = 1)
-
lsq::IcmEncoder *get(const LocalSearchQuantizer *lsq) override
-
explicit GpuIcmEncoderFactory(int ngpus = 1)
-
class GpuIndex : public faiss::Index
Subclassed by faiss::gpu::GpuIndexFlat, faiss::gpu::GpuIndexIVF
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add(Index::idx_t, const float *x) override
x
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void train(idx_t n, const float *x)
Perform training on a representative set of vectors
- Parameters
n – nb of training vectors
x – training vecors, size n * d
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual void reset() = 0
removes all elements from the database.
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void reconstruct(idx_t key, float *recons) const
Reconstruct a stored vector (or an approximation if lossy coding)
this function may not be defined for some indexes
- Parameters
key – id of the vector to reconstruct
recons – reconstucted vector (size d)
-
virtual void reconstruct_n(idx_t i0, idx_t ni, float *recons) const
Reconstruct vectors i0 to i0 + ni - 1
this function may not be defined for some indexes
- Parameters
recons – reconstucted vector (size ni * d)
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual bool addImplRequiresIDs_() const = 0
Does addImpl_ require IDs? If so, and no IDs are provided, we will generate them sequentially based on the order in which the IDs are added
Protected Attributes
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
Private Functions
-
void addPaged_(int n, const float *x, const Index::idx_t *ids)
Handles paged adds if the add set is too large, passes to addImpl_ to actually perform the add for the current page
-
void addPage_(int n, const float *x, const Index::idx_t *ids)
Calls addImpl_ for a single page of GPU-resident data.
-
using idx_t = int64_t
-
class GpuIndexBinaryFlat : public faiss::IndexBinary
- #include <GpuIndexBinaryFlat.h>
A GPU version of IndexBinaryFlat for brute-force comparison of bit vectors via Hamming distance
Public Types
-
using component_t = uint8_t
-
using distance_t = int32_t
Public Functions
-
GpuIndexBinaryFlat(GpuResourcesProvider *resources, const faiss::IndexBinaryFlat *index, GpuIndexBinaryFlatConfig config = GpuIndexBinaryFlatConfig())
Construct from a pre-existing faiss::IndexBinaryFlat instance, copying data over to the given GPU
-
GpuIndexBinaryFlat(GpuResourcesProvider *resources, int dims, GpuIndexBinaryFlatConfig config = GpuIndexBinaryFlatConfig())
Construct an empty instance that can be added to.
-
~GpuIndexBinaryFlat() override
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void copyFrom(const faiss::IndexBinaryFlat *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexBinaryFlat *index) const
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
virtual void add(faiss::IndexBinary::idx_t n, const uint8_t *x) override
Add n vectors of dimension d to the index.
Vectors are implicitly assigned labels ntotal .. ntotal + n - 1
- Parameters
x – input matrix, size n * d / 8
-
virtual void reset() override
Removes all elements from the database.
-
virtual void search(faiss::IndexBinary::idx_t n, const uint8_t *x, faiss::IndexBinary::idx_t k, int32_t *distances, faiss::IndexBinary::idx_t *labels) const override
Query n vectors of dimension d to the index.
return at most k vectors. If there are not enough results for a query, the result array is padded with -1s.
- Parameters
x – input vectors to search, size n * d / 8
labels – output labels of the NNs, size n*k
distances – output pairwise distances, size n*k
-
virtual void reconstruct(faiss::IndexBinary::idx_t key, uint8_t *recons) const override
Reconstruct a stored vector.
This function may not be defined for some indexes.
- Parameters
key – id of the vector to reconstruct
recons – reconstucted vector (size d / 8)
-
virtual void train(idx_t n, const uint8_t *x)
Perform training on a representative set of vectors.
- Parameters
n – nb of training vectors
x – training vecors, size n * d / 8
-
virtual void add_with_ids(idx_t n, const uint8_t *x, const idx_t *xids)
Same as add, but stores xids instead of sequential ids.
The default implementation fails with an assertion, as it is not supported by all indexes.
- Parameters
xids – if non-null, ids to store for the vectors (size n)
-
virtual void range_search(idx_t n, const uint8_t *x, int radius, RangeSearchResult *result) const
Query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory). The distances are converted to float to reuse the RangeSearchResult structure, but they are integer. By convention, only distances < radius (strict comparison) are returned, ie. radius = 0 does not return any result and 1 returns only exact same vectors.
- Parameters
x – input vectors to search, size n * d / 8
radius – search radius
result – result table
-
void assign(idx_t n, const uint8_t *x, idx_t *labels, idx_t k = 1) const
Return the indexes of the k vectors closest to the query x.
This function is identical to search but only returns labels of neighbors.
- Parameters
x – input vectors to search, size n * d / 8
labels – output labels of the NNs, size n*k
-
virtual size_t remove_ids(const IDSelector &sel)
Removes IDs from the index. Not supported by all indexes.
-
virtual void reconstruct_n(idx_t i0, idx_t ni, uint8_t *recons) const
Reconstruct vectors i0 to i0 + ni - 1.
This function may not be defined for some indexes.
- Parameters
recons – reconstucted vectors (size ni * d / 8)
-
virtual void search_and_reconstruct(idx_t n, const uint8_t *x, idx_t k, int32_t *distances, idx_t *labels, uint8_t *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting array is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
void display() const
Display the actual class name and some more info.
Public Members
-
int d
vector dimension
-
int code_size
number of bytes per vector ( = d / 8 )
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
Protected Functions
-
void searchFromCpuPaged_(int n, const uint8_t *x, int k, int32_t *outDistancesData, int *outIndicesData) const
Called from search when the input data is on the CPU; potentially allows for pinned memory usage
-
void searchNonPaged_(int n, const uint8_t *x, int k, int32_t *outDistancesData, int *outIndicesData) const
Protected Attributes
-
std::shared_ptr<GpuResources> resources_
Manages streans, cuBLAS handles and scratch memory for devices.
-
const GpuIndexBinaryFlatConfig binaryFlatConfig_
Configuration options.
-
std::unique_ptr<BinaryFlatIndex> data_
Holds our GPU data containing the list of vectors.
-
using component_t = uint8_t
-
struct GpuIndexBinaryFlatConfig : public faiss::gpu::GpuIndexConfig
Public Members
-
int device
GPU device on which the index is resident.
-
MemorySpace memorySpace
What memory space to use for primary storage. On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.
-
int device
-
struct GpuIndexConfig
Subclassed by faiss::gpu::GpuIndexBinaryFlatConfig, faiss::gpu::GpuIndexFlatConfig, faiss::gpu::GpuIndexIVFConfig
Public Functions
-
inline GpuIndexConfig()
Public Members
-
int device
GPU device on which the index is resident.
-
MemorySpace memorySpace
What memory space to use for primary storage. On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.
-
inline GpuIndexConfig()
-
class GpuIndexFlat : public faiss::gpu::GpuIndex
- #include <GpuIndexFlat.h>
Wrapper around the GPU implementation that looks like faiss::IndexFlat; copies over centroid data from a given faiss::IndexFlat
Subclassed by faiss::gpu::GpuIndexFlatIP, faiss::gpu::GpuIndexFlatL2
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
GpuIndexFlat(GpuResourcesProvider *provider, const faiss::IndexFlat *index, GpuIndexFlatConfig config = GpuIndexFlatConfig())
Construct from a pre-existing faiss::IndexFlat instance, copying data over to the given GPU
-
GpuIndexFlat(GpuResourcesProvider *provider, int dims, faiss::MetricType metric, GpuIndexFlatConfig config = GpuIndexFlatConfig())
Construct an empty instance that can be added to.
-
~GpuIndexFlat() override
-
void copyFrom(const faiss::IndexFlat *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexFlat *index) const
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
size_t getNumVecs() const
Returns the number of vectors we contain.
-
virtual void reset() override
Clears all vectors from this index.
-
virtual void train(Index::idx_t n, const float *x) override
This index is not trained, so this does nothing.
-
virtual void reconstruct(Index::idx_t key, float *out) const override
Reconstruction methods; prefer the batch reconstruct as it will be more efficient
-
virtual void reconstruct_n(Index::idx_t i0, Index::idx_t num, float *out) const override
Batch reconstruction method.
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Compute residual.
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Compute residual (batch mode)
-
inline FlatIndex *getGpuData()
For internal access.
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual bool addImplRequiresIDs_() const override
Flat index does not require IDs as there is no storage available for them
-
virtual void addImpl_(int n, const float *x, const Index::idx_t *ids) override
Called from GpuIndex for add.
Protected Attributes
-
const GpuIndexFlatConfig flatConfig_
Our configuration options.
-
std::unique_ptr<FlatIndex> data_
Holds our GPU data containing the list of vectors.
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
-
using idx_t = int64_t
-
struct GpuIndexFlatConfig : public faiss::gpu::GpuIndexConfig
Public Functions
-
inline GpuIndexFlatConfig()
Public Members
-
bool useFloat16
Whether or not data is stored as float16.
-
bool storeTransposed
Whether or not data is stored (transparently) in a transposed layout, enabling use of the NN GEMM call, which is ~10% faster. This will improve the speed of the flat index, but will substantially slow down any add() calls made, as all data must be transposed, and will increase storage requirements (we store data in both transposed and non-transposed layouts).
-
int device
GPU device on which the index is resident.
-
MemorySpace memorySpace
What memory space to use for primary storage. On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.
-
inline GpuIndexFlatConfig()
-
class GpuIndexFlatIP : public faiss::gpu::GpuIndexFlat
- #include <GpuIndexFlat.h>
Wrapper around the GPU implementation that looks like faiss::IndexFlatIP; copies over centroid data from a given faiss::IndexFlat
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
GpuIndexFlatIP(GpuResourcesProvider *provider, faiss::IndexFlatIP *index, GpuIndexFlatConfig config = GpuIndexFlatConfig())
Construct from a pre-existing faiss::IndexFlatIP instance, copying data over to the given GPU
-
GpuIndexFlatIP(GpuResourcesProvider *provider, int dims, GpuIndexFlatConfig config = GpuIndexFlatConfig())
Construct an empty instance that can be added to.
-
void copyFrom(faiss::IndexFlat *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexFlat *index)
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
void copyFrom(const faiss::IndexFlat *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexFlat *index) const
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
size_t getNumVecs() const
Returns the number of vectors we contain.
-
virtual void reset() override
Clears all vectors from this index.
-
virtual void train(Index::idx_t n, const float *x) override
This index is not trained, so this does nothing.
-
virtual void reconstruct(Index::idx_t key, float *out) const override
Reconstruction methods; prefer the batch reconstruct as it will be more efficient
-
virtual void reconstruct_n(Index::idx_t i0, Index::idx_t num, float *out) const override
Batch reconstruction method.
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Compute residual.
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Compute residual (batch mode)
-
inline FlatIndex *getGpuData()
For internal access.
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual bool addImplRequiresIDs_() const override
Flat index does not require IDs as there is no storage available for them
Protected Attributes
-
const GpuIndexFlatConfig flatConfig_
Our configuration options.
-
std::unique_ptr<FlatIndex> data_
Holds our GPU data containing the list of vectors.
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
-
using idx_t = int64_t
-
class GpuIndexFlatL2 : public faiss::gpu::GpuIndexFlat
- #include <GpuIndexFlat.h>
Wrapper around the GPU implementation that looks like faiss::IndexFlatL2; copies over centroid data from a given faiss::IndexFlat
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
GpuIndexFlatL2(GpuResourcesProvider *provider, faiss::IndexFlatL2 *index, GpuIndexFlatConfig config = GpuIndexFlatConfig())
Construct from a pre-existing faiss::IndexFlatL2 instance, copying data over to the given GPU
-
GpuIndexFlatL2(GpuResourcesProvider *provider, int dims, GpuIndexFlatConfig config = GpuIndexFlatConfig())
Construct an empty instance that can be added to.
-
void copyFrom(faiss::IndexFlat *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexFlat *index)
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
void copyFrom(const faiss::IndexFlat *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexFlat *index) const
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
size_t getNumVecs() const
Returns the number of vectors we contain.
-
virtual void reset() override
Clears all vectors from this index.
-
virtual void train(Index::idx_t n, const float *x) override
This index is not trained, so this does nothing.
-
virtual void reconstruct(Index::idx_t key, float *out) const override
Reconstruction methods; prefer the batch reconstruct as it will be more efficient
-
virtual void reconstruct_n(Index::idx_t i0, Index::idx_t num, float *out) const override
Batch reconstruction method.
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Compute residual.
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Compute residual (batch mode)
-
inline FlatIndex *getGpuData()
For internal access.
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual bool addImplRequiresIDs_() const override
Flat index does not require IDs as there is no storage available for them
Protected Attributes
-
const GpuIndexFlatConfig flatConfig_
Our configuration options.
-
std::unique_ptr<FlatIndex> data_
Holds our GPU data containing the list of vectors.
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
-
using idx_t = int64_t
-
class GpuIndexIVF : public faiss::gpu::GpuIndex
Subclassed by faiss::gpu::GpuIndexIVFFlat, faiss::gpu::GpuIndexIVFPQ, faiss::gpu::GpuIndexIVFScalarQuantizer
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
GpuIndexIVF(GpuResourcesProvider *provider, int dims, faiss::MetricType metric, float metricArg, int nlist, GpuIndexIVFConfig config = GpuIndexIVFConfig())
-
~GpuIndexIVF() override
-
int getNumLists() const
Returns the number of inverted lists we’re managing.
-
virtual int getListLength(int listId) const = 0
Returns the number of vectors present in a particular inverted list.
-
virtual std::vector<uint8_t> getListVectorData(int listId, bool gpuFormat = false) const = 0
Return the encoded vector data contained in a particular inverted list, for debugging purposes. If gpuFormat is true, the data is returned as it is encoded in the GPU-side representation. Otherwise, it is converted to the CPU format. compliant format, while the native GPU format may differ.
-
virtual std::vector<Index::idx_t> getListIndices(int listId) const = 0
Return the vector indices contained in a particular inverted list, for debugging purposes.
-
GpuIndexFlat *getQuantizer()
Return the quantizer we’re using.
-
void setNumProbes(int nprobe)
Sets the number of list probes per query.
-
int getNumProbes() const
Returns our current number of list probes per query.
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add(Index::idx_t, const float *x) override
x
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void train(idx_t n, const float *x)
Perform training on a representative set of vectors
- Parameters
n – nb of training vectors
x – training vecors, size n * d
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual void reset() = 0
removes all elements from the database.
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void reconstruct(idx_t key, float *recons) const
Reconstruct a stored vector (or an approximation if lossy coding)
this function may not be defined for some indexes
- Parameters
key – id of the vector to reconstruct
recons – reconstucted vector (size d)
-
virtual void reconstruct_n(idx_t i0, idx_t ni, float *recons) const
Reconstruct vectors i0 to i0 + ni - 1
this function may not be defined for some indexes
- Parameters
recons – reconstucted vector (size ni * d)
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
ClusteringParameters cp
Exposing this like the CPU version for manipulation.
-
int nlist
Exposing this like the CPU version for query.
-
int nprobe
Exposing this like the CPU version for manipulation.
-
GpuIndexFlat *quantizer
Exposeing this like the CPU version for query.
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual bool addImplRequiresIDs_() const override
Does addImpl_ require IDs? If so, and no IDs are provided, we will generate them sequentially based on the order in which the IDs are added
Protected Attributes
-
const GpuIndexIVFConfig ivfConfig_
Our configuration options.
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
Private Functions
-
void init_()
Shared initialization functions.
-
using idx_t = int64_t
-
struct GpuIndexIVFConfig : public faiss::gpu::GpuIndexConfig
Subclassed by faiss::gpu::GpuIndexIVFFlatConfig, faiss::gpu::GpuIndexIVFPQConfig, faiss::gpu::GpuIndexIVFScalarQuantizerConfig
Public Functions
-
inline GpuIndexIVFConfig()
Public Members
-
IndicesOptions indicesOptions
Index storage options for the GPU.
-
GpuIndexFlatConfig flatConfig
Configuration for the coarse quantizer object.
-
int device
GPU device on which the index is resident.
-
MemorySpace memorySpace
What memory space to use for primary storage. On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.
-
inline GpuIndexIVFConfig()
-
class GpuIndexIVFFlat : public faiss::gpu::GpuIndexIVF
- #include <GpuIndexIVFFlat.h>
Wrapper around the GPU implementation that looks like faiss::IndexIVFFlat
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
GpuIndexIVFFlat(GpuResourcesProvider *provider, const faiss::IndexIVFFlat *index, GpuIndexIVFFlatConfig config = GpuIndexIVFFlatConfig())
Construct from a pre-existing faiss::IndexIVFFlat instance, copying data over to the given GPU, if the input index is trained.
-
GpuIndexIVFFlat(GpuResourcesProvider *provider, int dims, int nlist, faiss::MetricType metric, GpuIndexIVFFlatConfig config = GpuIndexIVFFlatConfig())
Constructs a new instance with an empty flat quantizer; the user provides the number of lists desired.
-
~GpuIndexIVFFlat() override
-
void reserveMemory(size_t numVecs)
Reserve GPU memory in our inverted lists for this number of vectors.
-
void copyFrom(const faiss::IndexIVFFlat *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexIVFFlat *index) const
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
size_t reclaimMemory()
After adding vectors, one can call this to reclaim device memory to exactly the amount needed. Returns space reclaimed in bytes
-
virtual void reset() override
Clears out all inverted lists, but retains the coarse centroid information
-
virtual void train(Index::idx_t n, const float *x) override
Trains the coarse quantizer based on the given vector data.
-
virtual int getListLength(int listId) const override
Returns the number of vectors present in a particular inverted list.
-
virtual std::vector<uint8_t> getListVectorData(int listId, bool gpuFormat = false) const override
Return the encoded vector data contained in a particular inverted list, for debugging purposes. If gpuFormat is true, the data is returned as it is encoded in the GPU-side representation. Otherwise, it is converted to the CPU format. compliant format, while the native GPU format may differ.
-
virtual std::vector<Index::idx_t> getListIndices(int listId) const override
Return the vector indices contained in a particular inverted list, for debugging purposes.
-
int getNumLists() const
Returns the number of inverted lists we’re managing.
-
GpuIndexFlat *getQuantizer()
Return the quantizer we’re using.
-
void setNumProbes(int nprobe)
Sets the number of list probes per query.
-
int getNumProbes() const
Returns our current number of list probes per query.
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add(Index::idx_t, const float *x) override
x
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void reconstruct(idx_t key, float *recons) const
Reconstruct a stored vector (or an approximation if lossy coding)
this function may not be defined for some indexes
- Parameters
key – id of the vector to reconstruct
recons – reconstucted vector (size d)
-
virtual void reconstruct_n(idx_t i0, idx_t ni, float *recons) const
Reconstruct vectors i0 to i0 + ni - 1
this function may not be defined for some indexes
- Parameters
recons – reconstucted vector (size ni * d)
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
ClusteringParameters cp
Exposing this like the CPU version for manipulation.
-
int nlist
Exposing this like the CPU version for query.
-
int nprobe
Exposing this like the CPU version for manipulation.
-
GpuIndexFlat *quantizer
Exposeing this like the CPU version for query.
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual void addImpl_(int n, const float *x, const Index::idx_t *ids) override
Called from GpuIndex for add/add_with_ids.
-
virtual void searchImpl_(int n, const float *x, int k, float *distances, Index::idx_t *labels) const override
Called from GpuIndex for search.
-
virtual bool addImplRequiresIDs_() const override
Does addImpl_ require IDs? If so, and no IDs are provided, we will generate them sequentially based on the order in which the IDs are added
Protected Attributes
-
const GpuIndexIVFFlatConfig ivfFlatConfig_
Our configuration options.
-
size_t reserveMemoryVecs_
Desired inverted list memory reservation.
-
std::unique_ptr<IVFFlat> index_
Instance that we own; contains the inverted list.
-
const GpuIndexIVFConfig ivfConfig_
Our configuration options.
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
-
using idx_t = int64_t
-
struct GpuIndexIVFFlatConfig : public faiss::gpu::GpuIndexIVFConfig
Public Functions
-
inline GpuIndexIVFFlatConfig()
Public Members
-
bool interleavedLayout
Use the alternative memory layout for the IVF lists (currently the default)
-
IndicesOptions indicesOptions
Index storage options for the GPU.
-
GpuIndexFlatConfig flatConfig
Configuration for the coarse quantizer object.
-
int device
GPU device on which the index is resident.
-
MemorySpace memorySpace
What memory space to use for primary storage. On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.
-
inline GpuIndexIVFFlatConfig()
-
class GpuIndexIVFPQ : public faiss::gpu::GpuIndexIVF
- #include <GpuIndexIVFPQ.h>
IVFPQ index for the GPU.
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
GpuIndexIVFPQ(GpuResourcesProvider *provider, const faiss::IndexIVFPQ *index, GpuIndexIVFPQConfig config = GpuIndexIVFPQConfig())
Construct from a pre-existing faiss::IndexIVFPQ instance, copying data over to the given GPU, if the input index is trained.
-
GpuIndexIVFPQ(GpuResourcesProvider *provider, int dims, int nlist, int subQuantizers, int bitsPerCode, faiss::MetricType metric, GpuIndexIVFPQConfig config = GpuIndexIVFPQConfig())
Construct an empty index.
-
~GpuIndexIVFPQ() override
-
void copyFrom(const faiss::IndexIVFPQ *index)
Reserve space on the GPU for the inverted lists for
num
vectors, assumed equally distributed among Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexIVFPQ *index) const
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
void reserveMemory(size_t numVecs)
Reserve GPU memory in our inverted lists for this number of vectors.
-
void setPrecomputedCodes(bool enable)
Enable or disable pre-computed codes.
-
bool getPrecomputedCodes() const
Are pre-computed codes enabled?
-
int getNumSubQuantizers() const
Return the number of sub-quantizers we are using.
-
int getBitsPerCode() const
Return the number of bits per PQ code.
-
int getCentroidsPerSubQuantizer() const
Return the number of centroids per PQ code (2^bits per code)
-
size_t reclaimMemory()
After adding vectors, one can call this to reclaim device memory to exactly the amount needed. Returns space reclaimed in bytes
-
virtual void reset() override
Clears out all inverted lists, but retains the coarse and product centroid information
-
virtual void train(Index::idx_t n, const float *x) override
Trains the coarse and product quantizer based on the given vector data.
-
virtual int getListLength(int listId) const override
Returns the number of vectors present in a particular inverted list.
-
virtual std::vector<uint8_t> getListVectorData(int listId, bool gpuFormat = false) const override
Return the encoded vector data contained in a particular inverted list, for debugging purposes. If gpuFormat is true, the data is returned as it is encoded in the GPU-side representation. Otherwise, it is converted to the CPU format. compliant format, while the native GPU format may differ.
-
virtual std::vector<Index::idx_t> getListIndices(int listId) const override
Return the vector indices contained in a particular inverted list, for debugging purposes.
-
int getNumLists() const
Returns the number of inverted lists we’re managing.
-
GpuIndexFlat *getQuantizer()
Return the quantizer we’re using.
-
void setNumProbes(int nprobe)
Sets the number of list probes per query.
-
int getNumProbes() const
Returns our current number of list probes per query.
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add(Index::idx_t, const float *x) override
x
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void reconstruct(idx_t key, float *recons) const
Reconstruct a stored vector (or an approximation if lossy coding)
this function may not be defined for some indexes
- Parameters
key – id of the vector to reconstruct
recons – reconstucted vector (size d)
-
virtual void reconstruct_n(idx_t i0, idx_t ni, float *recons) const
Reconstruct vectors i0 to i0 + ni - 1
this function may not be defined for some indexes
- Parameters
recons – reconstucted vector (size ni * d)
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
ProductQuantizer pq
Like the CPU version, we expose a publically-visible ProductQuantizer for manipulation
-
ClusteringParameters cp
Exposing this like the CPU version for manipulation.
-
int nlist
Exposing this like the CPU version for query.
-
int nprobe
Exposing this like the CPU version for manipulation.
-
GpuIndexFlat *quantizer
Exposeing this like the CPU version for query.
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual void addImpl_(int n, const float *x, const Index::idx_t *ids) override
Called from GpuIndex for add/add_with_ids.
-
virtual void searchImpl_(int n, const float *x, int k, float *distances, Index::idx_t *labels) const override
Called from GpuIndex for search.
-
void verifySettings_() const
Throws errors if configuration settings are improper.
-
void trainResidualQuantizer_(Index::idx_t n, const float *x)
Trains the PQ quantizer based on the given vector data.
-
virtual bool addImplRequiresIDs_() const override
Does addImpl_ require IDs? If so, and no IDs are provided, we will generate them sequentially based on the order in which the IDs are added
Protected Attributes
-
const GpuIndexIVFPQConfig ivfpqConfig_
Our configuration options that we were initialized with.
-
bool usePrecomputedTables_
Runtime override: whether or not we use precomputed tables.
-
int subQuantizers_
Number of sub-quantizers per encoded vector.
-
int bitsPerCode_
Bits per sub-quantizer code.
-
size_t reserveMemoryVecs_
Desired inverted list memory reservation.
-
std::unique_ptr<IVFPQ> index_
The product quantizer instance that we own; contains the inverted lists
-
const GpuIndexIVFConfig ivfConfig_
Our configuration options.
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
-
using idx_t = int64_t
-
struct GpuIndexIVFPQConfig : public faiss::gpu::GpuIndexIVFConfig
Public Functions
-
inline GpuIndexIVFPQConfig()
Public Members
-
bool useFloat16LookupTables
Whether or not float16 residual distance tables are used in the list scanning kernels. When subQuantizers * 2^bitsPerCode > 16384, this is required.
-
bool usePrecomputedTables
Whether or not we enable the precomputed table option for search, which can substantially increase the memory requirement.
-
bool interleavedLayout
Use the alternative memory layout for the IVF lists WARNING: this is a feature under development, do not use!
-
bool useMMCodeDistance
Use GEMM-backed computation of PQ code distances for the no precomputed table version of IVFPQ. This is for debugging purposes, it should not substantially affect the results one way for another.
Note that MM code distance is enabled automatically if one uses a number of dimensions per sub-quantizer that is not natively specialized (an odd number like 7 or so).
-
IndicesOptions indicesOptions
Index storage options for the GPU.
-
GpuIndexFlatConfig flatConfig
Configuration for the coarse quantizer object.
-
int device
GPU device on which the index is resident.
-
MemorySpace memorySpace
What memory space to use for primary storage. On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.
-
inline GpuIndexIVFPQConfig()
-
class GpuIndexIVFScalarQuantizer : public faiss::gpu::GpuIndexIVF
- #include <GpuIndexIVFScalarQuantizer.h>
Wrapper around the GPU implementation that looks like faiss::IndexIVFScalarQuantizer
Public Types
-
using idx_t = int64_t
all indices are this type
-
using component_t = float
-
using distance_t = float
Public Functions
-
GpuIndexIVFScalarQuantizer(GpuResourcesProvider *provider, const faiss::IndexIVFScalarQuantizer *index, GpuIndexIVFScalarQuantizerConfig config = GpuIndexIVFScalarQuantizerConfig())
Construct from a pre-existing faiss::IndexIVFScalarQuantizer instance, copying data over to the given GPU, if the input index is trained.
-
GpuIndexIVFScalarQuantizer(GpuResourcesProvider *provider, int dims, int nlist, faiss::ScalarQuantizer::QuantizerType qtype, faiss::MetricType metric = MetricType::METRIC_L2, bool encodeResidual = true, GpuIndexIVFScalarQuantizerConfig config = GpuIndexIVFScalarQuantizerConfig())
Constructs a new instance with an empty flat quantizer; the user provides the number of lists desired.
-
~GpuIndexIVFScalarQuantizer() override
-
void reserveMemory(size_t numVecs)
Reserve GPU memory in our inverted lists for this number of vectors.
-
void copyFrom(const faiss::IndexIVFScalarQuantizer *index)
Initialize ourselves from the given CPU index; will overwrite all data in ourselves
-
void copyTo(faiss::IndexIVFScalarQuantizer *index) const
Copy ourselves to the given CPU index; will overwrite all data in the index instance
-
size_t reclaimMemory()
After adding vectors, one can call this to reclaim device memory to exactly the amount needed. Returns space reclaimed in bytes
-
virtual void reset() override
Clears out all inverted lists, but retains the coarse and scalar quantizer information
-
virtual void train(Index::idx_t n, const float *x) override
Trains the coarse and scalar quantizer based on the given vector data.
-
virtual int getListLength(int listId) const override
Returns the number of vectors present in a particular inverted list.
-
virtual std::vector<uint8_t> getListVectorData(int listId, bool gpuFormat = false) const override
Return the encoded vector data contained in a particular inverted list, for debugging purposes. If gpuFormat is true, the data is returned as it is encoded in the GPU-side representation. Otherwise, it is converted to the CPU format. compliant format, while the native GPU format may differ.
-
virtual std::vector<Index::idx_t> getListIndices(int listId) const override
Return the vector indices contained in a particular inverted list, for debugging purposes.
-
int getNumLists() const
Returns the number of inverted lists we’re managing.
-
GpuIndexFlat *getQuantizer()
Return the quantizer we’re using.
-
void setNumProbes(int nprobe)
Sets the number of list probes per query.
-
int getNumProbes() const
Returns our current number of list probes per query.
-
int getDevice() const
Returns the device that this index is resident on.
-
std::shared_ptr<GpuResources> getResources()
Returns a reference to our GpuResources object that manages memory, stream and handle resources on the GPU
-
void setMinPagingSize(size_t size)
Set the minimum data size for searches (in MiB) for which we use CPU -> GPU paging
-
size_t getMinPagingSize() const
Returns the current minimum data size for paged searches.
-
virtual void add(Index::idx_t, const float *x) override
x
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void add_with_ids(Index::idx_t n, const float *x, const Index::idx_t *ids) override
x
andids
can be resident on the CPU or any GPU; copies are performed as needed Handles paged adds if the add set is too large; calls addInternal_
-
virtual void assign(Index::idx_t n, const float *x, Index::idx_t *labels, Index::idx_t k = 1) const override
x
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void search(Index::idx_t n, const float *x, Index::idx_t k, float *distances, Index::idx_t *labels) const override
x
,distances
andlabels
can be resident on the CPU or any GPU; copies are performed as needed
-
virtual void compute_residual(const float *x, float *residual, Index::idx_t key) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void compute_residual_n(Index::idx_t n, const float *xs, float *residuals, const Index::idx_t *keys) const override
Overridden to force GPU indices to provide their own GPU-friendly implementation
-
virtual void range_search(idx_t n, const float *x, float radius, RangeSearchResult *result) const
query n vectors of dimension d to the index.
return all vectors with distance < radius. Note that many indexes do not implement the range_search (only the k-NN search is mandatory).
- Parameters
x – input vectors to search, size n * d
radius – search radius
result – result table
-
virtual size_t remove_ids(const IDSelector &sel)
removes IDs from the index. Not supported by all indexes. Returns the number of elements removed.
-
virtual void reconstruct(idx_t key, float *recons) const
Reconstruct a stored vector (or an approximation if lossy coding)
this function may not be defined for some indexes
- Parameters
key – id of the vector to reconstruct
recons – reconstucted vector (size d)
-
virtual void reconstruct_n(idx_t i0, idx_t ni, float *recons) const
Reconstruct vectors i0 to i0 + ni - 1
this function may not be defined for some indexes
- Parameters
recons – reconstucted vector (size ni * d)
-
virtual void search_and_reconstruct(idx_t n, const float *x, idx_t k, float *distances, idx_t *labels, float *recons) const
Similar to search, but also reconstructs the stored vectors (or an approximation in the case of lossy coding) for the search results.
If there are not enough results for a query, the resulting arrays is padded with -1s.
- Parameters
recons – reconstructed vectors size (n, k, d)
-
virtual DistanceComputer *get_distance_computer() const
Get a DistanceComputer (defined in AuxIndexStructures) object for this kind of index.
DistanceComputer is implemented for indexes that support random access of their vectors.
-
virtual size_t sa_code_size() const
size of the produced codes in bytes
-
virtual void sa_encode(idx_t n, const float *x, uint8_t *bytes) const
encode a set of vectors
- Parameters
n – number of vectors
x – input vectors, size n * d
bytes – output encoded vectors, size n * sa_code_size()
-
virtual void sa_decode(idx_t n, const uint8_t *bytes, float *x) const
decode a set of vectors
- Parameters
n – number of vectors
bytes – input encoded vectors, size n * sa_code_size()
x – output vectors, size n * d
Public Members
-
faiss::ScalarQuantizer sq
Exposed like the CPU version.
-
bool by_residual
Exposed like the CPU version.
-
ClusteringParameters cp
Exposing this like the CPU version for manipulation.
-
int nlist
Exposing this like the CPU version for query.
-
int nprobe
Exposing this like the CPU version for manipulation.
-
GpuIndexFlat *quantizer
Exposeing this like the CPU version for query.
-
int d
vector dimension
-
idx_t ntotal
total nb of indexed vectors
-
bool verbose
verbosity level
-
bool is_trained
set if the Index does not require training, or if training is done already
-
MetricType metric_type
type of metric this index uses for search
-
float metric_arg
argument of the metric type
Protected Functions
-
virtual void addImpl_(int n, const float *x, const Index::idx_t *ids) override
Called from GpuIndex for add/add_with_ids.
-
virtual void searchImpl_(int n, const float *x, int k, float *distances, Index::idx_t *labels) const override
Called from GpuIndex for search.
-
void trainResiduals_(Index::idx_t n, const float *x)
Called from train to handle SQ residual training.
-
virtual bool addImplRequiresIDs_() const override
Does addImpl_ require IDs? If so, and no IDs are provided, we will generate them sequentially based on the order in which the IDs are added
Protected Attributes
-
const GpuIndexIVFScalarQuantizerConfig ivfSQConfig_
Our configuration options.
-
size_t reserveMemoryVecs_
Desired inverted list memory reservation.
-
std::unique_ptr<IVFFlat> index_
Instance that we own; contains the inverted list.
-
const GpuIndexIVFConfig ivfConfig_
Our configuration options.
-
std::shared_ptr<GpuResources> resources_
Manages streams, cuBLAS handles and scratch memory for devices.
-
const GpuIndexConfig config_
Our configuration options.
-
size_t minPagedSize_
Size above which we page copies from the CPU to GPU.
-
using idx_t = int64_t
-
struct GpuIndexIVFScalarQuantizerConfig : public faiss::gpu::GpuIndexIVFConfig
Public Functions
-
inline GpuIndexIVFScalarQuantizerConfig()
Public Members
-
bool interleavedLayout
Use the alternative memory layout for the IVF lists (currently the default)
-
IndicesOptions indicesOptions
Index storage options for the GPU.
-
GpuIndexFlatConfig flatConfig
Configuration for the coarse quantizer object.
-
int device
GPU device on which the index is resident.
-
MemorySpace memorySpace
What memory space to use for primary storage. On Pascal and above (CC 6+) architectures, allows GPUs to use more memory than is available on the GPU.
-
inline GpuIndexIVFScalarQuantizerConfig()
-
struct GpuMemoryReservation
- #include <GpuResources.h>
A RAII object that manages a temporary memory request.
Public Functions
-
GpuMemoryReservation()
-
GpuMemoryReservation(GpuResources *r, int dev, cudaStream_t str, void *p, size_t sz)
-
GpuMemoryReservation(GpuMemoryReservation &&m) noexcept
-
~GpuMemoryReservation()
-
GpuMemoryReservation &operator=(GpuMemoryReservation &&m)
-
inline void *get()
-
void release()
-
GpuMemoryReservation()
-
struct GpuMultipleClonerOptions : public faiss::gpu::GpuClonerOptions
Subclassed by faiss::gpu::ToGpuClonerMultiple
Public Functions
-
GpuMultipleClonerOptions()
Public Members
-
bool shard
Whether to shard the index across GPUs, versus replication across GPUs
-
int shard_type
IndexIVF::copy_subset_to subset type.
-
IndicesOptions indicesOptions
how should indices be stored on index types that support indices (anything but GpuIndexFlat*)?
-
bool useFloat16CoarseQuantizer
is the coarse quantizer in float16?
-
bool useFloat16
for GpuIndexIVFFlat, is storage in float16? for GpuIndexIVFPQ, are intermediate calculations in float16?
-
bool usePrecomputed
use precomputed tables?
-
long reserveVecs
reserve vectors in the invfiles?
-
bool storeTransposed
For GpuIndexFlat, store data in transposed layout?
-
bool verbose
Set verbose options on the index.
-
GpuMultipleClonerOptions()
-
struct GpuParameterSpace : public faiss::ParameterSpace
- #include <GpuAutoTune.h>
parameter space and setters for GPU indexes
Public Functions
-
virtual void initialize(const faiss::Index *index) override
initialize with reasonable parameters for the index
-
virtual void set_index_parameter(faiss::Index *index, const std::string &name, double val) const override
set a combination of parameters on an index
-
size_t n_combinations() const
nb of combinations, = product of values sizes
-
bool combination_ge(size_t c1, size_t c2) const
returns whether combinations c1 >= c2 in the tuple sense
-
std::string combination_name(size_t cno) const
get string representation of the combination
-
void display() const
print a description on stdout
-
ParameterRange &add_range(const std::string &name)
add a new parameter (or return it if it exists)
-
void set_index_parameters(Index *index, size_t cno) const
set a combination of parameters on an index
-
void set_index_parameters(Index *index, const char *param_string) const
set a combination of parameters described by a string
-
void update_bounds(size_t cno, const OperatingPoint &op, double *upper_bound_perf, double *lower_bound_t) const
find an upper bound on the performance and a lower bound on t for configuration cno given another operating point op
-
void explore(Index *index, size_t nq, const float *xq, const AutoTuneCriterion &crit, OperatingPoints *ops) const
explore operating points
- Parameters
index – index to run on
xq – query vectors (size nq * index.d)
crit – selection criterion
ops – resulting operating points
Public Members
-
std::vector<ParameterRange> parameter_ranges
all tunable parameters
-
int verbose
verbosity during exploration
-
int n_experiments
nb of experiments during optimization (0 = try all combinations)
-
size_t batchsize
maximum number of queries to submit at a time.
-
bool thread_over_batches
use multithreading over batches (useful to benchmark independent single-searches)
-
double min_test_duration
run tests several times until they reach at least this duration (to avoid jittering in MT mode)
-
virtual void initialize(const faiss::Index *index) override
-
struct GpuProgressiveDimIndexFactory : public faiss::ProgressiveDimIndexFactory
- #include <GpuCloner.h>
index factory for the ProgressiveDimClustering object
Public Functions
-
explicit GpuProgressiveDimIndexFactory(int ngpu)
-
virtual Index *operator()(int dim) override
ownership transferred to caller
-
virtual ~GpuProgressiveDimIndexFactory() override
Public Members
-
GpuMultipleClonerOptions options
-
std::vector<GpuResourcesProvider*> vres
-
std::vector<int> devices
-
int ncall
-
explicit GpuProgressiveDimIndexFactory(int ngpu)
-
class GpuResources
- #include <GpuResources.h>
Base class of GPU-side resource provider; hides provision of cuBLAS handles, CUDA streams and all device memory allocation performed
Subclassed by faiss::gpu::StandardGpuResourcesImpl
Public Functions
-
virtual ~GpuResources()
-
virtual void initializeForDevice(int device) = 0
Call to pre-allocate resources for a particular device. If this is not called, then resources will be allocated at the first time of demand
-
virtual cublasHandle_t getBlasHandle(int device) = 0
Returns the cuBLAS handle that we use for the given device.
-
virtual cudaStream_t getDefaultStream(int device) = 0
Returns the stream that we order all computation on for the given device
-
virtual void setDefaultStream(int device, cudaStream_t stream) = 0
Overrides the default stream for a device to the user-supplied stream. The resources object does not own this stream (i.e., it will not destroy it).
-
virtual std::vector<cudaStream_t> getAlternateStreams(int device) = 0
Returns the set of alternative streams that we use for the given device.
-
virtual void *allocMemory(const AllocRequest &req) = 0
Memory management Returns an allocation from the given memory space, ordered with respect to the given stream (i.e., the first user will be a kernel in this stream). All allocations are sized internally to be the next highest multiple of 16 bytes, and all allocations returned are guaranteed to be 16 byte aligned.
-
virtual void deallocMemory(int device, void *in) = 0
Returns a previous allocation.
-
virtual size_t getTempMemoryAvailable(int device) const = 0
For MemorySpace::Temporary, how much space is immediately available without cudaMalloc allocation?
-
virtual std::pair<void*, size_t> getPinnedMemory() = 0
Returns the available CPU pinned memory buffer.
-
virtual cudaStream_t getAsyncCopyStream(int device) = 0
Returns the stream on which we perform async CPU <-> GPU copies.
-
cublasHandle_t getBlasHandleCurrentDevice()
Calls getBlasHandle with the current device.
Functions provided by default
-
cudaStream_t getDefaultStreamCurrentDevice()
Calls getDefaultStream with the current device.
-
size_t getTempMemoryAvailableCurrentDevice() const
Calls getTempMemoryAvailable with the current device.
-
GpuMemoryReservation allocMemoryHandle(const AllocRequest &req)
Returns a temporary memory allocation via a RAII object.
-
void syncDefaultStream(int device)
Synchronizes the CPU with respect to the default stream for the given device
-
void syncDefaultStreamCurrentDevice()
Calls syncDefaultStream for the current device.
-
std::vector<cudaStream_t> getAlternateStreamsCurrentDevice()
Calls getAlternateStreams for the current device.
-
cudaStream_t getAsyncCopyStreamCurrentDevice()
Calls getAsyncCopyStream for the current device.
-
virtual ~GpuResources()
-
class GpuResourcesProvider
- #include <GpuResources.h>
Interface for a provider of a shared resources object.
Subclassed by faiss::gpu::StandardGpuResources
Public Functions
-
virtual ~GpuResourcesProvider()
-
virtual std::shared_ptr<GpuResources> getResources() = 0
Returns the shared resources object.
-
virtual ~GpuResourcesProvider()
-
template<typename GpuIndex>
struct IndexWrapper Public Functions
-
IndexWrapper(int numGpus, std::function<std::unique_ptr<GpuIndex>(GpuResourcesProvider*, int)> init)
-
void runOnIndices(std::function<void(GpuIndex*)> f)
-
void setNumProbes(int nprobe)
Public Members
-
std::vector<std::unique_ptr<faiss::gpu::StandardGpuResources>> resources
-
std::vector<std::unique_ptr<GpuIndex>> subIndex
-
std::unique_ptr<faiss::IndexReplicas> replicaIndex
-
IndexWrapper(int numGpus, std::function<std::unique_ptr<GpuIndex>(GpuResourcesProvider*, int)> init)
-
class KernelTimer
- #include <Timer.h>
Utility class for timing execution of a kernel.
Public Functions
-
KernelTimer(cudaStream_t stream = 0)
Constructor starts the timer and adds an event into the current device stream
-
~KernelTimer()
Destructor releases event resources.
-
float elapsedMilliseconds()
Adds a stop event then synchronizes on the stop event to get the actual GPU-side kernel timings for any kernels launched in the current stream. Returns the number of milliseconds elapsed. Can only be called once.
Private Members
-
cudaEvent_t startEvent_
-
cudaEvent_t stopEvent_
-
cudaStream_t stream_
-
bool valid_
-
KernelTimer(cudaStream_t stream = 0)
-
class StackDeviceMemory
- #include <StackDeviceMemory.h>
Device memory manager that provides temporary memory allocations out of a region of memory, for a single device
Public Functions
-
StackDeviceMemory(GpuResources *res, int device, size_t allocPerDevice)
Allocate a new region of memory that we manage.
-
StackDeviceMemory(int device, void *p, size_t size, bool isOwner)
Manage a region of memory for a particular device, with or without ownership
-
~StackDeviceMemory()
-
int getDevice() const
-
void *allocMemory(cudaStream_t stream, size_t size)
All allocations requested should be a multiple of 16 bytes.
-
void deallocMemory(int device, cudaStream_t, size_t size, void *p)
-
size_t getSizeAvailable() const
-
std::string toString() const
-
struct Range
- #include <StackDeviceMemory.h>
Previous allocation ranges and the streams for which synchronization is required
Public Functions
-
inline Range(char *s, char *e, cudaStream_t str)
Public Members
-
char *start_
-
char *end_
-
cudaStream_t stream_
-
inline Range(char *s, char *e, cudaStream_t str)
-
struct Stack
Public Functions
-
Stack(GpuResources *res, int device, size_t size)
Constructor that allocates memory via cudaMalloc.
-
~Stack()
-
size_t getSizeAvailable() const
Returns how much size is available for an allocation without calling cudaMalloc
-
char *getAlloc(size_t size, cudaStream_t stream)
Obtains an allocation; all allocations are guaranteed to be 16 byte aligned
-
void returnAlloc(char *p, size_t size, cudaStream_t stream)
Returns an allocation.
-
std::string toString() const
Returns the stack state.
Public Members
-
GpuResources *res_
Our GpuResources object.
-
int device_
Device this allocation is on.
-
char *alloc_
Where our temporary memory buffer is allocated; we allocate starting 16 bytes into this
-
size_t allocSize_
Total size of our allocation.
-
char *start_
Our temporary memory region; [start_, end_) is valid.
-
char *end_
-
char *head_
Stack head within [start, end)
-
std::list<Range> lastUsers_
List of previous last users of allocations on our stack, for possible synchronization purposes
-
size_t highWaterMemoryUsed_
What’s the high water mark in terms of memory used from the temporary buffer?
-
Stack(GpuResources *res, int device, size_t size)
-
StackDeviceMemory(GpuResources *res, int device, size_t allocPerDevice)
-
class StandardGpuResources : public faiss::gpu::GpuResourcesProvider
- #include <StandardGpuResources.h>
Default implementation of GpuResources that allocates a cuBLAS stream and 2 streams for use, as well as temporary memory. Internally, the Faiss GPU code uses the instance managed by getResources, but this is the user-facing object that is internally reference counted.
Public Functions
-
StandardGpuResources()
-
~StandardGpuResources() override
- virtual std::shared_ptr<GpuR
-
StandardGpuResources()
-
enum class DistanceDataType