Class faiss::gpu::StandardGpuResourcesImpl
-
class StandardGpuResourcesImpl : public faiss::gpu::GpuResources
Standard implementation of the GpuResources object that provides for a temporary memory manager
Public Functions
-
StandardGpuResourcesImpl()
-
~StandardGpuResourcesImpl() override
-
virtual bool supportsBFloat16(int device) override
Does the given GPU support bfloat16?
-
void noTempMemory()
Disable allocation of temporary memory; all temporary memory requests will call cudaMalloc / cudaFree at the point of use
-
void setTempMemory(size_t size)
Specify that we wish to use a certain fixed size of memory on all devices as temporary memory. This is the upper bound for the GPU memory that we will reserve. We will never go above 1.5 GiB on any GPU; smaller GPUs (with <= 4 GiB or <= 8 GiB) will use less memory than that. To avoid any temporary memory allocation, pass 0.
-
void setPinnedMemory(size_t size)
Set amount of pinned memory to allocate, for async GPU <-> CPU transfers
-
virtual void setDefaultStream(int device, cudaStream_t stream) override
Called to change the stream for work ordering. We do not own
stream
; i.e., it will not be destroyed when the GpuResources object gets cleaned up. We are guaranteed that all Faiss GPU work is ordered with respect to this stream upon exit from an index or other Faiss GPU call.
-
void revertDefaultStream(int device)
Revert the default stream to the original stream managed by this resources object, in case someone called
setDefaultStream
.
-
virtual cudaStream_t getDefaultStream(int device) override
Returns the stream for the given device on which all Faiss GPU work is ordered. We are guaranteed that all Faiss GPU work is ordered with respect to this stream upon exit from an index or other Faiss GPU call.
-
void setDefaultNullStreamAllDevices()
Called to change the work ordering streams to the null stream for all devices
-
void setLogMemoryAllocations(bool enable)
If enabled, will print every GPU memory allocation and deallocation to standard output
-
virtual void initializeForDevice(int device) override
Internal system calls.
Initialize resources for this device
-
virtual cublasHandle_t getBlasHandle(int device) override
Returns the cuBLAS handle that we use for the given device.
-
virtual std::vector<cudaStream_t> getAlternateStreams(int device) override
Returns the set of alternative streams that we use for the given device.
-
virtual void *allocMemory(const AllocRequest &req) override
Allocate non-temporary GPU memory.
-
virtual void deallocMemory(int device, void *in) override
Returns a previous allocation.
-
virtual size_t getTempMemoryAvailable(int device) const override
For MemorySpace::Temporary, how much space is immediately available without cudaMalloc allocation?
-
std::map<int, std::map<std::string, std::pair<int, size_t>>> getMemoryInfo() const
Export a description of memory used for Python.
-
virtual std::pair<void*, size_t> getPinnedMemory() override
Returns the available CPU pinned memory buffer.
-
virtual cudaStream_t getAsyncCopyStream(int device) override
Returns the stream on which we perform async CPU <-> GPU copies.
-
bool supportsBFloat16CurrentDevice()
Does the current GPU support bfloat16?
Functions provided by default
-
cublasHandle_t getBlasHandleCurrentDevice()
Calls getBlasHandle with the current device.
-
cudaStream_t getDefaultStreamCurrentDevice()
Calls getDefaultStream with the current device.
-
size_t getTempMemoryAvailableCurrentDevice() const
Calls getTempMemoryAvailable with the current device.
-
GpuMemoryReservation allocMemoryHandle(const AllocRequest &req)
Returns a temporary memory allocation via a RAII object.
-
void syncDefaultStream(int device)
Synchronizes the CPU with respect to the default stream for the given device
-
void syncDefaultStreamCurrentDevice()
Calls syncDefaultStream for the current device.
-
std::vector<cudaStream_t> getAlternateStreamsCurrentDevice()
Calls getAlternateStreams for the current device.
-
cudaStream_t getAsyncCopyStreamCurrentDevice()
Calls getAsyncCopyStream for the current device.
Protected Functions
-
bool isInitialized(int device) const
Have GPU resources been initialized for this device yet?
Protected Attributes
-
std::unordered_map<int, std::unordered_map<void*, AllocRequest>> allocs_
Set of currently outstanding memory allocations per device device -> (alloc request, allocated ptr)
-
std::unordered_map<int, std::unique_ptr<StackDeviceMemory>> tempMemory_
Temporary memory provider, per each device.
-
std::unordered_map<int, cudaStream_t> defaultStreams_
Our default stream that work is ordered on, one per each device.
-
std::unordered_map<int, cudaStream_t> userDefaultStreams_
This contains particular streams as set by the user for ordering, if any
-
std::unordered_map<int, std::vector<cudaStream_t>> alternateStreams_
Other streams we can use, per each device.
-
std::unordered_map<int, cudaStream_t> asyncCopyStreams_
Async copy stream to use for GPU <-> CPU pinned memory copies.
-
void *pinnedMemAlloc_
Pinned memory allocation for use with this GPU.
-
size_t pinnedMemAllocSize_
-
size_t tempMemSize_
Another option is to use a specified amount of memory on all devices
-
size_t pinnedMemSize_
Amount of pinned memory we should allocate.
-
bool allocLogging_
Whether or not we log every GPU memory allocation and deallocation.
Protected Static Functions
-
static size_t getDefaultTempMemForGPU(int device, size_t requested)
Adjust the default temporary memory allocation based on the total GPU memory size
-
StandardGpuResourcesImpl()