Class faiss::gpu::StackDeviceMemory

class StackDeviceMemory

Device memory manager that provides temporary memory allocations out of a region of memory, for a single device

Public Functions

StackDeviceMemory(GpuResources *res, int device, size_t allocPerDevice)

Allocate a new region of memory that we manage.

StackDeviceMemory(int device, void *p, size_t size, bool isOwner)

Manage a region of memory for a particular device, with or without ownership

~StackDeviceMemory()
int getDevice() const
void *allocMemory(cudaStream_t stream, size_t size)

All allocations requested should be a multiple of 16 bytes.

void deallocMemory(int device, cudaStream_t, size_t size, void *p)
size_t getSizeAvailable() const
std::string toString() const

Protected Attributes

int device_

Our device.

Stack stack_

Memory stack.

struct Range

Previous allocation ranges and the streams for which synchronization is required

Public Functions

inline Range(char *s, char *e, cudaStream_t str)

Public Members

char *start_
char *end_
cudaStream_t stream_
struct Stack

Public Functions

Stack(GpuResources *res, int device, size_t size)

Constructor that allocates memory via cudaMalloc.

~Stack()
size_t getSizeAvailable() const

Returns how much size is available for an allocation without calling cudaMalloc

char *getAlloc(size_t size, cudaStream_t stream)

Obtains an allocation; all allocations are guaranteed to be 16 byte aligned

void returnAlloc(char *p, size_t size, cudaStream_t stream)

Returns an allocation.

std::string toString() const

Returns the stack state.

Public Members

GpuResources *res_

Our GpuResources object.

int device_

Device this allocation is on.

char *alloc_

Where our temporary memory buffer is allocated; we allocate starting 16 bytes into this

size_t allocSize_

Total size of our allocation.

char *start_

Our temporary memory region; [start_, end_) is valid.

char *end_
char *head_

Stack head within [start, end)

std::list<Range> lastUsers_

List of previous last users of allocations on our stack, for possible synchronization purposes

size_t highWaterMemoryUsed_

What’s the high water mark in terms of memory used from the temporary buffer?