Adding to an Array From Multiple Threads - Lock Free
- CUDA Driver vs Architecture vs Toolkit (aka SDK)
- > Adding to an Array From Multiple Threads - Lock Free
- How to Structure a CUDA C++ Project
- Numbers to Know (GPU v CPU)
- GPU Architecture
- How does a graphic card actually draw stuff?
To add an item into the array, atomically retrieve the next index to place an item into and increment it. Then (after you have retrieved the next index to add an item into AND you have incremented it) write your content into that index.
// This variable is shared between threads. // Represents the next index to add an item into. int index = 0; int oldIndex = atomicAdd(&index, 1) array[oldIndex] = 23234321;
If your threads need to synchronize, often they have to do some operation in serial. Try to make that operation as small as possible (consume as little CPU time as possible).
Here’s an easy way to add items to an array from multiple threads, without using a lock. You will use an atomic int.
The atomic int represents the next spot (cell) in the array to write an item into. Every time a thread wants to add an item to the array, it will atomically increment this int, and get the old value (all in one step). It will then proceed to write its content in the “old value” spot.
int index = 0;
// thread 1
int oldIndex = atomicAdd(&index, 1)
array[oldIndex] = 23234321;
// thread 2
int oldIndex = atomicAdd(&index, 1)
array[oldIndex] = 1234;
// etc
The key thing here is that the the next index to insert an item into is retrieved and incremented atomically.
This is effecient because the serialized operation is super quick (simply incrementing an int). Compare this with using a lock: Each thread would have to aquire the lock, then write its content into the new spot, and then release the lock. See how the serialized work is much more in that case?