concurrentqueue
A fast multi-producer, multi-consumer lock-free concurrent queue for C++11
A single-header C++ library for passing data between multiple threads without locks, offering high-performance lock-free queuing for multi-threaded applications.
This is a C++ library that solves a specific problem: passing data between multiple threads safely and quickly, without using traditional locks. A "queue" in programming is a data structure where items are added at one end and removed from the other, like a line at a store. When multiple threads (independent execution paths in a program) all need to read from and write to the same queue at the same time, you normally need a lock to prevent them from colliding. This library avoids that lock entirely, which is why it is described as "lock-free."
The entire library is a single header file, meaning you just drop one file into your project and you are done. It handles memory management for you, works with any data type, and places no artificial limits on how many items can be in the queue. There is also a blocking variant included for cases where a consumer thread should wait (rather than just return "nothing found") when the queue is empty.
Performance is the primary reason this library exists. The author found that other lock-free queues for C++ either imposed restrictions on the types they could hold or were not actually lock-free. This one supports bulk enqueue and dequeue operations, which are significantly faster than adding or removing items one at a time, especially when many threads are contending for the queue simultaneously.
There are meaningful trade-offs to know about. The queue is not linearizable, meaning that if two producers add items at the same time, the order those items come out is not guaranteed across producers. It is also not suitable for NUMA architectures (certain multi-processor hardware layouts) and requires some care with memory ordering in advanced usage patterns. The README includes sample code and links to detailed design documentation for developers who need to understand exactly how it behaves.
Internally, items are stored in contiguous blocks of memory rather than linked lists, which improves cache performance. Each producer gets its own sub-queue, and consumers cycle through all sub-queues to find an item to process. The library requires a C++11-capable compiler to build.
Where it fits
- Replace a mutex-protected queue in a multi-threaded C++ app with a lock-free queue to reduce contention and improve throughput
- Implement a fast producer-consumer pipeline in a game engine or real-time system where latency and throughput matter
- Use bulk enqueue and dequeue operations to batch-process work items across multiple threads more efficiently than one-at-a-time