[med-svn] [Git][med-team/libatomic-queue][master] 13 commits: update changelog.
Étienne Mollier (@emollier)
gitlab at salsa.debian.org
Tue Jul 18 13:17:57 BST 2023
Étienne Mollier pushed to branch master at Debian Med / libatomic-queue
d5eb2d32 by Étienne Mollier at 2023-07-18T13:33:49+02:00
update changelog.
- - - - -
1e62ea14 by Étienne Mollier at 2023-07-18T13:34:23+02:00
New upstream version 0.0+git20230629.b770bb2
- - - - -
83c8e21f by Étienne Mollier at 2023-07-18T13:34:23+02:00
routine-update: New upstream version
- - - - -
e9a16e6a by Étienne Mollier at 2023-07-18T13:34:30+02:00
Update upstream source from tag 'upstream/0.0+git20230629.b770bb2'
Update to upstream version '0.0+git20230629.b770bb2'
with Debian dir 672de571604e8fd1a5e1cfb14ca463d059678e81
- - - - -
39328352 by Étienne Mollier at 2023-07-18T13:34:30+02:00
routine-update: Standards-Version: 4.6.2
- - - - -
83f5d84f by Étienne Mollier at 2023-07-18T13:34:39+02:00
Remove field Section on binary package libatomic-queue0 that duplicates source.
Changes-By: lintian-brush
Fixes: lintian: installable-field-mirrors-source
See-also: https://lintian.debian.org/tags/installable-field-mirrors-source.html
- - - - -
fa5e4061 by Étienne Mollier at 2023-07-18T13:35:09+02:00
generate-shared-library.patch: refresh.
- - - - -
0bbd064e by Étienne Mollier at 2023-07-18T13:38:06+02:00
no-native patch: refresh.
- - - - -
b850a859 by Étienne Mollier at 2023-07-18T13:38:26+02:00
no_thin_archives.patch: refresh.
- - - - -
7ff5a619 by Étienne Mollier at 2023-07-18T13:38:42+02:00
compiler.patch: refresh.
- - - - -
f747e9e1 by Étienne Mollier at 2023-07-18T14:12:11+02:00
concurrentqueue.patch: new: fix include issue.
- - - - -
63d0dc14 by Étienne Mollier at 2023-07-18T14:13:42+02:00
update changelog.
- - - - -
625bcc87 by Étienne Mollier at 2023-07-18T14:16:26+02:00
leave todo items.
- - - - -
20 changed files:
- .github/workflows/c-cpp.yml
- Makefile
- debian/changelog
- debian/control
- debian/patches/compiler.patch
- + debian/patches/concurrentqueue.patch
- debian/patches/generate-shared-library.patch
- debian/patches/no-native
- debian/patches/no_thin_archives.patch
- debian/patches/series
- html/benchmarks.css
- html/benchmarks.html
- html/benchmarks.js
- html/theme.js
- include/atomic_queue/atomic_queue.h
- include/atomic_queue/defs.h
- + results/results-16.20220720T220953.txt
- scripts/run-benchmarks.sh
- src/example.cc
@@ -9,15 +9,19 @@ on:
- runs-on: ubuntu-18.04
+ runs-on: ubuntu-20.04
- - uses: actions/checkout at v2
- - name: Install Boost
+ - uses: actions/checkout at v3
+ - name: Install Boost.Test
run: sudo apt-get --quiet --yes install libboost-test-dev
- name: Environment variables
run: make env; make TOOLSET=gcc versions; make TOOLSET=clang versions
- name: Unit tests with gcc
run: make -rj2 TOOLSET=gcc example run_tests
+ - name: Unit tests with gcc thread sanitizer
+ run: make -rj2 TOOLSET=gcc BUILD=sanitize run_tests
- name: Unit tests with clang
run: make -rj2 TOOLSET=clang example run_tests
+ - name: Unit tests with clang thread sanitizer
+ run: make -rj2 TOOLSET=clang BUILD=sanitize run_tests
@@ -4,6 +4,7 @@
# time make -rC ~/src/atomic_queue -j8 run_benchmarks
# time make -rC ~/src/atomic_queue -j8 TOOLSET=clang run_benchmarks
# time make -rC ~/src/atomic_queue -j8 BUILD=debug run_tests
+# time make -rC ~/src/atomic_queue -j8 BUILD=sanitize run_tests
SHELL := /bin/bash
BUILD := release
@@ -28,13 +29,18 @@ AR := ${ar.${TOOLSET}}
cxxflags.gcc.debug := -Og -fstack-protector-all -fno-omit-frame-pointer # -D_GLIBCXX_DEBUG
cxxflags.gcc.release := -O3 -mtune=native -ffast-math -falign-{functions,loops}=64 -DNDEBUG
+cxxflags.gcc.sanitize := ${cxxflags.gcc.release} -fsanitize=thread
cxxflags.gcc := -pthread -march=native -std=gnu++14 -W{all,extra,error,no-{maybe-uninitialized,unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.gcc.${BUILD}}
+ldflags.gcc.sanitize := ${ldflags.gcc.release} -fsanitize=thread
+ldflags.gcc := ${ldflags.gcc.${BUILD}}
cflags.gcc := -pthread -march=native -W{all,extra} -g -fmessage-length=0 ${cxxflags.gcc.${BUILD}}
cxxflags.clang.debug := -O0 -fstack-protector-all
cxxflags.clang.release := -O3 -mtune=native -ffast-math -falign-functions=64 -DNDEBUG
+cxxflags.clang.sanitize := ${cxxflags.clang.release} -fsanitize=thread
cxxflags.clang := -stdlib=libstdc++ -pthread -march=native -std=gnu++14 -W{all,extra,error,no-{unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.clang.${BUILD}}
+ldflags.clang.sanitize := ${ldflags.clang.release} -fsanitize=thread
ldflags.clang := -stdlib=libstdc++ ${ldflags.clang.${BUILD}}
# Additional CPPFLAGS, CXXFLAGS, CFLAGS, LDLIBS, LDFLAGS can come from the command line, e.g. make CPPFLAGS='-I<my-include-dir>', or from environment variables.
@@ -1,30 +1,45 @@

# atomic_queue
C++14 multiple-producer-multiple-consumer *lockless* queues based on circular buffer with [`std::atomic`][3].
It has been developed, tested and benchmarked on Linux, but should support any C++14 platforms which implement `std::atomic`.
-The main design principle these queues follow is _minimalism_: the bare minimum of atomic operations, fixed size buffer, value semantics.
+These queues have been designed with a goal to minimize the latency between one thread pushing an element into a queue and another thread popping it from the queue.
-These qualities are also limitations:
+## Design Principles
+When minimizing latency a good design is not when there is nothing left to add, but rather when there is nothing left to remove, as these queues exemplify.
-* The maximum queue size must be set at compile time or construction time. The circular buffer side-steps the memory reclamation problem inherent in linked-list based queues for the price of fixed buffer size. See [Effective memory reclamation for lock-free data structures in C++][4] for more details. Fixed buffer size may not be that much of a limitation, since once the queue gets larger than the maximum expected size that indicates a problem that elements aren't processed fast enough, and if the queue keeps growing it may eventually consume all available memory which may affect the entire system, rather than the problematic process only. The only apparent inconvenience is that one has to do an upfront back-of-the-envelope calculation on what would be the largest expected/acceptable queue size.
+The main design principle these queues follow is _minimalism_, which results in such design choices as:
+* Bare minimum of atomic instructions.
+* Explicit contention/false-sharing avoidance.
+* Fixed size buffer.
+* Value semantics. Meaning that the queues make a copy/move upon `push`/`pop`, no reference/pointer to elements in the queue can be obtained.
+The impact of each of these small design choices on their own is barely measurable, but their total impact is much greater than a simple sum of the constituents' impacts, aka super-scalar compounding or synergy (a layman's term). The synergy emerging from combining multiple of these small design choices together is what allows CPUs to perform at their peak capacities least impeded.
+These design choices are also limitations:
+* The maximum queue size must be set at compile time or construction time. The circular buffer side-steps the memory reclamation problem inherent in linked-list based queues for the price of fixed buffer size. See [Effective memory reclamation for lock-free data structures in C++][4] for more details. Fixed buffer size may not be that much of a limitation, since once the queue gets larger than the maximum expected size that indicates a problem that elements aren't consumed fast enough, and if the queue keeps growing it may eventually consume all available memory which may affect the entire system, rather than the problematic process only. The only apparent inconvenience is that one has to do an upfront calculation on what would be the largest expected/acceptable number of unconsumed elements in the queue.
* There are no OS-blocking push/pop functions. This queue is designed for ultra-low-latency scenarios and using an OS blocking primitive would be sacrificing push-to-pop latency. For lowest possible latency one cannot afford blocking in the OS kernel because the wake-up latency of a blocked thread is about 1-3 microseconds, whereas this queue's round-trip time can be as low as 150 nanoseconds.
Ultra-low-latency applications need just that and nothing more. The minimalism pays off, see the [throughput and latency benchmarks][1].
Available containers are:
* `AtomicQueue` - a fixed size ring-buffer for atomic elements.
-* `OptimistAtomicQueue` - a faster fixed size ring-buffer for atomic elements which busy-waits when empty or full.
+* `OptimistAtomicQueue` - a faster fixed size ring-buffer for atomic elements which busy-waits when empty or full. It is `AtomicQueue` used with `push`/`pop` instead of `try_push`/`try_pop`.
* `AtomicQueue2` - a fixed size ring-buffer for non-atomic elements.
-* `OptimistAtomicQueue2` - a faster fixed size ring-buffer for non-atomic elements which busy-waits when empty or full.
-In the above, _atomic elements_ are those, for which [`std::atomic<T>{T{}}.is_lock_free()`][10] returns `true`. In other words, the CPU can load, store and compare-and-exchange such elements atomically natively. On x86-64 such elements are all the [C++ standard arithmetic and pointer types][11].
+* `OptimistAtomicQueue2` - a faster fixed size ring-buffer for non-atomic elements which busy-waits when empty or full. It is `AtomicQueue2` used with `push`/`pop` instead of `try_push`/`try_pop`.
These containers have corresponding `AtomicQueueB`, `OptimistAtomicQueueB`, `AtomicQueueB2`, `OptimistAtomicQueueB2` versions where the buffer size is specified as an argument to the constructor.
@@ -32,12 +47,15 @@ Totally ordered mode is supported. In this mode consumers receive messages in th
Single-producer-single-consumer mode is supported. In this mode, no expensive atomic read-modify-write CPU instructions are necessary, only the cheapest atomic loads and stores. That improves queue throughput significantly.
-A few other thread-safe containers are used for reference in the benchmarks:
+Move-only queue element types are fully supported. For example, a queue of `std::unique_ptr<T>` elements would be `AtomicQueue2B<std::unique_ptr<T>>` or `AtomicQueue2<std::unique_ptr<T>, CAPACITY>`.
+## Role Models
+Several other well established and popular thread-safe containers are used for reference in the [benchmarks][1]:
* `std::mutex` - a fixed size ring-buffer with `std::mutex`.
* `pthread_spinlock` - a fixed size ring-buffer with `pthread_spinlock_t`.
* `boost::lockfree::spsc_queue` - a wait-free single-producer-single-consumer queue from Boost library.
* `boost::lockfree::queue` - a lock-free multiple-producer-multiple-consumer queue from Boost library.
-* `moodycamel::ConcurrentQueue` - a lock-free multiple-producer-multiple-consumer queue used in non-blocking mode.
+* `moodycamel::ConcurrentQueue` - a lock-free multiple-producer-multiple-consumer queue used in non-blocking mode. This queue is designed to maximize throughput at the expense of latency and eschewing the global time order of elements pushed into one queue by different threads. It is not equivalent to other queues benchmarked here in this respect.
* `moodycamel::ReaderWriterQueue` - a lock-free single-producer-single-consumer queue used in non-blocking mode.
* `xenium::michael_scott_queue` - a lock-free multi-producer-multi-consumer queue proposed by [Michael and Scott](http://www.cs.rochester.edu/~scott/papers/1996_PODC_queues.pdf) (this queue is similar to `boost::lockfree::queue` which is also based on the same proposal).
* `xenium::ramalhete_queue` - a lock-free multi-producer-multi-consumer queue proposed by [Ramalhete and Correia](http://concurrencyfreaks.blogspot.com/2016/11/faaarrayqueue-mpmc-lock-free-queue-part.html).
@@ -88,8 +106,12 @@ The containers support the following APIs:
* `was_full` - Returns `true` if the container was full during the call. The state may have changed by the time the return value is examined.
* `capacity` - Returns the maximum number of elements the queue can possibly hold.
+_Atomic elements_ are those, for which [`std::atomic<T>{T{}}.is_lock_free()`][10] returns `true`, and, when C++17 features are available, [`std::atomic<T>::is_always_lock_free`][16] evaluates to `true` at compile time. In other words, the CPU can load, store and compare-and-exchange such elements atomically natively. On x86-64 such elements are all the [C++ standard arithmetic and pointer types][11]. The queues for atomic elements reserve one value to serve as an empty element marker `NIL`, its default value is `0`. `NIL` value must not be pushed into a queue and there is an [`assert`][13] statement in `push` functions to guard against that in debug mode builds. Pushing `NIL` element into a queue in release mode builds results in undefined behaviour, such as deadlocks and/or lost queue elements.
Note that _optimism_ is a choice of a queue modification operation control flow, rather than a queue type. An _optimist_ `push` is fastest when the queue is not full most of the time, an optimistic `pop` - when the queue is not empty most of the time. Optimistic and not so operations can be mixed with no restrictions. The `OptimistAtomicQueue`s in [the benchmarks][1] use only _optimist_ `push` and `pop`.
+`push` and `try_push` operations _synchronize-with_ (as defined in [`std::memory_order`][17]) with any subsequent `pop` or `try_pop` operation of the same queue object.
See [example.cc](src/example.cc) for a usage example.
TODO: full API reference.
@@ -130,7 +152,7 @@ There are a few OS behaviours that complicate benchmarking:
* CPU scheduler can place threads on different CPU cores each run. To avoid that the threads are pinned to specific CPU cores.
* CPU scheduler can preempt threads. To avoid that real-time `SCHED_FIFO` priority 50 is used to disable scheduler time quantum expiry and make the threads non-preemptable by lower priority processes/threads.
* Real-time thread throttling disabled.
-* Adverse address space randomisation may cause extra CPU cache conflicts. To minimise effects of that `benchmarks` executable is run at least 33 times and then the results with the highest throughput / lowest latency are selected.
+* Adverse address space randomisation may cause extra CPU cache conflicts, as well as other processes running on the system. To minimise effects of that `benchmarks` executable is run at least 33 times. The benchmark charts show the average; the standard deviation, minimum and maximum values are shown in the chart tooltips.
I only have access to a few x86-64 machines. If you have access to different hardware feel free to submit the output file of `scripts/run-benchmarks.sh` and I will include your results into the benchmarks page.
@@ -140,6 +162,7 @@ When huge pages are available the benchmarks use 1x1GB or 16x2MB huge pages for
sudo hugeadm --pool-pages-min 1GB:1 --pool-pages-max 1GB:1
sudo hugeadm --pool-pages-min 2MB:16 --pool-pages-max 2MB:16
+Alternatively, you may like to enable [transparent hugepages][15] in your system and use a hugepage-aware allocator, such as [tcmalloc][14].
### Real-time thread throttling
By default, Linux scheduler throttles real-time threads from consuming 100% of CPU and that is detrimental to benchmarking. Full details can be found in [Real-Time group scheduling][2]. To disable real-time thread throttling do:
@@ -171,3 +194,9 @@ Copyright (c) 2019 Maxim Egorushkin. MIT License. See the full licence in file L
[9]: https://stackoverflow.com/a/25168942/412080
[10]: https://en.cppreference.com/w/cpp/atomic/atomic/is_lock_free
[11]: https://en.cppreference.com/w/cpp/language/type
+[12]: https://en.cppreference.com/w/cpp/types/is_arithmetic
+[13]: https://en.cppreference.com/w/cpp/error/assert
+[14]: https://google.github.io/tcmalloc/temeraire.html
+[15]: https://www.kernel.org/doc/html/latest/admin-guide/mm/transhuge.html
+[16]: https://en.cppreference.com/w/cpp/atomic/atomic/is_always_lock_free
+[17]: https://en.cppreference.com/w/cpp/atomic/memory_order
@@ -1,3 +1,25 @@
+libatomic-queue (0.0+git20230629.b770bb2-1) UNRELEASED; urgency=medium
+ [ Nilesh Patra ]
+ * [ci skip] Remove myself from uploaders
+ [ Étienne Mollier ]
+ * New upstream version.
+ * Standards-Version: 4.6.2 (routine-update)
+ * Remove field Section on binary package libatomic-queue0 that duplicates
+ source.
+ * generate-shared-library.patch: refresh.
+ * no-native patch: refresh.
+ * no_thin_archives.patch: refresh.
+ * compiler.patch: refresh.
+ * concurrentqueue.patch: new: fix include issue.
+ + Still ftbfs with gcc-13; issue open upstream[1].
+ + Upstream moved to proper git tags; d/watch will benefit from adjustments.
+ [1]: https://github.com/max0x7ba/atomic_queue/issues/55
+ -- Étienne Mollier <emollier at debian.org> Tue, 18 Jul 2023 14:12:29 +0200
libatomic-queue (0.0+git20220518.83774a2-1) unstable; urgency=medium
* Team upload.
@@ -12,7 +12,7 @@ Build-Depends: debhelper-compat (= 13),
-Standards-Version: 4.6.1
+Standards-Version: 4.6.2
Vcs-Browser: https://salsa.debian.org/med-team/libatomic-queue
Vcs-Git: https://salsa.debian.org/med-team/libatomic-queue.git
Homepage: https://github.com/max0x7ba/atomic_queue
@@ -20,7 +20,6 @@ Rules-Requires-Root: no
Package: libatomic-queue0
Architecture: any
-Section: libs
Depends: ${shlibs:Depends},
Description: C++ atomic_queue library
@@ -8,7 +8,7 @@ Last-Update: 2022-07-01
This patch header follows DEP-3: http://dep.debian.net/deps/dep3/
--- libatomic-queue.orig/Makefile
+++ libatomic-queue/Makefile
-@@ -22,8 +22,8 @@
+@@ -23,8 +23,8 @@
ld.clang := clang++
ar.clang := ar
@@ -0,0 +1,22 @@
+Description: fix concurrentqueue.h import
+ For some reason, possibly Debian specific, the concurrentqueue include has
+ a moodycamel namespace straight in the libconcurrentqueue-dev path, which is
+ not compatible with what is currently specified in upstream source code. It's
+ a bit unclear right now whether the issue is in concurrentqueue or in the
+ libatomic-queue.
+Author: Étienne Mollier <emollier at debian.org>
+Forwarded: no
+Last-Update: 2023-07-18
+This patch header follows DEP-3: http://dep.debian.net/deps/dep3/
+--- libatomic-queue.orig/src/moodycamel.h
++++ libatomic-queue/src/moodycamel.h
+@@ -6,7 +6,7 @@
+ #include "benchmarks.h"
+-#include <concurrentqueue/concurrentqueue.h>
++#include <concurrentqueue/moodycamel/concurrentqueue.h>
+ #include <readerwriterqueue/readerwriterqueue.h>
+ #include "atomic_queue/defs.h"
@@ -3,9 +3,9 @@ Author: Nilesh Patra <npatra974 at gmail.com>,
Last-Update: Fri, 23 Oct 2020 22:10:01 +0200
Description: Fix unused variable
---- a/Makefile
-+++ b/Makefile
-@@ -10,6 +10,7 @@ BUILD := release
+--- libatomic-queue.orig/Makefile
++++ libatomic-queue/Makefile
+@@ -11,6 +11,7 @@
TOOLSET := gcc
build_dir := ${CURDIR}/build/${BUILD}/${TOOLSET}
@@ -13,7 +13,7 @@ Description: Fix unused variable
cxx.gcc := g++
cc.gcc := gcc
-@@ -54,21 +55,30 @@ ldlibs.moodycamel :=
+@@ -60,21 +61,30 @@
cppflags.xenium := -I${abspath ../xenium}
ldlibs.xenium :=
@@ -46,7 +46,7 @@ Description: Fix unused variable
benchmarks_src := benchmarks.cc cpu_base_frequency.cc huge_pages.cc
${build_dir}/benchmarks : cppflags += ${cppflags.tbb} ${cppflags.moodycamel} ${cppflags.xenium}
${build_dir}/benchmarks : ldlibs += ${ldlibs.tbb} ${ldlibs.moodycamel} ${ldlibs.xenium} -ldl
-@@ -88,9 +98,10 @@ ${build_dir}/example : ${example_src:%.c
+@@ -94,9 +104,10 @@
$(strip ${LINK.EXE})
-include ${example_src:%.cc=${build_dir}/%.d}
@@ -60,7 +60,7 @@ Description: Fix unused variable
${build_dir}/%.a : Makefile | ${build_dir}
$(strip ${LINK.A})
-@@ -115,6 +126,13 @@ ${build_dir}/%.o : src/%.cc Makefile | $
+@@ -121,6 +132,13 @@
${build_dir}/%.o : src/%.c Makefile | ${build_dir}
$(strip ${COMPILE.C})
@@ -74,7 +74,7 @@ Description: Fix unused variable
%.S : cppflags += ${cppflags.tbb} ${cppflags.moodycamel} ${cppflags.xenium}
%.S : src/%.cc Makefile | ${build_dir}
$(strip ${COMPILE.S})
-@@ -125,11 +143,14 @@ ${build_dir}/%.o : src/%.c Makefile | ${
+@@ -131,11 +149,14 @@
${build_dir} :
mkdir -p $@
@@ -4,25 +4,29 @@ Bug-Debian: https://bugs.debian.org/987532
Forwarded: not-needed
It violates Debian's architectual baseline and causes reproducibilty problems
---- a/Makefile
-+++ b/Makefile
-@@ -28,14 +28,14 @@ LD := ${ld.${TOOLSET}}
+--- libatomic-queue.orig/Makefile
++++ libatomic-queue/Makefile
+@@ -29,18 +29,18 @@
AR := ${ar.${TOOLSET}}
cxxflags.gcc.debug := -Og -fstack-protector-all -fno-omit-frame-pointer # -D_GLIBCXX_DEBUG
-cxxflags.gcc.release := -O3 -mtune=native -ffast-math -falign-{functions,loops}=64 -DNDEBUG
--cxxflags.gcc := -pthread -march=native -std=gnu++14 -W{all,extra,error,no-{maybe-uninitialized,unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.gcc.${BUILD}}
+cxxflags.gcc.release := -O3 -ffast-math -falign-{functions,loops}=64 -DNDEBUG
+ cxxflags.gcc.sanitize := ${cxxflags.gcc.release} -fsanitize=thread
+-cxxflags.gcc := -pthread -march=native -std=gnu++14 -W{all,extra,error,no-{maybe-uninitialized,unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.gcc.${BUILD}}
+cxxflags.gcc := -pthread -std=gnu++14 -W{all,extra,error,no-{maybe-uninitialized,unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.gcc.${BUILD}}
+ ldflags.gcc.sanitize := ${ldflags.gcc.release} -fsanitize=thread
+ ldflags.gcc := ${ldflags.gcc.${BUILD}}
-cflags.gcc := -pthread -march=native -W{all,extra} -g -fmessage-length=0 ${cxxflags.gcc.${BUILD}}
+cflags.gcc := -pthread -W{all,extra} -g -fmessage-length=0 ${cxxflags.gcc.${BUILD}}
cxxflags.clang.debug := -O0 -fstack-protector-all
-cxxflags.clang.release := -O3 -mtune=native -ffast-math -falign-functions=64 -DNDEBUG
--cxxflags.clang := -stdlib=libstdc++ -pthread -march=native -std=gnu++14 -W{all,extra,error,no-{unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.clang.${BUILD}}
+cxxflags.clang.release := -O3 -ffast-math -falign-functions=64 -DNDEBUG
-+cxxflags.clang := -stdlib=libstdc++ -pthread -std=gnu++14 -W{all,extra,error,no-{unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.clang. ${BUILD}}
+ cxxflags.clang.sanitize := ${cxxflags.clang.release} -fsanitize=thread
+-cxxflags.clang := -stdlib=libstdc++ -pthread -march=native -std=gnu++14 -W{all,extra,error,no-{unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.clang.${BUILD}}
++cxxflags.clang := -stdlib=libstdc++ -pthread -std=gnu++14 -W{all,extra,error,no-{unused-variable,unused-function,unused-local-typedefs}} -g -fmessage-length=0 ${cxxflags.clang.${BUILD}}
+ ldflags.clang.sanitize := ${ldflags.clang.release} -fsanitize=thread
ldflags.clang := -stdlib=libstdc++ ${ldflags.clang.${BUILD}}
- # Additional CPPFLAGS, CXXFLAGS, CFLAGS, LDLIBS, LDFLAGS can come from the command line, e.g. make CPPFLAGS='-I<my-include-dir>', or from environment variables.
@@ -4,9 +4,9 @@ Description: The build system is set up to produce "thin" archives
that can't stand on their own
Origin: https://lists.debian.org/debian-med/2021/12/msg00131.html
---- a/Makefile
-+++ b/Makefile
-@@ -62,7 +62,7 @@ PREPROCESS.CXX = ${CXX} -o $@ -E ${cppfl
+--- libatomic-queue.orig/Makefile
++++ libatomic-queue/Makefile
+@@ -68,7 +68,7 @@
COMPILE.C = ${CC} -o $@ -c ${cppflags} ${cflags} -MD -MP $(abspath $<)
LINK.EXE = ${LD} -o $@ $(ldflags) $(filter-out Makefile,$^) $(ldlibs)
LINK.SO = ${LD} -o $@.$(SOVERSION) -shared -Wl,-soname,`basename $@`.$(SOVERSION) $(ldflags) $(filter-out Makefile,$^) $(ldlibs)
@@ -3,3 +3,4 @@ generate-shared-library.patch
@@ -1,27 +1,62 @@
body {
+ visibility: hidden;
background-color: black;
color: white;
font-family: 'Roboto Slab', sans-serif;
+ overflow-y: scroll;
+h1, h2, h3 {
+ margin-left: 20px;
+p, li {
+ color: #A0A0A0;
+ margin-left: 20px;
+ul {
+ margin-top: .5em;
+ margin-bottom: 1em;
div.chart {
height: 500px;
- margin-top: 2em;
+ margin-bottom: 1em;
p.copyright {
color: #A0A0A0;
- text-align: left;
font-size: 0.8em;
-h1, h2, h3 {
- margin-left: 20px;
+h1.homepage {
+ margin-bottom: 0em;
-p, li {
+p.homepage {
color: #A0A0A0;
- margin-left: 20px;
+ margin-top: 0em;
+ font-weight: bold;
+.view-toggle {
+ cursor: pointer;
+ margin-bottom: 0em;
+ user-select: none;
+h3.view-toggle {
+ margin-left: 24px;
+ margin-top: 0em;
+svg.arrow-down-circle {
+ fill: currentColor;
+ width: 1em;
+ height: 1em;
+ color: #A0A0A0;
+ vertical-align: -.1em;
span.tooltip_scalability_title {
@@ -13,7 +13,7 @@
<link href="https://fonts.googleapis.com/css?family=Roboto+Slab:400,700&display=swap" rel="stylesheet">
<link rel="stylesheet" href="benchmarks.css">
- <script src="https://code.jquery.com/jquery-3.4.1.slim.min.js" integrity="sha256-pasqAKBDmFT4eHoN2ndd6lN370kFiGUFyTiUHWhU7k8=" crossorigin="anonymous"></script>
+ <script src="https://code.jquery.com/jquery-3.6.0.min.js" integrity="sha256-/xUj+3OJU5yExlq6GSYGSHk7tPXikynS7ogEvDej/m4=" crossorigin="anonymous"></script>
<script src="https://code.highcharts.com/highcharts.js"></script>
<script src="https://code.highcharts.com/highcharts-more.js"></script>
<script src="https://code.highcharts.com/modules/pattern-fill.js"></script>
@@ -23,52 +23,76 @@
<title>Scalaibilty and Latency Benchmarks</title>
- <h1>Scalability Benchmark</h1>
- <p>N producer threads push a 4-byte integer into one same queue, N consumer threads pop the integers from the queue. All producers posts 1,000,000 messages in total. Total time to send and receive all the messages is measured. The benchmark is run for from 1 producer and 1 consumer up to (total-number-of-cpus / 2) producers/consumers to measure the scalabilty of different queues.</p>
- <div class="chart" id="scalability-9900KS-5GHz"></div>
- <div class="chart" id="scalability-xeon-gold-6132"></div>
- <div class="chart" id="scalability-ryzen-5950x"></div>
- <h1>Latency Benchmark</h1>
- <p>One thread posts a 4-byte integer to another thread through one queue and waits for a reply from another queue (2 queues in total). The benchmark measures the total time of 100,000 ping-pongs, best of 10 runs. Contention is minimal here (1-producer-1-consumer, 1 element in the queue) to be able to achieve and measure the lowest latency. Reports the average round-trip time.</p>
- <div class="chart" id="latency-9900KS-5GHz"></div>
- <div class="chart" id="latency-xeon-gold-6132"></div>
- <div class="chart" id="latency-ryzen-5950x"></div>
- <h2>Systems details</h2>
- <h3>Intel i9-9900KS system</h3>
- <ul>
- <li>OS: Ubuntu-18.04.4 LTS
- <li>Compiler: gcc-8.4.0
- <li>atomic_queue version: commit 7e138d21fcd4bad95e030d8d6c8b77d5a4538baa
- <li>Boost version: 1.65.1
- <li>TBB version: 2019_U7, commit 4233fef583b4f8cbf9f781311717600feaaa0694
- <li>moodycamel concurrentqueue version: commit dea078cf5b6e742cd67a0d725e36f872feca4de4
- <li>moodycamel readerwriterqueue version: commit 2ae710de996a1d02bbc7696b2cdff2c6078e76f8
- <li>xenium library version: commit f6416d30043a7d025405038d5ddd4794aaaab4a3
- </ul>
- <h3>Intel Xeon Gold 6132 system</h3>
- <ul>
- <li>OS: Red Hat Enterprise Linux Server release 6.10 (Santiago)
- <li>Compiler: gcc-8.4.0
- <li>atomic_queue version: commit 7e138d21fcd4bad95e030d8d6c8b77d5a4538baa
- <li>Boost version: 1.65.1
- <li>TBB version: 2019_U7, commit 4233fef583b4f8cbf9f781311717600feaaa0694
- <li>moodycamel concurrentqueue version: commit dea078cf5b6e742cd67a0d725e36f872feca4de4
- <li>moodycamel readerwriterqueue version: commit 2ae710de996a1d02bbc7696b2cdff2c6078e76f8
- <li>xenium library version: commit f6416d30043a7d025405038d5ddd4794aaaab4a3
- </ul>
- <h3>AMD Ryzen 9 5950X system</h3>
- <ul>
- <li>OS: KDE Neon based on Ubuntu 20.04 and XanMod Kernel 5.10 LTS.
- <li>Compiler: gcc-9.3.0
- <li>atomic_queue version: commit e02078c14cab70f0df594ea3406f1240297e11d7
- <li>Boost version: 1.71.0
- <li>TBB version: 2019_U7, commit 4233fef583b4f8cbf9f781311717600feaaa0694
- <li>moodycamel concurrentqueue version: commit dea078cf5b6e742cd67a0d725e36f872feca4de4
- <li>moodycamel readerwriterqueue version: commit 2ae710de996a1d02bbc7696b2cdff2c6078e76f8
- <li>xenium library version: commit f6416d30043a7d025405038d5ddd4794aaaab4a3
- </ul>
- <h3>Source Code</h3>
- <p><a href="https://github.com/max0x7ba/atomic_queue">github.com/max0x7ba/atomic_queue</a></p>
- <p class="copyright">Copyright (c) 2019 Maxim Egorushkin. MIT License. See the full licence in file LICENSE.</p>
+ <h1 class="view-toggle">Scalability Benchmark</h1>
+ <div>
+ <p>N producer threads push a 4-byte integer into one same queue, N consumer threads pop the integers from the queue. All producers posts 1,000,000 messages in total. Total time to send and receive all the messages is measured. The benchmark is run for from 1 producer and 1 consumer up to (total-number-of-cpus / 2) producers/consumers to measure the scalabilty of different queues. The minimum, maximum, mean and standard deviation of at least 33 runs are reported in the tooltip.</p>
+ <h3 class="view-toggle">Scalability on Intel i9-9900KS</h3><div class="chart" id="scalability-9900KS-5GHz"></div>
+ <h3 class="view-toggle">Scalability on AMD Ryzen 7 5825U</h3><div class="chart" id="scalability-ryzen-5825u"></div>
+ <h3 class="view-toggle">Scalability on Intel Xeon Gold 6132</h3><div class="chart" id="scalability-xeon-gold-6132"></div>
+ <h3 class="view-toggle">Scalability on AMD Ryzen 9 5950X</h3><div class="chart" id="scalability-ryzen-5950x"></div>
+ </div>
+ <h1 class="view-toggle">Latency Benchmark</h1>
+ <div>
+ <p>One thread posts a 4-byte integer to another thread through one queue and waits for a reply from another queue (2 queues in total). The benchmark measures the total time of 100,000 ping-pongs, best of 10 runs. Contention is minimal here (1-producer-1-consumer, 1 element in the queue) to be able to achieve and measure the lowest latency. Reports the average round-trip time, i.e. the time it takes to post a message to another thread and receive a reply. The minimum, maximum, mean and standard deviation of at least 33 runs are reported in the tooltip.</p>
+ <h3 class="view-toggle">Latency on Intel i9-9900KS</h3><div class="chart" id="latency-9900KS-5GHz"></div>
+ <h3 class="view-toggle">Latency on AMD Ryzen 7 5825U</h3><div class="chart" id="latency-ryzen-5825u"></div>
+ <h3 class="view-toggle">Latency on Intel Xeon Gold 6132</h3><div class="chart" id="latency-xeon-gold-6132"></div>
+ <h3 class="view-toggle">Latency on AMD Ryzen 9 5950X</h3><div class="chart" id="latency-ryzen-5950x"></div>
+ </div>
+ <h1 class="view-toggle">Systems details</h1>
+ <div>
+ <h3 class="view-toggle">Intel i9-9900KS system</h3>
+ <ul>
+ <li>OS: Kubuntu-18.04.4 LTS
+ <li>Compiler: gcc-8.4.0
+ <li>atomic_queue version: commit 7e138d21fcd4bad95e030d8d6c8b77d5a4538baa
+ <li>Boost version: 1.65.1
+ <li>TBB version: 2019_U7, commit 4233fef583b4f8cbf9f781311717600feaaa0694
+ <li>moodycamel concurrentqueue version: commit dea078cf5b6e742cd67a0d725e36f872feca4de4
+ <li>moodycamel readerwriterqueue version: commit 2ae710de996a1d02bbc7696b2cdff2c6078e76f8
+ <li>xenium library version: commit f6416d30043a7d025405038d5ddd4794aaaab4a3
+ </ul>
+ <h3 class="view-toggle">AMD Ryzen 7 5825U system</h3>
+ <ul>
+ <li>OS: Kubuntu 22.04 LTS
+ <li>Compiler: gcc-11.2.0
+ <li>atomic_queue version: commit 7d75e9ed0359650224b29cdf6728c5fe0a19fffb
+ <li>Boost version: 1.74.0
+ <li>TBB version: 2021.5.0
+ <li>moodycamel concurrentqueue version: commit dea078cf5b6e742cd67a0d725e36f872feca4de4
+ <li>moodycamel readerwriterqueue version: commit 2ae710de996a1d02bbc7696b2cdff2c6078e76f8
+ <li>xenium library version: commit f6416d30043a7d025405038d5ddd4794aaaab4a3
+ </ul>
+ <h3 class="view-toggle">Intel Xeon Gold 6132 system</h3>
+ <ul>
+ <li>OS: Red Hat Enterprise Linux Server release 6.10 (Santiago)
+ <li>Compiler: gcc-8.4.0
+ <li>atomic_queue version: commit 7e138d21fcd4bad95e030d8d6c8b77d5a4538baa
+ <li>Boost version: 1.65.1
+ <li>TBB version: 2019_U7, commit 4233fef583b4f8cbf9f781311717600feaaa0694
+ <li>moodycamel concurrentqueue version: commit dea078cf5b6e742cd67a0d725e36f872feca4de4
+ <li>moodycamel readerwriterqueue version: commit 2ae710de996a1d02bbc7696b2cdff2c6078e76f8
+ <li>xenium library version: commit f6416d30043a7d025405038d5ddd4794aaaab4a3
+ </ul>
+ <h3 class="view-toggle">AMD Ryzen 9 5950X system</h3>
+ <ul>
+ <li>OS: KDE Neon based on Ubuntu 20.04 and XanMod Kernel 5.10 LTS.
+ <li>Compiler: gcc-9.3.0
+ <li>atomic_queue version: commit e02078c14cab70f0df594ea3406f1240297e11d7
+ <li>Boost version: 1.71.0
+ <li>TBB version: 2019_U7, commit 4233fef583b4f8cbf9f781311717600feaaa0694
+ <li>moodycamel concurrentqueue version: commit dea078cf5b6e742cd67a0d725e36f872feca4de4
+ <li>moodycamel readerwriterqueue version: commit 2ae710de996a1d02bbc7696b2cdff2c6078e76f8
+ <li>xenium library version: commit f6416d30043a7d025405038d5ddd4794aaaab4a3
+ </ul>
+ </div>
+ <h1 class="view-toggle homepage">Homepage</h1>
+ <div>
+ <p class="homepage"><a href="https://github.com/max0x7ba/atomic_queue">github.com/max0x7ba/atomic_queue</a></p>
+ <p class="copyright">Copyright (c) 2019 Maxim Egorushkin. MIT License. See the full licence in file LICENSE.</p>
+ </div>
@@ -35,7 +35,7 @@ $(function() {
"OptimistAtomicQueueB2": ['#FFBFBF', 18]
- function plot_scalability(div_id, results, title_suffix, max_lin, max_log) {
+ function plot_scalability(div_id, results, max_lin, max_log) {
const modes = [
{type: 'linear', title: { text: 'throughput, msg/sec (linear scale)'}, max: max_lin, min: 0 },
{type: 'logarithmic', title: { text: 'throughput, msg/sec (logarithmic scale)'}, max: max_log, min: 100e3},
@@ -87,13 +87,10 @@ $(function() {
const chart = Highcharts.chart(div_id, {
chart: {
events: {
- click: function() {
- mode ^= 1;
- chart.yAxis[0].update(modes[mode]);
- }
+ click: function() { this.yAxis[0].update(modes[mode ^= 1]); }
- title: { text: 'Scalability on ' + title_suffix },
+ title: { text: undefined },
subtitle: { text: "click on the chart background to switch between linear and logarithmic scales" },
xAxis: {
title: { text: 'number of producers, number of consumers' },
@@ -110,7 +107,7 @@ $(function() {
- function plot_latency(div_id, results, title_suffix) {
+ function plot_latency(div_id, results) {
const series = Object.entries(results).map(entry => {
const [name, stats] = entry;
const s = settings[name];
@@ -140,7 +137,7 @@ $(function() {
series: { stacking: 'normal'},
bar: { dataLabels: { enabled: true, align: 'left', inside: false } }
- title: { text: 'Latency on ' + title_suffix },
+ title: { text: undefined },
xAxis: { categories: categories },
yAxis: {
title: { text: 'latency, nanoseconds/round-trip' },
@@ -158,13 +155,33 @@ $(function() {
const scalability_9900KS = {"AtomicQueue": [[1, 52660493, 286258811, 74231130, 46923128], [2, 11670323, 12511844, 12011858, 270810], [3, 9791407, 10870735, 10354387, 423144], [4, 8124141, 8262334, 8192020, 23767], [5, 7882302, 8164594, 8058345, 45565], [6, 7536832, 7993441, 7709403, 113618], [7, 7011413, 8020563, 7552220, 427030], [8, 6291117, 7515622, 6885968, 545237]], "AtomicQueue2": [[1, 22787102, 61696929, 23153888, 2262406], [2, 11251529, 12267302, 11657086, 212493], [3, 9250720, 10001213, 9472512, 131865], [4, 7958528, 8157226, 8055508, 33266], [5, 7784153, 8097440, 7972636, 61800], [6, 7450035, 7952026, 7641924, 130961], [7, 7005546, 7995642, 7509325, 381599], [8, 6349759, 7441272, 6854003, 471089]], "AtomicQueueB": [[1, 42613077, 228034973, 48968374, 17271281], [2, 11307287, 12122517, 11654762, 192294], [3, 9978460, 11117123, 10580691, 418664], [4, 7820303, 8149391, 8038875, 49723], [5, 7393617, 7922868, 7706848, 116543], [6, 7044646, 7623977, 7432887, 119697], [7, 6771050, 7812016, 7300722, 426304], [8, 6167485, 7214447, 6685564, 449080]], "AtomicQueueB2": [[1, 31747483, 44550020, 34684489, 1949026], [2, 11004660, 11624801, 11264944, 159388], [3, 9311302, 9898647, 9585552, 81750], [4, 7583514, 8026821, 7885529, 68419], [5, 7318917, 7806120, 7600268, 122098], [6, 7004711, 7518179, 7348211, 105453], [7, 6760542, 7775829, 7294366, 408721], [8, 6203358, 7175857, 6682430, 396215]], "OptimistAtomicQueue": [[1, 487380322, 829842979, 661556071, 100346674], [2, 31797501, 32761745, 32437895, 262498], [3, 36537452, 37548890, 37008138, 364848], [4, 39195547, 39453579, 39332552, 57506], [5, 37390896, 48677211, 44454166, 2490283], [6, 41443858, 50559092, 46326029, 3930139], [7, 43825547, 53156863, 48061575, 3621601], [8, 46177415, 50602252, 47828080, 1452954]], "OptimistAtomicQueue2": [[1, 25703634, 682547965, 230538256, 211766068], [2, 21661800, 29516399, 24851671, 1493004], [3, 29291342, 33834235, 30273240, 524342], [4, 32920458, 36241653, 33343018, 441670], [5, 36830993, 43357072, 38976054, 1862089], [6, 39747081, 49741386, 44704047, 4504426], [7, 42479711, 51839802, 46362844, 3648632], [8, 43732450, 49877392, 46347786, 2371894]], "OptimistAtomicQueueB": [[1, 75661057, 738447042, 124305321, 83621261], [2, 31477141, 32474220, 32144227, 176354], [3, 36019269, 37037279, 36563374, 322208], [4, 38357209, 38905937, 38647013, 72549], [5, 36246828, 47608460, 43165102, 2491292], [6, 39494986, 49368578, 44976208, 4044505], [7, 41252863, 51655899, 46076590, 4108616], [8, 43899112, 49215349, 46213653, 1857294]], "OptimistAtomicQueueB2": [[1, 31441458, 495211858, 59246349, 27593701], [2, 21826376, 29825513, 26058597, 2081213], [3, 28756903, 34057706, 29794288, 839909], [4, 31084544, 33672715, 32858135, 485076], [5, 33366524, 40347303, 36955446, 2416293], [6, 36837801, 42786274, 39860539, 2457925], [7, 39946444, 45751323, 42359860, 2112179], [8, 41740252, 46736438, 43950268, 1704291]], "boost::lockfree::queue": [[1, 6746684, 8277185, 7092878, 418709], [2, 7312023, 7803259, 7553075, 87733], [3, 7263517, 7648842, 7476500, 91860], [4, 6359882, 7098293, 6610597, 192715], [5, 6367348, 6773852, 6457372, 46054], [6, 5927503, 6298061, 6055700, 68494], [7, 5746691, 6154693, 5964947, 83543], [8, 5331463, 5801836, 5535251, 89204]], "boost::lockfree::spsc_queue": [[1, 64923339, 78317500, 69086959, 2160846]], "moodycamel::ConcurrentQueue": [[1, 20190901, 29453011, 24985741, 1594915], [2, 14337151, 52431952, 16261043, 4078346], [3, 15291705, 43648056, 17046353, 4143492], [4, 15736506, 45837232, 18228886, 5125409], [5, 16888207, 47841058, 19245549, 5379950], [6, 16998837, 63384866, 20186438, 6382091], [7, 17716036, 66347129, 21038132, 6921929], [8, 17924728, 64375322, 22382013, 8285161]], "moodycamel::ReaderWriterQueue": [[1, 43356419, 538733018, 256503633, 185340411]], "pthread_spinlock": [[1, 23507277, 29932694, 27413691, 1797342], [2, 14270085, 18312194, 16382070, 769144], [3, 8211868, 12289865, 10189163, 1848412], [4, 6395961, 9383867, 7773828, 1275888], [5, 8442872, 10466994, 9009726, 423856], [6, 8112952, 9328919, 8527056, 234738], [7, 7189956, 8492547, 7685023, 190137], [8, 6576974, 7596251, 6917365, 230403]], "std::mutex": [[1, 5006882, 9199394, 6838493, 652022], [2, 4687459, 6598427, 5749404, 387982], [3, 4580302, 6900299, 5685428, 464037], [4, 4941923, 7100935, 6086683, 325998], [5, 5151696, 6739344, 5986755, 186929], [6, 5521016, 6571707, 5918632, 116062], [7, 5532592, 6378700, 5826170, 88618], [8, 5438188, 6181434, 5704761, 76268]], "tbb::concurrent_bounded_queue": [[1, 10925661, 14807665, 13187267, 1088087], [2, 12352037, 15166768, 13521906, 612838], [3, 11099805, 12535211, 11630738, 279433], [4, 9929811, 10656023, 10303443, 177287], [5, 9349138, 10217187, 9704186, 183365], [6, 8548656, 9516659, 8863967, 196987], [7, 7358384, 8693321, 7958661, 218257], [8, 6615544, 8013655, 7136724, 350688]], "tbb::spin_mutex": [[1, 32588344, 41937261, 36432718, 2291145], [2, 17753221, 21806602, 19845873, 1357076], [3, 7201937, 11563566, 9346899, 1335282], [4, 2900531, 6495310, 4753237, 1579671], [5, 5103017, 5929302, 5552236, 189032], [6, 4254932, 5441256, 4834876, 480630], [7, 4223732, 4907625, 4560981, 246626], [8, 3338874, 4286720, 4138009, 129870]], "xenium::michael_scott_queue": [[1, 8417342, 10161353, 9493893, 327033], [2, 8230532, 8706024, 8488596, 76740], [3, 7071683, 7702336, 7404448, 172642], [4, 6177715, 6500382, 6329812, 50090], [5, 6227656, 6844074, 6487028, 190493], [6, 6408222, 7118668, 6666732, 183381], [7, 6220683, 6728490, 6410011, 115700], [8, 5906991, 6324097, 6072896, 89071]], "xenium::ramalhete_queue": [[1, 26889784, 33285933, 31963600, 729718], [2, 22883173, 24719839, 23562698, 341416], [3, 28121330, 29464259, 28838631, 366336], [4, 33312793, 34047588, 33650956, 184508], [5, 31808107, 38717573, 34327553, 2297341], [6, 33560480, 40481895, 36597565, 2593281], [7, 34734954, 42470849, 38204151, 3109357], [8, 35105293, 44944634, 39750343, 4246943]], "xenium::vyukov_bounded_queue": [[1, 60523731, 122827707, 104853037, 23546237], [2, 17367563, 29204433, 25098906, 2910703], [3, 14333973, 16468857, 15718588, 266421], [4, 11678227, 12747022, 12409949, 196985], [5, 10112556, 11532118, 11083680, 290177], [6, 9709516, 12829017, 10969926, 1069776], [7, 9061926, 10421370, 9652587, 457388], [8, 8187699, 8591244, 8371133, 91811]]};
const scalability_xeon_gold_6132 = {"AtomicQueue": [[1, 8058966, 85486744, 19861417, 13465781], [2, 2774121, 5150399, 3716822, 529166], [3, 2234209, 3581321, 2844019, 297103], [4, 2189691, 2797820, 2500767, 141748], [5, 2000160, 2556556, 2239114, 108475], [6, 1800361, 2193952, 1967523, 85069], [7, 1339017, 2052080, 1747440, 113355], [8, 499239, 1790395, 1251368, 376126], [9, 457147, 1554831, 1065501, 317655], [10, 499701, 1497940, 933685, 296414], [11, 471438, 1317111, 758521, 284702], [12, 472731, 1223669, 645847, 211406], [13, 475966, 1051905, 607384, 154227], [14, 447298, 915959, 542223, 81608]], "AtomicQueue2": [[1, 6014132, 112250995, 11860821, 13520637], [2, 2828684, 4803110, 3861060, 547933], [3, 2370797, 3402752, 2907770, 290882], [4, 2198966, 2893203, 2481239, 168783], [5, 1922906, 2473517, 2215197, 120928], [6, 1700174, 2163119, 1957391, 98690], [7, 1584156, 1904525, 1752509, 71870], [8, 497167, 1692471, 1211725, 399956], [9, 492465, 1637918, 1032783, 355535], [10, 498320, 1502601, 894903, 322686], [11, 496862, 1287595, 740572, 255373], [12, 479471, 1142817, 669465, 220449], [13, 490420, 1087423, 564978, 132699], [14, 484859, 853987, 561566, 95000]], "AtomicQueueB": [[1, 11312440, 21089399, 14319386, 2322974], [2, 2828641, 4395539, 3598695, 363396], [3, 2383683, 3335368, 2837469, 222254], [4, 2194149, 2838158, 2479930, 155470], [5, 1961892, 2545450, 2206488, 124696], [6, 1704523, 2207219, 1965343, 113058], [7, 1400922, 2184936, 1760002, 125320], [8, 498481, 1680613, 1093922, 406887], [9, 495736, 1581164, 956214, 328532], [10, 498850, 1444846, 840343, 308105], [11, 483922, 1277870, 700261, 269404], [12, 487609, 1134736, 616528, 192809], [13, 494557, 857638, 544687, 81207], [14, 483041, 850197, 558294, 95879]], "AtomicQueueB2": [[1, 7460755, 14951085, 10960441, 1884733], [2, 2741293, 4471488, 3421984, 442894], [3, 2351790, 3354557, 2754730, 237182], [4, 2126512, 2763650, 2451035, 148674], [5, 2033646, 2434559, 2185096, 106060], [6, 1749020, 2318698, 1968299, 112029], [7, 1352736, 1922994, 1752021, 107017], [8, 479497, 1649868, 1094885, 411721], [9, 486573, 1566955, 964595, 345537], [10, 498586, 1511963, 858856, 312525], [11, 484384, 1295858, 693007, 252815], [12, 491452, 1155658, 619410, 194677], [13, 442994, 1058050, 576966, 133949], [14, 469414, 882437, 539996, 70095]], "OptimistAtomicQueue": [[1, 56698745, 429583640, 175629468, 86409817], [2, 6408754, 11931110, 8798271, 1427113], [3, 8066359, 13129768, 10458901, 1514753], [4, 8298306, 13581897, 11250748, 1640968], [5, 8932051, 13944639, 12365031, 1196775], [6, 9446462, 14000610, 12900019, 1207077], [7, 9778505, 14314352, 13477473, 850012], [8, 9215134, 11865416, 10467114, 722175], [9, 8102279, 11617885, 10064154, 979170], [10, 7755919, 11379025, 10007986, 1069232], [11, 7809733, 11642631, 10059359, 1147829], [12, 7678745, 11785406, 10015423, 1121277], [13, 7891823, 11650001, 9852053, 1038603], [14, 7931500, 12177433, 9759040, 1154347]], "OptimistAtomicQueue2": [[1, 13352047, 166577270, 79006910, 30513135], [2, 5809820, 10117510, 7296714, 983486], [3, 7359997, 12559722, 9306742, 1644149], [4, 7729367, 12734246, 10474524, 1667974], [5, 8256529, 13316977, 11173176, 1704466], [6, 8427196, 13658790, 12145214, 1423602], [7, 8972407, 13954602, 12800483, 941189], [8, 8306345, 11031293, 10007828, 701969], [9, 7781010, 11330468, 9562517, 884767], [10, 7270803, 10842898, 9535466, 1017074], [11, 7306288, 11400679, 9630510, 1113066], [12, 7615179, 10905131, 9599169, 993126], [13, 7768507, 10951419, 9495167, 927146], [14, 7939789, 11593058, 9363004, 1002168]], "OptimistAtomicQueueB": [[1, 18005087, 461920680, 43299949, 58590278], [2, 7918458, 13244281, 10554149, 1412045], [3, 8566563, 13834992, 11664903, 1605994], [4, 8776970, 13733282, 12143773, 1339924], [5, 9080446, 14486100, 12540476, 1136728], [6, 9031510, 14144692, 12968928, 1144476], [7, 10260978, 14264523, 13401276, 578048], [8, 7860310, 11677713, 10338906, 733228], [9, 8037599, 11536671, 10046625, 980055], [10, 7666387, 11483247, 9974741, 1077884], [11, 7773342, 11518370, 10097099, 1148028], [12, 7708761, 11962418, 10143672, 1169123], [13, 7725882, 11194790, 9873433, 1054815], [14, 7855188, 11275014, 9646028, 1118131]], "OptimistAtomicQueueB2": [[1, 11400233, 27116940, 21484544, 4456865], [2, 6565091, 11622771, 9409379, 1434258], [3, 7435746, 12559877, 10522656, 1516744], [4, 7776622, 12750010, 10260559, 1589501], [5, 7964167, 13270039, 11437117, 1346754], [6, 8849023, 13722187, 11756287, 1234538], [7, 8997751, 13835002, 12188309, 1192711], [8, 7756541, 10713723, 9591582, 747240], [9, 7314675, 11263412, 9209092, 948300], [10, 7352487, 10748888, 9264018, 1017641], [11, 7141749, 10896155, 9260621, 1076754], [12, 7063191, 10471776, 9248261, 984638], [13, 7358863, 10459869, 9071272, 961738], [14, 7490258, 10858481, 8986939, 1056811]], "boost::lockfree::queue": [[1, 1934482, 3335118, 2968513, 267417], [2, 2020556, 2714547, 2380363, 166177], [3, 1766944, 2481333, 2277536, 154223], [4, 1927815, 2468139, 2215008, 117101], [5, 1913080, 2341598, 2154795, 109277], [6, 1737937, 2239840, 2067750, 101330], [7, 1685532, 2158493, 1965928, 102944], [8, 476300, 1588449, 1057234, 312540], [9, 504256, 1466335, 882380, 236710], [10, 495183, 1249404, 733720, 210184], [11, 496163, 1173368, 615041, 163022], [12, 483550, 1080338, 576774, 125017], [13, 479449, 942173, 552191, 90608], [14, 444801, 789696, 538890, 64254]], "boost::lockfree::spsc_queue": [[1, 21589958, 35612264, 26701941, 3432048]], "moodycamel::ConcurrentQueue": [[1, 5031299, 13152497, 7231628, 2054206], [2, 3106244, 21840508, 5669989, 2480503], [3, 4039871, 18242902, 7384110, 3603375], [4, 4487792, 21071736, 8181695, 3838323], [5, 5209580, 24290350, 9672263, 5127482], [6, 5202954, 24160723, 8472347, 4567541], [7, 5415473, 26165080, 9754203, 5527832], [8, 4290069, 18526789, 7646915, 3740996], [9, 4479809, 35353993, 7585632, 6194437], [10, 4727037, 23405328, 7617742, 4615300], [11, 4631325, 30337177, 8709014, 6268210], [12, 4473005, 27300920, 8026322, 5175124], [13, 4555975, 27789293, 8331006, 5575842], [14, 4102221, 43489396, 11921415, 9787758]], "moodycamel::ReaderWriterQueue": [[1, 12713140, 254602528, 122153284, 81114699]], "pthread_spinlock": [[1, 4306958, 8535650, 5905333, 840994], [2, 2839333, 4736775, 4053457, 456568], [3, 2548628, 3614912, 3201805, 248819], [4, 2087992, 2959824, 2605329, 165780], [5, 1983329, 2542321, 2248467, 138984], [6, 1783286, 2276326, 1986022, 112386], [7, 1536216, 2018246, 1766854, 112798], [8, 507415, 1499893, 1072692, 193480], [9, 501385, 1152617, 766700, 218876], [10, 489327, 1025270, 609721, 149499], [11, 497072, 858980, 604787, 120507], [12, 475489, 849693, 593343, 102672], [13, 463691, 888711, 574088, 96224], [14, 373441, 833012, 549424, 69983]], "std::mutex": [[1, 442267, 6858037, 5283864, 1863950], [2, 4162864, 4959039, 4478520, 180618], [3, 2575706, 3420067, 2946085, 152139], [4, 2601420, 3137460, 2858986, 96306], [5, 3392974, 3797099, 3577014, 80921], [6, 4370258, 4891290, 4579916, 108823], [7, 4837222, 6248120, 5845232, 326581], [8, 4675007, 7221265, 6303575, 552163], [9, 4517060, 6675754, 5604113, 611225], [10, 4450885, 6593358, 5396274, 618943], [11, 4666608, 6758794, 5363476, 530564], [12, 4662177, 7071927, 5362666, 566952], [13, 4496056, 7270498, 5446862, 629130], [14, 4471558, 7214091, 5489034, 703952]], "tbb::concurrent_bounded_queue": [[1, 2741938, 6390144, 4991431, 1081767], [2, 3694771, 5634833, 5092675, 420218], [3, 3475746, 4391484, 4044394, 228584], [4, 2964563, 3890751, 3477907, 203006], [5, 2600081, 3341203, 3069347, 157629], [6, 2448135, 3072604, 2752748, 131448], [7, 2331329, 2770486, 2526461, 106497], [8, 1032645, 2367531, 1609048, 398019], [9, 768399, 2133918, 1378943, 297095], [10, 886747, 1960986, 1287592, 241557], [11, 852994, 1572988, 1213625, 141077], [12, 905349, 1536817, 1207538, 119201], [13, 672137, 1425158, 1150131, 125239], [14, 568180, 1255046, 1002357, 146505]], "tbb::spin_mutex": [[1, 21210988, 25406844, 23208893, 942349], [2, 7466066, 15461111, 13086723, 1647857], [3, 6548025, 10474300, 8916823, 708177], [4, 3503017, 7794311, 6294651, 966794], [5, 2153878, 5637630, 4544841, 631651], [6, 1922531, 4200007, 3254751, 437747], [7, 1534161, 2793915, 2246670, 284381], [8, 767030, 1603044, 1236223, 188171], [9, 664685, 1136499, 875213, 112513], [10, 503884, 920905, 710065, 93160], [11, 429966, 825839, 612632, 95126], [12, 328981, 741818, 536929, 89893], [13, 360477, 620612, 498964, 64207], [14, 343378, 562153, 446904, 49826]], "xenium::michael_scott_queue": [[1, 1770874, 4922580, 3393287, 798045], [2, 1987279, 3672290, 2760207, 374957], [3, 2000056, 2824672, 2385886, 152176], [4, 1827185, 2416437, 2127391, 115719], [5, 1702595, 2145286, 1919895, 91485], [6, 1536137, 1930985, 1748041, 79961], [7, 1426820, 1834610, 1643576, 81903], [8, 498697, 1628919, 1118063, 276128], [9, 452869, 1380436, 834411, 255185], [10, 494632, 1118414, 682696, 203418], [11, 490195, 1028229, 585071, 155611], [12, 484824, 889727, 574498, 120673], [13, 497397, 848913, 548659, 87463], [14, 498987, 845423, 541580, 77173]], "xenium::ramalhete_queue": [[1, 3243963, 16649455, 9804049, 4323515], [2, 4857860, 10891091, 6531145, 1101794], [3, 5681860, 10963393, 7152903, 886425], [4, 6453166, 11687397, 8090624, 1227694], [5, 7515932, 11465916, 8472107, 1003833], [6, 7603204, 11843149, 8816720, 1186933], [7, 7778687, 11444208, 8969099, 1200481], [8, 6620873, 8934784, 7893553, 554709], [9, 7110063, 8505487, 7938195, 307016], [10, 7332561, 8873905, 8083197, 302364], [11, 7650290, 8835820, 8195968, 282168], [12, 7663185, 8824693, 8282478, 271141], [13, 7786817, 9767663, 8710633, 459364], [14, 7888409, 11483491, 9499927, 1182102]], "xenium::vyukov_bounded_queue": [[1, 6620293, 58918128, 36338730, 16662346], [2, 3698951, 10319122, 6978079, 1806086], [3, 3321190, 5064399, 4427496, 329624], [4, 3526724, 4346643, 3923541, 164522], [5, 3316072, 3924131, 3551537, 117605], [6, 3114542, 3481877, 3279592, 91098], [7, 2784557, 3242623, 3020950, 108825], [8, 1278721, 2800348, 1844408, 521532], [9, 1103213, 2357968, 1486304, 324785], [10, 1025767, 1973106, 1342701, 256232], [11, 732921, 1613235, 1194292, 156458], [12, 494928, 1408766, 1053087, 242590], [13, 479926, 1216268, 994219, 184954], [14, 433322, 1122701, 804412, 232255]]};
const scalability_ryzen_5950x = {"AtomicQueue": [[1, 21295543, 35842740, 28172806, 2479430], [2, 3394180, 3469545, 3429817, 31948], [3, 2490960, 2650028, 2569420, 72586], [4, 1958844, 2071203, 2015194, 50782], [5, 959794, 1321468, 1126614, 166128], [6, 869656, 1175329, 1000005, 125300], [7, 855015, 965053, 908419, 36067], [8, 844334, 912315, 876586, 28265], [9, 809251, 897138, 861923, 22639], [10, 814346, 915698, 850386, 28166], [11, 815696, 868686, 838888, 17771], [12, 806778, 841887, 821050, 9212], [13, 764646, 873229, 809823, 38277], [14, 729665, 855147, 788284, 51654], [15, 695598, 835945, 756640, 58734], [16, 661065, 791738, 725248, 63022]], "AtomicQueue2": [[1, 13741574, 17195880, 15113722, 898672], [2, 3381441, 3471028, 3421654, 33824], [3, 2475969, 2634066, 2551893, 70606], [4, 1953014, 2063644, 2007001, 49925], [5, 943112, 1294239, 1103442, 160861], [6, 853470, 1164837, 987698, 137147], [7, 835922, 965368, 893078, 54994], [8, 844750, 886186, 857632, 10032], [9, 801647, 888990, 848813, 30479], [10, 803405, 879079, 839813, 30698], [11, 823949, 865885, 842995, 14384], [12, 816404, 861072, 825843, 8589], [13, 767201, 872807, 810309, 30929], [14, 735623, 850997, 789084, 45640], [15, 699825, 835390, 759058, 56215], [16, 661971, 790639, 725334, 61434]], "AtomicQueueB": [[1, 14989926, 23658435, 17992206, 2005217], [2, 3373949, 3431118, 3403367, 23733], [3, 2472153, 2638609, 2553529, 77513], [4, 1944419, 2067157, 2004659, 55955], [5, 951805, 1296560, 1105857, 156017], [6, 870340, 1162566, 993857, 122141], [7, 855897, 953367, 907695, 31505], [8, 841758, 907218, 872646, 27313], [9, 812440, 888656, 862920, 17825], [10, 820732, 882242, 851002, 23411], [11, 805128, 863252, 830031, 23033], [12, 816971, 850702, 824932, 6673], [13, 777096, 870856, 815340, 29018], [14, 748360, 851750, 794741, 42345], [15, 715862, 831582, 767599, 50018], [16, 690548, 789919, 739527, 47887]], "AtomicQueueB2": [[1, 15498703, 24531125, 17630225, 3208077], [2, 3383522, 3452964, 3420211, 28460], [3, 2464641, 2630256, 2545318, 77418], [4, 1937012, 2061155, 1998631, 58068], [5, 946180, 1301124, 1108917, 163042], [6, 866737, 1168507, 992803, 129307], [7, 853516, 960067, 900516, 45029], [8, 841111, 889749, 865698, 21556], [9, 818644, 882937, 854443, 19498], [10, 809528, 879137, 842534, 27852], [11, 809393, 864336, 832605, 20330], [12, 817922, 871669, 832346, 8592], [13, 791483, 859286, 822050, 18530], [14, 759673, 850982, 801021, 35280], [15, 723614, 834150, 771812, 45975], [16, 694540, 792854, 742429, 46964]], "OptimistAtomicQueue": [[1, 341389907, 602527240, 431387408, 54886177], [2, 13658253, 14614962, 13949143, 201327], [3, 16319281, 17757417, 17054184, 645787], [4, 16148553, 18212036, 17179994, 974951], [5, 9894838, 12210197, 10472612, 609243], [6, 10356881, 10735298, 10498062, 111092], [7, 11017236, 11645993, 11314419, 248835], [8, 11786496, 16693227, 13944166, 2140883], [9, 11942489, 13140510, 12475175, 455537], [10, 12120525, 14400357, 12993033, 854450], [11, 12648815, 15865295, 13838569, 1169719], [12, 12421828, 17662658, 14233404, 1777116], [13, 12956135, 17792566, 14506748, 1511164], [14, 13234379, 16612985, 14786609, 1504615], [15, 13404066, 17334728, 15272480, 1839210], [16, 13504328, 18321377, 15854345, 2330212]], "OptimistAtomicQueue2": [[1, 31960034, 49261316, 33040287, 2321274], [2, 12799633, 13213681, 13007056, 140064], [3, 15100651, 16045366, 15585577, 343314], [4, 13614811, 16242316, 14843225, 1132210], [5, 8366155, 8835665, 8669958, 150651], [6, 9158774, 9872443, 9468583, 254072], [7, 9595378, 10934328, 10264693, 593720], [8, 11230442, 12794788, 11602139, 344962], [9, 10612974, 12375119, 11430411, 772028], [10, 11410749, 13288461, 12242493, 793520], [11, 12387424, 14844253, 13305768, 869435], [12, 12178191, 15481166, 13210012, 1036054], [13, 12418970, 15999603, 13507395, 1068528], [14, 12393032, 15754443, 13833432, 1390840], [15, 12505339, 16395039, 14365960, 1823382], [16, 12644793, 17221525, 14814177, 2139400]], "OptimistAtomicQueueB": [[1, 193339455, 320812451, 228051227, 40791027], [2, 13810661, 15092576, 14109785, 314653], [3, 16250475, 17769573, 17019086, 689761], [4, 16017464, 18302827, 17159665, 1096356], [5, 9872996, 11366391, 10376422, 502319], [6, 10353096, 10805529, 10539297, 162907], [7, 10996126, 11788483, 11362136, 306538], [8, 11920386, 16987547, 14152970, 2166020], [9, 11988500, 13283407, 12530977, 479361], [10, 12176298, 14105969, 12993058, 784412], [11, 12668042, 15634003, 13773366, 1058175], [12, 12422798, 16930548, 14206839, 1748461], [13, 12977710, 17091121, 14554113, 1545085], [14, 13341569, 17588355, 15089321, 1715757], [15, 13512972, 17828656, 15607935, 2054751], [16, 13639536, 18659398, 16087331, 2421507]], "OptimistAtomicQueueB2": [[1, 47683902, 50193979, 48491157, 486231], [2, 12921392, 13307473, 13116503, 145565], [3, 14597155, 16224783, 15446962, 717840], [4, 13573261, 16437085, 14919440, 1278663], [5, 8238326, 8720676, 8528381, 174136], [6, 9101014, 9938667, 9437470, 308194], [7, 9508249, 10870736, 10185328, 600649], [8, 11118015, 12507639, 11482125, 348372], [9, 10629493, 12234906, 11385874, 700054], [10, 11392711, 13153599, 12181063, 757937], [11, 12340067, 14362440, 13152426, 776840], [12, 12280642, 15962685, 13624433, 1334359], [13, 12395574, 15947344, 13760740, 1338907], [14, 12425126, 16042384, 14065211, 1609272], [15, 12556442, 16805575, 14575409, 1974290], [16, 12689671, 17509368, 15028324, 2290207]], "boost::lockfree::queue": [[1, 2476768, 2578307, 2503077, 16146], [2, 1764933, 1793899, 1778281, 7907], [3, 1452498, 1540432, 1489512, 32216], [4, 1289983, 1320360, 1304907, 11257], [5, 794082, 885520, 834379, 32180], [6, 784462, 872385, 823576, 22372], [7, 416269, 866025, 631022, 209631], [8, 345719, 789266, 559187, 215472], [9, 412693, 600916, 509010, 83014], [10, 455346, 614859, 539671, 61639], [11, 482268, 605688, 550803, 31603], [12, 488322, 590606, 541703, 31476], [13, 490772, 555296, 516438, 15719], [14, 482188, 528380, 499080, 10088], [15, 462991, 480529, 470421, 3138], [16, 430559, 457354, 441835, 8790]], "boost::lockfree::spsc_queue": [[1, 27550404, 37754826, 34917096, 2778590]], "moodycamel::ConcurrentQueue": [[1, 4348326, 5145652, 4486582, 117288], [2, 4125791, 8552994, 4625689, 1157979], [3, 4740871, 9154007, 5234179, 1086936], [4, 5316360, 14791421, 6209350, 1734265], [5, 3358216, 16406122, 5960866, 2699185], [6, 3619221, 15491041, 5178497, 2464331], [7, 4005591, 19010685, 5995624, 2843800], [8, 4140544, 16920564, 5792536, 2643294], [9, 4091094, 20905155, 5607134, 3082470], [10, 4291133, 24730378, 5891180, 3204591], [11, 4387165, 17283594, 5748309, 2475644], [12, 4421958, 26077368, 5694981, 3123088], [13, 4509866, 22863671, 6270463, 3959906], [14, 4589085, 29264650, 6461633, 4793550], [15, 4630216, 27186852, 6477999, 4394472], [16, 4631448, 29668126, 6301973, 4224746]], "moodycamel::ReaderWriterQueue": [[1, 10345494, 271338758, 30878566, 32277140]], "pthread_spinlock": [[1, 4619756, 5117377, 4716207, 91856], [2, 2070835, 3034473, 2469041, 166182], [3, 1665854, 2391068, 2138608, 182368], [4, 253013, 1815428, 1033847, 632107], [5, 780890, 1478238, 1114289, 292954], [6, 763786, 1158471, 946051, 166469], [7, 831271, 901787, 870529, 18553], [8, 794907, 913088, 856103, 48738], [9, 766231, 871972, 840333, 18588], [10, 702292, 860919, 791159, 41583], [11, 644438, 819517, 745228, 47786], [12, 576335, 764183, 721761, 51020], [13, 581230, 765015, 697292, 51708], [14, 576006, 707635, 646152, 40113], [15, 537165, 616377, 578816, 18093], [16, 484658, 569050, 522612, 14623]], "std::mutex": [[1, 1030724, 2794303, 1769715, 592242], [2, 2872751, 3101101, 2979788, 49573], [3, 2058240, 2528068, 2151124, 81022], [4, 2079950, 2399277, 2145246, 54704], [5, 1006844, 1248402, 1072375, 36656], [6, 866281, 1116492, 937202, 41734], [7, 815880, 1089136, 907324, 57254], [8, 808443, 1134573, 921787, 81834], [9, 815507, 987125, 871383, 28846], [10, 810619, 917804, 843131, 17107], [11, 803806, 914404, 837592, 19204], [12, 811334, 908695, 845070, 17766], [13, 790324, 871971, 816022, 13983], [14, 775684, 844939, 796763, 11494], [15, 763740, 821326, 781607, 10176], [16, 750030, 806035, 771454, 9235]], "tbb::concurrent_bounded_queue": [[1, 2953202, 3854497, 3433498, 87332], [2, 3616439, 4018613, 3938994, 65277], [3, 2430449, 2519805, 2491612, 23967], [4, 2033947, 2078719, 2055753, 19612], [5, 1023855, 1109947, 1048028, 28041], [6, 890014, 1079888, 957052, 66613], [7, 841605, 972391, 881371, 37058], [8, 823983, 922690, 872088, 46797], [9, 820321, 868417, 833919, 9042], [10, 822001, 882898, 837252, 12251], [11, 807773, 841548, 815495, 6617], [12, 792136, 840341, 818179, 13891], [13, 777670, 810566, 790222, 6504], [14, 738670, 793129, 763006, 20175], [15, 704751, 769505, 735903, 29394], [16, 681614, 749735, 715320, 32827]], "tbb::spin_mutex": [[1, 13303338, 14029398, 13686140, 155000], [2, 5333560, 5970925, 5580405, 119385], [3, 3606876, 3839295, 3688735, 47090], [4, 3085486, 3271458, 3167074, 39386], [5, 1573380, 2057170, 1790334, 180012], [6, 1447557, 1664776, 1541561, 61537], [7, 1287956, 1480278, 1339968, 33728], [8, 1227257, 1406359, 1290200, 41626], [9, 1284532, 1412123, 1360067, 24278], [10, 1150930, 1396416, 1318055, 58110], [11, 1088297, 1302474, 1201405, 50038], [12, 933730, 1048920, 971454, 27315], [13, 986239, 1130819, 1033081, 25392], [14, 955548, 1139028, 1037207, 37897], [15, 887563, 1020685, 958823, 23060], [16, 825635, 918252, 860462, 19686]], "xenium::michael_scott_queue": [[1, 3309205, 3474663, 3386276, 34038], [2, 2939859, 3097449, 3026075, 59730], [3, 2182347, 2260268, 2219558, 19901], [4, 1747719, 1787606, 1767293, 9477], [5, 1062719, 1367870, 1173337, 113551], [6, 981986, 1161172, 1048359, 54987], [7, 939234, 1018805, 971332, 23650], [8, 774566, 889174, 830525, 49362], [9, 867727, 917999, 891369, 15851], [10, 869390, 924190, 891855, 14970], [11, 831869, 902709, 861005, 21628], [12, 767728, 863786, 809430, 34192], [13, 746657, 847725, 795403, 39397], [14, 715075, 824983, 772419, 43997], [15, 676553, 805652, 737369, 52001], [16, 628572, 740667, 687693, 50360]], "xenium::ramalhete_queue": [[1, 4042746, 4646301, 4303546, 139605], [2, 7565053, 9792901, 8819711, 641889], [3, 10561308, 12152128, 11701045, 392608], [4, 13117708, 14858321, 14462221, 282769], [5, 7623394, 8304795, 7870542, 157566], [6, 8570963, 8874821, 8740212, 54124], [7, 9463588, 10809493, 10035877, 504461], [8, 9980896, 14838853, 12373227, 2306465], [9, 9811109, 11433399, 10683409, 605592], [10, 9140696, 10710778, 10278780, 380932], [11, 9649405, 11154997, 10284158, 492901], [12, 9996820, 12086850, 10722357, 576643], [13, 10483396, 12062364, 10893390, 295764], [14, 10880710, 12142606, 11184796, 222352], [15, 11233547, 11713937, 11448521, 144645], [16, 11354170, 11964172, 11803898, 101828]], "xenium::vyukov_bounded_queue": [[1, 11639244, 92175617, 30578122, 21038427], [2, 4495006, 5852154, 5544747, 178825], [3, 3201919, 3760923, 3493329, 211310], [4, 3031056, 3701524, 3418315, 248777], [5, 1975489, 3137414, 2520679, 536231], [6, 1931764, 2644079, 2250631, 293724], [7, 1909656, 2199846, 2066973, 112186], [8, 1982604, 2380313, 2131389, 146327], [9, 1900776, 2064193, 1987146, 54495], [10, 1935005, 2096700, 2023293, 55435], [11, 1920842, 2233286, 2053693, 70996], [12, 1969133, 2167953, 2073549, 63446], [13, 1950805, 2169376, 2052178, 68364], [14, 1934051, 2160168, 2037191, 82253], [15, 1915321, 2100003, 2005809, 76568], [16, 1894059, 2075978, 1987443, 75691]]};
+ const scalability_ryzen_5825u = {"AtomicQueue": [[1, 55681891, 60041308, 58033830, 672502], [2, 6032352, 7729016, 6827208, 759431], [3, 4348284, 5441096, 4868191, 488660], [4, 3358073, 4382714, 3848656, 478491], [5, 2792030, 3550837, 3155651, 346933], [6, 2366449, 3028353, 2688865, 307039], [7, 2063667, 2556640, 2301961, 227573], [8, 1798670, 2261120, 2015791, 208284]], "AtomicQueue2": [[1, 50196015, 54036178, 52824559, 800008], [2, 5908377, 7279604, 6441551, 467690], [3, 4312270, 5267471, 4755379, 418098], [4, 3349460, 4262979, 3793190, 430592], [5, 2783915, 3462103, 3108096, 314989], [6, 2357307, 2954668, 2653777, 279505], [7, 2054859, 2501509, 2266143, 200203], [8, 1798762, 2194250, 1992150, 186571]], "AtomicQueueB": [[1, 39643148, 42024046, 41162268, 385369], [2, 6017022, 7661399, 6802266, 743608], [3, 4350144, 5392349, 4849789, 475058], [4, 3395240, 4381701, 3876227, 463985], [5, 2802325, 3528062, 3151107, 336997], [6, 2390880, 3027093, 2701368, 298756], [7, 2064459, 2542535, 2297759, 220502], [8, 1809410, 2258609, 2024007, 201956]], "AtomicQueueB2": [[1, 38786208, 41516618, 40819475, 437916], [2, 5927246, 7359978, 6465820, 477245], [3, 4308403, 5250665, 4763354, 436614], [4, 3368458, 4291513, 3812437, 431723], [5, 2782654, 3452306, 3106543, 314979], [6, 2373755, 2961284, 2662782, 278165], [7, 2058192, 2486711, 2264062, 199084], [8, 1804061, 2191338, 1995011, 183449]], "OptimistAtomicQueue": [[1, 463929482, 512583935, 483151974, 7528021], [2, 25319763, 54347471, 36218287, 10618584], [3, 25525793, 34384668, 29134435, 2886810], [4, 26303177, 34879227, 30417309, 3702132], [5, 25808636, 32315314, 28869598, 2796883], [6, 24717966, 33390676, 28903511, 3791484], [7, 23761953, 30367028, 26597532, 2489181], [8, 22990905, 30587213, 26685968, 3536207]], "OptimistAtomicQueue2": [[1, 186084594, 227616220, 211686676, 6574938], [2, 15483159, 19320065, 17369035, 1516286], [3, 19087520, 23282798, 21007797, 1369941], [4, 23351206, 32309760, 27314794, 3493765], [5, 24754166, 30961765, 26870172, 2034111], [6, 23802299, 32447975, 27318336, 3413398], [7, 23404578, 29657332, 25882546, 2446325], [8, 22267102, 29926124, 25643849, 3246300]], "OptimistAtomicQueueB": [[1, 176387374, 253473859, 187452402, 10545580], [2, 25323918, 51619670, 35573230, 9881317], [3, 25472385, 34817961, 29185357, 3052881], [4, 22966046, 34965988, 30197922, 3820679], [5, 25254349, 32504786, 28721515, 3082816], [6, 24288490, 33384032, 28829505, 3881602], [7, 23718061, 30711556, 26978986, 3028989], [8, 22909412, 30545543, 26604909, 3498309]], "OptimistAtomicQueueB2": [[1, 64368869, 96851362, 70298561, 4530679], [2, 15754447, 39573116, 17967736, 2343056], [3, 19234515, 23362235, 21070906, 1328729], [4, 22788999, 32099884, 26886263, 3625041], [5, 24233717, 30576981, 26397641, 2187497], [6, 23842126, 32293666, 26816203, 3145991], [7, 22941534, 29568045, 25524514, 2658011], [8, 22064329, 29795300, 25260406, 3164114]], "boost::lockfree::queue": [[1, 9247323, 9448402, 9331344, 19733], [2, 3263237, 3616504, 3384659, 38439], [3, 2915963, 3191315, 2976663, 29461], [4, 2285434, 2421581, 2311034, 19687], [5, 1683941, 2028624, 1929837, 28369], [6, 1301594, 1662104, 1396381, 99571], [7, 1087308, 1439668, 1159916, 62179], [8, 900277, 1275780, 983677, 80879]], "boost::lockfree::spsc_queue": [[1, 83439997, 98767383, 89214804, 3142447]], "moodycamel::ConcurrentQueue": [[1, 13417379, 15852809, 15561682, 151930], [2, 7696680, 17987151, 9005042, 1886147], [3, 7435683, 15381982, 8455015, 1500296], [4, 7072461, 21679709, 9352204, 2706833], [5, 7793702, 22526104, 9989506, 3053747], [6, 7489252, 27583509, 10116824, 3716468], [7, 8040376, 27615722, 11030219, 4510711], [8, 8065127, 27946590, 11611944, 5003140]], "moodycamel::ReaderWriterQueue": [[1, 128804233, 323718882, 293358466, 11686559]], "pthread_spinlock": [[1, 18410266, 20589290, 19151247, 378223], [2, 3684887, 6131002, 5665496, 238617], [3, 4066493, 4959688, 4634936, 165566], [4, 3337256, 4177216, 3844221, 190184], [5, 2137013, 3256171, 2659468, 274747], [6, 921277, 3141106, 2119059, 638203], [7, 738309, 2917023, 2206764, 540103], [8, 1048309, 2648085, 1812082, 651443]], "std::mutex": [[1, 6684247, 7686634, 7384149, 74825], [2, 4217735, 5751179, 4554084, 272538], [3, 2397265, 2722273, 2522838, 63802], [4, 2160526, 2680899, 2343582, 75578], [5, 2059731, 2430107, 2256463, 54749], [6, 1964551, 2306694, 2130121, 40715], [7, 1862222, 2141343, 2013189, 32615], [8, 1818728, 2118912, 1923708, 30985]], "tbb::concurrent_bounded_queue": [[1, 11644333, 13314827, 13146084, 174361], [2, 5932401, 6662916, 6292982, 252817], [3, 4504200, 4888417, 4676683, 123172], [4, 3379150, 4077660, 3712786, 290498], [5, 2830644, 3542565, 3144252, 292676], [6, 2340466, 2942154, 2615201, 253416], [7, 2031361, 2551135, 2276578, 232636], [8, 1757877, 2159176, 1938058, 163357]], "tbb::spin_mutex": [[1, 38310301, 42457508, 41008657, 728284], [2, 9071808, 10227257, 9634109, 406714], [3, 5360082, 6070810, 5653082, 212940], [4, 3709439, 4116450, 3864827, 101956], [5, 2960383, 3230438, 3041167, 47109], [6, 2463207, 2691476, 2509318, 33639], [7, 1930054, 2188069, 2022118, 57612], [8, 1548662, 1891360, 1668129, 96598]], "xenium::michael_scott_queue": [[1, 9175638, 10224426, 9843078, 149835], [2, 4857285, 6719928, 5751581, 802189], [3, 3461614, 4175499, 3822846, 260934], [4, 2728189, 3384167, 3037596, 262919], [5, 2218077, 2831734, 2470086, 171358], [6, 1850617, 2394157, 2076029, 170384], [7, 1573296, 2082487, 1769514, 151651], [8, 1358972, 1775376, 1542209, 135217]], "xenium::ramalhete_queue": [[1, 15907542, 30410628, 20331577, 1303107], [2, 10874252, 28393635, 20124615, 3756994], [3, 15166537, 25344045, 18202079, 1824075], [4, 15177639, 20153375, 18241744, 918473], [5, 16025687, 19663077, 18213127, 577247], [6, 16300682, 19346263, 18290831, 502058], [7, 16399722, 18808605, 17734945, 445536], [8, 16584330, 18031318, 17349257, 316550]], "xenium::vyukov_bounded_queue": [[1, 48893108, 59537772, 53168996, 2313982], [2, 9812158, 23409423, 13321595, 3424800], [3, 7329622, 9499618, 8339020, 894073], [4, 6165754, 8617429, 7363692, 1142496], [5, 5657679, 6969348, 6300161, 570940], [6, 4655862, 5845351, 5223735, 448221], [7, 4221090, 4967796, 4603430, 298930], [8, 3695220, 4298271, 3997092, 230714]]};
const latency_9900KS = {"AtomicQueue": [157, 171, 166, 0], "AtomicQueue2": [173, 177, 175, 0], "AtomicQueueB": [171, 184, 179, 3], "AtomicQueueB2": [175, 192, 180, 3], "OptimistAtomicQueue": [148, 160, 153, 3], "OptimistAtomicQueue2": [167, 176, 173, 1], "OptimistAtomicQueueB": [140, 154, 141, 1], "OptimistAtomicQueueB2": [149, 155, 150, 1], "boost::lockfree::queue": [310, 338, 319, 4], "boost::lockfree::spsc_queue": [129, 135, 132, 0], "moodycamel::ConcurrentQueue": [208, 254, 231, 7], "moodycamel::ReaderWriterQueue": [110, 167, 137, 12], "pthread_spinlock": [226, 308, 279, 25], "std::mutex": [411, 525, 465, 20], "tbb::concurrent_bounded_queue": [268, 307, 287, 9], "tbb::spin_mutex": [246, 309, 275, 18], "xenium::michael_scott_queue": [357, 407, 371, 6], "xenium::ramalhete_queue": [255, 282, 267, 4], "xenium::vyukov_bounded_queue": [183, 227, 212, 11]};
const latency_xeon_gold_6132 = {"AtomicQueue": [231, 479, 321, 72], "AtomicQueue2": [307, 556, 394, 86], "AtomicQueueB": [344, 588, 423, 80], "AtomicQueueB2": [403, 711, 491, 111], "OptimistAtomicQueue": [283, 459, 346, 55], "OptimistAtomicQueue2": [315, 562, 392, 78], "OptimistAtomicQueueB": [321, 507, 378, 69], "OptimistAtomicQueueB2": [345, 572, 409, 84], "boost::lockfree::queue": [726, 1151, 869, 154], "boost::lockfree::spsc_queue": [269, 507, 356, 69], "moodycamel::ConcurrentQueue": [427, 789, 547, 120], "moodycamel::ReaderWriterQueue": [207, 552, 328, 94], "pthread_spinlock": [623, 1899, 946, 308], "std::mutex": [1859, 3202, 2340, 463], "tbb::concurrent_bounded_queue": [565, 993, 683, 155], "tbb::spin_mutex": [561, 1069, 741, 156], "xenium::michael_scott_queue": [733, 1255, 879, 196], "xenium::ramalhete_queue": [493, 887, 596, 139], "xenium::vyukov_bounded_queue": [436, 685, 521, 89]};
const latency_ryzen_5950x = {"AtomicQueue": [353, 370, 365, 4], "AtomicQueue2": [375, 396, 386, 4], "AtomicQueueB": [365, 371, 368, 1], "AtomicQueueB2": [380, 387, 381, 1], "OptimistAtomicQueue": [342, 346, 343, 0], "OptimistAtomicQueue2": [296, 321, 309, 6], "OptimistAtomicQueueB": [318, 327, 325, 2], "OptimistAtomicQueueB2": [337, 353, 345, 3], "boost::lockfree::queue": [741, 747, 743, 1], "boost::lockfree::spsc_queue": [403, 405, 404, 0], "moodycamel::ConcurrentQueue": [539, 623, 587, 18], "moodycamel::ReaderWriterQueue": [355, 415, 374, 16], "pthread_spinlock": [737, 747, 742, 1], "std::mutex": [1462, 1624, 1513, 22], "tbb::concurrent_bounded_queue": [971, 1000, 974, 3], "tbb::spin_mutex": [638, 646, 643, 1], "xenium::michael_scott_queue": [940, 1061, 994, 26], "xenium::ramalhete_queue": [607, 659, 629, 11], "xenium::vyukov_bounded_queue": [469, 521, 476, 7]};
- plot_scalability('scalability-9900KS-5GHz', scalability_9900KS, "Intel i9-9900KS", 60e6, 1000e6);
- plot_scalability('scalability-xeon-gold-6132', scalability_xeon_gold_6132, "Intel Xeon Gold 6132", 15e6, 300e6);
- plot_scalability('scalability-ryzen-5950x', scalability_ryzen_5950x, "AMD Ryzen 9 5950X", 20e6, 500e6);
- plot_latency('latency-9900KS-5GHz', latency_9900KS, "Intel i9-9900KS");
- plot_latency('latency-xeon-gold-6132', latency_xeon_gold_6132, "Intel Xeon Gold 6132");
- plot_latency('latency-ryzen-5950x', latency_ryzen_5950x, "AMD Ryzen 9 5950X");
+ const latency_ryzen_5825u = {"AtomicQueue": [67, 93, 87, 7], "AtomicQueue2": [71, 100, 97, 3], "AtomicQueueB": [79, 97, 95, 1], "AtomicQueueB2": [99, 103, 100, 0], "OptimistAtomicQueue": [71, 72, 71, 0], "OptimistAtomicQueue2": [82, 83, 82, 0], "OptimistAtomicQueueB": [73, 75, 74, 0], "OptimistAtomicQueueB2": [80, 86, 80, 1], "boost::lockfree::queue": [205, 211, 205, 0], "boost::lockfree::spsc_queue": [108, 111, 109, 0], "moodycamel::ConcurrentQueue": [127, 134, 130, 1], "moodycamel::ReaderWriterQueue": [94, 96, 94, 0], "pthread_spinlock": [195, 213, 212, 1], "std::mutex": [590, 631, 600, 5], "tbb::concurrent_bounded_queue": [190, 195, 191, 0], "tbb::spin_mutex": [196, 202, 198, 1], "xenium::michael_scott_queue": [283, 295, 286, 1], "xenium::ramalhete_queue": [140, 159, 144, 2], "xenium::vyukov_bounded_queue": [127, 134, 130, 1]};
+ plot_scalability('scalability-9900KS-5GHz', scalability_9900KS, 60e6, 1000e6);
+ plot_scalability('scalability-ryzen-5825u', scalability_ryzen_5825u, 40e6, 1000e6);
+ plot_scalability('scalability-xeon-gold-6132', scalability_xeon_gold_6132, 15e6, 300e6);
+ plot_scalability('scalability-ryzen-5950x', scalability_ryzen_5950x, 20e6, 500e6);
+ plot_latency('latency-9900KS-5GHz', latency_9900KS);
+ plot_latency('latency-ryzen-5825u', latency_ryzen_5825u);
+ plot_latency('latency-xeon-gold-6132', latency_xeon_gold_6132);
+ plot_latency('latency-ryzen-5950x', latency_ryzen_5950x);
+ $(".view-toggle")
+ .html((index, html) => `<svg class="arrow-down-circle" viewBox="0 0 16 16"><path fill-rule="evenodd" d="M1 8a7 7 0 1 0 14 0A7 7 0 0 0 1 8zm15 0A8 8 0 1 1 0 8a8 8 0 0 1 16 0zM8.5 4.5a.5.5 0 0 0-1 0v5.793L5.354 8.146a.5.5 0 1 0-.708.708l3 3a.5.5 0 0 0 .708 0l3-3a.5.5 0 0 0-.708-.708L8.5 10.293V4.5z"/></svg> ${html}`)
+ .on("click", function() {
+ const toggle = $(this);
+ toggle.next().slideToggle();
+ toggle.children("svg.arrow-down-circle").each(function() {
+ $(this).animate(
+ { hidden: this.hidden ^ 1 },
+ { step: function(f) { $(this).css("transform", `rotate(${-90 * f}deg)`); } }
+ );
+ });
+ });
+ $("body").css("visibility", "visible");
@@ -197,7 +197,8 @@ Highcharts.theme = {
maskColor: 'rgba(255,255,255,0.3)',
lang: { thousandsSep: ',' },
- credits: { enabled: false }
+ credits: { enabled: false },
+ accessibility: { enabled: false }
// Apply the theme
@@ -56,14 +56,11 @@ struct GetIndexShuffleBits<false, array_size, elements_per_cache_line> {
// minimizes contention. This is done by swapping the lowest order N bits (which are the index of
// the element within the cache line) with the next N bits (which are the index of the cache line)
// of the element index.
-template<int BITS>
-constexpr unsigned remap_index_with_mix(unsigned index, unsigned mix) {
- return index ^ mix ^ (mix << BITS);
template<int BITS>
constexpr unsigned remap_index(unsigned index) noexcept {
- return remap_index_with_mix<BITS>(index, (index ^ (index >> BITS)) & ((1u << BITS) - 1));
+ unsigned constexpr mix_mask{(1u << BITS) - 1};
+ unsigned const mix{(index ^ (index >> BITS)) & mix_mask};
+ return index ^ mix ^ (mix << BITS);
@@ -120,6 +117,16 @@ constexpr uint64_t round_up_to_power_of_2(uint64_t a) noexcept {
+template<class T>
+constexpr T nil() noexcept {
+#if __cpp_lib_atomic_is_always_lock_free // Better compile-time error message requires C++17.
+ static_assert(std::atomic<T>::is_always_lock_free, "Queue element type T is not atomic. Use AtomicQueue2/AtomicQueueB2 for such element types.");
+ return {};
} // namespace details
@@ -158,9 +165,9 @@ protected:
static T do_pop_atomic(std::atomic<T>& q_element) noexcept {
if(Derived::spsc_) {
for(;;) {
- T element = q_element.load(X);
+ T element = q_element.load(A);
if(ATOMIC_QUEUE_LIKELY(element != NIL)) {
- q_element.store(NIL, R);
+ q_element.store(NIL, X);
return element;
@@ -169,7 +176,7 @@ protected:
else {
for(;;) {
- T element = q_element.exchange(NIL, R); // (2) The store to wait for.
+ T element = q_element.exchange(NIL, A); // (2) The store to wait for.
return element;
// Do speculative loads while busy-waiting to avoid broadcasting RFO messages.
@@ -264,7 +271,7 @@ public:
do {
if(static_cast<int>(head - tail_.load(X)) >= static_cast<int>(static_cast<Derived&>(*this).size_))
return false;
- } while(ATOMIC_QUEUE_UNLIKELY(!head_.compare_exchange_strong(head, head + 1, A, X))); // This loop is not FIFO.
+ } while(ATOMIC_QUEUE_UNLIKELY(!head_.compare_exchange_strong(head, head + 1, X, X))); // This loop is not FIFO.
static_cast<Derived&>(*this).do_push(std::forward<T>(element), head);
@@ -283,7 +290,7 @@ public:
do {
if(static_cast<int>(head_.load(X) - tail) <= 0)
return false;
- } while(ATOMIC_QUEUE_UNLIKELY(!tail_.compare_exchange_strong(tail, tail + 1, A, X))); // This loop is not FIFO.
+ } while(ATOMIC_QUEUE_UNLIKELY(!tail_.compare_exchange_strong(tail, tail + 1, X, X))); // This loop is not FIFO.
element = static_cast<Derived&>(*this).do_pop(tail);
@@ -298,7 +305,7 @@ public:
head_.store(head + 1, X);
else {
- constexpr auto memory_order = Derived::total_order_ ? std::memory_order_seq_cst : std::memory_order_acquire;
+ constexpr auto memory_order = Derived::total_order_ ? std::memory_order_seq_cst : std::memory_order_relaxed;
head = head_.fetch_add(1, memory_order); // FIFO and total order on Intel regardless, as of 2019.
static_cast<Derived&>(*this).do_push(std::forward<T>(element), head);
@@ -311,7 +318,7 @@ public:
tail_.store(tail + 1, X);
else {
- constexpr auto memory_order = Derived::total_order_ ? std::memory_order_seq_cst : std::memory_order_acquire;
+ constexpr auto memory_order = Derived::total_order_ ? std::memory_order_seq_cst : std::memory_order_relaxed;
tail = tail_.fetch_add(1, memory_order); // FIFO and total order on Intel regardless, as of 2019.
return static_cast<Derived&>(*this).do_pop(tail);
@@ -337,7 +344,7 @@ public:
-template<class T, unsigned SIZE, T NIL = T{}, bool MINIMIZE_CONTENTION = true, bool MAXIMIZE_THROUGHPUT = true, bool TOTAL_ORDER = false, bool SPSC = false>
+template<class T, unsigned SIZE, T NIL = details::nil<T>(), bool MINIMIZE_CONTENTION = true, bool MAXIMIZE_THROUGHPUT = true, bool TOTAL_ORDER = false, bool SPSC = false>
class AtomicQueue : public AtomicQueueCommon<AtomicQueue<T, SIZE, NIL, MINIMIZE_CONTENTION, MAXIMIZE_THROUGHPUT, TOTAL_ORDER, SPSC>> {
friend Base;
@@ -364,8 +371,8 @@ public:
using value_type = T;
AtomicQueue() noexcept {
- assert(std::atomic<T>{NIL}.is_lock_free()); // This queue is for atomic elements only. AtomicQueue2 is for non-atomic ones.
- if(T{} != NIL)
+ assert(std::atomic<T>{NIL}.is_lock_free()); // Queue element type T is not atomic. Use AtomicQueue2/AtomicQueueB2 for such element types.
+ if(details::nil<T>() != NIL)
for(auto& element : elements_)
element.store(NIL, X);
@@ -412,7 +419,7 @@ public:
-template<class T, class A = std::allocator<T>, T NIL = T{}, bool MAXIMIZE_THROUGHPUT = true, bool TOTAL_ORDER = false, bool SPSC = false>
+template<class T, class A = std::allocator<T>, T NIL = details::nil<T>(), bool MAXIMIZE_THROUGHPUT = true, bool TOTAL_ORDER = false, bool SPSC = false>
class AtomicQueueB : public AtomicQueueCommon<AtomicQueueB<T, A, NIL, MAXIMIZE_THROUGHPUT, TOTAL_ORDER, SPSC>>,
private std::allocator_traits<A>::template rebind_alloc<std::atomic<T>> {
using Base = AtomicQueueCommon<AtomicQueueB<T, A, NIL, MAXIMIZE_THROUGHPUT, TOTAL_ORDER, SPSC>>;
@@ -453,7 +460,7 @@ public:
AtomicQueueB(unsigned size)
: size_(std::max(details::round_up_to_power_of_2(size), 1u << (SHUFFLE_BITS * 2)))
, elements_(AllocatorElements::allocate(size_)) {
- assert(std::atomic<T>{NIL}.is_lock_free()); // This queue is for atomic elements only. AtomicQueueB2 is for non-atomic ones.
+ assert(std::atomic<T>{NIL}.is_lock_free()); // Queue element type T is not atomic. Use AtomicQueue2/AtomicQueueB2 for such element types.
for(auto p = elements_, q = elements_ + size_; p < q; ++p)
p->store(NIL, X);
@@ -14,7 +14,7 @@ static inline void spin_loop_pause() noexcept {
} // namespace atomic_queue
-#elif defined(__arm__) || defined(__aarch64__)
+#elif defined(__arm__) || defined(__aarch64__) || defined(_M_ARM64)
namespace atomic_queue {
constexpr int CACHE_LINE_SIZE = 64;
static inline void spin_loop_pause() noexcept {
@@ -30,6 +30,8 @@ static inline void spin_loop_pause() noexcept {
defined(__ARM_ARCH_8A__) || \
asm volatile ("yield" ::: "memory");
+#elif defined(_M_ARM64)
+ __yield();
asm volatile ("nop" ::: "memory");
@@ -55,7 +57,11 @@ static inline void spin_loop_pause() noexcept {
} // namespace atomic_queue
+#ifdef _MSC_VER
+#pragma message("Unknown CPU architecture. Using L1 cache line size of 64 bytes and no spinloop pause instruction.")
#warning "Unknown CPU architecture. Using L1 cache line size of 64 bytes and no spinloop pause instruction."
namespace atomic_queue {
constexpr int CACHE_LINE_SIZE = 64; // TODO: Review that this is the correct value.
static inline void spin_loop_pause() noexcept {}
@@ -82,6 +88,7 @@ auto constexpr A = std::memory_order_acquire;
auto constexpr R = std::memory_order_release;
auto constexpr X = std::memory_order_relaxed;
auto constexpr C = std::memory_order_seq_cst;
+auto constexpr AR = std::memory_order_acq_rel;
The diff for this file was not included because it is too large.
@@ -1,4 +1,4 @@
+#!/bin/bash -x
# Copyright (c) 2019 Maxim Egorushkin. MIT License. See the full licence in file LICENSE.
@@ -8,7 +8,7 @@ exe="$(dirname "$0")/../benchmarks"
function benchmark() {
lb="/usr/bin/stdbuf -oL"
- ((N=33))
+ let N=${N:33}
for((i=1;i<=N;++i)); do
$lb echo -n "[$i/$N] "
sudo chrt -f 50 "$exe"
@@ -11,56 +11,65 @@
int main() {
int constexpr PRODUCERS = 1; // Number of producer threads.
int constexpr CONSUMERS = 2; // Number of consumer threads.
- unsigned constexpr N = 1000000; // Pass this many elements from producers to consumers.
- unsigned constexpr CAPACITY = 1024; // Queue capacity. Since there are more consumers than producers this doesn't have to be large.
+ unsigned constexpr N = 1000000; // Each producer pushes this many elements into the queue.
+ unsigned constexpr CAPACITY = 1024; // Queue capacity. Since there are more consumers than producers the queue doesn't need to be large.
using Element = uint32_t; // Queue element type.
Element constexpr NIL = static_cast<Element>(-1); // Atomic elements require a special value that cannot be pushed/popped.
using Queue = atomic_queue::AtomicQueueB<Element, std::allocator<Element>, NIL>; // Use heap-allocated buffer.
- // Create a queue object shared between producers and consumers.
+ // Create a queue object shared between all producers and consumers.
Queue q{CAPACITY};
// Start the consumers.
- uint64_t results[CONSUMERS];
+ uint64_t sums[CONSUMERS];
std::thread consumers[CONSUMERS];
for(int i = 0; i < CONSUMERS; ++i)
- consumers[i] = std::thread([&q, &r = results[i]]() {
- uint64_t sum = 0;
- while(Element n = q.pop()) // Stop when 0 is received.
- sum += n;
- r = sum;
+ consumers[i] = std::thread([&q, &sum = sums[i]]() {
+ uint64_t s = 0; // New object with automatic storage duration. Not aliased or false-shared by construction.
+ while(Element n = q.pop()) // Break the loop when 0 is pop'ed.
+ s += n;
+ // Store into sum only once because it is element of sums array, false-sharing the same cache line with other threads.
+ // Updating sum in the loop above saturates the inter-core bus with cache coherence protocol messages.
+ sum = s;
// Start the producers.
std::thread producers[PRODUCERS];
for(int i = 0; i < PRODUCERS; ++i)
producers[i] = std::thread([&q]() {
+ // Each producer pushes range [1, N] elements into the queue.
+ // Ascending order [1, N] requires comparing with N at each loop iteration. Ascending order isn't necessary here.
+ // Push elements in descending order, range [N, 1] with step -1, so that CPU decrement instruction sets zero/equal flag
+ // when 0 is reached, which breaks the loop without having to compare n with N at each iteration.
for(Element n = N; n; --n)
- // Wait till producers complete and terminate.
+ // Wait till producers have terminated.
for(auto& t : producers)
- // Tell each consumer to complete and terminate.
+ // Tell consumers to terminate by pushing one 0 element for each consumer.
for(int i = CONSUMERS; i--;)
- // Wait till consumers complete and terminate.
+ // Wait till consumers have terminated.
for(auto& t : consumers)
+ // When all consumers have terminated the queue is empty.
- // Verify that each message was received exactly by one consumer only.
- uint64_t result = 0;
- for(auto& r : results) {
- result += r;
- if(!r)
- std::cerr << "WARNING: consumer " << (&r - results) << " received no messages.\n";
+ // Sum up consumer's received elements sums.
+ uint64_t total_sum = 0;
+ for(auto& sum : sums) {
+ total_sum += sum;
+ if(!sum) // Verify that each consumer received at least one element.
+ std::cerr << "WARNING: consumer " << (&sum - sums) << " received no elements.\n";
- uint64_t constexpr expected_result = (N + 1) / 2. * N * PRODUCERS;
- if(int64_t result_diff = result - expected_result) {
- std::cerr << "ERROR: unexpected result difference " << result_diff << '\n';
+ // Verify that each element has been pop'ed exactly once; not corrupted, dropped or duplicated.
+ uint64_t constexpr expected_total_sum = (N + 1) / 2. * N * PRODUCERS;
+ if(int64_t total_sum_diff = total_sum - expected_total_sum) {
+ std::cerr << "ERROR: unexpected total_sum difference " << total_sum_diff << '\n';
View it on GitLab: https://salsa.debian.org/med-team/libatomic-queue/-/compare/9355308007d30083c0d9b42374b1f79bc0587d91...625bcc87530286d68fb214ee50c31706d0512844
View it on GitLab: https://salsa.debian.org/med-team/libatomic-queue/-/compare/9355308007d30083c0d9b42374b1f79bc0587d91...625bcc87530286d68fb214ee50c31706d0512844
You're receiving this email because of your account on salsa.debian.org.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/debian-med-commit/attachments/20230718/1650d697/attachment-0001.htm>
More information about the debian-med-commit
mailing list