-r--r--r-- 12334 djbsort-20260127/doc/html/refs.html raw
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<style type="text/css">
html{overflow-y:scroll;background-color:#012345}
body{font-family:"Noto Sans","Droid Sans","DejaVu Sans","Arial",sans-serif;line-height:1.5}
tt,code{background-color:#f0f0f0;font-family:"Noto Sans Mono","Droid Sans Mono","DejaVu Sans Mono","Courier New",monospace,sans-serif;font-size:1em;}
pre{margin-left:3em}
p,ul,ol,blockquote,pre{font-size:1.0em;line-height:1.6}
li p{font-size:1.0em}
blockquote p{font-size:1.0em}
h1{font-size:1.5em}
h2{font-size:1.3em}
h3{font-size:1.0em}
h1 a{text-decoration:none}
table{border-collapse:collapse}
th,td{border:1px solid black}
table a{text-decoration:none}
table tr{font-size:1.0em;line-height:1.6em}
table tr{font-size:1.0em;line-height:1.5}
tbody tr:nth-child(2n+1){background-color:#f0ffff}
tbody tr:nth-child(2n+2){background-color:#fffff0}
#headline{display:block;margin:0;padding:0;color:#ffffff;background-color:#012345}
#headline .text{font-weight:bold;font-size:1.0em}
#headline input{display:none}
#nav ul{margin:0;padding:0}
#nav li{list-style-type:none;margin:0;padding:0}
.navtop{padding-bottom:0.5em;font-weight:bold;font-size:1.0em}
.navtop{background-color:#012345;color:#ffffff}
#nav .here{background-color:#012345;color:#ffffff}
#nav .away{background-color:#012345;color:#ffffff}
#nav .away a{text-decoration:none;display:block;color:#ffffff}
#nav .away a:hover,.away a:active{text-decoration:underline}
#hidemenu{visibility:hidden;display:none;overflow:hidden;position:fixed;top:0;left:0;height:100%;width:100%}
.main{padding:5px}
.main{background-color:#ffffff}
.pagetitle{font-size:1.4em;font-weight:bold}
@media only screen and (min-width:512px) {
.navtop{padding-top:5px}
#headline{top:0;margin:0;width:160px;height:100%;position:fixed;overflow:auto}
#headline .noselect{display:none}
#headline #nav{visibility:visible;display:block;width:auto;height:auto}
.main{margin-left:170px}
#headline #hidemenu{visibility:hidden}
}
@media not screen and (min-width:512px) {
#headline .noselect{-webkit-user-select:none;-ms-user-select:none;user-select:none;}
#headline #nav #navbot{visibility:hidden;position:fixed;top:0;left:-70%;z-index:2;transition:0.2s;margin:0;padding:0}
#headline input:checked ~ #nav #navbot{height:100%;position:fixed;top:0;left:0;visibility:visible;display:block;box-sizing:border-box;-moz-box-sizing:border-box;-webkit-box-sizing:border-box;vertical-align:center;font-size:1.0em;width:70%;overflow:auto}
#headline input:checked ~ #hidemenu{visibility:visible;display:block;background:black;opacity:0.3;z-index:1}
}
</style>
<title>
djbsort: References</title>
</head>
<body>
<label id=headline>
<input type=checkbox />
<nav id=nav>
<div class=navtop>
<span class=noselect>≡</span>
djbsort</div>
<ul id=navbot>
<li class=away><a href=index.html>Intro</a>
</li><li class=away><a href=download.html>Download</a>
</li><li class=away><a href=install.html>Install</a>
</li><li class=away><a href=test.html>Test</a>
</li><li class=away><a href=api.html>API</a>
</li><li class=away><a href=speed.html>Speed</a>
</li><li class=away><a href=security.html>Security</a>
</li><li class=away><a href=verif.html>Verification</a>
</li><li class=away><a href=internals.html>Internals</a>
</li><li class=away><a href=license.html>License</a>
</li><li class=here>References
</li><li class=away><a href=comparison.html>Comparison</a>
</li></ul></nav>
<div id=hidemenu></div>
</label>
<div class=main>
<div class=pagetitle>djbsort: References</div>
<p>There is a vast literature on sorting,
so this page cannot hope to be comprehensive.
Reasons to include something here:</p>
<ul>
<li>
<p>This is, or could be, something to credit for ideas used in djbsort.</p>
</li>
<li>
<p>The results are, or have been claimed to be,
competitive with djbsort in <a href="speed.html">speed</a>.
Speed results are not excluded on the basis of
<a href="security.html">security</a> or
<a href="verif.html">verification</a>.
Results are also not excluded on the basis of array-size limitations
(only multiples of 64, only powers of 2, etc.) or array-alignment requirements.</p>
</li>
</ul>
<h2>Available software</h2>
<p><a name="2001-herf"></a>
2001 Herf
<a href="http://stereopsis.com/radix.html">"radix"</a></p>
<p><a name="2016-santurkar"></a>
2016 Santurkar
<a href="https://github.com/sid1607/avx2-merge-sort">"avx2-merge-sort"</a></p>
<p><a name="2017-gueron"></a>
2017 Gueron–Krasnov
<a href="https://github.com/vkrasnov/avx_qsort">"avx_qsort"</a></p>
<p><a name="2017-bernstein"></a>
2017 Bernstein–Chuengsatiansup–Lange–van Vredendaal
<a href="https://ntruprime.cr.yp.to/software.html">"ntruprime-20170815.tar.gz"</a>:
constant time;
direct ancestor of djbsort;
see also
<a href="https://ntruprime.cr.yp.to/papers.html">paper</a>
"NTRU Prime: reducing attack surface at low cost"</p>
<p>2017 Bramas
<a href="https://gitlab.inria.fr/bramas/avx-512-sort">"avx-512-sort"</a>:
only for AVX-512;
see also
<a href="https://thesai.org/Downloads/Volume8No10/Paper_44-A_Novel_Hybrid_Quicksort_Algorithm_Vectorized.pdf">paper</a>
"A novel hybrid quicksort algorithm vectorized using AVX-512 on Intel Skylake"</p>
<p><a name="2018-hou"></a>
2018 Hou–Wang–Feng
<a href="https://github.com/vtsynergy/aspas_sort">"aspas"</a>:
see also paper
"A framework for the automatic vectorization of parallel sort on x86-based processors"</p>
<p><a name="2020-blacher"></a>
2020 Blacher–Giesen–Kühne
<a href="https://github.com/simd-sorting/fast-and-robust">"fast-and-robust"</a>:
see also
<a href="https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2021.3">paper</a>
"Fast and robust vectorized in-place sorting of primitive types"</p>
<p><a name="2020-shechter"></a>
2020 Shechter
<a href="https://github.com/damageboy/vxsort-cpp">"vxsort"</a>:
see also
<a href="https://bits.houmus.org/2020-01-28/this-goes-to-eleven-pt1">blog post</a></p>
<p>2020 Bingmann–Marianczuk–Sanders
<a href="https://github.com/JMarianczuk/SmallSorters">"SmallSorters"</a>:
only for small sizes;
see also
<a href="https://arxiv.org/abs/2002.05599">paper</a>
"Engineering faster sorters for small sets of items"</p>
<p><a name="2022-blacher"></a>
2022 Blacher–Giesen–Sanders–Wassenberg
<a href="https://github.com/google/highway">"vqsort"/"highway"</a>:
see also
<a href="https://arxiv.org/abs/2205.05982">paper</a>
"Vectorized and performance-portable Quicksort"</p>
<p><a name="2022-intel"></a>
2022 Intel
<a href="https://github.com/intel/x86-simd-sort.git">"x86-simd-sort"</a>
(development taken over by numpy in 2025)</p>
<h2>Harder-to-verify performance claims</h2>
<p><a href="https://www.sciencedirect.com/science/article/pii/0167819187900627">1987 Rönsch–Strauss</a>
"Timing results of some internal sorting algorithms on vector computers"</p>
<p><a href="https://webdocs.cs.ualberta.ca/~amaral/papers/furtak-spaa07.pdf">2007 Furtak–Amaral–Niewiadomski</a>
"Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms"</p>
<p><a href="https://web.archive.org/web/20120508095814/http://www.vldb.org/pvldb/1/1454171.pdf">2008 Chhugani–Macy–Baransi–Nguyen–Hagog–Kumar–Lee–Chen–Dubey</a>
"Efficient implementation of sorting on multi-core SIMD CPU architecture"</p>
<p><a href="https://web.archive.org/web/20170809023503/http://olab.is.s.u-tokyo.ac.jp/~kamil.rocki/xiaochen_rocki_IA3_SC13.pdf">2013 Xiaochen–Rocki–Suda</a>
"Register level sort algorithm on multi-core SIMD processors"</p>
<p><a href="https://www.semanticscholar.org/paper/VSR-sort%3A-A-novel-vectorised-sorting-algorithm-%26-Hayes-Palomar/dbbd5cccbbae8bb7ebed4854371901e763d2c08c">2015 Hayes–Palomar–Unsal–Cristal–Valero</a>
"VSR sort: A novel vectorised sorting algorithm & architecture extensions for future microprocessors"</p>
<p><a href="https://synergy.cs.vt.edu/pubs/papers/hou-aspas-ics15.pdf">2015 Hou–Wang–Feng</a>
"ASPaS: a framework for automatic SIMDization of parallel sorting on x86-based many-core processors":
reports, e.g.,
about 220 million cycles (55 cycles/byte) on 1.05GHz Xeon Phi 5110P (Knights Corner) for n=1000000</p>
<p><a href="https://web.archive.org/web/20210418053109/https://journal.uad.ac.id/index.php/TELKOMNIKA/article/view/2741">2016 Xu–Xu–Yin–Zheng</a>
"Optimized merge sort on modern commodity multi-core CPUs",
Telkomnika <strong>14</strong> (2016), 209–318:
reports, e.g.,
about 110 million cycles (27 cycles/byte) on 2.4GHz Core i7 4700MQ (Haswell) for n=1048576</p>
<p><a href="https://web.archive.org/web/20240815205312/https://www.imada.sdu.dk/u/lcf/pubs/paper36.pdf">2016 Codish–Cruz-Filipe–Nebel–Schneider-Kamp</a>
"Optimizing sorting algorithms by using sorting networks",
Formal Aspects of Computing <strong>29</strong> (2017), 559–579:
reports, e.g.,
about 150 cycles (2.3 cycles/byte) on unspecified "Core i7" for n=16;
about 120000 cycles (3 cycles/byte) for n=10000</p>
<p><a href="https://web.archive.org/web/20260123055800/https://valeriyvan.com/assets/docs/Fast%20Quicksort%20Implementation%20Using%20AVX%20Instructions.pdf">2016 Gueron–Krasnov</a>
"Fast quicksort implementation using AVX instructions",
The Computer Journal <strong>59</strong> (2016), 83–90:
reports, e.g.,
about 160000 cycles (40 cycles/byte) for n=1000 <code>int32</code> on Haswell (Figure 4),
and
about 130000 cycles (32 cycles/byte) for n=1000 <code>int32</code> on Haswell
using radix sort in Intel's Integrated Performance Primitives library,
"the fastest in-memory sort we are aware of"</p>
<p><a href="https://ieeexplore.ieee.org/abstract/document/8855628">2019 Yin–Zhang–Müller–Liu–Wei–Schmidt–Liu</a>
"Efficient parallel sort on AVX-512-based multi-core and many-core Architectures"</p>
<h2>Sorting networks</h2>
<p><a href="https://www.cs.kent.edu/~batcher/sort.pdf">1968 Batcher</a>
"Sorting networks and their applications"</p>
<p><a href="https://www.ics.uci.edu/~dan/pubs/BatcherMerge.pdf">1983 Kumar–Hirschberg</a>
"An efficient implementation of Batcher's odd-even merge algorithm
and its application in parallel sorting schemes"</p>
<p><a href="https://web.archive.org/web/20250221203539/https://www.cal.cs.ucla.edu/tech-report/198_-reports/860043.pdf">1986 Marberg–Gafni</a>
"Sorting in constant number of row and column phases on a mesh"</p>
<p><a name="1987-chung"></a>
<a href="https://www.sciencedirect.com/science/article/pii/0012365X9090173F">1987 Chung–Ravikumar</a>
"Bounds on the size of test sets for sorting and related networks",
Discrete Mathematics (1990)</p>
<p><a href="https://link.springer.com/chapter/10.1007/3-540-58325-4_233">1994 Lee–Batcher</a>
"A multiway merging network"</p>
<p><a name="1998-lang"></a>
<a href="https://hwlang.de/algorithmen/sortieren/bitonic/oddn.htm">1998 Lang</a>
"Bitonic sorting network for n not a power of 2"</p>
<p><a href="https://www.cs.dartmouth.edu/~thc/papers/slabpose.pdf">2006 Chaudhry–Cormen</a>
"Slabpose columnsort: a new oblivious algorithm for out-of-core sorting on distributed-memory clusters"</p>
<p><a name="2007-even"></a>
<a href="https://web.archive.org/web/20250103112116/https://www.eng.tau.ac.il/~guy/Papers/conclusive.pdf">2007 Even–Levi–Litman</a>
"Optimal conclusive sets for comparator networks",
Theoretical Computer Science (2009)</p>
<p><a href="https://www.cs.au.dk/~gerth/advising/thesis/kris-vestergaard-ebbesen.pdf">2015 Ebbesen</a>
"On the practicality of data-oblivious sorting"</p>
<h2>Symbolic execution</h2>
<p><a name="2007-nethercote"></a>
<a href="https://www.valgrind.org/docs/valgrind2007.pdf">2007 Nethercote–Seward</a>
"Valgrind: a framework for heavyweight dynamic binary instrumentation",
ACM SIGPLAN Conference on Programming Language Design and Implementation</p>
<p><a name="2016-shoshitaishvili"></a>
<a href="https://www.cs.ucsb.edu/~vigna/publications/2016_SP_angrSoK.pdf">2016 Shoshitaishvili–Wang–Salls–Stephens–Polino–Dutcher–Grosen–Feng–Hauser–Kruegel–Vigna</a>
"SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis",
IEEE Symposium on Security and Privacy</p><hr><font size=1><b>Version:</b>
This is version 2026.01.23 of the "References" web page.
</font>
</div>
</body>
</html>