-r--r--r-- 12334 djbsort-20260127/doc/html/refs.html raw
<html> <head> <meta http-equiv="content-type" content="text/html; charset=utf-8"> <meta name="viewport" content="width=device-width, initial-scale=1"> <style type="text/css"> html{overflow-y:scroll;background-color:#012345} body{font-family:"Noto Sans","Droid Sans","DejaVu Sans","Arial",sans-serif;line-height:1.5} tt,code{background-color:#f0f0f0;font-family:"Noto Sans Mono","Droid Sans Mono","DejaVu Sans Mono","Courier New",monospace,sans-serif;font-size:1em;} pre{margin-left:3em} p,ul,ol,blockquote,pre{font-size:1.0em;line-height:1.6} li p{font-size:1.0em} blockquote p{font-size:1.0em} h1{font-size:1.5em} h2{font-size:1.3em} h3{font-size:1.0em} h1 a{text-decoration:none} table{border-collapse:collapse} th,td{border:1px solid black} table a{text-decoration:none} table tr{font-size:1.0em;line-height:1.6em} table tr{font-size:1.0em;line-height:1.5} tbody tr:nth-child(2n+1){background-color:#f0ffff} tbody tr:nth-child(2n+2){background-color:#fffff0} #headline{display:block;margin:0;padding:0;color:#ffffff;background-color:#012345} #headline .text{font-weight:bold;font-size:1.0em} #headline input{display:none} #nav ul{margin:0;padding:0} #nav li{list-style-type:none;margin:0;padding:0} .navtop{padding-bottom:0.5em;font-weight:bold;font-size:1.0em} .navtop{background-color:#012345;color:#ffffff} #nav .here{background-color:#012345;color:#ffffff} #nav .away{background-color:#012345;color:#ffffff} #nav .away a{text-decoration:none;display:block;color:#ffffff} #nav .away a:hover,.away a:active{text-decoration:underline} #hidemenu{visibility:hidden;display:none;overflow:hidden;position:fixed;top:0;left:0;height:100%;width:100%} .main{padding:5px} .main{background-color:#ffffff} .pagetitle{font-size:1.4em;font-weight:bold} @media only screen and (min-width:512px) { .navtop{padding-top:5px} #headline{top:0;margin:0;width:160px;height:100%;position:fixed;overflow:auto} #headline .noselect{display:none} #headline #nav{visibility:visible;display:block;width:auto;height:auto} .main{margin-left:170px} #headline #hidemenu{visibility:hidden} } @media not screen and (min-width:512px) { #headline .noselect{-webkit-user-select:none;-ms-user-select:none;user-select:none;} #headline #nav #navbot{visibility:hidden;position:fixed;top:0;left:-70%;z-index:2;transition:0.2s;margin:0;padding:0} #headline input:checked ~ #nav #navbot{height:100%;position:fixed;top:0;left:0;visibility:visible;display:block;box-sizing:border-box;-moz-box-sizing:border-box;-webkit-box-sizing:border-box;vertical-align:center;font-size:1.0em;width:70%;overflow:auto} #headline input:checked ~ #hidemenu{visibility:visible;display:block;background:black;opacity:0.3;z-index:1} } </style> <title> djbsort: References</title> </head> <body> <label id=headline> <input type=checkbox /> <nav id=nav> <div class=navtop> <span class=noselect>≡</span> djbsort</div> <ul id=navbot> <li class=away><a href=index.html>Intro</a> </li><li class=away><a href=download.html>Download</a> </li><li class=away><a href=install.html>Install</a> </li><li class=away><a href=test.html>Test</a> </li><li class=away><a href=api.html>API</a> </li><li class=away><a href=speed.html>Speed</a> </li><li class=away><a href=security.html>Security</a> </li><li class=away><a href=verif.html>Verification</a> </li><li class=away><a href=internals.html>Internals</a> </li><li class=away><a href=license.html>License</a> </li><li class=here>References </li><li class=away><a href=comparison.html>Comparison</a> </li></ul></nav> <div id=hidemenu></div> </label> <div class=main> <div class=pagetitle>djbsort: References</div> <p>There is a vast literature on sorting, so this page cannot hope to be comprehensive. Reasons to include something here:</p> <ul> <li> <p>This is, or could be, something to credit for ideas used in djbsort.</p> </li> <li> <p>The results are, or have been claimed to be, competitive with djbsort in <a href="speed.html">speed</a>. Speed results are not excluded on the basis of <a href="security.html">security</a> or <a href="verif.html">verification</a>. Results are also not excluded on the basis of array-size limitations (only multiples of 64, only powers of 2, etc.) or array-alignment requirements.</p> </li> </ul> <h2>Available software</h2> <p><a name="2001-herf"></a> 2001 Herf <a href="http://stereopsis.com/radix.html">"radix"</a></p> <p><a name="2016-santurkar"></a> 2016 Santurkar <a href="https://github.com/sid1607/avx2-merge-sort">"avx2-merge-sort"</a></p> <p><a name="2017-gueron"></a> 2017 Gueron–Krasnov <a href="https://github.com/vkrasnov/avx_qsort">"avx_qsort"</a></p> <p><a name="2017-bernstein"></a> 2017 Bernstein–Chuengsatiansup–Lange–van Vredendaal <a href="https://ntruprime.cr.yp.to/software.html">"ntruprime-20170815.tar.gz"</a>: constant time; direct ancestor of djbsort; see also <a href="https://ntruprime.cr.yp.to/papers.html">paper</a> "NTRU Prime: reducing attack surface at low cost"</p> <p>2017 Bramas <a href="https://gitlab.inria.fr/bramas/avx-512-sort">"avx-512-sort"</a>: only for AVX-512; see also <a href="https://thesai.org/Downloads/Volume8No10/Paper_44-A_Novel_Hybrid_Quicksort_Algorithm_Vectorized.pdf">paper</a> "A novel hybrid quicksort algorithm vectorized using AVX-512 on Intel Skylake"</p> <p><a name="2018-hou"></a> 2018 Hou–Wang–Feng <a href="https://github.com/vtsynergy/aspas_sort">"aspas"</a>: see also paper "A framework for the automatic vectorization of parallel sort on x86-based processors"</p> <p><a name="2020-blacher"></a> 2020 Blacher–Giesen–Kühne <a href="https://github.com/simd-sorting/fast-and-robust">"fast-and-robust"</a>: see also <a href="https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SEA.2021.3">paper</a> "Fast and robust vectorized in-place sorting of primitive types"</p> <p><a name="2020-shechter"></a> 2020 Shechter <a href="https://github.com/damageboy/vxsort-cpp">"vxsort"</a>: see also <a href="https://bits.houmus.org/2020-01-28/this-goes-to-eleven-pt1">blog post</a></p> <p>2020 Bingmann–Marianczuk–Sanders <a href="https://github.com/JMarianczuk/SmallSorters">"SmallSorters"</a>: only for small sizes; see also <a href="https://arxiv.org/abs/2002.05599">paper</a> "Engineering faster sorters for small sets of items"</p> <p><a name="2022-blacher"></a> 2022 Blacher–Giesen–Sanders–Wassenberg <a href="https://github.com/google/highway">"vqsort"/"highway"</a>: see also <a href="https://arxiv.org/abs/2205.05982">paper</a> "Vectorized and performance-portable Quicksort"</p> <p><a name="2022-intel"></a> 2022 Intel <a href="https://github.com/intel/x86-simd-sort.git">"x86-simd-sort"</a> (development taken over by numpy in 2025)</p> <h2>Harder-to-verify performance claims</h2> <p><a href="https://www.sciencedirect.com/science/article/pii/0167819187900627">1987 Rönsch–Strauss</a> "Timing results of some internal sorting algorithms on vector computers"</p> <p><a href="https://webdocs.cs.ualberta.ca/~amaral/papers/furtak-spaa07.pdf">2007 Furtak–Amaral–Niewiadomski</a> "Using SIMD registers and instructions to enable instruction-level parallelism in sorting algorithms"</p> <p><a href="https://web.archive.org/web/20120508095814/http://www.vldb.org/pvldb/1/1454171.pdf">2008 Chhugani–Macy–Baransi–Nguyen–Hagog–Kumar–Lee–Chen–Dubey</a> "Efficient implementation of sorting on multi-core SIMD CPU architecture"</p> <p><a href="https://web.archive.org/web/20170809023503/http://olab.is.s.u-tokyo.ac.jp/~kamil.rocki/xiaochen_rocki_IA3_SC13.pdf">2013 Xiaochen–Rocki–Suda</a> "Register level sort algorithm on multi-core SIMD processors"</p> <p><a href="https://www.semanticscholar.org/paper/VSR-sort%3A-A-novel-vectorised-sorting-algorithm-%26-Hayes-Palomar/dbbd5cccbbae8bb7ebed4854371901e763d2c08c">2015 Hayes–Palomar–Unsal–Cristal–Valero</a> "VSR sort: A novel vectorised sorting algorithm &amp; architecture extensions for future microprocessors"</p> <p><a href="https://synergy.cs.vt.edu/pubs/papers/hou-aspas-ics15.pdf">2015 Hou–Wang–Feng</a> "ASPaS: a framework for automatic SIMDization of parallel sorting on x86-based many-core processors": reports, e.g., about 220 million cycles (55 cycles/byte) on 1.05GHz Xeon Phi 5110P (Knights Corner) for n=1000000</p> <p><a href="https://web.archive.org/web/20210418053109/https://journal.uad.ac.id/index.php/TELKOMNIKA/article/view/2741">2016 Xu–Xu–Yin–Zheng</a> "Optimized merge sort on modern commodity multi-core CPUs", Telkomnika <strong>14</strong> (2016), 209–318: reports, e.g., about 110 million cycles (27 cycles/byte) on 2.4GHz Core i7 4700MQ (Haswell) for n=1048576</p> <p><a href="https://web.archive.org/web/20240815205312/https://www.imada.sdu.dk/u/lcf/pubs/paper36.pdf">2016 Codish–Cruz-Filipe–Nebel–Schneider-Kamp</a> "Optimizing sorting algorithms by using sorting networks", Formal Aspects of Computing <strong>29</strong> (2017), 559–579: reports, e.g., about 150 cycles (2.3 cycles/byte) on unspecified "Core i7" for n=16; about 120000 cycles (3 cycles/byte) for n=10000</p> <p><a href="https://web.archive.org/web/20260123055800/https://valeriyvan.com/assets/docs/Fast%20Quicksort%20Implementation%20Using%20AVX%20Instructions.pdf">2016 Gueron–Krasnov</a> "Fast quicksort implementation using AVX instructions", The Computer Journal <strong>59</strong> (2016), 83–90: reports, e.g., about 160000 cycles (40 cycles/byte) for n=1000 <code>int32</code> on Haswell (Figure 4), and about 130000 cycles (32 cycles/byte) for n=1000 <code>int32</code> on Haswell using radix sort in Intel's Integrated Performance Primitives library, "the fastest in-memory sort we are aware of"</p> <p><a href="https://ieeexplore.ieee.org/abstract/document/8855628">2019 Yin–Zhang–Müller–Liu–Wei–Schmidt–Liu</a> "Efficient parallel sort on AVX-512-based multi-core and many-core Architectures"</p> <h2>Sorting networks</h2> <p><a href="https://www.cs.kent.edu/~batcher/sort.pdf">1968 Batcher</a> "Sorting networks and their applications"</p> <p><a href="https://www.ics.uci.edu/~dan/pubs/BatcherMerge.pdf">1983 Kumar–Hirschberg</a> "An efficient implementation of Batcher's odd-even merge algorithm and its application in parallel sorting schemes"</p> <p><a href="https://web.archive.org/web/20250221203539/https://www.cal.cs.ucla.edu/tech-report/198_-reports/860043.pdf">1986 Marberg–Gafni</a> "Sorting in constant number of row and column phases on a mesh"</p> <p><a name="1987-chung"></a> <a href="https://www.sciencedirect.com/science/article/pii/0012365X9090173F">1987 Chung–Ravikumar</a> "Bounds on the size of test sets for sorting and related networks", Discrete Mathematics (1990)</p> <p><a href="https://link.springer.com/chapter/10.1007/3-540-58325-4_233">1994 Lee–Batcher</a> "A multiway merging network"</p> <p><a name="1998-lang"></a> <a href="https://hwlang.de/algorithmen/sortieren/bitonic/oddn.htm">1998 Lang</a> "Bitonic sorting network for n not a power of 2"</p> <p><a href="https://www.cs.dartmouth.edu/~thc/papers/slabpose.pdf">2006 Chaudhry–Cormen</a> "Slabpose columnsort: a new oblivious algorithm for out-of-core sorting on distributed-memory clusters"</p> <p><a name="2007-even"></a> <a href="https://web.archive.org/web/20250103112116/https://www.eng.tau.ac.il/~guy/Papers/conclusive.pdf">2007 Even–Levi–Litman</a> "Optimal conclusive sets for comparator networks", Theoretical Computer Science (2009)</p> <p><a href="https://www.cs.au.dk/~gerth/advising/thesis/kris-vestergaard-ebbesen.pdf">2015 Ebbesen</a> "On the practicality of data-oblivious sorting"</p> <h2>Symbolic execution</h2> <p><a name="2007-nethercote"></a> <a href="https://www.valgrind.org/docs/valgrind2007.pdf">2007 Nethercote–Seward</a> "Valgrind: a framework for heavyweight dynamic binary instrumentation", ACM SIGPLAN Conference on Programming Language Design and Implementation</p> <p><a name="2016-shoshitaishvili"></a> <a href="https://www.cs.ucsb.edu/~vigna/publications/2016_SP_angrSoK.pdf">2016 Shoshitaishvili–Wang–Salls–Stephens–Polino–Dutcher–Grosen–Feng–Hauser–Kruegel–Vigna</a> "SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis", IEEE Symposium on Security and Privacy</p><hr><font size=1><b>Version:</b> This is version 2026.01.23 of the "References" web page. </font> </div> </body> </html>