Pengfei WANG

and 1 more

Deduplicating and sorting large sets of integers is a significant requirement in the field of big data, posing particular challenges in single-threaded or resource-constrained environments. This paper proposes the SlowSort algorithm as an improvement and engineering optimization of the classic BitSort idea, which naturally includes a deduplication function. SlowSort supports negative integers through offset mapping; it significantly compresses space requirements by using the second smallest and second largest values to determine the bit vector range, excluding the influence of outliers; and it significantly improves single-threaded execution efficiency through optimizations such as bitwise operations. Comparative experiments implemented in Java and C++ show that: on high-density integer datasets within the 10 7 range, SlowSort’s average performance surpasses parallel sorting algorithms and is on par with radix sort; on integer sets within the 10 9 range, SlowSort’s performance is similar to parallel sorting and radix sort; on large-scale datasets of one billion elements or more, SlowSort’s performance is 50% lower than parallel sorting and radix sort, but its peak memory overhead is reduced by 50%; across all three scales mentioned above, SlowSort’s performance is more than 8 times faster than standard library sorting algorithms (Java Arrays.sort, C++ std::sort). Therefore, SlowSort provides a solution that balances performance and engineering practicality for large-scale integer deduplication and sorting tasks in specific scenarios.