Daniel Lemire -

Daniel Lemire

Full Professor, University of Quebec

Daniel Lemire is a computer science professor at the Data Science Laboratory of the University of Quebec (TELUQ). He is ranked in the top 2% of all scientists (Stanford University/Elsevier ranking, 2024). He is among the 1000 most followed programmers in the world on GitHub; GitHub has over 100 million developers. He published over 85 peer-reviewed research papers. His work is found in many standard libraries (.NET, Rust, GCC/glibc++, LLVM/libc, Go, Node.js, etc.) and in the major Web browsers (Safari, Chrome, etc.). He is an editor at the journal Software: Practice and Experience (Wiley, established in 1971). In 2020, he received the University of Quebec’s 2020 Award of Excellence for Achievement in Research for his work on the acceleration of JSON parsing. His research interests include high-performance programming.

Canada

Public Documents 6

Scanning HTML at Tens of Gigabytes per Second on ARM Processors

Daniel Lemire

September 18, 2024

Modern processors have instructions to process 16 bytes or more at once. These instructions are called SIMD, for single instruction, multiple data. Recent advances have leveraged SIMD instructions to accelerate parsing of common Internet formats such as JSON and base64. The two major Web browser engines (WebKit and Blink) have adopted SIMD algorithms for parsing HTML on 64-bit ARM processors. During HTML parsing, they quickly identify specific characters with a strategy called vectorized classification. We review their techniques and compare them with a faster alternative. We measure a 20-fold performance improvement in HTML scanning compared to traditional methods on recent ARM processors. Our findings highlight the potential of SIMD-based algorithms for optimizing Web browser performance.

Parsing Millions of DNS Records per Second

Jeroen Koekkoek

and 1 more

September 06, 2024

The Domain Name System (DNS) plays a critical role in the functioning of the Internet. It provides a hierarchical name space for locating resources. Data is typically stored in plain text files, possibly spanning gigabytes. Frequent parsing of these files to refresh the data is computationally expensive: processing a zone file can take minutes. We propose a novel approach called simdzone to enhance DNS parsing throughput. We use data parallelism, specifically the Single Instruction Multiple Data (SIMD) instructions available on commodity processors. We show that we can multiply the parsing speed compared to state-of-the-art parsers found in Knot DNS and the NLnet Labs Name Server Daemon (NSD). The resulting software library replaced the parser in NSD.

Batched Ranged Random Integer Generation

Nevin Brackett-Rozinsky

and 1 more

August 27, 2024

Pseudorandom values are often generated as 64-bit binary words. These random words need to be converted into ranged values without statistical bias. We present an efficient algorithm to generate multiple independent uniformly-random bounded integers from a single uniformly-random binary word, without any bias. In the common case, our method uses one multiplication and no division operations per value produced. In practice, our algorithm can triple the speed of unbiased random shuffling for small to moderately large arrays.

Parsing Millions of URLs per Second

Yagiz Nizipli

and 1 more

June 02, 2023

URLs are fundamental elements of web applications. By applying vector algorithms, we built a fast standard-compliant C++ implementation. Our parser uses three times fewer instructions than competing parsers following WHATWG URL standard (e.g., Servo’s rust-url) and up to eight times fewer instructions than the popular curl parser. The Node.js environment adopted our C++ library. In our tests on realistic data, a recent Node.js version (20.0) with our parser is four to five times faster than the last version with the legacy URL parser.

Transcoding Unicode Characters with AVX-512 Instructions

Daniel Lemire

and 1 more

December 09, 2022

Intel includes on its recent processors a powerful set of instructions capable of processing 512-bit registers with a single instruction (AVX-512). Some of these instructions have no equivalent in earlier instruction sets. We leverage these instructions to efficiently transcode strings between the most common formats: UTF-8 and UTF-16. With our novel algorithms, we are often twice as fast as the previous best solutions. For example, we transcode Chinese text from UTF-8 to UTF-16 at more than 5 GiB s − 1 using fewer than 2 CPU instructions per character. To ensure reproducibility, we make our software freely available as an open source library.

On-Demand JSON: A Better Way to Parse Documents?

Daniel Lemire

and 1 more

January 04, 2023

JSON is a popular standard for data interchange on the Internet. Ingesting JSON documents can be a performance bottleneck. A popular parsing strategy consists in converting the input text into a tree-based data structure—sometimes called a Document Object Model or DOM. We designed and implemented a novel JSON parsing interface—called On-Demand—that appears to the programmer like a conventional DOM-based approach. However, the underlying implementation is a pointer iterating through the content, only materializing the results (objects, arrays, strings, numbers) lazily. On recent commodity processors, an implementation of our approach provides superior performance in multiple benchmarks. To ensure reproducibility, our work is freely available as open source software.