AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Jan Arne Sparka
Jan Arne Sparka

Public Documents 1
Grammar-based Fuzzing of Data Integration Parsers in Computational Materials Science
Jan Arne Sparka
Sebastian Müller

Jan Arne Sparka

and 4 more

March 17, 2023
Context: Computational materials science (CMS) focuses on in silico experiments to compute the properties of known and novel materials, where many software packages are used in the community. The NOMAD Laboratory1 offers to store the input and output files in its FAIR data repository. Since the file formats of these software packages are non-standardized, parsers are used to provide the results in a normalized format. Objective: The main goal of this article is to report experience and findings of using grammar-based fuzzing on these parsers. Method: We have constructed an input grammar for four common software packages in the CMS domain and performed an experimental evaluation on the capabilities of grammar-based fuzzing to detect failures in the NOMAD parsers. Results: With our approach, we were able to identify three unique critical bugs concerning the service availability, as well as several additional syntactic, semantic, logical, and downstream bugs in the investigated NOMAD parsers. We reported all issues to the developer team prior to publication. Conclusion: Based on the experience gained, we can recommend grammar-based fuzzing also for other research software packages to improve the trust level in the correctness of the produced results.

| Powered by Authorea.com

  • Home