João Caldeira

and 4 more

Context: Profiling developers is challenging since many factors, such as their skills, experience, development environment, and behaviors, may influence a detailed analysis and the delivery of coherent interpretations. Objective: We aim at profiling software developers by mining their software development process. To do so, we performed a controlled experiment where, in the realm of a Python programming contest, a group of developers had the same well-defined set of requirements specifications and a well-defined sprint schedule. Events were collected from the PyCharm IDE, and from the Mooshak automatic jury where subjects checked in their code. Method: We used n-gram language models and text mining to characterize developers’ profiles, and process mining algorithms to discover their overall workflows and extract the correspondent metrics for further evaluation. Additionally, we evaluated a textual abstraction of software process smells and assessed their results. Results: Findings show that we can clearly characterize with a coherent rationale most developers, and distinguish the top performers from the ones with more challenging behaviors. This approach may lead ultimately to the creation of a standard catalog of software development process smells and their correspondent textual abstractions, which is fundamentally useful in conjunction with Large Language Models. Conclusions: The profile of a developer provides a software project manager with a clue for the selection of appropriate tasks he/she should be assigned. With the increasing usage of low and no-code platforms, where coding is automatically generated from an upper abstraction layer, mining developers’ actions in the development platforms is a promising approach to early detect not only behaviors but also assess project complexity and model effort. Results are promising, however, further testing is needed to support this approach. If this reveals useful, large language models can be trained specifically to address the finding of process development patterns within teams and organizations.