The CASP16 experiment provided the first opportunity to benchmark AlphaFold3. In contrast to AlphaFold2, AlphaFold3 can predict the structure of non-protein molecules, and according to the benchmark presented by the developers, it should perform slightly better than AlphaFold2 for proteins. In this study, we assess the performance of AlphaFold3 using both automatic server submissions and manual predictions from the Elofsson group. All predictions were generated via the AlphaFold3 web server, with manual interventions applied to large targets and ligands. Compared to AlphaFold2-based methods, we found that AlphaFold3 performs slightly better for protein complexes. However, when massive sampling is applied to AlphaFold2, the difference disappears. It was also noted that in the official ranking from CASP, AlphaFold3 performs better than AlphaFold2 for easier targets, but not for harder targets. Further, the performance of the AlphaFold3 server is comparable to the best methods when taking the top-ranked predictions into account, but slightly behind when examining the best out of the five submitted models. Here, there exist targets where AlphaFold3 makes a good prediction and the top-ranked method failed, and vice-versa, indicating that a venue for progress could be to develop better strategies for identifying the best model. When using AlphaFold3 to predict the stoichiometry of larger protein complexes, the accuracy is limited, especially for heteromeric targets. When analyzing the predictions including nucleic acids, it was found that, in general, the accuracy is relatively low, but the AlphaFold3 performance was not far behind the top-ranked method. In summary, AlphaFold3 provides an easy-to-use method that offers close to state-of-the-art predictions in all categories of CASP.

Gabriele Pozzati

and 2 more