Introduction
Besides being a challenging and interesting problem in itself, computational modeling of protein structure has significant practical impact on the biomedical field1-3. The most direct application is in structural biology where models are used to help determine protein structures by experimental methods including X-ray crystallography, cryo-electron microscopy and NMR spectroscopy. In X-ray crystallography, models are often used to solve the phase problem by molecular replacement (MR), which relies on the existence of similar protein structures or accurate models that serve as templates to be placed in the crystal cell, consistent with the diffraction data4. In NMR, models can assist with the prediction of chemical shifts and NMR spectra, or the interpretation of real spectra (i.e., chemical shift assignments and then NOE assignments) and in building structures that satisfy experimentally derived distance and angle restraints5,6. In cryo-EM, models are of value for backbone tracing and fitting sequence into a map, especially at low and moderate resolution (3.5-5.0 Å)7,8. Regardless of the structure determination technique, models can be used to identify and sometimes fix problematic regions in experimental structures 9. With the recent major advances in protein structure modeling10-13, it is clear that in future models will play a substantially larger role in determining and validating experimental structures.
In CASP, not-yet solved or not-yet released structures are solicited from the experimental community as modeling targets. The suitability of a structure as a target is largely determined by three factors: estimated modeling difficulty (some may be too easy), whether there is sufficient time available before experimental structure release, and conversely, whether the experimental structure will be solved in time for model assessment. Inevitably, some targets will encounter problems, and normally have to be abandoned. There were eleven such targets in CASP14, including seven where experimental data have been collected, but, nevertheless, the structure could not be determined. Because of the very high accuracy of many submitted models on other targets, especially those from the AlphaFold2 group11,12, the organizers decided to see how many of the challenging structures could be resolved with the aid of models. In previous CASPs, generated models have occasionally helped solve structures. For example, the crystal structure of Sla2 ANTH domain of Chaetomium thermophilum (CASP11 target T0839) was determined by molecular replacement using CASP models14 , but these have been exceptions. In CASP14, four structures were solved with the aid of AlphaFold2 models. A post-CASP analysis has shown15 that models from other groups would also have been effective in some cases. In the three remaining unsolved cases, poor data quality appears to have been the issue. These are all ‘hard’ targets with limited or no homology information available for at least some domains, demonstrating the power of the new methods for all classes of modeling difficulty. For one other target, provision of the models resulted in correction of a local experimental error. The paper discusses these success stories, with content for each target provided by the corresponding experimental group.