Introduction
Besides being a challenging and interesting problem in itself,
computational modeling of protein structure has significant practical
impact on the biomedical field1-3. The most direct
application is in structural biology where models are used to help
determine protein structures by experimental methods including X-ray
crystallography, cryo-electron microscopy and NMR spectroscopy. In X-ray
crystallography, models are often used to solve the phase problem by
molecular replacement (MR), which relies on the existence of similar
protein structures or accurate models that serve as templates to be
placed in the crystal cell, consistent with the diffraction data4. In NMR, models can
assist with the prediction of chemical shifts and NMR spectra, or the
interpretation of real spectra (i.e., chemical shift assignments and
then NOE assignments) and in building structures that satisfy
experimentally derived distance and angle restraints5,6.
In cryo-EM, models are of value for backbone tracing and fitting
sequence into a map, especially at low and moderate resolution (3.5-5.0
Å)7,8.
Regardless of the structure determination technique, models can be used
to identify and sometimes fix problematic regions in experimental
structures 9. With the
recent major advances in protein structure modeling10-13, it is clear that
in future models will play a substantially larger role in determining
and validating experimental structures.
In CASP, not-yet solved or not-yet released structures are solicited
from the experimental community as modeling targets. The suitability of
a structure as a target is largely determined by three factors:
estimated modeling difficulty (some may be too easy), whether there is
sufficient time available before experimental structure release, and
conversely, whether the experimental structure will be solved in time
for model assessment. Inevitably, some targets will encounter problems,
and normally have to be abandoned. There were eleven such targets in
CASP14, including seven where experimental data have been collected,
but, nevertheless, the structure could not be determined. Because of the
very high accuracy of many submitted models on other targets, especially
those from the AlphaFold2 group11,12,
the organizers decided to see how many of the challenging structures
could be resolved with the aid of models. In previous CASPs, generated
models have occasionally helped solve structures. For example, the
crystal structure of Sla2 ANTH domain of Chaetomium thermophilum (CASP11
target T0839) was determined by molecular replacement using CASP
models14 , but these
have been exceptions. In CASP14, four structures were solved with the
aid of AlphaFold2 models. A post-CASP analysis has
shown15 that models
from other groups would also have been effective in some cases. In the
three remaining unsolved cases, poor data quality appears to have been
the issue. These are all ‘hard’ targets with limited or no homology
information available for at least some domains, demonstrating the power
of the new methods for all classes of modeling difficulty. For one other
target, provision of the models resulted in correction of a local
experimental error. The paper discusses these success stories, with
content for each target provided by the corresponding experimental
group.