3.1. Brief description of the target
As a member of the recently defined Kuttervirus genus, theEscherichia coli O157:H7 bacteriophage CBA120 infects multiple
hosts using four tailspike proteins (TSP1-4). Each TSP has a distinct
endo-glycosidase activity specific to the lipopolysaccharides of
different bacterial hosts. The four phage CBA120 TSPs are so far the
best characterized, thus they served as a paradigm for understanding the
infection mechanism and host range expansion characteristic to theKuttervirus genus. All TSPs assemble into trimers and employ the
same overall fold of their catalytic domains (trimers of β-helix
subunits). Nevertheless, within this fold, the different active site
architectures confer different endo-glycosidase substrate specificities,
which in turn facilitates the host range expansion of the phage37-40. The four TSPs
form a complex, seen on negative-stained electron micrographs as a
branched appendage emanating from the phage tail41. The 335 N-terminal
amino acids of TSP4 mediate this assembly and anchoring function. The
sequence of this region (herewith termed TSP4-N) comprise the target
submitted for CASP14 structure prediction (target T1070). The crystal
structure of TSP4-N was determined initially at a resolution limit of
3.2 Å using Single-wavelength Anomalous Dispersion at the Se absorption
edge of crystals containing SeMet protein. This structure served as a
Molecular Replacement search model to determine the crystal structure of
the wild-type TSP4-N using crystals that diffracted to a resolution
limit of 2.6 Å. Structure refinement of this crystal form yieldedR = 0.206 and R free = 0.229.
Consistent with the full-length TSP4, the TSP4-N also assembled into
trimers. The structure revealed four domains connected by flexible
linkers. The 75 N-terminal amino acids comprise the domain that anchors
TSP4 to the phage tail baseplate (herewith termed AD). Of these,
approximately 50 amino acid residues fold into an intertwined triple
β-helix, which then disengage to form an antiparallel β-prism II from
the ensuing 25 residues, with each subunit contributing 3-stranded
antiparallel β-sheet to the trimer prism (Fig. 5A). This was the most
challenging region for structure prediction because of its lack of
sequence homology to sequences of known protein structure. Following a
short linker region, the polypeptide chain folds into three domains
(herewith termed XD1-3) that recruit the partner TSPs. While XD1
exhibits a low but clear sequence identity to a domain of gp9 from phage
T4 baseplate (18% over 95 of 100 shared amino acid residues), XD2 and
XD3 exhibit only remote sequence homology to proteins of known crystal
structure, which can be detected by Hidden Markov Model methods. Domain
XD1 adopts a mixed β-sandwich fold, while both XD2 and XD3 adopt a
jellyroll fold. In the crystal structures, whether the trimers employ a
crystallographic or non-crystallographic 3-fold symmetry axis, all
domains obey the same 3-fold symmetry axis. The XD1 and XD3 monomers
form closely packed trimeric assemblies. However, XD2 subunits splay
apart and do not interact with one another even though they remain
related by the 3-fold symmetry axis. This spatial separation of XD2
subunits prevents binding of a trimeric partner TSP, and is probably a
crystal packing artifact. Indeed, a crystal structure of a protein
construct lacking the XD3 domain revealed closely packed XD2 subunits,
as necessary for binding of a trimeric TSP partner.