Introduction 1,2 3–9 10–22 23–28 29 3 30 31–33 16–18,34–36 37 38 − 1 − 1 34 via Figure 1 35,36 The residues that form the obligatory folding nucleus are highly conserved within immunoglobulin domains but are conserved only in terms of residue type in fibronectin type III (fnIII) domains. However, there are rare examples of proteins that appear to have a disparate nucleation pattern. Here, we have identified a fnIII domain in which one of the hydrophobic residues in the conserved folding nucleus has been replaced by a surface polar residue, and we ask how the folding mechanism has been affected. An extensive protein engineering Φ-value analysis reveals that the folding mechanism is unaltered, but that a spatially different set of core residues is used to form the obligatory folding nucleus, where interactions within each sheet establish the correct hydrogen bond registry between the core β-strands. Subsequent interactions between two such pairs are able to bring the β-sheets together and set up the complex Greek key topology. Results Residue conservation within the folding nucleus of fnIII domains 39 40 Figure 2 16,18 Selection of CAfn2 as a candidate Bacillus circulans 41 Supplementary Data Figure 1 Figure 1 Supplementary Figure 1 42 Figures 1 and 3 18 Figure 3 Figure 1 Characterization of wild-type CAfn2 43 G D–N − 1 Figure 4 T 44 Effect of mutations 45 Table 1 G D–N m m − 1 − 1 m- m m G 46 G D–N − 1 − 1 Figure 4 Table 2 m kf m kf − 1 m- m ku 47 48 49 Structure of the transition state 50 2 Table 2 Figure 3 18 16 17 Figure 5 A and G-strands Table 2 Figures 3 and 4 Table 1 G D–N Table 1 Strands B, C, E and F Table 2 Figures 3 and 4 C′-strand Table 2 Figures 3 and 4 Table 1 Discussion CAfn2 folds by a nucleation-condensation mechanism via Identification of the obligate (embryonic) folding nucleus 31,32 18 36 17,35 Figure 6 There is a caveat we should make. We note that in CAfn2 it is more difficult to select which residues are likely to form part of the obligate nucleus than it was in TNfn3. Consider the B-strand. I20(B2) exhibits a Φ-value that is slightly lower, but is within error of L22(B3). However, I20 forms no contacts with the nucleating residues from the opposite sheet, (V38(C4) and V68(E4)), suggesting that it does not form part of the obligatory nucleus that establishes the topology of the molecule. Furthermore, as was the case in TNfn3, we were unable to determine a Φ-value for the highly conserved Trp in the position B4. Simulations confirmed for TNfn3 that this Trp residue had a low Φ-value (as we had inferred from the pattern of Φ-values surrounding the Trp residue). Trp 24 makes 150 side-chain–side-chain contacts in CAfn2, and 65% of these contacts are with residues that have Φ-values that are (or are predicted to be) low (Φ ∼ 0.15, 52 contacts) or zero (46 contacts). Less than one-third of the contacts made by Trp24 are with residues in the putative obligatory nucleus (with L22, V38 and V68) and no contact is made with I55. Thus, we tentatively propose that if Trp24 does have a role in the folding nucleus, it is more likely to be in the critical layer than in the topology-defining obligate nucleus. Also consider residue A70 in the F-strand in position F5. The Φ-value for this residue is very slightly higher than that for V68 in layer F4. However, Ala to Gly mutation must be considered to be non-conservative and, furthermore, Ala70 makes no contact within the proposed obligate nucleus i.e. it cannot have a role in establishing the Greek key topology. Comparison of the transition states of CAfn2 and TNfn3 18 Figure 7 Figure 8 Figures 3 and 8 Figure 6 Figure 6 18 18,19 In summary, for both proteins we observe the formation of a specific nucleus in the core of the protein involving formation of long-range tertiary contacts between a single residue from each of the B, C, E and F-strands. Formation of this “obligate” nucleus establishes the topology of the protein. Other residues pack around this obligate nucleus to form the critical contact layer until sufficient contacts have formed to surmount the free-energy barrier. This is typical of a nucleation-condensation folding mechanism. The peripheral strands and the loops pack late, mainly after the rate-limiting step for folding. Conclusion: plasticity within the obligatory folding nucleus in Ig-like domains Unlike other classes of proteins, such as the homeodomain proteins, all Ig-like proteins appear to fold by the same, nucleation condensation mechanism. The obligate nucleus is defined by the interactions that are necessary to establish the complex Greek key β-sheet topology of the native state. Previous biophysical studies of members of the Ig-like fold have shown that this folding nucleus always comprises a ring of interacting residues within the hydrophobic core: one residue from each of the B, C, E and F-strands. Whereas the obligatory nucleus in the immunoglobulin superfamily proteins is highly conserved and is based around the invariant tryptophan located within the C-strand, members of the fnIII superfamily show more variability. Instead of restricting a particular structural position to a specific amino acid, each position simply needs to possess a hydrophobic residue. Here, we have shown that the fnIII nucleus is more flexible still, and that when this pattern of residue conservation is lost upon mutation, fnIII proteins can “migrate” the folding nucleus, thereby revealing plasticity in the early stages of the folding process, while retaining the same folding mechanism. 51 52 Materials and Methods Protein expression and purification Bacillus circulans Escherichia coli 53 2+ 2+ 2+ Equilibrium measurements 43 Change of free energy on mutation G D–N (1) 54 (1) ΔΔ G D − N = 〈 m 〉 ( [ u r e a ] 50 % w t − [ u r e a ] 50 % m u t ) 50% m m Kinetic measurements 18 m − 1 Φ-Value analysis (2) 54 (2) Φ = ΔΔ G D − ‡ ΔΔ G D − N G D–‡ (3) ΔΔ G D − ‡ = R T ln ( k f / k f ′ ) k f k f′ Appendix A Supplementary data Supplementary Figure 1 Supplementary Data doi:10.1016/j.jmb.2007.09.088