NMRanalyst for 3D NMR Spectral Analysis
| | | | 3D NMR | | | |
 

The 3D shape of bio-molecules determines their function in organisms. Molecular structure determination by NMR requires the analysis of a series of multidimensional spectra. Varian, Inc. provides ProteinPack used below to facilitate the data acquisition. We have extended NMRanalyst for the automated analysis of 3D data. Given the number of spectra to be analyzed and the huge number of resonances per spectrum, this 3D analysis module provides an important advance for the routine use of 3D NMR.

Application I: 3D HNCO Spectrum of Human Ubiquitin

HNCO Schematic

A standard compound for testing 3D NMR is human ubiquitin, a 76 residue protein (8.5 kDa). The experimental 600 MHz spectra below of 0.2 mM 15N and 13C labelled ubiquitin (90% H2O/10% D2O, pH 5.8, 20oC) are a courtesy of George Gray, Varian, Inc. The HNCO triple resonance experiment is illustrated in the figure on the left. The HNCO data was acquired in 1 h 13 min with ProteinPack's ghn_co pulse sequence (32 x 16 x 512 phase sensitive points, two transients per increment, zerofilled to 128 x 64 x 512 points).

The analysis of all eight phase components of this 128 MByte spectrum takes 2 hours on a 3.2 GHz laptop. NMRanalyst exhaustively searches all spectral positions and produces a complete numerical description (signal integral, resonance frequencies, linewidths, phases, and corresponding error values) of identified spectral resonances. This information can be summarized in a simulated spectrum as shown below. The following residual spectrum illustrates which experimental signals are not completely explained by the reported spin systems.

Experimental HNCO Spectrum
Experimental HNCO Spectrum

Simulated HNCO Spectrum
Simulated HNCO Spectrum

Residual HNCO Spectrum
Residual HNCO Spectrum

Application II: 3D HNCACB Spectrum of Human Ubiquitin

HNCACB Schematic

The HNCACB triple resonance experiment is illustrated in the figure on the left. HNCACB spectra depend on the small 1JNCa (8-12 Hz) and 2JNCa (7 Hz) coupling constants resulting in a low sensitivity for this experiment. So the 200 µM human ubiquitin sample with an acquisition time of 5 h 12 min using ProteinPack's ghn_cacb pulse sequence (64 x 16 x 512 points, four transients per increment, 128 x 64 x 512 after zerofilling) leads to a challenging S/N for this data analysis. The ghn_cacb pulse sequence uses inversion during the C to C magnetization transfer. In plots C resonances are shown as yellow and C as blue isosurfaces. The spectral analysis takes about 4 hours and 45 minutes on a 3.2 GHz laptop and the experimental, simulated, and residual spectra are shown below.

Experimental HNCACB Spectrum
Experimental HNCACB Spectrum

Simulated HNCACB Spectrum
Simulated HNCACB Spectrum

Residual HNCACB Spectrum
Residual HNCACB Spectrum

Graphical illustrations are convenient for evaluating the obtained analysis results. But most applications should be based on the obtained numerical spectral description. The following table shows how the software summarizes the backbone shift information determined from the HNCO and HNCACB spectral analyses. All shift values are given in ppm. Neither the amino acid types nor the residue positions in the protein sequence are known. So determined residues are sorted in increasing 15N shift and are sequentially numbered as shown in column #n. An HNCO signal provides the amide proton (column H), amide nitrogen (column N), and the preceding amino acid carbonyl shift (column CO-1). The HNCACB spectrum provides the corresponding C and C connectivities for the given residue (columns Ca and Cb) and the corresponding weaker resonances for the sequential residue (columns Ca-1 and Cb-1).

Traditionally, strip plots are used to graphically determine the amino acid sequence in a protein. Based on the determined shifts given in the table, strong and weak C and C resonances from residues can be matched numerically. The #n-1 column shows the resulting position number(s) of the possible preceding residue. Finally, the manually added REMARK column shows the best match of the shifts in each line to published assignments for ubiquitin in Wang, A.C.; Grzesiek, S.; Tschudin, R.; Lodi, P.J.; Bax, A.; J-Bio NMR 1995, 5, 376-382.


 #n     N     H     CO-1    Ca     Cb    Ca-1   Cb-1  #n-1                  REMARK
----------------------------------------------------------                  ----------
  1  102.28  8.13  177.36  45.46  16.68    -      -     -                   G47
  2  103.11  7.00  175.26  57.48  63.48  65.50    -     -                   S20(->P19)
  3  105.64  7.61  178.90  61.54  69.14  57.44    -    2 31 38 56           T9->L8
  4  108.52  8.79  175.32  59.80  72.36    -    32.77  30 35                T55->R54
  5  108.73  8.48  177.95  46.17    -    55.39    -    11 25 47 51          G35->E34
  6  108.80  7.87  176.30  59.80  71.23  55.90  40.95  34 53                T22->D21
  7  108.99  7.79  175.54  45.46    -      -      -     -                   G10
  8  110.94  8.49  176.94  45.34  30.62  56.75    -    24 34 42 44 60       G75->R74
  9  113.27  8.46  180.80  61.16    -    58.80    -    22 23 45             S57->L56
 10  113.41  8.51  178.29  55.93  39.90    -      -     -                   D39
 11  114.00  8.70  177.83  55.45  33.41  58.37    -    12 16 18 22          E34->K33
 12  114.31  9.30  175.77  58.19  26.01    -    32.63  30 35                E64->K63
 13  114.73  7.64  175.21  61.05  65.03    -      -     -                   S65
 14  114.74  8.26  176.05  59.76  42.19  55.23  30.69  47                   I3->Q2
 15  114.81  7.94  173.67  46.08    -      -      -     -                   G76
 16  115.20  7.38  177.33  58.24  34.12    -      -     -                   K33
 17  115.29  8.73  177.12  60.62  70.62    -      -     -                   T7
 18  115.55  7.22  177.46  58.32  40.20    -      -     -                   Y59
 19  115.76  8.12  174.66  54.27  37.43    -      -     -                   N60
 20  116.63  7.79  177.06  55.78  30.15    -      -     -                   Q40
 21  117.14  8.73  172.03  62.60  70.36  61.11  64.98  9 13                 T66->S65
 22  117.26  8.91  175.80  58.57  36.53  54.95  29.87   -                   V17(->E16)
 23  117.75  8.11  176.54  58.79  40.40    -    72.44  4                    L56->T55
 24  117.84  7.46  175.44  56.77  31.64    -      -     -                   Q41
 25  118.26  8.58  172.33  55.21  41.44    -      -     -                   F4
 26  118.66  7.22  174.27  62.53  37.00    -      -     -                   I61
 27  118.68  8.53  177.92  59.30  33.89  67.99    -    46                   K27->V26
 28  118.75  9.20  175.41  55.84  31.79  53.96  44.52  54 63                H68->L67
 29  118.93  8.61  174.03  52.84  30.96  58.45  36.55  22                   E18->V17
 30  119.15  7.43  174.79  54.43  32.80  45.27    -    1 7 8                R54(->G53)
 31  119.50  8.00  178.88  57.54  41.12  60.22    -    32 37 64             D32(->Q31)
 32  119.98  7.82  180.23  59.88  33.50    -      -     -                   K29
 33  120.11  6.11  173.94  57.96  40.60  46.08    -    5 15                 I36->G35
 34  120.13  8.15  175.49  56.42  40.82    -    31.92  24 28 50 58          D52->E51
 35  120.21  8.48  175.75  57.98  32.68    -      -     -                   K63
 36  120.40  8.64  175.76  62.56  69.80  56.33  33.43  42                   T12->K11
 37  120.97  9.27  175.19  60.57  34.39  55.20  41.39  25                   V5->F4
 38  121.09  9.12  176.94  57.62  41.99  60.63  70.61  17                   L8->T7
 39  121.13  8.26  180.34  66.25  36.89  59.93    -    4 6 14 32 64         I30->K29
 40  121.14  7.90  178.99  56.14  38.55    -      -     -                   N25
 41  121.45  8.74  175.17  62.16  69.63  60.28  40.92  64                   T14->I13
 42  121.67  7.23  173.98  56.45  33.50  45.50    -    1 7 8                K11->G10
 43  121.83  7.95  173.74  54.64  34.58  45.70    -    1 7 8                K48->G47
 44  121.84  8.44  177.45  56.72  30.71  54.97  42.51  57                   R74(->L73)
 45  121.87  9.05  175.26  59.00  41.30  53.10  45.83  55                   I44->L43
 46  121.89  8.08  178.32  67.74    -      -      -     -                   V26
 47  122.57  8.93  170.48  55.06  30.75    -    33.15  11 30 32 42          Q2(->M1)
 48  122.58  8.11  174.25  54.23  42.88  60.71  34.98  37 62                L71->V70
 49  122.71  8.63  174.62  56.03  29.19    -    34.59  37 43 62 65          Q49->K48
 50  122.87  8.38  176.65    -    32.12    -      -     -                   E51
 51  123.23  7.94  180.50  55.46  17.78    -      -     -                   A28
 52  123.55  8.58  177.88  54.16  31.45    -    42.91  48 57                R72->L71
 53  123.67  8.03  174.63  55.96  40.97    -      -     -                   D21
 54  123.76  8.26  173.57  54.00  44.31    -    31.78  24 28 50 52 58       L73->R72
 55  124.16  8.82  173.90  53.20  45.90    -    31.81  24 28 50 52 58       L43(->R42)
 56  124.27  7.91  178.33  57.49  40.37    -      -     -                   D58
 57  124.41  8.36  175.36  54.94  42.58    -    31.45  24 28 52 58          L69->H68
 58  124.69  7.60  174.53  53.68  31.74  62.55    -    21 26 36             Q62->I61
 59  124.85  8.71  173.75  52.92  46.99  62.14    -    41                   L15->T14
 60  124.91  8.81  175.86  56.81  43.77    -      -     -                   F45
 61  125.44  8.53  175.61  54.32  41.55    -      -     -                   L50
 62  126.45  9.16  175.43  60.68  34.96  54.00    -    19 48 52 54 58 61 63 V70(->L69)
 63  127.40  9.39  173.77  53.94  44.47  62.57    -    21 26 36             L67->T66
 64  127.47  9.51  174.38  60.19  40.85  62.39    -    21 26 36 41          I13->T12
 65  127.54  8.86  174.80  54.68  34.54  60.60    -    17 37 62             K6->V5
 66  132.73  8.94  174.58  52.68  16.60  56.84  43.83  60                   A46->F45
----------------------------------------------------------
 Residues not reported: M1, E16, P19, I23, E24, Q31, P37, P38, R42, G53
			

This table does not contain 10 of the 76 ubiquitin residues. Prolin resonances (P19, P37, P38) are not detectable in HNCO and HNCACB spectra due to the lack of an amide proton. The remaining unobserved ubiquitin resonances are not sufficiently separated from other resonances for identification. The spectral areas in which such problems occur can be identified from the residual spectra above. All reported backbone shift assignments are in agreement with published values except for one challenge. Residues 1 and 8 in the table correspond to G47 and G75 and all glycin residues have no C resonance. The reported 16.68 and 30.62 ppm shifts belongs to the sequential residue and should be listed in the Cb-1 rather than in the Cb column. Such glycin specific ambiguities can be resolved when the amino acid type is assigned.

Correct sequential assignments (column #n-1) are obtained for over half of the reported residues. Due to the low spectral S/N, for most residues only the Ca-1 or the Cb-1 shift is detected resulting in several possible matches. The REMARK column indicates the cases in which the #n-1 column does not contain an appropriate match by listing the sequential residue assignment in parenthesis. Most of these cases result from references to one of the ten ubiquitin residues which are not contained in the table as described above. For a more complete sequential assignment a higher S/N HNCACB dataset should be used. In summary, the presented 3D analysis has moderate resource requirements and can be fully automated. The obtained analysis results are exceptionally complete given the analyzed 3D NMR datasets.

Application III: 3D HCCH-TOCSY Spectrum of Human Ubiquitin

The HCCH-TOCSY spectrum of the above described human ubiquitin sample was acquired in 9 hours and 42 minutes using ProteinPack's ghcch_tocsy pulse sequence (64 x 32 x 512 phase sensitive points, four transients per increment). The data was linear predicted to 128 x 64 x 512 points and no line broadening was used.

Experimental HCCH-TOCSY Spectrum
Experimental HCCH-TOCSY Spectrum

Simulated HCCH-TOCSY Spectrum
Simulated HCCH-TOCSY Spectrum

Residual HCCH-TOCSY Spectrum
Residual HCCH-TOCSY Spectrum

The identification of amino acids based on the known C and C resonances and the observed HCCH-TOCSY peak patterns, and the derivation of their side chain carbon and proton assignments are not completed yet.

Conclusions

Application Results: NMRanalyst yields excellent spectral descriptions as shown by the HNCO and HNCACB residual spectra. Most residuals could be eliminated by using a spectrum acquired with higher resolution and by using more sample or longer acquisition times (further phase cycling). The generated table of determined backbone chemical shifts provides the HNCO and HNCACB analysis results in a format needed for the further protein structure determination.

Super Peak Picker: The acquisition time on a high-field NMR instrument is expensive. To maximize the information extracted from the acquired NMR spectra, the simplistic peak picking should be replaced by the more sensitive, selective, and reliable spin system modeling used by NMRanalyst. NMRanalyst's 3D module can analyze all types of phase sensitive 3D NMR spectra. Functioning as a "super peak picker", its results provide clean input for post-processing software or for the further human interpretation.

3D Phasing: Since NMRanalyst analyzes all spectral phase components, there is no need to optimize the data acquisition for minimal phase distortions or to attempt manual phase correction. NMRanalyst determines better phase functions than obtainable visually (U.S. Patents 5,218,299 & 5,572,125). The determination of phase functions is optional and if available they are used to increase the analysis speed.

Software Availability: The NMRanalyst 3D analysis is available for Red Hat Linux and Microsoft Windows. The Varian and Bruker (AMX and DMX) FID and spectrum formats are implemented. But the Bruker formats have not been tested. If you have access to 3D Varian or Bruker data and are interested in beta-testing this software, please contact Mail: dunkel@ScienceSoft.net.

Summary: The identification of spin systems (resonances) in real-world noisy 3D NMR data can be automated! A visual data analysis clearly excels compared to simplistic peak picking. But analyze the 3D data with NMRanalyst and take a look at the residual spectrum. All spectral signals not shown in the residual spectrum were properly characterized and reported. The structure determination of proteins by NMR involves further interpretation steps. But for the data acquisition and identification and characterization of spectral signals, we believe we have found an adequate solution.


© 1999-2015 ScienceSoft, LLC. All Rights Reserved. ScienceSoft Site Map