NMR Analysis, Processing and Prediction: NMR

Showing posts with label NMR. Show all posts

Thursday, 10 February 2011

Alignment of NMR spectra – Part VI: Reaction Monitoring (II)

Previous posts on this series:

Crossing over of peaks is a very common event in Reaction Monitoring (RM) experiments. When this happens, the automatic alignment algorithm discussed in previous posts (here and here) might not work properly. To illustrate this issue, as I did not have a real experiment at hand, I simulated using Mnova a very simple data set comprised by a triplet and a singlet in such a way that the chemical shift of the triplet moves from 1.4 ppm to 0.7 ppm and having an exponential decay from spectrum to spectrum. This is depicted in the figure below, both as a stacked and a bitmap plot.

Now let’s say you are interested in extracting the intensities of the triplet as the reaction progresses. There is actually no need to pre-align the spectra algorithmically; it is much simpler to have some kind of graphical tool to instruct the software which peaks (or multiplets) need to be used for the reaction monitoring analysis. Let me show you how this works in Mnova:

First of all, in the Data Analysis module you select the region to be analyzed. As a starting point, the region will have a rectangular shape (green rectangle in the figure below):

It can be noted that the graph shows an exponential decay, but the actual values must obviously be wrong as the values calculated, using the green rectangle as a boundary for the integration, include peaks from both the triplet and singlet, and we are interested in the analysis of the triplet resonances only. Now let’s change this…

The selection rectangle has a number of handlers (small green boxes). You can drag and move them freely so that you can adjust the selection feature to follow the triplet (BTW, the number of handlers can be adjusted. In this case, there are 6 handlers, but higher numbers are also permitted). In the figure below, the result of adjusting the handlers to follow the triplet is shown:

Now you can see that there is an outlier in the exponential curve which, obviously is caused by the singlet which overlaps with the triplet (spectrum number 6 which corresponds to data point #5, as in the graph the numbering starts from zero). Figure below shows that particular spectrum showing the singlet overlapping with the triplet:

At this stage, there are several approaches. The simplest one is to just discard that point for the analysis, for example, by right clicking on that point in the graph and disabling it:

As soon as that point is deleted / disabled, Mnova will update the graph automatically. This is the new result:

Another approach would involve using GSD to eliminate the singlet from the triplet so that it would not be necessary to discard the information from that particular spectrum. However, this is something I will blog about in a future post.

Tuesday, 8 February 2011

Alignment of NMR spectra – Part V: Reaction Monitoring (I)

Previous posts on this series:
1. Alignment of NMR spectra – Part I: The problem
2. Alignment of NMR spectra – Part II: Binning / Bucketing
3. Alignment of NMR spectra – Part III: Global Alignment
4. Alignment of NMR spectra – Part IV: Advanced Alignment

Following the progression of chemical reactions by NMR is becoming more and more popular. Quoting Michael A. Bernstein et al. (Magn. Reson. Chem. 2007; 45: 564–571)

(…)The technique is rich in structural information, and can uniquely provide subtle information on speciation, protonation sites, and intermediate compound production. NMR measurements can be made under quantitative conditions, and one can be confident that all organic species will be observed. These factors combine to make NMR a very attractive tool for these analyses, and address many of the shortcomings in traditional spectroscopic measurements (…)

Typically, as a reaction proceeds, it’s very common to observe very significant chemical shift fluctuations of a given resonance due to, for example, changes in pH or protonation of the starting material, just to mention a few. These changes in chemical shift can be so large that extracting relevant information from those spectra (e.g. intensities/integrals across the data set) can be difficult, so aligning those spectra can be helpful. Let me illustrate this with an example exhibiting clear nonlinear misalignments: peaks at about ca 11.6 ppm do not move whilst the peaks at higher field move very significantly:

Instead of displaying the data set as a stacked plot as above, it might be more convenient to display it as an intensity or bitmap plot because this plotting mode highlights more clearly the alignment /misalignment profiles:

It’s evident that correcting the data using a single reference peak (or a global shift) is not sufficient. In order to align this data set, we can follow two different strategies:

Strategy 1:

Starting with raw spectrum (1), it is possible to perform a full-spectrum correction (global alignment) before the single intervals are aligned:

It can be appreciated that after applying the global alignment, most of the peaks in (2) are now properly aligned, except the peaks at the left which were previously aligned but after this operation get misaligned. This problem will be covered in the next step.
After the spectra have been aligned ‘globally’, the user just needs to select the interval which comprises the peaks left to be aligned as depicted in (1). (2) shows the final result once both the global and local alignment have been applied:

Strategy 2:

A different, although analogous strategy, would consist in aligning two different spectral intervals separately without resorting to a global alignment as shown in (1) below. Note that the peaks in the interval in the left are already well aligned (so selecting this region is optional; if there were some minor misalignment, the algorithm would optimize such residual misalignment).

(2) shows the final result after the two intervals have been aligned. It’s completely equivalent to the result obtained with Strategy 1

Conclusion
In this post I showed how the automatic alignment algorithm can be used to align RM data sets prior to any further analysis. However, there is a better way to extract NMR descriptors from Reaction Monitoring experiments that does not require any prior pre-processing alignment. In fact, I believe that this method, which I will present in my next post, has several advantages (in the context of reaction monitoring), especially in those cases where the chemical shift ordering of some peaks changes during the reaction, situations in which automatic alignment algorithms usually have great trouble dealing with. An example of a reaction monitoring data set showing peaks crossing over is shown below, in bitmap mode (it’s a simulated data set)

Therefore in my next post, I will show how to analyze RM data sets with important peak fluctuations and crossing over

Sunday, 30 January 2011

Alignment of NMR spectra – Part II: Binning / Bucketing

In my last post, I wrote that spectra of biological samples are usually poorly aligned due to wide changes in chemical shift arising from small variations in pH or other sample conditions such as ionic strength or temperature.

The most widely used method of addressing this chemical shift variability across spectra is by means of the so-called binning (or bucketing), procedure that consists in segmenting a spectrum into small areas (bins / buckets) and taking the area under the spectrum for each segment. Preferably, the size of the bins should be large enough so that a given peak remains in its bin despite small spectral shifts across the spectra, but not so large as to include peaks belonging to multiple compounds within a single bin.
As a simple example to illustrate how binning works, let’s consider the spectrum of Taurine (Fig. 1)

Fig. 1: 1H-NMR spectrum of Taurine synthesized with Mnova NMRPredict. Only the spectral region corresponding to the methylene protons is shown.

Taking the spectrum shown in Fig. 1 which has been predicted using Mnova NMRPredict, seven additional spectra were created by changing the chemical shift of the CH2 protons randomly in an effort to simulate the chemical shift variability observed in real life biofluid NMR spectra.

Fig. 2: Synthesized data set comprised by 8 simulated spectra of Taurine with random chemical shifts for the CH2 protons and displayed in superimposed mode in Mnova.

These spectra have been synthesized using 32768 data points and a spectral width of 6001.6 Hz with a spectrometer frequency of 500.13 MHz. If the size of each bin is set to 0.02 ppm (represented by the vertical grid lines in Fig. 2), this will result in the generation of 6001.6 / (0.02 x 500.13) = 600 bins.

When the binning command is issued in Mnova, a new spectrum with 600 data points in which every point is the sum of all the points within each bin is produced. The result of this binning or bucketing operation applied to one single spectrum of the synthetic Taurine data set is depicted in fig. 3, where the circles correspond to the area of each bucket in the original spectrum. Fig. 4 shows the result applied to all spectra in superimposed mode. Digital resolution of the resulting binned spectrum is 10 Hz/point

Fig. 3: Methylene region of one synthetic 1H-NMR spectrum of Taurine after data reduction by uniform binning

Fig. 4: Result of applying data reduction by uniform binning to the 8 1H-NMR spectra of Taurine

Once the spectra have been binned, they are ready to be exported in a convenient format (e.g. ASCII) for further statistical analysis (e.g. PCA).
It can be noticed that binning greatly minimizes the effects from variations in peak positions (in this case, all peaks get perfectly aligned). Additionally, binning reduces the data size for multivariate statistical analyses, although today’s computers and optimized linear algebra algorithms are able to handle large data volumes very efficiently.

The major drawback of this procedure is the loss of a considerable amount of information enclosed in the original spectra. In this particular case, the fine structure of the two triplets is totally lost (the coupling constant is 6.6 Hz whilst the digital resolution is 10 Hz), precluding the direct interpretation of multivariate models. In addition, peaks moving on borders between bins might cause artifacts. Another source of loss of information occurs, for example,when peaks belonging to several compouns are included within a single bin.

There exist several better alternatives to binning, typically involving some form of peak alignment without data reduction. But this will be the subject of my next post …