Thursday, 3 February 2011

Alignment of NMR spectra – Part III: Global Alignment

Previous posts on this series:
  1. Alignment of NMR spectra – Part I: The problem
  2. Alignment of NMR spectra – Part II: Binning / Bucketing

We have seen that binning helps in minimizing, for example, the effect of pH-induced fluctuations in chemical shift so that, in the field of NMR-based metabonomics studies, ensuring that signals for a given metabolite appear at the same location in all spectra. One evident disadvantage of binning is that it greatly reduces the spectral resolution (e.g. in a 500 MHz instrument, a typical 64 Kb NMR spectrum with SW = 12 ppm, would be reduced to 300 points (bins) if a bin width of 0.04 ppm [20 Hz = ~218 points] is used).
This loss of resolution is not desirable and considering that today’s powerful computers can handle large data matrices, there is now an increasing tendency to perform multivariate analysis at the maximum spectral resolution possible. Alternatives to binning typically involve some form of peak alignment procedure and in this post I will cover the simplest one, global alignment. The purpose of this post is to simply illustrate the concept of alignment, but it is important to note that this method is not generally applicable to the misalignment problems found in metabonomics NMR data sets, although it might be useful in many other contexts.

The idea of global alignment is very simple and corresponds to the well-known chemical shift referencing method in which the user sets the internal reference peak (e.g. TMS, DSS, TSP, etc) of each spectrum to e.g. 0 ppm. In order to cope with small fluctuations in chemical shifts, this method seeks for the highest peak within a narrow (user-defined, auto-tuning option in Mnova) interval, as depicted in the figure below:



Clearly, this method will not work properly in those data sets with local misalignments, that is, when signals of one metabolite fluctuates in one direction whilst the peaks of a different metabolite move differently). As an example, let’s consider again the simulated data set of Taurine used in my previous post and which I copy below for convenience:



Remember that this data set has been generated by randomly changing the chemical shifts of the two CH2 groups. Now, let’s apply the global alignment procedure using as chemical shift reference at a value of 3.25 ppm as shown in the picture below:



As expected, all peaks corresponding to the triplet at 3.25 get perfectly aligned, but the other multiplet remains misaligned (see below).



One could devise an extension to this global alignment procedure in which the same procedure is applied to different segments of the spectrum. In this particular case, one could select two different windows, one for each triplet and apply the same algorithm locally to each segment. However, having to manually select the chemical shift reference for each segment is not very practical and, in addition, relying only on the simple search of the maximum peak within each segment is not a very robust method for automatic alignment. In my next post, I will present a much more powerful automatic alignment method in which the user will not need to define the reference chemical shift value for each segment / window, but before that, and as an introduction to that post, let me show you another global automatic alignment method.
Let’s assume that we have several spectra which we want to align automatically in such a way that we first manually reference the chemical shift of one of these spectra (e.g. the first one in the series) and then ask the software (e.g. Mnova) to automatically align all the other spectra using this one as a reference spectrum. The idea for such algorithm is to figure out which is the optimal value that a spectrum has to be shifted (left or right) so that the difference between this spectrum and the reference one is minimal.



Such alternative ‘global method’ has been implemented in Mnova several years ago already and is based on the maximization of the cross-correlation between the reference spectrum and the spectrum/spectra to be aligned. This procedure is the essential foundation for the advanced alignment method which I will present in my next post.

No comments: