NMR Analysis, Processing and Prediction: 2010

Saturday, 23 October 2010

Conformational analysis of cyclic compounds using Mspin and RDCs

On the occasion of the release of a new version of Mspin (BTW, this is the very first multiplatform version of Mspin: it works now in Windows, Mac OS X and Linux), I would like to bring into your attention one of the many applications where this software plays an instrumental role: The application of Mspin to the study of seven-membered rings compounds by NMR.

The NMR study of seven-membered ring compounds is a classical problem in conformational analysis. They are commonly studied by means of NOE-based experiments or 3J coupling analysis using Karplus-Altona relationships. In a recent work, recently published in Chemical Communications, [ Chem. Commun. , 2010, 46, 5879–5881 ]from the groups of Roberto Gil (Carnegie Mellon University) and Navarro-Vázquez ( Universidade de Vigo ) have demonstrated that the conformation of a 3-benzazepine compound can be completely determined by using 1DCH residual dipolar couplings (RDCs). These RDCs were easily obtained by performing HSQC experiments coupled in the direct dimension using a polydimethylsiloxane gel as oriented medium

Conformational search indicated the presence of 11 possible conformations for this molecule. In agreement with computed DFT energies for these conformers, as well as observed 3J couplings, chemical shifts, and NOE's RDC analysis shows the preference of the system for a crown-chair conformation with equatorial disposition of the substituents.

Here you can download the rdc data file ( click here ) in the MSpin format and a multiconformer XYZ file with the DFT optimized conformations (click here) . Load them into the RDC module of Mspin, select the singular value decomposition method (SVD), just click the calculate button and see how convenient is to perform the RDC analysis with Mspin

Friday, 30 July 2010

New Fast NMR technique

Instrument time is precious and a plethora of different fast NMR experiments are continuously being proposed in order to reduce the time required to record an NMR spectrum. Actually, money is not the only reason, there are many other factors which motivate the development of techniques to increase the speed of data collection. For example, if one wants to make real-time studies of kinetic processes or protein folding, it’s pivotal to speed up the acquisition of NMR data, in particular multidimensional spectra.
On this issue, we have just put our bit into this field and published an article which describes the use of localized spectroscopy for parallel multidimensional NMR data acquisition. The key idea is to interleave the data acquisition at a variety of localized bands within a given interscan repetition time.

In other words, the method is based on MRI-type slice selection techniques (e.g. spinecho multislice and gradient echo multislice) where nuclear spins in different parts of the tube are excited and detected during subsequent transients while the previously used spins have time to relax towards equilibrium before being excited again, hence achieving a considerable timesaving in the overall acquisition.

We believe that this method; named PALSY, is a very powerful yet simple and general technique to reduce experimental time. Of course, there is a sensitivity penalty approximately proportional to the number of slices chosen, but the good thing is that the achievable resolution in any dimension is not compromised in any way.

Another point of interest is that it does not require any fancy data processing, just a simple data shuffling operation needed to extract the different sub-spectra contained into the acquired raw data matrix. This operation has been implemented into the Mnova Alpha version and will be available in the next official release. Meanwhile, as always, should anyone be interested in evaluating this alpha version, just drop a comment here and I will get in touch to provide an executable.

The article can be accessed here:

Fast multidimensional localized parallel NMR spectroscopy for the analysis of samples

I would like to take this opportunity to acknowledge and congratulate my friend Manolo for this work. He is the intellectual father of this pulse sequence and is currently extending this idea further to cover other NMR experiments.

Tuesday, 27 July 2010

Riding up the peaks …

… around the Pyrenees. NMR peaks are not the only ones that interest us in Mestrelab, we also love the Cols in the Tour de France, and I have some proof :-) :

Quoting a friend of mine: I had never seen the Yellow Jersey before. It is a bit surprising that the competitors would fight so hard for the right to wear it :-)

Even though I’m not a great fan of Lance Armstrong, I reckon he has incredibly popularized cycling in the US. The guy in the photo has been following him for several years already in the Tour de France, running by the riders. It looks easier than it actually is, as the riders go faster than 20 Km/h (i.e. 3 min/Km) in fairly hefty slopes, so you have to be in a pretty good shape to keep up with them for 200 m (that is approximately the distance he is doing with them).

Well, enough for this off topic post. I will follow up later on tonight with some real NMR stuff.

Friday, 2 July 2010

Fermentanomics

We are now in an era in which a plethora of domains of sciencific investigations are combined with the suffix ‘omics’. These include neologisms such as genomics, proteomics, metabonomics, pharmacogenomics, nutrigenomics and many others.

Scientists at Lilly have now established the basis of a new –omics related discipline which they have dubbed Fermentanomics and consists in a new rapid and robust NMR method for monitoring mammalian cell cultures.
This work has been published as a JACS communication and I’m delighted to see that they have used our Global Spectral Deconvolution (GSD) technique available in Mnova for the extraction of the concentrations of the components from the NMR spectra of the spent media of mammalian cell culture

Friday, 21 May 2010

Bruker Smiles

The above title is not aimed to mislead and it does not refer to Bruker’s state of mind: Quite simply, the purpose of this post is merely technical and the relation between Bruker and its smiles will be apparent in a moment… Just keep reading…

Because it wouldn't make sense otherwise, NMR instruments use receiver systems equipped with digital filters since a relatively long time ago. The advantages of such digital filters (generally designed as low-pass filters and applied together with oversampling and decimation methods) are many fold, ranging from higher quality spectral baselines to SNR and effective dynamic range improvements, enhanced reduction of potential sources of folded signals, etc

It´s not all about advantages though … I’m sure most of you are already very well aware of the pesky problem that is infamously known as group-delay artifact in Bruker (and Jeol) data which has plagued the NMR community since these companies switched to digital receivers. In short, the FID resulting from the digital filter does not start at time = 0 but only after a long and slowly rising oscillation of length G (G = Group Delay).

Some empirical procedures to correct it were presented on the internet but they are palliative and do not resolve the problem completely, particularly when apodization is applied.
Typically and depending on how the FID is processed, the spectrum might exhibit smiles (baseline artifacts pointing up) or frowns (baseline artifacts pointing down) at the outer regions of the spectrum as depicted below:

The ultimate solution
These small artifacts are in general not a big problem as one could use a spectral width large enough so that the peaks of interest in the spectrum will not be affected by these artifacts (although some processing algorithms such as backward Linear Prediction could be somewhat problematic with the Group Delay). In any case, we did not feel very comfortable with present solutions to this problem. A few months ago, I went for dinner with Stan and right after it, the power of the red wine and above all, the Galician octopus inspired Stan in such a way that he managed to understand the engineering drawback and proposed a new correction algorithm which we implemented together in Mnova just a few minutes later (whilst still under the influence of the wine :-) ).
Basically we have now a new pre-processing algorithm that corrects in a totally automatic way any Bruker FID corrupted by the group-delay artifact, producing a normal and physically correct FID so that the smiles will not be seen in the f-domain spectrum. The performance of the new algorithm is illustrated in the figure below:

This enhanced correction is available in Mnova since version 6.1.1 onwards, although it is not the default processing method for the moment. In order to activate it, it is necessary to select it via Processing/Group Delay menu command.

I guess the take home from the story is never underestimate the power of red wine and Galician octopus :-)

Thursday, 20 May 2010

Back in the blogosphere

After a 3 month-long hiatus, I'm back in the blogosphere. I’ve been travelling and working very hard on several exciting projects, but my entries have consequently suffered.

Although my workload has not decreased, I am not planning to travel for the next few weeks. Hopefully I will manage to blog on a more regular basis from now on, especially now that there are many things we have been working on lately that I hope will be of interest to the NMR community.

For now, I’d just like to point out a very interesting blog entry written by my friend Stan on his well-known NMR blog. One of the tools in Mnova for which we are most proud of is GSD (Global Spectral Deconvolution) which, as any other fitting process, requires the definition of a line shape model. GSD uses a Lorentzian model and some of the reasons for this choice have been elegantly exposed on his blog.

Why spectral lines are Lorentzian

Thursday, 4 February 2010

Learning NMR with Deep Purple

It’s not that I’ve gone crazy (well, I hope not :-) ) or that Deep Purple has moved from Hard Rock to Science (or at least I‘m not aware of this), but after my last post about the acoustic reproduction of NMR FIDS, I thought that it would be fun to compose a simple song and, at the same time, create some stuff which can serve as an educational tool for some very basic NMR.

I won’t actually compose anything original, but rather make an NMR cover of the famous Smoke on the Water riff by Deep Purple. It’s very simple with a central theme consisting in a four-note "blues scale" melody. If you don’t know this song, you can watch and listen here:

These are the notes for the central theme I learnt by ear. I know it’s not 100% accurate, but for the purpose of the exercise this should do just fine:

DFG DFAG DFG FD

You can play this riff with the virtual piano below, just click on the picture and click the notes above. It’s fun!

NMRing Smoke on the water

Before any further ado, these are the instructions to listen to the NMR version of Smoke on the Water:

First you will need Mnova. If you don’t own a license, you can download a free, fully functional demo from our web site.
Download this NMR document (SmokeOnTheWater.mnova)
Download this script (playArray.qs)
Finally, put on your earphones or amplify the volume on your computer speakers, open SmokeOnTheWater.mnova file with Mnova and run the script.

If everything works as I hope, you should be listening to 12 different pings pretending to be the melody of Smoke on the Water riff. I hope you won’t get too disappointed and more importantly, I hope that Deep Purple won’t detest me for ruining their song :)

Behind the scenes

As Dr. Walter Bauer explained on his web site, the most convenient way to create melodies with NMR is by means of pulse programmers to define the appropriate frequencies and delays. I’m taking a much simpler approach which, of course, cannot be used to create complex harmonies.

The basic idea is to synthesize NMR FIDs in such a way that each FID will represent a note of the song. For example, in this particular case I have simulated 4 FIDS with frequencies corresponding to A, D,F, and G. Next I stacked these 4 FIDs to create a new stacked item with 12 FIDs formed by the combination of the 4 main FIDs (main tones) and sorted to yield DFG DFAG DFG FD. The result is illustrated in the figure below.

As you can see, not all the FIDs have the same length. This is, of course, because some notes have to last longer than others. For example, we can consider the first FID (D) as the quarter note (crotchet), the third FID a half note (minim) and the sixth FID an eighth note (quaver). Next, I will comment on how the duration of the FIDs are controlled.

Some points of interest

At first sight, it may seem that in order to create the NMR version of Smoke on the Water one simply has to create the 4 FIDs with the ‘real’ frequencies corresponding to A, D, F and G notes and then organize them accordingly. For example, using the A220 pitch, the frequencies for the different notes should be:


A = 220 Hz
D = 146.83
F = 174.61
G = 196 Hz

However, if these notes are used in this way, the resulting song will be totally different than expected! The reason for that is very simple: NMR FIDs are expected to be in the so-called Quadrature Detection mode, that is, zero frequency in the center of the spectral width. Thus, it becomes necessary to translate the original frequencies into the NMR quadrature frame. The equation for such transformation is very simple:

NMR Note Frequency = Note Frequency + SpectralWidth/2

For example, in this case, as I have used a spectral width of 3000 Hz, the NMR frequencies for the 4 notes will be:


A = 220 + 3000/2 = 1720 Hz
D = 146.83

+ 3000/2

= 1646.83 Hz
F = 174.61

+ 3000/2

= 1674.61 Hz
G = 196

+ 3000/2 = 1696 Hz
Another point worth mentioning concerns the way to modulate the duration of the different FIDs to create crotchets, quavers, etc. In NMR terms, this is equivalent to the acquisition time (AT) which is defined as:

AT = N / SW

Where N is the number of points and SW is the spectral width. We can modify any of these two values to change the duration of the note (= FID acquisition time). In this example, I have kept the spectral width constant and modified the number of points.

Finally …

In this example I have created pure tone notes (e.g. one single frequency for each note) but it could also be possible to create some kind of guitar chords (e.g. power chords) to make the song more realistic by simply combining the root tone and a fifth. This can be easily done by using the Spin Simulation toolkit available in Mnova.

Monday, 18 January 2010

Listening to NMR FIDs

About ten years ago I implemented a command in MestReC for the acoustic reproduction of an NMR FID. This was motivated by the suggestion of Javier Sardina and also after I found the Web page by Walter Bauer (an excellent musician by the way).

This feature was missing in Mnova which I think is a shame as in my opinion it is a very valuable educational tool. For example, it’s a beautiful way to show that measured NMR frequencies lie in the audio frequency region.
So I have decided to write a script to fill this gap in Mnova. If I remember well, MestReC command played just the real part of the FID. As a minor improvement, I have added stereo capabilities now, basically by using the real and imaginary components of the FID as the two stereo sound channels. Another enhancement is that the sampling rate used to reproduce the FID corresponds to the actual acquired spectral width.

Download the script

Anyone interested in this feature can download the script from this link
As always, your feedback will be very welcome

Thursday, 14 January 2010

On integrating overlapped peaks

Following up from the integration problem raised in my previous post and before I delve into Line Fitting, I would like to give you a quick update on some progress we have recently done in Mnova to facilitate the accurate integration of peaks in those cases in which a multiplet is contaminated by one or several extraneous peaks (e.g. a solvent peak).
Consider the following spectrum predicted using NMRPredict Desktop

As expected (this is a perfect synthetic spectrum and therefore noise-free with no phase or baseline distortions and enough separation between the 3 multiplets) the relative integrals are in agreement with the structure, 1:2:1.

I will now modify this spectrum by adding an extra peak in the H-5multiplet

As a result, the multiplet corresponding to H-5 will show a relative integral of 2 instead of 1. This problem can be tackled by using, for example, some signal suppression algorithm to get rid of this extra peak or by deconvolving the multiplet and then summing up the individual deconvolved peaks without the extra peak.
The aim of this post is to let you know that we have just automated all this process via the powerful GSD algorithm (more about this in a later post) so that dealing with this kind of problems has become much easier than before. As I will describe in depth once the new version with this functionality is released, the user just has to select which peak or group of peaks needs to be excluded for the integration and the program will do the rest. This is illustrated in the figure below where the extra peak (in red) is not used for the integral calculation

Even though this new functionality is not available in the current official release of Mnova, it’s already fully operative in our internal version (alpha). Anyone interested in trying this new feature out is more than welcome. Just drop me a line (support at mestrelab.com ) and I will give more detail.

Monday, 11 January 2010

Basis on qNMR: Integration Rudiments (Part II)

My last post was a basic survey on different measurement strategies for peak areas. Manual methods such as counting squares or cutting and weighing, known as ‘boundary methods’ were introduced for historical reasons. These methods were first used by engineers, cartographers, etc, end then quickly adopted by spectroscopists and chromatographers.

In the digital era, most common peak area measurement involves the calculation of the running sum of all points within the peak(s) boundaries or by other quadrature method (e.g. Trapezoid, Simpson, etc [1]). Obviously, the digital resolution, i.e. the number of discrete points that defines a peak is a very important factor in minimizing the integration error. Intuitively, it’s easy to understand that the higher the number of acquired data points, the lower the integration error. It’s therefore very important to avoid any under-digitalization when an FID is acquired, a problem which is unfortunately more common than many chemists realize.

As described by F. Malz and H. Jancke [2], at least five data points must appear above the half width for each resonance for a precise and reliable subsequent integration. What does this mean in practical terms? Typically, acquisition parameters are defined according to the Nyquist condition: the spectral width (SW) and the number of data points (N, total number of complex points) determine the total acquisition time AQ:

AQ = N/SW

And the digital resolution (DR) is proportional to the inverse of the acquisition time, the latter being the product of the dwell time (DW) and the number of increments:

DR = SW/N = DW x TD = 1 / AQ

If we consider a typical 500 MHz 1H-NMR spectrum with a line width at half height of 0.4 Hz (this is a common manufacturer specification) and a spectral width of 10 ppm (5000 Hz), the minimum number of acquired data points required to satisfy the five points rule should be:
5 pt x 5000 Hz / 0.4 Hz = 62500 complex points.

This number is not suitable for the FFT algorithm which requires, generally, a length equal to a power of two. This is done by zero padding the FID with zeroes until the closest upper power of two, in this case 65536 (64 Kb).

Furthermore, in order to get the most out of the acquired data points, zero filling once (adding as many zeros as acquired data points) has been found (see [3]) to incorporate information from the dispersive component into the absorptive component, and hence it is useful to zero fill at least once (which is exactly what Mnova does).. For example, as S. Bourg and J. M. Nuzillard have shown [4], even though zero-filling does not participate in the improvement of the spectral signal to-noise ratio, it may increase the integral precision by a factor up to 2^(1/2) when the time-domain noise is not correlated.

Regardless of the quadrature method, they all share the same systematic problem: in order to integrate one or several peaks it’s necessary to specify the integration limits. In qNMR assays, this is an evaluation parameter whose effect can be estimated using the theoretical line shape of an NMR signal. To a good approximation (assuming proper shimming), the shape of an NMR line can be expressed as a Lorentzian function:

Where w is the peak width at half height and H is its height value. When L(x) is integrated between +/- infinite, the total integrated area becomes:

Obviously, it’s it is unreasonable to integrate digitally from –infinite to +infinite so an approximation must be made by choosing limits. This has been studied by Griffiths and Irving [5] who have showed that for a maximum error of 1%, integration limits of 25 times the line width in both directions must be employed. If errors less than 0.1 % are desired, the integral width has to be +/-76 times the peak width. For example, in a 500 MHz NMR spectrum with a peak width of 1 Hz, the integrated region should be 152 Hz (~0.30 ppm), as illustrated in the image below

But in general, peaks are not so well separated and for example, when studying complex mixtures or impurities related to the main compound, wide integrals cannot be used. In general, integration by direct summation is not adapted to partially overlapping peaks.

For example, just consider the simple case of peak overlapping where, for instance, one peak of the double doublet overlaps within a triplet:

The theoretical relative integrals for the two multiplets should be 1:1. However, the area of the triplet calculated via the standard running sum method will be overvalued because of the contamination caused by one of the peaks of the double doublet which in turn will be underestimated. This is illustrated in the figure below where the green lines corresponds to the triplet, the blue lines to the double doublet and the red line is the actual spectrum (sum of all individual peaks)

The question is: how to overcome this problem? The answer is, of course, Line Fitting (Deconvolution) which will be the subject of my next post.

References:

[1] Jeffrey C. Hoch and Alan S. Stern, NMR Data Processing, Wiley-Liss, New York (1996)

[2] F. Malz, H. Jancke, J. Pharmaceut. Biomed. 38, 813-823 (2005)

[3] E. Bartholdi and R. R. Ernst, "Fourier spectroscopy and the causality principle", J. Magn. Reson. 11, 9-19 (1973)
doi:10.1016/0022-2364(73)90076-0

[4] S. Bourg, J. M. Nuzillard, "Influence of Noise on Peak Integrals Obatined by irect Summation", J. Magn. Reson. 134, 184-188 (1988)
doi:10.1006/jmre.1998.1500

[4] Lee Griffiths and Alan M. Irving, "Assay by nuclear magnetic resonance spectroscopy: quantification limits", Analyst 123 (5), 1061–1068 (1998)