Setting the scene
Over the last 25 years, during my bachelor's degree, PhD, Post Doc, and now as director of R&D at
Mestrelab, I have had the opportunity to interact with many organic chemists. Most of them, although with their own singularities, share relatively similar procedures and workflows, with their strengths and weaknesses. I have witnessed many advances in the way they conduct their research, but I also must say that there are some areas of it that remain firmly rooted in the past.
An example of the latter which I’m still seeing in many labs is the issue of
data loss: In the particular case of academia, research teams are typically made up of (pre)doctoral or postdoctoral students whose residence time is usually between 3 and 8 years, roughly speaking.
During that period, they produce an enormous amount of spectroscopic data (NMR, GC/LC/MS, UV/IR, etc.) to characterize their molecules. Whilst some groups have some sophisticated IT infrastructures equipped with either in-house or third party DBs (including
Mnova DB for analytical data), I think it is not unreasonable to say that most of them save their spectroscopy data on their personal computers (e. g. laptops) or in shared folders of their research group (e. g. Dropbox).
Data leakage is the result as students leave.
If you're a principal investigator, I'm sure you've found yourself in the following situation: one of your students synthesized a compound some time ago. However, for some reason, you are now considering the possibility that the proposed structure may not be the right one. Obviously, to review this structure, you need to have access to the
original spectroscopic data, but unfortunately, the student is no longer part of your research group and you have no way of locating the NMR spectra.
In the same plot line, some students only keep the spectroscopic data of the products that they have successfully synthesized but
discard the data of those reactions that did not work in the way they had planned.
These are just two examples of what I consider to be a more general problem associated with the difficulty of efficiently
managing analytical information in an organic chemistry laboratory.
Nowadays,
many labs are moving from paper-based to electronic laboratory notebooks (ELNs) that offer significant benefits for long-term storage. However,
most of them lack the capability to understand and handle spectroscopy data in an integrated manner. Some of them are just repository of
PDFs of analytical data generated by some specialized software. This is, in my opinion, a very limited, unproductive and inefficient solution to the extent that data generated in this form has been dubbed as
“dead data” where
all the valuable spectroscopy information has been removed, reducing it to a series of unstructured set of images and text strings. As it is stored today, analytical data is virtually unusable and tasks like the ones listed below are simply impossible to perform:
- NMR data could have been processed incorrectly making a comprehensive analysis of the data unfeasible.
- Only some parts of the spectrum could have been reported or the resolution is too low to characterize a compound unambiguously. For instance, accurate determination of coupling constants, inspection of possible impurities or side products in a reaction would not be possible.
- Spectroscopic data search: Do I have any spectrum that contains a triplet at 3.5 ppm? This is a question that could not be answered with dead data.
- Do I have any spectrum similar to this one?
Some ELNs, in addition to PDF or plain images, also store raw data but do not offer a solution with
real spectroscopy intelligence capabilities within a searchable and homogeneous environment.
Mbook 2.0: A spectroscopy-aware ELN
Our ELN,
MBook 2.0 is our answer to those issues.
It has been designed to take advantage of all the power of Mnova which is tightly integrated with Mbook and is responsible for processing the analytical data acquired by the chemist. The scientist only needs to send the data in a zip file and Mnova will automatically recognize the file format (NMR data such as those from Bruker, JEOL, Varian / Agilent, Magritek, Thermo picoSpin, Nanalysis as well as many LC/GC/MS and UVIR files) and
process in a fully unattended way. As a result, a new Mnova document is generated on the fly and saved into the ELN.
This
file can be accessed and viewed directly from within Mbook with a new spectral
viewer which provides basic navigation tools such as zoom-in and out.
At this present time Mbook 2.0 does not include spectral search capabilities, but we expect to offer this feature shortly once the integration of Mbook with Mnova DB is completed