The death of my paper lab book?

Nature recently had a feature on the “paperless” lab which mostly focused on electronic laboratory notebooks (ELNs). As a computer nerd, I’ve been thinking about using one for a while.

ELNs have lots of advantages over paper notebooks. They’re searchable, easily backed up and can automatically incorporate data from instruments—no more cutting and pasting. Businesses like them as it’s easier to find out what an ex-employee did in an ELN than in loads of paper notebooks.

I’ve always used the my department’s standard synthetic chemistry lab book which has a risk assessment and reaction scheme on every left page and lines on every right. It works quite well. I number every reaction TWP001, TWP002 etc and samples are labelled TWP001-A, TWP001-B, etc. Spectra follow a similar convention, e.g. TWP001-A_em_spec.txt or TWP001-A_abs_spec.txt, and all data and code used for data analysis is kept in a folder called TWP001_brief_description.[^git]

But there are a few things that I really hate about paper lab books. Going back through my notes when writing up work is a real chore, especially with seemingly never ending notes along the lines of “same as TWP050 except…”. Reaction TWP050 says: “same as TWP049 except…”. With an ELN you can just copy and paste.

The inherent linearity of a paper lab book is a pain. Entries are in chronological order and reactions are performed sequentially, one at a time, but I usually work on two or three reactions at a time. Leaving blank pages looks sloppy, but cramming notes into small gaps is messy.

The biggest problem is that paper notebooks have become incomplete records of research in the modern laboratory. A lab book should be a complete record of your thoughts, observations, measurements and results. However with modern lab instrumentation it’s impractical or impossible to include all the data by printing, cutting and sticking it in. For example, a search on my computer (not a look in my lab book) reveals 510 UV-vis absorption, fluorescence and excitation spectra recorded since August 2010. There’s no way I could print that out (and even if I did, the data is useless in that format). Furthermore, a paper lab book can’t capture any of the data analysis on the computer. My MATLAB (and now Python) code is riddled with comments. With paper lab books, this information is highly fragmented.

Considering these problems I’ve been looking at electronic alternatives for some time, but what I’ve disliked about them boils down to two things: inflexibility and how they handle data. They seem to try to fit everything into a particular template or form. With a paper lab book, I can write and draw whatever I want, which is important to me as I’m not a “normal” synthetic chemist—I with flow reactors and I’m more interested in my residence time than yield.

I want to be able to access my plain text data files as plain text files and not have them converted into horrible proprietary binary formats subject to the whims of the ELN vendor. Think of the hassle caused when Microsoft switched from .doc to .docx—I don’t want this happening with my data. Plain text files from 30 years ago can still be read today and will be readable for longer than I’ll be alive. It also worries me that a web based ELN could disappear and leave me with a load of horribly formatted files to wade through.

Researching online I found advocates of open notebook science—the (left field) practice of making your entire lab book and data available online as it is recorded—using blogs and wikis as ELNs. Cameron Neylon’s blog-like open lab book used the University of Southampton’s free LabTrove software. Lab book entries are like blog posts, with attachements for data, and you can organise posts using tags, e.g. “NMR” or categories, perhaps to organise posts related to a single reaction. Jean-Claude Bradley’s group notebook, called the UsefulChem Project used a wiki. I really like Bradley’s wiki and there are lots of nice examples if you click about on the list of reactions. His group upload and link to spectra and photographs—a complete research record.

I did a bit more research into using a wiki for an ELN and they seem to be the perfect match. They’re flexible in terms of organising data however I want and pages are versioned so you can see what was written when. There are loads of different wiki applications available, so I narrowed the possibilities down with the following criteria:

  • active development
  • proven large scale deployment for stability and reliability
  • open source and free
  • page access control
  • supports attachments
  • self-hosted because I don’t trust anyone
  • written in a nice programming language
  • stores data nicely, i.e. not binary formats

This boiled down to MediaWiki (runs Wikipedia), FosWiki (used for loads of corporate intranets) and MoinMoin (large scale deployments are the Apache Software Foundation, Python and Ubuntu wikis).

MediaWiki doesn’t handle attachments very well for ELNs since attachments are available globally, i.e. across the whole wiki at the top level rather being linked to individual pages. The latter makes more sense to me as spectra or photos (the attachment) are related to the experiment (the page) rather than the whole notebook (the wiki). MediaWiki is designed for open content, so it doesn’t do access control without dodgy extensions. It’s also written in PHP, which I have no intension of learning. So that’s MediaWiki struck off.

FosWiki is aimed at corporations, which I think you can tell from it’s look and feature list. It’s also written in Perl, which I really don’t want to learn. So that’s FosWiki gone.

Last is Moinmoin. Unlike MediaWiki, attachments are linked to pages. MoinMoin is written in Python, a really nice language I’ve started to use instead of MATLAB, so there’s the possibility of writing my own extensions. It’s currently at version 1.9.4, so it should be very stable, and version 2.0 is under active development. It’s very clean and tidy.

I spoke to my supervisor about an ELN and he was extremely keen so I’ve decided to give MoinMoin a go. I’ve installed it on a Linode virtual server running Ubuntu linux.[^VPS] It took a about 6 hours to install the whole server from scratch—not bad having never administered a server before! Initially I was a little worried about security, with data being on a internet server, but I’ve locked down the server pretty tight and am going to make off site backups to my office machine. If anyone is interested, I’ll write up how to set it up.

It would be cool to make MoinMoin chemically-savvy—perhaps by pulling in data from ChemSpider or Wolfram Alpha, or COSHH info from Sigma-Alrich? I think this could be done with a little Python scripting. I’ll open source anything good for others to use. I’m also planning on setting up an old scanner in the lab to upload paper drawings.

This could all prove to be an embarrassing experiment or even a complete nightmare and ending with me dusting off my most recent lab book and finding a pen. On the other hand, it could be great. We’ll have to wait and see!

[^git]: I use Git to keep track of all changes to the files in each experiment folder. I don’t know anyone in my department who has heard of it, which is a shame as it’s a great tool. I’ll blog about it sometime.

[^VPS]: I could have installed it on a dedicated machine in the office, but we’re a bit short on machines and didn’t want to have to deal with hardware.

*[COSHH]: control of substances hazardous to health *[ELN]: electronic lab notebook