Open science with figshare and object orientated-programming

Update: I’m pleased to say that I was awarded Imperial’s Bradley-Mason Prize for Open Chemistry — see Professor Rzepa’s blog post for more info.

From 1st May 2015, the EPSRC requires that all publications include a statement saying how the underlying research data can be accessed. Technically, you can simply include an email address to contact for the data, but I think that’s hardly in the spirit of open science. In this post, I want to describe how I used object-orientated programming (OOP) and figshare to meet this requirement for my latest paper in Lab on a Chip. You can download the data and MATLAB code to reproduce the graphs at figshare.

Graphical abstract for Microscale extraction and phase separation using a porous capillary.
Graphical abstract for my paper.

In OOP, you create classes that define objects and their properties. For example, if you had a class Animal, instances of this class could be cat and dog. For the Animal class the properties might be legs (an integer) or dateofbirth (a date). The class also defines methods, which are functions that operate on instances of a class. For example, Animal.age() might use the dateofbirth property to return the age of the animal.

For my paper I defined a class called sepexp (short for separation experiment, the subject of the paper) with properties corresponding to the independent and dependent variables. My class definition also included a method runall to run the experiments (which were, thankfully, automated—one of the joys of flow chemistry) and plot to plot the data.

To start an experiment, I would create an instance of my sepexp class. For example, let’s call it exp1, and during its creation I specify the independent variables. Executing exp1.runall() runs all the experiments defined by my properties. The details aren’t relevant here—see the paper if you’re interested—but the key thing is that it saves the results in the properties mass_initial and mass_final.

Now I’ve got an object that defines the experiment and contains the results I can save it, e.g. using save in MATLAB or pickle in Python.

The next step is to plot it, so I execute exp1.plot(), which does a straightforward calculation on the data collected to get the volumetric collection rate at the outlet and plots it. I then repeated this for each experiment.

What does this approach give you? You end up with a class definition and series of objects that contain the parameters of each experiment, how it was carried out, the results, and a means to reproduce the analyses. You can zip this up, upload it to figshare, and you’ve got a publicly accessible link to your data with a DOI.

An OOP approach saves time when analysing data, because you define how the data is analysed once in class definition, and apply it repeatedly to every object/experiment. It’s easy to iterate through all your objects (see the scripts in the /plotting_scripts folder). Distributing the class definition ands the objects together means others can reproduce your analysis. I think that’s pretty cool. If you’ve got MATLAB, download my archive and give it a go.

Or even better, try it out with your next project. There are lots of resources for learning OOP in your language of choice online. The MATLAB OOP documentation is good (although I think MATLAB’s OOP syntax is horrible). I personally like books and learnt about OOP for the first time in the excellent book Learning Python by Mark Lutz.

Routine operations

On Friday I went to a talk by Steven Ley titled Going with the Flow: Enabling Technologies for Molecule Makers. His group at Cambridge have done a lot of impressive work on flow chemistry over many years, both developing the technology and using it to synthesise organic molecules.

He covered a lot of ground in the talk, but one of his main points was that it is “unsustainable to use people for routine operations”. Chemists train for 10 years to then stand in front of a fume hood running columns. Ley wants to develop tools that allow researchers to make better use of their time in the laboratory. Flow chemistry has many benefits over batch chemistry, one of them being that it is easy to automate.

His talk left me wondering where I’m particularly inefficient in the lab. Sample collection and recording absorption spectra are particularly time consuming. Last year I started to build an (Arduino-powered) automatic sample collector, but made it far too complicated and never finished it. Now I’ve drastically simplified it (to the design my supervisor said I should use in the first place, as he often likes to remind me) and hope to have it working by the end of next week. I reckon it could save me anywhere between 5–10 hours a week of standing around swapping vials. I’m also going to make a start on recording absorption spectra inline. Again, this will save me a few hours a week, leaving me to do something more valuable.

I completely agree with Ley about the benefits of flow chemistry, but you can’t ignore that all this equipment costs money. Ley’s group use a lot of commercially available equipment and it’s not cheap. In my group, we build a lot of apparatus ourselves because we can tailor it to our needs and it’s a lot more “hackable” (as well as cheaper).

Someone in the audience tried to make the point during questions that funding is tight, especially for those working in organic synthesis. How they meant to afford equipment like £40,000 inline infrared spectrometers? Ley didn’t really answer this question (and I’m not sure he can). He’s obviously very well funded so he can build and develop the “lab of the future” [1]. A lot of this technology might be out of the budget of the chemists who will benefit from it the most. Unfortunately they might be performing “routine operations” for some time to come.

[1]: M.D. Hopkin, I.R. Baxendale, S.V. Ley, Chim. Oggi./Chemistry Today, 2011, 29, 28-32.

MSci Project Part 1: Quantum Dots

I don’t start my PhD until October so I won’t be posting much about it for a couple of months. In the mean time, I thought it would be nice to talk about what I did for my final year research project as part of my MSci degree.

The aim was to synthesise (core-shell and ternary) quantum dots using microfluidic reactors. It sounds complicated, but really it’s quite straight forward! An explanation of it all in one post would be rather long so I’m going to break it down into two posts, starting with quantum dots and then moving on to microfluidic reactors.

What are Quantum Dots?

Quantum dots are nanoparticles—particles only a few billionths of a metre in size—made from semiconductors. Semiconductors are materials whose electrical conductivity is midway between that of insulators and conductors. They are the foundation of modern electronics and without them we wouldn’t have components like transistors and diodes which are essential building blocks of the technology we use every day.

All materials have particular physical properties—such as the melting point or density—that are independent of how much of the material you have. For example, if you measured the melting point of a material, cut it in half, then remeasured the melting point, the melting point would not change. Properties like these are called intensive properties.

Imagine you had a piece of semiconductor and repeatedly measured an intensive property, such as melting point, then cut it in half. You would expect intensive properties to stay the same, regardless of the amount of material. However, if you carried on doing this for quite some time—so that your semiconductor was just a few billionths of a metre across—you would find that its properties would start to change: properties which were intensive become extensive and dependent on how much of the material you have. Chemists take can advantage of this phenomenon to tune the properties of semiconductors for particular applications by controlling the particle size.

Making Quantum Dots

Rather than breaking down macro- or microscopic bits of semiconductor to make nanoparticles (“top-down”), chemists usually make quantum dots from individual atoms (“bottom-up”). This is most commonly achieved by injecting the appropriate reagents into a hot solvent. The quantum dots spontaneously form in the hot solvent and are left to grow to the desired size.

The photo below is of some cadmium selenide quantum dots that I made last year. I think it’s a wonderful example of their size-dependent properties.

CdSe Quantum Dots
CdSe quantum dots fluorescing under UV light.

Each vial contains quantum dots that were removed from the reaction vessel at regular intervals. The vial on the far left hand side contains quantum dots grown for 30 seconds and the vial on the far right hand side contains quantum dots grown for 3 hours. The mean size of the particles grown for 30 seconds and 3 hours was 2.8 nm and 4.2 nm respectively, so the nanoparticle size increases from left to right.

The colour arises from a process called fluorescence. The vials are sat on top of an ultraviolet lamp which causes the quantum dots to fluoresce and emit light, the wavelength of which is dependent on the size of the quantum dots.

These unique optical properties make quantum dots very attractive for use in solar cells, displays and even in medical imaging. The trouble is that high-quality quantum dots are quite tricky to make, especially on an industrial scale. In part 2, I’ll talk a bit more about the applications of quantum dots, what microfluidics is and why it’s great for making quantum dots. If anyone has any questions, please don’t hesitate to ask!