Correcting the literature

Mathias Brust in Chemistry World:

Ideally, science ought to be self-correcting. … In general, once a new phenomenon has been described in print, it is almost never challenged unless contradicting direct experimental evidence is produced. Thus, it is almost certain that a substantial body of less topical but equally false material remains archived in the scientific literature, some of it perhaps forever.

Philip Moriarty expresses similar concern in a post at Physics Focus. Openly criticising other scientists’ work is generally frowned upon—flaws in the literature are “someone else’s problem”. Erroneous papers sit in the scientific record, accumulating a few citations. Moriarty thinks this is a problem because bibliometrics are (unfortunately) used to assess the performance of scientists.

I think this is a problem too, although for a different reason. During my MRes I wasted a lot of time trying to replicate a nanoparticle synthesis that I’m now convinced is totally wrong. Published in June 2011, it now has five citations according to Web of Knowledge. I blogged about it and asked what I should do. The overall response was to email the authors but in the end I didn’t bother. I wanted to cut my losses and move on. But it still really bugs me that other people could be wasting their limited time and money trying to repeat it when all along it’s (probably) total crap.

I did take my commenters’ advice and email an author about another reaction that has turned out to be a “bit of an art”. (Pro tip: if someone tells you a procedure is a bit of an art, find a different procedure.) I asked some questions about a particular procedure and quoted a couple of contradictions in their papers, asking for clarification/correction. His responses were unhelpful and after a couple of exchanges he stopped replying. Unlike the first case, I don’t believe the results are flat out wrong. Instead I suspect a few experimental details are missing or they don’t really know what happens. I think I’ll get to the bottom of it eventually, but it’s frustrating.

What are your options if you can’t replicate something or think it’s wrong? I can think of four (excluding doing nothing):

  1. Email the corresponding author. They don’t have an incentive to take it seriously. You are ignored.

  2. Email the journal editor. Again, unless they’re receiving a lot of emails, what incentive does the journal have to take it seriously? I suspect you’d be referred to the authors.

  3. Try and publish a rebuttal. Can you imagine the amount of work this would entail? Last time I checked, research proposals don’t get funded to disprove papers. This is only really a viable option if it’s something huge, e.g. arsenic life.

  4. Take to the Internet. Scientists, being irritatingly conservative, think you’re crazy. Potentially career damaging.

With these options, science is hardly self-correcting. I’d like to see a fifth: a proper mechanism for post-publication review. Somewhere it’s academically acceptable to ask questions and present counter results. I think discussion should be public (otherwise authors have little incentive to be involved) and comments signed (to discourage people from writing total nonsense). Publishers could easily integrate such a system into their web sites.

Do you think this would work? Would you use it? This does raise another question: should science try and be self-correcting at all?

Thanks to Adrian for bringing Mathias Brust’s article to my attention.

Open Access: Going for Gold?

Tonight the Science Communication Forum at Imperial College held a debate called Open Access: Going for Gold? with Stephen Curry (Imperial) and Mark Thorley (NERC, RCUK). The debate was chaired by Richard Van Noorden (Nature News).

Update 2 (28th September): you can listen to the debate on Figshare and here’s a useful link to RCUK’s open access policy (PDF).

Lots of things were discussed but a couple things in particular stuck in my mind writing this on the way home.

RCUK require for CC-BY for gold, but only CC-BY-NC for green

Under the new RCUK policy researchers must either pay a fee to publish in a gold open access journal or alternatively publish in a closed access journal and then deposit the article in a repository within 6 months [1].

Gold articles must be published with a CC-BY licence. This is good as it means anyone can do what they want with the work as long as the original authors are attributed. However, green articles deposited in a repository after the embargo period are only required to have a CC-BY-NC licence, meaning that you cannot use the work for commercial purposes.

This is very disappointing. Sadly it wasn’t discussed in the debate. CC-BY-NC is, as tweeted during the debate, a licence of fear. All it says is that the authors couldn’t think of a way to make money out of the work, so they’ll be damned if anyone else does. The work might as well have never happened.

Thorley talked about open access benefiting “UK PLC”, but CC-BY-NC is at complete odds with this. CC-BY-NC stifles innovation and progress. Furthermore, if the state funded the research, then the state and the rest of society should benefit from it. Under CC-BY-NC, no one benefits.

Green is of poorer quality than gold?

A couple of people doubted the quality of papers published straight to repositories like arXiv. I’m not so convinced. Firstly, they assume the reader is stupid and can’t work out for themselves if a paper is a load of nonsense. Secondly, it assumes that peer review weeds out all the bad papers. It doesn’t. Someone suggested a kitemark to say that a particular paper in a repository is trustworthy. I hope I don’t have to explain why that’s an awful idea.

Thorley did at one point say something about gold papers being better for the lay person. Curry looked quite suprised. This is a completely different debate. Just because a paper is literally accessible to the public doesn’t mean the information contained within it is intelligible to the public. But if someone is interested enough to be reading papers I don’t think gold/green will really make that much of a difference to them—not enough to justify an APC. I wonder what percentage of papers even undergo any major revisions between submission and publication.

Concluding thoughts

CC-BY-NC for green is a real disaster. I sincerely hope RCUK revise their policy so that it’s the same as gold.

I still can’t make up my mind about green versus gold. On the one hand, I think everything should go straight into repositories like arXiv. Forget journals and use the money we save to help fund and develop repositories, (although I know this is really very unlikely to ever happen). But on the other, if we are going to pay journals to publish work, we should expect more in return. Not just PDFs, but high quality (interactive?) documents including data and code in reuseable formats and tools to help us do things like text mining. I can’t help but think there’s very little innovation in publishing, especially considering the size of their profit margins.

It’s clear a lot more will happen in the open access debate. As Thorley said, this isn’t an event, it’s a journey. Hopefully it won’t be too arduous.

Update: gold—a free market for innovation?

Having slept it on it I can see where RCUK are coming from with their preference for gold, but I think they’re overestimating what publishers actually offer at the moment. Do most journals currently add enough value for it to be worth the APC? I’m not sure. I get the feeling people tend to think that every journal produces papers as beautiful as NPG. Authors will be paying the journal to publish, therefore we should expect more in return—especially considering the tidy profit margins. At present, I don’t think gold is that much better than green in that respect.

If, as Curry said, scientists end their addiction to impact factors (increasingly likely as HEFCE will be enforcing their ban on them), gold might lead to a more free market-like situation. Scientists will look around for journals that offer the best value for money. This could really drive innovation in scientific publishing as publishers are going to be competing in terms of what they can offer scientists rather than what the journal can do for an author’s career.

(Updated on 09:02 on 27th September 2012 with additional section.)

[1]: Thorley said that the embargo periods vary in length from publisher to publisher. He was pretty clear about 6 months and said 12 months was “a joke”. Personally I think 6 months is still far too long. It also raises the question: do publishers only add such little value that its only worth 6 months? Why bother with it in the first place?

Negative results and dodgy papers: keep quiet or publish?

Negative results are very rarely published in the literature. After all, the literature is bursting with new positive results and we don’t have enough time to read all of these, let alone papers describing what doesn’t work. Negative results are dull—who would want to read anything in the Journal of Negative Results?

Up until recently I haven’t had a problem with the status quo. I’m afraid the following discussion is a bit vague because I’m (still) not sure about how much detail I can go into my work, but please bear with me.

I came across a paper published this year which describes the effect of doing something quite specific in a synthesis on nanoparticle shape. Do the thing, get a particular nanoparticle shape (usually quite challenging to obtain); stop doing the thing, you get another shape (easy to obtain). I was quite excited because if it worked it would get around a major barrier to my desired nanoparticles.

I repeated the reaction exactly as the paper described, but it didn’t work.

I repeated the reaction in a flow reactor as it would make it easy to intensify the “thing”. According to the paper, this should definitely give the desired nanoparticles because the morphology selectivity/yield is directly proportional to the intensity of the “thing”. But it still didn’t work.

I’ve now given up on the reaction and moved on to something else. But that my results will not be published means that someone else could also waste a lot of time and money—on equipment, reagents, electron microscopy—repeating the experiment.

What can I do? I think I have three options:

Option 1: Do nothing.

I’ve already made it clear that I don’t like this option. I’m fairly sure the paper is wrong. It bugs me that it exists without some kind of mark against it.

Option 2: Email the authors.

I’m not too keen on this either. I suspect that my email would be ignored. Plus, I would rather any discussion happened in the open, which brings me on to…

Option 3: Blog about it (and possibly email the authors telling them that I blogged about it).

I feel uneasy about this. Could it be perceived as confrontational? Would I get a reputation as a troublemaker? I feel like it is the proper, scientific and open thing to do, but in reality it is absolutely not the done thing. I suspect most researchers would go for option one and do nothing. I could be right and the paper is wrong, but I’d be very happy to be proven wrong and get the reaction working.

What you think? Keep quiet, email or blog? Any other suggestions are welcome.

On the value of journal editors (and why green open access won’t work)

Previously I argued that traditional journals should be abandoned and green open access repositories like arXiv are the way forward. More recently I praised the “DIY” open access journal The Journal of Machine Learning Research run by researchers, writing that chemists should do something similar.

But now I think I’m wrong because I’ve underestimated the value of journal editors.

On Stephen Curry’s blog a commenter said:

“The current system of peer-reviewed journals is altogether very flawed. … [A]t the end of the day, the journals make millions just formatting, laying them out and sending a few emails. This just cannot be right.”

6 months ago I would have probably agreed. Anonymous Publishing Employee replied (it’s worth reading in full) saying that they are wrong because they underestimate the work a journal really does. Editors have to decide whether a paper fits in with their journal and is worth sending for review, obviously requiring technical knowledge. If it is worth sending they have to decide who to send it to, requiring personal knowledge of the scientists. A lot of administrative time is spent chasing up reviewers, but once the reviews are in the editor has to make a decision or repeat the review process again. If accepted, subsequent copy editing and layout takes time (money) and there are other indirect costs too, e.g. IT and rent. The main expense, they believe, is salaries (not that surprising).

Before I’ve said that peer review would work in green OA repos, but now I think I was wrong. Editors have a lot of specialist knowledge that ensures the right people review papers. It’s also required to finally decide whether to accept or reject a paper. I now doubt that a comparable level of peer review would happen in a repository. There’s no incentive for scientists to review post-publication. With a journal, there’s a certain amount of flattery involved when a scientist is asked to review by an editor. In effect, the editors drive the peer review process forwards, whereas it might never get started in a repo.

Furthermore, if we only had green OA repositories there would be another loss that I’ve never considered before: the commentaries, reviews, editorials and research highlights that complement the original research articles.

Screenshot from the current issue web page for the latest issue of Nature Chemistry

These are written or commissioned by editors. Recently I’ve really enjoyed the extra content in Nature Chemistry. An interview with Chief Editor Stuart Cantrill goes into more depth about the work behind the scenes. Lab on a Chip is another journal that I like to keep track of—obviously much more specialised than Nature Chemistry—and it has similar articles.

A complete shift to green OA would result in the loss of this valuable content. Websites or blogs might spring up to take it’s place, but I doubt it would be of the same calibre. It would be a real shame to lose it because it’s a great way to broaden one’s knowledge and stumble across interesting work.

Overall I think I was wrong about green OA repositories. Journal editors (rather than the “journal” in itself) are a valuable asset to the peer review process and scientific endeavour as a whole. Still, more could be done to enhance the transparency of the peer review, but I think that open access publication simply won’t succeed post-publication peer review in green repositories.

DIY open access for chemists

Never before have I had the urge to start writing a blog post on the tube on a Monday morning, but after reading a post about the do-it-yourself open access Journal of Machine Learning Research, I have now.

To summarise the post by Stuart Shieber (although I encourage you to read it for yourself): the Journal of Machine Learning Research formed after the entire editorial board of Machine Learning quit. Since October 2010, JMLR has peer-reviewed and published over 1000 open access articles, at no cost to authors. It’s successful and well-respected—it has the highest impact factor of all journals in its category on Web of Science.

How do they do it? Volunteers. The same volunteers that peer reviewed for Kluwer’s (now Springer’s) Machine Learning and helped the publisher to make huge profits.

There are some costs, of course. The domain name—about £10 a year for .org. Hosting is provided free by MIT, but you can get fairly decent hosting for about £10 per month. Their biggest cost was a tax accountant! So far it has cost them about $10 per article—a far cry from the thousands of dollars most publishers want to publish OA.

This makes me think what on earth are publishers doing (aside from profiteering) charging at least $1500 per OA article? The JMLR demonstrates that the whole argument of scholarly publishing necessarily being an expensive process is patently false. Publishers rely on academics for their entire publishing process. The Internet—and backing of a university like MIT (which would cost them far less than a typical journal subscription)—provides academics all they need to take control of their field.

Computer scientists have the advantage over chemists of being highly proficient computer users and hence find it a lot easier to sort out typesetting with LaTeX and the installation of one of many available open source journal publishing systems. But with a bit of assistance, most chemists could submit articles in LaTeX, which is really quite straightforward. Furthermore, you only need a couple of people to get the web site going and maintain it.

The whole DIY ethos of JMLR is brilliant. Academics put so much work into their research, give it all up to publishers and then pay to read their community’s work. It’s great to see scientists publish their work themselves.

I would love chemists to start a DIY OA journal, though I think chemists, out of physicists who have arXiv and biologists and medics who have PLoS, are a conservative bunch. I’m not sure why. Shieber wrote that computer scientists are used to volunteering; I don’t think chemists are. Nonetheless, I think it could happen, especially with the backing of a department or university (I think libraries are in a good position to help here). I think this might be the way scholarly publishing moves in the future. I’d be more than happy to help get a DIY OA chemistry journal going.

Questionable research practices, peer review and an open access future?

Blimey—it’s been five weeks since my last post and I’m now a five weeks into my postgraduate studies. It’s gone quickly and I’ve been very busy.

As part of the doctoral training centre’s new/modern/[positive adjective] approach to a PhD we get (well, have) to take courses that ’round us out’ as modern researchers. A few weeks ago, we had a course on research ethics taught by Marianne Talbot. I did Philosophy A-level and especially enjoyed moral philosophy, so I was looking forward to it.

The course was attended not just by PE DTC students but also the CQD and TMS DTCs. Rather unsurprisingly (but disappointingly) there was a bit of a unfriendly vibe between the different DTCs. “We get MacBook Pros!” said one, “we don’t have to do experiments!” said another, to which we all replied “we get £18,000 to spend and we like lab experiments!” The conversation never progressed any further…

Overall the course was excellent and very enjoyable. I loved how Marianne dealt efficiently and firmly with the few people who wanted to deny the existence of everything! One of the afternoon sessions was on open access publishing, a topic I already had an interest in. I’ve read about it before but have never been entirely convinced (I’m not sure why). Marianne gave a strong case for open access is good. She referenced this website as a good overview. If you don’t know what open access is, then it’s worth a quick read. There was unanimous support of the open access concept.

Marianne then introduced a distinction I had never heard of before: green and gold open access methods. In the green method, papers are deposited in a public online repository. Papers are not peer reviewed prior to being published and anyone can upload an article. The most famous example of this is probably arXiv. In the gold method, you submit a paper to journal, it’s peer reviewed, and if accepted it’s published in a journal that is either entirely open access or permits some open access articles. An example of the former type is PLoS ONE.

The question Marianne asked us to discuss was “Do you think it is acceptable for scientist to self-archive pre-prints in repositories with peer-review?” The answers from students were quite vague. But generally it seemed that peer review was held in extremely high, almost reverent, regard.

I found this odd considering we had just been discussing questionable research practices. One example of a questionable research practice that stuck out to me was:

leaving important information out of methodology section of a manuscript or refusing to give peers reasonable access to unique research materials or data that support published papers.

One would expect that if peer review functioned as well as my fellow students said then readers would rarely come across this practice in the literature. Yet in my field of research, I encounter it all the time! Authors brag that they’ve found the way to make the biggest, smallest, longest or generally ‘best’ nanoparticle but then fail to tell you crucial information such the number of moles of reagents, reaction times and temperatures that allow you to repeat the work. I spent an unbelievable amount of time last year trying to figure out the required conditions to synthesise heterostructured quantum dots. If peer review did it’s job, then things like this wouldn’t get through.

Other students were arguing that because anyone can publish a paper in a green OA repository that there is no quality control. I disagree. I think a lot of students are assuming that readers are idiots and need peer review. If you uncritically read a paper or think that because it’s in a journal it must be true then you’re at best naive or at worst incompetent. Decent researchers will spot questionable claims and results.

Is peer review even really that good a quality control method? Typically you only have two reviewers. Can you be sure they read the paper instead of give it to a PhD/postdoc?

Imagine that rather than submitting papers to traditional peer reviewed journals researchers published their work in open access green repositories. No real scientist is going to post rubbish because their reputation is on the line. Rather than having only two reviewers as with traditional journals, you could have tens or even hundreds of reviewers. They could post their comments—the peer review—publicly on the repository article page (I’m thinking more along the lines of threaded discussions rather than linear blog-style comments).

I think it would be awesome. The authors could respond to readers’ questions, for example, asking for clarification of an experimental technique or reagent used, or post new versions of the article correcting mistakes or providing further information.

At present, reviewers’ comments are made privately and anonymously. These comments would be useful to the scientific community. There’s no reason why it should stay private. Science is all about debate, questioning and (a moderate dose of) scepticism. At conferences and in department presentations, researchers handle criticism and questions. There’s no reason why journal articles should be any different.

I do wonder whether I’m being overly optimistic or if I’ve missed out something crucial. What do you think? I’d like to know…

Profiteering, Secretive Chemists and Open Access

Yesterday George Monbiot published a scathing piece in the Guardian about academic publishers, writing that they are the “most ruthless capitalists in the western world” and that “the racket they run is most urgently in need of referral to the competition authorities”.

I agree that journal pricing is absurd. Viewing a single article will cost you around $30–40. I’ve never understood how it can cost that much to publish and provide one-time access to a single article considering distribution is electronic and journals don’t pay for peer-review.

Libraries spend a large proportion of their budgets on journal subscription deals where they get access to thousands of journals, but are tied into 6% yearly price increases. One wonders why libraries agreed to such high yearly increases in the first place, well above the rate of inflation. Imperial’s library spends £3.8 million—43% of its budget—on journal subscriptions every year. However, Deborah Shorley, Director of Imperial’s Library, isn’t going to let this continue by trying to get publishers to accept payments in Sterling and reduce subscription fees by 15%.

My biggest gripe is that research funded by the tax payer isn’t freely available to the public. I agree with Monbiot that all research funded by the tax payer should be freely available to the public. It seems that private individuals all too often make vast profits from public investment.

Where’s the chemistry arXiv?

Nature recently published a piece about the pre-print server arXiv on its 20th anniversary. ArXiv seems like an excellent resource but chemistry has nothing like it. Why? Derek Lowe wrote today that he doesn’t know; nor do I. I think The Curious Wavefunction is on to something in that chemists are more secretive than physicists.

Perhaps it’s because cutting edge physics experiments are large and require lots of collaboration, unlike most chemistry research. A big development in chemistry could come from a small group working in a couple of fume hoods. They are much more easily beaten to publication by a competing group (and consequently lose out on any subsequent recognition) than physicists working on something like the LHC, so they are secretive until their work is published.

I hope that there will be a shift to open access but my feeling is there won’t. There’s no incentive for those in positions to bring about such a change to actually do so. Widely read, high impact journals are closed access and make lots of money from subscription fees, so there isn’t an incentive for publishers to switch to open access and charge authors to publish instead. Additionally, only well established researchers can afford to publish in a low impact open access journal rather than high impact closed journal.