From time to time examples of scientific fraud come to light and raise questions about the integrity of scientific endeavour. The most well-known example of recent years must surely be South Korean stem cell biologist Hwang Woo-Suk, whose ground-breaking discoveries in the field of therapeutic cloning were exposed as bogus (In addition to his science reputation being in tatters, Hwang was convicted in October 2009 of embezzlement and violation of bioethical laws, although he escaped a custodial sentence).
In physics, the multiple re-use of the same graphs as data for entirely different experiments led to the downfall of a leading young nanoscientist (this was the subject of a 2004 episode of the BBC’s Horizon series The dark secret of Hendrik Schön). Are Hwang and Schön rare examples bringing unwarranted criticism to a body of otherwise exemplary scientists, or are their crimes indicative of much wider malpractice within the scientific community?
University of Edinburgh researcher Daniele Fanelli has shed some light on the the extend of scientific fraud in an article How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. Published in the open access journal PLoS ONE in May 2009, the research brought together data from a number of earlier smaller studies on scientific misconduct to generate “the first meta-analysis of these surveys” (p1).
Fanelli was interested in examining the rates of self-reporting of scientific misconduct and knowledge about the misconduct of colleagues. Recognising that “any boundary defining misconduct will be arbitrary” (p9), he limited discussion to incidents where there was clear “intention to deceive” (p1, p9) rather than generation of incorrect results as a consequence of shoddy experimental design and/or accidental misinterpretation of the data. For the purposes of this study, Fanelli also excluded plagiarism and other examples of “questionable research practices” (QRPs; such as failure to include a contributor amongst the list of authors for a paper) from his definition of scientific misconduct, which was instead limited to fabrication and falsification. The grounds for this decision seem valid; whereas fabrication (the invention of data) and falsification (the wilful distortion of results) change the actual body of scientific knowledge, these other unprofessional activities lead instead to changes in the distribution of credit for the work, the substance of which remains unaltered.
To identify the previous studies of misconduct, Fanelli conducted a search of citation databases, scientific journals, “grey literature” databases and internet search engines using the terms “research misconduct”, “research integrity”, “research malpractice”, “scientific fraud”, “fabrication, falsification” and “falsification, fabrication”. An initial search generated 3276 potentially relevant studies. The vast majority (3207) were easily excluded because they were not surveys of research misconduct.
The author then applied very strict criteria to limit the meta-analysis to genuinely appropriate studies. For example, papers were excluded if there was no quantitative data, if the data included no clear category of never/none/nobody (e.g. if only mean values were shown), if the sample had been generated in a non-random manner, or if undergraduate and/or other non-researchers were included in a manner that did not permit their removal from the dataset). Having done so, the initial pool of potential papers was whittled right down to 18 suitable studies.
Quantifying research malpractice
What were the conclusions of Fanelli’s analysis? The main issues addressed were the proportion of respondent admitting to misconduct or questionable practices of their own, or knowledge of similar behaviour committed by colleagues on at least one occasion.
In the various studies reviewed, between 0.3% and 4.9% of respondents confirmed that they had modified results to improve the outcomes. This led to an average of about 2% self-reporting of misconduct (although it was nearer 1% if the responses were limited to those that specifically mentioned ‘falsification’ or ‘fabrication’.
A rather larger number, 9.5%, were willing to admit that they had carried out broader questionable practices. Again, however, the phrasing was important with more respondents willing to say they had “modified research results” than admitting that had reported results that they “knew to be untrue”. This may fit with an underlying assumption that it is okay to omit data that you “know” are outliers or otherwise “wrong”. As Fanelli puts it “many did not think that the data they “improved” were falsified” (p9).
When asked about the actions of others, a crude average of around 16.7% (range 5.2% to 33.3%, Fanelli elects to report this statistic as “up to 34%” (p10)) of scientists said they had personal knowledge that a colleague had fabricated or falsified data on at least one occasion. A much wider range (6.2% to 72%; crude mean 28.5%) said that they were aware of peers who had indulged in QRPs.
So, were Hwang and Schön isolated miscreants or does their identification mark the tip of an iceberg of scientific misconduct? The truth seems to lie somewhere in between. As Fanelli notes, usual rules of self-reporting bias – in which some people (typically older women) under-report criminal behaviour whereas others (typically younger males) over-report such activity – do not apply here. It is highly unlikely that anyone in a community where trust is taken seriously will over-report their own wrong actions. It is likely, therefore, that the calculated values of self-reported malpractice are underestimates.
The data regarding knowledge of other researchers’ actions are harder to validate. It is theoretically possible, for example, that more than one correspondent might be describing the wrongdoings of the same colleague. In contrast, the criteria was knowledge of malpractice on “at least one occasion” and therefore the data may not take into account serial offences.
Other approaches to measuring misconduct, as reviewed by Fanelli, have generated a range of figures which might be seen as very broadly equivalent. For example, about 0.02% of paper are retracted from PubMed due to misconduct (Claxton, 2005). 1% of papers submitted to the Journal of Cell Biology were found to have been inappropriately manipulated (Steneck, 2006). 2% of clinical researchers were found guilty of serious scientific misconduct in routine US Food and Drug Administration audits (Glick, 1992).
Whatever the accuracy of these numbers, however, it remains true that the vast majority of science is carried out in a spirit of accuracy and integrity.