In Reply: The letters mostly present a classical view of rigor in evaluation without addressing my central thesis: that the power of any method of evaluation varies with the object of evaluation. RCTs in suitable contexts are unmatched; but in other contexts, using an RCT is like searching for oil with a metal detector. It is not a bad tool, just usually the wrong one for the job when the objects of study involve complex social systems.
Dr Durieux points out interestingly that the same may be sometimes true in studying biological responses. That this may be the case for rapid response teams is suggested by the wide gap between, on one hand, some studies (such as the MERIT trial) and, on the other hand, many systematic local observations, some published reports, and common sense. Criteria recently proposed for publishing reports of improvement work have not yet been widely adopted.1 Given customary publication norms, which make it very difficult to get even carefully documented local reports and qualitative research studies into print, the inconclusiveness of systematic reviews such as those noted by Drs Fan and Needham is no surprise. After all, reviews use published papers as their inputs. In their masterful book, Pawson and Tilley call such findings the “ironic anticlimax” of many forms of classical evaluation of social programs.2
The Science of Quality Improvement—Reply. JAMA. 2008;300(4):390-392. doi:10.1001/jama.300.4.391-c