Posted on 19/04/2024
Dear Rahel,
Thanks for posting this blog about how far "we" have come on impact evaluation. Let me be terse with my answer: not much, if at all. And for the following three reasons:
- CGD's "When Will We Ever Learn" (WWWEL) is a throw back to Vedungs' first scientific wave of evaluation - Vedung, E. (2010) Four Waves of Evaluation Diffusion, Evaluation, Sage Publications, 16: 263 pp. 263-277. During the 1960s and even earlier, advanced evaluative thinking and practice was driven by a notion of scientification of public policy and public administration. It was argued this would make government more rational, scientific and grounded in facts. Its technocratic thrust sought to isolate public policy decisions from the messy, complex world we live in. Evaluation was to be performed by professional academic researchers (often masquerading as evaluators).Spitting roast for the labs and units you list, and many others. Towards the mid-1970s, confidence in experimental evaluation faded however. Voices started communicating how Evaluation should be more diverse and inclusive. Those other than academic researchers should be involved. Ring bells for today's debates on de-colonisation, localisation and Indigenous Evaluation?
2. CGD's self-serving basic thesis:
- "persistent shortcomings in our knowledge of the effects of social policies and programs reflect a gap in both the quantity and quality of impact evaluations.’
- the authors argued: An “evaluation gap” has emerged because governments, official donors, and other funders do not demand or produce enough impact evaluations and because those that are conducted are often methodologically flawed.” They ascribe the evaluation gap to the public good nature of impact measurement; and
- "that governments and development agencies are better at monitoring and process evaluations than at accountability or measuring impact"’ - this may be so but, monitoring, long neglected by the evaluation community, as practiced by most govts and dev agencies, is done far from well and is deliberately held down as routine reporting process (pers comm Michael Quinn Patton, April 2024).
James Morton in his 2009 paper "Why We Will Never Learn" provides a wonderfully lettered critique of the above: the Public Good concept is a favourite resort of academics making the case for public funding of their research. It has the politically useful characteristic of avoiding blame. No one is at fault for the ‘evaluation gap’ if evaluation is, by very its nature, something that will be underfunded. Comfortable as this is, there are immediate problems. For example, it is difficult to argue that accountability is a public good. Why does the funding agency concerned not have a direct, private-good interest in accountability?
Having effectively sidelined Monitoring and Processes, WWWEL goes on to focus, almost entirely, on measuring outcomes and impact. This left the "monitoring gap" conveniently alone. While avoiding any discussion of methodologies: randomised control trials, quasi experimental double-difference, etc. many discussions WWWEL encouraged were the abstruse, even semantic nature of the technical debates which dominate discussion about impact measurement.
3. Pawson and Tilley's expose - through their masterful 1997 publication "Realistic Evaluation" of experimentalists and RCT's intrinsic limits as defined by its narrow use based on the deficiency of its external validity. They challenge orthodox view of experimentation: the construction of equivalent experimental and control groups, the application of interventions to the experimental group only and comparisons of the changes that have taken place in the experimental and control groups as a method of finding out what effect the intervention has had. Their position throws into doubt experimental methods of finding out which programmes do and which do not produce intended and unintended consequences. They maintain it not to be a sound way of deriving sensible lessons for policy and practice.
In sum then, CGD's proposition of RCTs, to cite Paul Krugman. is like a cockroach policy: it was flushed away in the 1970's but returned forty years later along with its significant limits intact; and CGD missed the most significant gap. From the above, one could get the impression that development aid has lost the capacity to learn: it suppresses, not takes heed of, lessons.
I hope the above is seen as a constructive contribution to the debate your blog provokes; and my seeming pessimism simply qualifies my optimism - a book was launched yesterday on monitoring systems in Africa.
Best wishes and good luck,
Daniel
United Kingdom
Daniel Ticehurst
Monitoring > Evaluation Specialist
freelance
Posted on 23/08/2024
Just pitching in, like Silva, to congratulate Musta on making such a great point. The seeming marginal value and high opportunity costs of EAs.
At 2022’s european evaluation society, the key note by Estelle Raimondo and Peter Dahler-Larsen was striking. They rehearsed an interesting analysis on the indiscriminate application and diminishing returns to the practice of late through its "performative" use. Bureaucratic capture.
Some argue EAs are the least of today’s evaluation community’s concerns.
The keynote’s reference to how "....sometimes, agencies can reduce reputational risk and draw legitimacy from having an evaluation system rather than from using it" reminds of the analogy the famous classicist and poet AE Housman made in 1903:
"...gentlemen who use manuscripts as drunkards use lamp-posts,—not to light them on their way but to dissimulate their instability.”
United Kingdom
Daniel Ticehurst
Monitoring > Evaluation Specialist
freelance
Posted on 06/08/2024
Dear Amy,
Thanks for taking time to read through and reply.
My apologies but let me be terse and honest.....
Many thanks for explaining what Mr Scriven wrote. I now understand. That said, I remain none the wiser on the import and significance of what he wrote. Motherhood and apple pie, so clever but a bit thin. 😉
On EAs themselves, and as I alluded to, the purpose and scope of an EA's inquiry appears to be part and parcel of what most people would refer to as a competent ex ante evaluation or appraisal. As you say, great to have an evaluator or two on the team ,yet...How could you not look at the "evaluability" of the investment by appraising it and the evidential reasoning that informs its rationale and design factors, including a ToC and/or a results framework? Or are we saying that an appraisal that recommends an investment not worthy can be judged as evaluable or, indeed, vice versa (assuming the EA is conducted after the appraisal)?
Thus, and as Hadera points out, the incremental value generated by carrying out a discrete - some would say a contrived EA solely by evaluators appears marginal at best, potentially fragments the team and comes across as rather extravagant and indulgent.
Many thanks for the post and the discussions that have prompted enquiry, debate, skepticism and doubt about EAs.
With best wishes,
Daniel
United Kingdom
Daniel Ticehurst
Monitoring > Evaluation Specialist
freelance
Posted on 31/07/2024
Dear Amy,
Thanks for posting. I remember well reading Rick D's Synthesis of the Literature back in 2013. I had four observations:
Finally, some help: You quoted Michael Scriven as saying: "evaluability is analogous to requiring serviceability in a new car and may be thought of as “the first commandment in accountability”. I know this must be a significant saying, but I don't understand/get it and its importance. What do you think he means?
Best wishes and thanks again,
Daniel