

Does Foreign Aid Work? Efforts to Evaluate
U.S. Foreign Assistance
Marian Leonardo Lawson
Analyst in Foreign Assistance
June 23, 2016
Congressional Research Service
7-5700
www.crs.gov
R42827
link to page 27 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Summary
In most cases, the success or failure of U.S. foreign aid programs is not entirely clear, in part
because historically, most aid programs have not been evaluated for the purpose of determining
their actual impact. Many programs are not even evaluated on basic performance. The purpose
and methodologies of foreign aid evaluation have varied over the decades, responding to political
and fiscal circumstances. Aid evaluation practices and policies have variously focused on meeting
program management needs, building institutional learning, accounting for resources, informing
policymakers, and building local oversight and project design capacity. Challenges to meaningful
aid evaluation have varied as well, but several are recurring. Persistent challenges to effective
evaluation include unclear aid objectives, funding and personnel constraints, emphasis on
accountability for funds, methodological challenges, compressed timelines, country ownership
and donor coordination commitments, security, and agency and personnel incentives. As a result
of these challenges, aid agencies do not undertake evaluation of all foreign aid activities, and
evaluations, when carried out, may differ considerably in quality.
The Obama Administration has taken several steps to enhance foreign assistance evaluation.
2010 Quadrennial Diplomacy and Development Review (QDDR) resulted in,
among other things, a stated commitment to plan foreign aid budgets “based not
on dollars spent, but on outcomes achieved.”
USAID introduced a new evaluation policy in January 2011.
The State Department, which began to manage a growing portion of foreign
assistance in the 21st century, introduced a new evaluation policy in February
2012, which was updated in January 2015.
The Millennium Challenge Corporation revised its evaluation policy in 2012, and
soon after began releasing its first evaluation reports.
The agency evaluation policies differ in several respects, including their support for impact
evaluation, but reflect a common emphasis on evaluation planning as a part of initial program
design, transparency and accessibility of evaluation findings, and the application of data to inform
future project design and policy decisions. Aspects of the three evaluation policies are compared
in the Appendix.
Recent reports and policy reviews suggest that aid evaluation frequency and quality have
improved in recent years, though progress has been uneven. Attention to this issue remains
strong, both within the Administration and among Members of Congress. The 2015 QDDR
reemphasizes the role of evaluation, calling for more evaluation training, more strategic use of
data, and more timely analysis of lessons learned, among other things. Though recent evaluation
reform efforts have been agency-driven, Congress has considerable influence over their impact.
Legislators may mandate a particular approach to evaluation directly through legislation (e.g.,
H.R. 3766 and S. 2184 in the 114th Congress), or may support or fail to support Administration
policies by controlling the appropriations necessary to implement the policies. Furthermore,
Congress will largely determine how, or if, any actionable information resulting from the new
approach to evaluations will influence the nation’s foreign assistance policy priorities.
Congressional Research Service
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Contents
Introduction ..................................................................................................................................... 1
Why Evaluation? ............................................................................................................................. 2
Impact and Performance Evaluations .............................................................................................. 4
History of U.S. Foreign Assistance Evaluation ............................................................................... 5
Evaluation Challenges .................................................................................................................... 11
Applying Evaluation Findings to Policy ....................................................................................... 17
Current Agency Evaluation Policies .............................................................................................. 19
Issues for Congress ........................................................................................................................ 21
Conclusion ..................................................................................................................................... 22
Appendixes
Appendix. Select Aspects of Current USAID, State Department, and MCC Evaluation
Policies ....................................................................................................................................... 24
Contacts
Author Contact Information .......................................................................................................... 26
Congressional Research Service
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Introduction
In considering budget issues, Congress has long been interested in the relative efficiency and
effectiveness of federal programs, including foreign assistance. Foreign assistance evaluation is
one aspect of a government-wide effort to link program effectiveness to budgeting decisions. It is
also an element of broader foreign aid reforms implemented in recent years. The 2010
Quadrennial Diplomacy and Development Review (QDDR), the basis of many aid policy
initiatives, called for the State Department and the U.S. Agency for International Development
(USAID) to plan foreign aid budgets and programs “based not on dollars spent, but on outcomes
achieved,” and for USAID to become “the world leader in monitoring and evaluation.”1 The 2015
QDDR continued the emphasis on evaluation, emphasizing the strategic use of data and the need
to build agency evaluation capacity.2 Rigorous evaluation is also a cornerstone of the Millennium
Challenge Corporation (MCC), established in 2004 to promote a new model of development
assistance.3 According to former USAID Administrator Rajiv Shah, global development policies
and practices are experiencing a “transformation based on absolute demand for results.”4 That
demand comes, in part, from some Members of Congress as they scrutinize the Administration’s
international affairs budget request and consider foreign aid spending priorities.5 It also comes
from aid beneficiaries and American taxpayers who want to know what impact, if any, foreign aid
dollars are having and whether foreign aid programs are achieving their intended objectives.
The current emphasis on evaluation is not new. The importance, purpose and methodologies of
foreign aid evaluation have varied over the decades since USAID was established in 1961,
responding to political and fiscal circumstances, as well as evolving development theories. There
are a number of reasons that this issue has again gained prominence in recent years. For one,
foreign aid funding levels increased significantly in the first decade of the 21st century, while
evaluations decreased, raising questions about the knowledge basis for aid policy.6 Analysts have
noted that after decades of aid agencies spending billions of dollars on assistance programs, very
little is known about the impact of these programs.7 Some wonder how policymakers can develop
effective foreign aid strategies without a clear understanding of how and why prior assistance has
succeeded or failed.
This report focuses primarily on U.S. bilateral assistance, not on the work of multilateral aid
entities, such as the World Bank, to which the United States contributes. While a wide range of
1 U.S. Department of State, Quadrennial Diplomacy and Development Review, 2010, Leading Through Civilian Power,
p. 103.
2 Enduring Leadership in a Dynamic World, the Quadrennial Diplomacy and Development Review, 2015, p. 13.
3 For more information about the MCC model, see CRS Report RL32427, Millennium Challenge Corporation, by Curt
Tarnoff.
4 Statement of USAID Administrator Rajiv Shah to The Cable, as reported in The Cable, June 13, 2012.
5 While not often discussing evaluation policy per se, some Members appear to be influenced in their policy decisions
by their sense of what aid is working and what is not. For example, when introducing her subcommittee’s FY2013
proposal at full-committee mark-up on May 17, 2012, House State-Foreign Operations Appropriations Subcommittee
Chairwoman Kay Granger remarked that the legislation “only supports programs that work.” Senator Lindsay Graham
of the Senate State-Foreign Operations Appropriations Subcommittee, explaining the sharp reduction in aid for Iraq in
the Senate’s FY2013 proposal at a May 22, 2012, mark-up, said “there’s no point in throwing good money after bad.”
6 For historic information on foreign aid spending, see CRS Report R40213, Foreign Aid: An Introduction to U.S.
Programs and Policy, by Curt Tarnoff and Marian L. Lawson.
7 When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Gap Working
Group, Center for Global Development, May 2006, p. 1.
Congressional Research Service
1
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
federal agencies provide foreign assistance in some form,8 this report focuses on the three
agencies that have primary policy authority and implementation responsibility for U.S. foreign
assistance—USAID, the State Department, and the Millennium Challenge Corporation (MCC). It
discusses past efforts to improve aid evaluation, as well as ongoing issues that make evaluation
challenging in the foreign assistance context. The report also provides an overview of the current
evaluation policies of the primary implementing agencies, and discusses related issues for
Congress, including recent legislation.
Program Evaluation Government-Wide
Program evaluation is an important issue throughout the U.S. government, and foreign assistance evaluation is just
one part of a broader effort by the federal government to improve accountability and program performance through
stronger evaluation processes. With the Government Performance and Results Act (GPRA) of 1993, Congress
established unprecedented statutory requirements regarding the establishment of goals, performance measurement
indicators, and submission of related plans and reports to Congress for its potential use in policy development and
program oversight. The GPRA Modernization Act of 2010 updated the original law, requiring more frequent plan
updates and on-line posting of data.9 State Department and USAID strategic planning and assessment documents
required by GPRA are available at Performance.gov. The agency-specific evaluation plans discussed in this report are
intended to comply with and build upon this government-wide effort.
Why Evaluation?
To know whether aid is successful, one must understand its purpose. The Foreign Assistance Act
(FAA) of 1961 (P.L.87-195), as amended, is the authorizing legislation for most modern foreign
aid programs. The FAA declared that
the principal objective of the foreign policy of the United States is the encouragement
and sustained support of the people of developing countries in their efforts to acquire the
knowledge and resources essential to development, and to build the economic, political,
and social institutions that will improve the quality of their lives.10
The original legislation lists five principal goals for foreign aid: (1) the alleviation of the worst
physical manifestations of poverty among the world’s poor majority; (2) the promotion of
conditions enabling developing countries to achieve self-sustaining economic growth and
equitable distribution of benefits; (3) the encouragement of development processes in which
individual civil and economic rights are respected and enhanced; (4) the integration of the
developing countries into an open and equitable international economic system; and (5) the
promotion of good governance through combating corruption and improving transparency and
accountability.11 Amending legislation over the years added dozens of new, though often
overlapping, aid objectives. For example, “the suppression of the illicit manufacturing of and
trafficking in narcotic and psychotropic drugs” was added in 1971,12 “to alleviate human suffering
caused by natural and manmade disasters” was added in 1975,13 and “to enhance the antiterrorism
skills of friendly countries by providing training and equipment” and “to strengthen the bilateral
8 According to ForeignAssistance.gov, 22 U.S. government agencies reported obligating foreign assistance in FY2015.
9 For more on current GPRA requirements, see CRS Report R42379, Changes to the Government Performance and
Results Act (GPRA): Overview of the New Framework of Products and Processes, by Clinton T. Brass.
10 Foreign Assistance Act of 1961, P.L. 87-195), §101(a).
11 Ibid.
12 FAA, as amended, §481(a)(1)(C).
13 FAA, as amended, §491(a).
Congressional Research Service
2
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
ties of the United States with friendly governments by offering concrete [antiterrorism]
assistance”14 were added in 1983. In short, U.S. foreign aid is intended to be a tool for fighting
poverty, enhancing bilateral relationships, and/or protecting U.S. security and commercial
interests.
In this broad view, some instances of specific development assistance projects and programs are
widely viewed as successful. The largest aid program of the last century, the Marshall Plan (1948-
1952), for example, is acclaimed as a key factor in the post-World War II reconstruction of
European states that have gone on to become major strategic and trade partners of the United
States. In the late 1960s and 1970s, aid associated with the “green revolution” was credited with
greatly improving agricultural productivity and addressing hunger and malnutrition in parts of
Asia, and global health programs were credited with virtually eradicating smallpox. Korea,
Taiwan, and Botswana are often cited as aid success stories as a result of remarkable economic
progress following significant aid infusions. More recently, unquestionable progress in battling
public health crises, such as HIV/AIDS, across the globe can be largely attributed to massive
foreign assistance programs, both bilateral and multilateral. Recent studies have also shown a
positive but modest impact of aid on economic growth rates.15 Even in these instances, however,
close analysis often reveals many caveats.
In other specific instances foreign aid programs and projects have been considered to be
conspicuously unsuccessful, or even harmful to intended beneficiaries. Critics of foreign
assistance cite decades of aid to corrupt governments in Africa, which enriched corrupt leaders
and did little to improve the lives of the poor.16 In Latin America, U.S. aid to anti-communist
rebels and regimes during the Cold War was associated with brutal violence and believed by
many to have damaged U.S. credibility as a champion of democracy. Numerous examples exist of
hospitals, schools, and other facilities that were built with donor funds and left to rot, unused in
developing countries that did not have the resources or will to maintain them. In some instances,
critics assert that foreign aid may do more harm than good, by reducing recipient government
accountability, fueling corruption, damaging export competitiveness, creating dependence, and
undermining incentives for adequate taxation.17
The most notable successes and conspicuous failures of foreign aid give fodder to both aid
advocates and detractors, but in all likelihood represent just a small segment of assistance
activities. In most cases, clear evidence of the success or failure of U.S. assistance programs is
lacking, both at the program level and in aggregate. One reason for this is that aid provided for
development objectives is often conflated with aid provided for political and security purposes.
Another reason is that historically, most foreign assistance programs are never evaluated for the
purpose of determining their impact, either at the time of implementation or retrospectively.
Furthermore, evaluation practices are not consistent enough to allow for the use of project level
data as the basis for broader, strategic evaluations. A 2009 review of monitoring and evaluation of
U.S. foreign assistance described the evaluation effort at that time as “uneven across agencies,
rarely assesses impact, lacks sufficient rigor, and does not produce the necessary analysis to
14 FAA, as amended, §572 (1) and (2).
15 “The $138.5 Billion Question: When Does Aid Work (And When Doesn’t It)?,” Center for Global Development
Policy Paper 049, Sect. 3.1.
16 Several examples of this are discussed in, Economic Gangsters: Corruption, Violence and the Poverty of Nations, by
Raymond Fisman and Edward Miguel, Princeton University Press, 2008.
17 See Dambisa Moyo, Dead Aid: Why Aid is Not Working and How There Is a Better Way for Africa, Farrar, Straus
and Giroux, New York, 2009, p. 48.
Congressional Research Service
3
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
inform strategic decision making.”18 In recent years, however, aid-implementing agencies have
taken steps to improve both the quantity and quality of aid evaluations, and to make better use of
the information gleaned from those efforts. A 2016 USAID review identified notable
improvements in evaluation practices at USAID since implementation of a new evaluation policy
in 2011.19
Impact and Performance Evaluations
The Department of State, USAID, and other U.S. agencies implementing foreign assistance
programs have long evaluated the performance of their own personnel and contractors in meeting
discrete objectives. Depending on the nature of the project or program, staff and contractors
might monitor the miles of road built, number of police officers trained, or changes in the use of
fertilizers by farmers. These results can be compared to the initial program goals and expectations
to determine whether the project or contract has been performed successfully. This type of
oversight is called performance monitoring, and if the resulting data are analyzed in an effort to
explain how and why a program meets or fails to meet strategic objectives, this is called
performance evaluation. Performance monitoring and evaluation are widely viewed as essential
aspects of oversight, and performance evaluations represent the vast majority of foreign aid
evaluations. Financial audits by agency Inspectors General, which examine whether funds are
being used as intended, are also a common form of evaluation, particularly at the State
Department. These audits are in addition to regular financial audits required by agencies of
contractors, aid-implementing partners, and host government entities.
Performance evaluation and financial audits play an important part in project management but do
little to answer questions about foreign aid effectiveness. Addressing this question, some argue,
requires impact evaluations. Impact evaluations can take many forms, but their common element
is that they use a defined counterfactual, or control group, and baseline data to measure change
that can be attributed to an aid intervention.20 Impact evaluations look not at the output of an
activity, but rather at its impact on a development objective. For example, while a performance
evaluation of an education program may look at the number of textbooks provided and teachers
trained, an impact evaluation may determine how or if literacy or math skills had improved for
the target group as compared to a similar group that did not receive the textbooks or teacher
training. A performance evaluation of an HIV prevention project may report the number of public
awareness events held or condoms distributed, while an impact evaluation of the same program
would monitor changes in the HIV/AIDS infection rate of the targeted population relative to a
control group. An impact evaluation of a police training program would look at the program’s
impact on civil order and public safety rather than simply report how many officers were trained
or the value of equipment supplied. Randomized controlled trials, in which beneficiaries are
randomly selected from a prequalified group and compared before and after the program to those
18 Beyond Success Stories: Monitoring and Evaluation For Foreign Assistance Results, Evaluator Views of Current
Practice and Recommendations for Change, by Richard Blue, Cynthia Clapp-Wincek and Holly Benner, May 2009, p.
ii.
19 “Strengthening Evidence Based Development: Five Years of Better Evaluation Practices at USAID, 2011-2016,”
available at https://www.usaid.gov/sites/default/files/documents/1870/Strengthening%20Evidence-
Based%20Development%20-%20Five%20Years%20of%20Better%20Evaluation%20Practice%20at%20USAID.pdf.
20 For a thorough, yet non-technical, discussion of the use of impact/attribution evaluation, see “An introduction to the
use of randomized control trials to evaluate development interventions,” by Howard White, International Initiative for
Impact Evaluation, Working Paper 9, February 2011.
Congressional Research Service
4
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
not selected, are widely viewed as best practice for impact evaluation, but less rigorous methods
are used as well.
Impact evaluations can be key to determining whether a foreign assistance program “works.”
However, impact evaluations are generally far more complex and resource-intensive than
performance evaluations, and usually must be planned before an activity begins. Agencies
implementing foreign assistance must balance the potential knowledge to be gained from impact
evaluation with the additional resources necessary to carry out such evaluations. As a result,
while the potential learning benefits of impact evaluation have long been recognized by aid
officials, the use of rigorous impact evaluation has been, and continues to be, very limited. More
typically, agencies aim for evaluation practices that are, as one expert has put it, “cost-effectively
rigorous,” and, at minimum, “independent, transparent, and consistent, thus persuasive.”21
Primary School Deworming in Kenya
History of U.S. Foreign
(1997-2001)22
One well-known example of an impact evaluation that
Assistance Evaluation
yielded useful information looked at a World Bank-
supported project in Kenya that treated children for
The practice of foreign assistance evaluation
intestinal worms, a prevalent affliction that results in
has changed over time to reflect evolving, or
listlessness, diarrhea, abdominal pain, and anemia. The
stated development objective was to increase the
some might say cyclical, attitudes about the
number of children completing their primary education.
purpose and relative importance of
In col aboration with the local health ministry, NGO
evaluation.24 This is evident both in the United
implementers treated 30,000 children in 75 schools with
States and internationally. Aid evaluation
a drug that cost $3.27 annually per child, using baseline
practices and policies have variously focused
data and a random phase-in approach that allowed for a
control ed comparison. The evaluation found that the
on different evaluation objectives, including
deworming resulted in a 25% reduction in absenteeism,
meeting program management needs,
or 10-15 more days of school attendance per child per
institutional learning, accountability for
year. This case is also an example of the value of
resources, informing policymakers, and
consistent methodology and the use of sector- or region-
building local oversight and project design
wide evaluation that looks at results beyond the project
level. Similar evaluation methods were used for other
capacity.
interventions (providing free uniforms, textbooks, and/or
The history of U.S. foreign assistance
meals) with the same goal and in the same region,
allowing evaluators to do a comparative analysis and
evaluation begins with USAID, which
determine that the deworming intervention was the
implemented the vast majority of U.S. foreign
most effective of these interventions in increasing school
assistance prior to the last decade. In its early
participation.23
years, USAID was primarily involved in large
capital and infrastructure projects, for which evaluations focused on financial and economic rates
of return were appropriate. However, the agency soon shifted focus towards smaller and more
diverse projects to address basic human needs, and found that the rate of return evaluation model
was no longer sufficient.25 The agency established its first Office of Evaluation in 1968, and used
21 Clemens, Michael. “Impact Evaluation in Aid: What For? How Rigorous?” Presentation at the Overseas
Development Institute, July 3, 2012, video recording available at http://www.cgdev.org/content/multimedia/detail/
1426372/.
22 For an overview of this evaluation, as well as links to related studies, see http://www.povertyactionlab.org/
evaluation/primary-school-deworming-kenya.
23 Roetman, Eric. A Can of Worms? Implications of Rigorous Impact Evaluations for Development Agencies,
International Initiative for Impact Evaluations, Working Paper 11, March 2011, p. 5.
24 Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 4.
25 The USAID Evaluation System: Past Performance and Future Direction, Bureau for Program and Policy
(continued...)
Congressional Research Service
5
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
a Logical Framework (LogFrame) model as its primary system for monitoring and evaluation.26
The LogFrame approach, subsequently adopted by many international development agencies,
employed a matrix to identify project goals, purposes, results, and activities, with corresponding
indicators, verification methods, and important assumptions. Baseline data were to be used for
each indicator, and results were reported at quarterly points during the life of a project. However,
these data were not analyzed to look for competing explanations of the results or unintended
consequences of activities. In many respects, the LogFrame approach was quite similar to the
current GPRA requirements (discussed in the “Program Evaluation Government-Wide” text box
above.)
While the LogFrame approach established USAID as a thought leader with respect to evaluation
policy, in practice, evaluation quality varied
Testing Family Planning Project Design
significantly from project to project. A 1970
in Thailand, 1979
evaluation handbook included a diagram of
the “ideal” program evaluation design, which
Many evaluations are designed to answer specific
questions about project design. One example is the
resembles a randomized controlled trial, but
Family Planning Health and Hygiene Project, a 1979
notes that “there are a great many reasons why
independent evaluation of USAID support for the
it may not be possible to reach the ideal.”28
government of Thailand’s family planning policy.
Reviews of foreign assistance evaluation over
Implemented by the American Public Health Association,
decades revealed shortcomings. For one, the
the evaluation used a baseline survey and experimental
design to test the hypothesis that contraception services
system had become decentralized over time,
would be more cost-effective and acceptable to
suitable to meet the information needs of
communities if combined with basic health services
project managers in the field but not
rather than implemented in isolation. Obtaining the
contribute to broader learning or policy
appropriate information to inform resource allocation
making. A 1982 report by the General
was a primary objective of the evaluation. According to
the report, “the evaluation was implemented with
Accounting Office (now the Government
sufficient precision and adherence to experimental
Accountability Office, GAO) found that “AID
requirements to provide information on which to make
staff does not apply lessons learned in the
management decisions about the best use of resources.”
development of new projects,” and that
Evaluators found that the hypothesis was not supported
“lessons learned are neither systematically nor
by the evidence. Adding basic health services doubled the
cost of programs but was not associated with increased
comprehensively identified or recorded by
contraceptive use. As a result, the evaluators
those who are directly involved.”29 In response
recommended that future decisions about family planning
to the GAO report’s recommendation that
and basic health services programs be considered
USAID build an “information analysis
without any assumption that a linkage between the two
would increase the acceptance of contraception use.27
capability,” the agency created the Center for
Development Information and Evaluation
(...continued)
Coordination, USAID, September 1990, p. 9.
26 That same year, the Foreign Assistance Act of 1961 (P.L. 87-195) was amended by the Foreign Assistance Act of
1968 (P.L. 90-554) to add Section 621A, which calls for “strengthened management practices,” including defined
objectives, quantitative indicators of progress, and means for comparing anticipated results with actual results.
27 The Community-Based Family Planning Services Family Planning Health and Hygiene Project, prepared by Bruce
Carlson, MSPH, and Malcolm Potts, M.D. under the auspices of The American Public Health Association, USAID,
1979, pp. 5, 7.
28 Evaluation Handbook, Office of Program Evaluation, USAID, November 1970, p. 40.
29 Experience – A Potential Tool for Improving U.S. Assistance Abroad, U.S. Government Accountability Office,
GAO-ID-82-36, June 15, 1982, p. i (summary).
Congressional Research Service
6
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
(CDIE) in 1983, with a mandate to “foster the use of development information in support of
AID’s assistance efforts.”30 CDIE carried out meta-evaluations to reveal broader trends in aid
impact, provided information and training on evaluation best practices to mission staff, and made
a wide range of evaluation reports accessible to implementers in the field. Aid officials suggest
that CDIE’s evaluation work played a significant role in shaping USAID strategies and priorities
in many sectors over decades.
An internal USAID review in 1988 found that CDIE had greatly increased the use of aid
evaluation information by implementers, but also identified a need to improve the quality and
timeliness of evaluation reports.31 While the evaluation policy at the time still called for rigorous,
statistical methods of evaluation, it was found that this approach was never actually widely used
at USAID because the required skills, time, and expense made implementation difficult.32 As one
internal review noted, “statistical rigor in evaluation methods was deemphasized in favor of
‘reasonably’ valid evidence about project performance.”33 Guidance to missions encouraged the
use of low-cost and timely qualitative evaluation methodologies, including the use of key
informant interviews, focus group discussions, community meetings, and informal surveys.34
In the early 1990s, accountability for funds became a primary focus of aid evaluation. After a
1990 GAO review concluded that USAID evaluation practices made it difficult or impossible to
account for use of aid funds,35 attention turned to tracking where aid money was going, not
measuring what it was accomplishing. At the same time, USAID was facing increasing budgetary
pressure and increasing congressional and public concern about what was being achieved through
foreign assistance.36 In response, USAID carried out an Evaluation Initiative from 1990 to 1992,
greatly expanding the staff and budget of CDIE and making significant investments in rigorous
evaluation designs and innovative methods to evaluate sector-wide results.37 However, by the
mid-1990s the priorities changed once again. A 1993 agency reorganization led to the 1994
elimination of an Office of Evaluation within CDIE, a reduction of overall CDIE staff,38 and a
new emphasis on “rapid appraisal techniques,” which guidance documents describe as a
compromise between slow, costly, and credible formal evaluation methods and cheap, quick,
informal methods (focus group, etc.) that may be less reliable.39
In 1995, USAID replaced the requirement to conduct mid-term and final evaluations of all
projects with a policy calling for evaluation only when necessary to address a specific
management question.40 The rationale was that the required evaluations had become pro forma, as
30 The History of CDIE, CDIEHIST.017/SESmith;JREriksson/10-17-94, p.4.; available through the Development
Experience Clearinghouse on the USAID website.
31 Ibid.
32 The A.I.D. Evaluation System: Past Performance and Future Directions, Bureau for Program and Policy
Coordination, Agency for International Development, September 1990, p. 10.
33 Ibid., p. 11.
34 Ibid., p. 11.
35 Accountability and Control Over Foreign Assistance, GAO/T-NSIAD-90-25, March 29, 1990, p. 6, 11. The review
found that military assistance managed by State and the Department of Defense was also inadequately monitored and
accounted for.
36 The History of CDIE, p.6; The A.I.D. Evaluation System, p. 11.
37 Ibid, pp. 6-7.
38 Ibid. p. 8.
39 The Role of Evaluation in USAID, Performance Monitoring and Evaluation TIPS, USAID CDIE, 1997, Number 11,
p. 3.
40 Beyond Success Stories, p.7; Evaluation of Recent USAID Evaluation Experience, Cynthia Clapp-Wincek and
(continued...)
Congressional Research Service
7
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
GAO reviews had suggested, and that fewer, more comprehensive evaluations would be a better
use of time and resources. As a result, the number of completed evaluations dropped from 425 in
1993 to an estimated 138 in 1999,41 but the depth and scope of new evaluations reportedly did not
change.42 One study suggests that inconsistent guidance on evaluation in these years allowed
many already overburdened mission staff to ignore agency-wide requirements, but noted that the
Global Health, Africa, and Europe & Eurasia bureaus, which had their own evaluation
procedures, continued to carry out quality evaluation work.43
Foreign assistance levels grew rapidly starting in 2003 to support military activities in
Afghanistan and Iraq, as well as the President’s Emergency Plan for AIDS Relief (PEPFAR) and
the creation in 2004 of the Millennium Challenge Corporation (MCC). Accountability to
Congress became a major evaluation priority. In 2005, inspired by remarks made by then House
Foreign Operations Appropriations Subcommittee Chairman Jim Kolbe regarding the importance
of being able to clearly demonstrate results of aid expenditures, USAID Administrator Andrew
Natsios sought to revitalize evaluation within the agency. He sent a cable to all mission directors
calling for the inclusion of evaluation plans, and higher quality evaluations, in all program
designs; designated monitoring and evaluation officers at each post; and set aside funding for
evaluations and incentives for employees who do evaluations; among other things.44
In 2006, in further pursuit of accountability, as well as a desire to rationalize the bilateral
assistance efforts of multiple U.S. agencies, Secretary of State Condoleezza Rice created the
Office of the Director of Foreign Assistance (F Bureau) at the State Department. In addition to
consolidating many USAID and State policy and planning functions for foreign assistance, the F
Bureau established an extensive set of standard performance indicators “to measure both what is
being accomplished with U.S. Government foreign assistance funds and the collective impact of
foreign and host-government efforts to advance country development.”45 Prior to this initiative,
the State Department, which traditionally had managed a much smaller aid portfolio than USAID,
is said to have made a de facto decision not to evaluate its assistance programs on a systematic
basis.46 The data collected through the “F process,” which remains in place today, allow for a
marked improvement in aid transparency, demonstrating comprehensively where and for what
(...continued)
Richard Blue, Working Paper No. 320, U.S. Agency for International Development, Center for Development
Information and Evaluation, June 2001, p. 31.
41 Evaluation of Recent USAID Evaluation Experience, p. 5. The report authors note that while some of the declining
numbers can be attributed to missions not submitting their evaluations to the Development Experience Clearinghouse,
as policy required, making the specific numbers unreliable, the trend of decline is unmistakable.
42 Evaluation of Recent USAID Evaluation Experiences, p. 12.
43 The Evaluation of USAID’s Evaluation Function: Recommendations for Reinvigorating the Evaluation Culture
Within the Agency, Janice M. Weber, Bureau for Program and Policy Coordination, USAID, September 2004, pp. 5, 10.
44 Actions Required to Implement the Initiative to Revitalize Evaluation in the Agency, UNCLAS STATE 127594, July
8, 2005.
45 See http://www.state.gov/f/indicators/index.htm. It was originally expected by many that the F Bureau would
eventually track all foreign assistance provided by U.S. agencies, not just State and USAID. As of 2012, some MCC
data has been added to the Bureau’s public database (www.foreignassistance.gov), but there does not appear to be
momentum toward any expansion of F Bureau authority.
46 Beyond Success Stories, p. 14. The State Department traditionally has used a variety of resources for monitoring its
foreign assistance programs, including Mission and Bureau Strategic Plans, annual performance and accountability
reports, and Office of Inspector General and Government Accountability Office reports, but had no systematic
evaluation process (Department of State Program Evaluation Plan, FY2007-2012 Department of State and USAID
Strategic Plan, Bureau of Resource Management, May 2007, Appendix II).
Congressional Research Service
8
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
purpose aid funds are allocated by State and USAID as of FY2006.47 However, the demands of F
process reporting were believed by some to have interfered with more results-oriented evaluation
work at USAID, and a 2008 assessment of State’s evaluation capacity found that several bureaus,
including those that manage State’s security assistance programs, still had little or no evaluation
capacity.48
The structural reforms of the F Bureau came at a time of heightened congressional scrutiny of
foreign aid. In 2004, Congress established the Helping to Enhance the Livelihood of People
(HELP) Around the Globe Commission, through a provision in P.L. 108-199, to independently
review foreign assistance policy decisions, delivery challenges, methodology, and measurement
of results. After nearly two years of work, the HELP Commission released its report in late 2007.
On the subject of evaluation, the report noted that “everyone to whom members of the
Commission spoke about monitoring and evaluation expressed concern about the inadequacy of
the existing process” and concluded that “unless our government better evaluates projects based
on the outcomes they achieve, it will not improve the effectiveness of taxpayer dollars.”49 The
commission recommended creation of a unified foreign assistance policy, budgeting, and
evaluation system within State, quite similar to the F process, which was established before the
report was released. Other HELP Commission recommendations included ensuring that
evaluation strategies use control groups and randomization as much as possible; considering new
evaluation methods, such as the use of professional associations or accreditation agencies; and
building, in collaboration with other donors, the capacities of recipient governments to provide
reliable baseline data.50
At the same time the F Bureau was established, and the HELP Commission was active, the
international donor community began to prioritize aid effectiveness, sparking renewed interest in
rigorous impact evaluation (see the “A Global Perspective on Aid Evaluation” text box below).
Some aid professionals viewed the F process as an opportunity to build a cross-agency aid
evaluation practice focused on impact, and were disappointed that the common indicators used by
the F Bureau, while an improvement with respect to comparability, measured outputs rather than
impact. Furthermore, the use of more rigorous evaluation methodologies was not a focus of the
reform.
These issues were revisited by the Obama Administration when it embarked in 2009 on a
Quadrennial Diplomacy and Development Review (QDDR) to examine how State and USAID
could be better prepared for current and future challenges. As a result of that review, the
Administration committed itself in December 2010 to several principles of foreign assistance
effectiveness, including “focusing on outcomes and impact rather than inputs and outputs, and
ensuring that the best available evidence informs program design and execution.”51 The first
QDDR became the basis of many changes at State and USAID, including the creation of a new
Office of Learning, Evaluation and Research at USAID and a new USAID evaluation policy,
which took effect in January 2011.52 A second QDDR, in 2015, called for training to deepen
47 The data is publically available at http://www.foreignassistance.gov.
48 Beyond Success Stories, p. 8.
49 Beyond Foreign Assistance: The HELP Commission Report on Foreign Assistance Reform, The United States
Commission on Helping to Enhance the Livelihood of People (HELP) Around the Globe Commission, December 7,
2007, p. 15.
50 HELP Report, p. 99.
51 QDDR, p. 110.
52 A second QDDR, completed in 2015, continues to emphasize the need for better evaluation practices, calling for a
“data-driven, evidence-based” approach to development and diplomacy policymaking, increasing evaluation training
(continued...)
Congressional Research Service
9
link to page 27 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
evaluation expertise at both USAID and State, and for adding “rigor” to evaluations through
better use of diagnostics and data analysis.53
The State Department adopted an evaluation policy similar to that of USAID in February 2012,
requiring all large projects and programs to be evaluated at least once in their lifetime or five-year
period, all State bureaus to complete two to four evaluations before the end of 2012-2013, and
posts to do the same in 2013-2014. The 2012 policy also called for 3%-5% of program resources
to be identified for evaluation purposes. It appears, however, that some of these requirements
were not met, and in January 2015, State revised its policy, paring it down to a less directive form
that was thought to be more appropriate for the wide range of State activities, from diplomatic
engagement to foreign assistance, and to reflect ongoing challenges in evaluating particularly
sensitive activities such as security assistance (see the “Evaluating Security Assistance” text box
below).54 The new policy removed the requirement that all large projects be evaluated, requires
one evaluation per bureau per year, and does not require any evaluations at the post level. Further
details of the new policy are provided in the Appendix.
MCC Rural Water Supply Project in Mozambique, 2008-2013
One MCC impact evaluation looked at a rural water supply project that was part of the $507 mil ion Mozambique
compact that ended in 2013. The $200 mil ion project installed water points (mostly hand pumps) in 614 poor, rural
communities, with the expectation that better access to improved water sources would reduce waterborne disease
rates and allow women and girls to spend less time fetching water and more time on education or economically
productive activities. The program met or exceeded most of its output targets, which related to water points
constructed, number of people trained in sanitary best practices, percentage of population with improved water
access, and time saved to get to primary water source. From a performance perspective, it was a success. The
independent impact evaluation, however, showed that improved access to clean water did not have any statistically
significant impact on beneficiary health or income, which were the ultimate objectives. Analysis of the results revealed
that while water quality was high at the col ection point, it often became contaminated at the household level,
possibly negating the health benefits of the improved water points. The evaluation did not discuss potential reasons
why the average of an hour saved every day in water col ection did not translate into higher household income.
Nevertheless, this evaluation challenged assumptions on which the project was designed, offering significant learning
value. In response to the evaluation findings, MCC reported that it would take steps to enhance peer review of
critical assumptions, improve understanding of local community water sanitation knowledge and practices before
designing future water supply projects, and consider how evaluators can assign value to time savings beyond income
generation. Evaluators also suggested that a longer time frame may be necessary to observe income-related results,
and MCC reports that it may conduct a survey in 2016 to assess the longer-term impacts of this project.
Source: Measuring Results of the Mozambique Rural Water Supply Project, MCC, August 11. 2014, available at
https://www.mcc.gov/resources/doc/summary-measuring-results-of-the-mozambique-rwsa.
The Millennium Challenge Corporation, established in 2004, has been regarded by many as a
leader in aid evaluation, largely as a result of its demanding evaluation policy. MCC provides
funding and technical assistance to support five-year development plans, called “compacts,”
created and submitted by partner countries. Since its inception, MCC policy has required that
every project in a compact be evaluated by independent evaluators, using pre-intervention
baseline data. MCC has also put a stronger emphasis on impact evaluation than State and USAID;
of the 48 completed evaluations as of April 2016, 13 are described as impact evaluations (as are
(...continued)
and capacity building, and noting that State’s Bureau for Political and Military Affairs is developing a comprehensive
approach to monitoring and evaluating security assistance programs.
53 2015 QDDR, pp. 13, 57.
54 Conversations between CRS and State Department officials, February 2015, May 2016.
Congressional Research Service
10
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
about 40 of the 101 planned evaluations), a much high proportion than at other aid agencies.55
Despite this emphasis, the overall impact of MCC assistance remains unclear. Individual project
evaluations have demonstrated successful project implementation, but often little evidence of
progress toward the overarching objective of raising household incomes in targeted areas. Such
evidence, however, may only be apparent many years after compact completion.
Evaluation Challenges
The current evaluation emphasis on measuring impact and broader learning about what works is
not new; as discussed above, it was the basis of USAID evaluation policy in the 1970s and at
various times since. Nevertheless, a 2009 meta-evaluation of U.S foreign aid programs indicated
that rigorous impact evaluation—the kind that could determine with credibility whether a specific
aid intervention or broader sector strategy
Evaluating Security Assistance
worked to produce a specific development
Foreign assistance evaluation efforts have focused almost
outcome—was rarely attempted. Of the 296
exclusively on development assistance and, to a far lesser
evaluations posted between 2005 and 2008 to
degree, humanitarian assistance. Military and security
USAID’s Development Experience
assistance programs under State Department authority
have gone largely unevaluated. The strategic and
Clearinghouse website, an independent
diplomatic sensitivities of this type of aid present
reviewer found only 9% reported on a
significant challenges for evaluators. Past efforts by State
comparison group and only one used an
to contract independent evaluators for these programs
experimental design involving randomized
were reportedly unsuccessful, with the unprecedented
assignment, the method most likely to produce
nature of the work creating high levels of uncertainty and
perceived risk among potential bidders. These challenges
accurate data.56 A 2005 review of USAID
may be one reason that State loosened its evaluation
evaluations (focused on democracy and
requirements in 2015 and why proposed legislation
governance programs) found that “as a group,
calling for more stringent and comprehensive aid
they lacked information that is critical to
evaluation has typically excluded security assistance. The
2015 QDDR, however, noted that the State
demonstrating the results of USAID projects,
Department’s Bureau of Political-Military Affairs was
let alone whether the projects were the real
developing a comprehensive approach to monitoring and
cause of whatever change the evaluation
evaluation of security assistance. A working group is
reported.”57 A meta-evaluation covering the
reportedly tasked with establishing a feasible, incremental
period 2009
approach to security assistance evaluation, starting with
-2012 found a notable increase in
the limited col ection of baseline data. Initiate pilot
evaluation following the new evaluation
evaluations of Foreign Military Financing programs may
policy and found improvements in 68% of
occur as early as 2017.
quality factors examined, including the
Source: 2015 QDDR, p. 34; CRS conversations with
inclusion of recommendations. For most
State Department officials.
factors, however, the improvements were less
than 15%, and most evaluations met USAID quality standards in only a few of the 37 criteria
reviewed.58 USAID anticipates completing a second meta-evaluation, covering the period 2012-
2016, in 2017.
55 This data was provided to CRS by MCC on April 15, 2016. Includes evaluations of both compacts and threshold
programs.
56 Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 46.
57 Trends in International Development Evaluation Theory, Policies and Practices; USAID, 17 August 2009, p. 13.
The report was prepared for USAID by Molly Hageboeck of Management Systems International.
58 A summary of the 2009-2012 meta-evaluation is available at http://usaidlearninglab.org/sites/default/files/resource/
files/Meta%20Evaluation%20Presentation.pdf.
Congressional Research Service
11
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
The gap between evaluation goals and actual practices has been documented repeatedly over the
history of U.S. foreign assistance. So, too, have the challenges that make it difficult for
implementers to achieve ideal evaluation practices in the field. Some of these challenges are
discussed below.
Mixed Objectives. The U.S. foreign assistance program has dozens of official objectives written
into statute, and many aid programs are designed to meet multiple objectives. Often there are both
strategic objectives and development objectives attached to an aid intervention, which may or
may not be acknowledged in budget and planning documents. For example, assistance to
Uzbekistan may have been requested and appropriated for specific agriculture sector activities,
but may have been motivated primarily by a desire to secure U.S. overflight privileges for
military aircraft bringing troops and supplies to Afghanistan. An evaluation of the agricultural
impact may be of no use to policymakers who are more interested in the strategic goal, nor to aid
professionals who are unlikely to view any lessons learned in these circumstances as applicable to
agricultural development projects if political needs overrode the development rationale for the
program.
Another example is the Food for Peace program, which provides U.S. agricultural commodities to
countries facing food insecurity. One objective of the program is to feed hungry people, but long-
standing requirements that most of the food be provided by U.S. agribusiness and be shipped by
U.S.-flagged vessels make clear that supporting the U.S. agriculture and shipping industries is a
program objective as well, and a potentially conflicting one. Studies have shown that the buy and
ship America provisions, as they are known, may lessen the hunger-alleviation impact of food aid
by up to 40%.59
Despite the political and diplomatic considerations that arguably underlie the majority of foreign
aid, evaluations that examine those strategic objectives are rare (or at least not publicly available).
This may be understandable, as such evaluations would often be politically and diplomatically
sensitive. Nevertheless, evaluation that focuses only on the development or humanitarian impact
of a particular program or project, when broader strategic objectives are drivers of the aid, may
largely miss the point. For example, a 2015 Mercy Corp evaluation of youth employment
programs in Afghanistan (funded by the United Kingdom, not the United States) tested the
assumption that a program to create economic opportunities for youth would promote stability by
lessening participants’ support for political violence. Contrary to expectation, the evaluation
found that the employment, economic confidence, and business connections fostered by the
program made participants more likely to express support for political violence.60
Funding and Personnel Constraints. The more rigorous and extensive an evaluation, the
costlier it tends to be, both in funds and staff time. Impact evaluations are particularly costly and
require specially trained implementers. Absent a directive from agency leadership, aid
implementers are unlikely to make resources available for evaluation at the expense of other
program components. As one internal USAID review explained, “since USAID’s development
professionals have limited staff, limited budget, and copious priorities, unfortunately, due to lack
of training on the crucial role of evaluation in the development process, most have chosen to
eliminate evaluation from their programs.”61 Competitive contracting plays a role as well. At a
59 The Developmental Effectiveness of Untied Aid, OECD, p.1, available at http://www.oecd.org/dataoecd/5/22/
41537529.pdf.
60 “Does Youth Employment Build Stability?,” Evidence From Impact Evaluation of Vocational Training in
Afghanistan, Mercy Corps 2015.
61 An Evaluation of USAID’s Evaluation Function, p. 5.
Congressional Research Service
12
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
time when most program implementation is contracted out, and cost is a key factor in winning
contract bids, some argue that there is little incentive to invest in the up-front costs, such as
baseline surveys, of a well-designed evaluation plan in the absence of an enforced requirement.62
As a result, ad hoc evaluations of limited scope and learning value—as one report describes it, the
“do the best you can in three weeks” approach—often prevail by default.63 “It is rare,” according
to one report, “that the resources provided for an evaluation are sufficient to develop and apply
more rigorous research methods that would produce valid empirical evidence regarding outcomes
and attributable impact.”64 While MCC has the benefit of compacts being fully funded up front,
which may account in part for its more comprehensive evaluation practices, State and USAID
cannot count on receiving requested project funding from year to year, creating a challenge for all
aspect of program implementation, including evaluation.
Sometimes the limited resource is personnel, rather than funding. Past reviews of assistance
evaluation repeatedly cite lack of trained evaluation personnel as a problem. USAID has tried to
address this problem by training 1,600 staff in evaluation design and implementation since 2011
and producing a number of evaluation tools, publications, and webinars available to staff. USAID
has also recently recruited monitoring and evaluation fellows, who are placed for six months to
two years in offices that need additional expertise.65 Another part of this effort is building strong
relationships with other entities focused on aid evaluation, including aid agencies of other donor
countries and the International Initiative for Impact Evaluation (3ie).66 Some experts have
suggested that greater emphasis on collective evaluations—donor countries and foundations
contributing to an independent organization that conducts evaluations of aid crossing many donor
portfolios—could address resource and expertise limitations as well as allow for generalization of
evaluation findings and policy relevance.67
Emphasis on Accountability of Funds. Aid monitoring and evaluation efforts over the past
decade have primarily focused on accountability of funds because that is what stakeholders,
including Congress, generally ask about. Concerned about corruption and waste, bound by
allocation limits, and required by law to report on various aspects of aid administration,
implementing agencies have developed monitoring, evaluation, and data collection practices that
are geared toward tracking where funds go and what they have purchased rather than the impact
of funds on development or strategic objectives. For example, the F Bureau’s Foreign Assistance
Framework, launched in 2006, was created largely to address the information demands of
stakeholders, who wanted more data on how aid funds are being spent. It worked, to the extent
that it is now easier to find information on how much aid is being spent in a given year on
counterterrorism activities in Kenya, for example, or on agricultural growth programs in
Guatemala.68 But little if any of the resulting data addresses the impact of aid programs.
Methodological Challenges. In the complex environment in which many aid projects are carried
out, it can be challenging to employ high quality evaluation methods. U.S. agency policies allow
62 Beyond Success Stories, p. 16.
63 Ibid.
64 Ibid.
65 Strengthening Evidence Based Development, p. 12
66 For more information on 3ie, see the “A Global Perspective on Aid Evaluation” text box below.
67 The Future of Aid: Building Knowledge Collectively, Center for Global Development Policy Paper 050, January
2015.
68 Foreign aid data from FY2006-FY2012 estimates, sorted by recipient country, year, agency (only State, USAID and
MCC), appropriations account, and objective is readily available through the “Foreign Assistance Dashboard” at
http://www.foreignaid.gov.
Congressional Research Service
13
link to page 27 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
for a variety of evaluation methods (see Appendix), acknowledging that the most rigorous
methods are not always practical. Sometimes it is impossible to identify a comparable control
group for an impact evaluation, or unethical to exclude people from a humanitarian intervention
for the purpose of comparison. Sometimes the goals are intangible and cannot be accurately
documented through metrics. For example, it may be much harder to measure the impact of
programs such as the Middle East Partnership Initiative, designed to strengthen relationships, than
to measure more concrete objectives, such as reducing malaria prevalence. This may be one
reason why reviews have found that global health assistance has a stronger evaluation history
than other aid sectors;69 disease prevalence and mortality rates lend themselves to quantification
better than military personnel attitudes towards human rights or the strength of civil society.
Rigorous methodology can also limit program flexibility, as making program changes mid-
course, in response to changed circumstances or early results, can compromise the evaluation
design. Some MCC evaluation reports note that information gleaned from early project
implementation resulted in mid-course changes that improved program logic but undermined
impact evaluation plans.
Even when metrics and baselines are well established, it can still be very difficult to attribute
impact to a specific U.S. aid intervention when such programs are often carried out in the context
of a broader trade, investment, political, and multi-donor environment.70 A 2016 SIGAR report,
for example, notes that while USAID frequently cites improvements in Afghanistan’s education
sector among the highlights of U.S. reconstruction efforts, the agency is unable to establish a link
between U.S. assistance and trends in the sector, in which many donors are active.71 Also, some
aid professionals see broader drawbacks to rigorous impact evaluation methods. Some assert that
the use of randomized control groups, which generally require the use of independent evaluators,
limits the participation of affected individuals and communities in project design. They argue that
community participation in project planning and evaluation, which can lead to greater buy-in and
local capacity building, is more valuable in the development context than high-quality evaluation
findings.72 Others counter that more participatory methodologies are often weakened by bias, and
that it is unwise and even unethical to replicate programs, which may profoundly affect
participants, without having properly evaluated them.73
Compressed Timelines. While development assistance, in particular, is recognized as a long-
term endeavor, aid strategies can be trumped by political pressures, which can influence
evaluation. In 2001, a USAID survey report stated that “the pattern found was that evaluation
work responds to the more immediate pressures of the day.”74 Policymakers facing relatively
short budget and election cycles do not always allow adequate time for programs to demonstrate
their potential impact. Such pressures have only increased over the past 15 years, particularly in
the politically charged environments of Iraq, Afghanistan, and Pakistan. As a Senate Foreign
Relations Committee majority-staff report on aid to Afghanistan found, “the U.S. Government
has strived for quick results to demonstrate to Afghans and Americans alike that we are making
69 Beyond Success Stories, p. 9.
70 The QDDR states that “we know that in many cases the outcome-level results are not solely attributable to U.S.
government investments and activities; we will focus on outcome-level progress in locations and subsectors where the
U.S. government is concentrating support.” (QDDR 2010, p. 104).
71 SIGAR Education report 16-32-AR p. 16. The report also notes that the education data used by USAID is provided
by the Afghan government and has not been independently verified.
72 A Can of Worms, p. 8.; Beyond Success Stories, p. 17.
73 Improving Lives Through Impact Evaluation, p. 15
74 Evaluation of Recent USAID Evaluation Experiences, p. 26.
Congressional Research Service
14
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
progress. Indeed, the constant demand for immediate results prevented the implementation of
programs that could have met long-term goals and would now be bearing fruit.”75
The type of evaluation necessary to determine whether aid has real impact is both hard to do and
of limited use in a short-term context. Timelines are particularly restrictive for MCC, which
originally intended to complete evaluations during the compact implementation period. This goal,
which reflects broad support for limited timeframes on foreign assistance, was found not to be
feasible during implementation of MCC’s first compacts in Cape Verde and Honduras.76 Baseline
data and evaluation models can be rendered worthless if program timelines change. For example,
an MCC evaluation of a farmer training program in Armenia found that the planned impact
evaluation model—a phased roll-out—was compromised by a delay in implementing one
component of the program and the five-year compact timeline.77 The long-term impacts of aid
may be the most significant in judging effectiveness, but are least likely to be evaluated.
Sector Evaluation Example: Trade Capacity Building
Many analysts have suggested that cross-country evaluations of aid for a specific sector may be more useful for
shaping policy than the more common individual project evaluations. One example of this approach is an evaluation
commissioned by USAID to look at the impact of 256 U.S. trade capacity building (TCB) assistance projects in 78
countries from 2002 to 2006. The United States obligated about $5 bil ion during this period for TCB activities,
through several federal agencies, including assistance to help developing countries strengthen their public institutions
and policies related to trade, as well as programs to make private industries more knowledgeable about and
competitive in global markets. The evaluation was designed after the fact, making a randomized control ed trial
unfeasible, and had to account for variations in reporting across projects. Much of the report highlights anecdotal
examples of issues that could not be analyzed systematically as a result of inconsistent data collection methodologies
across projects. However, using regression analysis, evaluators found a relationship suggesting that each additional $1
invested in U.S. aid (from all agencies) for TCB is associated with a $53 increase in the value of recipient country
exports two years later. For TCB aid specifically managed by USAID, the relationship was $1 invested for $42 in
increased exports. No similar association was found between TBC assistance and recipient country imports or
foreign direct investment. While this evaluation’s methodology was not sufficient to demonstrate actual aid impact or
causation, its findings may be useful to policymakers in both demonstrating a correlation between TCB aid and export
growth, as well as forming the basis of a discussion about the comparative advantages of various U.S. agencies in
managing TCB aid.
Source: From Aid to Trade: Delivering Result. A Cross-Country Evaluation of USAID Trade Capacity Building, prepared for
USAID by Mol y Hageboeck of Management Systems International, November 24, 2010; Executive Summary.
Country Ownership and Donor Coordination. The United States and other aid donor countries
have made pledges to both coordinate their efforts and increase recipient country control, or
“ownership,” over the planning of aid projects and the management of aid funds. Country
ownership is believed by many to increase the odds that positive results will be sustained over
time both by ensuring aid projects are consistent with recipient priorities and by helping to build
the budget and project management capacity of recipient country governments and non-
governmental organizations (NGOs) that administer the assistance. Donor coordination of
assistance efforts is supposed to promote efficiency, ease administrative burdens on aid recipients,
and avoid duplication, among other things. USAID, as part of its ongoing procurement reform
process, aims to channel an increasing portion of contract and grant aid directly to governments
and local organizations. However, greater country ownership, and the pooled funds that may
75 S.Prt. 112-21, Evaluating U.S. Foreign Assistance to Afghanistan, June 8, 2011, p. 14.
76 Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11-
728, p. 33.
77 Measuring Results of the Armenia Farmer Training Investment, October 23, 2012, p.4, available at
http://www.mcc.gov/documents/reports/results-2012-002-1196-01-armenia-results-country-summary.pdf.
Congressional Research Service
15
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
result from donor coordination, generally means diminished donor control, and a lesser ability to
evaluate how U.S. funds contributed to a particular outcome. Accountability concerns often
greatly overshadow the learning aspects of evaluation in such a context, as Congress has
expressed concern about the heightened potential for corruption and mismanagement when funds
flow directly to recipient country institutions. A 2016 report of the Special Inspector General for
Afghanistan Reconstruction (SIGAR), for example, notes that while an increasing portion of U.S.
aid to Afghanistan is being provided through Afghan government ministries, these ministries
struggle with staffing, technical skills, management, and accountability.78
Security. Over the past 15 years, a significant percentage of foreign aid has been allocated to
countries where security concerns have presented major obstacles to implementing, monitoring
and evaluating foreign aid. A 2012 evaluation of a USAID agricultural development program in
rural Pakistan, for example, states “the operating environment for development projects has been
especially testing in recent years in the presence of an insurgency and frequent targeted killings
and kidnappings.”79 Development staff in Afghanistan and Iraq in particular have not always been
able to safely visit project sites to verify that a structure has been built or supplies delivered,
much less be out on the streets conducting the types of surveys that certain evaluations would
normally call for. A 2011 USAID Inspector General report noted that more than half of
performance audits in Iraq at that time indicated security concerns, and a 2016 SIGAR report
noted that the drawdown of U.S. and coalition military personnel in Afghanistan, and the
deteriorating security situation, made it difficult or impossible for civilian agency personnel to
oversee projects first-hand.80 Even in less hostile environments, security concerns can undermine
evaluation quality. For example, a 2011 evaluation of Office of Transition Initiatives governance
activities in Colombia noted that “security considerations limited to some degree the evaluation
team’s freedom to interview community members in project sites at will. This fact made it
difficult to be certain that field research did not suffer from a form of sampling bias.”81 While
security challenges may weigh against the use of aid in certain regions, the most insecure places
are sometimes where the U.S. foreign policy interests are greatest, and policymakers must
consider whether the risk of being unable to evaluate even the performance of an aid intervention
is worth taking for other reasons.
Agency and Personal Incentives. Given discretion in the use and conduct of evaluations,
observers have noted the inclination of foreign assistance officials to avoid formal evaluation for
fear of drawing attention to the shortcomings of the programs on which they work. While agency
staff are clearly interested in learning about program results, many are reportedly defensive about
evaluation, concerned that evaluations identifying poor program results may have personal career
implications, such as loss of control over a project, damage to professional reputation, budget
cuts, or other potential career repercussions.82 As explained by one USAID direct-hire in response
to a 2001 survey, “if you don’t ask [about results], you don’t fail, and your budget isn’t cut.”83
That same study revealed that staff felt more pressure to produce success stories than to produce
78 Challenges to Effective Oversight of Afghanistan Reconstruction grow as High-Risk Areas Persist, SIGAR, 2/24/16,
pp. 9-10.
79 United States Assistance to Balochistan Border Areas: Evaluation Report, Prepared by Management Systems
International for USAID, January 16, 2012, p. vi.
80 SIGAR 2/16 report, p. 14.
81 USAID/OTI’s Integrated Governance Response Program in Colombia, Final Evaluation, prepared by Caroline
Hartzell et al., April 2011, p. 7.
82 Evaluation of Recent USAID Evaluation Experiences, p. 22.
83 Ibid., p. 24.
Congressional Research Service
16
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
balanced and rigorous evaluations, and that “professional staff do not see any Agency-wide
incentive to advance learning through evaluations.”84 Few observers consider risk taking and
accepting failure as a necessary component of learning to be hallmarks of USAID or State
Department culture, but a shift in this attitude may be in progress. According to USAID
Administrator Gayle Smith, there has been “a cultural shift from checking the box that everything
is fine to here’s what we’re learning and here’s what happened.”85 Other experts have suggested
that there remains a reluctance within USAID to hold staff responsible for poor evaluation
practices.86
Evaluating Humanitarian Assistance
Humanitarian assistance can present unique evaluation challenges, and is evaluated less frequently than development
assistance. Available evaluation reports show significant shortfalls in this area. For example, a 2015 evaluation report
of a State Department Bureau of Population, Refugees and Migration (PRM)-funded program to boost employment
skills and opportunities for refugees living in camps in Ethiopia, implemented by three partners under the auspices of
the United Nations High Commissioner for Refugees (UNHCR), found anecdotal evidence of positive program
impacts but little basis for assessing program effectiveness. Neither PRM nor UNHCR at the time required more than
basic monitoring of program outputs (individuals trained), and implementers could provide no data on livelihood or
education outcomes, which were the objective of the programs. This was due in part to no system being in place to
col ect the necessary data, and in part because the camp population was fluid and many program participants left the
camp soon after participating in the program and were not tracked. Despite the many challenges, U.S. agencies and
other donors are making efforts to improve evaluation of humanitarian aid. Among the priorities that emerged from
the 2016 World Humanitarian Summit consultative process is development of a framework and mechanisms for
better evaluating the quality and effectiveness of humanitarian assistance by all donors.
Source: Evaluating the Effectiveness of Livelihood Programs for Refugees in Ethiopia, U.S. Department of State, available at
http://www.state.gov/documents/organization/252133.pdf.
Applying Evaluation Findings to Policy
A consistent theme in past reviews of foreign aid evaluation practices is that even when quality
evaluation takes place, the resulting information and analysis are often not considered and applied
beyond the immediate project management team. Evaluations are rarely designed or used to
inform policy. Lack of faith in the quality of the evaluation, irregular dissemination practices, and
resistance to criticism may all contribute to this problem, as does lack of time on the part of aid
implementers and policymakers alike to read and digest evaluation reports. A 2009 survey of U.S.
aid agencies found that “bureaucratic incentives do not support rigorous evaluation or use of
findings,” “evaluation reports are often too long or technical to be accessible to policymakers and
agency leaders with limited time,” and learning that takes place, if any, is “largely confined to the
immediate operational unit that commissioned the evaluation.”87 The shift in recent decades
towards the use of contractors and implementing partners for most project implementation, and
most project evaluation, may also impact the learning process. As one report notes, “partner
84 Ibid., pp. 26-27.
85 USAID Administrator Gayle Smith at a forum on “Assessing the Impact of Foreign Assistance: The Role of
Evaluation,” the Brookings Institution, March 30, 2016. See http://www.brookings.edu/events/2016/03/30-impact-
foreign-assistance.
86 Ruth Levine, Global Development and Population Program Director, Hewlett Foundation, at a forum on “Assessing
the Impact of Foreign Assistance: The Role of Evaluation,” the Brookings Institution, March 30, 2016. See
http://www.brookings.edu/events/2016/03/30-impact-foreign-assistance.
87 Beyond Success Stories, p.iv.
Congressional Research Service
17
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
organizations are learning from the experience, but USAID is not,” and most evaluation work
does not circulate beyond the partner.88
Congress expressed some interest in this issue with the Initiating Foreign Assistance Reform Act
of 2009 (H.R. 2139 in the 111th Congress, introduced by Representative Howard Berman), which
called for “a process for applying the lessons learned and results from evaluation activities,
including the use and results of impact evaluation research, into future budgeting, planning,
programming, design and implementation of such United States foreign assistance programs.”
The government-wide GPRA performance planning and assessment requirements mentioned
earlier (see “Program Evaluation Government-Wide” text box above) also attempted to mandate
better use of evaluation data in policymaking government-wide. Aid agencies have addressed this
issue with renewed focus and mixed results. USAID reviewed the utilization of evaluation data
over the first several years under its new policy and found that 90% of surveyed evaluation
findings and recommendations had some impact on program-level decisionmaking, mostly for
project design and modification.89 USAID requires that its five-year Country Development
Cooperation Strategies (CDCS) cite evidence as the basis of their development hypothesis, and
60% of the CDCS in 2015 cited evaluation reports as evidence. However, there is no USAID
requirement that new policies draw on evaluation findings, and the study found little evidence
linking evaluations to higher-level policy decisions.90
The learning aspect of evaluation relies heavily on agency culture, which may be shaped more by
leadership than policy. The effective application of evaluation information depends also on the
details of implementation, such as evaluation questions being based on the information needs of
policymakers and program managers, and information being presented in a format and to a scale
that is useful. Policymakers, for example, may be much better able to make actionable use of a
meta-evaluation of microfinance programs, presented in a short report highlighting key findings,
than a whole database of detailed analysis of single projects, the results of which may or may not
be more broadly applicable. Experts have pointed out that individual project evaluations, even
when well done, do not roll up nicely into a document showing what works and what does not.
They contend that for maximum learning, an effort must be made at the cross-agency or even
whole-of-government level to develop evaluation meta-data that is responsive not only to the
needs of a project manager interested in the impact of a particular activity, but also to agency
leadership and policymakers who want to know, more broadly, what foreign assistance is most
effective.
This view has been reflected in legislation introduced in recent Congresses. The Foreign
Assistance Revitalization and Accountability Act of 2009 (S. 1524 in the 111th Congress,
introduced by then Senator Kerry) called for the creation of a Council on Research and
Evaluation of Foreign Policy to do cross-agency evaluation of aid programs. The Foreign Aid
Transparency and Accountability Act (S. 2184/ H.R. 3766 in the 114th Congress, introduced by
Senator Marco Rubio and Representative Ted Poe, respectively), would direct the President to
establish guidelines for the consistent evaluation of foreign assistance across federal agencies.
As important as evaluation can be to improving aid effectiveness, not every aid project has broad
learning potential. Knowing which potential evaluations could have the greatest policy
implications may be key to maximizing evaluation resources. Many USAID projects, for
example, are designed with no intention that they be scaled up or replicated elsewhere. In other
88 Evaluation of Recent USAID Evaluation Experiences, p. 27.
89 Evaluation Utilization at USAID, February 23, 2016, p. 10.
90 Ibid., p. 12.
Congressional Research Service
18
link to page 27 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
situations, an approach may have already been well proven. In such instances, a basic
performance evaluation for accountability may be appropriate, but rigorous evaluation may be a
poor use of resources. A 2012 USAID “Decision Tree for Selecting the Evaluation Design” asks
staff to first consider whether an evaluation is needed, and decline to evaluate if the timing is not
right, if there are no unanswered questions for the evaluation to address, or if there is no demand
from stakeholders.91
Current Agency Evaluation Policies
The primary U.S. government agencies managing foreign assistance each have their own distinct
evaluation policies, with varying degrees of specificity. The Quadrennial Diplomacy and
Development Review (QDDR) report of December 2010 stated the intent that USAID would
reclaim its leadership role with respect to international development evaluation and learning, and
referenced a new USAID evaluation policy in the works to reflect the growing demand for results
data and attempt to address some persistent evaluation challenges. That policy took effect January
2011. The State Department followed suit in February 2012 with a new evaluation policy that was
similar in many respects to the USAID policy, and MCC updated its policy in May 2012. State
then updated its policy again in early 2015, apparently paring down several requirements in the
2012 policy, though the 2015 QDDR reaffirmed the State Department’s commitment to building
evaluation capacity. The Appendix table compares key provisions of the current evaluation
policies of USAID, State, and MCC.
The State and USAID policies share much in common, balancing the costs and expected gains
from evaluation. For example, both require performance evaluations of all larger-than-average
projects and experimental/pilot projects, but not all projects. The policies share an emphasis on
accessibility of information, with provisions to promote consistent and timely dissemination of
evaluation reports, though State only requires public dissemination of foreign assistance
evaluations, and summaries rather than full reports. In their introductory language, both policies
emphasize the learning benefits of evaluation, in addition to accountability. The USAID policy is
notably more detailed than State’s on many of the issues. The USAID policy establishes required
features for evaluation reports, and specifies that evaluation questions be identified in the design
phase of projects, issues which the State policy does not address. USAID states that most
evaluations will be conducted by third party contractors or grantees, to promote independence,
while State’s policy does not require independent evaluators. While USAID suggests a target
allocation of 3% of program funds for program evaluation, the State policy provides no such
target and the guidance suggests that such a target may not be realistic. Perhaps most
significantly, USAID’s policy calls for impact evaluation whenever feasible, while the State
policy sets a clear expectation that impact evaluation will be rare.
MCC’s evaluation policy shares many elements of the State and USAID policies, but goes farther
in many respects. MCC requires independent evaluations of all compact projects, using indicators
and baselines established prior to project implementation. The agency has also made a practice of
including a “lessons learned” section in its evaluation reports. It may be, however, that first-hand
experience with the challenges of evaluation is bringing MCC policy and practice closer to that of
USAID over time. MCC’s 2012 policy revision adopts definitions from USAID’s 2011 evaluation
policy and includes a section on institutional learning. The update also appears to move closer to
the USAID model with respect to impact evaluation, calling for impact evaluations “when their
91 Decision Tree for Selecting the Evaluation Design, USAID, June 2012, p. 1, available on USAID’s Development
Experience Clearinghouse website.
Congressional Research Service
19
link to page 14 link to page 14 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
costs are warranted,” whereas the previous iteration referred to independent impact evaluations as
an “integral part” of MCC’s focus on results.92 The MCC policy still appears to have the strongest
enforcement mechanism among the three agency policies, conditioning the release of quarterly
disbursements on substantial compliance with the policy. USAID’s policy, in contrast, calls only
for occasional compliance audits, and State’s policy does not address compliance at all.
While some experts have called for greater uniformity of evaluation practices across agencies to
allow for comparative analysis, others view the differences in State, USAID, and MCC evaluation
polices as reflecting the different experience, scope of work, and priorities of the agencies.
USAID, with the largest and most diverse assistance portfolio among the agencies, and numerous
small projects, may require a more flexible approach to evaluation than MCC, which is narrowly
focused on economic growth and recipient government ownership. At State, foreign assistance is
just one part of a broader portfolio (including diplomatic activities), potentially impacting what
type and scope of evaluation is useful or possible. State is also responsible for many military and
security assistance programs, which present unique challenges, as discussed in the “Evaluation
Challenges” section above.
These current evaluation policies may represent a step towards improving knowledge of foreign
assistance measures of effectiveness at the program or project level, and increasing transparency
of the evaluation process. They do not, however, attempt to establish a systemic approach to aid
evaluation that would make country-wide, sector-wide, or cross-agency evaluation or aid more
feasible. They look similar to earlier initiatives to improve aid evaluation. Many aspects of the
2011 USAID policy, for example, are strikingly similar to the required actions called for in the
2005 cable to USAID missions (e.g., evaluation planning as part of all program designs,
designated evaluation officers at each post, and set-aside evaluation funds). It may be too early to
know whether this new multiagency initiative will have more real or lasting impact than its
predecessors. A meta-evaluation examining USAID evaluations from 2009 to 2012 indicates that
both the number and quality of evaluations increased significantly in that period, but most
evaluations in 2012 still failed to meet evaluation standards.93
92 Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 1, 2012, p.18; Policy for
Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 12, 2009, p. 17.
93 Meta-Evaluation of Quality and Coverage of USAID Evaluations: 2009-2012, August 2013, p. 7.
Congressional Research Service
20
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
A Global Perspective on Aid Evaluation
U.S. foreign assistance evaluation efforts have evolved in the context of a global movement by public and private aid
donors to improve aid effectiveness, with improved evaluation practices as one of many strategies. Representatives of
aid donor countries meet regularly under the auspices of the OECD Development Assistance Committee (DAC) to
discuss evaluation practices, among other things, as a means of implementing the aid effectiveness agenda laid out in
the 2005 Paris Declaration on Aid Effectiveness and the 2008 Accra Agenda for Action. A 2010 OECD/DAC survey
and report on evaluation in the development agencies of major donor countries highlighted several issues that are
common to U.S.-specific aid evaluation.94 The report found a heavy reliance on measuring outputs, but also a trend
toward measuring aid impact and larger strategic questions of development effectiveness. It identified new emphasis
on dissemination of evaluation findings, and found that while bilateral aid agencies on average allocated 0.1% of their
development assistance budget to evaluation, lack of human resources—people qualified to do rigorous impact
evaluations, evaluations of direct budget support, or requiring specific language skil s, in particular—presented a bigger
obstacle to evaluation goals than did financial constraints.
Non-governmental organizations have focused on evaluation in recent years, as well. In 2004, an Evaluation Gap
Working Group was convened by the Center for Global Development with support from the Bil & Melinda Gates
Foundation and the Wil iam and Flora Hewitt Foundation. The Working Group focused on why rigorous impact
evaluations of development assistance were so rare. The resulting report, “When Wil We Ever Learn?,” is a key
resource for this report. The group made two recommendations: (1) that donors invest more in their own evaluation
capacity, and (2) that an independent institution be created to evaluate aid.95 The offshoot of the latter
recommendation is the International Initiative for Impact Evaluation (3ie), established in 2009, with a mission to use
impact evaluations, specifically, to generate high quality evidence for use in shaping effective development policies. 3ie
both funds evaluations and produces extensive materials on evaluation methods, implementation practices, and
application to policy, as a means to improve evaluators’ technical capacity. USAID and MCC are official partners of
3ie, as are many other official aid agencies, private foundations, and non-profit organizations such as the Hewlett and
Gates foundations and Save the Children.
Issues for Congress
While some momentum on foreign aid evaluation reform has originated within the
Administration, Congress may have significant influence on this process. Not only can Congress
mandate or promote a certain approach to evaluation directly through legislation, as has been
proposed, it can modulate Administration policies by controlling the appropriations necessary to
implement the policies. Congress may also influence how, or if, the information resulting from
evaluations will impact foreign assistance policy priorities. These issues are discussed in greater
detail below.
Reform Authorization Legislation. In the 112th and 113th Congresses, legislation was introduced
that focused specifically on foreign aid evaluation. The Foreign Aid Transparency and
Accountability Act (H.R. 3159/ S. 3310 in the 112th, S. 1271/H.R. 2638 in the 113th Congress)
sought to evaluate the performance of U.S. foreign assistance programs and improve program
effectiveness by requiring the President to establish guidelines on measurable goals, performance
metrics, and monitoring and evaluation plans for foreign assistance programs that can be applied
on a consistent basis across implementing agencies.96 The legislation also called for the creation
of a website that would make detailed, program-level information on foreign assistance, including
94 Evaluation in Development Agencies, Better Aid, OECD Publishing, 2010, available at http://dx.doi.org/10.1787/
9789264094857-en.
95 When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Working Group,
Center for Global Development, May 2006.
96 The House and Senate proposals were similar but not identical. For example, H.R. 3159, as passed by the House,
called for evaluation guidelines to be applied “with reasonable consistency,” while S. 3310 called for the guidelines to
be applied “on a uniform basis.”
Congressional Research Service
21
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
country strategies, budget documents, budget justifications, actual expenditures, and program
reports and evaluations available to the public. The legislation was reintroduced in the 114th
Congress (H.R. 3766/S. 2184) with some modifications, including the exclusion of security
assistance.
The general focus of these proposals is on codifying evaluation requirements and extending them
across the various federal and agencies that administer aid programs. The benefit of such broad
uniformity, arguably, is that it could enable policymakers, the public, and other stakeholders to
better compare the activities of various agencies and get a more comprehensive picture of total
U.S. foreign assistance. A potential drawback is the effort and expense required to impose such
uniformity on agencies with different objectives, management structures, and information
technology systems. These proposals also focus on transparency and accountability rather than
effectiveness, and do not explicitly promote the use of impact evaluation, though they call for the
use of rigorous methodologies, including impact evaluation. If performance evaluation continues
to comprise the vast majority of aid evaluations, such a cross-agency requirement may provide
comparable information on aid management from agency to agency, but is not likely to facilitate
comparative analysis of what aid channels are most effective.
Appropriations for Enhanced Evaluation. Increasing the number and quality of foreign aid
evaluations, while potentially cost effective in the long run, requires an investment of resources.
For the most part, evaluation costs are integrated into program accounts at the various
implementing agency budgets and are not scrutinized specifically by Congress. Annual funding
levels established by Congress, together with any related legislative directives that limit the use of
funds, may play a role in determining the extent of the Administration’s efforts and capacity to
strengthen evaluation practice. Congress may also wish to specify in appropriations legislation a
portion of funds to be used for evaluation purposes.
Impact of Evidence-Based Approach on Congressional Priorities. Congress has long exerted
control over foreign assistance not only through appropriated funds and restrictions, but also by
directing foreign assistance funds to certain sectors, countries, or even specific projects through
bill or report language. For example, the committee reports accompanying the annual State-
Foreign Operations appropriation proposals provide specific funding levels for microfinance,
basic education, water and sanitation, women’s leadership training, people-to-people
reconciliation programs in the Middle East, and other sectors of particular interest to Members of
Congress. Should credible information about the relative effectiveness of these programs be made
available as a result of improved evaluation practices, Congress can weigh the importance of the
data, among other considerations, in establishing aid priorities. Some congressional directives on
aid are less likely than others to be affected by evaluation results. The availability of actionable
evaluation data may not result in a maximization of aid effectiveness, but may allow Congress to
make more deliberate trade-offs between effectiveness and other objectives.
Conclusion
The primary U.S. agencies charged with implementing foreign assistance have made significant
steps in the last several years to address ongoing deficiencies in evaluation practices that make it
difficult to judge whether foreign assistance is achieving its various objectives. There is
widespread agreement on the need for more consistent performance evaluation of aid programs.
The value of rigorous impact evaluation is broadly recognized as well, though the agencies differ
in their capabilities and aspirations in this respect. Past policies and evaluation reform efforts,
however, have been similarly focused but not sustained in the face of persistent challenges, many
of which remain today. Other reforms, such as the establishment of centralized evaluation
Congressional Research Service
22
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
processes or the creation of an independent evaluation entity, have been proposed in legislation
but not yet enacted. Growing emphasis in Congress and the Administration on results-based
budgeting, as well as movement within the international aid donor community toward more
rigorous aid evaluation practices, may provide the context for sustained progress. The 114th
Congress continues to have opportunities to influence how U.S. foreign assistance is evaluated
through legislative proposals, appropriations, and oversight activities.
Congressional Research Service
23
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Appendix. Select Aspects of Current USAID, State
Department, and MCC Evaluation Policies
USAID
State
MCC
Effective
January 2011
January 29, 2015
May 1, 2012
Date
Responsible
PPL/LER responsible for system
F oversees planning and
Primary lead is MCA
Personnel
implementation, while missions
implementation of foreign assistance
(host country entity)
and functional bureaus
evaluations, BP for diplomatic
M&E, with input from
responsible for conducting
engagement evaluations. Each Bureau is MCC M&E.
evaluations. All Bureaus and
responsible for conducting its own
operating units must designate
evaluations and must appoint a Bureau
an evaluation point of contact.
Evaluation Coordinator.
Evaluation
Operating units must conduct at
All programs/projects/activities greater
All Compacts and
Requirement
least one performance
than or equal to the median size (using
Threshold
evaluation of each project that
dol ar value or staff resources as the
Agreements include
equals or exceeds average
measure) for the Bureau must be
monitoring and
project size.
evaluated at least once in their lifetime.
evaluation plans,
Projects involving an untested
All pilot programs must be evaluated
which identify the
hypothesis or new approach,
before being replicated.
evaluations to be
and that are anticipated to
conducted for each
Each Bureau or office should conduct
expand in scale or scope, wil
project, the key
at least one evaluation each fiscal year.
undergo an impact evaluation, if
evaluation questions
feasible.
and methodologies,
and the data
All evaluations wil share certain
col ection strategies
basic features, including a ful
that wil be used.
description of methodology;
standardized recording and
Final evaluations are
maintenance of records from
required for all
evaluation; evaluation findings
projects in a Compact
based on facts, evidence, and
upon completion or
data, sex-disaggregated data; and
termination; mid-term
an explanation of the limitations
evaluations are
of the data.
discretionary.
Key evaluation questions wil be
Selected indicators
identified during the design
must have baselines
phase of every project.
established prior to
the start of the
corresponding activity.
Congressional Research Service
24
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
USAID
State
MCC
Evaluation
Emphasis on quality evaluation
Evaluations should be based on
Impact evaluations
Type
methods and favoring random
verifiable data and information that
performed “when
assignment/experimental
have been gathered using the standards their costs are
methods for impact evaluations
of professional evaluation
warranted by the
when feasible.
organizations.
expected
According to the guidance,
accountability and
counterfactual data required for impact learning.”
evaluation “cannot be col ected for the
overwhelming majority of the
evaluations of management processes,
delivery system and programs – unlike
in other fields, control groups are not
established when projects or programs
are initiated at the Department. Even
when data can be generated, the cost
of col ecting can be prohibitive.”
Evaluator
Policy states that most
Suggests that evaluators should be
Independent
Type
evaluations wil be conducted by
“free from and pressure and/or
evaluators required
third party contractors or
bureaucratic interference,” but does
for final evaluations of
grantees managed by USAID, but not require the use of outside
Compacts.
evaluation teams may be
evaluators.
Mid-term compact
composed primarily of USAID
Bureaus and offices may conduct
evaluations and final
staff, led by an outside expert,
evaluations with their own staff as long
threshold program
when it is determined that this
as the staff have the appropriate
evaluations can be
wil facilitate institutional
training and experience and are not
done independently or
learning.
accountable to the managers of the
by MCC/MCA staff.
program being evaluated.
Funding
Recommends an average 3% of
Calls for program managers to identify
Does not specify a
Requirement
program budgets be dedicated
resources to conduct evaluations
portion of funds that
specifically to external evaluation, during program planning, but does not
should be used for
distinct from monitoring.
specify an amount or portion of funds
evaluation.
Resources for evaluation should
to be used for evaluation, and the
be concentrated on large
guidance suggests that the international
projects and those that are
standard of 3-5% of program costs is
innovative or pilot approaches.
unrealistic.
Reporting
Public availability of evaluation
Bureaus and posts must post
MCAs must post their
Requirement
reports and summaries, within 3
summaries of evaluation results
approved Compact
months of completion, on the
internally, unless they are classified or
M&E plans on their
Development Experience
sensitive but unclassified (SBU).
website. MCC and
Clearinghouse website.
Summaries of foreign assistance
MCAs must
evaluations must be posted publicly on
“regularly” publish
the F Bureau web page of the state.gov
results information on
website.
their websites.
Compliance
PPL/LER wil organize occasional
No reference to compliance
Substantial compliance
Enforcement
external technical audits of
enforcement.
required for approval
operating unit compliance with
of quarterly
the policy.
disbursements
requested by recipient
country.
Source: Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 1, 2012; Department of
State Evaluation Policy, Bureau of Resource Management, February 23, 2012; Evaluation: Learning from Experience,
USAID Evaluation Policy, January 2011.
Congressional Research Service
25
Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance
Notes: PPL/LER = USAID Office of Learning, Evaluation and Research; F Bureau = Office of Foreign Assistance
Resources; RM = State Department Bureau of Resource Management; MCA = the Mil ennium Challenge Account
implementing entity in each compact country; M&E = monitoring and evaluation. The information in the table
refers only to what is in the actual evaluation policy document of each agency, as cited above. Information
available outside of these documents, which may provide greater details about aspects of the policies, is not
reflected here.
Author Contact Information
Marian Leonardo Lawson
Analyst in Foreign Assistance
mlawson@crs.loc.gov, 7-4475
Congressional Research Service
26