Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

Changes from November 19, 2012 to February 13, 2013

This page shows textual changes in the document between the two versions indicated in the dates above. Textual matter removed in the later version is indicated with ~~red strikethrough~~ and textual matter added in the later version is indicated with blue.
    Does Foreign Aid Work? Efforts to Evaluate
U.S. Foreign Assistance
Marian Leonardo Lawson
Analyst in Foreign Assistance
November 19, 2012

Congressional Research Service
7-5700
www.crs.gov
R42827

CRS Report for Congress
Prepared for Members and Committees of Congress

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

Summary
Congress’s recent focus on reducing federal spending raises questions about the relative
efficiency and effectiveness of all federal programs. In this context, evaluation of foreign
assistance programs is of growing interest to many Members of Congress as they scrutinize the
Administration’s international affairs budget request and debate foreign aid spending priorities.
Policymakers, taxpayers, and aid recipients alike want to know what impact, if any, foreign aid
dollars are having, and whether foreign aid programs are achieving their intended objectives.
 to Evaluate U.S. Foreign Assistance
    February 13, 2013
            (R42827)
          
ContentsIntroduction
Does Aid Work? A Brief Summary
Impact and Performance Evaluations
History of U.S. Foreign Assistance Evaluation
Evaluation Challenges
Applying Evaluation Findings to Policy
Current Agency Evaluation Policies
Issues for Congress
Conclusion

      Appendixes
      Appendix A. Select Aspects of Current USAID, State Department, and MCC Evaluation Policies

    
SummaryIn most cases, the success or failure of U.S. foreign aid programs is not entirely clear, in part
 because historically, most aid programs have not been evaluated for the purpose of determining
 their actual impact. The purpose and methodologies of foreign aid evaluation have varied over the
 decades, responding to political and fiscal circumstances. Aid evaluation practices and policies
 have variously focused on meeting program management needs, building institutional learning,
 accounting for resources, informing policymakers, and building local oversight and project design
 capacity. Challenges to meaningful aid evaluation have varied as well, but several are recurring.
 Persistent challenges to effective evaluation include unclear aid objectives, funding and personnel
 constraints, emphasis on accountability for funds, methodological challenges, compressed
 timelines, country ownership and donor coordination commitments, security, and agency and
 personnel incentives. As a result of these challenges, aid agencies do not undertake rigorous
 evaluation for all foreign aid activities.

    The U.S. government agencies managing foreign assistance each have their own distinct
 evaluation policies; these policies have come into closer alignment in the last two years than in
 the past. The Obama Administration’'s Quadrennial Diplomacy and Development Review
 (QDDR) resulted in, among other things, a stated commitment to plan foreign aid budgets “based
"based not on dollars spent, but on outcomes achieved.”" This focus on evaluating the impact of foreign
 assistance reflects an international trend. USAID put this idea into practice by introducing a new
 evaluation policy in January 2011. The State Department, which began to manage a growing
 portion of foreign assistance over the past decade, followed suit with a similar policy in February
 2012. The Millennium Challenge Corporation, notable for its demanding but little-tested
 approach to evaluation, also recently revised its policy. While differing in several respects,
 including their support for impact evaluation, the policies reflect a common emphasis on
 evaluation planning as a part of initial program design, transparency and accessibility of
 evaluation findings, and the application of data to inform future project design and allocation
 decisions. Aspects of the three evaluation policies are compared in Appendix A.
.
    Though recent evaluation reform efforts have been agency-driven, Congress has considerable
 influence over their impact. Legislators may mandate a particular approach to evaluation directly
 through legislation (e.g., H.R. 3159, , S. 3310S. 3310 in the 112th Congress), or can support or undermine Administration
 policies by controlling the appropriations necessary to implement the policies. Furthermore,
 Congress will largely determine how, or if, any actionable information resulting from the new
 approach to evaluations will influence the nation’'s foreign assistance policy priorities.

Congressional Research Service

 
    
    
  Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

Contents
Introduction...................................................................................................................................... 1
Does Aid Work? A Brief Summary .................................................................................................. 2
Impact and Performance Evaluations .............................................................................................. 4
History of U.S. Foreign Assistance Evaluation ............................................................................... 5
Evaluation Challenges ................................................................................................................... 10
Applying Evaluation Findings to Policy ........................................................................................ 16
Current Agency Evaluation Policies .............................................................................................. 17
Issues for Congress ........................................................................................................................ 20
Conclusion ..................................................................................................................................... 21

Appendixes
Appendix A. Select Aspects of Current USAID, State Department, and MCC Evaluation
Policies........................................................................................................................................ 23

Contacts
Author Contact Information........................................................................................................... 25

Congressional Research Service

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

Introduction
Congress’ to Evaluate U.S. Foreign Assistance
  Introduction
    Congress's strong focus on reducing federal spending raises questions about the relative
 efficiency and effectiveness of all federal programs, and foreign assistance is a subject often
 raised in broad budget debates. Foreign assistance evaluation is one aspect of a government-wide
 effort to link program effectiveness to budgeting decisions. It is also an element of broader
 foreign aid reforms implemented in recent years. The 2010 Quadrennial Diplomacy and
 Development Review (QDDR), the basis of many recent aid policy initiatives, called for the State
 Department and the U.S. Agency for International Development (USAID) to plan foreign aid
 budgets and programs “"based not on dollars spent, but on outcomes achieved,”" and for USAID to
 become “"the world leader in monitoring and evaluation.”1"1 Rigorous evaluation is also a
 cornerstone of the Millennium Challenge Corporation (MCC), established in 2004 to promote a
 new model of development assistance.22 According to USAID Administrator Rajiv Shah, global
 development policies and practices are experiencing a “"transformation based on absolute demand
 for results.”3"3 That demand comes, in part, from some Members of Congress as they scrutinize the
Administration’ Administration's international affairs budget request and consider foreign aid spending priorities.4
4 It also comes from aid beneficiaries and American taxpayers who want to know what impact, if
 any, foreign aid dollars are having and whether foreign aid programs are achieving their intended
 objectives.
 
    The current emphasis on evaluation is not new. The importance, purpose and methodologies of
 foreign aid evaluation have varied over the decades since USAID was established in 1961,
 responding to political and fiscal circumstances, as well as evolving development theories. There
 are a number of reasons that this issue has gained prominence in recent years. For one, foreign aid
 funding levels have increased over the past decade while evaluations have decreased, raising
 questions about the knowledge basis for aid policy.55 Analysts have noted that after decades of aid
 agencies spending billions of dollars on assistance programs, very little is known about the
 impact of these programs.66 Some wonder how policymakers can develop effective foreign aid
 strategies without a clear understanding of how and why prior assistance has succeeded or failed.
 
    This report focuses primarily on U.S. bilateral assistance, and less on the work of multilateral aid
 entities, such as the World Bank, to which the United States contributes. While a wide range of
 federal agencies provide foreign assistance in some form,7 this report focuses on the three
1

U.S. Department of State, Quadrennial Diplomacy and Development Review, 2010, Leading Through Civilian Power,
p. 103.
2
For more information about the MCC model, see CRS Report RL32427, Millennium Challenge Corporation, by Curt
Tarnoff.
3
Statement of USAID Administrator Rajiv Shah to The Cable, as reported in The Cable, June 13, 2012.
4
While not often discussing evaluation policy per se, some Members appear to be influenced in their policy decisions
by their sense of what aid is working and what is not. For example, when introducing her subcommittee’s FY2013
proposal at full-committee mark-up on May 17, 2012, House State-Foreign Operations Appropriations Subcommittee
Chairwoman Kay Granger remarked that the legislation “only supports programs that work.” Senator Lindsay Graham
of the Senate State-Foreign Operations Appropriations Subcommittee, explaining the sharp reduction in aid for Iraq in
the Senate’s FY2013 proposal at a May 22, 2012, mark-up, said “there’s no point in throwing good money after bad.”
5
For historic information on foreign aid spending, see CRS Report R40213, Foreign Aid: An Introduction to U.S.
Programs and Policy, by Curt Tarnoff and Marian Leonardo Lawson.
6
When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Gap Working
Group, Center for Global Development, May 2006, p. 1.
7
According to U.S. Overseas Loans and Grants, 21 U.S. Government agencies reported disbursing foreign assistance in
(continued...)

Congressional Research Service

1

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

7 this report focuses on the three agencies that have primary policy authority and implementation responsibility for U.S. foreign
 assistance—USAID, the State Department, and the Millennium Challenge Corporation (MCC). It
 discusses past efforts to improve aid evaluation, as well as ongoing issues that make evaluation
 challenging in the foreign assistance context. The report also provides an overview of the current
 evaluation policies of the primary implementing agencies, and discusses related issues for
 Congress, including recent legislation.

    
      
        
          
            Program Evaluation Government-Wide

            Program evaluation is an important issue throughout the U.S. government, and foreign assistance evaluation is just
 one part of a broader effort by the federal government to improve accountability and program performance through
 stronger evaluation processes. With the Government Performance and Results Act (GPRA) of 1993, Congress
 established unprecedented statutory requirements regarding the establishment of goals, performance measurement
 indicators, and submission of related plans and reports to Congress for its potential use in policy development and
 program oversight. The GPRA Modernization Act of 2010 updated the original law, requiring more frequent plan
 updates and on-line posting of data.88 The agency-specific evaluation plans discussed in this report are intended to
 comply with and build upon this government-wide effort. Most recently, in a May 18, 2012, memorandum, the Office
 of Management and Budget (OMB) directed all federal agencies to demonstrate the use of evidence from rigorous
 evaluation throughout their FY2014 budget submissions.9 9  While OMB has emphasized use of evidence in prior years,
 this memorandum appears to take the issue to a more formal level, and suggests that evaluation data may be closely
 linked to budget approval in future fiscal years.

 
          
        
      
    
    Does Aid Work? A Brief Summary

    To know whether aid is successful, one must understand its purpose. The Foreign Assistance Act
 (FAA) of 1961 (P.L.87-195), as amended, is the authorizing legislation for most modern foreign
 aid programs. The FAA declared that

     the principal objective of the foreign policy of the United States is the encouragement and
 sustained support of the people of developing countries in their efforts to acquire the
 knowledge and resources essential to development, and to build the economic, political, and
 social institutions that will improve the quality of their lives.10

10 
    The original legislation lists five principal goals for foreign aid: (1) the alleviation of the worst
 physical manifestations of poverty among the world’'s poor majority; (2) the promotion of
 conditions enabling developing countries to achieve self-sustaining economic growth and
 equitable distribution of benefits; (3) the encouragement of development processes in which
 individual civil and economic rights are respected and enhanced; (4) the integration of the
 developing countries into an open and equitable international economic system; and (5) the
 promotion of good governance through combating corruption and improving transparency and
 accountability.1111 Amending legislation over the years added dozens of new, though often
 overlapping, aid objectives. For example, “the suppression of the illicit manufacturing of and
(...continued)
FY2010. See http://gbk.eads.usaidallnet.gov/data/fast-facts.html.
8
For more on current GPRA requirements, see CRS Report R42379, Changes to the Government Performance and
Results Act (GPRA): Overview of the New Framework of Products and Processes, by Clinton T. Brass.
9
Use of Evidence and Evaluation in the FY2014 Budget, Memorandum to the Heads of Executive Departments and
Agencies, Jeffrey D. Zients, Acting Director, Office of Management and Budget, May 18, 2012.
10
Foreign Assistance Act of 1961, P.L. 87-195), §101(a).
11
Ibid.

Congressional Research Service

2

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

"the suppression of the illicit manufacturing of and trafficking in narcotic and psychotropic drugs”" was added in 1971,12 “12 "to alleviate human suffering
 caused by natural and manmade disasters”" was added in 1975,1313 and “"to enhance the antiterrorism
 skills of friendly countries by providing training and equipment”" and “"to strengthen the bilateral
 ties of the United States with friendly governments by offering concrete [antiterrorism]
assistance”14 assistance"14 were added in 1983. In short, U.S. foreign aid is intended to be a tool for fighting
 poverty, enhancing bilateral relationships, and/or protecting U.S. security and commercial
 interests.

    In this broad view, some instances of specific development assistance projects and programs are
 widely viewed as successful. The largest aid program of the last century, the Marshall Plan (194819521948-1952), for example, is acclaimed as a key factor in the post-World War II reconstruction of
 European states that have gone on to become major strategic and trade partners of the United
 States. In the late 1960s and 1970s, aid associated with the “"green revolution”" was credited with
 greatly improving agricultural productivity and addressing hunger and malnutrition in parts of
 Asia, and global health programs were credited with virtually eradicating smallpox. Korea,
 Taiwan, and Botswana are often cited as aid success stories as a result of remarkable economic
 progress following significant aid infusions. More recently, unquestionable progress in battling
 public health crises, such as HIV/AIDS, across the globe can be largely attributed to massive
 foreign assistance programs, both bilateral and multilateral. Even in these instances, however,
 close analysis often reveals many caveats.
 
    In other specific instances foreign aid programs and projects have been considered to be
 conspicuously unsuccessful, or even harmful to intended beneficiaries. Critics of foreign
 assistance cite decades of aid to corrupt governments in Africa, which enriched corrupt leaders
 and did little to improve the lives of the poor.1515 In Latin America, U.S. aid to anti-communist
 rebels and regimes during the Cold War was associated with brutal violence and believed by
 many to have damaged U.S. credibility as a champion of democracy. Numerous examples exist of
 hospitals, schools, and other facilities that were built with donor funds and left to rot, unused in
 developing countries that did not have the resources or will to maintain them. In some instances,
 critics assert that foreign aid may do more harm than good, by reducing government
 accountability, fueling corruption, damaging export competitiveness, creating dependence, and
 undermining incentives for adequate taxation.16
16
    The most notable successes and conspicuous failures of foreign aid give fodder to both aid
 advocates and detractors, but in all likelihood represent just a small segment of assistance
 activities. In most cases, clear evidence of the success or failure of U.S. assistance programs is
 lacking, both at the program level and in aggregate. One reason for this is that aid provided for
 development objectives is often conflated with aid provided for political and security purposes.
 Another reason is that historically, most foreign assistance programs are never evaluated for the
 purpose of determining their impact, either at the time or retrospectively. Furthermore, evaluation
 practices are not consistent enough to allow for the use of project level data as the basis for
12

FAA, as amended, §481(a)(1)(C).
FAA, as amended, §491(a).
14
FAA, as amended, §572 (1) and (2).
15
Several examples of this are discussed in, Economic Gangsters: Corruption, Violence and the Poverty of Nations, by
Raymond Fisman and Edward Miguel, Princeton University Press, 2008.
16
See Dambisa Moyo, Dead Aid: Why Aid is Not Working and How There Is a Better Way for Africa, Farrar, Straus
and Giroux, New York, 2009, p. 48.
13

Congressional Research Service

3

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 broader, strategic evaluations. According to one 2009 review of monitoring and evaluation across
 U.S. foreign assistance implementing agencies, evaluation of foreign assistance programs “is
"is uneven across agencies, rarely assesses impact, lacks sufficient rigor, and does not produce the
 necessary analysis to inform strategic decision making.”17

"17 
    Impact and Performance Evaluations

    The Department of State, USAID, and other U.S. agencies implementing foreign assistance
 programs have long evaluated the performance of their own personnel and contractors in meeting
 discrete objectives. Depending on the nature of the project or program, staff and contractors
 might monitor the miles of road built, number of police officers trained, or changes in the use of
 fertilizers by farmers. These results can be compared to the initial program goals and expectations
 to determine whether the project or contract has been performed successfully. This type of
 oversight is called performance monitoring, and if the resulting data are analyzed in an effort to
 explain how and why a program meets or fails to meet strategic objectives, this is called
 performance evaluation. Performance monitoring and evaluation are widely viewed as essential
 aspects of oversight, and performance evaluations represent the vast majority of foreign aid
 evaluation to date. Financial audits by agency Inspectors General, which examine whether funds
 are being used as intended, are also a common form of evaluation, particularly at the State
 Department.
 
    Performance evaluation and financial audits play an important part in project management but do
 little to answer questions about foreign aid effectiveness. Addressing this question, some argue,
 requires impact evaluations. Impact evaluations can take many forms, but their common element
 is that they use a defined counterfactual, or control group, and baseline data to measure change
 that can be attributed to an aid intervention.1818 Impact evaluations look not at the output of an
 activity, but rather at its impact on a development objective. For example, while a performance
 evaluation of an education program may look at the number of textbooks provided and teachers
 trained, an impact evaluation may determine how or if literacy or math skills had improved for
 the target group as compared to a similar group that did not receive the textbooks or teacher
 training. A performance evaluation of an HIV prevention project may report the number of public
 awareness events held or condoms distributed, while an impact evaluation of the same program
 would monitor changes in the HIV/AIDS infection rate of the targeted population. An impact
 evaluation of a police training program would look at the program’'s impact on civil order and
 public safety rather than simply report how many officers were trained or the value of equipment
 supplied. Randomized controlled trials, in which beneficiaries are randomly selected from a
 prequalified group and compared before and after the program to those not selected, are widely
 viewed as best practice for impact evaluation, but less rigorous methods are used as well.
 
    Impact evaluations can be key to determining whether a foreign assistance program “"works.”
" However, impact evaluations are generally far more complex and resource-intensive than
17

Beyond Success Stories: Monitoring and Evaluation For Foreign Assistance Results, Evaluator Views of Current
Practice and Recommendations for Change, by Richard Blue, Cynthia Clapp-Wincek and Holly Benner, May 2009, p.
ii.
18
For a thorough, yet non-technical, discussion of the use of impact/attribution evaluation, see “An introduction to the
use of randomized control trials to evaluate development interventions,” by Howard White, International Initiative for
Impact Evaluation, Working Paper 9, February 2011.

Congressional Research Service

4

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 performance evaluations. Agencies implementing foreign assistance must balance the potential
 knowledge to be gained from impact evaluation with the additional resources necessary to carry
 out such evaluations. As a result, while the potential learning benefits of impact evaluation have
 long been recognized by aid officials, the use of rigorous impact evaluation has been, and
 continues to be, very limited. More typically, agencies aim for evaluation practices that are, as
 one expert has put it, “"cost-effectively rigorous,”" and, at minimum, “"independent, transparent,
 and consistent, thus persuasive.”19

"19
    History of U.S. Foreign Assistance Evaluation

    The practice of foreign assistance evaluation has changed over time to reflect evolving, or some
 might say cyclical, attitudes about the purpose and relative importance of evaluation.2020 This is
 evident both in the United States and internationally. Aid evaluation practices and policies have
 variously focused on different evaluation objectives, including meeting program management
 needs, institutional learning, accountability for resources, informing policymakers, and building
 local oversight and project design capacity.
 
    The history of U.S. foreign assistance evaluation begins with USAID, which implemented the
 vast majority of U.S. foreign assistance prior to the last decade. In its early years, USAID was
 primarily involved in large capital and infrastructure projects, for which evaluations focused on
 financial and economic rates of return were appropriate. However, the agency soon shifted focus
 towards smaller and more diverse projects to address basic human needs, and found that the rate
 of return evaluation model was no longer sufficient.2121 The agency established its first Office of
 Evaluation in 1968, and used a Logical Framework (LogFrame) model as its primary system for
 monitoring and evaluation. The LogFrame approach, subsequently adopted by many international
 development agencies, employed a matrix to identify project goals, purposes, results, and
 activities, with corresponding indicators, verification methods, and important assumptions.
 Baseline data were to be used for each indicator, and results were reported at quarterly points
 during the life of a project. However, these data were not analyzed to look for competing
 explanations of the results or unintended consequences of activities.

    While the LogFrame approach established USAID as a thought leader with respect to evaluation
 policy, in practice, evaluations varied significantly from project to project. A 1970 evaluation
 handbook included a diagram of the “ideal”"ideal" program evaluation design, which resembles a
 randomized controlled trial, but notes that “"there are a great many reasons why it may not be
 possible to reach the ideal.”22"22 Reviews of foreign assistance evaluation over decades revealed
 shortcomings. For one, the system had become decentralized over time, suitable to meet the
 information needs of project managers in the field but not contribute to broader learning or policy
 making. A 1982 report by the General Accounting Office (now the Government Accountability
 Office, GAO) found that “AID staff does not apply lessons learned in the development of new
19

Clemens, Michael. “Impact Evaluation in Aid: What For? How Rigorous?” Presentation at the Overseas
Development Institute, July 3, 2012, video recording available at http://www.cgdev.org/content/multimedia/detail/
1426372/.
20
Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 4.
21
The USAID Evaluation System: Past Performance and Future Direction, Bureau for Program and Policy
Coordination, USAID, September 1990, p. 9.
22
Evaluation Handbook, Office of Program Evaluation, USAID, November 1970, p. 40.

Congressional Research Service

5

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

projects,” and that “"AID staff does not apply lessons learned in the development of new projects," and that "lessons learned are neither systematically nor comprehensively identified or
 recorded by those who are directly involved.”23"23 In response to the GAO report’'s recommendation
 that USAID build an “"information analysis capability,”" the agency created the Center for
 Development Information and Evaluation (CDIE) in 1983, with a mandate to “"foster the use of
 development information in support of AID’'s assistance efforts.”24"24 CDIE carried out metaevaluationsmeta-evaluations to reveal broader trends in aid impact, provided information and training on
 evaluation best practices to mission staff, and made a wide range of evaluation reports accessible
 to implementers in the field. Aid officials suggest that CDIE’'s evaluation work played a
significant role in shaping USAID strategies and priorities in many sectors over decades.
An internal USAID review in 1988 found that
CDIE had greatly increased the use of aid
evaluation information by implementers, but
also identified a need to improve the quality
and timeliness of evaluation reports.26 While
the evaluation policy at the time still called for
rigorous, statistical methods of evaluation, it
was found that this approach was never
actually widely used at USAID because the
required skills, time, and expense made
implementation difficult.27 As one internal
review noted, “statistical rigor in evaluation
methods was deemphasized in favor of
‘reasonably’ valid evidence about project
performance.”28 Guidance to missions
encouraged the use of low-cost and timely
qualitative evaluation methodologies,
including the use of key informant interviews,
focus group discussions, community meetings,
and informal surveys.29

 significant role in shaping USAID strategies and priorities in many sectors over decades.
    
      
        
          
            Testing Family Planning Project Design
 in Thailand, 1979

            Many evaluations are designed to answer specific
 questions about project design. One example is the
 Family Planning Health and Hygiene Project, a 1979
 independent evaluation of USAID support for the
 government of Thailand’'s family planning policy.
 Implemented by the American Public Health Association,
 the evaluation used a baseline survey and experimental
 design to test the hypothesis that contraception services
 would be more cost effective and acceptable to
 communities if combined with basic health services
 rather than implemented in isolation. Obtaining the
 appropriate information to inform resource allocation
 was a primary objective of the evaluation. According to
 the report, “"the evaluation was implemented with
 sufficient precision and adherence to experimental
 requirements to provide information on which to make
 management decisions about the best use of resources.”
"  Evaluators found that the hypothesis was not supported
 by the evidence. Adding basic health services doubled the
 cost of programs but was not associated with increased
 contraceptive use. As a result, the evaluators
 recommended that future decisions about family planning
 and basic health services programs be considered
 without any assumption that a linkage between the two
would increase the acceptance of contraception use.25

In the early 1990s, accountability for funds
became a primary focus of aid evaluation.
After a 1990 GAO review concluded that
USAID evaluation practices made it difficult or impossible to account for use of aid funds,30
attention turned to tracking where aid money was going, not measuring what it was
23

Experience – A Potential Tool for Improving U.S. Assistance Abroad, U.S. Government Accountability Office,
GAO-ID-82-36, June 15, 1982, p. i (summary).
24
The History of CDIE, CDIEHIST.017/SESmith;JREriksson/10-17-94, p.4.; available through the Development
Experience Clearinghouse on the USAID website.
25
The Community-Based Family Planning Services Family Planning Health and Hygiene Project, prepared by Bruce
Carlson, MSPH, and Malcolm Potts, M.D. under the auspices of The American Public Health Association, USAID,
1979, pp. 5, 7.
26
Ibid.
27
The A.I.D. Evaluation System: Past Performance and Future Directions, Bureau for Program and Policy
Coordination, Agency for International Development, September 1990, p. 10.
28
Ibid., p. 11.
29
Ibid., p. 11.
30
Accountability and Control Over Foreign Assistance, GAO/T-NSIAD-90-25, March 29, 1990, p. 6, 11. The review
(continued...)

Congressional Research Service

6

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

accomplishing. At the same time, USAID was facing increasing budgetary pressure and
increasing congressional and public concern about what was being achieved through foreign
assistance.31 In response, USAID carried out an Evaluation Initiative from 1990 to 1992, greatly
expanding the staff and budget of CDIE and making significant investments in rigorous
evaluation designs and innovative methods to evaluate sector-wide results.32 However, by the
mid-1990s the priorities changed once again. A 1993 agency reorganization led to the 1994
elimination of an Office of Evaluation within CDIE, a reduction of overall CDIE staff,33 and a
new emphasis on “rapid appraisal techniques,” which guidance documents describe as a
compromise between slow, costly, and credible formal evaluation methods and cheap, quick,
informal methods (focus group, etc.) that may be less reliable.34
In 1995, USAID replaced the requirement to conduct mid-term and final evaluations of all
projects with a policy calling for evaluation only when necessary to address a specific
management question.35 The rationale was that the required evaluations had become pro forma, as
GAO reviews had suggested, and that fewer, more comprehensive evaluations would be a better
use of time and resources. As a result, the number of completed evaluations dropped from 425 in
1993 to an estimated 138 in 1999,36 but the depth and scope of new evaluations reportedly did not
change.37 One study suggests that inconsistent guidance on evaluation in these years allowed
many already overburdened mission staff to ignore agency-wide requirements, but noted that the
Global Health, Africa, and Europe & Eurasia bureaus, which had their own evaluation
procedures, continued to carry out quality evaluation work.38
Foreign assistance levels grew rapidly starting in 2003 to support military activities in
Afghanistan and Iraq, as well as the President’s Emergency Plan for AIDS Relief (PEPFAR) and
the creation in 2004 of the Millennium Challenge Corporation (MCC). Accountability to
Congress became a major evaluation priority. In 2005, inspired by remarks made by House
Foreign Operations Appropriations Subcommittee Chairman Jim Kolbe regarding the importance
of being able to clearly demonstrate results of aid expenditures, USAID Administrator Andrew
Natsios sought to revitalize evaluation within the agency. He sent a cable to all mission directors
calling for the inclusion of evaluation plans, and higher quality evaluations, in all program

(...continued)
found that military assistance managed by State and the Department of Defense was also inadequately monitored and
accounted for.
31
The History of CDIE, p.6; The A.I.D. Evaluation System, p. 11.
32
Ibid, pp. 6-7.
33
Ibid. p. 8.
34
The Role of Evaluation in USAID, Performance Monitoring and Evaluation TIPS, USAID CDIE, 1997, Number 11,
p. 3.
35
Beyond Success Stories, p.7; Evaluation of Recent USAID Evaluation Experience, Cynthia Clapp-Wincek and
Richard Blue, Working Paper No. 320, U.S. Agency for International Development, Center for Development
Information and Evaluation, June 2001, p. 31.
36
Evaluation of Recent USAID Evaluation Experience, p. 5. The report authors note that while some of the declining
numbers can be attributed to missions not submitting their evaluations to the Development Experience Clearinghouse,
as policy required, making the specific numbers unreliable, the trend of decline is unmistakable.
37
Evaluation of Recent USAID Evaluation Experiences, p. 12.
38
The Evaluation of USAID’s Evaluation Function: Recommendations for Reinvigorating the Evaluation Culture
Within the Agency, Janice M. Weber, Bureau for Program and Policy Coordination, USAID, September 2004, pp. 5, 10.

Congressional Research Service

7

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

designs; designated monitoring and evaluation officers at each post; and set aside funding for
evaluations and incentives for employees who do evaluations; among other things.39
Primary School Deworming in Kenya
In 2006, in further pursuit of accountability, as
well as a desire to rationalize the bilateral
(1997-2001)40
assistance efforts of multiple U.S. agencies,
One well-known example of an impact evaluation that
yielded useful information looked at a World BankSecretary of State Condoleezza Rice created
supported project in Kenya that treated children for
the Office of the Director of Foreign
intestinal worms, a prevalent affliction that results in
Assistance (F Bureau) at the State
listlessness, diarrhea, abdominal pain, and anemia. The
Department. In addition to consolidating many stated development objective was to increase the
USAID and State policy and planning
number of children completing their primary education.
In collaboration with the local health ministry, NGO
functions for foreign assistance, the F Bureau
implementers treated 30,000 children in 75 schools with
established an extensive set of standard
a drug that cost $3.27 annually per child, using baseline
performance indicators “to measure both what
data and a random phase-in approach that allowed for a
is being accomplished with U.S. Government
controlled comparison. The evaluation found that the deforeign assistance funds and the collective
worming resulted in a 25% reduction in absenteeism, or
10-15 more days of school attendance per child per year.
impact of foreign and host-government efforts
42
This case is also an example of the value of consistent
to advance country development.” Prior to
methodology and the use of sector- or region-wide
this initiative, the State Department, which
evaluation that looks at results beyond the project level.
traditionally had managed a much smaller aid
Similar evaluation methods were used for other
interventions (providing free uniforms, textbooks, and/or
portfolio than USAID, is said to have made a
meals) with the same goal and in the same region,
de facto decision not to evaluate its assistance
43
allowing evaluators to do a comparative analysis and
programs on a systematic basis. As a result,
determine that the de-worming intervention was the
the data collected through the “F process,”
most effective of these interventions in increasing school
which remains in place today, allow for a
participation.41
marked improvement in aid transparency,
demonstrating comprehensively where and for what purpose aid funds are allocated by State and
USAID as of FY2006.44 However, the demands of F process reporting were believed by some to
have interfered with more results-oriented evaluation work at USAID, and a 2008 assessment of
State’s evaluation capacity found that several bureaus, including those that manage State’s
security assistance programs, still had little or no evaluation capacity.45

39

Actions Required to Implement the Initiative to Revitalize Evaluation in the Agency, UNCLAS STATE 127594, July
8, 2005.
40
For an overview of this evaluation, as well as links to related studies, see http://www.povertyactionlab.org/
evaluation/primary-school-deworming-kenya.
41
Roetman, Eric. A Can of Worms? Implications of Rigorous Impact Evaluations for Development Agencies,
International Initiative for Impact Evaluations, Working Paper 11, March 2011, p. 5.
42
See http://www.state.gov/f/indicators/index.htm. It was originally expected by many that the F Bureau would
eventually track all foreign assistance provided by U.S. agencies, not just State and USAID. As of 2012, some MCC
data has been added to the Bureau’s public database (www.foreignassistance.gov), but there does not appear to be
momentum toward any expansion of F Bureau authority.
43
Beyond Success Stories, p. 14. The State Department traditionally has used a variety of resources for monitoring its
foreign assistance programs, including Mission and Bureau Strategic Plans, annual performance and accountability
reports, and Office of Inspector General and Government Accountability Office reports, but had no systematic
evaluation process (Department of State Program Evaluation Plan, FY2007-2012 Department of State and USAID
Strategic Plan, Bureau of Resource Management, May 2007, Appendix II).
44
The data is publically available at http://www.foreignassistance.gov.
45
Beyond Success Stories, p. 8.

Congressional Research Service

8

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

The structural reforms of the F Bureau came at a time of heightened congressional scrutiny of
foreign aid. In 2004, Congress established the Helping to Enhance the Livelihood of People
 would increase the acceptance of contraception use.25An internal USAID review in 1988 found that CDIE had greatly increased the use of aid evaluation information by implementers, but also identified a need to improve the quality and timeliness of evaluation reports.26 While the evaluation policy at the time still called for rigorous, statistical methods of evaluation, it was found that this approach was never actually widely used at USAID because the required skills, time, and expense made implementation difficult.27 As one internal review noted, "statistical rigor in evaluation methods was deemphasized in favor of 'reasonably' valid evidence about project performance."28 Guidance to missions encouraged the use of low-cost and timely qualitative evaluation methodologies, including the use of key informant interviews, focus group discussions, community meetings, and informal surveys.29 
In the early 1990s, accountability for funds became a primary focus of aid evaluation. After a 1990 GAO review concluded that USAID evaluation practices made it difficult or impossible to account for use of aid funds,30 attention turned to tracking where aid money was going, not measuring what it was accomplishing. At the same time, USAID was facing increasing budgetary pressure and increasing congressional and public concern about what was being achieved through foreign assistance.31 In response, USAID carried out an Evaluation Initiative from 1990 to 1992, greatly expanding the staff and budget of CDIE and making significant investments in rigorous evaluation designs and innovative methods to evaluate sector-wide results.32 However, by the mid-1990s the priorities changed once again. A 1993 agency reorganization led to the 1994 elimination of an Office of Evaluation within CDIE, a reduction of overall CDIE staff,33 and a new emphasis on "rapid appraisal techniques," which guidance documents describe as a compromise between slow, costly, and credible formal evaluation methods and cheap, quick, informal methods (focus group, etc.) that may be less reliable.34 
In 1995, USAID replaced the requirement to conduct mid-term and final evaluations of all projects with a policy calling for evaluation only when necessary to address a specific management question.35 The rationale was that the required evaluations had become pro forma, as GAO reviews had suggested, and that fewer, more comprehensive evaluations would be a better use of time and resources. As a result, the number of completed evaluations dropped from 425 in 1993 to an estimated 138 in 1999,36 but the depth and scope of new evaluations reportedly did not change.37 One study suggests that inconsistent guidance on evaluation in these years allowed many already overburdened mission staff to ignore agency-wide requirements, but noted that the Global Health, Africa, and Europe & Eurasia bureaus, which had their own evaluation procedures, continued to carry out quality evaluation work.38 
Foreign assistance levels grew rapidly starting in 2003 to support military activities in Afghanistan and Iraq, as well as the President's Emergency Plan for AIDS Relief (PEPFAR) and the creation in 2004 of the Millennium Challenge Corporation (MCC). Accountability to Congress became a major evaluation priority. In 2005, inspired by remarks made by House Foreign Operations Appropriations Subcommittee Chairman Jim Kolbe regarding the importance of being able to clearly demonstrate results of aid expenditures, USAID Administrator Andrew Natsios sought to revitalize evaluation within the agency. He sent a cable to all mission directors calling for the inclusion of evaluation plans, and higher quality evaluations, in all program designs; designated monitoring and evaluation officers at each post; and set aside funding for evaluations and incentives for employees who do evaluations; among other things.39 

      
        
          Primary School Deworming in Kenya (1997-2001)40
One well-known example of an impact evaluation that yielded useful information looked at a World Bank-supported project in Kenya that treated children for intestinal worms, a prevalent affliction that results in listlessness, diarrhea, abdominal pain, and anemia. The stated development objective was to increase the number of children completing their primary education. In collaboration with the local health ministry, NGO implementers treated 30,000 children in 75 schools with a drug that cost $3.27 annually per child, using baseline data and a random phase-in approach that allowed for a controlled comparison. The evaluation found that the de-worming resulted in a 25% reduction in absenteeism, or 10-15 more days of school attendance per child per year. This case is also an example of the value of consistent methodology and the use of sector- or region-wide evaluation that looks at results beyond the project level. Similar evaluation methods were used for other interventions (providing free uniforms, textbooks, and/or meals) with the same goal and in the same region, allowing evaluators to do a comparative analysis and determine that the de-worming intervention was the most effective of these interventions in increasing school participation.41

        

    
In 2006, in further pursuit of accountability, as well as a desire to rationalize the bilateral assistance efforts of multiple U.S. agencies, Secretary of State Condoleezza Rice created the Office of the Director of Foreign Assistance (F Bureau) at the State Department. In addition to consolidating many USAID and State policy and planning functions for foreign assistance, the F Bureau established an extensive set of standard performance indicators "to measure both what is being accomplished with U.S. Government foreign assistance funds and the collective impact of foreign and host-government efforts to advance country development."42 Prior to this initiative, the State Department, which traditionally had managed a much smaller aid portfolio than USAID, is said to have made a de facto decision not to evaluate its assistance programs on a systematic basis.43 As a result, the data collected through the "F process," which remains in place today, allow for a marked improvement in aid transparency, demonstrating comprehensively where and for what purpose aid funds are allocated by State and USAID as of FY2006.44 However, the demands of F process reporting were believed by some to have interfered with more results-oriented evaluation work at USAID, and a 2008 assessment of State's evaluation capacity found that several bureaus, including those that manage State's security assistance programs, still had little or no evaluation capacity.45 

    The structural reforms of the F Bureau came at a time of heightened congressional scrutiny of foreign aid. In 2004, Congress established the Helping to Enhance the Livelihood of People (HELP) Around the Globe Commission, through a provision in P.L. 108-199, to independently
 review foreign assistance policy decisions, delivery challenges, methodology, and measurement
 of results. After nearly two years of work, the HELP Commission released its report in late 2007.
 On the subject of evaluation, the report noted that “"everyone to whom members of the
 Commission spoke about monitoring and evaluation expressed concern about the inadequacy of
 the existing process”" and concluded that “"unless our government better evaluates projects based
 on the outcomes they achieve, it will not improve the effectiveness of taxpayer dollars.”46 The
"46 The commission recommended creation of a unified foreign assistance policy, budgeting, and
 evaluation system within State, quite similar to the F process, which was established before the
 report was released. Other HELP Commission recommendations included ensuring that
 evaluation strategies use control groups and randomization as much as possible; considering new
 evaluation methods, such as the use of professional associations or accreditation agencies; and
 building, in collaboration with other donors, the capacities of recipient governments to provide
 reliable baseline data.47
47
    At the same time the F Bureau was established, and the HELP Commission was active, the
 international donor community began to prioritize aid effectiveness, sparking renewed interest in
 rigorous impact evaluation (see the “"A Global Perspective on Aid Evaluation”" text box below).
below). Some aid professionals viewed the F process as an opportunity to build a cross-agency aid
 evaluation practice focused on impact, and were disappointed that the common indicators used by
 the F Bureau, while an improvement with respect to comparability, measured outputs rather than
 impact. Furthermore, the use of more rigorous evaluation methodologies was not a focus of the
 reform. These issues were revisited by the Obama Administration when it embarked in 2009 on a
 Quadrennial Diplomacy and Development Review (QDDR) to examine how State and USAID
 could be better prepared for current and future challenges. As a result of that review, the
 Administration committed itself in December 2010 to several principles of foreign assistance
 effectiveness, including “"focusing on outcomes and impact rather than inputs and outputs, and
 ensuring that the best available evidence informs program design and execution.”48"48 The QDDR
 became the basis of many recent and ongoing changes at State and USAID, including the creation
 of a new Office of Learning, Evaluation and Research at USAID and a new USAID evaluation
 policy, which took effect in January 2011. State followed suit and adopted an evaluation policy
 similar to that of USAID in February 2012. These policies are discussed later in this report.

    The Millennium Challenge Corporation is a relative newcomer to foreign assistance, and has a
 very limited evaluation history. Nevertheless, since its establishment in 2004, MCC has been
 regarded by many as a leader in aid evaluation, largely as a result of its demanding evaluation
 policy. MCC provides funding and technical assistance to support five-year development plans,
 called “"compacts,”" created and submitted by partner countries. Since its inception, MCC policy
 has required that every project in a compact be evaluated by independent evaluators, using preinterventionpre-intervention baseline data. MCC has also put a stronger emphasis on impact evaluation than State
 and USAID; of the 25 MCC impact evaluation plans (not completed evaluations) made publicly
46

Beyond Foreign Assistance: The HELP Commission Report on Foreign Assistance Reform, The United States
Commission on Helping to Enhance the Livelihood of People (HELP) Around the Globe Commission, December 7,
2007, p. 15.
47
HELP Report, p. 99.
48
QDDR, p. 110.

Congressional Research Service

9

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 available, 11 employ a rigorous randomized control trial methodology rarely used by other aid
 agencies.4949 MCC to date has released five evaluations, all related to specific farmer training
 activities, and has not completed any final compact evaluations. A GAO report on the first two
 completed MCC compacts suggests that significant changes were made to the original evaluation
 plans, raising questions about whether the agency’'s practices will reflect its policy over the long
 term.50
MCC’50
    
      
        
          
            MCC's First Impact Evaluations

            MCC released its first set of independent impact evaluations on October 23, 2012.5151 While the evaluations all look at
 farmer training activities, and reflect a small portion of MCC compacts in the respective countries (Armenia, Ghana,
 El Salvador, Honduras, and Nicaragua), they were much anticipated in the development community as harbingers of
 the success or failure of MCC’'s evidence-based approach to evaluation. The evaluation results were mixed. MCC
 reports meeting or exceeding output and outcome targets for most of the evaluated activities, but not seeing
 measurable changes in household incomes, which was the intended impact. The reports also describe some problems
 with evaluation design and implementation. Many development experts praised MCC’'s transparency about both the
 successes and shortcomings of its programs, and apparent commitment to continuous improvement.5252 The evaluation
 reports were published in full on MCC’'s website, along with MCC analysis of lessons learned (e.g., phased
 implementation doesn’'t work well on a tight schedule, as delays undermine the entire evaluation model) and
 questions raised (e.g., should the assumption that increased farm income leads to increased household income be
 reconsidered?). According to at least one development professional, this first set of evaluations is a “"game changer”
" that has set a new standard for development agencies.53

Evaluation Challenges
53
          
        
      
    
    
    Evaluation Challenges
    The current evaluation emphasis on measuring impact and broader learning about what works is
 not new; as discussed above, it was the basis of USAID evaluation policy in the 1970s and at
 various times since. Nevertheless, a 2009 meta-evaluation of U.S foreign aid programs indicated
 that rigorous impact evaluation—the kind that could determine with credibility whether a specific
 aid intervention or broader sector strategy worked to produce a specific development outcome—
was rarely attempted. Of the 296 evaluations reviewed, only 9% reported on a comparison group
 and only one used an experimental design involving randomized assignment, the method most
 likely to produce accurate data.5454 A 2005 review of USAID evaluations (focused on democracy
 and governance programs) found that “"as a group, they lacked information that is critical to
 demonstrating the results of USAID projects, let alone whether the projects were the real cause of
 whatever change the evaluation reported.”55 This gap between evaluation goals and actual
49

See http://www.mcc.gov/pages/activities/activity/impact-evaluation.
Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11728, pp. 32-38.
51
MCC’s statement on the release, which summarizes the findings, is available at http://www.mcc.gov/pages/press/
release/statement-102312-evaluations.
52
Statements of various leaders in the development community with respect to the MCC evaluations are available at
http://www.modernizeaid.net/2012/10/23/mfan-statement-new-evaluations-advance-transparency-and-providevaluable-guidance-for-future-programs/.
53
See comments of William Savedoff from the Center for Global Development at http://blogs.cgdev.org/mca-monitor/
2012/11/the-biggest-experiment-in-evaluation-mcc-and-systematic-learning.php.
54
Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 46.
55
Trends in International Development Evaluation Theory, Policies and Practices; USAID, 17 August 2009, p. 13.
The report was prepared for USAID by Molly Hageboeck of Management Systems International.
50

Congressional Research Service

10

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

"55 This gap between evaluation goals and actual practices has been documented repeatedly over the history of U.S. foreign assistance; so too have
 the challenges that make it difficult for implementers to achieve ideal evaluation practices in the
 field. Some of these challenges are discussed below.
OTI Consolidation in Colombia,
Mixed Objectives. The U.S. foreign
2007-2011
assistance program has dozens of official
objectives written into statute, and many aid
OTI Consolidation in Colombia,
 2007-2011
A 2011 evaluation of USAID’'s Office of Transition
 Initiatives (OTI) Integrated Governance Response
programs are designed to meet multiple
 Program (IGRP) in Colombia demonstrates the difficulty
objectives. Often there are both strategic
 in quantifying the success of certain types of foreign aid.
objectives and development objectives
 The IGRP was intended to strengthen the government of
attached to an aid intervention, which may or
Colombia’s credibility and legitimacy in communities
may not be acknowledged in budget and
 Colombia's credibility and legitimacy in communities once controlled by rebels, a process known as
“ "consolidation.”" When the Colombian military replanning documents. For example, assistance
re-established control over a community, OTI provided
to Uzbekistan may be requested and
 funds and technical assistance to support rapid-response
appropriated for specific agriculture sector
 community-based projects, such as school rehabilitation,
activities, but may be motivated primarily by a and small income-generation programs, such as providing
desire to secure U.S. overflight privileges for
 and small income-generation programs, such as providing agricultural inputs, designed to increase citizen
 confidence in, and cooperation with, the government.
military aircraft bringing troops and supplies
 The loosely defined objectives and ex-post approach to
to Afghanistan. An evaluation of the
 evaluation, however, made it difficult to determine the
agricultural impact may be of no use to
program’s effectiveness. As the evaluation report notes,
policymakers who are more interested in the
without a defined endpoint for the consolidation process
strategic goal, nor to aid professionals who are program's effectiveness. As the evaluation report notes, without a defined endpoint for the consolidation process or concrete indicators for what constitutes success, the
 evaluation is “"necessarily impressionistic in nature.”
unlikely to view any lessons learned in these
" While a more rigorous evaluation methodology would be
circumstances as applicable to agricultural
 possible with better planning (for example, using a predevelopment projects in a less politically
pre-intervention survey as a baseline to measure changing
affected environment. Another example is the
 attitudes), it may not be practical. Rapid response was a
 key element of the OTI approach, which focused on
Food for Peace program, which provides U.S.
 citizens seeing an immediate and beneficial impact of
agricultural commodities to countries facing
 government control, and delay for the sake of rigorous
food insecurity. One objective of the program
evaluation design could have undermined that strategy.
is to feed hungry people, but long-standing
Evaluators used literature reviews, interviews, and site
requirements that most of the food be
 evaluation design could have undermined that strategy. Evaluators used literature reviews, interviews, and site visits to find that the program was a success because it
“ "nurtured a mindset”" among both Colombians and
provided by U.S. agribusiness and be shipped
Americans working on consolidation that is valuable in
by U.S.-flagged vessels make clear that
achieving policy objectives in conflict zones.56
supporting the U.S. agriculture and shipping
industries is a program objective as well, and a
potentially conflicting one. Studies have shown that the buy and ship America provisions, as they
are known, may lessen the hunger-alleviation impact of food aid by up to 40%.57
Despite the political and diplomatic considerations that arguably underlie the majority of foreign
aid, strategic evaluations that examine those objectives are rare (or at least not publicly available).
This may be understandable, as such evaluations would often be politically and diplomatically
sensitive. Nevertheless, evaluation that focuses only on the development or humanitarian impact

56
All information in this text box is based on USAID/OTI’s Integrated Governance Response Program in Colombia, A
Final Evaluation, produced for USAID by Caroline Hartzell, Robert Lamb, Phillip McLean and Johanna Mendelson
Forman, April 2011. Direct quotes, in order of appearance, are from pages 20 and 13.
57
The Developmental Effectiveness of Untied Aid, OECD, p.1, available at http://www.oecd.org/dataoecd/5/22/
41537529.pdf.

Congressional Research Service

11

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

of a particular program or project, when broader strategic objectives are drivers of the aid, may
largely miss the point.
Funding and Personnel Constraints. The more rigorous and extensive an evaluation, the
costlier it tends to be, both in funds and staff time. Impact evaluations are particularly costly and
require specially trained implementers. Absent a directive from agency leadership, aid
implementers are unlikely to make resources available for evaluation at the expense of other
program components. As one internal USAID review explained, “since USAID’s development
professionals have limited staff, limited budget, and copious priorities, unfortunately, due to lack
of training on the crucial role of evaluation in the development process, most have chosen to
eliminate evaluation from their programs.”58 Competitive contracting plays a role as well. At a
time when most program implementation is contracted out, and cost is a key factor in winning
contract bids, some argue that there is little incentive to invest in the up-front costs, such as
baseline surveys, of a well designed evaluation plan in the absence of an enforced requirement.59
As a result, ad hoc evaluations of limited scope and learning value—as one report describes it, the
“do the best you can in three weeks” approach—often prevail by default.60 “It is rare,” according
to one report, “that the resources provided for an evaluation are sufficient to develop and apply
more rigorous research methods that would produce valid empirical evidence regarding outcomes
and attributable impact.”61 Sometimes the limited resource is personnel, rather than funding.
Reviews of assistance evaluation repeatedly cite lack of trained evaluation personnel as a
problem.
Emphasis on Accountability of Funds. Aid evaluations in recent years have primarily focused
on accountability of funds because that is what stakeholders, including Congress, generally ask
about. Concerned about corruption and waste, bound by allocation limits, and required by law to
report on various aspects of aid administration, implementing agencies have developed
monitoring, evaluation, and data collection practices that are geared toward tracking where funds
go and what they have purchased rather than the impact of funds on development or strategic
objectives. For example, the F Bureau’s Foreign Assistance Framework, launched in 2006, was
created largely to address the information demands of stakeholders, who wanted more data on
how aid funds are being spent. It worked, to the extent that it is now easier to find information on
how much aid is being spent in a given year on counterterrorism activities in Kenya, for example,
or on agricultural growth programs in Guatemala.62 But little if any of the resulting data addresses
the impact of aid programs. If stakeholders had instead expressed sustained interest in aid impact,
the so-called “F process” may have taken a different form.
Methodological Challenges. In the complex environment in which many aid projects are carried
out, it can be challenging to employ high quality evaluation methods. U.S. agency policies allow
 Americans working on consolidation that is valuable in achieving policy objectives in conflict zones.56Mixed Objectives. The U.S. foreign assistance program has dozens of official objectives written into statute, and many aid programs are designed to meet multiple objectives. Often there are both strategic objectives and development objectives attached to an aid intervention, which may or may not be acknowledged in budget and planning documents. For example, assistance to Uzbekistan may be requested and appropriated for specific agriculture sector activities, but may be motivated primarily by a desire to secure U.S. overflight privileges for military aircraft bringing troops and supplies to Afghanistan. An evaluation of the agricultural impact may be of no use to policymakers who are more interested in the strategic goal, nor to aid professionals who are unlikely to view any lessons learned in these circumstances as applicable to agricultural development projects in a less politically affected environment. Another example is the Food for Peace program, which provides U.S. agricultural commodities to countries facing food insecurity. One objective of the program is to feed hungry people, but long-standing requirements that most of the food be provided by U.S. agribusiness and be shipped by U.S.-flagged vessels make clear that supporting the U.S. agriculture and shipping industries is a program objective as well, and a potentially conflicting one. Studies have shown that the buy and ship America provisions, as they are known, may lessen the hunger-alleviation impact of food aid by up to 40%.57 
Despite the political and diplomatic considerations that arguably underlie the majority of foreign aid, strategic evaluations that examine those objectives are rare (or at least not publicly available). This may be understandable, as such evaluations would often be politically and diplomatically sensitive. Nevertheless, evaluation that focuses only on the development or humanitarian impact of a particular program or project, when broader strategic objectives are drivers of the aid, may largely miss the point. 
Funding and Personnel Constraints. The more rigorous and extensive an evaluation, the costlier it tends to be, both in funds and staff time. Impact evaluations are particularly costly and require specially trained implementers. Absent a directive from agency leadership, aid implementers are unlikely to make resources available for evaluation at the expense of other program components. As one internal USAID review explained, "since USAID's development professionals have limited staff, limited budget, and copious priorities, unfortunately, due to lack of training on the crucial role of evaluation in the development process, most have chosen to eliminate evaluation from their programs."58 Competitive contracting plays a role as well. At a time when most program implementation is contracted out, and cost is a key factor in winning contract bids, some argue that there is little incentive to invest in the up-front costs, such as baseline surveys, of a well-designed evaluation plan in the absence of an enforced requirement.59 As a result, ad hoc evaluations of limited scope and learning value—as one report describes it, the "do the best you can in three weeks" approach—often prevail by default.60 "It is rare," according to one report, "that the resources provided for an evaluation are sufficient to develop and apply more rigorous research methods that would produce valid empirical evidence regarding outcomes and attributable impact."61 Sometimes the limited resource is personnel, rather than funding. Reviews of assistance evaluation repeatedly cite lack of trained evaluation personnel as a problem. 
Emphasis on Accountability of Funds. Aid evaluations in recent years have primarily focused on accountability of funds because that is what stakeholders, including Congress, generally ask about. Concerned about corruption and waste, bound by allocation limits, and required by law to report on various aspects of aid administration, implementing agencies have developed monitoring, evaluation, and data collection practices that are geared toward tracking where funds go and what they have purchased rather than the impact of funds on development or strategic objectives. For example, the F Bureau's Foreign Assistance Framework, launched in 2006, was created largely to address the information demands of stakeholders, who wanted more data on how aid funds are being spent. It worked, to the extent that it is now easier to find information on how much aid is being spent in a given year on counterterrorism activities in Kenya, for example, or on agricultural growth programs in Guatemala.62 But little if any of the resulting data addresses the impact of aid programs. If stakeholders had instead expressed sustained interest in aid impact, the so-called "F process" may have taken a different form.
Methodological Challenges. In the complex environment in which many aid projects are carried out, it can be challenging to employ high quality evaluation methods. U.S. agency policies allow for a variety of evaluation methods (see Appendix A), acknowledging that the most rigorous
 methods are not always practical. Sometimes it is impossible to identify a comparable control
 group for an impact evaluation, or unethical to exclude people from a humanitarian intervention
58

An Evaluation of USAID’s Evaluation Function, p. 5.
Beyond Success Stories, p. 16.
60
Ibid.
61
Ibid.
62
Foreign aid data from FY2006-FY2012 estimates, sorted by recipient country, year, agency (only State, USAID and
MCC), appropriations account, and objective is readily available through the “Foreign Assistance Dashboard” at
http://www.foreignaid.gov.
59

Congressional Research Service

12

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 for the purpose of comparison. Sometimes the goals are intangible and cannot be accurately
 documented through metrics. For example, it may be much harder to measure the impact of
 programs such as the Middle East Partnership Initiative, designed to strengthen relationships, than
 to measure more concrete objectives, such as reducing malaria prevalence. This may be one
 reason why reviews have found that global health assistance has a stronger evaluation history
 than other aid sectors;6363 disease prevalence and mortality rates lend themselves to quantification
 better than military personnel attitudes towards human rights or the strength of civil society.
 Rigorous methodology can also limit program flexibility, as making program changes midcoursemid-course, in response to changed circumstances or early results, can compromise the evaluation
 design. Even MCC, with its emphasis on rigorous evaluation, has chosen to use less rigorous
 qualitative methods for certain projects that do not, in the agency’'s opinion, lend themselves to
 quantitative evaluation.64
64 
    Even when metrics and baselines are well established, it can still be very difficult to attribute
 impact to a specific U.S. aid intervention when such programs are often carried out in the context
 of a broader trade, investment, political, and multi-donor environment.6565 Also, some aid
 professionals see broader drawbacks to rigorous impact evaluation methods. Some assert that the
 use of randomized control groups, which generally require the use of independent evaluators,
 limits the participation of affected individuals and communities in project design. They argue that
 community participation in project planning and evaluation, which can lead to greater buy-in and
 local capacity building, is more valuable in the development context than high-quality evaluation
 findings.6666 Others counter that more participatory methodologies are often weakened by bias, and
 that it is unwise and even unethical to replicate programs, which may profoundly affect
 participants, without having properly evaluated them.67
Compressed Timelines.67 
    Compressed Timelines. While development assistance, in particular, is recognized as a longtermlong-term endeavor, aid strategies can be trumped by political pressures, which can influence
 evaluation. In 2001, a USAID survey report stated that “"the pattern found was that evaluation
 work responds to the more immediate pressures of the day.”68"68 Policymakers facing relatively
 short budget and election cycles do not always allow adequate time for programs to demonstrate
 their potential impact. Such pressures have only increased over the last decade, particularly in the
 politically charged environments of Iraq, Afghanistan, and Pakistan. As a Senate Foreign
 Relations Committee majority-staff report on aid to Afghanistan explains, “found, "the U.S. Government has strived for
 quick results to demonstrate to Afghans and Americans alike that we are making progress. Indeed,
 the constant demand for immediate results prevented the implementation of programs that could
have met long-term goals and would now be bearing fruit.”69

63

Beyond Success Stories, p. 9.
Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11728, p. 33.
65
The QDDR states that “we know that in many cases the outcome-level results are not solely attributable to U.S.
government investments and activities; we will focus on outcome-level progress in locations and subsectors where the
U.S. government is concentrating support.” (QDDR 2010, p. 104).
66
A Can of Worms, p. 8.; Beyond Success Stories, p. 17.
67
Improving Lives Through Impact Evaluation, p. 15
68
Evaluation of Recent USAID Evaluation Experiences, p. 26.
69
S.Prt. 112-21, Evaluating U.S. Foreign Assistance to Afghanistan, June 8, 2011, p. 14.
64

Congressional Research Service

13

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 have met long-term goals and would now be bearing fruit."69 
    The type of evaluation necessary to determine whether aid has real impact is both hard to do and
 of limited use in a short-term context. Timelines are particularly restrictive for MCC, which
 originally intended to complete evaluations during the compact implementation period. This goal,
 which reflects broad support for limited timeframes on foreign assistance, was found not to be
 feasible during implementation of MCC’'s first compacts in Cape Verde and Honduras.7070 Baseline
 data and evaluation models can be rendered worthless if program timelines change. For example,
 an MCC evaluation of a farmer training program in Armenia found that the planned impact
 evaluation model—a phased roll-out—was compromised by a delay in implementing one
 component of the program and the five-year compact timeline.71
71 
    
      
        
          
            Sector Evaluation Example: Trade
 Capacity Building

            Many analysts have suggested that cross-country
 evaluations of aid for a specific sector may be more
 useful for shaping policy than the more common
 individual project evaluations. One example of this
 approach is an evaluation commissioned by USAID to
 look at the impact of 256 U.S. trade capacity building
 (TCB) assistance projects in 78 countries from 2002 to
 2006. The United States obligated about $5 billion during
 this period for TCB activities, through several federal
 agencies, including assistance to help developing
 countries strengthen their public institutions and policies
 related to trade, as well as programs to make private
 industries more knowledgeable about and competitive in
 global markets. The evaluation was designed after the
 fact, making a randomized controlled trial unfeasible, and
 had to account for variations in reporting across
 projects. Much of the report highlights anecdotal
 examples of issues that could not be analyzed
 systematically as a result of inconsistent data collection
 methodologies across projects. However, using
 regression analysis, evaluators found a relationship
 suggesting that each additional $1 invested in U.S. aid
 (from all agencies) for TCB is associated with a $53
 increase in the value of recipient country exports two
 years later. For TCB aid specifically managed by USAID,
 the relationship was $1 invested for $42 in increased
 exports. No similar association was found between TBC
 assistance and recipient country imports or foreign
 direct investment. While this evaluation’'s methodology
 was not sufficient to demonstrate actual aid impact or
 causation, its findings may be useful to policymakers in
 both demonstrating a correlation between TCB aid and
export growth, as well as forming the basis of a

70

Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11728, p. 33.
71
Measuring Results of the Armenia Farmer Training Investment, October 23, 2012, p.4, available at
http://www.mcc.gov/documents/reports/results-2012-002-1196-01-armenia-results-country-summary.pdf.

Congressional Research Service

14

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 export growth, as well as forming the basis of a discussion about the comparative advantages of various
 U.S. agencies in managing TCB aid.72

72
          
        
      
    
    Country Ownership and Donor
Coordination. Coordination. The United States and other
 aid donor countries have made pledges in recent years to both coordinate their efforts and
 increase recipient country control, or “"ownership,”" over the planning of aid projects and the
 management of aid funds. The QDDR also promotes these objectives.7373 Country ownership is
 believed by many to increase the odds that positive results will be sustained over time both by
 ensuring aid projects are consistent with recipient priorities and by helping to build the budget
 and project management capacity of recipient country governments and non-governmental
 organizations (NGOs) that administer the assistance. Donor coordination of assistance efforts is
 supposed to promote efficiency, ease administrative burdens on aid recipients, and avoid
 duplication, among other things. USAID, as part of its ongoing procurement reform process, aims
 to channel 30% of aid directly to governments and local organizations in developing countries by
 2015. However, greater country ownership, and the pooled funds that may result from donor
 coordination, generally means diminished donor control, and a lesser ability to evaluate how U.S.
 funds contributed to a particular outcome. Accountability concerns often greatly overshadow the
 learning aspects of evaluation in such a context, as Congress has expressed concern about the
 heightened potential for corruption and mismanagement when funds flow directly to recipient
 country institutions.
Security. 
    Security. Over the past decade, a significant percentage of foreign aid has been allocated to
 countries where security concerns have presented major obstacles to implementing, monitoring
 and evaluating foreign aid. A 2012 evaluation of a USAID agricultural development program in
 rural Pakistan, for example, states “"the operating environment for development projects has been
 especially testing in recent years in the presence of an insurgency and frequent targeted killings
 and kidnappings.”74"74 Development staff in Afghanistan and Iraq have not always been able to
 safely visit project sites to verify that a structure has been built or supplies delivered, much less
 be out on the streets conducting the types of surveys that certain evaluations would normally call
 for. A 2011 USAID Inspector General report noted that more than half of performance audits in
 Iraq indicated security concerns. In the most insecure environments, monitoring and evaluation of
 aid programs have often fallen by the wayside. Even in less hostile environments, security
 concerns can undermine evaluation quality. For example, a 2011 evaluation of Office of
 Transition Initiatives governance activities in Colombia noted that “"security considerations
 limited to some degree the evaluation team’'s freedom to interview community members in
 project sites at will. This fact made it difficult to be certain that field research did not suffer from
 a form of sampling bias.”75"75 While security challenges may weigh against the use of aid in certain
 regions, the most insecure places are sometimes where the U.S. foreign policy interests are
 greatest, and policymakers must consider whether the risk of being unable to evaluate even the
performance of an aid intervention is worth taking for other reasons.
72
From Aid to Trade: Delivering Result. A Cross-Country Evaluation of USAID Trade Capacity Building, prepared for
USAID by Molly Hageboeck of Management Systems International, November 24, 2010; Executive Summary.
73
Leading Through Civilian Power, U.S. Department of State, Quadrennial Diplomacy and Development Review,
2010, p. 95.
74
United States Assistance to Balochistan Border Areas: Evaluation Report, Prepared by Management Systems
International for USAID, January 16, 2012, p. vi.
75
USAID/OTI’s Integrated Governance Response Program in Colombia, Final Evaluation, prepared by Caroline
Hartzell et al., April 2011, p. 7.

Congressional Research Service

15

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 performance of an aid intervention is worth taking for other reasons.
    Agency and Personal Incentives. Given discretion in the use and conduct of evaluations,
 observers have noted the inclination of foreign assistance officials to avoid formal evaluation for
 fear of drawing attention to the shortcomings of the programs on which they work. While agency
 staff are clearly interested in learning about program results, many are reportedly defensive about
 evaluation, concerned that evaluations identifying poor program results may have personal career
 implications, such as loss of control over a project, damage to professional reputation, budget
 cuts, or other potential career repercussions.7676 As explained by one USAID direct-hire in response
 to a survey, “"if you don’'t ask [about results], you don’'t fail, and your budget isn’'t cut.”77"77 That
 same study revealed that staff felt more pressure to produce success stories than to produce
 balanced and rigorous evaluations, and that “"professional staff do not see any Agency-wide
 incentive to advance learning through evaluations.”78"78 Few observers consider risk taking and
 accepting failure as a necessary component of learning to be hallmarks of USAID or State
 Department culture. MCC’'s institutional attitude toward adverse results may be tested in the
 coming year, as its first evaluations are being made public for the first time.


    Applying Evaluation Findings to Policy

    A consistent theme in past reviews of foreign aid evaluation practices is that even when quality
 evaluation takes place, the resulting information and analysis are often not considered and applied
 beyond the immediate project management team. Evaluations are rarely designed or used to
 inform policy. Lack of faith in the quality of the evaluation, irregular dissemination practices, and
 resistance to criticism may all contribute to this problem, as does lack of time on the part of aid
 implementers and policymakers alike to read and digest evaluation reports. A survey of U.S. aid
 agencies found that “"bureaucratic incentives do not support rigorous evaluation or use of
 findings,” “" "evaluation reports are often too long or technical to be accessible to policymakers and
 agency leaders with limited time,”" and learning that takes place, if any, is “"largely confined to the
 immediate operational unit that commissioned the evaluation.”79"79 The shift in recent decades
 towards the use of contractors and implementing partners for most project implementation, and
 most project evaluation, may also impact the learning process. As one report notes, “partner
"partner organizations are learning from the experience, but USAID is not,”" and most evaluation work
 does not circulate beyond the partner.80
80 
    The lack of a “"learning culture,”" as some describe it, has been a perennial criticism that agencies
 appear to have been largely unsuccessful addressing in the past, though the prominent “lessons
learned”"lessons learned" sections in the first batch of MCC evaluations may set a new standard. Some assert that
 outside pressure, such as a legislative mandate, may be necessary. Congress expressed some
 interest in this issue with the Initiating Foreign Assistance Reform Act of 2009 (H.R. 2139 in the
111th 111th Congress), which called for “"a process for applying the lessons learned and results from
 evaluation activities, including the use and results of impact evaluation research, into future
 budgeting, planning, programming, design and implementation of such United States foreign
 assistance programs.”" No such requirements were enacted in the 111th Congress, but the May
76

Evaluation of Recent USAID Evaluation Experiences, p. 22.
Ibid., p. 24.
78
Ibid., pp. 26-27.
79
Beyond Success Stories, p.iv.
80
Evaluation of Recent USAID Evaluation Experiences, p. 27.
77

Congressional Research Service

16

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

111th Congress, but the May 2012 memorandum from OMB, calling on all agencies to use evaluation data in their FY2014
 budget submissions, may have similar impact.81
81 
    The learning aspect of evaluation relies heavily on agency culture, which may be shaped more by
 leadership than policy. The effective application of evaluation information depends also on the
 details of implementation, such as evaluation questions being based on the information needs of
 policymakers and program managers, and information being presented in a format and to a scale
 that is useful. Policymakers, for example, may be much better able to make actionable use of a
 meta-evaluation of microfinance programs, presented in a short report highlighting key findings,
 than a whole database of detailed analysis of single projects, the results of which may or may not
 be more broadly applicable. Experts have pointed out that individual project evaluations, even
 when well done, do not roll up nicely into a document showing what works and what does not.
 They contend that for maximum learning, an effort must be made at the cross-agency or even
 whole-of-government level to develop evaluation meta-data that is responsive not only to the
 needs of a project manager interested in the impact of a particular activity, but also to agency
 leadership and policymakers who want to know, more broadly, what foreign assistance is most
 effective. This view has been reflected in legislation introduced in recent years, including the
 Foreign Assistance Revitalization and Accountability Act of 2009 (S. 1524 in the 111th111th Congress),
 which called for the creation of a Council on Research and Evaluation of Foreign Policy to do
 cross-agency evaluation of aid programs.

    As important as evaluation can be to improving aid effectiveness, not every aid project has broad
 learning potential. Knowing which potential evaluations could have the greatest policy
 implications may be key to maximizing evaluation resources. Many USAID projects, for
 example, are designed as small-scale demonstrations, with no intention that they be scaled up or
 replicated elsewhere. In other situations, an approach may have already been well proven. In such
 instances, a basic performance evaluation for accountability may be appropriate, but rigorous
 evaluation may be a poor use of resources. A 2012 USAID “"Decision Tree for Selecting the
 Evaluation Design”" asks staff to first consider whether an evaluation is needed, and decline to
 evaluate if the timing is not right, if there are no unanswered questions for the evaluation to
 address, or if there is no demand from stakeholders.82

82 
    Current Agency Evaluation Policies

    The primary U.S. government agencies managing foreign assistance each have their own distinct
 evaluation policies, but these policies have come into closer alignment in the last two years. The
 Quadrennial Diplomacy and Development Review (QDDR) report of December 2010 stated the
 intent that USAID would reclaim its leadership role with respect to evaluation and learning, and
 referenced a new USAID evaluation policy in the works to reflect the growing demand for results
 data and attempt to address some persistent evaluation challenges. That policy took effect January
 2011. The State Department followed suit in February 2012 with an new evaluation policy that is
 similar in many respects to the USAID policy, and MCC updated its policy in May 2012.
81
This memo is discussed in the text box on page 2. See Use of Evidence and Evaluation in the FY2014 Budget,
Memorandum to the Heads of Executive Departments and Agencies, Jeffrey d. Zients, Acting Director, Office of
Management and Budget, May 18, 2012.
82
Decision Tree for Selecting the Evaluation Design, USAID, June 2012, p. 1, available on USAID’s Development
Experience Clearinghouse website.

Congressional Research Service

17

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

Appendix A  Appendix A compares key provisions of the current evaluation policies of USAID, State, and
 MCC.
 
    The new State and USAID policies share much in common, balancing the costs and expected
 gains from evaluation. For example, both require performance evaluations of all larger-thanaveragethan-average projects and experimental/pilot projects, but not all projects. Both also include a target
 allocation of funds for program evaluation: 3% for USAID and 3%-5% for State. The policies
 share an emphasis on accessibility of information, with provisions to promote consistent and
 timely dissemination of evaluation reports. In their introductory language, both policies
 emphasize the learning benefits of evaluation, in addition to accountability. The USAID policy is
 notably more detailed than State’'s on many of the issues. The USAID policy establishes required
 features for evaluation reports, and specifies that evaluation questions be identified in the design
 phase of projects, issues which the State policy does not address. USAID states that most
 evaluations will be conducted by third party contractors or grantees, to promote independence,
 while State’'s policy does not explicitly mention use of independent evaluators. State’'s evaluation
 reporting requirements also focus on internal dissemination, while USAID requires public
 availability. According to State officials, however, many of these issues are fleshed out in
 subsequent internal guidance documents and the State and USAID policies, in practice, differ
 only on the use of impact evaluation. USAID’'s policy calls for impact evaluation whenever
 feasible, while the State policy sets a clear expectation that impact evaluation will be rare.83
MCC’83 
    MCC's evaluation policy shares many elements of the State and USAID policies, but goes farther
 in many respects. MCC requires independent evaluations of all compact projects, using indicators
 and baselines established prior to project implementation. It may be, however, that first-hand
 experience with the challenges of evaluation is bringing MCC policy and practice closer to that of
 USAID over time. MCC’'s 2012 policy revision adopts definitions from USAID’'s 2011 evaluation
 policy and includes a new section on institutional learning. The update also appears to move
 closer to the USAID model with respect to impact evaluation, calling for impact evaluations
“ "when their costs are warranted,”" whereas the previous iteration referred to independent impact
 evaluations as an “"integral part”" of MCC’'s focus on results.8484 The MCC policy still appears to
 have the strongest enforcement mechanism among the three agency policies, conditioning the
 release of quarterly disbursements on substantial compliance with the policy. USAID’'s policy, in
 contrast, calls only for occasional compliance audits, and State’'s policy does not address
 compliance at all.

    While some experts have called for greater uniformity of evaluation practices across agencies to
 allow for comparative analysis, others view the differences in State, USAID, and MCC evaluation
 polices as reflecting the different experience, scope of work, and priorities of the agencies.
 USAID, with the largest and most diverse assistance portfolio among the agencies, and numerous
 small projects, may require a more flexible approach to evaluation than MCC, which is narrowly
 focused on economic growth and recipient government ownership. At State, foreign assistance is
 just one part of a broader portfolio (including diplomatic activities), potentially impacting what
type and scope of evaluation is useful or possible.

83

Author’s communication with State officials via e-mail, October 10, 2012.
Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 1, 2012, p.18; Policy for
Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 12, 2009, p. 17.
84

Congressional Research Service

18

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 type and scope of evaluation is useful or possible. 
    These current evaluation policies represent a step towards improving knowledge of foreign
 assistance measures of effectiveness at the program or project level, and increasing transparency
 of the evaluation process. They do not, however, attempt to establish a systemic approach to aid
 evaluation that would make country-wide, sector-wide, or cross-agency evaluation or aid more
 feasible. They look similar to earlier initiatives to improve aid evaluation. Many aspects of the
 new USAID policy, for example, are strikingly similar to the required actions called for in the
 2005 cable to USAID missions (e.g., evaluation planning as part of all program designs,
 designated evaluation officers at each post, and set-aside evaluation funds). It is too early to know
 whether this new initiative will have more real or lasting impact than its predecessors. The State
 Department policy has only recently taken effect. MCC just released its first five project
 evaluation reports in October 2012,8585 and has yet to produce a compact evaluation. USAID, a
 year into implementation of its policy, reports that insufficient time has passed to document any
 changes in evaluation quality, as no evaluations have gone from start to finish under the new
 requirements. However, the quantity of USAID evaluations has increased notably, from 89 in
 2010 to 295 in 2011,8686 and the agency aims to complete 250 “"high quality”" evaluations by
January 2013.

85
86

See http://www.mcc.gov/pages/activities/activity/impact-evaluation.
USAID Evaluation Policy: Year One, First Annual Report and Plan for 2012 and 2013, p. 2.

Congressional Research Service

19

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 January 2013. 
    
      
        
          
            A Global Perspective on Aid Evaluation

            U.S. foreign assistance evaluation efforts have evolved in the context of a global movement by public and private aid
 donors to improve aid effectiveness, with improved evaluation practices as one of many strategies. Representatives of
 aid donor countries meet regularly under the auspices of the OECD Development Assistance Committee (DAC) to
 discuss evaluation practices, among other things, as a means of implementing the aid effectiveness agenda laid out in
 the 2005 Paris Declaration on Aid Effectiveness and the 2008 Accra Agenda for Action. A 2010 OECD/DAC survey
 and report on evaluation in the development agencies of major donor countries highlighted several issues that are
 common to U.S.-specific aid evaluation.8787 The report found a heavy reliance on measuring outputs, but also a trend
 toward measuring aid impact and larger strategic questions of development effectiveness. It identified new emphasis
 on dissemination of evaluation findings, and found that while bilateral aid agencies on average allocated 0.1% of their
 development assistance budget to evaluation, lack of human resources—people qualified to do rigorous impact
 evaluations, evaluations of direct budget support, or requiring specific language skills, in particular—presented a bigger
 obstacle to evaluation goals than did financial constraints.
 
            Non-governmental organizations have focused on evaluation in recent years, as well. In 2004, an Evaluation Gap
 Working Group was convened by the Center for Global Development with support from the Bill & Melinda Gates
 Foundation and the William and Flora Hewitt Foundation. The Working Group focused on why rigorous impact
 evaluations of development assistance were so rare. The resulting report, “"When Will We Ever Learn?,”" is a key
 resource for this report. The group made two recommendations: (1) that donors invest more in their own evaluation
 capacity, and (2) that an independent institution be created to evaluate aid.8888 The offshoot of the latter
 recommendation is the International Initiative for Impact Evaluation (3ie), established in 2009, with a mission to use
 impact evaluations, specifically, to generate high quality evidence for use in shaping effective development policies. 3ie
 both funds evaluations and produces extensive materials on evaluation methods, implementation practices, and
 application to policy, as a means to improve evaluators’' technical capacity. USAID and MCC are official partners of
 3ie, as are many other official aid agencies, private foundations, and non-profit organizations such as the Hewlett and
 Gates foundations and Save the Children.

 
          
        
      
    
    Issues for Congress

    While recent momentum on foreign aid evaluation reform has originated within the
 Administration, Congress may have significant influence on this process. Not only can Congress
 mandate or promote a certain approach to evaluation directly through legislation, as has been
 proposed, it can modulate Administration policies by controlling the appropriations necessary to
 implement the policies. Congress may also influence how, or if, the information resulting from
 evaluations will impact foreign assistance policy priorities. These issues are discussed in greater
detail below.
Reform Authorization Legislation. There is at least one proposal in the 112th Congress that
focuses detail below. 
    Reform Authorization Legislation. In the 112th Congress, legislation was introduced that focused specifically on foreign aid evaluation. The Foreign Aid Transparency and Accountability
 Act of 2012 (H.R. 3159; ; S. 3310) sought) seeks to evaluate the performance of U.S. foreign assistance
 programs and improve program effectiveness by requiring the President to establish guidelines on
 measurable goals, performance metrics, and monitoring and evaluation plans for foreign
 assistance programs that can be applied on a uniformconsistent basis across implementing agencies, both
U.S. and multilateral..89 The legislation also callscalled for the creation of a website, within two years of
enactment, that would make detailed, program-level information on foreign assistance, including
 country strategies, budget documents, budget justifications, actual expenditures, and program
reports and evaluations available to the public. The bill’s requirements are similar in many
87

Evaluation in Development Agencies, Better Aid, OECD Publishing, 2010, available at http://dx.doi.org/10.1787/
9789264094857-en.
88
When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Working Group,
Center for Global Development, May 2006.

Congressional Research Service

20

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

respects to the F Process, but would extend the requirements across the various federal and
multilateral reports and evaluations available to the public. The general focus of these proposals was similar in many respects to the F Process initiated years ago by the State Department, but would codify the requirements and extend them across the various federal and agencies that administer aid programs. The benefit of such broad uniformity,
 arguably, is that it could enable policymakers, the public, and other stakeholders to better
 compare the activities of various agencies and get a more comprehensive picture of total U.S.
 foreign assistance. A potential drawback is the effort and expense required to impose such
 uniformity on agencies with different objectives, management structures, and information
 technology systems. The legislation isThese proposals also focused on transparency and accountability rather than
 effectiveness, and doesdid not promote the use of impact evaluation. If performance evaluation
 continues to comprise the vast majority of aid evaluations, such a cross-agency requirement may
 provide comparable information on aid management from agency to agency, but is not likely to
 facilitate comparative analysis of what aid works best.
Appropriations for Enhanced Evaluation.is most effective. H.R. 3159 was approved by the House in December 2012, but no action was taken by the Senate before the 112th Congress adjourned. Similar legislation has not yet been introduced in the 113th Congress, but likely will be. 
    Appropriations for Enhanced Evaluation. Increasing the number and quality of foreign aid
 evaluations, while potentially cost effective in the long run, requires an investment of resources.
 For the most part, evaluation costs are integrated into program accounts at the various
 implementing agency budgets and are not scrutinized specifically by Congress. However,
 USAID, in conjunction with its new policy, started in the FY2012 budget request to identify
 resource needs for a centralized evaluation and learning through a “"Learning, Evaluation and
Research” Research" (LER) line item. LER is one of the seven focus areas of the USAID Forward reform
 agenda, and is intended to both enhance USAID’'s ability to conduct rigorous evaluations, as well
 as apply the knowledge gained through evaluation to improve future assistance strategies and
 design. The Administration requested $19.7 million for this purpose, through the Development
 Assistance appropriations account, for FY2012. Congress provided $12.26 million. For FY2013,
 USAID requested $26.67 million, to expand the number of priority evaluations it can carry out,
 improve staff training, and support evaluation collaborations with international partners. The
 ultimate funding level established by Congress, together with any related legislative directives,
 may play a role in determining the extent of the Administration’'s efforts to strengthen evaluation
 practice.
 
    Impact of Evidence Based Approach on Congressional Priorities.. Congress has long exerted
 control over foreign assistance not only through appropriated funds and restrictions, but also by
 directing foreign assistance funds to certain sectors, countries, or even specific projects through
 bill or report language. For example, the committee reports accompanying the FY2013 House and
 Senate State-Foreign Operations appropriation proposals (H.Rept. 112-494; ; S.Rept. 112-172), ),
like most of their predecessors, provide specific funding levels for microfinance, basic education,
 water and sanitation, women’'s leadership training, people-to-people reconciliation programs in
 the Middle East, and other sectors of particular interest to Members of Congress. Should credible
 information about the relative effectiveness of these programs be made available as a result of
 improved evaluation practices, Congress can weigh the importance of the data, among other
 drivers, in establishing aid priorities. Some congressional directives on aid are less likely than
 others to be affected by evaluation results. The availability of actionable evaluation data may not
 result in a maximization of aid effectiveness, but may allow Congress to make more deliberate
 trade-offs between effectiveness and other objectives.

Conclusion
 
    Conclusion
    The primary U.S. agencies charged with implementing foreign assistance have made significant
 steps in the last two years to address ongoing deficiencies in evaluation practices that make it
 difficult to judge whether foreign assistance is achieving its various objectives. There is

Congressional Research Service

21

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 widespread agreement, reflected in new policies, on the need for consistent performance
 evaluation of aid programs. The value of rigorous impact evaluation is broadly recognized as
 well, though the agencies differ in their capabilities and aspirations in this respect. Past policies
 and evaluation reform efforts, however, have been similarly focused but not sustained in the face
 of persistent challenges, many of which remain today. Other reforms, such as the establishment of
 centralized evaluation processes or the creation of an independent evaluation entity, have been
 proposed in legislation yet not addressed in agency policies. Growing emphasis in Congress and
 the Administration on results-based budgeting, as well as movement within the international aid
 donor community toward more rigorous aid evaluation practices, may provide the context for
 future change. The 113th113th Congress will have multiple opportunities to influence how U.S. foreign
 assistance is evaluated through legislative proposals, appropriations, and oversight activities.

Congressional Research Service

22

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

 
    Appendix A. Select Aspects of Current USAID,
 State Department, and MCC Evaluation Policies
USAID

State

MCC

Effective
Date

January 2011

February 15, 2012

May 1, 2012

Responsible
Personnel

PPL/LER responsible for system
implementation, while missions
and functional bureaus
responsible for conducting
evaluations. All Bureaus and
operating units must designate an
evaluation point of contact.

F and RM Bureaus monitor and
report on evaluations plans. Each
Bureau should identify a senior
staffer to serve as evaluation point of
contact.

Primary lead is MCA
(host country entity)
M&E, with input from
MCC M&E.

Evaluation
Requirement

Operating units must conduct at
least one performance evaluation
of each project that equals or
exceeds average project size.

All programs/projects/activities
greater than or equal to the median
size (generally using dollar value as
the measure) for the Bureau must be
evaluated at least once in their
lifetime or every five years,
whichever is less.

All Compacts and
Threshold Agreements
include monitoring and
evaluation plans, which
identify the evaluations
to be conducted for
each project, the key
evaluation questions
and methodologies,
and the data collection
strategies that will be
used.

Projects involving an untested
hypothesis or new approach, and
that are anticipated to expand in
scale or scope, will undergo an
impact evaluation, if feasible.
All evaluations will share certain
basic features, including a full
description of methodology;
standardized recording and
maintenance of records from
evaluation; evaluation findings
based on facts, evidence, and
data, sex-disaggregated data; and
an explanation of the limitations
of the data.

All pilot programs must be evaluated
once every five years.
Each Bureau must evaluate 2 to 4
projects/programs/activities in
FY2012-FY2013, with this
requirement extending to all posts in
FY2013-FY2014 period.

Selected indicators
must have baselines
established prior to the
start of the
corresponding activity.

Key evaluation questions will be
identified during the design phase
of every project.
Evaluation
Type

Emphasis on quality evaluation
methods and favoring random
assignment/experimental methods
for impact evaluations when
feasible.

Congressional Research Service

Final evaluations are
required for all
projects in a Compact
upon completion or
termination; mid-term
evaluations are
discretionary.

Bureau’s discretion, based on
context but the policy establishes an
expectation that the “great majority”
of evaluations will be performance
evaluations because “impact
evaluations are more time
consuming, costly, and often difficult
to successfully design for State
programs, projects and activities.”

Impact evaluations
performed “when their
costs are warranted by
the expected
accountability and
learning.”

23

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

Evaluator
Type

Funding
Requirement

USAID

State

MCC

Policy states that most
evaluations will be conducted by
third party contractors or
grantees managed by USAID, but
evaluation teams may be
composed primarily of USAID
staff, led by an outside expert,
when it is determined that this
will facilitate institutional learning.

Suggests that evaluators should be
“free from and pressure and/or
bureaucratic interference,” but does
not explicitly call for the use of
outside evaluators.

Independent evaluators
required for final
evaluations of
Compacts.

Recommends an average 3% of
program budgets be dedicated
specifically to external evaluation,
distinct from monitoring.

Program managers “should identify
resources of up to 3-5% for
evaluation activities.”

Does not specify a
portion of funds that
should be used for
evaluation.

Mid-term compact
evaluations and final
threshold program
evaluations can be
done independently or
by MCC/MCA staff.

Resources for evaluation should
be concentrated on large projects
and those that are innovative or
pilot approaches.
Reporting
Requirement

Public availability of evaluation
reports and summaries, within 3
months of completion, on the
Development Experience
Clearinghouse website.

Bureaus and posts must
electronically transmit final
evaluation reports as cables and post
reports on their OpenNet or
ClassNet websites.

MCAs must post their
approved Compact
M&E plans on their
website. MCC and
MCAs must “regularly”
publish results
information on their
websites.

Compliance
Enforcement

PPL/LER will organize occasional
external technical audits of
operating unit compliance with
the policy.

No reference to compliance
enforcement.

Substantial compliance
required for approval
of quarterly
disbursements
requested by recipient
country.

Source: USAID
State
MCC
Effective Date
January 2011
February 15, 2012
May 1, 2012
Responsible Personnel
PPL/LER responsible for system implementation, while missions and functional bureaus responsible for conducting evaluations. All Bureaus and operating units must designate an evaluation point of contact.
F and RM Bureaus monitor and report on evaluations plans. Each Bureau should identify a senior staffer to serve as evaluation point of contact.
Primary lead is MCA (host country entity) M&E, with input from MCC M&E.
Evaluation Requirement
Operating units must conduct at least one performance evaluation of each project that equals or exceeds average project size.
Projects involving an untested hypothesis or new approach, and that are anticipated to expand in scale or scope, will undergo an impact evaluation, if feasible. 
All evaluations will share certain basic features, including a full description of methodology; standardized recording and maintenance of records from evaluation; evaluation findings based on facts, evidence, and data, sex-disaggregated data; and an explanation of the limitations of the data.
Key evaluation questions will be identified during the design phase of every project.
All programs/projects/activities greater than or equal to the median size (generally using dollar value as the measure) for the Bureau must be evaluated at least once in their lifetime or every five years, whichever is less.
All pilot programs must be evaluated once every five years.
Each Bureau must evaluate 2 to 4 projects/programs/activities in FY2012-FY2013, with this requirement extending to all posts in FY2013-FY2014  period.
All Compacts and Threshold Agreements include monitoring and evaluation plans, which identify the evaluations to be conducted for each project, the key evaluation questions and methodologies, and the data collection strategies that will be used.
Final evaluations are required for all projects in a Compact upon completion or termination; mid-term evaluations are discretionary.
Selected indicators must have baselines established prior to the start of the corresponding activity.
Evaluation Type
Emphasis on quality evaluation methods and favoring random assignment/experimental methods for impact evaluations when feasible.
Bureau's discretion, based on context but the policy establishes an expectation that the "great majority" of evaluations will be performance evaluations because "impact evaluations are more time consuming, costly, and often difficult to successfully design for State programs, projects and activities."
Impact evaluations performed "when their costs are warranted by the expected accountability and learning."
Evaluator Type
Policy states that most evaluations will be conducted by third party contractors or grantees managed by USAID, but evaluation teams may be composed primarily of USAID staff, led by an outside expert, when it is determined that this will facilitate institutional learning.
Suggests that evaluators should be "free from and pressure and/or bureaucratic interference," but does not explicitly call for the use of outside evaluators. 
Independent evaluators required for final evaluations of Compacts.
Mid-term compact evaluations and final threshold program evaluations can be done independently or by MCC/MCA staff.
Funding Requirement
Recommends an average 3% of program budgets be dedicated specifically to external evaluation, distinct from monitoring.
Resources for evaluation should be concentrated on large projects and those that are innovative or pilot approaches.
Program managers "should identify resources of up to 3-5%  for evaluation activities."
Does not specify a portion of funds that should be used for evaluation.
Reporting Requirement
Public availability of evaluation reports and summaries, within 3 months of completion, on the Development Experience Clearinghouse website.
Bureaus and posts must electronically transmit final evaluation reports as cables and post reports on their OpenNet or ClassNet websites.
MCAs must post their approved Compact M&E plans on their website. MCC and MCAs must "regularly" publish results information on their websites.
Compliance Enforcement
PPL/LER will organize occasional external technical audits of operating unit compliance with the policy. 
No reference to compliance enforcement.
Substantial compliance required for approval of quarterly disbursements requested by recipient country.
Source: Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 1, 2012; Department of
 State Evaluation Policy, Bureau of Resource Management, February 23, 2012; Evaluation: Learning from Experience,
 USAID Evaluation Policy, January 2011.
Notes: 
        Notes: PPL/LER = USAID Office of Learning, Evaluation and Research; F Bureau = Office of Foreign Assistance
 Resources; RM = State Department Bureau of Resource Management; MCA = the Millennium Challenge Account
 implementing entity in each compact country; M&E = monitoring and evaluation. The information in the table
 refers only to what is in the actual evaluation policy document of each agency, as cited above. Information
 available outside of these documents, which may provide greater details about aspects of the policies, is not
reflected here.

Congressional Research Service

24

 Does Foreign Aid Work? Efforts to Evaluate U.S. Foreign Assistance

Author Contact Information
Marian Leonardo Lawson
Analyst in Foreign Assistance
mlawson@crs.loc.gov, 7-4475

Congressional Research Service

25

  reflected here.
      
    
    
    
    
  
  
    
  
  
    Footnotes1.
            
        
           U.S. Department of State, Quadrennial Diplomacy and Development Review, 2010, Leading Through Civilian Power, p. 103.
2.
            
        
           For more information about the MCC model, see CRS Report RL32427, Millennium Challenge Corporation, by [author name scrubbed].
3.
            
        
           Statement of USAID Administrator Rajiv Shah to The Cable, as reported in The Cable, June 13, 2012.
4.
            
        
           While not often discussing evaluation policy per se, some Members appear to be influenced in their policy decisions by their sense of what aid is working and what is not. For example, when introducing her subcommittee's FY2013 proposal at full-committee mark-up on May 17, 2012, House State-Foreign Operations Appropriations Subcommittee Chairwoman Kay Granger remarked that the legislation "only supports programs that work." Senator Lindsay Graham of the Senate State-Foreign Operations Appropriations Subcommittee, explaining the sharp reduction in aid for Iraq in the Senate's FY2013 proposal at a May 22, 2012, mark-up, said "there's no point in throwing good money after bad."
5.
            
        
           For historic information on foreign aid spending, see CRS Report R40213, Foreign Aid: An Introduction to U.S. Programs and Policy, by [author name scrubbed] and Marian Leonardo Lawson.
6.
            
        
           When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Gap Working Group, Center for Global Development, May 2006, p. 1.
7.
            
        
           According to U.S. Overseas Loans and Grants, 21 U.S. Government agencies reported disbursing foreign assistance in FY2010. See http://gbk.eads.usaidallnet.gov/data/fast-facts.html. 
8.
            
        
           For more on current GPRA requirements, see CRS Report R42379, Changes to the Government Performance and Results Act (GPRA): Overview of the New Framework of Products and Processes, by [author name scrubbed].
9.
            
        
           Use of Evidence and Evaluation in the FY2014 Budget, Memorandum to the Heads of Executive Departments and Agencies, Jeffrey D. Zients, Acting Director, Office of Management and Budget, May 18, 2012.
10.
            
        
           Foreign Assistance Act of 1961, P.L. 87-195), §101(a).
11.
            
        
           Ibid.
12.
            
        
           FAA, as amended, §481(a)(1)(C).
13.
            
        
           FAA, as amended, §491(a).
14.
            
        
           FAA, as amended, §572 (1) and (2).
15.
            
        
           Several examples of this are discussed in, Economic Gangsters: Corruption, Violence and the Poverty of Nations, by Raymond Fisman and Edward Miguel, Princeton University Press, 2008.
16.
            
        
           See Dambisa Moyo, Dead Aid: Why Aid is Not Working and How There Is a Better Way for Africa, Farrar, Straus and Giroux, New York, 2009, p. 48. 
17.
            
        
           Beyond Success Stories: Monitoring and Evaluation For Foreign Assistance Results, Evaluator Views of Current Practice and Recommendations for Change, by Richard Blue, Cynthia Clapp-Wincek and Holly Benner, May 2009, p. ii.
18.
            
        
           For a thorough, yet non-technical, discussion of the use of impact/attribution evaluation, see "An introduction to the use of randomized control trials to evaluate development interventions," by Howard White, International Initiative for Impact Evaluation, Working Paper 9, February 2011.
19.
            
        
           Clemens, Michael. "Impact Evaluation in Aid: What For? How Rigorous?" Presentation at the Overseas Development Institute, July 3, 2012, video recording available at http://www.cgdev.org/content/multimedia/detail/1426372/.
        
      
      
        20.
            
        
           Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 4.
21.
            
        
           The USAID Evaluation System: Past Performance and Future Direction, Bureau for Program and Policy Coordination, USAID, September 1990, p. 9.
22.
            
        
           Evaluation Handbook, Office of Program Evaluation, USAID, November 1970, p. 40.
23.
            
        
           Experience – A Potential Tool for Improving U.S. Assistance Abroad, U.S. Government Accountability Office, GAO-ID-82-36, June 15, 1982, p. i (summary).
24.
            
        
           The History of CDIE, CDIEHIST.017/SESmith;JREriksson/10-17-94, p.4.; available through the Development Experience Clearinghouse on the USAID website.
25.
            
        
           The Community-Based Family Planning Services Family Planning Health and Hygiene Project, prepared by Bruce Carlson, MSPH, and Malcolm Potts, M.D. under the auspices of The American Public Health Association, USAID, 1979, pp. 5, 7.
26.
            
        
           Ibid.
27.
            
        
           The A.I.D. Evaluation System: Past Performance and Future Directions, Bureau for Program and Policy Coordination, Agency for International Development, September 1990, p. 10.
28.
            
        
           Ibid., p. 11.
29.
            
        
           Ibid., p. 11.
30.
            
        
           Accountability and Control Over Foreign Assistance, GAO/T-NSIAD-90-25, March 29, 1990, p. 6, 11. The review found that military assistance managed by State and the Department of Defense was also inadequately monitored and accounted for.
31.
            
        
           The History of CDIE, p.6; The A.I.D. Evaluation System, p. 11.
32.
            
        
           Ibid, pp. 6-7.
33.
            
        
           Ibid. p. 8.
34.
            
        
           The Role of Evaluation in USAID, Performance Monitoring and Evaluation TIPS, USAID CDIE, 1997, Number 11, p. 3.
35.
            
        
           Beyond Success Stories, p.7; Evaluation of Recent USAID Evaluation Experience, Cynthia Clapp-Wincek and Richard Blue, Working Paper No. 320, U.S. Agency for International Development, Center for Development Information and Evaluation, June 2001, p. 31.
36.
            
        
           Evaluation of Recent USAID Evaluation Experience, p. 5. The report authors note that while some of the declining numbers can be attributed to missions not submitting their evaluations to the Development Experience Clearinghouse, as policy required, making the specific numbers unreliable, the trend of decline is unmistakable. 
37.
            
        
           Evaluation of Recent USAID Evaluation Experiences, p. 12.
38.
            
        
           The Evaluation of USAID's Evaluation Function: Recommendations for Reinvigorating the Evaluation Culture Within the Agency, Janice M. Weber, Bureau for Program and Policy Coordination, USAID, September 2004, pp. 5, 10.
39.
            
        
           Actions Required to Implement the Initiative to Revitalize Evaluation in the Agency, UNCLAS STATE 127594, July 8, 2005.
40.
            
        
           For an overview of this evaluation, as well as links to related studies, see http://www.povertyactionlab.org/evaluation/primary-school-deworming-kenya.
41.
            
        
           Roetman, Eric. A Can of Worms? Implications of Rigorous Impact Evaluations for Development Agencies, International Initiative for Impact Evaluations, Working Paper 11, March 2011, p. 5.
42.
            
        
           See http://www.state.gov/f/indicators/index.htm. It was originally expected by many that the F Bureau would eventually track all foreign assistance provided by U.S. agencies, not just State and USAID. As of 2012, some MCC data has been added to the Bureau's public database (www.foreignassistance.gov), but there does not appear to be momentum toward any expansion of F Bureau authority. 
43.
            
        
           Beyond Success Stories, p. 14. The State Department traditionally has used a variety of resources for monitoring its foreign assistance programs, including Mission and Bureau Strategic Plans, annual performance and accountability reports, and Office of Inspector General and Government Accountability Office reports, but had no systematic evaluation process (Department of State Program Evaluation Plan, FY2007-2012 Department of State and USAID Strategic Plan, Bureau of Resource Management, May 2007, Appendix II).
44.
            
        
           The data is publically available at http://www.foreignassistance.gov.
45.
            
        
           Beyond Success Stories, p. 8.
46.
            
        
           Beyond Foreign Assistance: The HELP Commission Report on Foreign Assistance Reform, The United States Commission on Helping to Enhance the Livelihood of People (HELP) Around the Globe Commission, December 7, 2007, p. 15.
47.
            
        
           HELP Report, p. 99.
48.
            
        
           QDDR, p. 110.
49.
            
        
           See http://www.mcc.gov/pages/activities/activity/impact-evaluation.
50.
            
        
           Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11-728, pp. 32-38.
51.
            
        
           MCC's statement on the release, which summarizes the findings, is available at http://www.mcc.gov/pages/press/release/statement-102312-evaluations.
52.
            
        
           Statements of various leaders in the development community with respect to the MCC evaluations are available at http://www.modernizeaid.net/2012/10/23/mfan-statement-new-evaluations-advance-transparency-and-provide-valuable-guidance-for-future-programs/.
        
      
      
        53.
            
        
           See comments of William Savedoff from the Center for Global Development at http://blogs.cgdev.org/mca-monitor/2012/11/the-biggest-experiment-in-evaluation-mcc-and-systematic-learning.php.
54.
            
        
           Trends in Development Evaluation Theory, Policies and Practices, USAID, 17 August 2009, p. 46.
55.
            
        
           Trends in International Development Evaluation Theory, Policies and Practices; USAID, 17 August 2009, p. 13. The report was prepared for USAID by Molly Hageboeck of Management Systems International.
56.
            
        
           All information in this text box is based on USAID/OTI's Integrated Governance Response Program in Colombia, A Final Evaluation, produced for USAID by Caroline Hartzell, Robert Lamb, Phillip McLean and Johanna Mendelson Forman, April 2011. Direct quotes, in order of appearance, are from pages 20 and 13.
57.
            
        
           The Developmental Effectiveness of Untied Aid, OECD, p.1, available at http://www.oecd.org/dataoecd/5/22/41537529.pdf.
58.
            
        
           An Evaluation of USAID's Evaluation Function, p. 5.
59.
            
        
           Beyond Success Stories, p. 16.
60.
            
        
           Ibid.
61.
            
        
            Ibid.
62.
            
        
           Foreign aid data from FY2006-FY2012 estimates, sorted by recipient country, year, agency (only State, USAID and MCC), appropriations account, and objective is readily available through the "Foreign Assistance Dashboard" at http://www.foreignaid.gov.
63.
            
        
           Beyond Success Stories, p. 9.
64.
            
        
           Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11-728, p. 33.
65.
            
        
           The QDDR states that "we know that in many cases the outcome-level results are not solely attributable to U.S. government investments and activities; we will focus on outcome-level progress in locations and subsectors where the U.S. government is concentrating support." (QDDR 2010, p. 104).
66.
            
        
           A Can of Worms, p. 8.; Beyond Success Stories, p. 17.
67.
            
        
           Improving Lives Through Impact Evaluation, p. 15
68.
            
        
           Evaluation of Recent USAID Evaluation Experiences, p. 26.
69.
            
        
           S.Prt. 112-21, Evaluating U.S. Foreign Assistance to Afghanistan, June 8, 2011, p. 14.
70.
            
        
           Millennium Challenge Corporation: Compacts in Cape Verde and Honduras Achieved Reduced Target, GAO-11-728, p. 33.
71.
            
        
           Measuring Results of the Armenia Farmer Training Investment, October 23, 2012, p.4, available at http://www.mcc.gov/documents/reports/results-2012-002-1196-01-armenia-results-country-summary.pdf.
72.
            
        
           From Aid to Trade: Delivering Result. A Cross-Country Evaluation of USAID Trade Capacity Building, prepared for USAID by Molly Hageboeck of Management Systems International, November 24, 2010; Executive Summary.
73.
            
        
           Leading Through Civilian Power, U.S. Department of State, Quadrennial Diplomacy and Development Review, 2010, p. 95.
74.
            
        
           United States Assistance to Balochistan Border Areas: Evaluation Report, Prepared by Management Systems International for USAID, January 16, 2012, p. vi.
75.
            
        
           USAID/OTI's Integrated Governance Response Program in Colombia, Final Evaluation, prepared by Caroline Hartzell et al., April 2011, p. 7.
76.
            
        
           Evaluation of Recent USAID Evaluation Experiences, p. 22.
77.
            
        
           Ibid., p. 24.
78.
            
        
           Ibid., pp. 26-27.
79.
            
        
           Beyond Success Stories, p.iv.
80.
            
        
           Evaluation of Recent USAID Evaluation Experiences, p. 27.
81.
            
        
           This memo is discussed in the text box on page 2. See Use of Evidence and Evaluation in the FY2014 Budget, Memorandum to the Heads of Executive Departments and Agencies, Jeffrey d. Zients, Acting Director, Office of Management and Budget, May 18, 2012.
82.
            
        
           Decision Tree for Selecting the Evaluation Design, USAID, June 2012, p. 1, available on USAID's Development Experience Clearinghouse website. 
83.
            
        
           Author's communication with State officials via e-mail, October 10, 2012.
84.
            
        
           Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 1, 2012, p.18; Policy for Monitoring and Evaluation of Compacts and Threshold Programs, MCC, May 12, 2009, p. 17.
85.
            
        
           See http://www.mcc.gov/pages/activities/activity/impact-evaluation.
86.
            
        
           USAID Evaluation Policy: Year One, First Annual Report and Plan for 2012 and 2013, p. 2.
87.
            
        
           Evaluation in Development Agencies, Better Aid, OECD Publishing, 2010, available at http://dx.doi.org/10.1787/9789264094857-en.
88.
            
        
           When Will We Ever Learn?: Improving Lives Through Impact Evaluation, Report of the Evaluation Working Group, Center for Global Development, May 2006.
89.
            
        
           The House and Senate proposals were similar but not identical. For example, H.R. 3159, as passed by the House, called for evaluation guidelines to be applied "with reasonable consistency," while S. 3310 called for the guidelines to be applied "on a uniform basis."