![]()
Transcript of An Evaluation Workshop on
Planning and Constructing Performance-based Evaluations
by
Joseph S. Wholey
and
John A. McLaughlin
Project Directors' Annual Meeting
Renaissance Washington DC Hotel
Washington, DC
June 10, 1998
This is a product of the National Transition Alliance for Youth with Disabilities (NTA), Cooperative Agreement Number H158M50001. The NTA is jointly funded by the U.S. Departments of Education and Labor, including the Office of Special Education and Rehabilitative Services, and the National School-to-Work Office. Contents of this document do not necessarily reflect the views or policies of the Departments of Education or Labor, nor does the mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.
The partners of the National Transition Alliance are the Transition Research Institute at the University of Illinois, the National Transition Network, Institute on Community Integration at the University of Minnesota, the Disabilities Studies and Services Center at the Academy for Educational Development, the Council of Chief State School Officers, the National Alliance of Business, and the National Association of State Directors of Special Education. Collaborators of the National Transition Alliance are equal opportunity employers and educators.
This product was produced for the National Transition Alliance by the Transition Research Institute at the University of Illinois.
Published October 1998.
Foreword
The demand for federally funded program evaluations began to be heard in the 1960s during the Great Society as a result of large social programs (e.g., education, housing, health, criminal justice) initiated by President Kennedy and expanded under Presidents Johnson and Nixon. This demand for program evaluation continues and is reinforced with the recent passage of the Government Performance and Results Act (GPRA) of 1993. GPRA requires all agencies within the federal government to carry out evaluations of the programs and projects they administer. This means that all federally funded programs and projects must now collect and report information on the effectiveness and efficiency of their activities.
As noted in the Government Performance and Results Act, there is declining public confidence in governments ability to adequately address vital public needs; there are increasing calls to improve the efficiency of federally funded programs; and congressional policy-making, spending decisions, and program oversight are seriously hampered by inadequate information on program performance and insufficient attention to results.
It is time for directors and managers of federal and state funded programs to design evaluations that articulate program goals, measure program performance against those goals, and provide information useful for improving program performance. Project directors and managers need to be able to tell others outside of their programs about the value of their programs and to report on service quality, customer satisfaction, and most importantly, results.
In keeping with the intent of GPRA, and in support of the federal governments efforts to ensure that programs are properly evaluated, the National Transition Alliance provides evaluation technical assistance to directors of OSERS funded programs, including model transition projects, school-to-work systems, and state system change grantees. This transcript of a workshop on planning and constructing performance-based evaluations is intended to give you something that you will be able to use as a guide to design and implement your own performance-based evaluation. I hope that this transcript will provide you with information that will enable you to manage programs more effectively and efficiently and that, ultimately, will help you to improve your programs performance.
Thomas E. Grayson, Director
National Transition Alliance at the University of Illinois
Champaign-Urbana
Table of Contents
Welcome and Introduction 1
Workshop Objectives 2
Essential Evaluation Questions 3
The Government Performance and Results Act (GPRA) - History and Politics
5
GPRAs Beginning 5
GPRA: Context, Purposes, and Requirements 7
Implementing the Results Act 9
Implementing and Managing a Performance-based Plan 10
Theory of Change 11
Bridge or Information Spectrum 11
Logic Models 13
Use of Logic Models 13
Logic Model Elements 15
Inputs, Activities, Outputs, Outcomes, and External Factors 15
External Factors 17
Examples of Logic Models 19
Group Exercise in Constructing a Logic Model 21
Workshop Participants: Responses to Group Exercise 22
Measuring Performance 25
Collect the Right Information 25
Report Results and Meaningful Information 26
Why Collect Performance Information? 27
The Measurement Challenge 29
External Influences 31
Uncertainties 31
Validity Assumption 32
Determine What is Essential 32
Relevance Realism 33
Logic Model Chart 34
Key Terms 36
Design Evaluation and Implementation Evaluation 36
Customer Feedback 37
Measurable Objectives 38
Conclusion 39
Questions From Workshop Participants 40
Appendix 45
A - Critical Evaluation Questions
B - Government Performance and Results Act
C - Effectively Implementing the Results Act
D - References
E - Theory of Change
F - Performance Information Spectrum
G - Using Logic Models in Developing Performance Measurement Systems
H - Typical Logic Model: Inputs, Activities, Outputs, Outcomes, and External Factors
I - Program Logic Model Example
J - Example of Logic Models
K - Exercise: the Logic Model Shuffle (United Way of America)
L - Simple Logic Chart and Simple Logic Chart Example
M - Key Terms
N - Components of a Measurable Objective
O - The Evolution of a Good Objective
P - Remaining Handouts
PLANNING AND CONSTRUCTING
PERFORMANCE-BASED EVALUATIONSTHOMAS E. GRAYSON: We are very fortunate to have two experts on performance-based evaluation speaking today, Joe Wholey and John McLaughlin. Both are very experienced and knowledgeable about GPRA, the Government Performance and Results Act. Both have conducted training at the federal, private agency, state, and school district level.
Joe Wholey is Professor of Public Administration at the University of Southern California and Senior Advisor for Evaluation Methodology at the U.S. General Accounting Office. His work focuses on performance-based management and accountability in the public and not-for-profit sectors. His interest is on the use of strategic planning and performance measurement to improve policy decision-making, government performance, and public confidence in government.
John McLaughlin is a colleague, not only in evaluation but also in special education and research. He has 25 years of experience in this field of evaluation and design, and has worked a lot with private foundations to help them design performance-based evaluation and looking at results of the programs.
Workshop Objectives
JOHN A. McLAUGHLIN: You have in your handout our objectives. We have long-term and short-term objectives because we knew that we weren't going to be able to wave a magic wand and have you up to speed with regard to performance-based management. In terms of long range, there are two critical objectives we're reaching for. The first is to enable you to better communicate the worth, the value of your programs--the results, the pay-off.
The second, and I think this is the most important aspect of the Government Performance and Results Act, is to enable you to develop the skills to manage your programs based on the appropriate feedback in order for you to improve your programs and make mid-course corrections when your data tell you to do that. Thetas one of the biggest changes in the evaluation philosophy as it relates to projects like the ones that you work on.
We have three short-term objectives. First of all, we want to give you a quick background of the rationale at the national level for performance-based management. We know that you've heard a lot about the Government Performance and Results Act, so we're just going to give you a quick overview to connect to that, because it's an important driving force.
Secondly, we want to give you a tool, a process, to enable you to describe your program in a logical fashion so that you can communicate to people not only what you're trying to accomplish, but also how you're going to accomplish it.
And then thirdly, if that's the logical pathway to success, we're going to help you to think through what the appropriate measures are that you want to take to monitor progress and assess success.
Note how those short-term objectives connect to the long-term objectives. That's kind of an implicit way of guiding us to what we're really trying to accomplish with you today.
So if those are our dependent variables, what are our independent variables? As an orienting framework we're going to talk about four questions that we think you want to address when you think about designing a program and when you're creating the evaluation mechanism around that program. We'll also talk about the bigger national picture with respect to the Government Performance and Results Act. I want to build a connection between the GPRA and OSERS and the requirements of GPRA, point out why you're so important to OSERS, and then move into some didactic exercises regarding building a logic model for your program. Then we will finally move into performance measurement.
Essential Evaluation Questions
I want to start with the essential evaluation questions presented in your handout as an organizing framework for this workshop (see Appendix A). These questions were not generated while sitting in our offices and thinking about the Government Performance and Results Act, thinking about performance-based management. Rather, they are a part of what we have learned as we've worked with people such as yourselves, attempting to respond to those requirements. We have also looked at what kinds of questions Congress is asking-- Is there a major stakeholder in the Government Performance and Results Act? Yes, not only Congress, but also people at the state level and people at the local level are major stakerholders as well. We have to remember that performance-based management came from the localities. They didn't necessarily come from the federal government, and so these are questions that we think all program managers need to address.
The biggest question is am I addressing the right results? Now, what does that mean? In your programs, you are interested in transition from secondary schools to the world of work for persons with special needs. You're also interested in the factors that contribute to, factors that lead to, that cause a person to be able to transition to the world of work with ease. On the other hand, what are the factors that serve as barriers? What are the factors that serve as driving forces to that successful transition?
It's these factors that you address in your projects. What we want to describe is what Michael Patton would call a validity assumption. The funding agency (e.g., OSERS) wants you to be able to demonstrate -- that the relationship between these factors and the right results (i.e., transitioning from school to work) is reasonable, that it is valid, so that's the first question that we want you to address.
If we can determine that we're addressing the right factors, then our next challenge is to address four other critical questions, of which, the first is communicating to people what our program is trying to achieve, for whom, when, through what strategies, and with what resources. We've got to tell them what we're trying to change, in what people, and how were going to do it, and with what resources.
Secondly, we know that there are a lot of other people who are working with the same population as we are. Therefore, we need to figure out who our partners are and how we can partner with them. For example, if we want to reach that long-term goal of transition to the world of work for individuals with disabilities, then, we must try to do all we can to prepare students with disabilities to enter the world of work. However, if someone isn't out there working with the employers to be receptive to employing these persons, our program is dead in the water, or certainly less effective. So, we need to figure out who our partners are (e.g., employers) and how we can partner with them.
After communicating what it is we're trying to accomplish, thirdly, we need to address the question, how will I know how well I am doing? We want to be able to tell people how were going to monitor and evaluate our program.
The fourth and final question is how are we doing now? In other words, we want to give the person whos reviewing our project a status check. This is where were starting out from. This is the jumping-off point. This is where we are as we begin our program.
So these are the four other critical questions that we want people to answer once they've established the answer to that big question: Are we addressing the right results?
We've always had the elements of planning and conducting a performance evaluation. We've always had to demonstrate the need. We have always had to demonstrate what factors are related to the problem. We have always had to include a plan of action, we have had to have some operational objectives. We had to have tasks, and so forth. The big change is the link. No longer do we address these elements individually, we now have to build a connection between all of those elements, including results or intended outcomes, and a logical argument for our project.
Now, I would like to ask Joe to talk a little bit about the Government Performance and Results Act.
The Government Performance and Results Act (GPRA)
History and Politics
JOSEPH S. WHOLEY: This is just a quick overview. You've heard for the last couple of days a lot about the Government Performance and Results Act, also known as GPRA. It came, as John said, from the state and local level, and also from experience in other countries around the world who were leading the United States in this regard for performance and results information.
GPRAs Beginning
I've been associated with GPRA since 1990. The person who pushed for it was the former mayor of Sunnyvale, California, John Mercer, who was working for Senator William Roth of Delaware at the time, a member of the Republican Minority, in the United States Senate. Mercer was trying to interest Senator Roth in a piece of legislation that would try to accomplish at the federal level what Mercer had seen the city manager accomplish in Sunnyvale. That is, to run the whole government as a set of projects with measurable goals and measures of success, and so forth. I was enlisted in Mercers army. Before that I was very interested in the topic-- I am an evaluator by background, and I had noticed that evaluations are done rarely and used very infrequently.
Ive been interested for the past 15 years or so in repetitive measurement of results as perhaps a better way to get information used, especially to improve programs. That is my main interest, to improve programs. So as you learn how this history of GPRA develops, you will see a certain set of purposes for the statute emerge.
Senator Roth became interested, and he put this statute in a bill. Many pieces of legislation introduced in the Congress never go anywhere. This bill at one time was called the Bang for the Buck Act, an engaging title for a proposed federal statute.
Some hearings were held. I testified a couple of times, as I'm a member of the National Academy of Public Administration. John Glenn was chairman of the committee, and he also became interested. Other people also started to show interest -- managers, managers of agencies and programs. John Glenn was going to sign on and co-sponsor the legislation, and so the people from OMB, the Office of Management and Budget, quickly redrafted Senator Roth's bill. Senators Glenn and Roth as co-sponsors reintroduced it as a much more management-friendly statute, no longer simply the Bang for the Buck Act, but now with features that would interest managers. Freedom from red tape and so forth is part of the spirit and even the letter of the law to help people achieve better results.
The bill passed the Senate of the United States unanimously in 1992, a Republican Senator getting a statute through the Senate unanimously. It was considered non-controversial, good government legislation. It was not considered in the House, and it died at the end of the 102nd Congress.
Some people elected in the '92 election, showed interest in this bill as a total quality management (TQM) move, so the new administration put its muscle behind this. This time the House considered the bill. It passed the House and Senate unanimously in '93, and was signed by the President in '93.
I recommend that you read it some day. It's very short. I will give you the reference to it later on. It was considered very sensible, non-controversial, and well drafted, good government legislation, Public Law 103-62. That's the statute that is driving a lot of activity now around town.
Two successive Directors of the Office of Management and Budget, Alice Rivlin, who brought me over to OMB, and Frank Raines, who just left OMB, were very interested in the statute as a way to improve budget choices and program accomplishment, so they pushed it very hard.
In the '96 election, Republicans, who had closed down the government of the United States of America a few times, were stunned by the electorate -- almost lost the House -- so the House Majority Leader began to look around for a "kinder, gentler," more friendly House of Representatives. He seized on the Government Performance and Results Act as a way of providing a more professional form of oversight.
So we have divided government, a Republican Congress and a Democratic administration as twins working together on this: to find a way to demonstrate to the American people what is being accomplished with the resources entrusted to us.
The public, which is unwilling to tax itself, still wants good services. So those are the pressures, and the Congress of the United States noticed those things and came up with this statute.
GPRA: Context, Purposes, and Requirements
These are the contexts, purposes and requirements of the statute (see Appendix B). I think they are goals that you can identify with also. You want to produce better results, better programs and "improved public accountability." I call it communicating the value of what we do, which John announced as the first long-term objective for the workshop today. You can call it improved public accountability, or you can be a little more proactive and say we need to learn how to communicate the value of what we do.
Most people who run programs are inarticulate, tongue-tied. They know they're doing something important, but they can't convince other people of it. This statute provides an opportunity, a vehicle to communicate to others the value of what you do.
Most resource allocation decision-making in a time of finite and restricted resources goes on right within the agency. For example, within the United States Department of Education there are a lot of trade-offs, because they have to meet a budget mark every year when they propose their budget to OMB. So your program is fighting not only with the highway programs, but also especially with the other education programs for resources. The only place to get resources for education -- and I'm exaggerating slightly -- is from education. We have to learn how to be more cost-effective in our programs.
John didn't want me to take the whole day on this, so I'm going to run through a few of the overheads and skip others, but I wanted to just touch on what the statute requires. Agencies conduct strategic planning, agencies set long-term goals, and agencies say how the goals are going to be achieved. We call what agencies do "strategies." The statute calls them the processes -- how we are going to achieve the goals -- and the statute asks that you identify what resources you need to achieve the goals.
What are the required resources, and what are the key external factors? All those points we're going to talk about this morning are the things we're going to keep track of in a logic model. Every agency produces an annual performance plan with its goals and strategies, and the resources required to achieve those goals. The statute is fair-minded. If the Congress doesn't appropriate every dime that the agencies ask for, the agencies are allowed to go back and revise their goals.
The last piece of the annual plan is very interesting. Why should anybody believe your data, when you actually measure results? What will the agency do to verify and validate the performance data?
After each fiscal year is over, the agency reports results, actual performance compared to the goals and a summary of all the evaluations that were completed in the prior year. So the annual report is not limited to a comparison of actual performance with the goals. It also summarizes results of evaluations that were done. The Congress knows that you can't capture all complicated realities in a few numbers, so the statute is very open to program evaluation, including such information in each year's annual report that the agency will file.
There is a proposal in OMB to combine the last two into one. That is, the agency will tell about its goals for the future year and report how it was doing in the past all in the same document, instead of two separate documents. That's only a refinement. The statute has them in separate places.
Implementing the Results Act
The General Government Division turned out the 118th GAO report in fiscal year '96. This is a maroon document, which describes how federal agencies are successfully doing all the things needed to manage for results (see Appendix C). There are three things to do, (a) define the mission and desired outcomes, (b) measure your performance, and (c) use the information. There are 12 critical practices organized in a harmonious way under steps 1, 2, and 3, and a few supporting practices, 9 through 12 down below. It is useful and helpful document and I suggest you get a copy.
Other information on GPRA is available from the GAO. The helpful documents the GAO has turned out are available on the Internet, where you can download them for free, or you can call 202-512-6000 and ask them to mail you a copy, also free of charge.
I don't believe in being compliance-oriented. And nor does OMB or the GAO, nor anybody in the Congress. Instead, we want to achieve the purposes of the statute, better management, better programs, communicate the value of the programs, better budget choices, and so forth. How can we achieve -- with all of this planning and reporting -- the purposes of the statutes? That's whats important.
Since, we didnt know exactly how to do this, we had a pilot testing phase. GPRA was adopted in '93. The statute required 10 pilot projects, and we wound up conducting 68 projects; so we had 68 pilot projects on how to set goals and measure results, then we had their strategic plans.
Congress has to be consulted when the agency is developing its strategic plan. I told the Congress, or at least a key staff person working for Dick Armey, you can't be consulted if you haven't seen a draft. The agencies had to turn in, by July, their draft strategic plan.
GAO analyzed the plans, briefed them, and so forth. Congress commented and said, you people are doing a tragically inadequate job in A, B, C, and D, and they scored political points, but they had a fair-minded scoring system. Education, not the best political currency in the Republican Congress, scored very high, on their strategic draft plan, their final strategic plan, and their annual performance plan. Education was up there with the Transportation Department.
So, in draft form, the strategic plans were turned in, and then the performance plans go to OMB. They are private, privileged communications. Nobody can see them. Every September the agencies and OMB argue over the budget, and then the public version of the performance plan comes out with the President's budget in February or so each year.
And then, as I told you already, the agency is allowed to revise its performance plan if Congress doesn't appropriate every dime that was requested. This year, because everybody is learning how to do it for the first time, almost every agency will probably revise its plan just to have a better set of goals and strategies.
The annual performance reporting officially is supposed to start in March of the year 2000. In other words we have a little lag period and then have to report what happened in fiscal '99. This first plan is for fiscal '99, and then it goes on. Similarly, the first report is a report on fiscal '99 and then it goes on. But OMB has proposed that we pilot this right away, because many agencies have had measurable goals already, and so we're going to start piloting this performance reporting in February of '99.
I'm going to stop here. I put down some challenges in the document, but this is enough, I think, to orient us for this morning. There are references listed in the handouts (see Appendix D) which you can get a hold of, and we can talk about those things later on. So I will turn the discussion back to John now.
Implementing and Managing a Performance-based Plan
JOHN A. McLAUGHLIN: Joe has just given you the context in which OSERS is operating. It was meant to give you a sense of the pressures that are on OSERS, and give you an idea of why they're asking the questions that they're asking of you.
OSERS has a major problem, and only you can help them out. As Joe indicated, they had to put together a strategic plan. They had to develop an annual performance plan and will have to report the product of their efforts.
Their problem is that all of their programs are implemented beyond their control. For example, they're implemented in your projects. They're implemented in the State Departments of Education and local education agencies in other research projects similar to yours, projects and programs that are beyond their influence once they fund them. Although, they do monitoring, technical assistance, and so forth, in reality, much of the work that is supposed to achieve their goals happens in your projects.
Theory of Change
In your handouts you have a very busy-looking chart (see Appendix E). I developed it based upon the new revisions of IDEA, trying to get a sense of the big picture, the theory of change, if you will. We're going to talk about that in just a second in terms of, if this is our long-term goal, how do we achieve it?
The point that I want to make is how OSERS achieves its long-term goal is through you. OSERS has to develop a formal partnership with you. They have to make explicit what they're going to do with regard to developing the resources to support, drive, and energize your programs. In the other part of that partnership you make some promises that say, essentially, if you give us these resources we will help you achieve this long-term goal.
Bridge or Information Spectrum
One of the points that I wanted to emphasize in this bridge between the context within which OSERS works and what we want to encourage you to do with regard to performance-based management the new working relationship that you have with OSERS. This bridge or information spectrum is demonstrated in this overhead, which is also in your package (see Appendix F). We develop program goals and initiatives nationally, and those are actually implemented in the end back down at the classroom, and at each one of these levels there are programs.
When you are providing information on results via your final report or your interim reports, you're not reporting only from the perspective of accountability. Youre helping OSERS manage your programs. Your information is not only accountability information but management information, because with your information OSERS manages programs, OSERS makes program improvements, they make mid-course corrections. For example, they may give your data that demonstrates a successful practice for transition to the world of work to another section, which will disseminate the data back down to the field to try it out in different environments and different contexts. Rather than the old way we used to think -- lets give them our results so we're accountable for the funds we spent -- the new thinking is, let's give them our results so we can help them manage their programs better, and as a result of their improved management practices, the lives of the people we're interested in will be better.
This is the bridge we have to form with OSERS to enable them to manage their programs to enable them to be able to communicate the value of their programs to Congress and to others, as Joe indicated, in the Department of Education. OSERS needs your data to help them convince people to allocate very scarce resources to the kinds of programs that you need, and in the end that communication of value will bring more resources to you and your colleagues to do the kinds of things that you want to do.
I want to read something that Michael Patton wrote in his latest book, Utilization-Focused Evaluation, as a bridge to the next segment. He has a way about writing that captures the imagination and the spirit of the reader. Aleksii Antedilluvian Prelapsarianov, the oldest living Bolshevik speaks about a world without theory.
How are we to proceed without theory? What systems of Thought have these reformers to present to this mad swirling, planetary disorganization to the inevident welter of fact, event, phenomenon, calamity? Do they have, as we did, a beautiful theory, as bold, as grand, as comprehensive a construct...?
You can't imagine, when we first read the classic texts, when in the dark, vexed night of our ignorance and terror the seed-words sprouted and shoved incomprehension aside, when the incredible bloody vegetable struggled up and through into the red blooming, gave us praxis, true praxis, true theory married to actual life.
You who live in this sour little world and cannot imagine the grandeur of the prospect we gazed upon: like standing atop the highest peak in the mighty Caucasus and viewing in one all-knowing glance the mountainous, the granite order of creation. You can't imagine it. I weep for you.
In support of the development of theory for our programs, I give you Joe Wholey.
Logic Models
JOSEPH S. WHOLEY: Well move now to an abbreviated version of the logic model piece. Evaluators have been using logic models since the world began.
In essence, there's nothing so new about them. But were talking about using them not for evaluation studies but to develop agency plans. I thought of another use of logic models that I might ask you to write in your notes: using logic models to develop goals, strategies for achieving goals, and estimates of required resources to achieve goals.
Use of Logic Models
Before developing measurement systems, we need to develop a reasonable level of agreement on what we're doing, why we are doing it, and what resources we need to do it successfully. It's planning. So the missing bullet in your handout (see Appendix G) is use of logic models in consultative strategic planning, consultative annual performance planning. It isn't just the Congress that needs to be consulted, it's your partners. Often when people do planning they go on a retreat, or lock themselves in a room and develop a plan. That wont do any more. The Results Act emphasizes consultation with the Congress and emphasizes soliciting the views of those affected by or interested in your programs.
Eisenhower said years ago that, "planning is everything; the plan is nothing." The logic model is going to turn out to be a vehicle to help you do a more successful job of energizing what I call "partnerships for results." We can achieve no interesting result by ourselves; any interesting outcome goal needs the help of others to achieve it.
I'm talking about bringing together what I call "big people," policy people, managers, people who control needed resources -- potential partners -- to haggle over and come to agreement on goals, strategies for achieving the goals, and resources required to achieve the goals. So the most important thing on the first sheet (handout) is missing, so you need to write it in. It would say something like: using logic models in developing strategic and annual plans, and especially in developing them in a collaborative fashion where you bring in powerful people whose help you need.
There are several kinds of powerful people whose help you need. First, you need the help of people who influence the allocation of needed resources.
I call those policy-level people, people who have to say yes when we want to do something. They might be in our own university, school district, or community college, or maybe they're in the welfare department. They are the individuals whose help we need; they're the key stakeholders outside our program.
We also need to bring into the discussion on the goals, and so forth, program managers, program staff, either the people we serve or representatives of the people we serve. Those are key stakeholders, too. And then potential partners. Potential partners have to be brought into our setting, be involved in our goals, in developing our strategies as to how we're going to achieve our goals, and in identifying the resources we require to achieve our goals.
We need to do this, not locked in a windowless room where nobody can peek in at us, but in a collaborative fashion with draft plans, second draft, third draft, and so forth, as I encouraged Congress to demand to see the executive agencies draft their plans. Your partners should see your draft plans and have a chance to comment, and so forth, and the logic model is going to help us engage others in the consultation.
Logic Model Elements
I'm going to give you a couple of examples of logic models, and then we'll do a brief exercise in developing logic models. It all seems simple and easy until we try to do it.
Again, it is an inadequate title, but the first bullet on your handout (see Appendix G) is what I'm talking about. We're going to clarify the expectations and priorities of others that are important to our program success. What Dick Armey calls a Results Act frame of mind; it is simply good management.
We have to get a reasonable level of agreement on what our goals are. Frank Raines, who's heading Fannie Mae, suggests walking into an organization and asking people what the priority goals are. See if they all say the same thing or if they say entirely different things. That's a quick way to diagnose and evaluate any organization, to ask the people in it what priority goals they are working toward.
NASA turned out four strategic plans in four years because they're trying to deepen the commitment to shared goals. Commitment to shared goals is what we need. We need a reasonable level of agreement on what we're trying to accomplish.
We also need to be alert to why we might not succeed. Client characteristics, the economy in the local area -- there are many reasons why we might not succeed, or there may be helpful factors in the external environment. We need to keep track of the key factors likely to affect our results.
And naturally, we will explore costs and we will look at different or alternative measurement systems. A typical logic model consists of all these elements.
Inputs, Activities, Outputs, Outcomes, and External Factors
I flip back and forth on what the first element is. I sometimes call it resources (see Appendix H). But I tend to call it inputs now, because it includes staff, dollars, equipment, and so forth. The President and Vice President are fascinated with computers and wiring to connect to the Internet, and so on, but there are other inputs in the way of support. For example, support may come from important people, laws that may be passed at the state, local, or federal level, regulations, and so on. So I don't always call the first part of the logic model resources. There's a broader set of things that can help us achieve success.
Some of them can be captured under the external factors. To have a convenient place to put client characteristics you can just add another bullet. The clients, the people we serve, must be an important part of the theory of change. Their characteristics may vary from one year to the next, so we've got to have a place to put them in the model. I changed the name of the first piece of the model to put in the clients, too, as inputs, but you may prefer, just as John prefers, to keep them as a separate element.
John was also talking about theory of change as including the goals. That is what we are trying to accomplish, which is typically also called end outcomes. In other words, what results do we try to produce?
So, in the theory of change, on the right-hand side of these diagrams (see appendix I) is what are we trying to achieve; and on the left-hand side we have the resources we need and other inputs; and then, what are we going to do? Teaching, job placement, coaching of people after they're on the job, socializing employers so that they will be public-spirited, or compelling them to be public-spirited with suitable laws, whatever it may be.
There are different processes we go through. We want to win with our program. So what are the things that are needed to achieve our end outcome results?
In the Results Act environment people are all tangled up and puzzled by the distinction between outputs, which are physical things: products and services. Outcomes are results that occur in the people we serve, in the community, so there are a lot of people having to learn terminology. Economists call them all outputs. Economists don't make that distinction, so I think it is somewhat ideological.
Evaluators make the distinction. Evaluators and GPRA make the distinction. Interestingly enough, the Results Act puts together, into one thing, teaching and the people taught. Teaching the students is here, and students taught are here. The Results Act squashes them together because it wants to distinguish between process-type stuff and whether anybody is better off as a result, which is called the outcomes.
The key to useful performance measurement, which I probably revealed at your evaluation workshop in Arlington two years ago, is finding intermediate outcome goals, intermediate outcome objectives on which you can demonstrate the results. Intermediate outcomes in the theory of change, in John's terminology, connect the teaching, the job placement services, and so forth, to the end result that we're trying to achieve.
The reason it's the key is that that's where you can establish either accountability by reporting results and showing that you make a difference or, for management purposes, showing if people are better off. If we talk only about the end outcomes, all of our partners and the external factors have such a big influence that we're not going to be able to distinguish what our contribution is.
For example, the Office of National Drug Control Policy has developed a gigantic logic model to show how America's war on drugs looks. It has 150 agencies coming together to haggle over the logic models, the theory of change, how we're going to reduce the supply of drugs in this country, how we're going to reduce the demand for drugs in this country.
Just yesterday, I heard a briefing by the Office of Natural Drug Control Policy on how important the logic models are to getting agreement on what are we trying to accomplish, and further, on getting subsidiary agreements on how will we measure and demonstrate results. That's what a typical model includes.
External Factors
I like this form of the diagram (see Appendix I), which I came across in a U.S. Department of Transportation workshop years ago. The external factors are up above. They're like the local economic conditions, or the framework of laws and regulations that are going to affect our ability to succeed, or the climate and attitude toward persons with disabilities.
The external factors that are going to influence our success in achieving our goals and achieving our intermediate outcome goals and objectives are up at the top. They are the other forces. I've seen these models drawn, and you should probably draw them this way, too, so that the external factors are gigantic in size, and our little tiny program is down here.
There's a big world out there, and we've got a tiny little program, and the external factors are very important in determining whether we succeed or not. So the external factors should be part of the logic model. The Government Performance and Results Act requires agencies to state which external factors will affect their ability to achieve their strategic goals and objectives. As mentioned, therefore, you will be consulted when the Education Department revises its plan.
The statute requires the plan be revised by September of 2000. There will be a presidential election held at that time. It is highly likely that the agency is going to revise its strategic plan well before September of the year 2000. You can read on the Internet the Education Department's strategic plan. You can read on the Internet the Education Department's fiscal '99 performance plan.
You can see what you think is missing (in particular, what external factors should have been listed as affecting your ability to succeed) and also a lot of other things. The point is that all of this is public information, partly because of the enthusiasm of the Vice President, partly because technology has marched on.
So the logic model has in it the resources or the inputs, the activities and processes, the strategies to achieve the intermediate outcomes, and eventually the end outcomes. The first thing we do is produce some products and services like manuals, research reports, children taught, adolescents taught, whatever they are.
John McLaughlin always talks about what proportion of the target population you are reaching. You could conveniently put that in your output box (see Appendix I). You might even have a measure of what proportion of your target population you are reaching. Typically, written down on sheets of paper somewhere, is information that goes into boxes 2, 3, and 4; and written down on sheets of paper somewhere else is information that goes in box 8; but not written down on any sheets of paper is information for boxes 5, 6, and 7, which are the key to your success in improving your program and in communicating the value of your program to skeptical others.
So you've got a problem here. Nobody knows how the program is trying to succeed, nobody knows the intermediate results that are needed to go from teaching a child, or giving a little bit of job training or whatever, to a young adult, or whatever it might be, to some end result over here. You've got to work all of that out, and these theories of change become very complicated, because you bring in your partners. Among the external factors necessary to succeed would be the work of other people to help your program succeed, so you can list those up at the top, at least in narrative fashion.
You do not need to measure anything. This is not a menu for $10 trillion worth of measurement. This is just a picture of how the program is supposed to succeed, and when drawn by different people, not surprisingly, the picture will be different. An interesting thing we use these models for is to compare and contrast logic models; e.g., compare a logic model as seen by one person who may control resources your program needs to live and flourish, with the logic model drawn by your staff, or the logic model drawn by interest groups, (maybe the parents or whatever), or the logic model as drawn by a key state legislator. Of course, they don't draw these models. You have to interview them and get the information from them.
Examples of Logic Models
I'm going to ask you to develop one of these models in just a second. First, here is an example you have to listen carefully to. It is an example of a car-pooling program. For some reason they call it a ride-sharing program, but I call it a car-pooling program. It's chart 6 in my set of handouts (see Appendix J).
The car-pooling program has in it what the program wants to achieve; typically the end result goals are wonderful. I call them rhetorical goals. Hopefully your program will never be held accountable for your stated goals. It's usually a better America and a better world and a better galaxy, all going to be achieved for $750,000 and seven staff. Better quality of life for everybody in the metropolitan area who drives, or has ever thought of driving, and better air for all the people in the metropolitan area who happen to breathe.
It would be not very desirable to evaluate the program or measure its performance in terms of those goals on the right-hand side. We had better find intermediate outcome goals where the program can demonstrate some success, and maybe even some output goals.
They're going to set up a car-pool program. They're going to try to get people to use the Pentagons parking lot, which is very big. Maybe the people in car-pools could park in the front row. They do park in the front row at the Pentagon. The generals have to park in the second row.
Since the people in the car-pools can park in the first row, that would be an example of an incentive to encourage people to car pool. Maybe the ride-sharing program convinced the Pentagon to let those who car-pool park in the front row, so that would be a considerable achievement on the part of the program. You could call it an outcome, or you could call it an output. The Transportation Department happened to call it an output. It's a rule. It's a piece of paper that the Pentagon passed saying car-poolers get to park in the front row.
Another example of a logic model is from Harry Hatry (see Appendix J). I had to rewrite a couple of the words, but this logic model represents a water quality program, and the goal of the program -- this is quite interesting -- is not the item furthest along the causal chain. Its goal is better water quality. Of course, these are just bent around, they are really further off to the right-hand side of the diagram, so there's another outcome beyond the stated goal of the program.
They're all outcomes, but the reason we want higher quality water, besides its aesthetic quality -- ugly brown is not as nice as other colors that water might be -- is because of another outcome, healthier fish and healthier people.
We don't like it if we enthusiastically consume raw hard shell clams, for example, then become sick and die. So there's another goal beyond the goal of the program. The inputs are not drawn in here, so they need to be put in. I'll call that box zero. The external factors are not listed, but as you may know storms can foul up sewer systems. Sometimes they overload them, or will increase in economic activity. Population growth is another external factor that affects the performance of a water quality program.
Group Exercise in Constructing a Logic Model
Now we're going to have a quick exercise that was developed by the United Way of America (see Appendix K). We're going to do this in about 15 minutes; we need to do it in order to have a chance to discuss. I would like three or four of you to group yourselves, and another two or three people in a different group. I'm asking you to do two things. After picking somebody at your table to be your spokesperson the minimal thing you do in the next 15 minutes is the second bullet on this sheet, or the second bullet on the overhead. Do not spend the whole 15 minutes on the first bullet.
The second bullet is the key part of the assignment. Arrange those 10 little cards in a sensible logic model, causal model, in some kind of a program, trying to accomplish something. It has inputs, processes, outputs, and outcomes. There's more than one right answer. If you get the second bullet done in the next 13 minutes, think of and write down on a sheet of paper external factors that would affect the ability of the program to succeed. So we have now 12 minutes.
Can we come back together? We've all been doing the same exercise but as mentioned before, there's more than one right answer to it. A form of merit for your work would be if people do bring up interesting issues, problems, and ways to do this sort of thing.
Workshop Participants: Responses to Group Exercise
We heard people saying, it seemed so easy until we tried to do it, so are there points people would bring up from the tables, either by way of issues, questions, problems, or suggestions for how to do this?
VOICE: We used something of a concrete and categorical approach by labeling the water goblets, so that we had categories to place our cards in to help our thinking more clearly. We also learned that in getting the value of your project across, it would be very easy for us to do, if we had designed this ourselves.
At any rate, we took the cards and argued among ourselves what we thought was a resource, process, output, intermediate outcomes, and the outcomes, and we identified resources. Unfortunately, we only had one. We thought that there might be some others, but the MSW program manager --
JOSEPH S. WHOLEY: Your suggestion was that they sort things into a box of inputs and a box of outcomes, and so forth?
VOICE: We had difficulty, however, in discriminating, and discerning.
JOSEPH S. WHOLEY: That is why I wanted you to do this exercise. If you look at one of the earlier charts, down at the bottom of the logic model (see Appendix I) it has two questions, how and why. The how question asks how will we achieve the goals, what processes and what inputs are needed to achieve the goals? The why question works from left to right and asks why are we bothering to do this stuff. The why question informs us of the goals and outcomes that we're seeking to achieve. So when you are arguing, trying to decide in what order to put things, essentially, causal order is the order -- one thing produces or influences or causes another thing to happen.
Are there other points people would like to bring up -- either problems you ran into or suggestions for how to do it?
VOICE: In fourth grade I was taught to read everything first before starting any work, so what was really important was that we read every single card first and then see if there was any relationships between them, the way you were suggesting, causal. Also, it was helpful that all of the cards were printed on a sheet in the packet, putting them all in one page made it a little bit simpler, versus using separate cards to read and then sort into areas.
JOSEPH S. WHOLEY: Because the mind is a wonderful instrument, it begins to think of connections among the cards if you absorb them all. That's good. Are there other points, suggestions?
VOICE: I found your description of teaching to make that description to what's learned was a really important key in making some distinctions here.
JOSEPH S. WHOLEY: Harry Hatry says that outputs are still part of the activities and processes, but the outcomes are outside of you, outside of the program -- either it's learning that has occurred, or jobs obtained, or things like that.
VOICE: We tried using your suggestion about the rhetorical questions. We thought there were two rhetorical questions of true justice and the American way kind of stuff, healthy babies and the appropriate milestones. So we put those on the end as the rhetorical questions.
JOSEPH S. WHOLEY: That's good. Any other points people would like to bring up?
VOICE: We had, or at least I had, a little bit of a problem distinguishing outputs from outcomes because in a way you can almost blur that distinction. I think I understand it, but could you give us some sort of a definitional way?
JOSEPH S. WHOLEY: Well, outputs are products produced and services delivered. Outcomes are typically changes in the clients, or changes in the community, so that's a way to distinguish the two. Having said that, however, I must admit that sometimes there are arguments in a voluntary program. In a voluntary program, if clients come for training, you may treat that as an outcome of maybe a very good outreach component, or you may treat it as an output. Clients are coming and being taught, so sometimes you run into a tangle, as to how you're going to count timely delivery of service, or high-quality service, and so forth, so thinking of a causal sequence is also helpful.
JOHN A. McLAUGHLIN: Those terms are defined in my overheads and we will get to all of those different variables, outputs, outcomes, and so forth in the next segment.
VOICE: We know a lot about path planning and this is very similar to systems change, and so to arrive at the end outcome, it was important to find the North Star first.
JOSEPH S. WHOLEY: Very good, and then a path to get there. Some people call the whole thing we're looking at here the road map to results. It is kind of like a road map for finding that star, or how you would get to that star?
It's very important to know and to engrave in your mind that the logic model in the policymaker's mind may be very different from the logic model in the interest group's mind, or the logic model in the program staff's mind. So it is a good idea to develop separate logic models first and then compare and contrast and try to reach what I call a reasonable level of agreement on what is the program to be evaluated, measured, or, more importantly, run successfully.
VOICE: In that sense, if you had different levels within an organization constructing their own logic models, would you recommend an outside facilitator who could understand the general theme of what each level was proposing, or could they do it themselves? I see the need for some kind of facilitator who ties it all together.
JOSEPH S. WHOLEY: It has been done both ways. It can't be done successfully the first or even the second time. Most people have said that planning, to be effective, has to be simultaneously top-down and bottom up, and you can't exactly be simultaneous. I've been very successful extracting the model from working-level people, floating it up to higher bureaucratic levels, getting them to comment on what is missing, and then bringing it back down.
The policy-level people can tell if an important goal is missing, or an important factor is missing necessary to success. The people running the program, managers and staff may be too close to it. The policy people know about inputs and outcomes. People running programs know about inputs and outputs. So often there's a way to hook together the two in a sensible fashion and people like the people in this room can help in that process.
Measuring Performance
JOHN A. McLAUGHLIN: I would like to shift gears now. We said there are a number of questions we wanted to address: (a) Am I addressing the right results? (b) What is my program trying to achieve for whom, when, and so forth, and (c) Who are our partners?
What Joe provided for you was a mechanism for describing the value of your program, at least how it's supposed to be working. We call that an espoused theory. It is what we believe to be true about our program, what we're aiming toward, and how we're going to achieve that aim. It's what you write in your proposal to OSERS. That is, you say to them, this is what I plan to achieve or accomplish, and this is how I'm going to do it.
Collect the Right Information
Now we're going to talk about the evidence that you're going to collect to monitor the progress you're making, and also to develop information about the degree to which you achieved the aim that you set forth in your proposal.
A question that came up in the break involved the collaboration that goes on in some of our projects, particularly those that deal with persons with disabilities. Specifically, the person asked, how do we convince OSERS and other outside stakeholders that that's an important outcome?
The answer is write it in as part of your logic model. You say, to achieve this long-term outcome, one of my intermediate or short-term objectives is to form a partnership that will lead to shared information, shared responsibility, shared leadership, and you measure that. If something is very important to the success of your program, build it into your logic so that it will enable you to know what to measure.
That leads me to my other purpose for developing the logic chart. The first purpose was to communicate value. The second purpose is to determine what is to be measured. By having a performance spectrum -- from the outcome, the long-term outcome that we're aiming toward, to the resources that we're going to use to develop our programs, and so forth -- written as a public document -- we, as researchers and as project managers, can look at it and ask, what is it that we need to know, and what is it that other people need to know about this project? Our interest as managers is what we need to know to provide information that we're going to be successful, that we're on this path; that we're following the map and that the objective is in sight.
A point I want to make while it's on the tip of my tongue relates to a question that was addressed to me during the break. It involves you as researchers.
A lot of you have set as your long-term goal the facilitation of individuals with disabilities in their transition from school to the world of work, or something in that vein. The project you might be working on involves helping elementary school teachers to infuse the value of work into their students.
If your measure is on transition from school to the world of work, guess where you are, dead in the water. Congress wants to know, because that's what they allocated the money for, if more students and people with disabilities enter into the workforce. That's what we're all aiming for, but you were funded to be successful in teaching teachers to work with their students to help them understand the value of work or to become self-determined. What you need to do in your proposals is to convince OSERS that the data you will provide them show whether you were successful in those short-term goals. Don't let yourself get caught with a requirement to provide data on that final strategic goal.
From the standpoint of this logic modeling, it is very important to understand what is reasonable within the time frame you have.
Report Results and Meaningful Information
VOICE: Just a point of clarification. Along similar lines, when researching something to find out if it works, you may find out it doesn't work. You don't get the results, but it's meaningful data.
JOHN A. McLAUGHLIN: Let me repeat the question very quickly. It's something that we all, as researchers, understand. Its like taking good pictures; in a roll of 35 shots, we've got maybe a chance of 1 in 35 of getting a really good shot. What do we say to OSERS and other funding agencies with respect to our research? We set a goal for our research that will produce this particular product or result, but we know that there's a chance that the result will not come out the way we planned it to.
OSERS wants to know not only that outcome, they also want to know the result, and they want to know how you're going to collect data around your performance logic. What they are asking for and what you're promising is not only that you're aiming for the result, but that you're going to be good managers. You're going to be good stewards of the OSERS money; that's one of the points Joe was making in pointing out that this whole idea is aiming for results by using good management. So even though we have failed to hit our mark, we want to show people that we did it in a logical fashion, we collected the data, and we're making improvements.
Why Collect Performance Information?
Here's a question I always ask people when talking to them about performance-based management. Why do you collect performance information?
Think about your project. Why are you collecting performance information? Who is the audience? Write that down on a piece of paper. Who's the primary audience? If you had only one audience for your performance data, who would the audience be? Okay. How many people, by show of hands, wrote OSERS?
(A show of hands.)
JOHN A. McLAUGHLIN: How many wrote, "audience external to your program?"
(A show of hands.)
JOHN A. McLAUGHLIN: How many people wrote, "me and the project staff?"
(A show of hands.)
JOHN A. McLAUGHLIN: Only three. Let me talk to you about two very different perspectives on evaluation that sounds very much alike.
The first is a "results orientation." That's what you find a lot of people talking about today as it relates to the Government Performance and Results Act. We're asking the question, is the planned result being achieved -- that's a formative question as it is unfolding. Or we might ask the question, was the planned result achieved, which is more summative. If it is at the end of the program, we might ask what unintended results were observed.
Those are three very good questions. They are good accountability questions. They're historical. Some might say hysterical. The reason is that there's another orientation that I would like to encourage you to use.
And that's a "learning orientation." This is the orientation that I encourage people to describe in their proposals. It is just a little tweak of the previous questions, but notice it says, what factors in our program activities and resources are influencing results in what ways? Similarly, what external factors may have influenced the results in what ways? And finally, what unintended results were observed?
Whats the difference there? The difference is that these data can be used for both accountability and program improvement, and for management. If we get caught in the results orientation, as a manager we can't do anything with that information. All we can tell people is whether or not we hit the mark.
I'm asking you to be more clinical. Be like the researchers that you are. Look at a lot of information. Certainly, you have to look at the result, both the intermediate and the long-term result, but be sure that you collect data on the factors of your program, the internal or the external factors, and so forth, that could have influenced your results.
I am suggesting that you propose a hypothesis to yourself, if I put this program in place, then I will achieve these outcomes. As you work with your graduate students and as you conduct your own research, one of the first things you do is to take a good look at your research procedures and when you're very certain as to what you're going to do in your research, then you look at the dependent variable.
I'm asking you to do the same thing in your programs. Mary Ann Scheirer has written extensively about this. In fact, she has a chapter in Joe's book titled, Handbook of Practical Program Evaluation. I think she makes a statement that is absolutely right-on with regard to the concept I'm trying to get across here. I would like to read that excerpt.
"Process evaluation verifies what the program is, and whether or not it is delivered as intended to the targeted recipients, and in the intended "dosage." In order to undertake process measurement, the program itself must be specified in detail."
In your performance measurement, measure the independent variable, the things you do in your program, as well as the dependent variable."
Mary Ann goes on to say:
"Because the extent of program delivery cannot be assumed, an impact evaluation that does not include process evaluation component will seldom provide information in which decision-makers can have confidence."
When you go to someone and you show them your results, they say, prove it. In the learning orientation it is having data to prove it.
The Measurement Challenge
The measurement challenge. This is the challenge that all of you have to face. I usually say, collect all of the information, but Joe quickly reminds me that we don't have to collect all the information. You should collect information that will enable you to implement program improvements and allow the manager to communicate value as well. Sometimes our data will influence the development of new programs.
First, we recommend starting with the short-term outcome, the immediate outcome. For example, if you're working with students with disabilities in order to increase the probability that they will get jobs, than we want you to back off and start with the immediate short-term outcome, which might be, job-training skills.
Using the example of finding the North Star and then developing a path to the North Star, we have found that sometimes the North Star is too far away, so we may need to get our sextant out and find an interim star, and then work toward the North Star.
Remember Joe's how and why example? If we start at the short-term outcome we can ask ourselves, why is that short-term outcome important? Why are learning job or employment readiness skills important for individuals with disabilities? Because it will lead to sustained employment. That's your next level of outcome, the intermediate outcome.
And at the next level we ask, why is it important that students with disabilities sustain their employment? Because, then, they will be able to live independently. And why is that important...you can keep on going, as Joe has demonstrated. So sometimes finding the North Star is not as important as finding out the thing you want to change in order to get there.
Second, keep an eye on the strategic outcome. The strategic outcome is to make sure that students with disabilities find their way into the workforce. But your project may be working with something down the line, and it may take a long time to prove that it did indeed lead to students getting into the workforce. That doesn't mean you can't keep an eye on employment, because that is your baseline, which shows you what the change is in the long term that you're going to make. For example, I'm currently working with the evaluation of dropout prevention programs in Virginia, where we have $10 million going to dropout prevention.
What data do you think they're collecting at the local level? The dropout rates of students in middle school through high school. But that's their strategic outcome. The legislature keeps coming back saying, we just put $10million in that this year and we don't see any gain. Let's not do that any more. The problem is, we're not reporting what we're really doing with those $10 million on an interim short-term basis. We want to keep an eye on the strategic outcome, but we really want to measure where we want to be when our direct influence on customers ends.
Finally, we want to collect explanatory information, information that will focus, as Mary Ann mentioned, on the implementation of the program. We call it explanatory information, because it helps to explain the results. It helps you to tell your story about what happened, and why it might have happened.
External Influences
Notice that we included the external influences in our model of performance. I teach a course in organizational behavior, and one of the things that always gets my students upset is that I talk about a theory, like a theory of motivation for staff or employees, and then I say, but it's conditional. It's conditional probability.
Why? Because, as researchers and program people, we know things work differently in different environments, so from an explanatory perspective it's important for you to collect data on a regular basis about those external influences. That's why Joe was encouraging you to build external influences into your logic model.
Uncertainties
One of my favorite people is an economist by the name of Eliyahu M. Goldratt who has written several books, including The Goal: Excellence in Manufacturing, It's Not Luck, and The Critical Path. He talks about the things we're talking about but in a story form, so he communicates the concept of program logic as he's teaching a class in economics at a university.
One of the things that he says with regard to program management is that the essential variable in success is uncertainty. Think about your projects and how well you plan them. You can put these great logic charts up there, but remember I talked about conditions. What I recommend is that once you develop this logic chart, look at it and ask yourself, where are the uncertainties? Where don't we really know what we need to know? Where are the dependency relationships?
As Joe suggested, if its a key factor that parents attend a training program, for example, but we're uncertain whether there will be enough incentive for them to do it, then that's what you want to measure.
So one of the things you may want to do when you go back home and look at your projects is to think, what are the uncertainties here?
And, secondly, Goldratt would also say, look at the relationships. Don't look at the elements. Look at the relationships. We developed a logic chart for you and we said, this activity leads to this short-term outcome, which leads to this intermediate outcome, which leads to this long-term outcome.
What we normally do is monitor the elements of that performance. What I'm suggesting and what Goldratt is suggesting, from a measurement perspective, is yes, you have to measure those elements, or the essential features or characteristics of the elements, but you fail the game if you don't also look at the relationships you posited.
That is why you believe that a particular activity or set of activities will lead to that particular output, which will lead to that particular outcome, and so forth.
Validity Assumption
I mentioned Michael Patton earlier and the concept of validity assumption. The validity assumption is that causal relationship. It's the why of your program. If someone says to you from a funding agency, why is it that you need certified teachers in this class or in this program, your answer is to connect the certified teacher to an activity and respond that, if I don't have that certified teacher, then I won't be able to do this activity, and here is my rationale.
That's the validity assumption. That's where the true measurement for program improvement comes from. It comes from the relationship between the elements of your program.
Determine What Is Essential
Here are some things we have to think about with regard to measurement. With respect to focusing on the essential outcomes, and realism, one of the things Michael Patton suggests -- and I will ask you to do the same -- is the following. Take your left hand and look at it. Describe the patterns on your hand, maybe the length of your fingers, and the broken lines and the complete lines, and, in my hand, the calluses.
Now take out your right hand and look for the same factors. Compare, or look for the lines and the length of your finger, and so forth. Now, put both your hands out in front of you. See what happens? You can't keep them both in focus. You've got to look from one to the other, and back again.
This demonstrates that you can't look at everything in your project. You've got to determine what's essential. What do you need to communicate value, like what do the stakeholders need, and what do you need as a manager to help you make those mid-course connections, to connect the process to the emerging outcomes, which gets me to relevance.
Relevance Realism
One of the major benefits of doing the logic model is that you're able to get data that are relevant to what it is you're doing. In your field, that's known as treatment validity. When you measure something, you want to be sure it's measuring what you're trying to do. It's a valid measure of what you're trying to achieve; it relates to the intervention you're implementing, so that's focusing on the essential outcomes, the relevance of measurement.
Another point that is very important as you begin to think about collecting data -- notice that a lot of you said, I'm collecting data for audiences external to my program -- is that when you're collecting performance information, make sure it's information you can do something with. The worst thing you can do is to collect information that has no value to you, and more importantly, that your staff and people that are important to you, can't connect with. So collect information relevant to what you're doing.
And another realism perspective that Joe often talks about is that, to the maximum extent appropriate, we want to collect data from existing sources. Don't go out and collect new information until you've taken a look at what you already have.
One of the great things about special education is that we have a lot of data. Unfortunately, we rarely use those data to make programming decisions, either at the local school system or the state level, because we're typically shipping the data off to the federal government. But we have to take a look at what existing data we have that can be used in the evaluation and the monitoring of our programs.
Logic Model Chart
I want to give you a quick example of a very simple logic chart (see Appendix L), for a couple of reasons. First, Joe mentioned the importance of looking at your participants, what I call your customers. I actually break it out, as Joe indicated, because I think that if we're aiming for them, we need to pull the customers out, take a look at them, to see what are their characteristics, what are their needs, how is it best to deliver the intervention that I'm thinking about, and so on. So make sure your participants/customers become a primary focus for what it is you're doing. We solve problems through people. What I'm saying with regard to process evaluation and outcome evaluation is measure across the spectrum, as it relates to other programs and external influences.
You now have a very rough program logic for your programs.
And I realize that this rough program logic isn't necessarily representative of what you do, but it is an idea. The long-term outcome or goal you're aiming for is that students with severe disabilities are fully employed. The way you achieve it is by working with -- one thing you could do is work with transition specialists, say in James City County, and this is the county that I live in, so those are the customers for your program.
And the short-term immediate outcome is that teachers or transition specialists are skilled in teaching students with severe disabilities employment-seeking skills that will lead to students with severe disabilities acquiring the skills necessary to locate and secure employment, which enables them to become fully employed.
The program for doing that is a weekly employment-seeking skill training for transition teachers, and classes led by project staff. The outcome is all identified transition teachers attend weekly education programs, and these are just rough things.
Down here for external influences and related programs we might have tax incentives. We all know that, if they will hire persons with disabilities, employers will gain a tax incentive. It might be local, it might be state, and it might be federal tax incentives.
One of the related programs that could be a partnering program would be other school-based transition programs.
That gives you a sense of the scope. Your hypothesis is, if I do this, then I'm going to achieve these outcomes.
Now, the claim that you're going to be able to make, and I'm getting to the measurement aspect of this, is that these teachers actually acquired the skills and, as a result of acquiring the skills, they applied them in the classes with students with severe disabilities, who then acquired the job-seeking skills they needed, and then they got the full-time employment.
When you make a claim, that, say, students with severe disabilities acquire those skills, you have to have evidence to show how they might have gotten those skills. Notice the how question.
So your data or your evidence is along this performance spectrum. When you're looking at the boxes and measuring performance in the boxes, you're monitoring. When you're looking at the arrows or connecting links between the boxes, you're evaluating. That's a distinction I make
If, for example, I set up a performance goal for my teachers that 80 percent of them will acquire the skill, and I compare the actual number of teachers who acquire the skill to that standard and look for a discrepancy, someone might say that's evaluation. From the standpoint of program improvement, I make the distinction that's monitoring. Monitoring for results. What I want to know is, what's the relationship? What's the test of the logic in your logic chart?
I'm sure many of you have read Robert Yin's book on case study methodology. A lot of us in special education find that the case study is the primary mechanism for evaluating our work because it is unique and very context intertwined.
One of the techniques that Robert Yin talks about with respect to studying a case is pattern matching. The program logic model that you espouse is the pattern that you're going to look for in your case study, so you're going to set up this experimental design, if you will, because remember, we called it a hypothesis.
You're going to write it out, very specifically and that's the pattern you're going to use as your template when you do your case study. So this whole logic ties very closely to measurement and program evaluation.
Key Terms
I mentioned key terms, and I'm not going to go over those, but in your handouts (see Appendix M), I define some key terms with regard to short-term, intermediate, and long-term outcomes as well as indicators for those.
Design Evaluation and Implementation Evaluation
I now want to shift to a big picture look at the evaluation questions that could drive your evaluation method and your evaluation plan.
The first is design evaluation. What happens in design evaluation is that when you put together your proposal, as Joe mentioned, you involve the stakeholders, you involve others, and you get them to help you to determine whether the design of your program, this logic, is theoretically and practically sound.
The second phase of the design evaluation is when you submit your proposal. That's when the panel, usually in Washington, DC, goes back to the room and reads proposals. Theyre doing a design evaluation. Theyre looking for the theoretical significance of the model you put together. Theyre also looking at some of the practical considerations -- how many of us go right to the budget and ask, can you do that with the amount of money that is allocated and within the time that is allotted?
So that's design evaluation. By the way, one of the things that a program logic leads to is implementation evaluation. As you can see, I put down inputs, processes, emerging outputs and outcomes, and external factors. The reason that I say that to you now is that the paper program is never the program that actually happens. How many of you as evaluators have gone into a program, evaluated it based on the paper program and come back to the person and said, nothing's going on that you said was going to go on.
The reason is that the program typically isn't fully implemented and one of the reasons for that is few proposals are ever fully funded.
For example, you write a proposal for $150,000 and they say, we've got $50,000 for that program. No problem. We'll do it. We don't redesign the program. We say, it's okay, we'll do it for $50,000. Now, you either built in a lot of fat, or you're making a terrible error.
You want to be able to build in design evaluation and implementation evaluation, and you certainly want to keep an eye on outcomes. Cost is a factor, but not necessarily as much of a factor in the programs that you work in, unless you want to disseminate your program. Then, potential adopters want to know about the cost. They want to know two things: what's the start-up, how much is going to get this program running, and whats the operational cost.
So as an evaluator, if you're in that mode you're going to be challenged to develop cost data.
Now, you may also think about cost monitoring, which is just describing what costs what.
Customer Feedback
Customer feedback is essential anywhere we go now, and that could be customer feedback across multiple levels. It could be our immediate target, like we talked about the teachers that were involved. It could also be the students that the teachers work with, or the parents of the students that the teachers work with. It could be -- and we know in special education this is so important -- other teachers and staff in the building, including the school bus drivers.
In addition to customer feedback, we also want to collect data from our partners. We want to find out the extent to which our partners are seeing their relationship as a positive relationship, what are they getting out of it, what they are putting into it.
The one thing that is not here, but that I encourage people to include, more from an organizational perspective than a project or a research perspective, is to check employee satisfaction, worker satisfaction. You've all heard the story about killing the golden goose. Oftentimes we don't measure our geese very well.
Measurable Objectives
I want to talk very quickly about a measurable objective.
If you're writing measurable objectives, and we're evaluating them, we will want to see certain things (see Appendix N from United Way of Central Indiana).
First of all, we want a statement about the program. It could be the program name, it could be a descriptive phrase about the program, or it could be more specific, for example, the transition teacher training program in James City County.
Another element that needs to be included is the expected result of the program. That is, what are people going to benefit from? What are they going to take away? We are now talking about the means of measuring that result, a standard for success, such as the number of program participants or recipients, and the percent of recipients reached.
For example, Joe and I do a lot of work in United Way programs. The United Way might fund a program on parent education, serving 25 parents. Thats very good for those 25 parents, but remember, United Way is interested in community change. Their long-term objective is to enhance the health and quality of the community, so the percent of recipients served is an important one, because it lets them know the degree to which we're bridging the gap and meeting all of the needs of parents of children with disabilities, if that's the component of the measurable objective. Let me take you through the evolution of constructing a good objective (see Appendix O).
We might say in stage 1 that our objective is to improve the reading skills of at-risk students. To make that a little better we would say, to improve the reading skills of at-risk students ages 14 to 18. To make it better yet we would say, to improve the reading skills of at-risk students ages 14 to 18 through tutoring. Now we put the program in.
Getting more specific, we could say, to improve the reading skills of at-risk students ages 14 to 18 through tutoring as measured by performance on the school district's reading comprehension test. Now I'm saying how I'm going to measure it. Then I say, to improve the reading skills of at-risk students ages 14 to 18 through tutoring as measured by performance on the school district's reading comprehension test to be administered before and after the program, which means that I'm now saying not only what's measured, but also when it's measured.
Finally, I want to put the standard in. So as you see in the handout, I've said, "as measured by an average increase of 5 percent on the school district's reading comprehension test," and then I come back and say the number, "25 at-risk students." This illustrates the evolution of that objective. I started out with a very simple objective, which some people would say is okay. I would call that an expressive objective, to improve the reading skills of at-risk students, but when I get down to specifying, to improve the reading skills of 25 at-risk students ages 14 to 18 through tutoring and as measured by an average increase of 5 percent, notice what is happening. I am enhancing my communication with people with regard to what it is I want to accomplish, for whom, and through what means.
Conclusion
There's a lot more here that we're not going to be able to go over. I would encourage you to look over the rest of the handouts (see Appendix P). These handouts address performance objectives, key issues in data collection procedures, what you dont know can hurt you, please, do not skip the trail run, the trial run, and some options for using a subset of participants in a trial run.
I want to end this with a quote from George Bernard Shaw, which comes out of Goldratt, by the way: "The reasonable man adapts himself to the world. The unreasonable one persists in trying to adapt the world to himself. Therefore, all progress depends on the unreasonable man."
So those of you who are a little bit chagrined about this new focus we're taking, take heart. It's progress.
Questions From Workshop Participants
We can now accept questions.
VOICE: When you talked about the connection, connectedness, and how much OSERS needs input, is that unique -- or more characteristic of a Department of Education program, since Department of Education programs depend on implementation at a totally different political level, than perhaps a DOT program, or Agriculture, or something like that -- because we at the state level have the same problem with the districts in terms of adaptability, and so we need their cooperation. Is that the model differential, and is that more characteristic of the DOE, rather than the other federal government entities?
JOHN A. McLAUGHLIN: The question involves the connectivity and partnership between OSERS and the field, and the fact that OSERS depends on you to provide data to them about the success of the programs they would like to see. Is that specific to education? No, it is not.
For example, I recently finished the strategic planning for the Centers for Disease Control, and they have a very important connection between their federal-level programs and programs that are implemented at the state level.
I'm working now with the Department of Energy, and a lot of the energy impact, that is, saving of energy is obtained through you taking care of your lights and not keeping them on unnecessarily. So most federal programs have that need for connectivity.
JOSEPH S. WHOLEY: In this country, contrary to other national governments, a lot of program implementation takes place through others, point 1, so that's reinforcing John's comment.
Secondly, the federal government has shed, as they laughingly call it, hundreds of thousands of staff members, or how other governments, both at the state and the local level. The number of people working here is politically a figure of merit, and smaller is better.
More and more things are being done through what might be called partners. Sometimes they're actually contractors, and performance-based service contracting is an entire emerging field in the human services arena. So I would say that one of the key management challenges for most federal agencies is how you measure performance in programs where the services are not delivered by the federal agency itself.
VOICE: But what I was referring to is the fact that we've got a political system out there where school boards are elected.
JOSEPH S. WHOLEY: It also happens in the highway programs, which you brought up as an example. We have legislators who are elected. So it does introduce a complexity.
Now, in the child support enforcement arena, as an example, the federal agency tried to jump-start the process by developing a draft plan with goals and measures. It was rejected by the states, and it took them an extra year or two to come together on what I and they call national, not federal goals.
My understanding, is that when the child support enforcement system agreed on the national, not federal goals, the room burst into spontaneous applause. The states were no longer being treated as children by the federal agency. They were being treated as partners. As an aside, it was considered too difficult to develop the measures in the same cycle, so it was separated out and done as a second step after agreement on the goals, and that's a good idea.
So I would say, develop the logic model. Even if you never measure anything you're getting agreement among a lot of important people on what your business is.
JOHN A. McLAUGHLIN: Other questions or comments?
VOICE: You're talking about how hard it is to measure the results on outcomes when you don't employ the staff, but we all do that all the time and industry has been doing it for years. Ford, for example, is subcontracting out parts, and stuff like that, so I think there's previous learning we can turn to. I mean, our division has to subcontract out to providers for service.
JOSEPH S. WHOLEY: There's a cost issue. Part of why it is hard is that it's going to cost something.
When the Office of Management and Budget and the General Accounting Office testified before Congress in favor of the proposed Results Act, Congress asked them how much it was going to cost to implement this. Both agencies, OMB and GAO, said zero, meaning that Congress didn't have to appropriate any extra money, that agencies are expected to take it out of their normal management efforts. Data collection efforts, research efforts, evaluation efforts can also serve to cover the collection of data on results. So that's one thing.
The second thing is deciding which result to measure. If it's too far down the causal chain, my agency or program has such a small degree of influence that it's laughable, as people only late found out in the state of Oregon.
They had the Oregon benchmarks, which are famous worldwide. It was foolish to try to use those as the management objectives, because many different things affect those scores. So the people in Oregon had to regroup and see if they could develop some intermediate outcome objectives and intermediate outcome measures, and so forth.
The other point John made is very important. Can we get the outcome data out of existing data systems? Often we're using computerized exchange systems now, the state employment security system, for example, to track what's happening to the people we train, instead of having to do telephone surveys. Through matching records we can get fairly good information using existing data. It requires an agreement between the Education Department and the people who run the unemployment insurance system, so there are lots of technical and, political issues that we face when we try to measure results.
JOHN A. McLAUGHLIN: Other questions or comments? Those are good observations.
VOICE: Yesterday Tom Hehir mentioned that OSEP has been reorganized into state monitoring and research-to-practice divisions. Given your interest in seeing the results of evaluations having wider impact, what kind of linkages would you suggest between those two divisions so that there would be some synergy?
JOHN A. McLAUGHLIN: Let me take a shot at it, and then maybe Joe could do it too. The question is what can those two divisions do: the Division of Research To Practice, and what was the second one?
VOICE: The State monitoring.
JOHN A. McLAUGHLIN: First of all, they need to talk to each other and develop a logic. I know they already have, because I've seen their logic. They need to first set out their pathways and then look where their pathways logically connect and build the necessary communication linkages.
And notice that both of those are sensing what's going on up in the world. The monitoring division has a tentacle down into what we do with students with disabilities, the other has a tentacle down into the research, and so forth. They need to sense what is going on and come back and tie it together.
JOSEPH S. WHOLEY: You can tell that John McLaughlin's very interested in program evaluation just from his talk.
One thing the Congress said when they adopted the statute is that you can't measure everything with a few repetitive numerical measures, and that to measure performance you may need to do research and evaluation studies. They call them program evaluation studies, and so one linkage is at the measurement level.
But a more important link is the idea of working together in partnership to improve performance in terms of shared outcome goals. This is a so-called cross-cutting program issue. When we share common goals, we can work together, not particularly to measure things, although that is important so we get credit for what we do, but more importantly to improve the performance of those two bureaus or divisions, or whatever they're called.
So within the one agency, the important thing is to come together and share their logic models, in John's terminology, to find out how substantively they can do their work differently to improve progress toward shared goals and to help each other on the intermediate outcome goals that each one would have separately.
THOMAS E. GRAYSON: Lets give Joe Wholey and John McLaughlin a thank you and a round of applause.
Just as a quick follow-up. As most of you know, in addition to these evaluation workshops, we provide evaluation technical assistance site visits. We will be sending out a letter probably in six weeks asking if you want a site visit from us. It will cost you nothing. We can come out and work with you on site. We can look at your program, meet with your staff, or any internal or external evaluator you may have and discuss any aspect, issue, or concern you might have regarding your evaluation plans or implementation efforts. So if you would like us to visit your project, just let us know.
Thank you for coming and have a safe journey home.
Appendix A
Critical Evaluation Questions
The Big Question
4 Essential Questions
Appendix B
Government Performance and Results Act
- Context
- Purposes
- Requirements
Context for the Results Act
Purposes of the Results Act
Requirements of the Results Act
Appendix C
Effectively Implementing the Results Act
Process Steps
Appendix D
References
References
Congress
1. Government Performance and Results Act (P.L. 103-62, August 3, 1993)
2. Senate Report 103-58, Report to accompany S. 20, June 16, 1993
Office of Management and Budget
1. Circular A-11 (Part 2), Preparation and Submission of Strategic Plans and Annual Performance Plans, June 1997
General Accounting Office
1. GAO/GGD-96-118, June 1996
2. GAO/HEHS/GGD-97-138, May 1997
3. GAO/GGD-97-109, June 1997
4. GAO/AIMD-97-146, August 1997
5. GAO/GGD-97-180, September 1997
6. GAO/GGD-98-44, January 1998
7. GAO/GGD-10.1.16, Version 1, May 1997
8. GAO/GGD/AIMD-10.1.18, Version 1, February 1998
9. GAO/GGD-10.1.20, Version 1, April 1998National Academy of Public Administration
1. Toward Useful Performance Measurement, November 1994
2. Effective Implementation of the Government Performance and Results Act, January 1998United Way of America
1. Focusing on Program Outcomes: Summary Guide, 1996
2. Measuring Program Outcomes: A Practical Approach, 1996Joseph S. Wholey
1. Zero-Base Budgeting and Program Evaluation, 1978
2. Evaluation: Promise and Performance, 1979
3. Evaluation and Effective Public Management, 1983
4. Performance and Credibility (with Mark Abramson, Christopher Bellavita, and others), 1986
5. Improving Government Performance (with Kathryn Newcomer and others), 1989
6. Handbook of Practical Program Evaluation (with Harry Hatry, Kathryn Newcomer, and others), 1994
Appendix E
Theory of Change

Appendix F
Performance Information Spectrum

Appendix G
Using Logic Models in Developing Performance Measurement Systems
Using Logic Models in Developing
Performance Measurement Systems
Appendix H
Typical Logic Model: Inputs, Activities,
Outputs, Outcomes, and External Factors
Typical Logic Model: Inputs, Activities,
Outputs, Outcomes, External Factors
- Inputs
- Activities (processes)
- Outputs (products and services)
- Intermediate outcomes (results)
- End outcomes (results)
- Key external factors
Appendix I
Program Logic Model Example

Appendix J
Example of Logic Models
- Employer Rideshare Program
- Water Quality Program

EXAMPLE OF LOGIC MODEL: Water Quality Program

Appendix K
Exercise: The Logic Model Shuffle
(United Way of America)
EXERCISE: The Logic Model Shuffle
(United Way of America)
Issues, Problems, and Suggestions
Appendix L
Simple Logic Chart
and
Simple Logic Chart Example


Appendix M
KEY TERMS
Outcomes: Benefits resulting from the program
- Short-term Immediate changes in customer; e.g. teachers acquire new knowledge/skill
- Intermediate changes in those touched by customer; e.g., students learn new employment seeking skills
Outcome Indicators the specific information collected to track a programs success on outcomes; the number of teachers passing final exam; the number of students passing test on employment skills; the number of students getting jobs compared to those in the past.
Outcome Targets (sometimes referred to as standards) numerical objectives for a programs level of achievement on its outcomes; e.g., 95% of the teachers participating in the class score 90% or higher on final exam.
Note: Indicators and targets can be set for any element of the performance spectrum.
Components of a Measurable Objective
COMPONENTS OF A
MEASURABLE OBJECTIVE
Appendix O
The Evolution of a Good Objective
To improve the reading skills of at risk students
To improve the reading skills of at risk students, ages 14 to 18
To improve the reading skills of at risk students, ages 14 to 18, through tutoring
To improve the reading skills of at risk students, ages 14 to 18, through tutoring, as measured by performance on the school districts reading comprehension test
To improve the reading skills of at risk students, ages 14 to 18, through tutoring, as measured by performance on the school districts reading comprehension test to be administered before and after the program
To improve the reading skills of at risk students, ages 14 to 18, through tutoring, and as measured by an average increase of five percent on the school districts reading comprehension test to be administered before and after the program
25 at risk students, ages 14 to 18, through tutoring, and as measured by an average increase of five percent on the school districts reading comprehension test to be administered before and after the programTo improve the reading skills of
MEASURABILITY
Remaining Handouts
PERFORMANCE OBJECTIVES
Key Issues in Data Collection Procedures
What You Dont Know
CAN Hurt You
Measurement Problems, e.g.
Administration Problems, e.g.
PLEASE
! ! ! !
DO NOT
SKIP
THE
TRIAL
RUN
The Trial Run
Does not have to involve the entire program, but must . . . .
Some Options for Using
a Subset of Participants
in a Trial Run
The subset must be representative of all participants!