Assessing the Effectiveness of Organizational Coaching Programs
In this essay I identify and review a series of appreciative concepts and tools that can open opportunities and reduce threat by making the evaluative process clearer and more supportive when reviewing an organizational coaching program. I will also identify feedback strategies that enable the program evaluation process to be constructive. Effective program evaluation is a process that can be uncomfortable, for all growth and change involve some pain. Program evaluation, however, can be constructive. Furthermore, if it is appreciative, this evaluation process can meet the needs of both those who are providing and those who are being served by an organizational coaching program.
A Brief History of Program Evaluation
First, a brief excursion through the history of program evaluation and, in particular, through the major issues regarding the purposes that program evaluation serves and the various forms that program evaluation takes in serving these purposes. Probably the most important fact to keep in mind is that program evaluation is not commonly used in most sectors of contemporary society. Some of the work done in this area focused on educational programs and, in particular, programs being evaluated for funding purposes or for continuing accreditation or authorization.1 Many of the advances in program evaluation have been made by members of or consultants to major philanthropic foundations or the United States Federal government. They are asked to determine the worth of a program that has been funded or may be funded by their institution. Other advances have been made by those given the task of determining if a school or college should be granted a specific accreditation status.
Program evaluation has also been widely used in the sciences, criminal justice, medicine and social welfare, once again often associated with the assessment of program worth by governmental funding agencies. Following Sputnik, increasing attention was given to the achievements of American research initiatives, while attention also increased with regard to the success of heavily-funded social programs under the banners of The Great Society and War on Poverty.
In recent years, program evaluation has become more common within corporations and nonprofit organizations. In most cases, this growing interest is unrelated to outside funding sources; rather, it emerges from a growing concern about quality products and services, and the growing concern about assessing the costs and benefits associated with specific program offerings.
Accompanying this expansion in the size and scope of program evaluation initiatives is the maturation of the field. A clearer understanding of the differing functions played by specific evaluation strategies has been complemented by a clearer sense of those features that are common to all forms of program evaluation. The most important distinction to draw is between the use of evaluation processes to determine the worth of a program and the use of evaluation processes to assist in the improvement of this program. The terms used to identify these two functions are summative and formative.
Formative and Summative Program Evaluations
Paul Dressel differentiates between summative evaluation that involves “judgment of the worth or impact of a program” and formative evaluation that Dressel defines as “the process whereby that judgment is made”. The evaluator who is usually identified as the author of this distinction, Michael Scriven, offers the following description of these two terms. According to Scriven, formative evaluation
. . . is typically conducted during the development or improvement of a program or product (or person, and so on) and it is conducted, often more than once, for the in-house staff of the program with the intent to improve. The reports normally remain in-house; but serious formative evaluation may be done by an internal or external evaluator or (preferably) a combination; of course, many program staff are, in an informal sense, constantly doing formative evaluation.
As described by Scriven, summative evaluation:
. . . is conducted after completion of the program (for ongoing programs, that means after stabilization) and for the benefit of some external audience or decision-maker (for example, funding agency, oversight office, historian, or future possible users), though it may be done by either internal or external evaluators or a mixture. The decisions it serves are most often decisions between these options: export (generalize), increase site support, continue site support, continue with conditions (probationary status), continue with modifications, discontinue. For reasons of credibility, summative evaluation is much more likely to involve external evaluators than is a formative evaluation.
Scriven borrows from Bob Stake in offering a less formal but perhaps more enlightening distinction between formative and summative: “When the cook tastes the soup, that’s formative; when the guests taste the soup, that’s summative”. From an appreciative perspective, formative evaluation can be said to be an exercise in fully understanding the complex dynamics and causal factors influencing the operation of a program. By contrast, a summative evaluation allows one to identify and build on the successes and strong features of a specific program unit. Both formative and summative evaluations can be appreciative and the comprehensive appreciation of any program unit involves both formative and summative evaluation processes.
Judgment vs. Process
Ed Kelly, an experienced program evaluator from Syracuse University, further differentiated judgments concerning the extent to which the intentions of the program were satisfied and judgments concerning whether or not the program was any good. Concern for judgment necessarily involves issues of values, criteria, goals, customers and audience; concern for evaluative process necessarily involves issues of method, instrumentation and resources. Both approaches to evaluation require a clear definition of clientship, a precise sense of the role of evaluation, and an explicit understanding of the way the judgment or process will be used by the program staff and others.
In essence, program evaluation involves the development of a process whereby program activities can be interrelated and compared to program expectations, goals and values. The nature of this interrelationship will vary considerably. In some instances, external assistance will be required to establish the process, while in other instances the external assistance will be used to provide the interrelationship (judgments) once the process has been defined. In yet other instances, the external assistant (evaluator) both identifies the process and provides the judgments.
Regardless of the process being used, an effective program evaluation effort will commence with the initial planning of the program. In planning for any organizational coaching program, or in deciding on the initiation of a proposed program, the processes of evaluation are inevitably engaged. Those who plan the program will be concerned with the validity of their assumptions about needs, strategies and resources. Those who review their proposal will ask questions about feasibility, attractiveness, and probable success. Others will ask how program achievement is to be measured. Program evaluation is not a topic to be addressed at the end of a planning process. Program evaluation should be a vital and influential element that is given serious consideration throughout the process.
Four Levels of Program Evaluation
There are four basic types (and levels) of program evaluation: (1) documentation, (2) description, (3) determination, and (4) diagnosis. An outcome determination evaluation is conducted primarily for the purpose of judging the degree to which an organizational coaching program achieved its intended goals and outcomes. This summative approach aids decision-making about the continuation of the organizational coaching program. Ongoing decision-making concerning the nature, content and scope of an organizational coaching program are best addressed through use of diagnostic evaluation. This type of evaluation is formative in nature, since it is conducted while a program is in progress and is used to continually or intermittently refine and improve the program.
Program evaluations often are of greatest value when they aid the dissemination of organizational coaching program results. Descriptive and documentary approaches to program evaluation are most often employed when dissemination is critical. Descriptive evaluation tells other people about the nature and scope of an organizational coaching program. Documentary evaluation provides evidence for the existence of the organizational coaching program and its outcomes, and illustrates the nature of the program and its impact. Following is a more detailed description of each of these four types or levels of organizational coaching program evaluation.
Level One: Documentation
The most straightforward type of evaluation is documentation. When someone asks what has happened in an organizational coaching program or whether a program has been successful, the program staff can present the inquirer with evidence of organizational coaching program activity and accomplishment. Program evaluations that do not include some documentation run the risk of appearing sterile or contrived. One reads descriptions of an organizational coaching program and one even reviews tables of statistics and return-on-investment calculations concerning program outcomes.but never sees real evidence of the program’s existence. An appreciative evaluation always provides this real evidence. It points to the footprints left by an organizational coaching program and appreciates the meaning of these footprints.
Some program evaluators suggest that we are eventually led in program documentation to a goal-free evaluation. The documents speak for themselves and there is little need for an often biasing and limiting set of goals by which and through which an evaluator observes a specific program. Program documents often reveal much more about a program than is identified by a set of goals. Through the documents, one sees how a coaching program is actually living, and what emanates from the program that may or may not conform to its pre-specified goals.
Often after an organizational coaching program has been developed, someone will collect all the documents that have been accumulating during the course of the program. This may include minutes from major meetings, important memos and letters, reports, formal and informal communications about specific organizational coaching program activities or products, productions of the program, audio or video recordings of specific organizational coaching program activities, and so forth. These documents are usually stored in some file cabinet or digital file for vaguely defined use in the future. Often one suspects that the documents are stored to avoid the arduous task of sifting through them and throwing away the old, useless ones. Unfortunately, archives frequently are not used at a later date. As a result, the collection and storage of documents is rarely a rewarding or justifiable procedure in program evaluation.
Several problems are inherent in typical documentation processes. First, the documents often are stored with no master code. One can retrieve a document only by combing through vast arrays of irrelevant material. Even more importantly, there is rarely a summary documentation report that highlights the richness and value of the stored documents. Nothing entices one to explore the documents. Third, the documentation is usually not linked directly to the purposes or expected outcomes of the organizational coaching program and remains isolated from other aspects of the total evaluation. Many of the problems usually associated with documentation can be avoided if a systematic and comprehensive documentation procedure is implemented.
Organizational coaching documentation is likely to appeal most to and be most effectively used by those in the diffusion of innovation literature who are called the innovators (1-5% of population in most organizations). They make use of anything that they can find around them that is new and interesting (¡ªopportunity junkies.). They don’t need much of an incentive to try out a “new thing” such as organizational coaching. They are innovators, but not necessarily inventors (who are often academicians, R&D folks or “loners”). One makes a convincing argument by building a portfolio that contains documents (emails, course designs, video-recordings of coaching sessions, testimonials, etc.) Show people that organizational coaching exists (“just look around!”) and that it is likely to linger as a viable strategy alongside and complementing other strategies such as: (1) leadership development, (2) 360° feedback, (3) institutional planning and (4) organization development.
Level Two: Description
One of the fundamental features in any program evaluation, according to Scriven, is the identification of the program unit(s) being evaluated.10 He suggests that this identification should be based in a comprehensive description of the program being evaluated. Any final evaluation report will typically contain a description of the organizational coaching program being evaluated. Consequently, there is little need to spend much time advocating the importance of or identifying procedures for the description of a program. Nevertheless, most organizational coaching program descriptions can be improved. Given the importance of dissemination, one must be certain not only that information about the program is accurate and complete, but also that other people understand the program description.
A successful program description is something more than just the labeling of organizational coaching program components. An appreciative approach to program evaluation requires something more than a cursory classification or labeling of a program. It requires that the distinctive and most salient features of the organizational coaching program be identified and carefully described. A program description can often serve as a valuable guidebook for successful program replication if it has been prepared in an appreciative manner. It also often probes into the true function and meaning of a specific organizational coaching program.
Edward Kelly takes description and appreciative evaluation a step further in suggesting that one of the most important purposes of an evaluation is the provision of sufficient depiction or reconstruction of complicated social realities. Those people who are not present when an event occurs should nevertheless have a valid and useful understanding of what it must have been like to be there.12 According to Kelly, a portrayal is, literally, an effort to compare a rendering of an object or set of circumstances. Portrayal evaluation is the process of vividly capturing the complexity of social truth. Kelly suggests that things change depending on the angle from which they are viewed: multiple renderings or multiple portrayals are intended to capture the complexity of what has occurred.
In order to prepare an accurate description of an organizational coaching program, it is necessary not only to trace the history and context of the program and describe its central activities and/or products, but also to provide a portrait of the program (brief descriptions, quotations, paraphrases, and observations). What was it like being a client in this program? What did a typical client do on a daily basis as a result of involvement in this coaching program? What was it like to walk into an office after an organizational coaching program has been implemented? How has this program affected the life of a specific manager in this corporation?
Rather than always focusing on specific organizational coaching program activities, it is often valuable to focus on a specific participant in the coaching program. Pick a typical coaching client. In what activities did she engage? What did she miss? What didn’t she like? Why? One might even want to create a hypothetical client who represents normal involvement in the program. A case history can be written that describes this hypothetical participant in the program. This case history can be much more interesting and in some sense more real than dry statistics, though the case still needs to be supported by statistics to ensure that this typical person is, in fact, typical.
Organizational program descriptions are likely to be most often reviewed and used by those people in the diffusion of innovation literature who are called the early adopters (5-10% of population in most organizations). They want to know what coaching looks like and how it is engaged. They want a description, including detailed case studies, so that they can replicate this strategy. Like the innovators, they don¡¯t need much of an incentive to try out a carefully narrated organizational coaching strategy. They would like to see the taxonomy of organizational coaching strategies, however, to know which of several different coaching strategies to engage in response to a specific organizational issue. A convincing argument can be made (at least to the early adopters) by providing a detailed description of the coaching services being offered, including a description of the outcomes that can be expected from these services. If possible, build one or more hypothetical coaching engagements, complete with sample dialogues between client and coach and examples of how a client might make use of the insights gained from the coaching sessions.
Level Three: Determination of Outcomes
This third level of program evaluation is both the most obvious and most difficult. It is the most obvious because the term “evaluation” immediately elicits for many of us the image of judgment and assignment of worth. Has this organizational coaching program done what it was designed to do? Has this program done something that is worthwhile? Outcome determination evaluation is difficult because the two questions just cited look quite similar on the surface, but are, in fact, quite different. To know whether an organizational coaching program has done what it was supposed to do is quite different from knowing whether what it has done is of any value. The old axiom to the effect that “something not worth doing is not worth doing well” certainly applies to this type of evaluation. The problem is further compounded when an appreciative approach is taken to program evaluation, for both questions are important when seeking to appreciate an organizational coaching program. In the summative appreciation of a coaching program’s distinctive characteristics and strengths, we must assess not only the outcomes of the program, but also the value to be assigned to each of these outcomes.
The first of these two outcome determination questions is researchable. We usually can determine whether or not a specific set of outcomes have been achieved. The second question requires an imposition of values. Hence, it is not researchable. We can’t readily answer this question without substantial clarification of organizational intentions. Yet the issue of values and organizational intentions cannot be avoided in the determination of outcomes. In this essay, we explore ways in which the first question regarding achievement of pre-specified outcomes can be addressed.
Determining achievement of pre-specified outcomes. There are two levels at which a program can be evaluated regarding the achievement of predetermined outcomes. At the first level, one can determine whether the outcomes have been achieved, without any direct concern for the role of the program in achieving these outcomes. This type of outcome-determining evaluation requires only an end-of-program assessment of specific outcomes that have been identified as part of a program planning process.
To the extent that minimally-specified levels have been achieved, the organizational coaching program can be said to have been successful; though, of course, other factors may have contributed to, or even been primarily responsible for, the outcomes. If one needs to know specifically if the organizational coaching program contributed to the achievement of those outcomes, then a second set of procedures must be used.
Determining an organizational coaching program’s contribution to the achievement of pre-specified outcomes. This type of assessment requires considerably more attention to issues of design and measurement than does an assessment devoted exclusively to determination of outcomes. In order to show that a specific organizational coaching program contributed to the outcomes that were achieved, a program evaluator should be able to demonstrate a causal connection. For example, the evaluation should show that one or more comparable group of customers, production lines or competitive organizations that were not directly or indirectly exposed to the organizational coaching program did not achieved the pre-specified outcomes to the extent achieved by one or more groups that were exposed directly or indirectly to the organizational coaching program.
Several research design decisions must be made in order to achieve this comparison between a group that has participated in an organizational coaching program, called the experimental group, and a group that hasn’t participated in this program, called the control group,. Most evaluators try to employ a design in which people are assigned randomly to the experimental and control groups, and in which both groups are given pre- and post-program evaluations that assess the achievement of specific outcomes. Typically, the control group is not exposed to any organizational coaching program. Alternatively, the control group is exposed to a similar organizational coaching program that has already been offered in or by the organization. In this situation ideally there should be at least two control groups, one that receives no program services and the other that receives an alternative to the program being evaluated.
While this experimental design is classic in evaluation research, it is difficult to achieve in practice. First, people often can¡¯t be assigned randomly to alternative programs. Second, a control group may not provide an adequate comparison for an experimental group. If participants in a control group know that they are controls, this will influence their attitudes about and subsequently their participation in the alternative coaching program that serves as the control. Conversely, an experimental group is likely to put forth an extra effort if it knows its designation. This is often called The Hawthorne Effect. It may be difficult to keep information about involvement in an experiment from participants in either the experimental or control group, particularly in small organizations. Some people even consider the withholding of this type of information to be unethical.
Third, test and retest procedures are often problematic. In assessing the attitudes, knowledge or skills of employees before and after an organizational coaching program, one cannot always be certain that the two assessment procedures actually are comparable. Furthermore, if there is no significant change in pre- and post-program outcome measurements, one rarely can be certain that the organizational coaching program had no impact. The measuring instruments may be insensitive to changes that have occurred. On the other hand, the employees already may be operating at a high level at the time when the pre-test is taken and hence there is little room for improvement in retest results. This is the so-called ceiling effect.
A control group can solve some of these test/retest problems, because if the problems are methodological, they should show up in the assessment of both groups. However, one must realize that the pretest can itself influence the effectiveness of both the experimental and control group programs and thus influence the two groups in different ways. Furthermore, several logistical problems often are encountered when a classic experimental design is employed. In all but the largest organizations there may not be a sufficient number of people for a control group. There also may not be enough time or money to conduct two assessments with both an experimental and control group.
Given these difficult problems with a classic experimental design, many organizational coaching program leaders and program evaluators may have to adopt alternative designs that are less elegant but more practical. In some cases, leaders and evaluators have returned to the more basic level of outcome determination. They have restricted their assessment to outcome measures. For example, they determine the level of performance achieved by recipients of organizational coaching services and use this information to determine the relative success of the organizational coaching program being evaluated. This type of information is subject to many misinterpretations and abuses, though it is the most common evaluation design being used in contemporary organizations. The information is flawed because one doesn’t know if differences in performance of coaching recipients can be attributed to the organizational coaching program being reviewed or to the entering characteristics of those who choose to be coached. In order to be fair in the assessment of a program’s effectiveness, one must at the very least perform a value-added assessment. This type of assessment requires that performance be measured at the start of a program and again at the end of the program to determine the value that has been added, or more specifically the improvement in performance that has been recorded.
Fortunately, there are ways in which to assess program outcomes accurately and fairly, without having to engage a pure experimental design that may be neither feasible nor ethical. Donald Campbell and Julian Stanley have described a set of quasi-experimental designs that allow one to modify some of the conditions of the classic experimental design without sacrificing the clarity of results obtained.14 Campbell and Stanley’s brief monograph on experimental and quasi-experimental designs is a classic in the field. Any program evaluator who wishes to design an outcome determination evaluation should consult this monograph. Three of the most widely used of these quasi-experimental designs are time series, nonequivalent control group design and rotational/counterbalanced design.
Campbell and Stanley’s time-series design requires that some standard measure be taken periodically throughout the life of the organizational coaching program.for example, personal productivity, departmental profitability, divisional morale or customer ratings of product or service quality. If such a measurement relates directly to one of the anticipated outcomes of the organizational coaching program being evaluated, there may be a significant change in this measurement. This change will occur after the program has been in place for a given amount of time among those employees or units of the organization that are participating in the program. With this design, a sufficient number of measures must be taken before and after the organizational coaching program is initiated in order to establish a comparative base. At least three measures should be taken before and two measures after program initiation.
The second quasi-experimental design (nonequivalent control group design) will in some cases help the evaluator to partially overcome the Hawthorne effect among experimental group members and the sense of inferiority and guinea-pig status among control group members. Rather than randomly selecting people into an experimental or control group, the evaluator can make use of two or more existing groups. Two coaching programs, for instance, that offer the same type of services might be identified. Clients would select one or the other program on the basis of time preference, convenience of location, etc. It is hoped that these reasons would function independently of the outcomes being studied in the evaluation. Some of the prospective coaching clients would be provided with the new coaching services, while the other prospective clients (the control group) receive the coaching services already provided by the organization.
The prospective coaching clients may need to be informed of the differences between the experimental and control groups before signing up, based on an understandable concern for their welfare. If this is the case, then a subset of the clients from the experimental and control groups can be paired on the basis of specific characteristics (e.g. length of time in the organization, level of organizational responsibility or personality type) that might affect comparisons between the self-selected groups. The two subgroups that are paired thus become the focus of outcome determination evaluation, while the remaining participants in the two groups are excluded from this aspect of the overall program evaluation.
A rotational/counterbalanced design also can be used in place of a classic experimental design, especially if no control group can be obtained and if the evaluators are particularly interested in specific aspects or sequences of activities in the organizational coaching program being evaluated. The rotational/counterbalanced design requires that the program be broken into three or four units. One group of program participants would be presented with one sequence of these units (e.g. Unit l, Unit 3, Unit 2), a second group of participants being presented with a second sequence (e.g. Unit 3, Unit 2, Unit 1) and so forth. Ideally, each possible sequence of units should be offered. Outcomes are assessed at the end of each unit. An evaluator who makes use of this design will obtain substantial information about program outcomes, as well as some indication about interaction between organizational coaching program activities.
The rotational/counterbalanced design might be used successfully in the assessment of an organizational coaching program that offers several different sources (e.g. reflective, instrumented and organizational coaching) in several different sequences. It would yield information not only about the overall success of the new program but also suggest which sequence of program services is most effective. Campbell and Stanley describe a variety of other designs, indicating the strengths and weaknesses of each. They show that some designs are relatively more effective than others in certain circumstances, such as those involving limited resources and complex program outcomes. In addition, they suggest alternatives to the classic experimental design for situations in which that design may be obtrusive to the program being evaluated or otherwise not feasible.
Organizational coaching evaluation initiatives that focus on the determination of outcomes are likely to be very attractive to those in the diffusion of innovation literature who are called the early majority (30-50% of population in most organizations). They want to see evidence of coaching effectiveness before participating in a coaching program. This evidence need not be quantitative in nature; it can be based on qualitative data that are ¡ªtrusted. by the early majority (trust being based on the belief that the data are being collected by either neutral sources or those who are committed to the program, but are open to new learning with regard to this program’s successes or failures).
One makes a convincing argument when addressing the concerns of the early majority by identifying the criteria for measuring success of the organizational coaching program, this success being defined either by return on investment (amount of money devoted to coaching program as related to revenue generation or cost savings assignable to the coaching services) or return on expectations (what the client system expected the outcome(s) to be from the coaching session). Develop and systematically implement assessment tools that relate directly to the criteria of success. Base the determination of outcomes on these assessment measures. Be sure to measure both before and after the coaching program has been initiated to determine if this program has ¡ªadded value. to the base line.
Level Four: Diagnosis
Many times evaluations of coaching programs are unsatisfactory, not because they fail to determine whether or not an outcome has been achieved or an impact observed, but rather because they tell us very little about why a particular outcome or impact occurred. At the end of an organizational coaching program we may be able to determine that it has been successful and has yielded an impressive return on our investment in the program. However, if we do not know the reasons for this success.that is if we have not fully appreciated the complex dynamics operating within and upon this organizational coaching program.then we have little information that is of practical value. We have very few ideas about how to sustain or improve the program, or about how to implement a successful organizational coaching program somewhere else. All we can do is continue doing what we already have done. This choice is fraught with problems, for conditions can change rapidly. Programs that were once successful may no longer be so.
Michael Quinn Patton is among the most influential evaluators in his emphasis on the pragmatic value inherent in a diagnostic focus. Coining the phrase utilization-focused evaluation, Patton suggests that:
Unless one knows that a program is operating according to design, there may be little reason to expect it to produce the desired outcomes. . . . When outcomes are evaluated without knowledge of implementation, the results seldom provide a direction for action because the decision maker lacks information about what produced the observed outcomes (or lack of outcomes). Pure pre-post outcomes evaluation is the ¡ªblack box. approach to evaluation.
A desire to know the causes of organizational coaching program success or failure may be of minimal importance if an evaluation is being performed only to determine success or failure.or if there are no plans to continue or replicate the program in other settings. However, if the evaluation is to be conducted while the program is in progress, or if there are plans for repeating the organizational coaching program somewhere else, evaluation should include appreciative procedures for diagnosing the causes of success and failure.
What are the characteristics of a diagnostic evaluation that is appreciative in nature? First, this type of evaluation necessarily requires qualitative analysis.17 Whereas evaluation that focuses on outcomes or that is deficit-oriented usually requires some form of quantifiable measurement, diagnostic evaluation is more often qualitative or a mixture of qualitative and quantitative. Numbers in isolation rarely yield appreciative insights, nor do they tell us why something has or has not been successful. This does not mean that quantification is inappropriate to diagnostic evaluation. It only suggests that quantification is usually not sufficient. Second, the appreciative search for causes to such complex social issues as the success or failure of an organizational coaching program requires a broad, systemic look at the program being evaluated in its social milieu. Program diagnosis must necessarily involve a description of the landscape. The organizational coaching program must be examined in its social and historical context.
Third, an appreciative approach to diagnostic evaluation requires a process of progressive focusing. Successively more accurate analyses of causes and effects in the organizational coaching program are being engaged. Since a diagnostic evaluation is intended primarily for the internal use of the program¡¯s staff and advisors, it must be responsive to the specific questions these people have asked about the program. Typically, a chicken-and-egg dilemma is confronted: the questions to be asked often become clear only after some initial information is collected. Thus, a diagnostic evaluation is likely to be most effective if it is appreciative in focusing on a set of increasingly precise questions.
Appreciative focusing takes place in a progressive manner during the information collection phase of the diagnostic evaluation process. Malcolm Parlett, the developer of a diagnostically oriented procedure called illuminative evaluation, describes appreciative focusing as a three-stage information collection process.18 During the first stage:
. . . the researcher is concerned to familiarize himself thoroughly with the day-to-day reality of the setting or settings he is studying. In this he is similar to social anthropologists or to natural historians. Like them he makes no attempt to manipulate, control or eliminate situational variables, but takes as given the complex scene he encounters. His chief task is to unravel it; isolate its significant features; delineate cycles of cause and effect; and comprehend relationships between beliefs and practices, and between organizational patterns and the responses of individuals.
The second stage involves the selection of specific aspects of the organizational coaching program for more sustained and intensive inquiry. The questioning process in the second stage of an illuminative evaluation becomes more focused and, in general, observations and inquiry become more directed, systematic and selective. During the third stage, general principles that underlie the organization and dynamics of the program are identified, described and, as a result, appreciated. Patterns of cause and effect are identified within the program, and individual findings are placed in a broader explanatory context.
The three stages of progressive focusing have been summarized by Parlett:
Obviously, the three stages overlap and functionally interrelate. The transition from stage to stage, as the investigation unfolds, occurs as problem areas become progressively clarified and re-defined. The course of the study cannot be charted in advance. Beginning with an extensive data base, the researchers systematically reduce the breadth of their inquiry to give more concentrated attention to the emerging issues. This progressive focusing permits unique and unpredicted phenomena to be given due weight. It reduces the problem of data overload and prevents the accumulation of a mass of unanalyzed material.
These three appreciative characteristics of diagnostic evaluation (qualitative analysis, systematic perspectives and progressive focusing) are often troublesome for both inexperienced and traditional evaluators. These characteristics appear to fly in the face of a contemporary emphasis on precision, measurement, objectivity and the discovery of deficits. Such is not the case, however, for these three characteristics can serve to enhance rather than take the place of a more traditional so-called scientific evaluation.
In looking appreciatively at cause and effect relationships in a complex social setting, a whole variety of tools and concepts must be considered. In attempting to better understand the workings of a specific program or culture, the evaluator, like the anthropologist, uses a variety of data collection methods, ranging from participant-observation and interviews to questionnaires and activity logs.21 Parlett suggests that the experienced evaluator also emulates the anthropologist in making use of various data analysis methods, ranging from narration and metaphor to multivariate statistics.
One is likely to have an impact on one of the most difficult populations . the group known in the diffusion of innovation literature as the late majority (30-40% of the population in most organizations).only with the engagement of diagnostic program evaluation initiatives. The late majority do things (such as obtain a coach) not because they understand what is happening, but because this is the ¡ªright. thing to do (the ¡ªband wagon. effect). There is a high possibility that the innovation will be misused at this point. It is particularly important at this phase to understand why something works. Program diagnosis helps to better prepare users and screen out inappropriate users.
An evaluation of an organizational coaching program is likely to successful .and useful to most members of an organization (including the late majority).only if it is not only diagnostic but also appreciative in nature. This fourth level of analysis can be best achieved through solicitation of appreciative narratives (¡ªcatching people when they¡¯re doing it right..) and through dialogue regarding why this level of success was achieved. One can also make use of sophisticated quantitative tools, such as multivariate analysis and factor analysis to tease out the contributing factors (often called moderator variables) such as amount of experience of coaching clients in a leadership role or specific coaching strategy being used. Eventually, highly sophisticated tools will be used that mix together qualitative and quantitative methods: cross-impact analysis, spread of impact analysis organizational mapping (causal loop diagrams), and computer modeling.
Valid, Useful and Appreciative Information
The four types of program evaluation described above all contribute to the decision-making and dissemination processes that inevitably attend the ongoing planning and development of any organizational coaching program. An optimally effective and appreciative program evaluation will draw all four types into a single, comprehensive design. This design brings together the valid, useful and appreciative information collected from the documentary, descriptive, diagnostic and outcome determination evaluations. It also combines this program information with a clear and consensus-based agreement concerning the purposes and desired outcomes of the program, yielding results that translate readily into program decisions and dissemination.
Initially, the idea of incorporating all four types of evaluation into a single, comprehensive project may not seem feasible. However, with careful planning, all four types can be employed at relatively low cost. First, the same information sources and data gathering procedures can be used to collect several different kinds of information. Careful, integrated planning not only saves time for the evaluators. It also reduces the chance that program participants will feel over-evaluated and under-appreciated.
Second, the whole evaluation process can be spread out over a fairly long period of time, if planning begins early enough in the development of a new organizational coaching program. The long-term planning of widely dispersed evaluative interventions makes a major evaluation project possible and reduces negative reactions to those interventions that are made. In general, an outcome determination evaluation will require extensive attention at the start and end of a program, whereas program description and diagnosis require greatest attention during the middle of the program. Program documentation requires some attention before the program begins and during the program. The most extensive documentation work is required after the outcome determination evaluation is completed and before a final report is prepared.
Any evaluation design team must keep a third point in mind. It concerns the “friendly” (or not so friendly) opposition: the stakeholders who never seem to be convinced by any evaluation data (who, in this instance, never seem to come around to acknowledging the benefits of organizational coaching). We know from research on the diffusion of innovation that the final and most resistant population in an organization is composed of the so-called laggards or recalcitrants (5-15% of the population in most organizations).
They will never support this “new thing” — unless they can be “co-opted”. They often hate what the innovator is doing and the innovation that they wish to implement because they themselves are former innovators. Thus, when they see a new innovation that could be successful, it reminds them of their own failure and they don’t want someone else “showing them up”. We “co-opt” the laggards by inviting them into our design process. We gather their recollections and lessons learned, along with documents, descriptions, summative evaluations and formative evaluations from the innovations they introduced or championed that did not work.
How can they contribute their wisdom to the current innovative initiative (organizational coaching)? How can they finally be part of a success? How can their talents be finally appreciated? Along with the other groups (innovators, early adopters, early majority and late majority), the recalcitrants can be won over to the side of effective organizational coaching. It is always a matter of understanding and appreciation in gaining the support of any members of an organization for any new initiative.such as organizational coaching.
A Ten Step Program Evaluation Process
Regardless of the focus of the evaluation, there are certain key features that the organizational coaching program evaluation should incorporate. Michael Scriven identifies fourteen key features of any successful program evaluation.22 I have built on, modified and distilled these fourteen features, identifying a set of ten program evaluation steps. Each of these steps should be included in any organizational coaching program evaluation that seeks to be appreciative in form and purpose.
The following ten steps are appropriate in addressing the needs of either an outcome determination evaluation or diagnostic evaluation, and incorporate both program description and program documentation. These ten steps are: (l) identification of program unit(s) to be evaluated, (2) formulation of an evaluation strategy and identification of personnel to conduct this evaluation, (3) formulation of an initial set of evaluation questions, (4) collection of information regarding these questions, (5) analysis of information collected, (6) reporting back of analyzed information, (7) evaluation of the evaluation, (8) reformulation of evaluation questions, (9) repetition of steps three, four, five and six for an indeterminate number of iterations, and (10) preparation of final evaluation report.
Step One: Identification of Coaching Program Unit(s) to be Evaluated
Scriven first points to the identification of the program unit to be evaluated. In systems terms, this would be described as the differentiation of the boundaries of the system. What is to be included and what is to be excluded in the evaluation of a specific organizational coaching program or unit of an organizational coaching program? What are the overall nature, function and operation of the coaching program unit being evaluated? What are the components, their functions and relationships? What is the nature of the system by which the product or service is being delivered to the coaching client? What is the nature of the support system and environment in which the coaching program unit is situated? These questions are all appreciative in character and relevant to this initial program identification step.
Step Two: Formulation of Evaluation Strategy and Identification of Personnel
Once the coaching program unit is identified and boundaries have been drawn, some planning must take place. This planning focuses on issues of clientship, audience, purpose of evaluation, and personnel to conduct the evaluation. The following specific questions should be addressed during this step:
Identification of Client: For whom (person, department, staff, committee) is this evaluation specifically being conducted? Who should formulate researchable questions for this evaluation?
Identification of Audience: Who will be present for the reporting back of analyzed information from this evaluation? Who will receive copies of any reports that are produced from this evaluation?
Purpose of Evaluation: What is the overall purpose of this evaluation? Will it primarily be used for formative purposes and therefore be focused on diagnostic issues? Will it primarily be used for summative purposes and therefore be focused on the determination of program outcomes? In general, how might an evaluation benefit this coaching program? How might an evaluation benefit the stakeholders for this coaching program, including the people being served by the program, the people who are responsible for the program, and the people who must make decisions about the program? How can we justify spending time and money on this evaluation project?
The first of these questions must be answered to ensure some clarity regarding responsibility for the evaluation project and for results produced by the project. Somewhere between one and four people should be identified as the client for a program evaluation. As the client, these people must make all decisions regarding the questions to be asked in an evaluation. The client may consult with other people regarding the questions to be asked, but has ultimate responsibility for selecting these questions. Only a small number of people should be designated as part of the client system. If more than four or five people are involved in the decision-making process, the program evaluation is likely to be diffuse and useless.
The client for an organizational coaching program evaluation may be the program director, a committee, the department that is responsible for the program, or a specific member of the program staff. If the evaluation is focused on the determination of outcomes, then the client group should include the administrator or group to which the coaching program unit is accountable. If the evaluation is summative and outcome-focused, a vice president to whom those running the organizational coaching program reports might also be an appropriate member of the client group.
Other stakeholders are also appropriate, such as members of an advisory group that oversees this organizational coaching program. These people or organizations are appropriate clients for outcome determination evaluation. However, they are rarely appropriate if the evaluation is primarily diagnostic and formative in nature. These external stakeholders tend to add an element of threat and judgment to the formative evaluation process that will often distort the results of this evaluation and will usually make it difficult for those who are running the coaching program to benefit from the formative findings. While these external stakeholders may be part of the audience for a diagnostic evaluation, they should not be designated as the client.
The question of audience is separate from the question of client. The audience for a program evaluation is defined as the group of people who will receive copies of all or most reports produced by the evaluator or evaluation team. A smaller audience may be identified for oral reports on the project. The audience for a program evaluation may change as the purposes of the evaluation become clearer and as the causes of program strength and difficulty become better known. The audience for a program evaluation should be identified at an early point in the planning process. It seems that the nature of the audience and its interests and concerns will influence the type of questions that need to be asked and the amount of candor that will go into any oral or written report.
This leads us to the third question regarding the overall purpose for an appreciative evaluation. While the client should select specific questions, there must be a wider ownership for the overall design and purpose of an appreciative evaluation. Members of the audience for this evaluation, as well as other influential people or groups within the organization, should be informed of the nature and purpose of the client and audience. They should also acknowledge and understand the nature and purpose of the evaluation. Frequently, the purpose of an appreciative evaluation is misunderstood, which can cause the results to be disappointing or misinterpreted by many people who read or hear of the report. Some organization-wide dissemination of information about appreciative program evaluation processes and purposes is advisable to reduce misunderstanding and disappointment.
A broadly representative constituency might be involved even more centrally in a final decision that must be made during this first step, if the culture of the organization encourages collaboration: Who should conduct the evaluation? Someone who knows a great deal about organizational coaching might be selected. Alternatively, someone might be selected who is knowledgeable about program evaluation. Should one or more people from within the organization be selected? What about an outside evaluator?
A broad constituency is often of value in selecting a program evaluator, because if in-house resource people are used they often must be given release time, compensation or at least recognition. These in-house resource people also must be given legitimacy so that they do not suffer the fate of most prophets-in-their-own-land. This legitimating process may be unnecessary if someone in the organization has been formally assigned this role or works in an office for program review or evaluation.
The development of a broad constituency might also be necessary if the decision is made to bring in one or more resource people from outside the organization, for these people cost money, which typically requires deliberation regarding priorities of the organization. An appreciative program evaluation can also impact on the organization in wide and unanticipated ways. There should therefore be broad-based participation in and commitment to the appreciative evaluation process.
Several factors should be kept in mind regarding the selection of a program evaluator or team. First, the person(s) should know or be given sufficient opportunity to become knowledgeable about the organization. Without some sense of the organization’s culture, an evaluator can rarely offer anything other than mundane or inaccurate observations. Second, the evaluator should be able to offer a fresh perspective. She should be a cosmopolitan who knows what is happening in other similar organizational coaching programs. A parochial perspective yields useless and often self-serving analyses. Third, the evaluator(s) should have access to a variety of evaluation tools and strategies so that the investigation is not tightly constrained by the limitations of the evaluator.
Usually no one person will meet all of these requirements. An evaluation team of four to five members is often appropriate. This team might include an outside evaluator with expertise in formative or summative evaluation processes or an outside evaluator with specific expertise in organizational coaching. This evaluator typically heads the team. The team might also include an internal person with evaluation expertise, who serves as coordinator and liaison person for the team, and one or two internal people with broad organizational perspectives and widely acknowledged credibility. Such a team is small enough to do an efficient job, yet large and diverse enough to meet the wide-ranging needs of a diagnostic evaluation. Larger teams are rarely appropriate even when engaged in major evaluation tasks, for considerable attention usually has to be devoted to the coordination of activities and administration when a large evaluation team is employed. Time is drained away from the evaluation itself.
If a program does not warrant a four- to five-person team, or if such a team is not feasible, given budgets and time constraints, then a two-person team might be considered. One of these individuals might be an external evaluation expert, the other a respected in-house resource person with a broad inter-organizational perspective concerning organizational coaching. This team might be given assistance by other members of the organization in specific operations for short periods of time.
Step Three: Formulation of Initial Evaluation Questions
Once the program unit and a client and audience have been identified, attention turns to the formulation of researchable questions for the evaluation. The term researchable question identifies an important limitation in any program evaluation. This type of evaluation can be used to answer only questions regarding the current status of a program. Questions regarding intentions (in which directions should we move?) are not appropriate for this step of an evaluation process.
A researchable question is one that potentially can yield an answer that everyone will accept, at least tentatively, as being valid. Thus, a question regarding the attitudes of clients toward a specific coaching activity is researchable. Information can be collected to answer this question to almost everyone¡¯s satisfaction. On the other hand, a question regarding whether these attitudes should make a difference in plotting future program directions is not researchable, for this is a question regarding values and can only be answered through discussion and, if necessary, mediation or negotiation among members of the organization.
The client should make an initial attempt to identify major concerns about the organizational coaching program that can be resolved at least in part by obtaining more information about some aspect of the program. Typically, this will involve one of the following concerns:
Attitudes about the program: how do people who are affiliated with the organizational coaching program (staff, participants, advisers) and/or are acquainted with the coaching program feel about the program as a whole or about specific program events or products?
Causes of specific attitudes: what seem to be the sources of certain attitudes about the coaching program and/or specific coaching events or products?
Effects of specific attitudes: what are likely to be the consequences of certain attitudes held by people about the coaching program or specific coaching activities?
Program success or failure: given certain criteria for program success, to what extent is this coaching program successful (summative evaluation) and what forces operating within the coaching program or operating on the program from outside seem to facilitate or hinder program success or failure (formative evaluation)?
The client should identify one or more concerns in one or more of these four areas. While the client should give some attention to these issues, even before meeting with the program evaluators, much of the clarification will occur in interaction with the evaluation team. As noted above, increased clarification will occur as data are gathered, initially analyzed and reported back to the client through an appreciative process. At this preliminary stage, it is only necessary for the client and evaluators to arrive at a common understanding about several researchable questions to give the evaluation team some direction for the initial collection of information.
Step Four: Collection of Information Concerning the Coaching Program
During this step, the progressive focusing process that Parlett describes goes into effect, especially if the evaluation is to have an appreciative focus. Typically, the program evaluators seek to answer the specific questions that have been posed by the client, as well as to set a context in which answers to these questions can be interpreted and understood. This context-setting activity relates directly to the first step in the appreciative evaluation process: description of the program¡¯s support system and environment.
The concept of triangulation is important for this stage of an appreciative evaluation. An evaluator should gather information from at least three different sources, using at least three different methods so that results are unlikely to be biased by a particular source or methodology. The three sources might include program participants, program staff and neutral outside observers. Formal interviews, questionnaires and performance tests are particularly appropriate methods to use in conducting a triangulated program evaluation, though other methods such as document review, participant observation, informal interviewing, and polling may also be appropriate.
When only one source or method is used, an evaluator can’t account for any distortion that might be caused by the source or method itself. When two sources or methods are used, different results can either be interpreted as substantive or as a product of source or methodological biases. When three or more sources and methods are used, source and methodological biases usually can be sorted out and separated from more important, substantive concerns.
The length of the information-collecting phase will vary widely depending on the size and complexity of the organizational coaching program. Usually twenty to forty hours of information collection are required for a thorough appreciative evaluation. In most instances, information begins to be highly redundant after ten to fifteen interviews or six to eight hours of observing the same type of event, provided there is a random sampling of interviews and observations.
Step Five: Analysis of Information Concerning Coaching Program
The resources of a program evaluation team are not fully brought to bear at the time of information collection but rather at the time of analysis and interpretation. The resources of an external expert in the program area being studied or of an internal or external evaluator with extensive experience with coaching programs in similar organizations can be particularly valuable in the analysis of collected information.
Two main issues need to be kept in mind.26 First, analysis should be focused on the researchable questions offered by the client. Second, an effort should be made to preserve the richness of the collected information, while distilling and refining that information so that recipients of this analysis are not overwhelmed with detail or complexity. This second concern resides at the heart of any appreciative evaluation process. Appreciative evaluators often blend a macro-analysis of overall themes and issues as they relate to the client’s questions, with a micro-analysis of specific events, case histories or critical incidents that relate to the client’s questions.
By focusing on questions asked by the client, appreciative evaluators can demonstrate concern for the interests of the client and program, and thereby establish greater credibility and trust. One should avoid evaluators who continually insert their own definitions of the problem and speak about representing the overall interests of the organization, without being clear whose interests (other than their own) are really being served. Evaluators can be of valuable assistance to the client in helping to clarify the questions to be asked and broadening the range of questions that are addressed. This discussion should occur, however, after information about the initial questions has been reported.the step to which we now turn.
Step Six: Reporting Back Analyzed Information
Ideally, the reporting back process begins before all of the information has been collected. The evaluators meet informally with the client to report on early observations, intuitions, tentative conclusions and so forth. The client¡¯s reactions can be used to add greater precision to the remaining information collection. This is part of the progressive focusing process associated with appreciative evaluations. Typically, the formal reporting back step does not commence until all the information has been collected and analyzed, so that the evaluators can be given an opportunity to pull together their ideas.
As a general rule, the evaluators’ analysis of program information should be presented first in an oral report to the client and, in some cases, the audience. An oral report enables the client to ask questions, gain clarification on specific points and react to specific conclusions in interaction with the evaluators. An oral presentation also enables the evaluators to receive immediate feedback on the nature and results of their evaluative inquiry. The oral report usually begins with a review of the client’s questions and the methods used to collect program information. A few general observations and comments on the program often follow not so much to inform the client about the program as to inform the client about the particular perspective, orientation or bias being brought by the evaluators to the program diagnosis. Each of the client’s questions is then examined in some detail, often with an appreciative blend of macro (themes, issues) and micro (events, case histories, critical incidents) analyses.
Step Seven: Evaluation of the Evaluation
After the client has been given an opportunity to comment on the presentation, attention should shift to next steps. If no further information collection and analysis is warranted then the evaluators should move to the tenth step, preparing a final written report. If the initial program evaluation seems to have been of value, and if further information collection and analysis seems warranted, then the client and evaluators should first discuss revisions, if any, in the evaluative procedures being used.
At this point, the client is evaluating the evaluation. Scriven believes that this meta-evaluation is a professional imperative for any program evaluator. It is needed for the ongoing professional development of the evaluator and, more broadly, the continuing maturation of the field of program evaluation. In this way, a meta-evaluation serves an ongoing formative purpose. Meta-evaluations, however, also serve ethical purposes and help to establish the credibility of the profession: one should be open to having one’s own work evaluated if there is a commitment to evaluate the work of other people. This should help not only to improve the procedures that were used, but also to enhance trust and openness between the two parties. Upon completion of this evaluation of the evaluation, the client and evaluators are ready either to conclude the evaluation with a written report or to move to a second cycle in the appreciative process of progressive focusing.
While evaluating the evaluation, the client and evaluators probably will encounter the following two issues:
To what extent did the initial questions capture the central concerns of the client about the coaching program being evaluated?
To what extent did answers to the questions that were asked yield important new insights about the coaching program? Do the answers produce recommendations for changes in the program (formative evaluation)? Do the answers provide stakeholders with information that increases the chances that thoughtful and balanced decisions will be made about the program (summative evaluation)?
To the extent that the questions didn’t capture central concerns or generate new insights and recommendations, the evaluators might be faulted in part for inappropriate information collecting procedures or inadequate analysis. In most instances, however, the questions themselves are partially responsible for the relative failure of initial data collection and analysis.
Step Eight: Reformulation of Evaluation Questions
One will sometimes find that the initial questions will have to be reformulated, based on the client’s and evaluator’s further experience with the program evaluation procedure and the answers given to the initial questions. The reformulation, as in the case of the initial formulation, should involve both the client and evaluators, with the latter helping the former to consider a wide array of alternative questions, to clarify questions that are chosen, and to insure that the questions are posed in a researchable manner.
Step Nine: Recycling through Collection, Analysis and Reporting of Program Information
With a new or revised set of questions, the evaluation team proceeds through one or more additional cycles of the appreciative evaluation process. Each cycle will begin with an increasingly precise and focused set of questions. Usually, less information needs to be collected with each cycle, for information collected in previous cycles will be relevant to the new questions, and the amount of information needed to answer precise questions is less than is needed to answer more diffuse questions. In many instances, the information collection step is reduced in length, because the same people will be interviewed, the same activities observed, and so forth as in earlier cycles. New information usually will be integrated with previously collected information, eliminating the need for identification of new themes or integrating principles.
In reporting out after the initial cycle, the evaluators will want to review the findings from previous cycles briefly before moving on to new information. Each reporting session should end with an evaluation of the evaluation and decision to conclude the evaluation or recycle through the third, fourth and fifth steps with newly reformulated evaluation questions. Usually, an appreciative program evaluation will require two or three cycles. Rarely are more than five cycles required to yield valid and useful information for the client.
Step Ten: Preparation of Final Evaluation Report
Once the client and evaluators have decided to bring the appreciative evaluation to a close, the final written report should be prepared. The evaluators should prepare an initial written draft that is reviewed by the client. They then prepare a final draft for formal submission to the client and distribution to the designated audience.
The client is asked to review an initial draft for two reasons. First, nothing should catch the client by surprise when the final report is prepared, particularly if the evaluation contains a summative component. Differences of opinion between the client and evaluators should be discussed at this point or earlier during one or more of the oral report sessions. Differences should not be deferred until after the final report is submitted. Second, an initial draft is submitted to ensure that factual errors are corrected and that specific observations or conclusions are not likely to be misinterpreted. If the final report is likely to be controversial, several members of the audience might even review the initial draft to ensure minimal errors or misunderstanding.
The written report in an appreciative evaluation process usually begins with a general statement about the type of questions that were asked and the purpose of the evaluation, as determined during the first step. The evaluation procedures that were used are then reviewed, followed by a listing and discussion of the specific questions asked by the client. The questions can be grouped either by cycle or by general theme. A commentary on each question is then presented. Supporting documentation, such as quotations, statistics, and transcripts, should be included in the text if brief, or placed in a table or appendix if longer than one page. The appreciative report will often come alive and be both more informative and influential if it incorporates additional modes, such as photo essays, audio or video recordings, multimedia presentations, product displays, scenarios, case studies, portrayals, questions and answers, and even computer-based simulations that dramatically illustrate the primary dynamic of the program.
If the evaluation report is to stand alone, it should begin with a brief description of the program. It might also conclude with a set of recommendations made by the evaluators based on an analysis of the program. Usually, the evaluators should avoid making a single, specific recommendation.for reactions against this recommendation can serve to invalidate all the work done on the program diagnosis. Given the evaluators’ unique and appreciative perspective on the program, their recommendations can be of value to a program even after the formal program evaluation is completed. A program evaluator can provide insights and advice for several years to come if the appreciative evaluation has been thoughtfully conceived and executed.
Conclusions
“The tragedies of science,” according to Thomas Huxley, “are the slayings of beautiful hypotheses by ugly facts”. The leaders and managers of organizational coaching programs in many organizations face this prospect when confronted with the need to design or select ways of evaluating their efforts. Organizational coaching program evaluation may indeed be threatening to their cherished notions about how human and organizational resources are developed and about how change and stabilization actually take place. More immediately, evaluation can be threatening to one’s beliefs regarding how a particular coaching program is impacting a particular client or department. All of this suggests that program evaluation in this field (and most other fields) must be designed and managed carefully and with full recognition of both the opportunities and threats associated with any program evaluation process.
- Posted by Bill Bergquist
- On August 15, 2011
- 0 Comment
Leave Reply