Three quick judgments, then the list. Full recap tomorrow, but:

1. HAHAHAHAHAHAHAHA, Reed Grimm's schtick is schticked out! You cannot just ab-flash your way into the top 13!

2. Hallie Day > Hollie Cavanagh. Or > Shannon, but they probably had a one H_llie limit, and besides, Shannon most likely got the megachurch vote, something I'm astonished I didn't see coming earlier.

3. Full disclosure: I will be stanning for Erika Van Pelt from here on out.

Now then! Your top 13! It's almost perfect:

1. Phillip Phillips

2. Jessica Sanchez

3. Hollie Cavanagh

4. Joshua Ledet

5. Heejun Han

6. Shannon Magrane

7. Skylar Laine

8. Elise Testone

9. Colton Dixon

10. Jermaine Jones

11. Erika Van Pelt

12. Jeremy Rosado

13. Deandre Brackensick

Measuring the outcomes of leadership development programs.

Journal of Leadership & Organizational Studies November 1, 2009 | Black, Alice M.; Earnest, Garee W.

The lack of research evaluating the outcomes of leadership development programs and the lack of a suitable evaluation instrument are evident in the literature. This study represents the first attempt at providing a comprehensive method to evaluate and measure leadership development programs on a post-program level. Social learning theory, adult learning theory, and the EvaluLEAD framework influenced the theoretical model developed in this research. The EvaluLEAD principles provide a basis for the conceptual model and results in the development of a program evaluation instrtkment named the Leadership Program Outcomes Measure. Finally, the application of this measure to one statewide leadership development program is presented.

The foundation of leadership development programs began in 1983 with the vision of the W. K. Kellogg Foundation (WKKF). The WKKF (2001) funded the creation of the first organized statewide leadership development programs to exist in the United States. These programs were geared toward participants primarily from rural areas. They presently extend to 32 states and three countries, lay claim to thousands of alumni, and receive an immense amount of stakeholder support. However, the lack of research evaluating the outcomes of leadership development programs and the lack of a suitable evaluation instrument are evident in the literature.

This dearth of evaluation mechanisms has not gone unnoticed in the research community. Pointing to this serious shortage of rigorous, systematic evidence in program evaluation, Carman (2007) and the WKKF (2001) have called for an increase in leadership development program evaluation.

This study, Measuring the Outcomes of Leadership Development Programs, sought to assist in leadership program evaluation through the creation of an instrument that measured program outcomes on the individual, organizational, and community levels. We used the early WKKF model to measure the outcomes of one statewide leadership development program created in 1985, and we built on an evaluation framework called EvaluLEAD proposed by Grove, Kibel, and Haas (2005). The results of this study provide the first examination of the effect of a leadership development program at the post-program evaluation level. The study employed a comprehensive instrument called the Leadership Program Outcomes Measure (LPOM). The LPOM was developed by Black (2006) to gain insight into alumni outcomes and program achievements. It has important connotations for those who manage leadership development programs and who wish to evaluate post-program outcomes.

Carter and Rudd (2000) suggest that the two primary goals of early leadership development programs were to develop leadership skills in the participants and to enhance participants' knowledge of topics. Social learning theory (SLT) of Bandura (1986), adult learning theory (Birkenholz, 1999; Caffarella, 2002; Knowles, 1984; Lieb, 1991), and Rost's (1993) leadership paradigm influenced the theoretical model that was developed in our research for evaluating leadership development programs. The researchers acknowledge the existence of other exemplary leadership research; however, we believe that these theories were best suited to capturing leadership program development outcomes. The proposed model (see Figure 1) attempts to capture the elements relating to participants of leadership programs, which in turn leads to a theory-driven evaluation approach (Bledsoe & Graham, 2005).

This study is designed to address the dearth of evaluation methods available to those who plan and administer leadership development programs. There are relatively few published studies designed to measure the level of change that a participant experiences from his or her leadership program experience and to what degree this change radiates from the participant to the community in which he or she interacts. This article will provide an example of research conducted with a new instrument designed to address this gap.

[FIGURE 1 OMITTED] We add to the literature by seeking to capture the level of change on personal, business, and community levels experienced by the participants. The study combines quantitative measures of degree of change ratings with qualitative measures to triangulate the data and provide a higher degree of reliability and validity to the instrument. The instrument resulting from this research is applicable to publicly or privately sponsored leadership development programs seeking to evaluate the program's effect. The findings from this research add to the understanding of the effect of leadership development programs post-program and present a novel approach to leadership program evaluation.

Literature Review Social Learning Theory As researchers, we often heard from participants of leadership development programs that the "group" influenced personal growth. Bandura (1977) was the first theorist to develop the concept of "imitation" as modeling behavior, where individuals learn from one another by observing behaviors and imitating them.

This process, called social learning theory or "observational learning," was first identified by Bandura (1977, 1986), who points out that the observation of others can help individuals learn from example. According to Bandura, needless errors are eliminated when individuals learn by observing others and then thinking about their actions before performing them. In SLT, modeling behaviors assists the individual's learning through exposure to guides; this is a process that Bandura calls "informative learning." In addition, a person can learn a behavior but may wait until a later time to display that behavior. Bandura proposed that a person's thought processes affect his or her behavior when coupled with exposure to social experiences. These observations and experiences are then drawn on to establish new patterns of behavior that often go beyond those of the exposed levels (Bandura, 1986).

Bandura (1986) found that individuals change because the skills needed to be effective in their efforts to bring about change are demonstrated. He notes that, by empowering people with creative mechanisms, people can exercise influence in areas of their life. Thus, individuals can be empowered with the ability to exercise influence in areas of their lives through social experience and modeling. This modeling helps an individual develop the belief that she or he, too, can accomplish what someone else is observed accomplishing (McGowan, 1986).

"Through modeling we can transmit skills, attitudes, values, and emotional proclivities" (Bandura, 1986, p. 5). SLT argues that individuals should interact and exhibit behavior that enhances self, ability, and role performance.

Based on this theory, a significant difference in activities should be identified in participants of leadership development programs. SLT contends that individuals should interact and exhibit behavior, which enhances self and ability. Therefore, participants in leadership development programs should exhibit behaviors that indicate increased ability in role performance, increased involvement in community activities, and increased perceptions of reality that make them more aware of cultural differences.

Adult Learning Another area of leadership program planning and delivery occurs through the application of adult learning theories, or "andragogy" (Caffarella, 2002). Knowles (1984), the pioneer of the principles of andragogy, theorized that adults learn experientially and use a problem-solving approach to their pursuit of knowledge. Furthermore, for a program to be successful, adults need to be informed as to why they need to learn and why a topic is of value to them (Caffarella, 2002).

Finally, adults learn best by interaction through hands-on experiences related first to their knowledge. Lieb (1991) then links Bandura's (1977) SLT to adult learning theory by explaining that adults are motivated by social relationships and the need for associations and friendships. Furthermore, Gibson (2004) states that SLT is very evident in adult learning and emphasizes Bandura's (1977) theory related to adult learning in the areas of attention, retention, performance and motivation, reciprocal determinism, self-regulation, and self-efficacy.

Rost Paradigm Because leadership development programs are never unidirectional but are, rather, complex webs of relationships, motivation, and interaction, the Rost paradigm influenced the researchers in this study. Rost's (1993) paradigm indicates that leadership is a relationship of influence where real change occurs with a mutual purpose in the context of relationships. Rost explains that in the model, leaders and followers give evidence of their intention in action or words. These relationships of influence can be multidirectional, vertical, horizontal, diagonal, or circular and are noncoercive in nature. Rost notes that real change occurs within relationships that are purposeful and future oriented. However, Rost also states that although change may be produced in the leadership relationship, it is not essential to it.

Theoretical Model of Leadership Applying the research constructs of SLT (Bandura, 1977, 1986), adult learning theory (Birkenholz, 1999; Caffarella, 2002; Knowles, 1984; Lieb, 1991), and the EvaluLEAD framework (Grove et al., 2005) along with the Rost (1993) paradigm, our researcher developed a theoretical model of leadership (Figure 1) for this research. The model is fixed in the context of leadership programs beginning with a group of individuals motivated to learn. The individuals undergo learning activities that form social relationships. The participant's experiences occur through observation, modeling, cognition, and environment. The observed results are self-confidence, behavior change, motivation, action, influential relationships, and mutual purpose. These areas interact and lead to transformation within the individual, the organization, and the community. The model may have processes cycling back through different levels as individual needs or new avenues of experience occur.

This study explored three levels of outcomes on the individual, organizational, and community levels. The researchers developed the EvaluLEAD Conceptual Model (see Figure 2) to present visually the context for the study based on the framework developed by Grove et al. (2005). The individual, organizational, and community domains indicated in the model are the areas where program outcomes occur. The individual domain is where most of the direct benefits of the leadership development program will occur and where the most program-associated results might be expected (Grove et al., 2005). The organizational domain is where results occur within the organizations where the program participants work. Results can also occur in outside organizations where the participants have contact (Grove et al., 2005). Finally, the community domain refers to communities--social or professional networks--to which the program participants' influences extend either directly or through their organizational work (Grove et al., 2005).

The program outcomes that occur from the leadership development program may be on the episodic, developmental, or transformative level as outlined in the EvaluLEAD Conceptual Model (Figure 2). Grove et al. (2005) offer examples of episodic results being the actions of the participants, which are well defined and time-bound. Developmental results occur across time and at different speeds and are represented as steps taken by an individual who may reach some challenging outcome, such as a sustained change in behavior or a new strategy (Grove et al., 2005). The transformative area is where fundamental shifts occur in behavior or performance. They are the "prize" (p. 11) to which programs aspire. The leadership program outcomes identified in the model on the individual, organizational, and community levels are examples of outcomes that Grove et al. (2005) mention could occur at these levels.

[FIGURE 2 OMITTED] Leadership Program Evaluation The forms of inquiry that Grove et al. (2005) describe as evidential and evocative take the form of evaluation methodology. Evidential inquiry seeks to capture the facts of what is occurring to an individual through the gathering of "hard evidence" (p. 13). On the other hand, evocative inquiry seeks the person's viewpoints and feedback through such methods as open-ended surveys, case studies, and so on (Grove et al., 2005). Finally, according to Bledsoe and Graham (2005), one can use a consumer approach to evaluation to determine the needs and opinions of those receiving services and those who would be most able to determine how well a program is meeting those needs.

The use of multiple methods allows for triangulation of results. One method can offset another method's weakness or complement a method's strengths. Relying solely on quantitative data can mask great differences among participants (Patton, 1990). Patton indicates that qualitative data from the same study can show the real meaning of the program for participants. Patton points out that a dynamic evaluation is not tied to a single treatment, predetermined goals, or outcomes; rather, it focuses on the actual operations and effects of a process, program, or intervention over time. Evaluators focus on capturing process, documenting variations, and exploring individual differences in experiences and outcomes (Patton, 1990).

Conflicting results between the qualitative and quantitative data also may be found (Kan & Parry, 2004; Wall & Kelsey, 2004). Therefore, the importance of triangulating data and using multiple methods in program evaluation is emphasized. Furthermore, Martineau and Hannum (2004) suggest that some researchers doubt the merits of the retrospective assessment because it could create an increase in the ratings from the "before" to the "now." They argue for the validity of retrospective surveys by pointing out that ratings of change are highly correlated with objective measures of change such as performance appraisals (Martineau & Hannum, 2004; Rockwell & Kohn, 1989).

Martineau and Hannum (2004) further suggest that evaluation techniques should measure more than just the participant's perception of the program. Therefore, the evaluation of leadership development programs can be much more difficult because the programs produce intangible results, such as increased leadership capacity (Martineau & Hannum, 2004). Finally, significant changes of performance in leadership development may only be revealed as long-term effects over time (Martineau & Hannum, 2004).

In program evaluation, one must be careful of response-shift bias when using quantitative methods. Response-shift bias occurs when individuals have rated themselves at one time, from one perspective, and then change their responses later because their perspectives have changed (Martineau & Hannum, 2004). Retrospective pretest and posttest assessments require two ratings: One rating focuses on the individual before the program, and the other rating assesses the person's skill and behaviors after the program is complete (Martineau & Hannum, 2004). Response-shift bias is avoided when participants rate themselves within a single frame of reference (Pratt, McGuigan, & Katzev, 2000). Pratt et al. (2000) state that retrospective designs produce a more legitimate assessment of program outcomes than traditional pretest-posttest methodology. They suggest that collecting outcome information at the end of the program can respond to the dynamic, evolving needs of the participants to reflect the actual program content as it evolved over time.

Furthermore, Rockwell and Kohn (1989) argue that program participants may have limited knowledge at the beginning of a program, which prevents them from determining their baseline behaviors. By a program's end, the content may have affected their responses. Therefore, if a pretest were used, the participants would have no way to know if they have made an accurate assessment, and this would cause response-shift bias.

The limitations of the retrospective design are memory-related problems, and there might be a subjective motivation to make the program look good on the part of the participants (Pratt et al., 2000). Questions should be formulated to enhance the recall of events. Pratt and colleagues believe behaviors that are more specific are easier to recall and assess than behaviors that are more global. The authors state that any self-report must be considered a form of estimation and may contain subject bias.

Wall and Kelsey (2004) found an overestimation of the knowledge and skills gained--along with social desirability and effort justification--in a statewide leadership development program based on a retrospective survey. They state that researchers need to be aware that self-reporting surveys may be inadequate for determining program effects.

Degree-of-change ratings, where individuals rate their degree of change using a 5-point response scale ranging from no change to great change, seem to be better for assessing change across rater groups such as peers, direct reports, and supervisors (Martineau & Hannum, 2004). The degree-of-change rating indicates the amount of change better than evaluations that measure change using pre--and posttest ratings (Martineau & Hannum, 2004). Finally, Grove et al. (2005) urge the collection of actual named events or points of outcome from the participants, which controls for response-shift bias.

Leadership development improves activities that "sustain the achievement of positive outcomes for organizations, communities, and countries by individuals" (Grove & PLP Team, 2002, p. 2). Grove and PLP Team (2002) point out that leadership development relies on group processes and occurs through a multitude of experiences rather than at a static point in time. "Leadership is a result of the individual's placement with and among others involved in actions oriented toward meaningful change" (p. 7).

Russon and Reinelt (2004) further emphasize that methods of evaluation must be clarified in determining what the audiences of the evaluation really want to know. Reinelt, Foster, and Sullivan (2002) note that a future task might be the development of assessment tools that could be used across leadership programs, especially areas of overlapping program interests. Finally, Carman (2007) suggests that most evaluations and data collection in community-based organizations are conducted internally by program staff with little funding or support. The importance of flexibility in application is important because one single model of evaluation cannot be applied across the many different contexts, goals, and outcomes of the myriad of leadership development programs (Grove & PLP Team, 2002).

Research Focus The work of Bandura (1986) in SLT emphasizes self-efficacy, motivation, observational learning, and the behaviors that accompany outcome identification. All of these areas interact in the area of reciprocal determinism, where learning takes place on the environmental, behavioral, and personal levels. The EvaluLEAD framework allows for outcomes that will vary across many different program levels and concepts. Grove and PLP Team (2002) explain that the relationship between the program and the observable result may not be direct. They suggest that evaluation approaches must explore learning as well as job and career performance. To this end, descriptive data serve a critical purpose for triangulation as well as stories and interpretive techniques. To build on the W. K. Kellogg Scan (Reinelt et al., 2002), the PLP Team has developed and proposed the EvaluLEAD Framework (Grove et al., 2005). The term framework is used instead of the word model to allow for flexibility in the EvaluLEAD application (Grove & PLP Team, 2002). go to web site adult learning theory

The EvaluLEAD framework assumes that the evaluation of leadership development programs will lead to findings that could not be foreseen (Grove et al., 2005). Therefore, stakeholders and administrators will be informed about the program's effects, and the program will produce better results.

Combined with the EvaluLEAD model's individual, organizational, and community outcomes of leadership programming, the researchers sought to determine the results of the leadership program on the three levels of self, organization, and community. We also sought to develop an effective instrument to evaluate leadership programs. In review of the research, a satisfactory measurement instrument for program outcomes did not exist. Therefore, a method of program measurement needed to be developed, which resulted in the production of the LPOM (Black, 2006) instrument to measure leadership program outcomes.

Method and Sample The researchers found that an instrument did not exist to measure leadership program outcomes after the participants leave the program. Therefore, they developed the LPOM (Black, 2006). The LPOM focuses on self-assessment measures, which are frequently used in summative program evaluation (Stufflebeam, 2001) and are conceptualized by the EvaluLEAD conceptual framework presented by Grove et al. (2005).

A first step in program evaluation is identification of the program goals, which serves to pinpoint the path on which a program works (Bledsoe & Graham, 2005). The outcome goals expected of these rural leadership development programs have been identified by Carter and Rudd (2000). They state that when WKKF established these rural leadership development programs, the goals for the programs were to first develop leadership skills in the participants and then to enhance their knowledge on topics (Carter and Rudd, 2000).

This research focuses on the dilemma faced by administrators of leadership development programs, which is the dearth of an instrument to allow for program outcome evaluation. Research to date has not focused on real-word, cost-effective outcome measures for these programs. Because a tested program evaluation instrument does not exist to determine program outcomes, this research, which tests an instrument and reports results toward determining leadership program outcomes, is the first stage in program evaluation.

As suggested by Patton (1990), focus groups were used to generate the original scale items. Focus group results were categorized using NQR 6.0 software with the EvaluLEAD model (Figure 2) as a guide. In the individual domain, 78 possible outcomes were identified because of the focus groups. In the organizational domain, 31 outcomes were identified. Finally, in the community domain, 15 outcomes were identified. A group of leadership program directors and faculty from across the United States judged the content and face validity of each item of the item pool for inclusion in the final survey instrument.

After examination, the individual outcomes were collapsed into 12 items focusing on self-confidence, interpersonal skills, organizational skills, community involvement, and creative thinking. In the organizational domain, outcomes were combined into 11 items of business decision making, innovativeness, use of business resources, new leadership skills, and improved management skills. In the community domain, 8 items were identified dealing with leadership roles, increased involvement, increased awareness of time, and appreciation of cultural differences. This resulted in the generation and testing of the final survey instrument.

The final instrument included three sections corresponding to the main variables in the study: individual-level outcomes, organizational-level outcomes, and community-level outcomes, plus demographic data. The measures are one-dimensional for each domain. Scoring consists of summing the response items for each dimension. Each section included a 5-point Likert-type scale presented for the participant's selection of the following: (1) none/none at all, (2) a little, (3) some, (4) much, and (5) a great deal. As suggested by Martineau and Hannum (2004), the Likert-type scale questions were used to measure the extent of participant agreement and to measure a degree of change. The degree-of-change rating was the participant's self-rating for the amount of change that he or she believed resulted from the program. To determine attitude strength, reduce response-shift bias, and triangulate the data from the degree-of-change rating, open-ended questions were included (Grove et al., 2005; Martineau & Hannum, 2004). see here adult learning theory

The instrument was checked for face and content validity by a panel of experts that included internal administrators familiar with the program as well as directors of other similar leadership development programs. The instrument was field tested with a similar leadership development program in a different state. Reliability estimates of internal consistency were demonstrated with Cronbach's alpha levels of .91 on the individual-level section, .92 on the organizational-level section, and .79 on the community-level section. As recommended by Netemeyer, Bearden, and Sharma (2003), there was no effect from socioeconomic factors when entered as predictors into the multiple regression analysis. Correlations with the open-ended questions indicated predictive validity and a minimization of response bias. The reliability and dimensionality of the instrument were supported. All item-to-total correlations were .30 and higher, and alpha levels of near .80 and above for the three sections resulted in all scale items being retained for the next round of study (Netemeyer et al., 2003).

Due to the strong theoretical base, confirmatory factor analysis (CFA) with maximum likelihood estimation was used as a priori for each scale using the structural equation modeling (SEM) program AMOS 7.0 (DeVellis, 2003; Netemeyer et al., 2003). Each of the three scales was tested on several factors. Scale items were then trimmed of redundant questions (DeVellis, 2003; Netemeyer et al., 2003). What emerged as the best fit for each subscale were one-factor models. The absolute fit indices of the chi-squared statistic and the goodness-of-fit index (GFI) as well as the incremental fit statistics, such as the comparative fit index (CFI) and the root mean square error of approximation (RMSEA), were used to establish model fit and are reported in the Results section of this article.

The instrument was administered as an online survey to 262 leadership development program alumni. Participants were asked to self-assess the outcomes of their leadership program experience. To control for response-shift bias, open-ended questions were used to allow participants to list facts supporting their report of outcomes on each subscale of the instrument.

Results To assess how the model represented the data and to determine construct validity, the data underwent CFA. Netemeyer et al. (2003) recommend that SRW should be higher than .30 with GFI values greater than .95, CFI values close to 1 (.90 and above), and RMSEA values less than .08 to represent acceptable fit.

Table 1 explains the descriptive statistics and SRW after applying CFA to the three subscales. After examination, higher scoring items were selected to be fixed-factor coefficients (Stapleton, 1997). The first subscale identifying individual outcomes indicated that the model that best fit the data was a one-factor model. An acceptable fit was produced after scale modification where 2 of the 12 items were deemed redundant, had lower mean values and SRW, and were eliminated from the scale (see Table 1). The individual outcomes model exhibited with a [chi square] = 42.8, df = 30, p > .05, GFI = .949, CFI = .982, and RMSEA = .051. SRW were all greater than .30 (range = .48-.80).

The organizational outcomes subscale model that best fit the data was a one-factor model. An acceptable fit was produced after 2 items were deleted from the 11 items scale, due to low means and SRW (Table 1). The organizational outcomes model exhibited with a [chi square] = 30.19, df = 21,p > .05, GFI = .968, CFI = .988, and RMSEA = .052. SRW were all greater than .30 (range = .63-.86).

Finally, the community outcomes subscale model that best fit the data was a one-factor model. An acceptable fit was produced after two of eight items were trimmed from the scale (Table 1). The community outcomes model exhibited with a [chi square] = 15.6, df= 8, p > .05, GFI = .972, CFI = .984, and RMSEA = .077. SRW were all greater than .30 (range = .39-.96).

Furthermore, scale convergent validity was determined by open-ended questions being included (Netemeyer et al., 2003). Scale concurrent validity was demonstrated by redundant questions and comparison to the focus group results (Netemeyer et al., 2003).

Program Evaluation Outcomes As described in the conceptual framework of this study, when the EvaluLEAD model is applied as a unit of measure to the leadership program, outcomes should be identified on the individual, organizational, and community levels. The findings of this study are based on a program alumni response rate of 75% (n = 196). Seventy-three percent of those responding were male and 27% female. The majority of participants were 40 to 59 years of age (79%), and the majority (86%) had incomes greater than $40,000 per year. Fifty-one percent were from rural farm areas.

Mu, standard deviations, and SRW for the Likert-type scale item results indicate that the program has a positive effect on the individual and organizational levels (Table 1). Less effect was detected on the community level, with the exception of cultural awareness. Furthermore, when referenced with the open-ended questions as a check on Likert-type scale responses, the instrument identified leadership program outcomes, which strengthened validity.

Individual-Level Effects Individual outcomes. The individual domain is where most of the direct benefits of leadership development occur and where the most program-associated results might be expected (Grove et al., 2005).

The evaluations of the individual-level effects indicate that participants were most affected by the program at the individual level. This finding is not surprising because Grove et al. (2005) indicate that most leadership programs should work at this level.

Participants were asked to respond to 12 items dealing with how they individually changed due to their leadership program experience (Table 1, Subscale 1). Seven scales show a majority of participants reporting a positive relationship between the program and individual-level outcomes. The remaining five scales fell in the middle range or the "some" category for the majority reporting. No scales in the negative range had a majority.

The strength of the individual outcomes varied according to the individual. All, however, had some type of outcomes on the individual level from their program participation. Eighty-eight percent of the participants described ways in which they personally changed due to their leadership program experience. At the individual level, outcomes occurred in the areas of personal growth, self-confidence, personal power, creative thinking, valuing of time, business skill-building, and modeling behaviors. Patterns that emerged on this level from the open-ended questions were increased confidence, increased communications skills, better ability to network, and more awareness of cultural factors.

Organizational-Level Effects Organizational outcomes. These are program-associated results that occur either within the organizations where the program participants work or in outside organizations where the participants have contact (Grove et al., 2005).

The organizational domain was designed to measure the participants' outcomes on a subscale made up of 11 variables (Table 1, Subscale 2). Five scales showed a majority reporting a positive relationship between the program and this level of outcome. Five scales fell in the middle range or the "some" category for the majority reporting. The remaining scale did not have a majority reporting in any of the summed areas. The relationship of these variables to self-efficacy indicates the influence of group situations where the social comparison of one's own performance to that of one's peers and the persuasion of others leads to greater self-efficacy (Brown, 1999). The increased self-efficacy relates to increased self-motivation (Brown, 1999). This then relates to career choice, job attitude, learning and achievement, training proficiency, task persistence, and goal-directed behaviors. Self-efficacy is related to important organizational outcomes (Van Knippenberg, Van Knippenberg, Cremer, & Hogg, 2004). In evaluating organizational-level outcomes, Grove et al. (2005) note that individuals "may have license to initiate changes on their own or they may first need to build support and constituencies for their ideas" (p. 13).

On the organizational level, participants reported in the open-ended questions improvement in networking, improved understanding of the "big picture," better communications skills in business, and improved management skills. Eighty percent of the participants described ways they improved on a professional, organizational, or business level due to their leadership program experience.

The majority of participants reported that they frequently experienced positive program outcomes occurring at the organizational level. The ability to network in terms of business and keeping contacts was a highly reported outcome. Participants reported that the increased networking benefited them in the business arena because of the support from other participants. They also reported an increase in problem-solving skills. Because of the leadership program, participants reported that they were able to improve business skills and bring new perspectives and new ideas to their businesses.

Community-Level Effects Community outcomes. The community level of outcomes is the community where the program participants have influence either individually, directly, or indirectly through the organizations with which they work or are affiliated. Grove et al. (2005) believe that the mission or "reason for being" (p. 9) for most leadership development programs is to influence this domain.

The community domain refers to neighborhoods, communities, or sectors of society to which the influences of participants may extend (Grove et al., 2005). The community-level items were designed to measure how participation in the community changed after the leadership program experience. The section included eight subscales (Table 1, Subscale 3). The community outcome section had lower reports of participant change than the other two areas.

The most unexpected occurrence at the community level was the very high number of participants (75%) who indicated that their awareness of cultural diversity changed. Fifty-seven percent reported being involved with their church, and 47% reported being involved with their local farm bureau. Other patterns show that participants had a low involvement level in politics but a high involvement level with organizations within their areas of expertise. Those responding indicated working on the local and community levels (45%) because of their involvement with the program.

To triangulate the data further, participants were asked to self-report what new community projects they championed because of their leadership program experience. Forty percent indicated that they championed community projects. These ranged from small community projects to large international projects.

Seventy percent of the alumni described, in detail, their involvement in organizations on the local, state, and national levels, and 70% held offices in these organizations. On the open-ended question, 72% of the participants indicated being involved in their community as volunteers. In addition, 58% reported holding board of director positions, which elevated them to the level of an opinion leader within their communities. Sixty-six percent felt that they do make a difference within their communities. In the political area, 23% reported holding elected or governmental positions. From those reporting, the positions held are primarily on the local level such as county commissioner and township trustee. If a goal of this program is to increase political activity, then an effort must be made to mold political office holders. If the goal is to increase civic awareness and consciousness, the focus groups and answers to the open-ended questions indicated that this is occurring. The community outcome showed lower reports of participant change than the other two outcome areas. These results are consistent with expected leadership program outcomes (Grove et al., 2005).

Participants not reporting involvement in the community area indicated that they decided to cut back on their involvement due to either family or business considerations. This correlates with the individual-level scale where 60% of those responding indicated higher awareness of the value of their time. Several participants noted that they now choose where to get involved and that they have learned to say no to those requesting their time.

Areas That Decreased or Worsened for Program Participants One primary level emerged in the open-ended question where participants were queried as to what decreased or worsened because of their program experience. Thirty-eight percent indicated relationships with their spouse, family, and/or farm being affected by their program involvement. One person indicated, "marriage ended because I grew and my spouse would not." Another stated, "there was tension at times with my spouse. This is probably due to my spouse's attitude and/or lack of interest." Summary The purpose of this research was to develop a method to determine leadership program outcomes after the participants leave the program. Leadership programs gather formative evaluation data as participants go through the programs, but a summative evaluation mechanism does not exist. This dearth of research in leadership program evaluation has led to the development of the LPOM (Black, 2006). The LPOM includes summated items and open-ended items, which form three subscales that measure individual, organizational, and community outcomes, respectively. The scale was developed to measure leadership program outcomes after participants leave a program. Participants are asked to rate the outcomes of their leadership program experience using a Likert-type scale of(1) none/none at all, (2) a little, (3) some, (4) much, and (5) a great deal. The instrument includes open-ended questions to triangulate and validate self-reports. SEM was used to verify the scales included in the instrument. SEM analysis resulted in reducing the scale from 31 to 25 Likert-type scale items.

The next step needed for further scale validation is to administer the LPOM to several other leadership programs and conduct both exploratory and confirmatory factor analyses. This step will also serve to increase the sample size, which will assist in further evaluation of the scales. No effect from socioeconomic factors was indicated when entered as predictors into multiple regression analysis. This is important because the analysis indicates that the effects of the program were not confined to a particular group of individuals who were older, younger, male, female, or from a particular region. The EvaluLEAD framework (Grove et al., 2005) worked as a model of determining leadership program outcomes.

To summarize the implications of this study, the instrument was found to measure the outcomes of this leadership program. It provided data and insights not obvious to stakeholders and administrators. This research provides a first-time look into the outcomes of a statewide leadership development program, and it provides a first-time look into the results of leadership programs.

Consistent with Bandura's (1986) SLT, this pilot study of leadership program outcomes shows that participants improved in cultural awareness. This indicates that when people are exposed to individuals different from themselves, their attitude about others will change. SLT suggests that observation of a behavior is a strong influencer and has greater value than verbal instruction. The data collected indicate that people who are in the position of learning as participants in leadership education gain knowledge of self, improve in business, are active in the local community, and are more aware of cultural differences.

Eighty-six percent of the participants (n = 166) reported being changed by the program. Because 60% of those reporting were rural residents and 19% were small town residents, it is noteworthy that an appreciation for cultural differences was expressed. The outcomes of this research triangulate with the literature. Social and adult learning theory (Bandura, 1986; Knowles, 1984) indicates that adults learn when they self-select the environment in which they learn. If expectancy of outcome influences a person's behavior, those individuals in the leadership development program should have the expectancy of success because of the program.

Bandura (2000) notes that a person's belief in his or her ability to be successful indicates a high level of self-efficacy. Because they believe in their success, people will put forth more effort and show more persistence toward a goal. Conclusions from the study indicate that the program generated positive outcomes on the individual and organizational levels. On the community level, one positive result existed.

The outcomes noted in this study are consistent with the WKKF (2001) findings, which note that leadership development programs should work on the personal, professional, policy, and practice levels. Personal growth is achieved by broadening a leader's perspectives, by increasing self-confidence, and by giving individuals a clearer sense of self-purpose or self-efficacy. The concept of professional growth is achieved by learning innovative approaches to management and business and increasing industry representation and participation in leadership roles (WKKF, 2001). Policy and practice achievements are made and strong networks of resources are formed among the participants.

In this study, the data captured the statistics and the outcomes of the program, while testing the LPOM instrument. The quantitative data provided detail concerning the outcomes being experienced in each of three levels. The qualitative data provided the personal story of how this was accomplished and how the experience affected both the participant and the community, thereby allowing triangulation and bringing depth and texture to the research study.

Ultimately, the development of the LPOM evaluation instrument provides a novel approach to leadership program evaluation and seeks to address the gap in evaluation instruments. Thus, this research successfully resulted in a valuable evaluation instrument that can measure leadership program outcomes using both qualitative and quantitative data collection techniques (Tashakkori & Creswell, 2007). Therefore, this research contributes to the leadership literature and is well positioned to advance leadership program evaluation by providing a useful and easy-to-use instrument for leadership program administrators and researchers.

Use of this instrument has the potential for the collection of rich data that can help to explain the outcomes of leadership development programs in the lives of the people served by such programs on the personal, professional, and community levels. Identification of these factors will assist program administrators and others as they seek to achieve excellence in these programs and to document program effects and outcomes.

Finally, evaluation can assist with a program's improvement and provide a basis for reporting theoretically substantiated results. These results are then reported to stakeholders, participants, funders, and others. Effective evaluation provides decisionmaking data, which affects program change, expansion, or even abolishment. In addition, many grant programs are requiring program evaluation data. These researchers found that research-based measurement and evaluation tools did not exist for effective leadership program evaluation. Therefore, for decision makers in leadership programs to collect data to assist in their efforts, the LPOM (Black, 2006) was developed.

This research builds on the EvaluLEAD framework (Grove et al., 2005) in hopes of designing an instrument for collecting data that provides an easy-to-use format. Global fit indices used in CFA (Root Mean Square Residual (RMRS), RMSEA, GFI, and CFI) showed that the proposed models explain the participants' self-assessments. However, more research is needed on leadership program outcomes to compare results and to define the evaluation scales further.

Cross-validation of the instrument is necessary in further research. This will assist with criterion validity, convergent validity, and discriminate validity and will further develop the instrument to better capture leadership program outcomes.

Furthermore, the LPOM (Black, 2006) can assist program administrators by providing a tool that is cost-effective and can gather data via an online instrument. A baseline evaluation now exists that will allow program decision makers, funders, and stakeholders to determine improvements to the program, make changes, and discuss the outcomes.

Another important step for further research is to include stakeholders, funders, and others so that they can determine their outcome perspectives compared with those reported by program alumni. There is a need to delve more deeply into leadership program outcomes after participants leave the program at timed intervals. The idea that leadership program effects are felt long after a program ends is not yet researched. Therefore, a justifiable research goal has emerged to have a mechanism in place to evaluate these outcomes.

10.1177/1548051809339193 References Bandura, A. (1977). Social learning theory. Englewood Cliffs, NJ: Prentice Hall.

Bandura, A. (1986). Social foundations of thought and actions. Englewood Cliffs, NJ: Prentice Hall.

Bandura, A. (2000). Cultivate self-efficacy for individual and organizational effectiveness. In E. A. Locke (Ed.), The Blackwell handbook of principles of organizational behavior. (pp. 120-136). Oxford, UK: Blackwell.

Birkenholz, R. J. (1999). Effective adult learning. Danville, IL: Interstate.

Black, A. M. (2006, October). Leadership Program Outcomes Measure--LPOM. Paper presented at ILA Conference, Chicago, IL.

Bledsoe, K. L., & Graham, J. A. (2005). The use of multiple evaluation approaches in program evaluation. American Journal of Evaluation, 26, 302-319.

Caffarella, R. S. (2002). Planning programs for adult learners (2nd ed.). San Francisco: Jossey-Bass.

Carman, J. G. (2007). Evaluation practice among community-based organizations: Research into reality. American Journal of Evaluation, 28, 60-75.

Carter, H. S., & Rudd, R. D. (2000). Evaluation of the Florida leadership program for agriculture and natural resources. Journal of Southern Agricultural Education Research, 50(1), 193-199.

DeVellis, R. F. (2003). Scale development: Theory and application (2nd ed.). Thousand Oaks, CA: Sage.

Gibson, S. K. (2004). Social learning (cognitive) theory and implications for human resources development. Advances in Developing Human Resources, 6(2), 193-210.

Grove, J. T., Kibel, B. M., & Haas, T. (2005). EvaluLEAD: A guide for shaping and evaluating leadership development programs. Oakland, CA: Sustainable Leadership Initiative, Public Health Institute.

Kan, M. M, & Parry, K. W. (2004). Identifying paradox: A grounded theory of leadership in overcoming resistance to change. The Leadership Quarterly, 15, 467-491.

Knowles, M. (1984). The adult learner: A neglected species (3rd ed.). Houston, TX: Gulf Publishing.

Martineau, J., & Hannum, K. (2004). Evaluating the impact of leadership development: A professional guide. Greensboro, NC: Center for Creative Leadership.

Netemeyer, R. G., Bearden, W. O., & Sharma, S. (2003). Scaling procedures: Issues and applications. Thousand Oaks, CA: Sage.

Patton, M. Q. (1990). Qualitative evaluation and research methods (2nd ed.). Thousand Oaks, CA: Sage.

Pratt, C. C., McGuigan, W. M., & Katzev, A. R. (2000). Measuring program outcomes: Using retrospective pretest methodology. Journal of Evaluation, 21(3), 341-349.

Rost, J. C. (1993). Leadership for the twenty-first century. Westport, CT: Praeger.

Russon, C., & Reinelt, C. (2004). The results of an evaluation scan of 55 leadership development programs. Journal of Leadership & Organizational Studies, 10(3), 105-107.

Stufflebeam, D. L. (2001). Evaluation models: New directions for evaluation. San Francisco: Jossey-Bass.

Tashakkori, A., & Creswell, J. W. (2007). A new era of mixed methods. Journal of Mixed Methods Research, 1(1), 3-7.

Van Knippenberg, D., Van Knippenberg, B., Cromer, D. D., & Hogg, M. A. (2004). Leadership, self, and identity: A review and research agenda. The Leadership Quarterly, 15(6), 825-856.

W. K. Kellogg Foundation. (2001). Leadership scan. The legacy of the ag leadership development program: Rich heritage cultivates future opportunities. Battle Creek, MI: Author.

Wall, L. J., & Kelsey, K. D. (2004). When findings collide: Examining survey vs. interview data in leadership education research. Journal of Southern Agricultural Education Research, 54(1), 180-193 Alice M. Black Garee W. Earnest The Ohio State University

Table 1 Subscale Items, Descriptive Statistics, and Regression Weights (N = 196)

Subscale 1: Individual Outcomes Subscale

Self- Imitate Personal Factor/Item Growth Confidence Power Success

[mu] 3.9 3.6 3.5 3.5 Standard deviation 0.86 0.91 1.00 1.00 SRW (12 items) 0.74 0.67 0.78 0.65 SRW (10 items) 0.76 0.65 0.80 0.67

Biz Creative Community Personal Factor/Item Time Value Skills Thinking Involvement

[mu] 3.5 3.4 3.4 3.3 Standard deviation 1.10 0.96 0.92 0.97 SRW (12 items) 11.40 0.73 0.66 0.48 SRW (10 items) -- 0.74 0.70 0.48

Family Life Personal Factor/Item Value Change Control Change

[mu] 3.2 3.0 2.9 2.7 Standard deviation 1.20 1.20 1.20 1.10 SRW (12 items) 0.70 0.71 0.62 0.61 SRW (10 items) 0.65 0.69 -- 0.60

Subscale 2: Organizational Outcomes Subscale

Build Network Facilitate Problem Personal Factor/Item Contacts Skills Change Respond

[mu] 3.9 3.9 3.9 3.4 Standard deviation 0.99 0.93 0.84 0.91 SRW (11 items) 0.58 0.68 0.70 0.85 SRW (9 items) 0.52 0.63 0.63 0.88

Problem Decision Personal Factor/Item Solve Making Time Use Confidence

[mu] 3.4 3.5 3.2 3.6 Standard deviation 0.94 0.94 1.00 1.10 SRW (11 items) 0.77 0.84 0.70 0.68 SRW (9 items) 0.79 0.86 0.73 0.65

Resource Career Personal Factor/Item Organizations Use Change

[mu] 3.1 3.3 2.7 Standard deviation 1.10 0.09 1.20 SRW (11 items) 0.61 0.70 0.56 SRW (9 items) -- 0.66 --

Subscale 3: Community Outcomes Subscale

Cultural Time Local Community Personal Factor/Item Difference Value Groups Groups

[mu] 4.0 3.2 3.0 2.9 Standard deviation 1.0 1.0 1.1 1.1 SRW (8 items) 0.39 0.56 0.95 0.90 SRW (6 items) 0.39 0.56 0.96 0.90

Reduced State National Other Personal Factor/Item Commit Level Level Countries

[mu] 2.7 2.5 2.0 1.7 Standard deviation 1.2 1.3 1.3 1.2 SRW (8 items) 0.24 0.58 0.48 0.30 SRW (6 items) -- 0.56 0.50 --

Note: SRW = standardized regression weights.

Black, Alice M.; Earnest, Garee W.