Posts Tagged ‘multisource feedback’
Here’s another NYTimes Corner Office offering, featuring Laszlo Bock, SVP of People Operations at Google. (http://www.nytimes.com/2013/06/20/business/in-head-hunting-big-data-may-not-be-such-a-big-deal.html?pagewanted=1). The first half is about hiring with some interesting observations (especially if you have responsibilities in that area). The second half describes their Upward Feedback process, along with other HR systems. And, no, they are not a client.
I offer these observations for your consideration:
- Big Data is the new fad, but many of us have been using large data bases to understand the impact of our change processes for a long time, whether at the organizational level (employee surveys) or the individual level (360 Feedback).
- Your organization is not using “Big Data” (at least in the way Laszlo is describing) if you are using external norms. Note that Google is using internal norms very aggressively, tracking progress in moving the norm over time AND giving percentile rankings for each leader.
- The challenges he describes regarding hiring practices are very interesting, and it appears they are making some progress in implementing processes that are more predictive and more consistent. That said, hiring is always a challenge, and emphasizes the importance of using processes such as multisource (360) feedback to identify and either improve or weed out poor managers.
- He speaks to the importance of consistency in leaders. 360 Feedback promotes consistency in a number of ways. First, it defines the behaviors that describe successful leaders, a form of alignment. One of the behaviors can relate to consistency itself, i.e., providing feedback to the leader about whether he/she is consistent. In addition, an organization-wide 360 process that is administered and used in a consistent manner can only help in reinforcing the views of employees that decisions are being made on a fair basis. Organization-wide implementation is the key to success in creating change, acceptance and sustainability.
- Back to the percentile rankings. I have found organizations strangely averse to this practice of letting the leader know where he/she ranks against peers. As Laszlo notes, the challenge is to give the leader a realistic view of how he/she is perceived, and to create some motivation to change. By the way, these rankings are one “solution” to leniency trends, that is, saying to the leader, “You may think you are hot stuff because you got a 4.0 rating (out of 5) on that behavior, but you are still lower than 80% of your peers.” That scenario is common in areas such as Integrity where we expect high scores from our leaders.
- I am a little surprised that he believes that the managers can “self-motivate” in the way he describes. I am usually skeptical that leaders will change without accountability. I would like to know more about that. I have already noted the use of percentile rankings that most organizations dismiss, and are seen are powerful motivators in this process. Laszlo also describes a dialog of sorts with the leader at the 8th percentile. Who is that conversation with? If it is with another person (boss, coach, HR manager), that alone creates a form of accountability and an implied consequence if improvement isn’t seen. If the conversation is just in the leader’s head, it speaks to the power of the information provided by the percentile score. Creating awareness is one thing. Awareness with context (e.g., comparison to others) is much more powerful. (Maybe like, “That’s a nice pair of pants! If it were the 60’s.”)
- Lastly, Laszlo speaks to the uniqueness of his and other organizations regarding what the organization needs from its leaders and how an individual employee might fit in and contribute. This clearly speaks to the need for custom designed content for hiring practices and then internal assessments once an employee is onboard.
Google is doing some very interesting research regarding leadership. Go back and look at their work on leadership competencies that they publicized a couple years ago. http://www.nytimes.com/2011/03/13/business/13hire.html?pagewanted=all
Beyond the research, Google is actually using their Big Data to create a culture, define the leaders they require, and putting some teeth into the theory with upward feedback at the forefront. Yet, at the end, he notes that all the measurement must be viewed through the lens of human insight. The context is deeper than just organization; it is also moderated by the current version of strategy, the team requirements, the job requirements, and the personal situation, all of which are in a constant state of flux.
©2013 David W. Bracken
My good friend and collaborator, Dale Rose, dropped me a note regarding his plans to do another benchmarking study on 360 Feedback processes. His company, The 3D Group, has done a couple of these studies before and Dale has been generous in sharing his results with me, which I have cited in some of my workshops and webinars. The studies are conducted by interviewing coordinators of active 360 systems. Given that they are verbal, some of the results have appeared somewhat internally inconsistent and difficult to reconcile, though the general trends are useful and informative.
Many of the topics are useful for practitioners to gauge their program design, such as the type of instrument, number of items, rating scales, rater selection, and so on. For me, the most interesting data relates to the various uses of 360 results.
Respondents in the 2004 and 2009 studies report many uses. In both studies, “development” is the most frequent response, and that’s how it should be. In fact, I’m amazed that the responses weren’t 100% since a 360 process should be about development. The fact that in 2004 only 72% of answers included development as a purpose is troubling whether we take the answers as factual or if they didn’t understand the question. The issue at hand here is not whether 360’s should be used for development; it is what else they should, can, and are used for in addition to “development.”
In 2004, the next most frequent use was “career development;” that makes sense. In 2009, the next most frequent was “performance management,” and career development dropped way down. Other substantial uses include high potential identification, direct link to performance measurement, succession planning, and direct link to pay.
But when asked whether the feedback is used “for decision making or just for development”, about 2/3 of the respondents indicated “development only” and only 1/3 for “decision making.” I believe these numbers understate the actual use of 360 for “decision making” (perhaps by a wide margin), though (as I will propose), it can depend on how we define what a “decision” is.
To “decide” is “to select as a course of action,” according to Miriam Webster (in this context). I would build on that definition that one course of action is to do nothing, i.e., don’t change the status quo or don’t let someone do something. It is impossible to know what goes on in person’s mind when he/she speaks of development, but it seems reasonable to suppose that it involves doing something beyond just leaving the person alone, i.e., maintaining the status quo. But doing nothing is a decision. So almost any developmental use is making a decision as to what needs to be done, what personal (time) and organizational (money) resources are to be devoted to that person. Conversely, denying an employee access to developmental resources that another employee does get access to is a decision, with results that are clearly impactful but difficult to measure.
To further complicate the issues, it is one thing to say your process is for “development only,” and another to know how it is actually used. Every time my clients have looked behind the curtain of actual use of 360 data, they unfailingly find that managers are using it for purposes that are not supported. For example, in one client of mine, anecdotal evidence repeatedly surfaced that the “development only” participants were often asked to bring their reports with them to internal interviews for new jobs within the organization. The bad news was that this was outside of policy; the good news was that leaders saw the data as useful in making decisions, though (back to bad news) they may have been untrained to correctly interpret the reports.
Which brings us to why this is an important issue. There are legitimate “development only” 360 processes where the participant has no accountability for using the results and, in fact, is often actively discouraged from sharing the results with anyone else. Since there are not consequences, there are few, if any, consequential actions or decisions required. But most 360 processes (despite the benchmark results suggesting otherwise) do result in some decisions being made, which might include doing nothing by denying an employee access to certain types of development.
The Appendix of The Handbook of Multisource Feedback is titled, “Guidelines for Multisource Feedback When Used for Decision Making.” My sense is many designers and implementers of 360 (multisource) processes feel that these Guidelines don’t apply because their system isn’t used for decision making. Most of them are wrong about that. Their systems are being used for decision making, and, even if not, why would we design an invalid process? And any system that involves the manager of the participant (which it should) is creating the expectation of direct or indirect decision making to result.
So Dale’s question to me (remember Dale?) is how would I suggest wording a question in his new benchmarking study that would satisfy my curiosity regarding the use of 360 results. I proposed this wording:
“If we define a personnel decision as something that affects an employee’s access to development, training, jobs, promotions or rewards, is your 360 process used for personnel decisions?”
Dale hasn’t committed to using this question in his study. What do you think?
©2012 David W. Bracken
Being the 360 Feedback nerd I am, I love it when some new folks get active on the LinkedIn 360 discussion group. One discussion emerged recently that caught my eye, and I have been watching it with interest, mulling over the perspectives and knowing I had to get my two cents in at some point.
Here is the question:
How many raters are too many raters?
We normally recommend 20 as a soft limit. With too many, we find the feedback gets diluted and you have too many people that don’t work closely enough with you to provide good feedback. I’d be curious if there are any suggestions for exceptions.
This is an important decision amongst the dozens that need to be made in the course of designing and implementing 360 processes. The question motivated me to pull out The Handbook of Multisource Feedback and find the excellent chapter on this topic by James Farr and Daniel Newman (2001), which reminded me of the complexity of this decision. Let me also reiterate that this is another decision that has different implications for “N=1” 360 processes (i.e., feedback for a single leader on an ad hoc basis) versus “N>1” systems (i.e., feedback for a group of participants); this blog and discussion is focused on the latter.
Usually people argue that too many surveys will cause disruption in the organization and unnecessary “soft costs” (i.e., time). The author of this question poses a different argument for limiting the rater population, which he calls “dilution” due to inviting unknowledgeable raters. For me, one of the givens of any 360 system is that the raters must have sufficient experience with the ratee to give reliable feedback. One operationalization of that concept is to require that an employee must have worked with/for the ratee for some minimum amount of time (e.g., 6 months or even 1 year), even if he/she is a direct report. Having the ratee select the raters (with manager approval) is another practice that is designed to help get quality raters that then also facilitate the acceptance of the feedback by the ratee. So “dilution” due to unfamiliarity can be combated with that requirement, at least to some extent.
One respondent to this question offers this perspective:
The number of raters depends on the number of people that deal with this individual through important business interactions and can pass valuable feedback based on real experience. There is no one set answer.
I agree with that statement. Though, while there is no one set answer, some answers are better than others (see below).
In contrast, someone else states:
We have found effective to use minimum 3 and maximum 5 for any one rater category.
The minimum of 3 is standard practice these days as a “necessary but not sufficient” answer to the number of raters. As for the maximum of 5, this is also not uncommon but seems to ignore the science that supports larger numbers. When clients seek my advice on this question of number of raters, I am swayed by the research published by Greguras and Robie (1998) who collected and researched the question of the reliability of various rater sources (i.e., subordinates, peers and managers). They came to the conclusion that different rater groups provide differing levels of reliable feedback, probably because the number of “agendas” lurking within the various types of raters. The least reliable are the subordinates, followed by the peers, and then the managers, the most reliable rater group.
One way to address rater unreliability is to increase the size of the group (another might be rater training, for example). Usually there is only one manager and best practice is to invite all direct reports (who meet the tenure guidelines), so the main question is the number of peers. This research suggests that 7-9 is where we need to aim, noting also that that is the number of returns needed, so inviting more is probably a good idea if you expect less than a 100% response rate.
Another potential rater group is external customers. Recently I was invited to participate in a forum convened by the American Board of Internal Medicine (ABIM) to discuss the use of multisource feedback in physician recertification processes. ABIM is one of 24 member Boards of the American Board of Medical Specialties (ABMS), which has directed that some sort of multisource (or 360) feedback be integrated into recertification.
The participants in this forum included many knowledgable, interesting researchers on the use of 360 in the context of medicine (a whole new world for me, which was very energizing). I was invited to represent the industry (“outside) perspective. One of the presenters spoke to the challenge of collecting input from their customers (i.e., patients), a requirement for them. She offered up the number of 25 as the number of patients needed to create a reliable result, using very similar rationale as Greguras and Robie regarding the many individual agendas of raters.
Back to LinkedIn, there was then this opinion:
I agree that having too many raters in any one rater group does dilute the feedback and make it much harder to see subtleties. There is also a risk that too many raters may ‘drown out’ key feedback.
This is when my head started spinning like Linda Blair in The Exorcist. This perspective is SO contrary to my 25 years of experience in this field that I had to prevent myself from discounting it as my head continued to rotate. I have often said that a good day for me includes times when I have said, “Gee, I have never thought of (insert topic) in that way.” I really do like hearing new and different views, but it’s difficult when they challenge some foundational belief.
For me, maybe THE most central tenet of 360 Feedback is the reliance on rater anonymity in the expectation (or hope) that it will promote honesty. This goes back to the first book on 360 Feedback by Edwards and Ewen (1996) where 360’s were designed with this need for anonymity being in the forefront. That is why we use the artificial form of communication of using anonymous questionnaires and usually don’t report in groups of less than 3. We know that violations of the anonymity promise result in less honesty and reduced response rates, with the grapevine (and/or social media) spreading violated trust throughout the organization.
The notion that too many raters will “drown out key feedback” seems to me to be a total reversal of this philosophy of protecting anonymity. It also seems to place an incredible amount of emphasis on the report itself where the numbers become the sole source of insight. Other blog entries of mine have proposed that the report is just the conversation starter, and that true insight is achieved in the post-survey discussions with raters and manager.
I recall that in past articles (see Bracken, Timmreck, Fleenor and Summers, 2001) we made the point that every decision requires what should be a conscious value judgment as to who the most important “customer” is for that decision, whether it be the rater, ratee, or the organization. For example, limiting the number of raters to a small number (e.g., 5 per group or not all Direct Reports) indicates that the raters and organization are more important than the ratee, that is, that we believe it is more important to minimize the time required of raters than it is to provide reliable feedback for the ratee. In most cases, my values cause me to lobby on behalf of the ratee as the most important customer in design decisions. The time that I will rally to the defense of the rater as the most important customer in a decision is when anonymity (again, real or perceived) is threatened. And I see these arguments for creating more “insight” by keeping rater groups small or subdivided are misguided IF these practitioners share the common belief that anonymity is critical.
Finally (yes, it’s time to wrap this up), Larry Cipolla, an extremely experienced and respected practitioner in this field, offers some sage advice with some comments, including the folly of increasing rater group size by combining rater groups. As he says, that is pure folly. But I do take issue with one of his practices:
We recommend including all 10 raters (or whatever the n-count is) and have the participant create two groups–Direct Reports A and Direct Reports B.
This seems to me to be a variation on the theme of breaking out groups and reducing group size with the risk of creating suspicions and problems with perceived (or real) anonymity. Larry, you need to show that doing this kind of subdividing creates higher reliability in a statistical sense that can overcome the threats to reliability created by using smaller N’s.
Someone please stop my head from spinning. Do I just need to get over this fixation with anonymity in 360 processes?
Bracken, D.W., Timmreck, C.W., and Church, A.H. (2001). The Handbook of Multisource Feedback. San Francisco: Jossey-Bass.
Bracken, D.W., Timmreck, C.W., Fleenor, J.W., and Summers, L. (2001). 360 feedback from another angle. Human Resource Management, 1, 3-20.
Edwards, M. R., and Ewen, A.J. (1996). 360° Feedback: The powerful new model for employee assessment and performance improvement. New York: AMACOM.
Farr, J.L., and Newman, D.A. (2001). Rater selection: Sources of feedback. In Bracken, D.W., Timmreck, C.W., and Church, A.H. (eds.), The Handbook of Multisource Feedback. San Francisco: Jossey-Bass.
Greguras, G.J., and Robie, C. (1998). A new look at within-source interrater reliability of 360-degree feedback ratings. Journal of Applied Psychology, 83, 960-968.
©2012 David W. Bracken
I used my last blog (http://dwbracken.wordpress.com/2011/08/09/so-now-what/) to start LinkedIn discussions in the 360 Feedback and I/O Practitioners group, asking the question: What is a “valid” 360 process? The response from the 360 group was tepid, maybe because the group has a more general population that might not be that concerned with “classic” validity issues (which is basically why I wrote the blog in the first place). But the I/O community went nuts (45 entries so far) with comments running the gamut from constructive to dismissive to deconstructive.
Here is a sample of some of the “deconstructive” comments:
…I quickly came to conclusion it was a waste of good money…and only useful for people who could (or wanted to) get a little better.
It is all probably a waste of time and money. Good luck!!
There is nothing “valid” about so-called 360 degree feedback. Technically speaking, it isn’t even feedback. It is a thinly veiled means of exerting pressure on the individual who is the focal point.
My position regarding performance appraisal is the same as it has been for many years: Scrap It. Ditto for 360.
Actually, I generally agree with these statements in that many 360 processes are a waste of time and money. It’s not surprising that these sentiments are out there and probably quite prevalent. I wonder, though, if we are all on the same page. In another earlier blog, I suggested that discussions about the use and effectiveness of 360’s should be separated by those that are designed for feedback to a single individual (N=1) and those that are designed to be applied to groups (N>1).
But the fact is that HR professionals have to help their management make decisions about people, starting with hiring and then progressing through placement, staffing, promotions, compensation, rewards/recognition, succession planning, potential designation, development opportunities, and maybe even termination.
Nothing is perfect, especially so when it comes to matters that involve people. As an example, look to the U.S. Constitution, an endearing document that has withstood the test of time. Yet the Founding Fathers were the first to realize that they needed to make provisions for the addition of amendments to further make refinements. Of course, some of those amendments were imperfect themselves and were later rescinded.
But we haven’t thrown out the Constitution because it is imperfect. Nor do we find it easy to come to agreements what the revisions should be. But one of the many good things about humans is a seemingly natural desire to make things better.
Ever since I read Mark Edwards and Ann Ewen’s seminal book, 360 Degree Feedback, I have believed that 360 Feedback has the potential to improve personnel decision making when done well. The Appendix of The Handbook of Multisource Feedback is titled, “Guidelines for multisource feedback when used for decision making,” coauthored with Carol Timmreck, where we made a stab at defining what “done well” can mean.
In our profession, we have an obligation to constantly seek ways of improving personnel decision making. There are two major needs we are trying to meet, which sometimes cause tensions. One is to provide the organization with more accurate information on which to base these decisions, which we define as increased reliability (accurate measurement) and validity (relevant to job performance). Accurate decision making is good for both the organization and the individual.
The second need is to simultaneously use methods that promote fairness. This notion of fairness is particularly salient in the U.S. where we have “protected classes” (i.e., women, minorities, older workers), but hopefully fairness is a universal concept that applies in many cultures.
Beginning with the Edwards & Ewen book and progressing from there, we can find more and more evidence that 360 done well can provide decision makers with better information (i.e., valid and fair) than traditional sources (e.g., supervisory evaluations). I actually heard a lawyer state that organizations could be legally exposed for not using 360 feedback because is more valid and fair than methods currently in use.
I have quoted Smither, London and Reilly (2005) before, but here it is again:
We therefore think it is time for researchers and practitioners to ask “Under what conditions and for whom
is multisource feedback likely to be beneficial?” (rather than asking “Does multisource feedback work?”).
©2011 David W. Bracken
This is the one year anniversary of this blog. This is the 44th post. We have had 2,026 views, though the biggest day was the first with 38 views. I have had fewer comments than I had hoped (only 30), though some LinkedIn discussion have resulted. Here is my question: Where to go from here? Are there topics that are of interest to readers?
Meanwhile, here is my pet peeve(s) of the week/month/year: I was recently having an exchange with colleagues regarding a 360 topic on my personal Gmail account and up pops ads on the margin for various 360 vendors (which is interesting in itself), the first of which is from Qualtrics (www.qualtrics.com) with the heading, “Create 360s in Minutes.”
The topic of technology run amok has been covered before here (When Computers Go Too Far, http://wp.me/p10Xjf-3G), my peevery was piqued (piqued peevery?) when I explored their website and saw this claim: USE VALIDATED QUESTIONS, FORMS and REPORTS.”
What the heck does that mean? What are “validated” forms and reports, for starters?
The bigger question is, what is “validity” in a 360 process? Colleagues and I (Bracken, Timmreck, Fleenor and Summers, 2001; contact me if you want a copy) have offered up a definition of validity for 360’s that holds that it consists of creating sustainable change in behaviors valued by the organization. Reliable items, user friendly forms and sensible reports certainly help to achieve that goal, but certainly cannot be said to be “valid” as standalone steps in the process.
The Qualtrics people don’t share much about who they are. Evidently their founder is named Scott and teaches MBA’s. They appear to have a successful enterprise, so kudos! I would like to know how technology vendors claim to have “valid” tools and what definition of validity they are using.
Hey maybe I will get my 31st comment?
©2011 David W. Bracken
My colleague, Jeff Saltzman, has a great blog that is much more diverse than mine (http://jeffreysaltzman.wordpress.com ). His most recent entry begins with this little gem of a story that I want to plagiarize and take in my own direction:
”Which is more important, the sun or the moon?” A citizen of a small town not noted for its intellectual prowess asked. “Why the moon of course,” was the reply. “It shines at night when it is needed. The sun shines only during the day, when there is no need of it at all!” (Ausbel, N., A Treasury of Jewish Folklore, 1948)
I have touched on the topic of importance in some past blogs (http://dwbracken.wordpress.com/2011/04/07/worst-is-not-first/), and the folly of asking raters what is “important”. This little story made me think of that issue once again from a slightly different angle. My stance has been, and still is, that raters are in a very poor position to judge the importance of a competency/behavior in the context of the need of the ratee and the organization.
There is really no way to know what is going through a rater’s mind if/when we ask him/her to give use importance ratings. There may be some research on this question (e.g., correlation between importance and effectiveness ratings), but I will hazard a guess that importance ratings are more a function of rater needs than the needs of the ratee or organization/team.
Jeff’s story also makes me wonder how qualified raters are to provide importance ratings when they are most likely not given any instruction as to what “importance” means (as rater training might attempt to do). And their rationale for importance ratings may well be as convoluted as the small town citizen’s is.
The question of importance is useful in helping prioritize actions. So, if it is not the raters who should indicate importance, who is it? The manager (“boss”), of course, partnering with the ratee. Hopefully the boss and ratee have a history of development discussions on a personal level, and about organization/team priorities to create alignment. If they have not been having those discussions, maybe a 360 process tied to performance management and development might create some mutual accountability for doing so.
The importance of the “boss” in the 360 process and employee development in general is so critical that it boggles the mind to think of 360’s that totally bypass (exclude) the manager. I will equally dismayed to read of a major 360 process describe on LinkedIn that makes boss input optional. Really? I have always thought that manager input is the most useful feedback many ratees get out of 360’s, to the extent that a best practice is to require that the boss complete their input in order for a report to be generated.
I will go as far as to say the manager ratings are more important than participant self-ratings. Ideally both will happen but, as I mentioned in a recent blog, self- ratings are more an indication of commitment to the process than a true evaluation of self competence in many, many cases. I will acknowledge that sometimes bosses use their ratings to send a message to the ratee, but even then the resulting discussion is often very enlightening for the ratee.
©2011 David W. Bracken