Lies, Damn Lies and… Where are the Statistics?

Mark Mullaly

12 years ago

A really interesting article caught my attention in the Globe & Mail on Monday, suggesting that your most engaged employees may also be your lowest performing employees. And your high performers may be feeling completely powerless. I can only imagine the legion of managers and supervisors that snorted their morning coffee or choked on their bagel as that suggestion permeated their start-of-week consciousness.

The article is based upon a study published recently by a US-based consulting organization called Leadership IQ. Wanting to know more about the results, how the study was done and the basis of how the results were interpreted, I went to the source for the original report. The results are published in an 11-page white paper that, in part, is designed to promote Leadership IQ’s assessment products.

Consultants publishing research is not new, of course. In fact, conducting research and promoting the results is a time-honoured approach to marketing that, done well, demonstrates the competency and credibility of the consultant, showcases their work, and not-too-subtly suggests that “we’re smart” and “we could do this kind of awesome analysis for you, too!” I should know. I’ve been doing it for fifteen years or so now.

What has to be asked about any research in order to evaluate its relevance and value is, “where is the bias?”. In much the same manner that Deep Throat exhorted Woodward and Bernstein to “follow the money” to get to the bottom of Watergate, if you want to get to the bottom of research the best advice is to “follow the bias.” In this instance, the bias is hiding in plain sight: before determining whether the findings of their study are relevant for your organization, they encourage you to “…conduct the same type of analysis, preferably using Leadership IQ’s survey and analytics.” (p.10)

Before we figure out whether the findings are relevant for your organization, however, we have to figure out whether they are relevant at all. And that is, to be frank, where I start to have some problems. The language used in the white paper implies that the study employed quantitative analysis (a fancy term for ‘statistics’). Liberally peppered through the document are references to “industry-leading statistical techniques”, “statistical normalcy” (which isn’t a statistical term at all), “multivariate statistics” and “analytics”. Big, juicy words, all of them.

Looking at what is reported, however, doesn’t suggest any use of statistics whatsoever. If you want to understand the value of research, and particularly quantitative research, there are a few fundamental things you need to know. Sample size (how many people were involved) is particularly important, and for statistical analysis you need a relatively large number to be accurate. It is helpful to understand the specific techniques that were used. And most importantly, you need to know the ‘power’ of the results (in other words, not just the significance of the results, but the likeliness they are true – the higher the power, the stronger the correlation being measured). This is why, when you read opinion polls in the newspaper, you will see things like “based upon a sample size of 1000; results are 95% accurate 19 times out of 20”. The probabilities are a measure of the power of the study, suggesting how representative they actually are.

In the Leadership IQ paper, we don’t know the sample size. We know there are 1000 employees in the case study organization, but there is no indication of how many of them participated. At the same time, some of the wording of the paper suggests the results are broader than the organization, but we don’t know that for sure, and there is no indication of the number of organizations or participating employees in the larger sample.

More problematic is that there is absolutely no discussion of statistical techniques. What has been employed is what is called “descriptive statistics”: for the categories of high, medium and low performing employees, they indicate the average response of that group to the question being asked. In the first question, for example, they indicate that the low performing group scored 5.99 on a scale of 1 to 7 on whether they are “motivated to give 100% effort when at work”, while the high performers reported 5.36. In the analysis, they make a very big deal about the fact that there is a 0.63 difference in these numbers. Is it significant? It might be, or it might not; that’s actually what statistical analysis would tell us, but it hasn’t been used.

The largest problem that I have with the study, however, is that the report makes some pretty concrete and judgemental statements in interpreting the results. They interpret the results reported above, for example, to assert “…high and middle performers aren’t reaching their full potential.” This might be true, and it might not. Without first understanding whether the correlation being identified is meaningful, we don’t know whether there is any material difference to speak of. And secondly, even if we were to accept on face value that the differences are important (and I don’t recommend it), we don’t know why that difference is there.

What this particular report shows is correlation. It does not show causation. A (mildly ridiculous) example might help. Let’s say someone did research, and discovered that the higher someone’s income, the more they spend on cars. Not particularly surprising, and not an insight that is going to shock the world at large. You’ve shown a relationship that most people expect. How believable would it be, however, if Mercedes were to publish a study that said, “If you drive a Mercedes S-class, you’ll make more money! Research proves it”?

In essence, that’s what is happening here. The report is saying there is a difference between results for different groups, and therefore claiming a relationship (correlation) exists. They are taking that presumed (but untested) correlation and stating their interpretation of why they think that correlation exists. The important words there are *their interpretation*. If we accept there is a relationship, what they suggest is one reason that relationship might exist. It is not the only one.

Let’s just take the question of whether employees are motivated to give 100% at work. The medium and high performance groups report a lower average score than the low performing group. The study claims these groups “…aren’t reaching their full potential.” (p. 4) It could also be interpreted that the medium and high performance groups have a differing interpretation of ‘giving 100%’ actually means. Or that they are able to do the same (or better) a job as the low performers with less effort. Or that the low performers, to be more favourably viewed, are giving you the answer you want to here. Or that they interpret 7-point answer scales differently than the other groups. All of these are plausible explanations. Any of them could be true. Without first demonstrating that the differences are meaningful, and *then* exploring why the differences exist, taking action based upon any interpretation would be questionable.

You may argue that it is be unfair to pick on this one study (and there are any number of others that I could have used). At the same time, the organization is making a very big deal about it, claiming that “In the past week alone, nearly every major business media outlet has featured our research study…” They would very much like you to believe that the results of their study are true. They very strongly suggest that you should start managing differently. And they would really, really like you to hire them to do a study of your organization. In this example of research-as-marketing, the marketing is much stronger than the research.

My earlier caution still holds: If you want to be an informed consumer of research, follow the bias.