What Homer Simpson and Frank Underwood can teach you about loyalty data analytics.

Zaktualizowano: lut 8

Data without analytics is like Bentley without an engine. It's just powerless numbers, not powerful knowledge. And these numbers can cheat you in a way that you can lose time, money and consequently ruin your company. What should you consider when building a strong marketing strategy, especially in this unpredictable VUCA world?

I want to take you on a short journey with those three heroes who help you spot common mistakes when analyzing loyalty data or basically any data on the market. Starting today, whenever you see Homer Simpson, a hipster girl, or Frank Underwood, they shall remind you of a fallacy or paradox linked to data analysis.

So let's start with Homer's story or maybe not really his story but ours…

What Homer Simpson can teach you about data trends interpretation.

It was a sunny morning last Spring, sometime mid-April. It was Monday - the “favourite” day when all the people are extremely busy. It's time to summarize the week that just passed, and the executive meeting is held. We deliver the weekly report that covers results according to pre-set KPI's. Every department teams bring their slides and discuss the trends on the market. And among the many KPIs that we deliver, there are two that attract a lot of attention. Those are the average basket of loyalty cardholders and the average spent in all the remaining anonymous transactions where no card is presented. Luckily, during week 13, both anonymous and loyalty basket increased significantly!

But then, the CEO of our customer came and presented his slide. He had some serious disappointment when looking at week 13. From the helicopter view, he focused on the clearly presented indicator, which is always essential for him - an average receipt value.

When they combined their charts, it seemed obvious that something is terribly wrong with the data. Someone messed up the report for sure. It's time to stop the meeting and get back once the agency delivers the report that no longer contradicts internal company data.

I was the one to receive that call that put our cooperation at serious risk. I listened carefully and immediately apologized as it must have been some dummy mistake in the report. And we started to verify the report. We opened the file. We get the raw data. We recalculated everything, but the more we looked inside, the more the error wasn't there. So apparently, we were not the only ones to have this "wtf "moment...

Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy.

A similar situation occurred to British scholars who, in 1988, published the outcomes of their study compared to the kind of treatments for kidney stones. They were particularly interested in checking if either of them is better in treating particular sites of kidney stones. What they found, which is obvious looking at this table, was that Treatment A was superior in both groups, both for small and large kidney stones treatment. But when they combined both groups, Treatment B performed better.

Simpson's paradox

This phenomenon of reversing the trend present in subgroups and disappears when those groups are combined is called Simpson's paradox. It was first described in the early 1950s by British statistician Edward Hugh Simpson. He started his career as a code breaker at Bletchley Park, where along with Alan Turing, he worked on breaking the enigma code. So whenever you see Homer Simpson again, think about Edward Simpson or at least about Simpson's paradox. That's why in our case both basket values ​​(with/ without card) went up, but because the share of transactions without a card increased significantly - in the end, the result of the average basket of the whole - decreased.

And now, let me take you on the next journey with another hero – Linda.

What a hipster girl can teach you about data samples testing.

Linda is 31. She's single, outspoken, and very bright. She majored in philosophy. As a student, she was deeply concerned with discrimination and social justice issues, and she also actively participated in anti-nuclear demonstrations.

Looking at that description, which is which statement is more probable:


A. Linda is a bank teller or

B. Linda is a bank teller, and she's also active in the feminist movement

85 per cent of people who were presented with Linda's story said that it is more probable that she's both a bank teller and an active feminist, which is a little counterintuitive, right? Because when you look at those two groups, the group of bank tellers must be greater or, in a strictly mathematical sense, not smaller than the group of bank tellers who also happen to be active in the feminist movement, as in this presented diagram.

Actually, Daniel Kahneman, who his entire life together with his friend Amos Tversky, created such an experiment and investigated the human mind's biases and fallacies. So if you haven't read "Thinking Fast and Slow," which summarizes their scientific path to Nobel Prize in economics, I strongly recommend it.

What does it have to do with loyalty data analytics? Two things for sure. With a growing number of dimensions, people tend to include all possible variables in models and are rather suspicious when we try to simplify the model. Many of our customers feel the need to have over-complicated data dashboards and explanations. And many loyalty managers have unhealthy pleasure from slicing the data sets into segments of few and drawing far-reaching conclusions from there. Or even, they end up with rabbit holes like tiny micro-segments that are so carefully crafted but not actionable in any way. On the other hand, they ask us why we need all those numbers if eventually, we use just a regression with few coefficients and variables.

And that's the difficult art of separating signals from noise. There is this famous saying that we often use: if you torture the data long enough, it will confess to basically anything. So whenever you meet a hipster girl again, remind yourself of sample data for hypothesis testing carefully revised with accordance to size, definition, and meaning of data-driven results.

And let's take a ride on the last journey with the ultimate hero called Frank.

What Frank Underwood can teach you about data plotting.

Frank is pragmatic. Everyone who's ever watched "House of Cards" knows that the only metrics he cares about are numbers. And actually, we often do the same whenever we analyze aggregates or statistics. We try to picture the world behind us. So imagine the following.

You analyzed the receipts' data of some British discount retailer. On average, people buy nine items during a transaction, and they spend seven and a half pounds. We have standard deviations for both variables. We know the formula for linear regressions. All properties and values are available to describe this phenomenon, right? So which is it? Which chart is described by those metrics?

Let's say your boss wants you to change the average number of items bought by customers to 10. When looking at chart 1 seems doable when looking at chart 4 - rather impossible.

The problem is that all four data sets can be described with the very same statistics. But as you can see, looking at the data's visual distribution, the plot dots' configurations are completely different. Again as it was with Simpson, it's not really about the guide – Frank Underwood but a different Frank. His name was Frank Anscombe, and in 1973 he drew those four charts to remind us about the importance of plotting the data. So if there is one thing that Frank can teach you: always try to plot the data. Do not rely on statistics and aggregates.


That's the end of our data science journey guided by our 3 heroes. I hope that next time when you see Homer Simpson, a hipster girl, or Frank Underwood, they will remind you of:

1. The trend may revert. When you combine the data from groups, you may see discrepancies.

2. Watch out for small sample conclusions that you try to extrapolate to big groups.

3. No matter what, try to plot the data!

Big data exploration through loyalty data analytics can fuel your brand with meaningful knowledge so that you'll be able to adapt your strategy to any circumstances. And transform #NewNormal challenges into #NewPossible solutions.


  • Thinking, Fast and Slow by Daniel Kahneman

  • Comparison Of Treatment Of Renal Calculi By Open Surgery, Percutaneous Nephrolithotomy, And Extracorporeal Shockwave Lithotripsy by C. R. Charig, D. R. Webb, S. R. Payne, and J. E. A. Wickham

  • https://en.wikipedia.org/wiki/Simpson%27s_paradox

  • https://en.wikipedia.org/wiki/Frank_Anscombe

30 wyświetlenia0 komentarz

Ostatnie posty

Zobacz wszystkie