Customer Lifetime Value Analytics: Case Study

Supply Chain Analytics
Customer Lifetime Value
Customer Acquisition
Customer lifetime value (CLTV) is the estimated total amount a customer will spend on a business throughout their relationship with that business. It takes into account the revenue generated by the customer as well as the costs associated with acquiring and serving that customer.
Author

Olumide Oyalola

Published

July 5, 2023

Introduction 😉

Customer lifetime value (CLTV) is the estimated total amount a customer will spend on a business throughout their relationship with that business. It takes into account the revenue generated by the customer as well as the costs associated with acquiring and serving that customer. By analyzing the relationship between customer acquisition costs and revenue generated, we can determine which channels are the most cost-effective for acquiring and retaining high-value customers.

The given data (available from the link below) includes information about the customer’s channel, cost of acquisition, conversion rate, and revenue generated.

Download Data

Goal 🥅

The goal is to analyze the CLTV of customers across different channels and identify the most profitable channels for the business.

Load Packages ♎

Load Dataset 📅

Code
customer_tbl <- read_csv("customer_acquisition_data.csv") %>% 
  mutate_if(is.character,as.factor)

glimpse(customer_tbl)
Rows: 800
Columns: 5
$ customer_id     <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,~
$ channel         <fct> referral, paid advertising, email marketing, social me~
$ cost            <dbl> 8.320327, 30.450327, 5.246263, 9.546326, 8.320327, 30.~
$ conversion_rate <dbl> 0.12314498, 0.01634149, 0.04382223, 0.16759225, 0.1231~
$ revenue         <dbl> 4199, 3410, 3164, 1520, 2419, 3856, 1172, 700, 2137, 9~

Exploratory Data Analysis 🏁

Code
summary(customer_tbl[,-1])
             channel         cost        conversion_rate      revenue    
 email marketing :214   Min.   : 5.246   Min.   :0.01634   Min.   : 500  
 paid advertising:194   1st Qu.: 5.246   1st Qu.:0.04382   1st Qu.:1694  
 referral        :207   Median : 8.320   Median :0.04382   Median :2764  
 social media    :185   Mean   :13.148   Mean   :0.08630   Mean   :2769  
                        3rd Qu.: 9.546   3rd Qu.:0.12314   3rd Qu.:3824  
                        Max.   :30.450   Max.   :0.16759   Max.   :4998  

Let’s start by visualizing the distribution of acquisition cost and revenue generated by the customer using histograms:

Code
cost <- customer_tbl %>% 
  ggplot(aes(cost)) +
  geom_histogram(fill = "steelblue", color = "white", bins = 30) +
  labs(title = "Distribution of acquisition cost",
       x = "Acquisition Cost",
       y = "Frequency")


rev <- customer_tbl %>% 
  ggplot(aes(revenue)) +
  geom_histogram(fill = "tomato", color = "white", bins = 30) +
  labs(title = "Distribution of revenue generated",
       x = "Revenue",
       y = "Frequency") +
  scale_x_continuous(labels = scales::comma)

cost/rev

Research Questions ❓

Q1

  • What’s the cost of acquisition across different channels?

Now, let’s examine the cost of acquisition across different channels and identify the most and least profitable channels:

Code
customer_tbl %>% 
  group_by(channel) %>% 
  summarise(Ave_cost = mean(cost, na.rm = TRUE)) %>% 
  ggplot(aes(x = reorder(channel, Ave_cost), y = Ave_cost)) +
  geom_bar(stat ="identity", width = 0.6) +
  labs(x = "Channel",
       y = "Average cost of acquisition",
       title = "Average cost of acquisition across channels") +
  geom_label_repel(aes(label = round(Ave_cost)))

Insight ℹ️

  • Paid advertisement is the most expensive channel whereas email marketing is the least expensive channel.

Q2

  • Which channels are most and least effective at converting customers?

Now, let’s see which channels are most and least effective at converting customers:

Code
customer_tbl %>% 
  group_by(channel) %>% 
  summarise(Ave_rate = mean(conversion_rate, na.rm = TRUE)) %>% 
  ggplot(aes(x = reorder(channel, Ave_rate), y = Ave_rate)) +
  geom_bar(stat ="identity", width = 0.6, fill = "steelblue", color = "white") +
  labs(x = "Channel",
       y = "Average conversion rate",
       title = "Average conversion rate across channels") +
  geom_label_repel(aes(label = round(Ave_rate,2)))

Insight ℹ️

  • Social media is the most effective channel for converting customers, whereas paid advertising is the least effective.

Q3

  • What are the most and least profitable channels in terms of generating revenue?

Now, let’s calculate the total revenue by channel and have a look at the most and least profitable channels in terms of generating revenue:

Code
customer_tbl %>% 
  group_by(channel) %>% 
  summarise(sum_rev = sum(revenue, na.rm = TRUE)) %>%
  mutate(Prop = sum_rev/sum(sum_rev),
         channel = factor(channel, c("email marketing", "referral",
                                     "paid advertising", "social media") )) %>%
  arrange(Prop, channel) %>% 
  ggplot(aes(x = 2, y = Prop, fill = channel)) +
  geom_bar(stat ="identity", width = 1, color = "white") +
  xlim(0.3, 2.5) +
  coord_polar(theta = "y",  start = 0) +
  theme_void() +
  scale_y_continuous(labels = scales::percent) +
  labs(x = "",
       y = "",
       title = "Revenue contribution by channels",
       fill = "") +
  geom_text_repel(aes(label = paste0(round(Prop*100,1),"%")), size = 3, position = position_stack(vjust = 0.5)) +
    scale_fill_brewer(palette = "Blues", direction = -1) +
    theme(legend.position = "right")

Insight ℹ️

So, email marketing is the most profitable channel in terms of generating revenue.

Q4

  • What’s the return on investment for each channel?

Let’s examine the return on investment (ROI) for each channel

Code
customer_tbl %>% 
  mutate(roi = revenue/cost) %>% 
  group_by(channel) %>% 
  summarise(Ave_ROI = mean(roi, na.rm = TRUE)) %>% 
  ggplot(aes(x= reorder(channel, Ave_ROI), y = Ave_ROI)) +
  geom_bar(stat = "identity", fill = "steelblue", color = "white",
           width = 0.6) +
  labs(x = "Channel",
       y = "Average Return on Investment",
       title =  "Average Return on Investment (ROI) by Channel")

Insight ℹ️

The ROI from email marketing is way higher than all other channels, whereas the ROI from paid advertising is the lowest.

Q5

  • What is the distribution of customer lifetime value from each channel?

Let’s calculate the customer lifetime value from each channel.

Based on the data we have, we can use the formula below to calculate CLTV

\[ CLTV=(revenue − Cost)∗ \frac{conversionRate}{cost} \]

Code
customer_tbl %>% 
  mutate(CLTV = (revenue - cost)* (conversion_rate/cost)) %>% 
  group_by(channel) %>% 
  summarise(cltv = mean(CLTV, na.rm = TRUE)) %>% 
  ggplot(aes(x = reorder(channel, cltv), y = cltv, fill = channel)) +
  geom_bar(stat = "identity", width = 0.6, show.legend = FALSE) +
  labs(x = "Channel",
       y = "CLTV",
       title = "Customer Lifetime Value by Channel")

Insight ℹ️

So, the customer lifetime value from social media and the referral channels are the most significant.

Let’s compare the CLTV distributions of the CLTV across channels

Code
customer_tbl %>% 
  mutate(CLTV = (revenue - cost)* (conversion_rate/cost)) %>% 
  ggplot(aes(x = channel, y = CLTV, fill =  channel)) +
  geom_boxplot(alpha = 0.5, show.legend = FALSE, color = "black") +
  labs(x ="Channel",
       y = "CLTV",
       title = "CLTV distribution by Channel")

Insight ℹ️

The boxplot above suggested that there’s a difference in the distribution of the customer lifetime value from different channels.

Hypotheses Testing 🧪

The following research hypotheses would be considered in the subsection below;

  1. is there is a significant difference in the revenue generated by channels?
  2. does the CLTV differ significantly across the channels?

Hypothesis 1

  • Null hypothesis: There’s no significant difference in the revenue generated by channels

  • Alternative Hypothesis: There’s a significant different in the revenue generated by channels

Code
model1 <- aov(lm(revenue ~ channel, data = customer_tbl))
summary(model1)
             Df     Sum Sq Mean Sq F value Pr(>F)
channel       3    3476233 1158744    0.73  0.534
Residuals   796 1264097594 1588062               

Insight ℹ️

  • The test result above suggested that there is no significant difference in the revenue across channel at 5% level of significant. As a result, we cannot reject the null hypothesis.

Hypothesis 2

  • Null Hypothesis: There’s no significant difference in the CLTV across channels

  • Alternative Hypothesis: There’s a significant difference in the CLTV across channels

Code
customer_tbl %<>% 
  mutate(CLTV = (revenue - cost)* (conversion_rate/cost))

model2 <- aov(lm(CLTV ~ channel, data = customer_tbl))
summary(model2)
             Df Sum Sq Mean Sq F value              Pr(>F)    
channel       3 237204   79068   339.6 <0.0000000000000002 ***
Residuals   796 185312     233                                
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Insight ℹ️

  • The test result above suggest that there’s a significant difference in the customer lifetime value across channels. The test is significant at 5% level of significance. As a result, the null hypothesis cannot be accepted.

As a follow-up question, it would be great to know where the difference lies. Below is a post-hoc test conducted to know where the difference lies.

Code
out <- duncan.test(model2, "channel", main = "CLTV across channels")
plot(out, variation = "IQR", main = "CLTV across channels")

Code
print(out$groups)
                      CLTV groups
social media     46.584325      a
referral         40.599817      b
email marketing  23.559595      c
paid advertising  1.500682      d
  • The post hoc test shows that there’s a significant different across all the channels at 5% level of significance.

Summary 👍

Customer lifetime value analysis is used to estimate the total value of customers to the business over the lifetime of their relationship. It helps companies determine how much to invest in customer acquisition and retention, as well as identify the most valuable customers to prioritize for retention efforts.

Take Away 🕶️

It was observed that email marketing has the least cost of acquisition with a better conversion rate than paid advertisement which cost the most among all the channels. Social media has the best conversion rate with email marketing contributing the most to the revenue generated and also has a better return on investment (ROI) than others. Social media has a better CLTV among all the channels and this is found to be statistically significant.