Reducing Churn and Saving Thousands of Dollars — Case Study

Reducing Churn and Saving Thousands of dollars — Case Study

A case study was done for a telecom company using the simplest models.

The subject of customer retention, loyalty, and churn is receiving attention in many industries. This is important in the customer lifetime value context. A company will have a sense of how much is really being lost because of the customer churn and the scale of the efforts that would be appropriate for a retention campaign. The mass marketing approach cannot succeed in the diversity of consumer business today.

There are various approaches to churn management. A broad strategy is improving product qualities or conducting mass advertisements, etc. A targeted strategy could be sending messages to particular customers.

Now there are 2 ways to avoid churn, reactive and proactive.

The reactive strategy, for instance, is getting in touch with the customer at the time of churn or after they churn. Company may try to understand their problem and offer a solution or simply offer discounts/benefits. This approach not only incurs cost but also incentivize churning. The word spreads, “threaten the company to leave and they shall reward”. Recently at Grammarly when I threatened to churn, they give me an immediate 40% discount. Now I am telling this to everyone and hopefully, you too can take advantage of it in case if you write a lot.

Now, what if somehow we can predict customer’s churn probability/risk with a time window and take action before they threaten? This is the proactive approach.

Now, the company is a small telecom shop in US, we consulted and are a current user of our product. Here, we have simplified the data for explanation purposes and showing a small sample of data. The data is of 35,000 people and churn rate after 2 years is 50%. The monthly churn is 2%.

Now, we built tens of hundreds of models for this one, but in this article we are using a logistic regression model so that we can explain better and include the odds ratio. We also want to understand the factors behind the churn.

Our target variable ischurn. The id iscustomer.

For simplicity of this article, we will here only attach two brief exploratory data analysis plots as below:

Numeric Variables Correlation Plot

Bi-Variate Pairplot

So now we need to understand the driver behind churn and by how much. The coefficient of each variable in logistic regression equation, odds ratio and importance of each variable will help us in knowing that. Before we build model, remember the data was very messy and we had to do lot of feature engineering including standardization, addressing missing values, nans, text variables etc.

The odds is calculated as:

odds = exp(variable coef)

The importance for a variable is calculated as:

importance = 
if odds > 1 then odds;
else 1 / odds;

The above shows most important variables.

Three simple interpretations :

highcreditr: for customers who have a high credit rating, the odds of customer churn will decrease by a factor of 0.47.

month: for one more month, the customer has had service, the odds of customer churn will decrease by a factor of 0.66.

eqpdays: for one more day that the customer-owned the current equipment, the odds of customer churn will increase by a factor of 2.19.

As we can see, the three most important variable are eqpdays, followed by highcreditr and months. Now if we can subgroup customers with certain characteristics (i.e. having the current equipment for a longer period of time), we can contact them with targeted messages and avoid churn at first place.

A simple action plan: We will call up people have owned their current equipment for more than 360 days (approximately average equipment ownership length), and ask them to renew equipment.


  • 10% of customers who receive the call will switch to a new phone.
  • The cost for the company is a one time cost of $150 (including phone cost and marketing cost).
  • The average revenue per customer in the group remains the same.

The base churn rate for customers who own the current gear longer than 360 days is 2.75%.

After we randomly assign the eqpdays column of the 10% of this subgroup of customers to be 0, we utilize logistic model to re-gauge the churn rate and calculate an average. The new average churn rate is assesed to be 2.14%.

The average revenue of this subgroup of customers is $50.41, and based on the above outcomes and assumptions, we calculated the new LTV for the customer and compare it to the baseline.

The plot depicts the estimated life time value over 60-month period.

Keeping all else equivalent:

  • The 12-month average LTV for a customer will reach $497.77, increased by 0.25% compare to the base LTV;
  • The 24-month average LTV for a customer will reach $857.17, increased by 4.61% compare to the base LTV;
  • The 36-month average LTV for a customer will reach $1109.14, increased by 7.74% compare to the base LTV;
  • The 48-month average LTV for a customer will reach $1285.80, increased by 10.27% compare to the base LTV;
  • The 60-month average LTV for a customer will reach $1409.64, increased by 12.33% compare to the base LTV.


This proactive style will benefit the organization. It can prevent customer churn beforehand, without giving incentives to customers who are leaving.


  • There were customer chats and email records as part of the dataset. We didn't use that for this article, since it would have made explaining the odds ratio and variable importance a bit more difficult. A case study on sentiment analysis can be found here.
  • This model had 65% accuracy. The winner model was an ensemble of gradient boost and neural network with an accuracy of 89%, which we aren’t using in this article.

About Author: Harsh Gupta has more than 7 years of experience building and directing AI initiatives across diverse industries, amounting to $10M + additional revenue during this period. He has served in technical roles such as Data Scientist at WWF and client-facing roles such as Consultant for Johns Hopkins, Grofers, OSUgiving. He is currently CEO of protonAutoML, a full-service data science consultancy and autoML software provider.

He can be reached out here for any advice or consultation.

Reducing Churn and Saving Thousands of Dollars — Case Study was originally published in The Startup on Medium, where people are continuing the conversation by highlighting and responding to this story.

Related Articles


Your email address will not be published. Required fields are marked *

Receive the latest news

Subscribe To Our Weekly Newsletter

Get notified about chronicles from TreatMyBrand directly in your inbox