Project Description¶

In this project, we will analyze the subscription data of a SaaS (Software as a Service) company to understand what encourages clients to renew their subscriptions. We aim to combine and examine different datasets to find patterns and trends in customer behavior.

The company has collected data on client information, subscription details, and economic indicators. By connecting these datasets, we will try to understand how internal factors (like company size or industry) and external factors (like inflation or GDP growth) influence subscription renewals.

Our main goal is to identify the key factors that affect a client's decision to continue using the service. This analysis will help us understand:

  • Which types of customers are more likely to renew
  • What business or economic conditions support higher renewal rates
  • How the company can improve customer retention and loyalty

The insights we discover can help the company create better strategies to increase subscription renewals and support long-term business growth.


The Data¶

We will use three datasets for our analysis. Each one provides important information that, when combined, can give us a clearer picture of customer behavior.

client_details.csv¶

This file contains background information about each client.

Column Description
client_id A unique ID for each client.
company_size Size of the client company (Small, Medium, Large).
industry The industry the client belongs to (e.g., Fintech, Gaming, etc.).
location The geographic location of the client (e.g., New York, etc.).

subscription_records.csv¶

This file tracks subscription activity for each client.

Column Description
client_id A unique ID for each client (matches with the ID in client_details).
subscription_type The type of subscription plan (Yearly or Monthly).
start_date The date when the subscription started (format: YYYY-MM-DD).
end_date The date when the subscription ended (format: YYYY-MM-DD).
renewed Whether the subscription was renewed (True or False).

economic_indicators.csv¶

This file includes external economic factors during the subscription period.

Column Description
start_date Start date of the economic period (quarterly, format: YYYY-MM-DD).
end_date End date of the economic period (quarterly, format: YYYY-MM-DD).
inflation_rate The inflation rate during the period.
gdp_growth_rate The GDP growth rate during the period.

By analyzing all three datasets together, we will try to discover patterns in client renewals and provide clear, data-driven suggestions for improving the company's renewal strategy.

In [2]:
# Import required library
import pandas as pd

# Import data
client_details = pd.read_csv('client_details.csv')
subscription_records = pd.read_csv('subscription_records.csv', parse_dates = ['start_date','end_date'])
economic_indicators = pd.read_csv('economic_indicators.csv', parse_dates = ['start_date','end_date'])

How many total Fintech and Crypto clients does the company have?¶

In [3]:
# Define a function that returns 1 if the input is either 'Fintech' or 'Crypto', otherwise returning 0
def is_fintech_or_crypto(x):
    if x in ['Fintech','Crypto']:
        return 1
    else:
        return 0

# Loop through the 'industry' column in client_details and count Fintech or Crpyto client
total_fintech_crypto_clients = 0
for industry in client_details['industry']:
    total_fintech_crypto_clients += is_fintech_or_crypto(industry)

total_fintech_crypto_clients
Out[3]:
47

Which industry has the highest renewal rate?¶

In [4]:
# Merge client details with subscription records
subscriptions = pd.merge(subscription_records, client_details, on = 'client_id', how = 'left')

# Group by industry and calculate renewal rate
industry_renewal_rates = subscriptions.groupby('industry')['renewed'].mean()

# Find the industry with the highest renewal rate
top_industry = industry_renewal_rates.sort_values(ascending = False).index[0]

top_industry
Out[4]:
'Gaming'

For clients that renewed their subscriptions, what was the average inflation rate when their subscriptions were renewed?¶

In [5]:
# Merge subscription records with economic indicators
subscriptions_with_inflation = pd.merge_asof(subscription_records.sort_values(by='end_date'), 
                                             economic_indicators, 
                                             left_on='end_date', 
                                             right_on='start_date', 
                                             direction='backward')


# Calculate the average inflation rate for renewed subscriptions
average_inflation_for_renewals = subscriptions_with_inflation[subscriptions_with_inflation['renewed'] == True].inflation_rate.mean()
average_inflation_for_renewals
Out[5]:
4.418909090909092