Aim -
- To find key drivers that influence the output of an Artificial Neural Network
- To determine the relative importance of these influencing factors
- Project Description
- About Data Set
- Data Exploration
- Modeling Approach
- Model Performance
- Feature Importance
- Summary
- Author
- License
- Acknowledgments
The bank in this case wants to predict whether a customer will subscribe to a term deposit. To make this a successful telemarketing campaign, the Bank would like to know which customers are highly likely to subscribe its offer.
The dataset provided by the bank contains details on the number of days since last contact which captures recency aspect and the number of contacts performed during the present and the previous campaign which captures the frequency aspect of the marketing campaign.
For modelling purpose, I have used the recency and frequency metrics to train my model because these metrics have very high predictive power. As for the model, I have used a binary classifier which gives an output of either 1 (the customer will subscribe) or 0 (the customer will not subscribe).
-
Title: Bank Telemarketing (with social/economic context)
-
Past Usage: The full dataset (bank-additional-full.csv) was described and analyzed in:
S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems (2014),doi:10.1016/j.dss.2014.03.001.
-
All records are ordered by date (from May 2008 to November 2010). Detailed Description can be found @ Data_Dictionary.md.
In this phase of the project, I have tried to resolve common data challenges faced such as poor data quality, multicolinearity, and correlation between pair of variables. The key insights are as follows -
From the preliminary data analysis, I concluded that the duration of the last call, outcome of the previous campaign, and month in which the customer was contacted have significant impact on the final outcome. However, there is no way to conclude which one is more or less important relative to each other.
Detailed Description can be found @ Bank_Marketing_Exploratory_Analysis.ipynb.
Steps -
- Feature Engineering and Data Transformation of Categorical and Numerical Attributes
- Split the dataset into predictors and response variables. In this case, response is whether customer subscribes or not.
- Split the dataset into training set (75%) and test set (25%)
- Create Model
- Fine tune the Hyper-parameters
- Test Performance of the Final Model
- Report Performance Metrics
- Identify most important factors based on socioeconomic characteristics of the customers
Overall, our model has achieved an accuracy of 91.66% for the test set.
The confusion matrix for this classification model is shown below.
- Optimal Set of Hyperparameters for our Neural Network is given by -
- neurons = 31
- learning_rate = 0.1
- batch_size = 2048
- optimizer = adam
- epochs = 100
- Accuracy = 91.65%
- Precision for (y=0) = 95%
- Precision for (y=1) = 63%
- Duration of last contact is the most influencing factor in determining whether a customer will subscribe to a term deposit.
The MIT License (MIT)
Copyright (c) 2023 Abbas Singapurwala
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Inspiration, code snippets, etc.