Business Analytics Using Data Mining

Type
Essay
Pages
12
Word Count
1539
School
N/A
Course
N/A
Final Project: 2Q-CSC550X-A1-07-Data Mining and Distributed Computing-Spring
2013
Student ID:nmallu8596
Predicting Average Basket Value
Business Analytics Using Data Mining
Table of Contents
I. Executive
Summary..................................................................................................................................
. 3
A. Problem
Description ..............................................................................................................................
. 3
B. Brief description of the data, its source, key characteristics, and chart(s)....................... 3
C. High level Description (Prediction
Methodology) ...................................................................... 4
D. Technical
Summary..................................................................................................................................
. 5
E. Performance
Metrics ................................................................................................................................ 6
F.
Limitations ..............................................................................................................................
...................... 7
G.
Recommendations ...................................................................................................................
.................. 7
II.
APPENDIX..............................................................................................................................
......................... 8
A. CART
Model ......................................................................................................................................
............. 8
B. Multiple Regression
Model ..................................................................................................................... 10
C. KNN
Model ......................................................................................................................................
............... 11D.Naive
Model ......................................................................................................................................
............. 13
I. Executive Summary
A. Problem Description
Retailers spend a considerable amount of time, effort, and money to acquire a new
customer. However once a customer has been acquired, the maximum value can only be
derived if the customer becomes a repetitive buyer and his/her purchase amounts increase
with time. Identifying which customer will qualify for a promotion is a key to this problem
and our study makes an attempt to solve this issue. Our model predicts the future shopping
basket value of a customer. Basis this prediction, every week top 10% customers will be
identified and the store will email the promotional discount coupons to these customers.
Accuracy of model will be another deciding factor in the overall scheme of business
strategy. A miss out on a probable loyal customer could impact the long term customer life
time value associated. As the hypermarket, we are interested in predicting the average
basket value of the next customer who walks in based on his/her demographic data as well
as his previous purchase pattern prior to this visit
B. Brief description of the data, its source, key characteristics, and chart(s)
Transaction data for ABC that include information at SKU level – customer ID, purchase
date, extended price, quantity sold, item description, department, sub-department, class
and subclass. Data are provided on a period of 13 months from Aug 2011 to Aug 2012.
Customer demographic information has also been provided which includes enrolment date,
date of birth, sex, marital status and customer IDs.
Source: Data collected by HansaCequity Solutions
Key Characteristics: The available data was at the SKU and customer level. However, we
needed basket level data for our analysis. In order to efficiently filter and sort the data to
make it suitable for basket level analysis, we used BASE SAS v9.1.
· Compiled data at the Basket level for each customer – Merged customer information with
the transactions data to get customer_id and transaction_data level data