Posts

Satnadander — Product Recommendation

Image
  The data set is available on Kaggle for Product recommendation Problem Statement:  Santander has provided with 1.5 years of customers behaviour data from Santander bank to predict what new products customers will purchase. The data starts at 2015–01–28 and has monthly records of products a customer has, such as “credit card”, “savings account”, etc. The details of the dataset can be found in: https://www.kaggle.com/c/santander-product-recommendation/data Points to be considered : The training dataset is of 1.5 yrs and the size is approx. 2.5 GB of CSV files It has 24 features which contributes to approx. 24 Products Overall training dataset has 48 columns i.e, 24 features and 24 Product Recommendations The dataset provided has the column/feature name not in standard form which need to be changed as per description given in Kaggle Design/ Approach: The major problem with the dataset is the size (~2.5 GB) , the code is needed to optimize the size without loosing any data/ feature and u