Market Basket Analysis of SAP HANA table (invoices) using R script .
Objective: 

If you have SAP HANA data base which stores all the enterprise transaction data and want to apply predictive/machine learning algorithms on the HANA data base tables or views using R. This blog gives you the steps to connect SAP HANA data base from R and retrieve tables/views/procedures data and apply R statistical algorithms or machine learning techniques to get the insights of data.

Back Ground and use case:

R is a data science scripting language for teams that unites data prep, machine learning, text mining and predictive model deployment. It is used for business and commercial applications as well as for research, education, training, rapid prototyping, and application development and supports all steps of the machine learning process including data preparation, results visualization, model validation and optimization.

SAP HANA is an in-memory, column-oriented, relational database management system developed and marketed by SAP SE. Its primary function as a database server is to store and retrieve data as requested by the applications.

With the SAP HANA Data Platform of structured and unstructured data I wanted to do the proof of concept on Text analysis, Market basket analysis, and Customer segmentation data science use cases using R.

The below step by step procedure helps to  connect HANA DB tables/views/procedures from R.

DS Examples using R This article gives you step by step procedure to connect to SAP HANA DB and do the analytics 
1.Install HANA Client tools, R studio 
2.Request for READ access to HANA Data Base 
3.Setup the Data source using HDBODBC driver (with the SAP HANA DB Server details) 
4.Download the RODBC package in your R Studio 
5.Establish the database connection using R commands 
6.Execute SQL Query function using SQL commands on R.
 7.Feed the data to the required algorithms to analyze the data/results
library('RODBC')
library(arules)
library("arulesViz")
library(Matrix)
library(xlsx)
library("plotrix")
library("RColorBrewer")

ch<-odbcconnect o:p="" pwd="password" uid="REDDYJ">

res <-sqlquery ch="" from="" invoice="" o:p="" orders="" product="" reddyj.cbc_mba="">
summary(res)
list(res)

This example shows how to fetch product wise Invoice to explore the Market Basket analysis
Market Basket Analysis is one of the key techniques used by large retailers to uncover associations between items. It works by looking for combinations of items that occur together frequently in transactions. To put it another way, it allows retailers to identify relationships between the items that people buy. Market Basket Analysis (MBA) which uses Association Rule Mining on the given transaction data.
An example of Association Rules Assume there are 100 customers
  • 10 of them bought milk, 8 bought butter and 6 bought both of them *Who bought milk => bought butter
  • support = P(Milk & Butter) = 6/100 = 0.06
  • confidence = support/P(Butter) = 0.06/0.08 = 0.75
  • lift = confidence/P(Milk) = 0.75/0.10 = 7.5
Use cases / Benefits: Store layout Marketing , promotions Website content placement Recommended engine Cross selling

You can find code here



Comments

Popular posts from this blog