Big Data, Ai and Machine
Learning
he role of Ai and Machine
Learning – can we automate and
improve our approaches?
Data science uses automated methods to analyse
vast amounts of data and extract knowledge
Google
...
Big Data, Ai and Machine
Learning
he role of Ai and Machine
Learning – can we automate and
improve our approaches?
Data science uses automated methods to analyse
vast amounts of data and extract knowledge
Google Analytics Customers
Characteristics of Big Data
Predictive Analytics for
Marketers
Managing Churn
Predicting component failure
Establishing Segments
Examples in
Retail Banking
Mobile Telecoms
Customer Analysis
Public Sector etc
It also discusses Pricing and Social Network analysis issues and what software is
available.
Predictive vs Descriptive
Analysis and Modelling
An introduction
All models are wrong but
some are useful
Become familiar with the tools that help you
Identify populations
Predict behaviour
Recommend activities to improve performance
Track these
Principles of data
management
Leventhal’s examples of data
Examples of Data Sources
Leventhal’s
take on the
role of the
analyst
She has to discuss with the business experts
what the issues are
Understand the nature of the data that
supports the business processes
Brief IT so it can be collected and prepared
which will take 70% of the time
Decide on the modelling techniques, build
and test the mode
Data quality Audit
◦ Values Analysis – is each variable well represented
◦ Statistical analysis – what is envelope of values
represented by the different means and SDs
◦ What does a decile histogram tell us about the shape of
the data
Data preparation and cleansing –
about 60-70% of the work
An example Inspecting the
Dataset
The analytic modelling toolkit
The same analysis from a machine
learning perspective looks like this
Classification Tools
Decision Trees- Supervised learning
Turn a classification into questions
Types of Decision Trees
When to use them
Rule induction trees CHAID uses chi square techniques to decide on the splits
They can form a random forest
Iterative averaging of best fit
There are decision trees in
Google Analytics
PCA (principle components analysis) is about reducing the variables to
something manageable – start by removing highly correlated variables
◦ Highlight the key values that describe what’s going on
◦ It requires business acumen to see which things can safely be ignored
Cluster analysis enables you to see which customers fall into natural
segments.
The self organising map is basically an unsupervised neural net
Affinity analysis delivers the beer and nappies correlation
Cluster Analysis
Surface Vector Machines – a
Supervised classification routine
Neural Networks let the modelling
software take the strain
Neural nets are
black boxes.
they learn by
refining the
factors in the
hidden layers to
deliver a more
accurate output
– but there’s no
model
TPredicting Lifetimes – churn
and in machines
Cumulative hazard function is the mirror image of
the survival function
Acquisition and churn require
a model of customer lifecycle
Regression for Prediction
Straightforward but consuming
Who is going to buy what under which
circumstances and how do we track it?
Churn Prediction with
Regression
Spotting individuals that will cancel a
subscription based on behaviour - It’s a
binary classification task
An example of using
regression to predict churn
Comparison of Approaches
Software Solutions - Datamining assumes
too much data to fit on one computer
How
targeting
Models are
built and
deployed
Precision and Recall
Data Measures
Machine Learning Libraries
There are many libraries available if you wish to write machine learning code. Amongst the
most popular general purpose libraries are:
Weka (Links to an external site.)Links to an external site. is a Java library developed at the
University of Waikato in New Zealand. It has a GUI, which is very useful, but can also be
called from within Java code. These characteristics make it an ideal starting point for
machine learning.
R programming language (Links to an external site.)Links to an external site. is very popular
for machine learning. There is no one library, but rather hundreds provided for free in a
decentralised manner such that anyone can use them.
Scikit Learn (Links to an external site.)Links to an external site. is a Python library for machine
learning. It is a popular choice in industry, and hides a lot of the mathematics.
Tensorflow (Links to an external site.)Links to an external site. (advanced) is an open source
library initially developed by the Google Brain team, which can be used in a variety of
programming languages including Python, C++, Java, Haskell and Go. It is used for general
machine learning, and deep neural networks in particular.
Pytorch (Links to an external site.)Links to an external site. (advanced) is a high performance
Python library, optimised to take advantage of advances in the use of GPU processing
Choosing your tool
http://scikit-learn.org/stable/tutorial/machine_learning_map/index.html
What Leventhal says about testing
Building Customer Segmentation
Relevant / Identifiable / Viable /
Distinctive / Complete / Exhaustive
What level of complexity can your staff manage
◦ Are the segments meaningful AND doable – how
often does the model need updating
Assess business needs and available data.
Define criteria
1. Demographic / psychographic /
behavioural
What do they need vs what do they
Classification is
about Segmentation
Analysis by Segmentation
1) Understand GDPR
2) Data mining is separate from modelling – try and keep
it as simple as possible
3) Don’t lose sight of the principles of statistics
4) The most valuable data is from your own customers
5) Modelled data is not the same as actual or raw data
6) you have to manage a combination of what data you
have and what techniques you can use
7) Try and use predictive techniques to manage stock
8) Robust is better than sophisticated
9)Try and simplify and make usable the social graph
10) Testing is important but is about deciding where to
bet your fiver
11) Get on with it – look for quick wins and follow the
money
Marketing with Smart Machines
Alexander Borek and Joerg Reinold
The Marketing Managers day – quite
soon.
Algorithms and Data
Real time bidding
Automated multivariate site testing
Monitoring of most promising sales leads.
Personalised offers like in Amazon
Threads personal shopping adviser
Every company is a media company
Autonomous logistics
Pay as you go vs ownership
Kit as a service – Rolls Royce aero engines
Fully automated supply chains
COPYRIGHT DR ALAN RAE 2018
Recognition of
individuals by face and
voice across channels
and partners
Business content will
be authored by
machines
Our challenge is to
keep up with all this.
Its cloud and API
driven and probably on
Amazon Web Services
Machines deliver statistical probability actions
statistics will be a core competency for marketers
[Show More]