How to choose what data analysis to do?

Data Driven Decision Making Week 3 Data Analysis Data Software

1

How to choose what data analysis to do?

Data Analysis: Process

Data preparation

Exploratory Data Analysis

Plotting

Distributions

Correlations

Advanced analytics

Descriptive modeling

Categorization

Predictive Modeling

Forecasting

Recommendation Systems

Business Question/Need

Business Decision

Data collection

Never actually this linear

Any step may be your stopping point

Simple analyses are great!

Will jump in and out of it

Data Preparation

Exploratory Data Analysis

Plotting

Distributions

Correlations

Continuous data, calculated for every year, 1928-2017

Mean return, 12%

Learning Data Methods

For each method you learn consider:

What sorts of data would I apply this method to?

Types of data

E.g. seasonal and frequency analyses would be used for time series data

Data from particular industries or specialties

E.g. Markov chain modeling for financial growth modeling

E.g. Logistics and Supply chain modeling

What do I have to look out for with this analysis?

Who do I know who’s an expert in this?

Data analysis

Machine Learning Artificial Intelligence

Regression and Statistical Modeling

What do you know about regression and statistical modeling?

What is Machine Learning? Artificial Intelligence?

“Formally”

Artificial Intelligence (AI)

Coined in 1956 by John McCarthy

“machines that can perform tasks that are characteristic of human intelligence” – McCarthy

General AI – AI that has all aspects of human intelligence

Narrow AI – AI that expresses some facet(s) of human intelligence, e.g. facial recognition

Machine learning (ML)

Coined in 1959 by Arthur Samuel

“the ability to learn without being explicitly programmed” – Samuel

Saves the billions of lines of code you’d need to procedurally create AI

“Machine learning is simply a way of achieving AI” – McClelland

Narrow AI

McClelland, Calum. (Dec 4, 2017) The Difference Between Artificial Intelligence, Machine Learning, and Deep Learning. Medium.com. Retrieved from https://medium.com/iotforall/the-difference-between-artificial-intelligence-machine-learning-and-deep-learning-3aa67bff5991

What is Machine Learning? Artificial Intelligence?

In usual conversation

Artificial Intelligence = Machine learning

With a computer scientist

Depending on the era they came up in, more likely to use AI or ML, generally more likely to use ML unless talking to potential funders / think pieces / TEDTalks

Press/public thinker

More likely to use AI

Start ups

More likely to use AI

From Rob Tibshirani

http://statweb.stanford.edu/~tibs/stat315a/

What is Machine Learning?

Statistics with different terminology?

From Rob Tibshirani

http://statweb.stanford.edu/~tibs/stat315a/

What is Machine Learning?

Supervised vs. Unsupervised learning

Supervised learning – you know the labels for the data

Regression

Classification

Unsupervised learning – you don’t know the labels

(probability) density estimation

clustering

https://commons.wikimedia.org/wiki/File:Cluster-2.svg

From Rob Tibshirani

http://statweb.stanford.edu/~tibs/stat315a/

What is Machine Learning?

Supervised vs. Unsupervised learning

Supervised learning – you know the labels for the data

Regression

Classification

Unsupervised learning – you don’t know the labels

(probability) density estimation

clustering

In both cases, you’re fitting the weights/parameters of model

You may be familiar with

Using Solver in Excel

Fitting a regression line to data

Forecasting

From Rob Tibshirani

http://statweb.stanford.edu/~tibs/stat315a/

What is Machine Learning?

Once you’ve fit your model, need to see how it performs with new data.

Overfit?

Set aside a subset of your data as a “test set” to see how the model performs on data that you didn’t fit on

From Rob Tibshirani

http://statweb.stanford.edu/~tibs/stat315a/

What is Deep Learning?

What is Deep Learning?

Deep learning is a method of Machine Learning (that is hot right now)

Deep learning is a type of (Artificial) Neural Network

Primary used for Supervised Learning

Aka Regression with A LOT OF WEIGHTS

What is Deep Learning?

(Artificial) Neural Network

Lots of parameters to fit

Deep learning

ANN with many layers

More difficult to train (incl time)

Computer scientists figured out how to train ~2006

Advanced field in many difficult ML tasks, e.g. object recognition

https://commons.wikimedia.org/wiki/File:Colored_neural_network.svg

What is Deep Learning?

(Artificial) Neural Network

Lots of parameters to fit

Deep learning

ANN with many layers

More difficult to train (incl time)

Computer scientists figured out how to train ~2006

Advanced field in many difficult ML tasks, e.g. object recognition

https://commons.wikimedia.org/wiki/File:Colored_neural_network.svg

https://www.dtreg.com/solution/view/21

What is Deep Learning?

(Artificial) Neural Network

Lots of parameters to fit

Deep learning

ANN with many layers

More difficult to train (incl time)

Computer scientists figured out how to train ~2006

Advanced field in many difficult ML tasks, e.g. object recognition

https://commons.wikimedia.org/wiki/File:Colored_neural_network.svg

https://www.dtreg.com/solution/view/21

https://www.dtreg.com/solution/view/21

What is Deep Learning?

(Artificial) Neural Network

Lots of parameters to fit

Deep learning

ANN with many layers

More difficult to train (incl time)

Computer scientists figured out how to train ~2006

Advanced field in many difficult ML tasks, e.g. object recognition

"Is this question part of your assignment? We can help"

ORDER NOW