# Data Science ProDegree

### Introduction to Data Science

What is Data Science?

Analytics Landscape

Life Cycle of a Data Science Projects

Data Science Tools & Technologies

### R for Data Science

Intro to R Programming

R Base Software

Understanding CRAN

RStudio The IDE

Basic Building Blocks in R

Understanding Vectors in R

Basic Operations Operators and Types

Handling Missing Values in R

Subsetting Vectors in R

Matrices and Data Frames in R

Lapply, sapply, vapply and tapply Functions

### Data Visualization using R

Grammar of Graphics

Bar Charts

Histograms

Pie Charts

Scatter Plots

Line Plots and Regression

Word Clouds

Box Plots

GGPLOT2

### Statistical Learning -1 (Including ANOVA)

Measures of Central Tendency in Data

Measures of Dispersion

Understanding Skewness in Data

Probability Theory

Bayes Theorem

Probability Distributions

Hypothesis Testing

### Statistical Learning - 2 (Including ANOVA)

Analysis of Variance and Covariance

One way analysis of variance

Assumption of ANOVA

Statistics associated with one way analysis of variance

Interpreting the ANOVA Results

Two way analysis of variance

Interpreting the ANOVA Results

Analysis of Covariance

### Exploratory Data Analysis with R

IMerge, Rollup, Transpose and Append

Missing Analysis and Treatment

Outlier Analysis and Treatment

Summarizing and Visualizing the Important Characteristics of Data

Univariate, Bivariate Analysis

Crosstabs, Correlation

### Linear Regression

What is Regression Analysis?

Limitations of Regression

Covariance and Correlation

Multivariate Analysis

Assumptions of Linearity Hypothesis Testing

Implementing Simple & Multiple Linear Regression

Making sense of result parameters

Model validation

Handling other issues/assumptions in Linear Regression

Handling outliers, categorical variables, autocorrelation, multicollinearity, heteroskedasticity Prediction and Confidence Intervals

### Project 1

Property Price Prediction using Linear Regression in R

### Logistic Regression

Implementing Logistic Regression

Making sense of result parameters: Wald Test, Likelihood Ratio Test Statistic, Chi-square Test Goodness of fit measures

Model validation: Cross Validation, ROC Curve, Confusion Matrix

### Project 2

Bank Credit Card Default Prediction using Logistic Regression in R

### Decision Trees

Introduction to Predictive Modeling with Decision Trees

Entropy & Information Gain

Standard Deviation Reduction (SDR)

Overfitting Problem

Cross Validation for Overfitting Problem

Running as a solution for Overfitting

### Project 3

Churn Analysis in Telecom Industry (Regression Trees

### Random Forest

Random Forest

Project 4 – Churn Analysis in Telecom Industry (Regression Trees & Classification Trees)

### Linear Discriminant Analysis

LDA Objective

Why Discriminant Analysis?

Discriminant Function

Assumption of LDA

Advantages & Disadvantages of LDA

Applications of LDA

### Project : Wine classification with Linear Discriminant Analysis

Project : Wine classification with Linear Discriminant Analysis

### Data Science with Python

### Basics of Python for Data Science

Python Basics

Data Structures in Python

Control & Loop Statements in Python

Functions & Classes in Python

Working with Data

### Data Frame Manipulation with Pandas

Data Acquisition(Import & Export)

Indexing

Selection and Filtering Sorting

Descriptive Statistics

Combining and Merging Data Frames

Removing Duplicates

Discretization and Binning

String Manipulation

### Exploration Data Analysis with Python

What is EDA?

Processes in EDA

Handling Data Types

Univariate and Bivariate Analysis

Hypothesis Testing

### Time Series Forecasting

Understand Time Series Data

Visualizing TIme Series Components

Exponential Smoothing

Holt’s Model

Holt-Winter’s Model

ARIMA

### Project : Forecasting and Predicting the furniture sales using ARIMA

### Clustering

What is Clustering?

K-means Algorithm

Types of Clustering

Evaluating K-means Clusters

### Project : Grouping teen students for targeted marketing campaigns

Project : Grouping teen students for targeted marketing campaigns

### Dimensionality Reduction

Principal Component Analysis (PCA)

Scatter plot

One-eigen value criterion

Factor Analysis

Project : Reduce Data Dimensionality for a House Attribute Dataset using PCA

### Machine Learning & Linear Regression

Machine Learning Modelling Flow

How to treat Data in ML

Parametric & Non-parametric ML

Types of Machine Learning

Introduction to Linear Regression

Linear Regression using Gradient Descent

Linear Regression using OLS

Linear Regression using Stochastic Gradient Descent

Project : Real Estate Price Prediction using Linear Regression

### Logistic Regression

Introduction to Logistic Regression

Logistic Regression using Stochastic Gradient Descent

Project 8 Project : Real Estate Price Prediction using Linear Regression

### Model Tuning

Performance Measures

Bias-Variance Trade-Off

Overfitting & Underfitting

Optimization Techniques

Project : Identifying good and bad customers for granting credit

### K Nearest Neighbor

K Nearest Neighbor

Understanding KNN

Voronoi Tessellation

Choosing K

Distance Metrics – Euclideam, Manhattan, Chebyshev

Project : Case Study: Breast Cancer

### Decision Tree & Random Forest

Decision Tree & Random Forest

Fundamental concepts of Ensemble

Hyper-Parameters

Project 11: Case Study : Predicting bank term deposit subscription based on marketing data

### Support Vector Machine

Support Vector Machines

What is SVM?

When to use SVM?

What is Support Vector?

Understanding Hyperplane

Project 12: Predicting credibility of the credit card customers

### SQL

Basic SQL

Introduction to SQL

DDL Statements

DML Statements

DQL Statements

Aggregate Functions

### Advanced SQL

Date functions

Union, Union All & Intersect Operators

Joins

Views & Indexes

Sub-Queries

Project 13 SQL Practice Exercises Creating a Database Schema and Table Relationship for a Logistic Company’s Data

### Tableau

Introduction to Visualization

Working with Tableau

Visualization in Depth

Data Organisation

Advanced Visualization

Mapping

Enterprise Dashboards Data Presentation

Project 14 Best Practices for Dashboarding and Reporting and Case Study

Have a Methodology

Know Your Audience

Define Resulting Actions

Classify Your Dashboard

Profile Your Data

Use Visual Features Properly

Design Iteratively

Project : Building Tableau Dashboard

### Resume Building and Interview Prep

1:1 Mock Interviews with Industry Veterans to Clear the Technical Round of Interviews to Give You Confidence to Face Real World Scenarios

### 1:1 Mock Interviews

1