Categories

# Deep Dive Weeks 2 & 3

The past two weeks involved supervised learning. After discovering NumPy and doing some feature engineering with the homework, we proceeded with supervised learning.

The past two weeks involved supervised learning. After discovering NumPy and doing some feature engineering with the homework, we proceeded with  supervised learning, where a target variable y is mapped from the inputs X .

Week 2: Intro to Supervised Learning

An excerpt from our class ipython notebooks:

More formally, this is expressed as

y=f(X)y=f(X)

where f is the unknown function that maps X to y. Our task as data scientists is to come up with g that captures f closely. Formally,

gf
Depending on the task, g may need to output continuous or discrete values. These two tasks are called regression and classification , respectively. Formally, we’ll call these tasks creating models that capture the underlying function f . In literature, oftentimes you’ll see models also being called as hypothesis .

For studying regression, we continued to use our board game dataset and modeled linear relationships between the features. Being a machine learning class, I didn’t expound on the analytical way of solving this. Instead, we did gradient descent as our optimization technique and visualized the resulting  lines. We also used r2 as a metric for correlation, just to look at the problem from the perspective of statistics.
After regression, we applied logistic regression to the “hello world” problem for all novice data scientists, the Iris dataset. Cost functions and all, we dissected the logistic regression algorithm.
Lastly, I had them do a homework in Kaggle, the Titanic dataset. Kaggle is a site for data science competitions, as well as a good resource for novices. The Titanic dataset involved a rather somber account of who survived during that fateful day when the Titanic sank. The task was to predict who survived. Everyone submitted their solutions and Kaggle, and some were even addicted to the whole gamification concept.  