Skip to content
Introduction to Deep Learning in Python
Introduction to Deep Learning in Python
Run the hidden code cell below to import the data used in this course.
# Import pandas
import pandas as pd
# Import the course datasets
wages = pd.read_csv('datasets/hourly_wages.csv')
mnist = pd.read_csv('datasets/mnist.csv')
titanic = pd.read_csv('datasets/titanic_all_numeric.csv')Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Deep Learning
- Deep learning uses neural networks
- They work better than simple classification and regression algorithms because they account for multiple feature interactions really well
- Deep learning models have the ability to find features and the interactions between them really well, without explicit feature engineering
- Modeler doesn't need to specify the interactions
- When we train the model, the neural network gets weights that find the relevant patterns to make better predictions
Below is an example of a deep learning model built using keras:
import numpy as np
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
predictors = np.loadtxt('predictors_data.csv', delimiter=',')
n_cols = predictors.shape[1]
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(1))Forward Propagation Algorithm
- Multiply-add process (or dot product)
- Forward propagation for one data point at a time
- Output is the prediction for that data point
Example of forward propagation algorithm in a neural network:
Example of forward propagation algorithm as code:
Run cancelled
import numpy as np
input_data = np.array([2, 3])
weights = {'node_0': np.array([1, 1]),
'node_1': np.array([-1, 1]),
'output': np.array([2, -1])}
node_0_value = (input_data * weights['node_0']).sum()
node_1_value = (input_data * weights['node_1']).sum()
hidden_layer_values = np.array([node_0_value, node_1_value])
output = (hidden_layer_values * weights['output']).sum()Activation Functions
- Applied to node inputs to produce node output
- Earlier it was tanh
- Now ReLU (Rectified Linear Activation) is more common
Why Deep learning is also called Representation learning
- Deep networks internally build representation of patterns in the data
- Partially replace the need for feature engineering
- Subsequent layers build increasingly sophisticated representations of raw data
Creating a Keras model
Model building steps
- Sepcify architecture
- Compile
- Fit
- Predict
A regression model
Run cancelled
import numpy as np
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
predictors = np.loadtxt('predictors_data.csv', delimiter=',')
n_cols = predictors.shape[1]
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(1))
# 'adam' is a good optimizer choice
# loss function 'mean_squared_error' is usually a good choice
model.compile(optimizer='adam', loss='mean_squared_error')
# fitting a model means applying backpropagation and gradient descent with out data to update the weights
# scaling the data before fitting can ease optimization
model.fit(predictors, target)Classification models
- 'categorical_crossentropy' is a common loss function
- metrics=['accuracy'] should be included for easy-to-understand diagnostics
- output layer has separate node for each possible outcome, and uses 'softmax' activation function
Run cancelled
from tensorflow.keras.utils import to_categorical
data = pd.read_csv('basketball_shot_long.csv')
# we are dropping shot_result column to have separate columns for the two outcomes - 1 and 0
predictors = data.drop(['shot_result'], axis=1).values
target = to_categorical(data['shot_result'])
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(100, activation='relu'))
model.add(Dense(2, activation='softmax'))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(predictors, target)Using models
- Save
- Reload
- Make predictions
Run cancelled
from tensorflow.keras.models import load_model
model.save('model_file.hh5')
my_model = load_model('model_file.h5')
predictions = my_model.predict(data_to_predict_with)
probability_true = predictions[:,1]
# verifying model summary
my_model.summary()