library(tidyverse)
library(keras)
Supervised learning: deep learning
Introduction
In this practical, we will create a feed-forward neural network as well as an optional convolutional neural network to analyze the famous MNIST dataset.
Let’s set the seed value and use the same number as below to reproduce the same results.
set.seed(45)
In this section, we will develop a deep feed-forward neural network for MNIST.
Data preparation
It is usually a good idea to normalize your features to have a manageable, standard range before entering them in neural networks.
Multi-layer perceptron: multinomial logistic regression
The simplest neural network model is a multi-layer perceptron where we have no hidden layers and only input and output layers. We can call this a multinomial logistic regression model, where we have no hidden layers and 10 outputs (0-1) for our mnist data. That model is shown below.
<-
multinom # initialize a sequential model
keras_model_sequential(input_shape = c(28, 28)) %>%
# flatten 28*28 matrix into single vector
layer_flatten() %>%
# softmax outcome == probability for each of 10 outputs
layer_dense(10, activation = "softmax")
$compile(
multinomloss = "sparse_categorical_crossentropy", # loss function for multinomial outcome
optimizer = "adam", # we use this optimizer because it works well
metrics = list("accuracy") # we want to know training accuracy in the end
)
Deep feed-forward neural networks
OPTIONAL: convolutional neural network
Convolution layers in Keras need a specific form of data input.
For each example, they need a (width, height, channels)
array (tensor). For a colour image with 28*28 dimension, that shape is usually (28, 28, 3)
, where the channels indicate red, green, and blue. MNIST has no colour info, but we still need the channel dimension to enter the data into a convolution layer with shape (28, 28, 1)
. The training dataset x_train
should thus have shape (60000, 28, 28, 1)
.