The Algorithms logoThe Algorithms
About

CNN-Using Keras

D

A Simple Convolutional Neural Network (using Keras)

Explaining the code

Original Articles :

Import the required libraries:

  1. Numpy: NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along with a large collection of high-level mathematical functions to operate on these arrays.

  2. Keras:Keras is a high-level neural networks API, written in Python.

    • Layers
      • Activation : This is a method that applies an activation function to the output.
      • MaxPool2D : This method creates a Pooling layer to implement Max Pooling operation.
      • Conv2D : This method is used to create a convolution layer.
      • Flatten : This method flattens (converting into 1-Dimension) the input without affecting the batch size.
      • Dense : This method is used to create a fully connected neural network.
    • Models
      • Sequential: The Sequential model is a linear stack of layers. We can create a Sequential model by passing a list of layer instances to the constructor or just by simply adding layers via the .add() method.
import numpy as np
from keras.layers import Conv2D, Activation, MaxPool2D, Flatten, Dense
from keras.models import Sequential

The Convolutional Neural Network Architecture

A typical Convolutional Neural Network consists of the following parts:

1. The Hidden Layers/ Feature Extraction Part

(The Covolution and Pooling Layer)

In this part of the network a series of convolution and pooling operations takes place, which are used for detecting different features of the input image.

  • In the code snippet below, we initialise a Sequential model. At first we dont provide the constructor with the layers,but rather we go on adding the layers to the model by using the .add() fuction as can be seen from the following snippets of code.
  • This model lets us build a model by adding on one layer at a time.
# Images fed into this model are 28 x 28 pixels with 3 channels
img_shape = (28,28,1)
# Set up the model
model = Sequential()
  • In the above line of codes,at first we define the img_shape.Here the img_shape denotes the input image shape to the model. It is important as the model needs to knpw what input shape it should expect to be fed.Therefore the first layer in a Sequential model needs to always receive information about the input shape.
# Add convolutional layer with 3, 3 by 3 filters
model.add(Conv2D(3,kernel_size=3,input_shape=img_shape))

Conv2D()

This function creates a 2D Convolution Layer.

What does a Convolution Layer do?

  • A convolution layer scans a source image with a filter to extract features which may be important for classification. (The filter is also called the convolution kernel.)
  • The kernel also contains weights,which are tuned during the training of the model to achieve the most accurate predictions.
  • Parameters :
    • The first parameter (3 in this case), defines the number of neural nodes in each layer.
    • kernel_size defines the filter size—this is the area in square pixels the model will use to scan the image. Kernel size of 3 means the model looks at a square of 3×3 pixels at a time.
    • input_shape is the pixel size of the images.

MaxPool2D()

This function creates a 2D Pooling Layer.

What does a Pooling Layer do?

  • Its function is to progressively reduce the spatial size of the representation to reduce the amount of parameters and computation in the network, and hence to also control overfitting.
  • The Pooling Layer operates independently on every depth slice of the input and resizes it spatially, using the MAX operation.

Activation()

The Activation() method is used to apply a specific activation function to the output of the Pooling layer. Here a relu activation function is added to the output layer.

# Add relu activation to the layer 
model.add(Activation('relu'))
#Pooling
model.add(MaxPool2D(2))

2. Classification Layer

After the convolution layer, there is classification layer that consists of fully connected layers, where neurons of one layer are connected with every activations from the previous layer. The thing about the fully connected layers is that it can only be fed 1-Dimensional data. With the output data from the previous layer being 3-Dimensional , it needs to the flatten out(converted into 1-Dimensional) before it can be fed into the Classification layer.

For this a Flatten layer is added, which takes the output of the convolution layer and turns it into a format that can be used by the densely connected neural layer.

#Fully connected layers
# Use Flatten to convert 3D data to 1D
model.add(Flatten())
# Add dense layer with 10 neurons
model.add(Dense(10))
# we use the softmax activation function for our last layer
model.add(Activation('softmax'))

Dense()

The final layer of the model is of type 'Dense', a densely-connected neural layer which will give the final classification/prediction.

  • The parameter(in this case 10), refers to the number of output nodes.
  • After that a softmax activation function is added to the output layer. It will take the output of the Dense layer and convert it into meaningful probabilities that will ultimately help in making the final prediction.

Summary()

Now just to get an overview of the model we have created ,we will use the summary() function. This will list out the details regarding the Layer,the Output shape and the number of parameters.

# give an overview of our model
model.summary()

Compilation:

Before the training process, we have to put together a learning process in a particular form.This is done via compile(). It consists of 3 elements:

  • An Optimiser : string identifier of an existing optimizer/a call to an optimizer function
  • A loss function : string identifier of an existing loss function/a call to a loss function
  • A list of metric : string identifier of an existing metric or a call to metric function
    • The metrics defines how the success of the model is evaluated.
    • we will use the ‘accuracy’ metric to calculate an accuracy score on the testing/validation set of images.
model.compile(loss='categorical_crossentropy', optimizer = 'adam', metrics=['accuracy'])

Our model is complete.

The Training Phase:

Now we will train the model.

We will use the MNIST dataset which is a benchmark deep learning dataset, containing 70,000 handwritten numbers from 0-9.

#importing the datasets
from keras.datasets import mnist
#loading the datasets
(X_train, y_train), (X_test, y_test) = mnist.load_data()
import matplotlib.pyplot as plt
plt.imshow(X_train[0])
print(X_train[0].shape)

In the code snippet below, reshaping is done.

But WHY?

Reshaping of the two sets of images, X_train and X_test is done so that their shape matches the shape expected by our CNN model.

X_train = X_train.reshape(60000,28,28,1)
X_test = X_test.reshape(10000,28,28,1)

One-hot encoding:

In one-hot encoding we create a column for each classification category, with each column containing binary values indicating if the current image belongs to that category or not.

from keras.utils import to_categorical
y_train = to_categorical(y_train)
y_test = to_categorical(y_test)

fit( )

  • This method trains the model for a fixed number of epochs.
  • An epoch is an iteration over the entire x and y data provided.
model.fit(X_train, y_train,validation_data=(X_test,y_test) ,epochs=10, verbose=2)

Now its time to test our Model!!

We can test our model against a/some specific inputs with the help of the predict( ) function as done in the code snippet below.

  • Output Format The output will consists of 10 probabilities, each representing the probability of being a particular digit from 0-9.
model.predict(X_test[:4])

Improving the displaying of the results:

By directly displaying the predicted number instead of the probabilities.(i.e. just refining the output. )

def return_Digit(x):
    a=np.where(x==max(x))
    print(int(a[0]))
prediction=model.predict(X_test[:4])
np.apply_along_axis(return_Digit,axis=1,arr=prediction)

---------------------------------------------------------------------------