Let us worry about your assignment instead!

We Helped With This Python Programming Assignment: Have A Similar One?

More InfoPython Coding Help

Short Assignment Requirements

Deep learning assignment Siamese CNN based programming in Python 3.6 in spyder anaconda environment

Assignment Description

Last updated on Wednesday, September 27, 2017

2017 IFN680 - Assignment Two (Siamese network)

Assessment information  

        Code and report submission due on Monday 30th October,  08.30am 

        Use Blackboard to submit your work

        Group size: three people per submission. Smaller group sizes allowed (1 or 2 people).


        You will implement a deep neural network classifier to predict whether two images belong to the same class. The dataset you will use is a set of images of handwritten digits.  

        The approach you are asked to follow is quite generic and can be apply to problems where we seek to determine whether two inputs belong to the same equivalence class.

        You are provided with scaffolding code that you will need to complete with your own functions.

        You will also perform experiments and report on your results.


Despite impressive results in object classification, verification and recognition, most deep neural network based recognition systems become brittle when the view point of the camera changes dramatically.  Robustness to geometric transformations is highly desirable for applications like wild life monitoring where there is no control on the pose of the objects of interest. The images of different objects viewed from various observation points define  equivalence classes where by definition two images are said to be  equivalent if they are views from the same object.

These equivalence classes can be learned via embeddings that map the input images to vectors of real numbers. During training, equivalent images are mapped to vectors that get pulled closer together, whereas if the images are not equivalent their associated vectors get pulled apart.


Common machine learning tasks like classification and recognition involve learning an appearance model. These tasks  can be interpreted and even reduced to the problem of learning manifolds from a training set.  Useful appearance models create an invariant representation of the objects of interest under a range of conditions. A good representation should combine invariance and discriminability. For example, in facial recognition where the task is to compare two images and determine whether they show the same person, the output of the system should be invariant to the pose of the heads. More generally, the category of an object contained in an image should be invariant to viewpoint changes.

This assignment borrows ideas we developed for a manta ray recognition system. The motivation for our research work is the lack of fully automated identification systems for manta rays. The techniques developed for such systems can also potentially be applied to other marine species that bear a unique pattern on their surface. The task of recognizing manta rays is challenging because of the heterogeneity of photographic conditions and equipment used in acquiring manta ray photo ID images like those in the figures below.

Two images of the same Manta ray

Many of those pictures are submitted by recreational divers. For those pictures, the camera parameters are generally not known.  The state of the art for manta ray recognition is a system that requires user input to manually align and normalize the 2D orientation of the  ray within the image. Moreover, the user has to select a rectangular region of interest containing the spot pattern. The images have also to be of high quality. In practice, marine biologists still prefer to use their own decision tree that they run manually.

In order to develop robust algorithms for recognizing manta spot patterns, a research student and I  have considered the problem of recognizing artificially generated patterns subjected to projective transformations that simulate changes in the camera view point.

Artificial data allowed us to experiment with a large amount of patterns and compare different network architectures to select the most suitable for learning geometric equivalence. Our experiments have demonstrated that Siamese[1] convolutional neural networks are able to discriminate between patterns subjected to large homographic transformations.

In this assignment, you will work with a simpler dataset. Namely the MNIST dataset. You will build a classifier to predict whether two images are warped views from digits of the same class or not.

Learning equivalence classes

A Siamese network consists of two identical subnetworks that share the same weights followed by a distance calculation layer.  The input of a Siamese network is a pair of images Pi  and Pj.  If the two images are deemed from the same equivalence classes, the pair is called a positive pair, whereas  for a pair of images from different equivalence classes, the pair is called a negative pair.

The input images Pi and Pj are fed to the twin subnetworks to produce two vector representations f(Pi)  and f(Pj) that are used to calculate a proxy distance. The training of a Siamese network is done on a collection of positive and negative pairs. Learning is performed by optimizing a contrastive loss function (see code documentation). The aim is to  minimize the distance between a pair of images from the same equivalence class while  maximizing the distance between a pair of different equivalence classes.

Your tasks

        You are provided with scaffolding code and an example of a Siamese network based on a multi-layer perceptron network.  You need to familiarize yourself with this code, then adapt it to your need.

        Write code to create a Siamese network based on a convolutional network with at least 3 convolutional layers, and 2 fully connected layers. No need to use more than 5 convolutional layers!

        Create a dataset of 100000 warped images using the provided function random_deform. Use the call  im2 = assign2_utils.random_deform(im1, 45, 0.3)   to warp the image im1 with a rotation of at most 45 degrees and a projective transformation with a “strength” of 0.3. 

        Find a suitable architecture for the base convolutional neural network of your Siamese network.

        Train your Siamese network on the original dataset (without warping) and report the performance of your network.

        Train your Siamese network on the warped dataset and report the performance of your network.

        Investigate whether starting training the network on easier pairs improve the classifier.  That is, start training the network on images that have a deformation significantly smaller than 45 degree and with the strength parameter smaller than 0.3.

        Describe your findings in the report.  Use tables and figures to support your arguments.


You should submit via Blackboard a zip file containing A report in pdf format strictly limited to 8 pages in total.

        explain clearly your methodology for your experiments

        present your experimental results using tables and figures

Your Python file   my_submission.py

Marking Guide Focus


        Structure (sections, page numbers), grammar, no typos.

        Clarity of explanations.

        Figures and tables  (use for explanations and to report performance).

        Code quality:    

        Readability, meaningful variable names.

        Proper use of Python constructs like numpy arrays, dictionaries and list comprehension.

        Header comments in classes and functions.

        Function parameter documentation.

        In-line comments.


        Soundness of the methodology

        Evidence based discussion/conclusion

Final Remarks

        Do not underestimate the workload. Start early. You are strongly encouraged to ask questions during the practical sessions.

        Email questions to ...

[1] Siamese network are defined in the next section.

Assignment Description

Marking Guide and Criterion List

Assignment Two 2017 IFN680

        Report:   10 marks

        Structure (sections, page numbers), grammar, no typos.

        Clarity of explanations.

        Figures and tables  (use for explanations and to report performance).

Levels of Achievement

10 Marks

7 Marks

5 Marks

3 Marks

1 Mark

Report written at the highest professional standard with respect to spelling, grammar, formatting, structure, and language terminology.

Report is verywell written and understandabl e throughout, with only a few insignificant presentation errors.

The report is generally wellwritten and understandabl e but with a few small presentation errors that make one of two points unclear. Clear figures and tables.

Large parts of the report are poorly-written, making many parts difficult to understand. Use of sections with proper section titles.

No figures or tables.

The entire report is poorly-written and/or incomplete and/or impossible to understand.

The report is in pdf format.

To get “i Marks”, the report needs to satisfy all the positive items and none of the negative items of the columns “j Marks” for all j<i.  For example, if your report is not in pdf format, you will not be awarded more than 1 mark.

        Code quality:   10 marks

        Readability, meaningful variable names.

        Proper use of Python constructs like numpy arrays, dictionaries and list comprehension.

        Header comments in classes and functions.

        Function parameter documentation.

        In-line comments.

Levels of Achievement

10 Marks

7 Marks

5 Marks

3 Marks

1 Mark

Code is generic. Minimal changes would be needed to run same experiments on a different dataset.

Proper use of numpy array operations. Avoid unnecessary loops.

Useful in-line comments.

Code structured

so that it is straightforward to repeat the experiments

No magic numbers

(that is, all numerical constants have been assigned to variables).

Appropriate use of auxiliary functions. Each function parameter documented (including type and shape)

Header comments

with  instructions on how to run the code to repeat the experiments.

Code looks like a random spaghetti plate

To get “i Marks”, the report needs to satisfy all the positive items and none of the negative items of the columns “j Marks” for all j<i.

        Experiments 20 marks

Levels of Achievement

20 Marks

15 Marks

10 Marks

5 Marks

0 Mark

Successfully train a CNN based Siamese network  on the  warped dataset in two phases. First small warps, then larger deformations.

The recommendations are supported by references to tables and/or figures.

Successfully train a CNN based Siamese network  on the  warped dataset

Methodology, experiments and recommendations are clear.

Successfully train a CNN based Siamese network successfully on the original (not warped) dataset

Partial description of the experiments. Critical information is missing to repeat the experiments.

No  experiments described in the report.

To get “i Marks”, the report needs to satisfy all the positive items and none of the negative items of the columns “j Marks” for all j<i.

Assignment Code


This module contains functions
- to load the original image dataset
- to generate random homographies
- to warp randomly images with random homographies


import numpy as np

import random

from tensorflow.contrib import keras
from tensorflow.contrib.keras import backend as K

from skimage import transform


def load_dataset():
    Load the dataset, shuffled and split between train and test sets
    and return the numpy arrays  x_train, y_train, x_test, y_test
    The dtype of all returned array is uint8

        x_train, y_train, x_test, y_test
    (x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
    return x_train, y_train, x_test, y_test

def random_homography(variation, image_side):
    Generate a random homography.  
    The large the value of variation the more deformation is applied.
    The homography is defined by 4 random points.

           variation:    percentage (in decimal notation from 0 to 1)
                         relative size of a circle region where centre is projected
                         length of the side of an input square image in pixels
           tform:        object from skimage.transfrm
    d = image_side * variation
    top_left =    (random.uniform(-0.5*d, d), random.uniform(-0.5*d, d))  # Top left corner
    bottom_left = (random.uniform(-0.5*d, d), random.uniform(-0.5*d, d))   # Bottom left corner
    top_right =   (random.uniform(-0.5*d, d), random.uniform(-0.5*d, d))     # Top right corner
    bottom_right =(random.uniform(-0.5*d, d), random.uniform(-0.5*d, d))  # Bottom right corner

    tform = transform.ProjectiveTransform()

            (bottom_left[0], image_side - bottom_left[1]),
            (image_side - bottom_right[0], image_side - bottom_right[1]),
            (image_side - top_right[0], top_right[1])
        )), np.array((
            (0, 0),
            (0, image_side),
            (image_side, image_side),
            (image_side, 0)

    return tform

def random_deform(image, rotation, variation):
    Apply a random warping deformation to the in
    image_side = image.shape[0]
    assert image.shape[0]==image.shape[1]
    cval = 0
    rhom = random_homography(variation, image_side)
    image_warped = transform.rotate(
        random.uniform(-rotation, rotation), 
        resize = False,
    image_warped = transform.warp(image_warped, rhom, mode='constant', cval=cval)
    return image_warped


Assignment Code


2017 IFN680 Assignment Two

Scaffholding code to get you started for the 2nd assignment.

import random
import numpy as np

#import matplotlib.pyplot as plt

from tensorflow.contrib import keras

from tensorflow.contrib.keras import backend as K
import assign2_utils


def euclidean_distance(vects):
    Auxiliary function to compute the Euclidian distance between two vectors
    in a Keras layer.
    x, y = vects
    return K.sqrt(K.maximum(K.sum(K.square(x - y), axis=1, keepdims=True), K.epsilon()))


def contrastive_loss(y_true, y_pred):
    Contrastive loss from Hadsell-et-al.'06
      y_true : true label 1 for positive pair, 0 for negative pair
      y_pred : distance output of the Siamese network    
    margin = 1
    # if positive pair, y_true is 1, penalize for large distance returned by Siamese network
    # if negative pair, y_true is 0, penalize for distance smaller than the margin
    return K.mean(y_true * K.square(y_pred) +
                  (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))

def compute_accuracy(predictions, labels):
    Compute classification accuracy with a fixed threshold on distances.
      predictions : values computed by the Siamese network
      labels : 1 for positive pair, 0 otherwise
    # the formula below, compute only the true positive rate]
    #    return labels[predictions.ravel() < 0.5].mean()
    n = labels.shape[0]
    acc =  (labels[predictions.ravel() < 0.5].sum() +  # count True Positive
               (1-labels[predictions.ravel() >= 0.5]).sum() ) / n  # True Negative
    return acc


def create_pairs(x, digit_indices):
       Positive and negative pair creation.
       Alternates between positive and negative pairs.
         digit_indices : list of lists
            digit_indices[k] is the list of indices of occurences digit k in 
            the dataset
         P, L 
         where P is an array of pairs and L an array of labels
         L[i] ==1 if P[i] is a positive pair
         L[i] ==0 if P[i] is a negative pair
    pairs = []
    labels = []
    n = min([len(digit_indices[d]) for d in range(10)]) - 1
    for d in range(10):
        for i in range(n):
            z1, z2 = digit_indices[d][i], digit_indices[d][i + 1]
            pairs += [[x[z1], x[z2]]]
            # z1 and z2 form a positive pair
            inc = random.randrange(1, 10)
            dn = (d + inc) % 10
            z1, z2 = digit_indices[d][i], digit_indices[dn][i]
            # z1 and z2 form a negative pair
            pairs += [[x[z1], x[z2]]]
            labels += [1, 0]
    return np.array(pairs), np.array(labels)

def simplistic_solution():
    Train a Siamese network to predict whether two input images correspond to the 
    same digit.
        in your submission, you should use auxiliary functions to create the 
        Siamese network, to train it, and to compute its performance.
    def create_simplistic_base_network(input_dim):
        Base network to be shared (eq. to feature extraction).
        seq = keras.models.Sequential()
        seq.add(keras.layers.Dense(128, input_shape=(input_dim,), activation='relu'))
        seq.add(keras.layers.Dense(128, activation='relu'))
        seq.add(keras.layers.Dense(128, activation='relu'))
        return seq
        # . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 
    # load the dataset
    x_train, y_train, x_test, y_test  = assign2_utils.load_dataset()

    # Example of magic numbers (6000, 784)
    # This should be avoided. Here we could/should have retrieve the
    # dimensions of the arrays using the numpy ndarray method shape 
    x_train = x_train.reshape(60000, 784) 
    x_test = x_test.reshape(10000, 784)
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    x_train /= 255 # normalized the entries between 0 and 1
    x_test /= 255
    input_dim = 784 # 28x28

    epochs = 20

    # create training+test positive and negative pairs
    digit_indices = [np.where(y_train == i)[0] for i in range(10)]
    tr_pairs, tr_y = create_pairs(x_train, digit_indices)
    digit_indices = [np.where(y_test == i)[0] for i in range(10)]
    te_pairs, te_y = create_pairs(x_test, digit_indices)
    # network definition
    base_network = create_simplistic_base_network(input_dim)
    input_a = keras.layers.Input(shape=(input_dim,))
    input_b = keras.layers.Input(shape=(input_dim,))
    # because we re-use the same instance `base_network`,
    # the weights of the network
    # will be shared across the two branches
    processed_a = base_network(input_a)
    processed_b = base_network(input_b)
    # node to compute the distance between the two vectors
    # processed_a and processed_a
    distance = keras.layers.Lambda(euclidean_distance)([processed_a, processed_b])
    # Our model take as input a pair of images input_a and input_b
    # and output the Euclidian distance of the mapped inputs
    model = keras.models.Model([input_a, input_b], distance)

    # train
    rms = keras.optimizers.RMSprop()
    model.compile(loss=contrastive_loss, optimizer=rms)
    model.fit([tr_pairs[:, 0], tr_pairs[:, 1]], tr_y,
              validation_data=([te_pairs[:, 0], te_pairs[:, 1]], te_y))

    # compute final accuracy on training and test sets
    pred = model.predict([tr_pairs[:, 0], tr_pairs[:, 1]])
    tr_acc = compute_accuracy(pred, tr_y)
    pred = model.predict([te_pairs[:, 0], te_pairs[:, 1]])
    te_acc = compute_accuracy(pred, te_y)
    print('* Accuracy on training set: %0.2f%%' % (100 * tr_acc))
    print('* Accuracy on test set: %0.2f%%' % (100 * te_acc))

if __name__=='__main__':
# ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#                               CODE CEMETARY        

Assignment Code


A short script to illustrate the warping functions of 'assign2_utils'


#import numpy as np
import matplotlib.pyplot as plt

from tensorflow.contrib import keras
#from tensorflow.contrib.keras import backend as K
import assign2_utils

(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()

im1 = x_train[20]

im2 = assign2_utils.random_deform(im1,45,0.3)




Customer Feedback

"Thanks for explanations after the assignment was already completed... Emily is such a nice tutor! "

Order #13073

Find Us On

soc fb soc insta

Paypal supported