Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


EEE3032:   Computer Vision and Pattern Recognition

 

1.

(a)   Apply two iterations of the k-Means algorithm on the 2D data distribution X ,

where k=2 and initial cluster centres are (28,38) and (41,70).


(b)   Given a colour image, explain how the k-Means algorithm can be used to produce spatially coherent segmentation.

(c)   Explain the purpose of the normalization step in a normalized graph cut (NCut), within the context of image segmentation.

(d)   Calculate the value of the normalized cut NCut(A, B) indicated by the dashed line on the following lattice:

 

(e)   Consider a video sequence in which a stationary camera captures images of a multi- coloured moving object. Briefly, suggest a method for extracting binary segmenta- tion of that moving foreground object.



2.

(a)   An Eigenmodel is computed from data X using its mean µ and a decomposition of

its co-variance (C) such that C = UVUT . What do U and V represent with respect to the distribution of X?

(b)   Calculate  the  mean   (µ)  and  co-variance   (C)  of  the  2D  point  distribution X = [x1 , x2 , ..., x5] described below


(c)   Consider the problem of detecting and tracking aeroplanes based on the shape of their 2D contours.

(i)    Briefly describe how an Eigenmodel can be use to build a Point Distribution Model (PDM) to model the statistics of a training set of aeroplane shapes.

(ii)   What is meant by the null space of the PDM? What would shapes look like sampled from within the null space of the PDM?

(iii)  Explain, with suitable mathematics, how the variation within the shapes used to train the PDM could be visualized.

(iv)  Explain how a tracking algorithm could be combined with a PDM in order to ro- bustly track a shape.

 

3.

You are building a system for sign language recognition from video. You must write the Computer Vision software to determine if a hand is present in the video, and if so which of five kinds of shape (A, B, C, D or E) the hand is making. You should use the             techniques taught on this module that you feel are most appropriate.

(a)   Assume that a contour of the hand is available.  Recommend, and describe in full

mathematical detail, an appropriate shape descriptor capable of discriminating be- tween the five hand shapes.

(b)   You decide to approach the shape recognition problem as a supervised classification task.  Describe a suitable classifier using full mathematical detail and explain how your system would be trained, and how the overall performance of the system could be evaluated.

(c)   State the product and sum rules for combining probabilities, and derive from these an expression for Bayes’ law.

(d)   Describe, with full mathematical detail, the SELECT-UPDATE-MEASURE cycle of the particle filter tracker and how this applies Bayes’ law.

 

4.

(a)   Explain what is a meant by the convolution theorem. What are its practical impli-

cations when performing filter operations on large images?

(b)   Given the following greyscale image

]Write  down  a  3  x 3  convolution  kernel  that  is  commonly  used  to  low-pass filter an image, and then perform that convolution on image I(x, y).

(c)   Explain, with appropriate diagrams, how the Bag of Visual Words (BoVW) repre- sentation can be trained and used to recognise objects in images.

(d)   How do deep learning approaches differ from classical approaches applied to the task of object recognition?

(e)   Sketch the architecture of the AlexNet convolutional neural network (CNN) archi-

tecture [Krizhevsky et al., NIPS 2012]. Ensure you name and label the layers.

(f)    What is the significance of the ReLu layer in a CNN?

(g)   Deep learning models typically perform well when trained using a large amount of data.   Suggest one way to mitigate this issue when only limited training data is available.