ECS797P Machine Learning for Visual Data Analysis Main Examination Period 2023

发布时间：2024-05-30

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Main Examination Period 2023

ECS797P Machine Learning for Visual Data Analysis

Question 1

A company that designs a multimedia management system wants to add the functionality of automatically labelling an image with respect to whether it depicts an indoors or an outdoors scene. The Computer Vision engineer is asked to implement this feature and decides to do so using a Bag-of-Words representation. There are 500 indoor and 500 outdoor images in the training dataset, for each image local descriptors are extracted in 100 positions and all of the local descriptors are clustered in 512 clusters.

a) Which of the following local descriptors are applicable and useful for the problem in hand and why? Histogram of oriented Gradients (HoG), Histogram of optical Flow (HoF), Local colour histogram.

[6 marks]

b) Give a precise description of the Machine Learning problem that the Computer Vision engineer has in hand, including the number of samples, the dimensionality of each sample, the input and the output of the Machine Learning module.

[6 marks]

c) Is there any issue/problem given the specific number of examples and dimensionality? What are the implications on the classification method that you would recommend and why?

[6 marks]

d) Attest time, for a specific user who have in their collection 100 outdoors and 900 indoors images, the system labels correctly 90 out of the 100 outdoor images and 810 out of the 900 indoors images.

i) What is the overall accuracy of the system?

ii) Give the confusion matrix.

[7 marks]

Question 2

A Computer Vision company is designing a system that monitors the behaviour of lions around water ponds in Africa's savannas. Assume that the image sequences at each pond are recorded from a camera that is located at the top of a high tree with clear view of the pond and the surroundings. Some of the behaviours that the zoologists are interested in are:

1. Statistics of the distances that the lions keep from each other.

2. Statistics of the length of their sleep.

This question is about using object detection and tracking to recognise some, or all, of the behaviours above.

a) Can object detection and/or tracking components be used to recognise all the above behaviours? If yes, how will the system be deployed at test time? If not, why not?

[10 marks]

b) Explain how you will train the components of the system that will be used for the first behaviour.

[8 marks]

c) Give two appropriate measure(s) and describe the process you would use to evaluating the performance of the system with respect to the first behaviour. Define the measure using a formula.

[7 marks]

Question 3

a) Consider a database of 2400 face images of size 40 by 50. This database contains images of 40 people each having 60 images. Now consider applying Principal Component Analysis (PCA) to the database to construct 12 eigenfaces for face recognition.

i) What is the dimensionality of the covariance matrix of the dataset?

ii) What is the dimensionality of the mean face?

iii) What is the dimensionality of the eigenface?

iv) What is the dimensionality of the pattern vector?

[8 marks]

b) Answer the following questions on using histogram as image representation.

i) How can an intensity value histogram representation be made to be insensitive to illumination changes?

ii) What is the histogram intersection distance between the following two histograms: [0.2 0.3 0.5] and [0.4 0.2 0.4 ]?

iii) What is the Euclidean distance of the two histograms above?

iv) You have a feature space of 5 dimensions and you want to build a histogram with each dimension quantised into 10 values. With a joint histogram, what will be the total number of bins in your histogram? What is the total bin number if you decide to build a marginal histogram instead?

[12 marks]

c) The PCA (eigenface) technique is not limited for face recognition only. It has been used for other object recognition tasks. Given a dataset that consists of images of the Eiffel Tower and some other towers, you want to use PCA (eigenface) and the nearest neighbour method to build a classifier that predicts whether new images depict the Eiffel Tower. Some samples of your input training images are given in Figure 1 below. In order to get reasonable performance from the eigenface algorithm, what preprocessing steps will be required on these images?

Eiffel Tower 1

Other Tower 1

Eiffel Tower 2

Other Tower 2

Eiffel Tower 3

Other Tower 3

Figure 1

[5 marks]

Question 4

a) Explain why the Viola-Jones algorithm is slow in training but very fast in detection.

[7 marks]

b) Fig. 2(a) depicts a Haar feature pattern (blue colour denotes +1 and black denotes -1) and Fig. 2(b) depicts the pixel values of a 4x5 image. Fig. 2(a) and Fig. 2(b) are of the same size. Compute the Haar feature with the following steps.

i) Compute the integral image of Fig. 2(b)

ii) Show the procedure to compute the Haar feature value from the integral image.


-1	-1	-1
+1	+1	+1

Fig. 2(a) Harr feature Fig. 2(b) Image

[11 marks]

c) An organisation that monitors giraffe populations in a park in Kenya asks a company to design a giraffe detector for them. The company is told that the camera needs to be placed on a path where several animals walk and that giraffes often walk close to each other in small groups (5-10). Is a part-based detector or a sliding window approach more appropriate for the problem in question? Justify your answer.

[7 marks]