ECMM426 Computer Vision
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECMM426
COLLEGE OF ENGINEERING, MATHEMATICS
AND PHYSICAL SCIENCES
COMPUTER SCIENCE
Examination, May 2020
Computer Vision
Duration: TWO HOURS + 30 MINUTES UPLOAD TIME
Answer ALL the questions.
Question 1 is worth 80 marks, while question 2 is worth 20 marks.
The marks for this module are calculated from 40% of the percentage mark for
this paper plus 60% of the percentage mark for associated coursework.
This is an OPEN BOOK examination.
SECTION A (Multiple Choice Questions)
Question 1
There are FORTY multiple choice questions with several possible choices each.
Clearly mark or write all the correct choices. Please note these questions might
have multiple correct answers, with partial marking.
1. Consider a grayscale image of size 200× 300. How much space in kilobytes
(KB) would this image require for storing in a disk?
(i) 20 KB
(ii) 60 KB
(iii) 300 KB
(iv) 100 KB
(2 marks)
2. Which of the following is a challenge when dealing with computer vision
problems?
(i) Variations due to geometric changes (like pose, scale etc)
(ii) Variations due to photometric factors (like illumination, appearance etc)
(iii) Background clutter
(iv) All of the above
(2 marks)
3. Convolution of a Gaussian filter with another Gaussian filter generates:
(i) Box filter
(ii) Unsharp filter
(iii) Gaussian filter
(iv) None of the above
(2 marks)
4. Suppose we have the following noisy image (Figure 1):
Figure 1: salt and pepper noise
This type of noise in the image is called ‘salt & pepper’ noise. Which type
of filter should be applied to denoise the image?
(i) Linear filter
(ii) Median filter
(iii) Sobel filter
(iv) None of the above
(2 marks)
5. ‘Ringing’ is an image artefact generated by:
(i) Box filter
(ii) Gaussian filter
(iii) Unsharp filter
(iv) All of the above
(2 marks)
6. What would be the relation between the original and modified image if the
original image be convolved with the following filter (Figure 2)?
Figure 2: filter
(i) Blurred image
(ii) Sharpened image
(iii) Inverted image
(iv) Rotated image
(2 marks)
7. If we convolve an image with the filter given below (Figure 3), what would
be the relation between the original and modified image?
Figure 3: filter
(i) The original image will be shifted to the right by 1 pixel
(ii) The original image will be shifted down by 1 pixel
(iii) The original image will be shifted to the left by 1 pixel
(iv) The original image will be shifted up by 1 pixel
(2 marks)
8. In Canny edge detection, we will get more continuous edges if we make the
following change to the hysteresis thresholding
(i) increase the high threshold
(ii) decrease the high threshold
(iii) increase the low threshold
(iv) decrease the low threshold
(2 marks)
9. In the following image (Figure 4), you can find an edge labelled in the red
region. Which form of discontinuity create this kind of edge?
Figure 4: chair
(i) Depth Discontinuity
(ii) Surface colour Discontinuity
(iii) Illumination discontinuity
(iv) None of the above
(2 marks)
10. What kind of edges would the Canny edge detector generate without doing
the non-maximum suppression step?
(i) Very thin edges
(ii) Thick edge regions
(iii) Perfect edges
(iv) None of the above
(2 marks)
11. What are the main benefits of detecting image edges using the zero-crossings
of Laplacian of Gaussian (LoG) of the image rather than thresholding its
gradient magnitude?
(i) Zero-crossing produces contours instead of regions
(ii) Zero-crossing is less sensitive to image noise
(iii) Zero-crossing is independent of threshold parameter
(iv) All of the above
(2 marks)
12. Let λ1 and λ2 be the eigenvalues of the second order moment matrix M,
from which we can compute the measure for detecting Harris corners as
R = λ1λ2− k(λ1 +λ2)2, where k is a small constant. What are the different
criteria in terms of R to reject a region as a purpose of detecting corner?
(i) R > 0
(ii) |R| is small
(iii) R < 0
(iv) All of the above
(2 marks)
13. Which of the following transformations is the Harris corner detector
invariant to?
(i) Translation
(ii) Scaling
(iii) Rotation
(iv) Photometric
(2 marks)
14. Let f 11 be a SIFT descriptor from an image I1, and f
1
2 and f
2
2 be two SIFT
descriptors from another image I2, which are respectively the nearest and
second nearest neighbours (in L2 distance) of f 11 in I2. f
1
1 from I1 is said to
be matched to f 12 in I2 if it satisfies the following criteria, where ‖ ·‖ denotes
L2 distance:
(i) ‖f
1
1−f12 ‖
‖f11−f22 ‖ ≈ 0
(ii) ‖f
1
1−f12 ‖
‖f11−f22 ‖ ≈ 1
(iii) ‖f
1
1−f12 ‖
‖f11−f22 ‖ 1
(iv) ‖f
1
1−f12 ‖
‖f11−f22 ‖ 1
(2 marks)
15. Suppose you have to rotate an image (Figure 5). Image rotation is nothing
but multiplication of image by a specific matrix to get a new transformed
image.
Figure 5: rotation
For simplicity, we consider one point in the image to rotate with co-ordinates
as (1, 0) to a co-ordinate of (0, 1), which of the following matrix would we
have to multiply with?
(i)
[
1 1
1 1
]
(ii)
[
0 1
1 1
]
(iii)
[
0 −1
1 0
]
(iv)
[
0 1
1 0
]
(2 marks)
16. The Cartesian coordinate of the homogeneous coordinate (x, y, w) is
(i) ( x
w
, y
w
)
(ii) ( x
w
, y
w
, 1)
(iii) (x, y, 1)
(iv) (x, y)
(2 marks)
17. Let R1 and R2 be two matrices that define two different rotation
transformations. Which one of the followings is true about them?
(i) R1R2 6= R2R1
(ii) R1R2R1 = R2R1R2
(iii) R2R1 > R1R2
(iv) R1R2 < R2R1
(2 marks)
18. In 2D coordinate system, mirroring about the line y = x can be achieved by
the following transformation matrix:
(i)
[
0 1
1 0
]
(ii)
[
0 1
−1 0
]
(iii)
[
0 −1
1 0
]
(iv)
[
1 1
1 1
]
(2 marks)
19. LetO be the origin of a 2D coordinate system C and P (6= O) be any point in
C. We further assume that R be a rotation about O and T be the translation
from the point P to O. The transformation matrix that achieve rotation R
about the point P can be written as:
(i) RTR−1
(ii) T−1RT−1
(iii) T−1RT
(iv) TRT
(2 marks)
20. Which of the following could affect the intrinsic parameters of a camera?
(i) A crooked lens system
(ii) Diamond/Rhombus shaped pixels with non right angles
(iii) The aperture configuration and construction
(iv) Any offset of the image sensor from the lens’s optical centre
(2 marks)
21. Which of the following statements describes an affine camera but not a
general perspective camera?
(i) Relative sizes of visible objects in a scene can be determined without
prior knowledge
(ii) Can be used to determine the distance from a object of a known height
(iii) Approximates the human visual system
(iv) An infinitely long plane can be viewed as a line from the right angle
(2 marks)
22. Let us assume that P number of unknown 3D points are projected into F
number of images where the 2D coordinates of those P points and their
correspondences are known. Assuming W (shape: 2F × P ) as the 2D
coordinates of those P points in F images, R (shape: 2F × 3) as the camera
rotation matrix for F images and S as the reconstructed 3D real world points,
their relation can be expressed as W = R × S, where W , R are known and
S is unknown. The solution of S can be given by:
(i) R−1W
(ii) W TR−1W TW
(iii) W TR−1W TR−1
(iv) None of the above
(2 marks)
23. Let us assume that P number of unknown 3D points are projected into F
number of images where the 2D coordinates of those P points and their
correspondences are known. Assuming W (shape: 2F × P ) as the 2D
coordinates of those P points in F images, R (shape: 2F × 3) as the camera
rotation matrix for F images and S as the reconstructed 3D real world points,
their relation can be expressed as W = R× S, where W is known and R, S
are unknown. The solutions of R and S can be estimated by:
(i) Random matrices that satisfy the expression
(ii) Singular value decomposition (SVD) and then selecting appropriate
submatrix depending on matrix rank
(iii) Selecting those rows and columns that respectively maximise and
minimise the matrix rank
(iv) None of the above
(2 marks)
24. Recognising an ‘Armchair’ among a collection of ‘Wing chair’, ‘Deck
chair’, ‘Desk chair’, ‘Barber chair’, ‘Operator chair’, ‘Armchair’,
‘Executive chair’, ‘Garden chair’ is known as:
(i) Instance recognition
(ii) Category recognition
(iii) Deep recognition
(iv) None of the above
(2 marks)
25. Which one of the following steps is not involved in bag-of-words model?
(i) Feature extraction
(ii) Feature quantisation
(iii) Non-maximum suppression
(iv) Visual vocabulary creation
(2 marks)
26. Let us assume that for creating a bag-of-visual-words (BoVW) model, we
have created a visual vocabulary of size 300. Now if we want to create
a bag-of-visual-words image descriptor with a 4 × 4 spatial pyramid, the
dimension of the feature should be:
(i) 4800
(ii) 1200
(iii) 300
(iv) 2400
(2 marks)
27. In a bag-of-visual-words model, the optimal size of the visual vocabulary
should be determined on the evaluation performance on the following data
split:
(i) Train set
(ii) Validation set
(iii) Test set
(iv) Both Validation and Test set
(2 marks)
28. What is the regular practice to use linear SVM to classify two classes that
are not linearly separable?
(i) Cross validation
(ii) Kernel trick
(iii) Neural neighbour trick
(iv) None of the above
(2 marks)
29. In Viola-Jones face detection algorithm, how does one implement a ‘weak
classifier’?
(i) SIFT feature with thresholding
(ii) Rectangular feature with thresholding
(iii) HOG feature with SVM
(iv) Rectangular feature with SVM
(2 marks)
30. Suppose we have the following image (Figure 6):
Figure 6: image
Our task is to segment the objects in the image. A simple way to do this
is to represent the image in terms of pixel intensity and then cluster them
according to the values. On doing this, we got the following histogram
(Figure 7) of pixel intensity
Figure 7: histogram
Suppose we choose k-means clustering to solve the problem, what would be
the appropriate value of k from just a visual inspection of the pixel intensity
histogram?
(i) 1
(ii) 2
(iii) 3
(iv) 4
(2 marks)
2026-03-11