ECMM426 Computer Vision
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
ECMM426 Computer Vision
Course Assessment
This is an autogradable course assessment (CA) for the ECMM426 Computer Vision module, which represents 60% of the overall module assessment.
This is an individual exercise and your attention is drawn to the College and University guidelines on
collaboration and plagiarism, which are available from the University of Exeter website (https://www.exeter.ac.uk/students/administration/complaintsandappeals/academicmisconduct/).
Important:
1. Do not change the name of this notebook and the containing folder. The notebook and the folder should respectively be named as CA.ipynb and CA.
2. Do not add and remove/delete any cell. You can work on a draft notebook and only copy the functions/implementations here.
3. Do not add your name or student code in the notebook or in the file name.
4. Each question asks for one or more functions to be implemented.
5. Each question is associated with appropriate marks and clearly specifies the marking criteria. Most of the questions have partial grading.
6. Each question specifies a particular type of inputs and outputs which you should regard.
7. Each question specifies data for your experimentation and test which you can consider.
8. A hidden unit test is going to evaluate if all the desired properties of the required function(s) are met or not.
9. If the test passes all the associated marks will be rewarded, if it fails 0 marks will be awarded.
10. There is no restriction on the usage of any function from the packages from pip3 distribution.
11. While uploading your work on e-Bart, please do not upload the EXCV10 and MaskedFace datasets you use for training your model.
Question 1 (3 marks)
Write a function add_gaussian_noise(im, m, std) which will add Gaussian noise with mean m and standard deviation std to the input image im and will return the noisy image. Note that the output image must be of uint8 type and the pixel values should be normalized in [0, 255].
Inputs
im is a 3 dimensional numpy array of type uint8 with values in [0, 255].
m is a real number.
std is a real number.
Outputs
The expected output is a 3 dimensional numpy array of type uint8 with values in [0, 255].
Data
You can work with the image at data/books.jpg .
Marking Criteria
The output with a particular m and std should exactly match with the correct noisy image with that m and std to obtain the full marks. There is no partial marking for this question.
In [ ]:
# Gaussian noise |
In [ ]:
In [ ]:
Question 2 (3 marks)
Speckle noise is defined as multiplicative noise, having a granular pattern, it is the inherent property of
Synthetic Aperture Radar (SAR) imagery. More details on Speckle noise can be found here (https://en.wikipedia.org/wiki/Speckle_(interference)). Write a function add_speckle_noise(im, m, std)
which will add Speckle noise with mean m and standard deviation std to the input image im and will return the noisy image. Note that the output image must be of uint8 type and the pixel values should be normalized in [0, 255].
Inputs
im is a 3 dimensional numpy array of type uint8 with values in [0, 255].
m is a real number.
std is a real number.
Outputs
The expected output is a 3 dimensional numpy array of type uint8 with values in [0, 255].
Data
You can work with the image at data/books.jpg .
Marking Criteria
The output with a particular m and std should exactly match with the correct noisy image with that m and std to obtain the full marks. There is no partial marking for this question.
In [ ]:
# Speckle noise |
In [ ]:
Question 3 (2 marks)
Write a function cal_image_hist(gr_im) which will calculate the histogram of pixel intensities of a gray image gr_im . Note that the histogram will be a one dimensional array whose length must be equal to v+1 , where v is the maximum intensity value of gr_im .
Inputs
gr_im is a 2 dimensional numpy array of type uint8 with values in [0, 255].
Outputs
The expected output is a 1 dimensional numpy array of type int64 .
Data
You can play with the image at data/books.jpg .
Marking Criteria
The output should exactly match with the correct histogram of a given gray image gr_im to obtain the full marks. There is no partial marking for this question.
In [ ]:
# Image histogram |
In [ ]:
Question 4 (3 marks)
Write a function compute_gradient_magnitude(gr_im, kx, ky) to compute gradient magnitude of the gray image gr_im with the horizontal kernel kx and vertical kernel ky .
Inputs
gr_im is a 2 dimensional numpy array of data type uint8 with values in [0, 255].
kx and ky are 2 dimensional numpy arrays of data type uint8 .
Outputs
The expected output is a 2 dimensional numpy array of the same shape as of gr_im and of data type
float64 .
Data
You can work with the image at data/shapes.png .
Marking Criteria
The output should exactly match with the correct gradient magnitude of a given gray image gr_im to obtain the full marks. There is no partial marking for this question.
In [ ]:
# Image gradient magnitude |
In [ ]:
Question 5 (2 marks)
Write a function compute_gradient_direction(gr_im, kx, ky) to compute direction of gradient of the gray image gr_im with the horizontal kernel kx and vertical kernel ky .
Inputs
gr_im is a 2 dimensional numpy array of data type uint8 with values in [0, 255].
kx and ky are 2 dimensional numpy arrays of data type uint8 .
Outputs
The expected output is a 2 dimensional numpy array of same shape as of gr_im and of data type
float64 .
Data
You can work with the image at data/shapes.png .
Marking Criteria
The output should exactly match with the correct gradient direction of a given gray image gr_im to obtain the full marks. There is no partial marking for this question.
In [ ]:
# Image gradient magnitude
In [ ]:
Question 6 (8 marks)
Write a function detect_harris_corner(im, ksize, sigmaX, sigmaY, k) which will detect the corners in the image im . Here ksize is the kernel size for smoothing the image, sigmaX and sigmaY are respectively the standard deviations of the kernal along the horizontal and vertical direction, and k is the constant in the Harris criteria. Experiment with your corner detection function on the following image (located at data/shapes.png ):
Adjust the parameters of your function so that it can detect all the corners in that image. Please feel free to
change the given default parameters and set your best parameters as default. You must not resize the above image and note that the returned output should be an × 2 array of type int64 , where is the total number of existing corner points in the image; each row of that × 2 array should be a Cartesian coordinate of the form ( , ) . Also please make sure that your function is rotation invariant which is the fundamental property of the Harris corner detection algorithm.
Inputs
im is a 3 dimensional numpy array of type uint8 with values in [0, 255].
ksize is an integer number.
sigmaX is an integer number.
sigmaY is an integer number.
k is a floating number.
Outputs
The expected output is a 2 dimension numpy array of data type int64 of size × 2 . Each row of that array should be a Cartesian coordinate of the form ( , ) .
Data
You can work with the image at data/shapes.png .
Marking Criteria
You will obtain full marks if your function can detect all the existing corners in the image, while the image is
being rotated to different angles. There is partial marking for this question, which will depend on the performance of the function on that image rotated to different angles.
In [ ]:
# Harris corner detection
In [ ]:
sigmaX=3 , sigmaY=3 , k=0.01 ):
In [ ]:
# This cell is reserved for the unit tests. Please leave this cell as it is. |
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
Question 7 (6 marks)
Write a function compute_homogeneous_rotation_matrix(points, theta) to compute the rotation matrix in homogeneous coordinate system to rotate a shape depicted with 2 dimensional ( , ) coordinates
points to an angle ( theta in the definition) in the anticlockwise direction about the center of the shape.
Inputs
points is a 2 dimensional numpy array of data type uint8 with shape × 2 . Each row of points is
a Cartesian coordinate ( , ) .
theta is a floating point number denoting the angle of rotation in degree.
Outputs
The expected output is a 2 dimensional numpy array of data type float64 with shape 3 × 3 .
Data
You can work with the 2 dimentional numpy array at data/points.npy .
Marking Criteria
You will obtain the full mark if your rotation matrix exactly matches with the actual rotation matrix. If your matrix does not exactly match, you will not get any mark and there is no partial mark for this question.
In [ ]:
# Homogeneous rotation matrix |
In [ ]:
Question 8 (5 marks)
Write a function compute_sift(im, x, y, feature_width) to compute a basic version of SIFT-like local features at the locations ( , ) of the RGB image im as described in the lecture materials and chapter 7.1.2 of the 2nd edition of Szeliski's book. The parameter feature_width is an integer representing the local feature width in pixels. You can assume that feature_width will be a multiple of 4 (i.e. every cell of your local SIFT-like feature will have an integer width and height). This is the initial window size you examine around each keypoint. Your implemented function should return a numpy array of shape × 128, where is the number of keypoints ( , ) input to the function.
Please feel free to follow all the minute details of the SIFT paper (https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf) in your implementation, but please note that your
implementation does not need to exactly match all the details to achieve a good performance. Instead a basic version of SIFT implementation is asked in this exercise, which should achieve a reasonable result. The following three steps could be considered as the basic steps: (1) a 4 × 4 grid of cells, each feature_width/4. It is simply the terminology used in the feature literature to describe the spatial bins where gradient distributions will be described. (2) each cell should have a histogram of the local distribution of gradients in 8 orientations. Appending these histograms together will give you 4 × 4 × 8 = 128 dimensions. (3) Each feature should be normalized to unit length.
Inputs
im is a 3 dimensional numpy array of data type uint8 with values in [0, 255].
x is a 2 dimensional numpy array of data type float64 with shape × 1 .
y is a 2 dimensional numpy array of data type float64 with shape × 1 .
feature_width is an integer.
Outputs
The expected output is a 2 dimensional numpy array of data type float64 with shape × , where = 128 is the length of the SIFT feature vector.
Data
You can tune your algorithm/parameters with the image at data/notre_dame_1.jpg and interest points at data/notre_dame_1_to_notre_dame_2.pkl .
Marking Criteria
You will get full marks if your output is shape wise consistent with the expected output. This function will further be tested together with the feature matching function to be implemented in the next question. There is no partial marking for this question.
In [ ]:
# SIFT like features |
In [ ]:
Question 9 (10 marks)
Write a function match_features(features1, features2, x1, y1, x2, y2, threshold) to implement the "ratio test" or "nearest neighbor distance ratio test" method of matching two sets of local features features1 at the locations (x1, y1) and features2 at the locations (x2, y2) as described in the lecture materials and in the chapter 7.1.3 of the 2nd edition of Szeliski's book.
The parameters features1 and features2 are numpy arrays of shape × 128, each representing one set of features. x1 and x2 are two numpy arrays of shape × 1 respectively containing the x-locations of
features1 and features2 . y1 and y2 are two numpy arrays of shape × 1 respcectively containing the y-locations of features1 and features2 . threshold is another parameter that validates matches based on the ratio test explained in the lecture or in the book of Richard Szeliski (equation 7.18 in section 7.1.3). Your function should return two outputs: matches and confidences , where matches is a numpy array of shape × 2 , where is the number of matches. The first column of matches is an index in
features1 , and the second column is an index in features2 . confidences is a numpy array of shape × 1 with the real valued confidence for every match.
This function does not need to be symmetric (e.g. it can produce different numbers of matches depending on the order of the arguments). To start with, simply implement the "ratio test", equation 7.18 in section 7.1.3 of Szeliski. There are a lot of repetitive features in these images, and all of their descriptors will look similar. The
ratio test helps us resolve this issue (also see Figure 11 of David Lowe's IJCV paper (https://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf). Please try to tune your SIFT descriptors and matching
algorithm together to obtain a better matching score. You can use the images and correspondences below to tune your algorithm.
Inputs
features1 is a 2 dimensional numpy array of data type float64 with shape × .
features2 is a 2 dimensional numpy array of data type float64 with shape × .
x1 is a 2 dimensional numpy array of data type float64 with shape × 1.
y1 is a 2 dimensional numpy array of data type float64 with shape × 1.
x2 is a 2 dimensional numpy array of data type float64 with shape × 1 .
y2 is a 2 dimensional numpy array of data type float64 with shape × 1 .
threshold is a real number of data type float64 .
Outputs
matches is a 2 dimensional numpy array of data type int64 .
confidences is a 1 dimensional numpy array of data type float64 .
Data
You can tune your algorithm on the images at data/notre_dame_1.jpg and
data/notre_dame_2.jpg , and interest points at data/notre_dame_1_to_notre_dame_2.pkl and also on the images at data/mount_rushmore_1.jpg and data/mount_rushmore_2.jpg , and interest points at data/mount_rushmore_1_to_mount_rushmore_2.pkl . Note that the corresponding points within the pickle files are the matching points.
Marking Criteria
The marking will be based on matching accuracy obtained by the feature description and matching algorithm implemented by you respectively in the previous and this question. There are two test cases (5 marks each) with two different pairs of images and corresponding points, which are provided in the Data section. You will obtain 60% marks if your algorithm can obtain matching accuracy greater than or equal to 50%, 80% marks if your algorithm obtains 70% accuracy or more, and full marks if your algorithm secures 90% matching accuracy or more. You will not obtain any mark if your algorithm can not achieve 50% matching accuracy.
In [ ]:
# Feature matching |
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
In [ ]:
2022-02-23