Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ENGN4528   Practice Exam Questions

S1 2021

College of Engineering and Computer Science

Practice Questions 2021

ENGN4528 Computer Vision

Question Booklet

Reading time: 15 minutes

Writing time: 1.5 hours

Uploading time: 15 minutes

(That is 2.0 hours in total)

There are 10 questions in total.

(Q1-Q10)

Please name your submission as

ENGN4528_exam_u1234567.docx

Q1: (21 marks) [3D SFM and Image formation question]

Answer the following questions concisely. Write down working, and if you are unsure about some part along the way, state your best assumption and use it for the remaining parts. Similarly, if you think some aspect is ambiguous, state your assumption and write the answer as clearly as you can.

(a) Given two calibrated cameras, C1 and C2, C1 has focal length of 500 in x and 375 in y, (in pixel units) the camera has resolution 512x512, and the camera centre projected to    image is at (249, 249), with no skew. Suppose C2 has the same image resolution and focal length as C1, but the camera centre projected to image is at (251, 252).  Write down the calibration matrix K1 and K2 for C1 and C2 respectively. (Hint: please only write down the final two 3x3 matrices.) [3 marks]

(b) Suppose that a 3D world coordinate system ((X,Y,Z) coordinates as in the below diagram from the lecture notes) is defined as aligned with the camera coordinate system of C1. More specifically, the world origin is at the camera centre of C1, the Z axis is aligned with the optical(principal) axis and the X and Y world coordinate systems aligned parallel with the x andy axes of the image of C1. Write down the matrices K[R|t] which define the projection of a point in world coordinate system to the image of C1. (Hint: please only write down the final 3x4 matrix.) [3 marks]

(c) Suppose that the scene has a point, P1, that in the world coordinate system defined above that lies at (39, 35, 100). Note that the points in world coordinate system are measured in cm. What location (to the nearest pixel) will that world point (P1) map to in the image of C1? [2 marks]

(d) Suppose that with respect to the world coordinate system that is aligned with camera C1, camera C2 begins being aligned to C1, and is then rotated by 45 degrees about its vertical axis (Y-axis)(as shown below), and subsequently the centre of C2 is translated by 0.2 m to the left of C1 (along the X axis of C1), then moved forward by 0.2 m parallel to the optical axis of C1.

Write down the matrices K[R|t], which define the projection of points in the world system (i.e, the same coordinate system of C1) to the image of C2. (Hint: please only write down the final  3x4 matrix.) [3 marks]

(e) What is the location (to the nearest pixel) that P1 maps to in the image of Camera C2? (Hint: Please write down only the final result.) [2 marks]

(f) Define the term epipole. [2 points]

(g) For camera C1, there is an epipole (or epipolar point) that relates to Camera C2. For the two-camera setup for predicting structure from motion, what is the position of the epipole in camera C1 of camera C2? (Hint: It is a point in the image coordinates of Camera C1). [2 points]

(h) Given a point P2 that appears in camera C1 at image location (x1, y1), and in camera C2 at image location (x2, y2). How would you find the world coordinates of point P2? [4 points]

Q2: (10 Marks) [Shape-from-X, Stereo]

(a) Shape-from-Shading approaches predict the brightness of an image pixel. Given a point light source at infinity (distant light source), write down the equation that defines the brightness at an image pixel assuming that the camera views a Lambertian surface, Please also define the terms of the equation. [2 marks]

(b) Suppose that we have used some other methods to know the brightness of the lighting,  its direction and the reflectance properties of the surface in the above scenario, but we   only have intensity information about this particular pixel for this surface, what can we say about the surface orientation? [2 marks]

(c) The images (a and b) shown below are the left, and the right image of an ideal stereo pair,taken with two identical cameras (A and B) mounted at the same horizontal level and with their optical axes parallel.

Draw a planar-view (i.e., a top-down bird-eye's view) of the scene showing roughly what the spatial arrangements of the three objects are. Only relative (rather than accurate) positions are required.

Q3: (8 marks) [basic design problem]

Given below is a single node in a neural network. Supposing that d is 4, x={2, 1,2,3}, and w={0.3,0.4,0.1,-0.4}, b=0.1, and that the activation function is a standard ReLU, that is    =max(0,x), where x is the input to the activation function.

(a) What is the output of this node? [2 marks]

(b) Describe the difference between, recognition and detection in terms of how you would use a Deep Convolutional Network to solve the problem? [2 marks]

(c) Two cascaded 3x3 layers, or a single 5x5 layer result in the same number of pixels in the input image impacting the result. So why might you prefer one representation over the other? [2 marks]

Q4: (2 marks) (questions with short answers) Given adataset that consists of images of the Eiffel Tower, your task is to learn a classifier to detect the Eiffel Tower in new images. You implement PCA to reduce the dimensionality of your data, but find that your performance in detecting the Eiffel Tower significantly drops in comparison to your method on the original input data. Samples of your input training images are given in the following figures. Why is the performance suffering? [hints: describe in two sentences.]

Q5: (10 marks) (algorithm design) Turn your phone into a GPS in an art museum or a library. GPS usually does not work well in an indoor environment. The goal of designing this algorithm is to localize your position by taking a few images around you in the museum. Please Briefly describe the key steps of your method.