Fundamental of Computer Vision Problem Set 0
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Fundamental of Computer Vision
Problem Set 0
The Image Processing Pipeline
Instructions
1. Submissions: Your report containing the written answers should be saved in a file named: FirstName_LastName_PS0. Please submit your code and input/output images in a zip file named: FirstName_LastName_PS0_matlab.zip if using Matlab or FirstName_LastName_PS0_python.zip if using Python. Please do not create subdirectories within the main directory. If plots are required, you must include them in your report (pdf or word file) and your code must display them when run. Points will be deducted for not following this requirement. Report must be submitted on canvas. Hard copies will not be accepted.
2. Integrity and collaboration: Students may collaborate with other students but each student needs to write and implement their own work . Please list the names of students you discussed the assignment with in your report.
3. File paths: Please make sure that any file paths that you use are relative and not absolute. Example: Do not use imread('/name/Documents/subdirectory/ps1/data/xyz.jpg') but imread('../data/xyz.jpg'). Make sure your code is bug- free and works out of the box. Please be sure to submit all main and helper functions. Points will be deducted if your code does not run out of the box.
The purpose of this assignment is to introduce you to Matlab as a tool for manipulating images. For this, you will build your own version of a very basic image processing pipeline. As we will see later in the class, the image processing pipeline is the sequence of steps that happens inside a camera, in order to convert a RAW image (roughly speaking, the values measured by the camera sensor) into a regular 8-bit image that you can display on a computer monitor or print on paper. There is a “Hints and Information” section at the end of this document that is likely to help. Uniquely to this assignment, we also provide a solution in the directory of the problem set ZIP file—but you should really, really first try to solve the assignment on your own, before looking at the solution. Though, you need to upload your solution on canvas.
Throughout this problem, you will use the file banana slug.tiff included in the ./data directory of the problem set ZIP file. This is a RAW image that was captured with a Canon EOS T3 Rebel camera. Slight pre-processing to the original RAW file is done in order to convert it to .tiff format. At the end of this assignment, the image should look something like what is shown in Figure 1 . The exact result can vary greatly, depending on the choices you make in your implementation.
Figure 1: One possible rendition of the RAW image provided with the assignment.
Initials. Load the image into Matlab. Originally, it will be in the form of a 2D-array of unsigned integers. Check and report how many bits per integer the image has, and what its width and height is. Then, convert the image into a double- precision array. (See help for functions imread, size, class and double.)
Linearization. The resulting 2D-array is not a linear function with respect to the true “energy” each pixel receives. It is possible that it has an offset due to dark noise (intensity values produced even if no light reaches the pixel), and saturated pixels due to overexposure. Additionally, even though the original data- type of the image was 16 bits, only 14 of those have meaningful information, meaning that the maximum possible value for pixels is 16383 (that’s 2 14). For the provided image file, you can assume the following: All pixels with a value lower than 2047 correspond to pixels that would be black, were it not for dark noise. All pixels with a value above 15000 are over-exposed pixels. (The values 2047 for the black level and 15000 for saturation are taken from the camera manufacturer).
Convert the image into a linear array within the range [0, 1]. Do this by applying a linear transform (shift and scale) to the image, so that the value 2047 is mapped to 0, and the value 15000 is mapped to 1. Then, clip negative values to 0, and values greater than 1 to 1. (See help for functions min and max.)
Identifying the correct Bayer pattern. Most cameras do not capture true RGB images. Instead, they use a process called mosaicing, where each pixel captures only one of the three color channels. The most common spatial arrangement ofpixels in terms ofwhat color channel they capture is called the Bayerpattern,
which says that in each 2 × 2 neighborhood of pixels, 2 pixels capture green, 1 pixel captures red, and 1 pixel captures blue measurements (see Figure 2). The same is true for the camera used to capture our RAW image: If you zoom into the image, you will see the 2 × 2 patches corresponding to the Bayer pattern.
Figure 2: The Bayer pattern.
However, we do not know how the Bayer pattern is positioned relative to our image: If you look at the top-left 2 × 2 image square, it can correspond to any of the four red-green-blue patterns shown in Figure 3 .
|
|
|
|
Figure 3: From left to right: ’grbg’, ’rggb’, ’bggr’, ’gbrg’.
Think of a way for identifying which version of the Bayer patterns applies to our image file, and report which version you identified. (See Hints and Information.)
White balancing. Before converting the mosaiced image into a proper RGB images, cameras first apply a step called white balancing. Roughly speaking, this involves changing the relative weights of the red, green, and blue measurements, to make sure that materials that are normally white also appear white in the image.
Figure 4: White balancing using the gray world assumption (left) and the white world assumption (right).
Two of the most common algorithms for white balancing use the so-called gray world and white world assumptions, and correspond to the transformations show in Figure 4 . Implement both gray world and white world white balancing. At the end of the assignment, check what the image looks like under both white balancing algorithms, and decide which one you like best. (See help for function mean.)
Demosaicing. After white balancing, you want to demosaic the image . This means that you want to convert the partial red, green, and blue color channels you have available because of mosaicing, into three full-resolution color channels. Use bilinear interpolation for demosaicking, as shown in Figure 5 . Do not implement bilinear interpolation manually! Instead, read the documentation to learn how to use Matlab’s interp2function for this purpose.
Brightness adjustment and gamma correction (20 points). You now have a 16-bit, full -resolution, linear RGB image. Because of the scaling, you did at the start of the assignment, the image pixel values may not be in a range appropriate for display. As a result, when displaying the image, it will appear very dark.
Figure 5: Demosaicing is the process of turning “filling-in” gaps in the three color channels, to obtain a full- resolution RGB image. This can be done by performing bilinear interpolation in each of the partial color channels measured by the sensor.
Brighten the image by linearly scaling it by some number. Select the scale as a percentage of the pre- brightening maximum grayscale value. (See help for rgb2gray for converting the image to grayscale). The correct percentage is highly subjective, so you should experiment with many different percentages and report and use what percentage looks best to you.
Before having an image that can be properly displayed, the last step you need to do is tone reproduction, also known as gamma correction. This is a non-linear transformation of intensity values, which roughly speaking is used to better match sensor measurements to the response function of common displays. For this, implement the following non-linear transform, then apply it to the image:
(1)
where C = {R, G, B} is each of the red, green, and blue channels. This function may look completely arbitrary, but it comes from the sRGB standard. It is a good default choice if the camera’s true gamma correction curve is not known. Please read about sRGB on Wikipedia.
Compression. Finally, it is time to store the image, either with or without compression. Use the imwrite command to store the image in .PNG format (no compression), and also in .JPEG format with quality setting 95. This setting determines the amount of compression. Can you tell the difference between the two files? The compression ratio is the ratio between the size of the uncompressed file (in bytes) and the size of the compressed file (in bytes). What is the compression ratio?
By changing the JPEG quality settings, determine the lowest setting for which the compressed image is indistinguishable from the original. What is the compression ratio?
Hints and Information
• To get help on a particular function in Matlab, type help or doc <function>. To get a list of functions related to a particular keyword, use the lookfor function. We will be making extensive use of the Image Processing Toolbox, and a list of the functions in that toolbox is generated by typing help images. Print your results (to a file) using the print command, and save space by using subplot whenever possible. As an example, the following Matlab script loads three images, displays them in a figure and prints the figure to a PNG file.
% read three images from current directory im1 =
imread(‘image1.tiff’);
im2 = imread (‘image2.tiff’); im3 =
imread(‘image3.tiff’);
% display images in a figure, side-by-side
figure;
imshow(im1);
% create a new figure
% display an image
title(‘Image 1’); % add a title
% print the displayed figure a PNG file. You can also print from the figure menubar.
print -dpng output.png
• You will find it very helpful to display intermediate results while you are implementing the image processing pipeline. However, before you apply the brightening and gamma correction, you will find that displayed images
will look completely black. To be able to see something more meaningful, you can use the following command to display an intermediate image im intermediate.
figure; imshow(min(1, im_intermediate * 5));
For example, Figure 6 shows what you should see using this command after the linearization step.
Figure 6: Left: The linear RAW image (brightness increased by 5). Right: Crop showing the Bayer pattern.
• The colon operator : allows to form arrays out of subsets of other arrays. In the following example, given an original image im, it creates three other images, each with only one-fourth the pixels of the originals. You can also use the function cat to combine these three images into a single 3-channel RGB image.
% create three sub-images of im as shown in figure below
im1 = im(1:2:end, 1:2:end)
im2 = im(1:2:end, 2:2:end);
im3 = im(2:2:end, 1:2:end);
% combine the above images into an RGB image, such that im1 is the red,
% im2 is the green, and im3 is the blue channel im_rgb = cat(3, im1, im2, im3);
2022-05-23
The Image Processing Pipeline