闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ST 114: Project 1

2022

Instructions:

In this project, you will write functions for smoothing time series data. A Python ﬁle smoothers.py has been provided for you to complete. Do not change the docstring! You will need to edit the ﬁle and submit it to gradescope.

● You may work together, but you must list all collaborators in your smoothers.py ﬁle.

● Again, do not change the docstring!

● DO NOT COPY AND PASTE EACH OTHER’S CODE! You’ll learn better if you write your code on your own.

● If you use any resources (online or other), be sure to list them in your smoothers.py ﬁle.

Assignment:

1. Complete the function sort data that takes in a tuple data whose ﬁrst element is a list of x coordinates and whose second element is a list of y coordinates. The function should return a copy of the input tuple sorted in non-decreasing order with respect to the x coordinates of the input tuple.

Below are some example evaluations of the function.

>>> sort data(([5, 1, 7], [1, 2, 3]))

([1, 5, 7], [2, 1, 3])

>>> sort data(([2, 4, 8], [1, 2, 3]))

([2, 4, 8], [1, 2, 3])

>>> sort data(([11, 4, -5], [1, 2, 3]))

([-5, 4, 11], [3, 2, 1])

Hint: Use the numpy function argsort to get the indices that would sort the x coordinates, e.g., the indices that would sort x = [5, 1, 7] in ascending order would be [1, 0, 2] since x[1]

< x[0] < x[2].

2. Complete the function find index that takes in a list x and a number x new and returns the small- est index i such that x[i] < x new < x[i+1]. Assume that i) there are no duplicated values in x. ii) x is sorted in ascending order, and iii) x new is a number between min(x) and max(x), inclusive. Below are some example evaluations of the function when x = [1, 5, 7, 9].

>>> find_index(x, 1)

>>> find_index(x, 2)

>>> find_index(x, 6)

>>> find_index(x, 7)

>>> find_index(x, 8)

>>> find_index(x, 9)

3. Complete the function linear predict that takes in two parameters data and x new.

● data is a tuple of two items. The ﬁrst item is a list of x coordinates which are numbers, and the second item is a list of y coordinates which are also numbers. Assume that data[0] is sorted in ascending order with no duplicated values.

● x new is a number between min(data[0]) and max(data[0]), inclusive.

● The function returns the linearly interpolated value y new at the point x new:

yi+1 x yi

xi+1 x xi

where xi = data[0][i], yi = data[1][i] and i is the smallest index such that xi < x new < xi+1. For example, suppose data = ([0, 5, 10],[1, 7, -5]) and x new = 2. Then linear predict(data, x new) produces the value y new = 3.4. In the plot below, the blue solid circles are the points (data[0][i], data[1][i]) for i = 0, 1, 2. The red solid circle is the point (x new, y new) = (2, 3.4).

As another example, suppose data = ([0, 5, 10], [1, 7, -5]) and x new = 9. Then linear predict(data, x new) produces the value y new = -2.6. In the corresponding plot, the blue solid circles are the points (data[0][i], data[1][i]) for i = 0, 1, 2. The red solid circle is the point (x new, y new) = (9, -2.6).

4. Complete the function linear interpolate that takes in two parameters data and x new list.

● data is a tuple of two items. The ﬁrst item is a list of x coordinates which are numbers, and the second is a list of y coordinates which are numbers. Assume that data[0] is sorted in ascending order.

● x new list is a list whose values are numbers betweenmin(data[0]) and max(data[0]) inclusive.

● The function returns a list whose items are the linearly interpolated values at the points in x new list.

For example, suppose data = ([0, 5, 10], [1, 7, -5])and x new list = [2, 9]. Then linear interpolate(data, x new list) should return the list [3.4, -2.6]. The red solid circles are the interpolated data points (2, 3.4) and (9, -2.6).

5. Complete the function kernel predict that takes in three parameters data, x new, and h.

● x new is a number between min(data[0]) and max(data[0]) inclusive.

● h is a positive number.

● The function returns the following predicted value y new at the point x new:

n_1

ynew = 匕 wiyi

i=0

where n = len(data[0]), xi = data[0][i], yi = data[1][i]. The numbers wi are nonnegative weights that add up to 1. Thus, y new is a weighted average of the y i. To get the weights, you will use a Gaussian kernel function Kh(u, v) to compute weights where:

Kh(u, v) = exp ╱ x 、

Here, the weight wi is given by

Kh(xnew, xi)

j=0 Kh(xnew, xj)

For example, suppose data = ([0, 5, 10], [1, 7, -5]), x new = 2, and h = 6. Then kernel predict(data, x new, h) requires computing

K6(2, 0) = exp ╱ x 、 = 0.7165313

K6(2, 5) = exp ╱ x 、 = 0.4723666

K6(2, 10) = exp ╱ x 、 = 0.0048279

to compute the weights

w0 = w1 =

w2 =

K6(2, 0)

K6(2, 0) + K6(2, 5) + K6(2, 10)

K6(2, 5)

K6(2, 0) + K6(2, 5) + K6(2, 10)

K6(2, 10)

K6(2, 0) + K6(2, 5) + K6(2, 10)

s 0.6002478

s 0.3957077

s 0.0040444

which are then used to compute a weighted average

ynew = w0 · 1 + w1 · 7 + w2 · x5 s 3.3499799.

Note: This weighted average is called the Nadaraya-Watson estimator.

6. Complete the function kernel smooth that takes in three parameters: data, x new list, and h.

● x new list is a list whose values are numbers betweenmin(data[0]) and max(data[0]) inclusive

● h is a positive number.

● The function returns a list whose items are the kernel predicted values at the points in x new list

For example, suppose data = ([0, 5, 10], [1, 7, -5]), x new list is a list of 30 equally spaced points between 0 and 10, and h = 6. In the plot below, the red solid circles are the kernel predicted values at those 30 equally spaced points using bandwidth parameter h = 6.

7. Complete the function knn predict that takes in three parameters: data, x new, and k.

● data is a tuple of two items. The ﬁrst item is a list of x coordinates which are numbers, and the second item is a list of y coordinates, which are also numbers. Assume that data[0] is sorted in ascending order with no duplicated values.

● x new is a number between min(data[0]) and max(data[0]) inclusive.

● k is a positive integer.

● The function returns the following predicted value y new at the point x new:

ynew = 匕 yi ,

ieN怎(xnew)

whereyi = data[1][i]. The set Nk(xnew) is the set of indices of the k items in the x coordi- nates data[0] that are closest to x new. Thus, y new is the average of the y values of the k nearest neighbors to x new. The acronym knn stands for k-nearest neighbors.

For example, suppose data = ([0, 5, 10, 15], [1, 7, -5, 11]). Let’s see what Nk(xnew) and ynew are for a few different choices for x new and k.

● If x new = 2 and k = 2 then

N2(2) = [0, 1] and ynew = = 4.

● If x new = 2 and k = 3 then

N3(2) = [0, 1, 2] and ynew = = 1.

● If x new = 8 and k = 2 then

N2(8) = [1, 2] and ynew = = 1.

● If x new = 8 and k = 3 then

N3(8) = [1, 2, 3] and ynew = s 4.33333.

8. Complete the function knn smooth that takes in three parameters: data, x new list, and k.

● x new list is a list whose values are numbers betweenmin(data[0]) and max(data[0]) inclusive.

● k is a positive integer.

● The function returns a list whose items are the knn predicted values at the points in x new list using number of neighbors parameter k.

For example, suppose data = ([0, 5, 10, 15], [1, 7, -5, 11]). In the plot below, the blue solid circles are the points (data[0], data[1]). The red solid circles are the knn predicted values at 30 equally spaced points between 0 and 10 using a neighbor parameter of k = 2.

9. Let x1 , . . . , xn be n numbers stored in the list x. The median of x, denoted median(x), is the “middle value” of the numbers x in the sense that half of the numbers in x are less than the median and the other half of the numbers in x are greater than the median. Let y denote a copy of x sorted in

ascending order. Then

median(x) = ,

y │ x 1┐ y [ 垃(ì) _1]+y [ 垃(ì)]

if n is odd.

if n is even.

Complete the function my median with the following speciﬁcations:

● It takes in one parameter, x, which is a list of numbers.

● It returns median(x) as a ﬂoat.

● Do not alter x. Evaluating your function my median(x) should not change the value of the list x.

● Do not use Python’s built-in median function. You will get zero credit on this problem if you do.

10. Complete the function median filter that takes in two parameters, y and W.

● y is a list of y coordinates which are numbers.

● W is a positive integer.

● The function returns a list where the ith element in y is amapped to a median value, as a ﬂoat, over a sliding window containing y[i] speciﬁcally.

,．median(y[i-W], . . . , y[i-1], y[i], y[i+1], . . . , y[i+W]) y[i] 1- ．median(y[0],y[1], . . . , y[i-1], y[i], y[i+1], . . . , y[i+W])

．．median(y[i-W], . . . , y[i-1], y[i], y[i+1], . . . , y[n-2], y[n-1])

if i - W > 0and i + W < len(y) if i - W < 0and i + W < len(y) if i - W > 0and i + W > len(y)

The basic idea of the median ﬁlter is that it replaces the ith point with the median of the points in a window of size 2 ·W around the ith point. (Note: The window has 2 ·W items in the ﬁrst case above: the window is smaller for the second and third cases.) An example function call on an input list of 7 items is given below.

>>> y = [-1, 6, 7, -2, 0, 8, 13]

>>> z = median_filter(y, 1)

>>> z

[2.5, 6.0, 6.0, 0.0, 0.0, 8.0, 10.5]

Below is the assignments that median filter(y, 1) makes to each slot in z.

z[0] = median([-1, 6])

z[1] = median([-1, 6, 7])

z[2] = median([6, 7, -2])

z[3] = median([7, -2, 0])

z[4] = median([-2, 0, 8])

z[5] = median([0, 8, 13])

z[6] = median([8, 13])

The plot below shows a longer list y of 50 elements (daily Air Quality Index measurements) plotted against index values 0, 1, 2, . . . , 49 as blue solid circles. The plot also shows the items in the list median filter(y, 1) plotted against the index values 0, 1, 2, . . . , 49 as semitransparent brown solid circles. We can see that the median ﬁlter “removes” unusual or outlying observations.

The plot below shows the same list y of 50 elements (daily Air Quality Index measurements) plotted against index values 0, 1, 2, . . . , 49 as blue solid circles. The plot also shows the items in the list median filter(y, 6) plotted against the index values 0, 1, 2, . . . , 49 as semitransparent brown solid circles. We can see that the broader median ﬁlter smooths the data to a much greater extent.

Note: Air Quality Index data from https://archive.ics.uci.edu/ml/datasets/Air+ Quality

11. In this problem, you’ll put everything together! Complete the function smoother that takes in two parameters data and x new list that the user must specify and ﬁve default parameters: method, filter flag, W, h, and k.

● data is a tuple of two items. The ﬁrst item is a list of x coordinates which are numbers, and the second item is a list of y coordinates which are also numbers. Assume that data[0] has no duplicated values. Do not assume that the tuple data is sorted in non-decreasing order with respect to the x coordinates of the input tuple data[0].

● x new list is a list whose values are numbers betweenmin(data[0]) and max(data[0]) inclusive.

● method is a string that is ’interpolate’ , ’kernel’ , or ’knn’ . The default value of method is ’interpolate’ .

● filter flag is a Boolean with a default value of False.

● W is a positive integer with a default value of 5.

● h is a positive number with a default value of 1.

● k is a positive integer with a default value of 1.

● If method has a value other than ’interpolate’ , ’kernel’ , or ’knn’ , the function prints the string ’Unsupported method.’ and returns None

● If method is ’interpolate’ and filter flag is False, the function returns the list of linearly interpolated values over the points x new list.

● If method is ’interpolate’ and filter flag is True, the function returns the list of linearly interpolated values over the points x new list after applying median filter on the y values in data with window parameter W.

● If method is ’kernel’ and filter flag is False, the function returns the list of kernel smoothed values over the points x new list with bandwidth parameter h.

● If method is ’kernel’ and filter flag is True, the function returns the list of ker- nel smoothed values over the points x new list with bandwidth parameter h after applying median filter on the y values in data with window parameter W.

● If method is ’knn’ and filter flag is False, the function returns the list of knn smoothed values over the points x new list with nearest neighbor parameter k.

● If method is ’knn’ and filter flag is True, the function returns the list of knn smoothed

values over the points x new list with nearest neighbor parameter k after applying median filter on the y values in data with window parameter W.

12. The variable AQI data is a tuple whose ﬁrst element is a list of x coordinates and whose second el- ement is a list of y coordinates. The y coordinates are the Air Quality Index values (CO concentration per hour) per day in an Italian city. The x coordinates are days, and, in order, they take the values 1-50. Create plots under the speciﬁed scenarios below using an x new list of 100 equally spaced points between 1 and 50. For all scenarios, plot the original data points in AQI data as black squares with markers of size 5.

Scenarios:

(a) Without median ﬁltering.

● Linear interpolation (green circles with dashed line)

● A kernel smoother with h = 0.1, 1, and 10. Overlay all three kernel smoothers on the same plot. Use a solid blue line for h = 0.1, solid green line for h = 1, and solid magenta line for h = 10.

● A knn smoother with k = 1, 5, and 15. Overlay all three knn smoothers on the same plot.

Use a solid blue line for k = 1, solid green line for k = 5, and solid magenta line for k = 15. (b) With median ﬁltering (W = 5).

● Linear interpolation (green circles with dashed line)

● A kernel smoother with h = 0.1, 1, and 10. Overlay all three kernel smoothers on the same plot. Use a solid blue line for h = 0.1, solid green line for h = 1, and solid magenta line for h = 10.

● A knn smoother with k = 1, 5, and 15. Overlay all three knn smoothers on the same plot. Use a solid blue line for k = 1, solid green line for k = 5, and solid magenta line for k = 15.

13. Write 5-8 sentences in a block comment at the bottom of your code that reﬂect on the project you have completed. Here are some questions to consider in your answer:

● What did you learn that was particularly interesting?

● What result did you get that was surprising to you?

● Which of the smoothers do you like the most? the least?

● What are you most proud of yourself for doing in this project?