闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Econ 117

Intro to Data Analysis and Econometrics

Midterm Exam — Spring 2020

1. Rules

● You can use any notes you have written or printed on both sides of a single sheet of paper.

● You can use a calculator, but it cannot have communication or internet capa- bilities.

● You must have your mobile phone put away for the entire exam.

● If you leave to use the restroom, do not take your phone with you.

● The exam will last 75 minutes. Do not open the exam until asked to do so.

● There are 5 questions totaling 75 points on the exam (1 point per minute).

● Somewhat challenging questions are labeled with “*” and challenging questions are labeled with “***”. We recommend doing harder questions last.

2. Before the exam

● Write your name on all of your blue books (you need 4).

● Label books “Q1 & Q2”, “Q3”, “Q4”, and “Q5”

● Answer the questions in their corresponding blue books.

3. During the exam

● Inside each book, letter each sub-question clearly.

● Please write neatly and show your work. If you show your work completely, but do not calculate the ﬁnal numeric answer, you will still receive full points.

● You do NOT need to show work on the TRUE or FALSE questions unless the problem states that you need to justify your answer. Only answer TRUE or

FALSE (please write out the whole word).

● Make sure your ﬁnal answer is clearly marked.

● If you need to make assumptions, please clearly state them in your answer.

4. When the exam ends

● Stop writing immediately when the exam ends.

● Hand in both this exam and your blue books.

● If you borrowed a calculator, hand that in, too.

Question 1. True or False [Book 1, 10 points]

For the questions below, please answer if they are TRUE or FALSE. You do not need to show work or justify your answer. Please write out the whole word “True” or “False” and not “T” or “F” .

(a) p-values cannot exceed 1.

(b) Var(X + Y) is always greater than Var(X . Y).

variance, then they have the same probability density function (PDF).

(d) Consider a random sample of N > 1 independent and identically dis- tributed observations, all of which have expected value µ and variance σ2 . The variance of the sample mean will always be strictly smaller than σ 2 , for any value of σ2 > 0 and any value of µ .

(e) You compute a 95% conﬁdence interval for the mean hourly wage in the US. Your conﬁdence interval is [20, 40]. This means that with 0.95 probability =$30.

(f) The probability mass function (PMF) of a discrete random variable must always take on values > 0 and < 1.

(g) For any two independent events A and B , Pr(A|B) = Pr(A).

(h) For the normal distribution, both the mean and the median occur at the same point.

(i) For any two events A and B with Pr(B) > 0, Pr(A n B) < Pr(A|B).

(j) You work at a ﬁrm and are worried about being paid fairly. The CEO sends an email disclosing the average salary in the ﬁrm. Equipped with this information (and knowing your own salary) you can determine whether the majority of your coworkers (more than half) earn more than you do.

Question 2. Short Answer [Book 1, 24 points]

(a) Let Y be a continuous random variable distributed according to the prob- ability density function (PDF) fY (y).

i. True or False (no justiﬁcation needed):

A. Pr(Y = 15) = fY (15)

B. Pr(10 < Y < 30) = fY (30) . fY (10)

C. It is always the case that 0 < fY (y) < 1 for all values of y.

D. The area under the pdf must be equal to 1 (i.e l一一 fY (y)dy = 1). ii. Suppose that the cumulative density function (CDF) of Y is given by

'(0 for y < 10

FY (y) = e [10, 100]

('1 for y > 100

A. What is the probability that Y is greater than 30 but less than or equal to 70?

B. * What is the median of Y?

(b) Suppose we have two random variables X and Y and we know that the correlation between X and Y is cor(X, Y) = 0.8. We additionally know E[X] = 2, E[Y] = 10, var(X) = 4, var(Y) = 4.

i. Deﬁne Z = X . Y What is Var(Z)?

ii. Now suppose we have a random independent and identically distributed sample of X and Y : {(X1 , Y1 ), ...., (XN , YN )}, with N = 100.

How would I use the sample to calculate an estimator of the population mean of Z? (provide a formula)

iii. How would I use the sample to calculate an estimator of the population variance of Z? (provide a formula)

iv. True or False: The law of large numbers tells us that E[X(¯)N ] will

converge in probability to the true sample mean of 2 (justify your answer).

v. What distribution does the central limit theorem tellus ′N( X(¯)N . 2)

will converge to (your answer should also include the mean and the variance)?

(c) Suppose I deﬁne the variable x in R as x <- c( -10, 1, 2, 3, NA, 12) and y as y <- 1:6 . Report what would be returned by each of the fol- lowing commands (note, this does not need to be formatted as R code or output).

i. mean(x)

ii. cor(x[2:4], y[1:3])

iii. y > 3

iv. x[y <3]

v. pnorm(0)

Question 3. SAT Retaking and the College Enrollment Gap. [Book 2, 14 points]

This question is based on a forthcoming paper by Goodman, Gurantz, and Smith called “Take Two! SAT Retaking and College Enrollment Gaps”, which studies diﬀerences in retaking behavior for the SAT test across demographic groups, and how that may aﬀect college enrollment.

Table 1 (reproduced below) provides several means and conditional means of demographic and performance variables for SAT test takers.

● Column (2) reports the expected value of key variables.

● column (4) reports the conditional expected value of the same variables, conditional on being a low-income student (i.e. E[X|low income student]).

● We will focus on columns 2 and 4, no other column is needed to answer of the following questions. In particular, we will not use column (1) and references to “all students” refers to column (2).

(a) Using the table above, among those taking the SAT exam:

i. What proportion of all students are low income?

ii. Let R be a random variable that is equal to 1 if a student retook the

SAT, and otherwise equal to 0. What is E[R]?

iii. What is the variance of R?

iv. What proportion of low-income students re-took the SAT? v. What is the probability a student is not low income?

vi. What is the probability a student is low income and an under-represented minority (URM)?

vii. Let the random variable URM be equal to 1 if the student is an un- derrepresented minority, and otherwise equal to 0. Let L be a random variable that is equal to 1 if a student is low income and otherwise be equal to 0. Are L and URM independent? (justify your answer)

(b) Using the deﬁnitions of the random variables deﬁned above, what is

Pr(L = 1|R = 1)?

(c) *** How many times more likely is a non-low income student (L = 0) to retake the SAT than a low income student (L = 1).

Question 4. Wages, Education, and Hypothesis testing[Book 3, 13 points]

Suppose we have a data set with 200 randomly drawn observations from the 1988 Consumer Population Survey.

The plot above plots the random variables Edu and logwage in a scatter plot for the 200 observations in our data.

(a) Based on this plot, would you say Edu is a discrete or continuous random variable (provide 1 sentence justiﬁcation)?

(b) Would you describe the correlation between Edu and logwage as positive, negative, or around zero? [provide 1 sentence justiﬁcation]

(d) Using the data set stored in the data.frame called data containing the variable Edu, we ﬁnd the following in R:

> mean ( data $ Edu )
[1] 13.7
> var ( data $ Edu )
[1] 5.246231
> sd ( data $ Edu )
[1] 2.290465
> median ( data $ Edu )
[1] 13
# max value of Edu variable in > max ( data $ Edu ) [1]	data . frame	data :
# Number of rows in data . frame > nrow ( data ) [1] 200 # Squareroot of 200: > sqrt (200) [1] 14.14214	data :

Note: when performing calculations, you can round the numbers above to 2 decimal places.

i. What is the standard error of the sample mean of Edu?

Suppose we want to test the null hypothesis that the true population mean of Edu is 13.4 against the alternative hypothesis that it is not equal to 13.4:

H0 : μ = 13.4

HA : μ 13.4

with α = 0.05.

ii. What is the test statistic for this test?

iii. What is the distribution of the test statistic if the null hypothesis is true (and we assume N is large enough)? [report parameters such as mean and variance as well]

iv. What is the p-value?1

v. Interpret this p-value in words. [2 sentences max].

vi. Do we reject the null hypothesis? [provide 1 sentence justiﬁcation for your answer].

Question 5. Firm Hiring Decisions [Book 4, 14 points]

A ﬁrm is looking to hire and has many applications. The ﬁrm screens applicants based on two dimensions: years of education, E, and years of experience in the sector, X. The joint distribution of education and experience among applicants is the following:

Educ E	Experience X 0 10 20
12	0.10 0.05 0.05
16	0.05 ? 0.30
18	0.15 0.05 0.10

The ﬁrm will hire all applicants who either have twenty years of experience (X = 20), or who have at least a BA degree (E > 16), or both. Other applicants will not be hired.

(a) What is the ? in the joint distribution table of applicants’ education and experience? That is, what is Pr(E = 16, X = 10)?

(b) What is the marginal distribution of education E for applicants? That is What is Pr(E = 12),Pr(E = 16), and Pr(E = 18) among all applicants?

(d) *** What is the marginal distribution of education E among those who are hired? That is What is Pr(E = 12|H = 1),Pr(E = 16|H = 1), and Pr(E = 18|H = 1) where H is an indicator variable that is equal to 1 if hired and otherwise equal to 0.

(e) *** What is the expected value of education E among those who are hired? (f) * The ﬁrm pays compensation C according to the following function:

C = 12 + 10 · X + 5 · E

What would the expected compensation C be among applicants who were hired and had 0 years of experience (X = 0)? [Hint: you do not speciﬁcally need to use your answers for (d) and (e) to answer this question.]

2023-06-14

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple

C语言