闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

APEC 3003

Final

May 6th , 2023

Instructions

This exam has 4 questions and one extra credit opportunity. Please answer all questions. You may use external resources that comply with the campus academic integrity policy to answer these questions but must state all responses in your own words. Please type your responses on a separate document, noting which question you are responding to, then upload your responses document onto Canvas.

Some of the parts – not all! – of this exam build on each other. If you come to a part of the exam and get stuck, I strongly recommend the following:

1. Read the rest of the parts of the exam, and consider if you can answer the question without the previous parts.

Answer all parts of the exam you can. For instance, just because 1(b) comes after 1(a) does not mean that you have to answer 1(a) in order to answer 1(b).

2. If you believe a part of a question depends on a previous question, you can still obtain partial credit on that part.

The strategy here is to say “suppose I had found x in the previous part. Then my answer would be ...” Show as much of what you know as possible.

For questions 1 through 3, you will work with replication data from a paper entitled “The Price of Political Opposition: Evidence from Venezuela ’s‘Maisanta ’.”1 The authors look at the eﬀects of the publication of the names of those who had signed a (failed) recall petition against Chávez in 2003. Chávez won 59% of the vote in the recall election triggered by this petition, so remained in power. In 2004, a database of all registered voters was released, which identiﬁed voters that had signed the third of 3 recall petitions. This clearly identiﬁed political opponents of the Chávez regime, and made this information broadly available. The authors argue that political leanings were referenced for job applicants, as well as by friends and neighbors. This allowed Chávez’s regime to retaliate against political opponents, particularly as he had consolidated power so had no need to conciliate the opposition. We will examine if there’s evidence for political retribution showing up in earnings of opponents of the Chávez regime whose identities were revealed.

The data maisanta .rdata are available on Canvas in the same space as this ﬁnal exam paper. You can ﬁnd a description of the variables in the data below.

Variable name

Class

Description

Range

ingreso_wk	numeric	Log of annual income (measured in 2000 thousand bolívares)	-5.069 to 11.579
maisanta	numeric	Identity as signer of 3rd round petition has been revealed	0 to 1
female	numeric	Sex is female	0 to 1
educ	numeric	Years of schooling	0 to 18
year	numeric	Year of sample	1997 to 2006
caracas	numeric	Person lives in Caracas	0 to 1
edad	numeric	Age	0 to 99
chavista	numeric	Signed for Chavez	0 to 1

Question 1: Univariate OLS regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 points

(a) (5 points) Write down the linear model for a regression of an individual’s log wages per year on a constant and an indicator that that person’s participation in the third petition

to recall Chávez (the maisanta) was publicly available in that year. Be sure to include all relevant notation, including coeﬃcients and subscripts.

(b) (5 points) Explain how the phrase “least squares” in Ordinary Least Squares (OLS) provides an intuitive explanation for the mechanism by which OLS produces estimates of coeﬃcients in the model.

(c) (5 points) Estimate the regression you described in part 1(a). Report and interpret the coeﬃcient you estimate for the maisanta on annual log wages.

Note: the left hand side variable is measured as the natural logarithm of bolívares2 per year. This is not a purely linear model!

(d) (10 points) What assumption(s) need(s) to be true for your estimate from 1(c) to provide an unbiased estimate of the eﬀect of having your opposition to the Chávez regime known on earnings?

Do you think that this assumption is plausible? Why or why not?

(e) (5 points) Report a heteroskedasticity robust standard error for the estimate you

computed in part 1(c).

Using the heteroskedasticity robust standard error, construct and report a 95% conﬁdence interval around the estimated coeﬃcient on maisanta from part 1(c).

What does this conﬁdence interval tell you about the statistical signiﬁcance of the estimate?

(f) (5 points) What assumption are you no longer making about the structure of the error terms by using heteroskedasticity robust standard errors?

Explain informally what that assumption says about the structure of the error terms.

Question 2: Multivariate OLS regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 points

(a) (5 points) We ’ll expand the regression from 1(a) to include year ﬁxed eﬀects, an indicator that the individual is female, and years of education.

ln(earnings)it = α0 + α1maisantait+ α2 femalei + α3 educit + γt + eit Estimate the expanded model and report your results.

(b) (10 points) Interpret the coeﬃcient on the maisanta coeﬃcient from the results obtained in 2(a). Also, note how the interpretation of the coeﬃcient changes compared to part 1(c).

(c) (5 points) Is the eﬀect of having one’s opposition to Chávez revealed on earnings statistically signiﬁcant? Why or why not?

(d) (10 points) Is the eﬀect practically signiﬁcant? Give your reasoning.

(e) (5 points) Suppose you added an indicator that the respondent was male (equal to 1 when the indicator for female is equal to zero, and equal to 0 when the indicator for female is 1) to this regression.

ln(earnings)it = α0 + α1maisantait+ α2 femalei + α3 educit + α4malei + γt + eit What will R report for the coeﬃcient α4 on male? Why?

Question 3: Diﬀerence-in-diﬀerences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 points

The authors of this study obtain their data by linking two data sources together. This link gives them a unique data set - they can observe which people in the earnings data are potentially exposed to political retribution. However, the earnings data come from a survey with a high attrition rate. That is, many people who start out in the survey sample stop responding to the survey over time. To maintain the sample size of the survey, the people who run the survey incorporate new respondents when others leave. This means that the structure of the survey data is a repeated cross-section rather than a panel.

(a) (5 points) Suppose that the authors had panel data and were able to reliably observe the same individuals for each year.

Then they could estimate the regression speciﬁcation

ln(earnings)it = β0 + β1maisantait + 6i + τt + εit

where i indexes individuals and t indexes years.

How would you interpret the coeﬃcient β 1 in this case?

(b) (5 points) What assumption is necessary for the coeﬃcient β 1 to represent the causal eﬀect of maisanta disclosure on earnings?

Informally describe what this assumption says about the data.

Question 4: Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 points

In case you hadn’t heard, job cuts at Disney mean that ABC has ﬁred Nate Silver, who is behind the ﬁrm 538 that is a major player in election forecasting. Suppose Nate is starting a new company and you are interviewing with them.

Someone didn’t do a good job paying attention to who owns what intellectual property at 538, and Nate still owns the simulation code shown on the last page of the exam.

This code is used to test how sensitive their predictions are to the relationship between independent vote share and predicted republican vote share.

Your interviewer asks you the following questions to assess your skill at interpreting code and understanding the underlying statistics.

(a) (5 points) What is the purpose of the command set .seed(2168) on line 1?

(b) (5 points) There will be 1,000 observed values of the t statistic in the data frame null at the end of the simulation.

For how many of those 1,000 observed t statistics do you expect the absolute value to be more than 1.96?

Why is this your expectation? And what line(s) of code makes you think this?

Extra credit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 points

Make a meme about econometrics.

You can’t use one that already exists.

Explain to me, as if I am your less econometrically savvy relative, why the meme is funny.

set .seed(2168)

load(observedvoteshares .rdata)

sim <- 1000

betanull <- 0

betanotnull <- 0 .1

#make a place to put results to compare them

null <- data .frame(iteration = 1 :sim ,

betahat = numeric(sim) ,

tstat = numeric(sim) ,

prederror = numeric(sim))

notnull <- data .frame(iteration = 1 :sim ,

betahat = numeric(sim) ,

tstat = numeric(sim) ,

prederror = numeric(sim))

for (s in 1 :sim){

#generate outcome under null

sharernull <- 0 .05 + 0 .3*observedvoteshares$incumbr

+ 0 .9*observedvoteshares$prevsharer

+ betanull*observedvoteshares$prevsharei

- 0 .95*observedvoteshares$prevshared

+ rnorm(length(observedvoteshares) , 0 , 0 .25)

#generate outcome under not null

sharernotnull <- 0 .05 + 0 .3*observedvoteshares$incumbr

+ 0 .9*observedvoteshares$prevsharer

+ betanotnull*observedvoteshares$prevsharei

- 0 .95*observedvoteshares$prevshared

+ rnorm(length(observedvoteshares) , 0 , 0 .25)

# Run OLS regression for null

nullmodel <- lm(sharenull ~ incumbr + prevsharer + prevsharei

+ prevshared , data = observedvoteshares)

# Store null coefficient in results data frame

null[s , "betahat"] <- coef(summary(model)) [2 , "Estimate"] null[s , "tstat"] <- coef(summary(model)) [2 , "t value"] # compare predicted values under null to observed shares in 2020 null[s , "prederror"] <- mean(predict(nullmodel) - obssharer)