PLSC 30600: Midterm 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
PLSC 30600: Midterm
2022
Problem 1 (15 points)
Consider the following directed acyclic graph which models dependence over time between treatments D ,
outcomes y and a unobserved variables U which affect y but not D. The subscript t denotes time (so Dt − l is the random variable denoting treatment at the time period before t, Dt is the treatment at time period t and Dt+l is the treatment one period after t)
Part A (3 points)
Suppose we are interested in identifying the effect of an intervention on Dt on yt , that is, the “instantaneous”
effect of a treatment on the outcome. Does the empty set satisfy the backdoor criterion? In other words, can we identify the effect without conditioning on any other variables? Explain why or why not.
Part B (5 points)
Is {yt − l . yt −2 } a valid adjustment set under the backdoor criterion? In other words, is conditioning on both
of the lagged outcomes yt −2 and yt − l sufficient to identify the effect of Dt on yt ? Explain why or why not.
Part C (5 points)
Find the minimal sufficient adjustment set for the effect of Dt on yt . In other words, what is the smallest set of variables that would be sufficient to satisfy the backdoor criterion and to identify the effect of Dt on yt ?
Explain your reasoning.
Part D (2 points)
Suppose you were to add yt − l to the set you found in Part C. Would that set still be an admissible adjustment set under the backdoor criterion? That is, would conditioning on that larger set still identify the effect of Dt on yt ? Explain why or why not.
Problem 2 (20 points)
Despite its significant importance to many political debates, there are few causal estimates of the effect of expanded healthcare insurance on healthcare outcomes. One landmark study, the Oregon Health Insurance Experiment, covered new ground by utilizing a randomized control trial implemented by the state government of Oregon. To allocate a limited number of eligible coverage slots for the state’s Medicaid expansion, about 30,000 low-income, uninsured adults (out of about 90,000 wait-list applicants) were randomly selected by lottery to be allowed to apply for Medicaid coverage. Researchers collected observable measures of health (blood pressure, cholesterol, and blood sugar levels), as well as hospital visitation and healthcare expenses for
6,387 selected adults and 5,842 not selected adults.
For this problem, you will need the OHIE.dta file. The variables you will need are:
library(estimatr)
library(tidyverse)
library(haven)
ohie <- haven ::read_dta("OHIE.dta")
The variables you will need are:
● treatment - Selected in the lottery
● ohp_all_ever_admin - Ever enrolled in Medicaid from matched notification date to September 30, 2009 (actually had Medicaid insurance)
● tab2bp_hyper - Outcome: Binary indicator for elevated blood pressure (defined a systolic pressure of 140mm Hg or more and a diastolic pressure of 90mm Hg or more)
● tab2phqtot_high - Outcome: Binary indicator for a positive screening result for depression (defined as a score of 10 or higher on the Patient Health Questionaire - 8)
● tab4_catastrophic_exp_inp - Outcome: Indicator for catastrophic medical expenditure (total out-of- pocket medical expenses ≥ 30% of household income)
● tab1_gender_inp - gender (0 - Male, 1 - Female, 2 - Transgender)
● tab1_age_19_34_inp - Age 19-34
● tab1_age_35_49_inp- Age 35-49
● tab1_race_black_inp - Race/ethnicity is Black
● tab1_race_nwother_inp - Race/ethnicity is non-White/other
● tab1_race_white_inp - Race/ethnicity is White
● tab1_hispanic_inp - Hispanic/Latino
We’ll start by subsetting the data down to only those observations where all three of the outcomes and the seven covariates are non-missing
# Which observations have all non-missing outcomes + covariates
ohie$nonmissing <- complete.cases(ohie %>% dplyr ::select(tab2bp_hyper, tab2phqtot_high, tab4_catastrophic_exp_inp, tab1_gender_inp, ta tab1_age_35_49_inp, tab1_race_black_inp, tab1_ tab1_race_white_inp, tab1_hispanic_inp))
# Subset down to complete cases
ohie_complete <- ohie %>% filter(nonmissing == 1)
Part A (5 points)
Using the complete case data and the pre-treatment covariates, assess whether you think randomization of the selection lottery was successfully carried out. Explain why or why not.
Part B (5 points)
Estimate the average intent-to-treat effect on each of the three separate outcomes: elevated blood pressure, depression, and catastrophic medical expenditure. Provide a 95% asymptotic confidence interval for each and assess, for each outcome, whether you would reject the null of no ITT at α = a05. Briefly discuss your findings and provide a substantive interpretation of your results.
Part C (3 points)
Estimate the average effect of being selected in the lottery on actual enrollment in Medicaid. Provide a 95% confidence interval and determine whether you would reject the null of no average effect of selection on enrollment at α = a05. Based on your results, discuss whether you think selection in the lottery had a meaningful effect on treatment uptake.
Part D (5 points)
Suppose that a researcher instead chose to estimate the effect of Medicaid enrollment using a “per-protocol” analysis - comparing participants assigned to treatment (selected in the lottery) who did enroll in Medicaid to those assigned to control (not selected) who did not enroll. Use this “per-protocol” analysis to estimate
the average treatment effect of Medicaid enrollment on depression, provide a 95% asymptotic confidence interval and compare your results to the ITT estimate from Part B.
Does the “per-protocol” analysis provide an unbiased estimator of the average treatment effect of Medicaid? Explain why or why not.
Part E (2 points)
In the analyses above, you examined only those observations where respondents could be reached and where follow-up data was available. Does being selected in the lottery affect whether data is missing for a respondent?
What assumption(s) do we need to make so that only analyzing the non-missing respondents will not induce bias in the intent-to-treat effect estimator.
Problem 3 (15 points)
How do people translate personal experiences into political attitudes? Exploring this question has been frustrated by the non-random assignment of social and economic phenomena such as crime, the economy,
education, health care or taxation. In ““Turning personal experience into political attitudes: The effect of
local weather on Americans’ perceptions about global warming,” Egan and Mullin (2012) look specifically at
the topic of Americans’ beliefs about the evidence for global warming.
They examine whether exposure to abnormally warm temperatures has an effect on whether Americans believe that there is solid evidence that the earth is getting warmer. They use Pew survey data from five
months between June 2006 and April 2008.
The variables of interest are:
● ddt_week - Average daily departure from normal local temperature (in Fahrenheit) in week prior to survey
● getwarmord - Opinion on whether there is “solid evidence” for global warming i.e., the earth getting
warmer (no = 1, mixed/some/don’t know = 2, yes = 3).
● wave - Month in which survey was conducted (1=June 2006, 2=July 2006, 3=August 2006, 4=January 2007, 5=April 2008).
Below is the code to import the dataset into R
### Load in the Egan and Mullin (2013) dataset
gwdataset <- read_dta("gwdataset.dta")
Part 1 (5 points)
Let’s define our outcome of interest as a binary indicator that takes a value of 1 when a respondent answers that “yes” they believe that there is “solid evidence” that the earth is getting warmer and 0 otherwise. Let’s define our treatment of interest as exposure to a “heat wave,” defined as a week with an average daily departure from normal local temperature above 10 degrees.
Under the assumption of complete ignorability of treatment, estimate the average treatment effect of exposure to a heat wave on individuals’ belief that there is solid evidence for global warming. Provide a 95% confidence interval and interpret your results. Given a rejection threshold of α = a05, do we conclude that there is sufficient evidence to reject the null of no average treatment effect?
Part 2 (5 points)
This paper combines data from 5 different Pew surveys from 2006-2008 taken at different times during the year. It may be the case that there is something different across survey waves such that complete ignorability is an unreasonable assumption. Choose an appropriate set of analyses to evaluate whether survey wave
confounds treatment and outcome. Interpret your results and discuss whether complete ignorability of our treatment is a reasonable assumption.
Note: Remember that wave is a discrete indicator variable for survey month/year - it is not a meaningful numeric value.
Part 3 (5 points)
Suppose instead that we assume that excess temperatures are conditionally ignorable given survey wave. Using an appropriate estimation strategy, estimate the average treatment effect of exposure to a heat wave on individuals’ belief that there is solid evidence for global warming. Provide a 95% confidence interval and interpret your results. Compare your findings to your answer from Question 1 and discuss any differences you find.
2022-04-29