POLI 7000 Writing Assignment 2

Soren Jordan

Spring 2021


This will be due March 26 (Friday) by 11:59 PM to Canvas as a .pdf. You lose 10 points for each day it is late. For Writing Assignment 2, a complete assignment consists of two parts:

1. First, the general questions and R script. You will be asked a series of questions about the statistics. Each of these questions should be answered as a comment (a line beginning with a # in the R script).

2. Second, once all of the questions are answered, please run your R script to execute it in the R console. Copy and paste the R console at the end.

Thus, when turning in the assignment, you should have the script, followed by the con-sole, and save the whole thing as a .pdf. This is still a writing assignment. That means your sentences should be free of spelling and grammatical errors and should use complete sentences.

Recall: you are allowed to consult with one another on coding for homework assignments, as long as

1. You never “divide and conquer” the assignment. All students are responsible for all portions of each assignment, and

2. You are not allowed to collaborate on the “applied” or “discussion” portions of ques-tions. You can code together, but as soon as you start writing sentences, you must use your own words and your own words alone, and

3. You explain your answers. Even if an answer is a simple mathematical solution, explain how you arrived at it. I can’t give partial credit for wrong numbers, but I can give partial credit for a thought process.

If you have questions . . .

Overall, please feel free to post to the relevant Discussion page on Canvas.

About R, please feel free to email me directly.

The assignment starts on the next page.


R Instructions and Questions

Reminder: to download data and read it into R

1. Download the dataset (usually XXX.csv) from Canvas

2. Move it to wherever all of your course materials are on your computer

3. Once it is on your computer, copy the filepath and read it into R

On a Mac, you can use option + command + c to copy a file path

On Windows, hold down shift and right-click to Copy as path. See tinyurl.com/windows-file-path. Also, on Windows you must reverse the di-rection of the slashes in the filepath

As a reminder, there are scripts on Canvas to use as a helpful example

This homework assignment will use three different datasets for practice. You’ve already seen one before!

Druckman and Shafranek (2020) (DOI: 10.1086/708776). Download the data from Canvas: ds_data.csv.1 It contains eight variables:

X: A counter variable for each row

condition: The condition of the experiment for each observation (see Druckman and Shafranek, page 1603)

bouncedback: An indicator for if the email bounced back (i.e. was emailed to an illegitimate address)

0 = No

1 = Yes

realresponse: An indicator for if the authors received a real response to the email

0 = No

1 = Yes

autoresponse: An indicator for if the authors received an automated response to the email

0 = No

1 = Yes

daystorespond: An indicator for how many days it took for a response to be received

– 0-52 = Response time (in days)

emailengaged: An indicator for if the response email engaged the email sender (i.e. “why don’t you talk to us more?”)

0 = No

1 = Yes

• dayofweeksent: An indicator for the day of the week the email was sent to the college

– (Day of week in characters)

Like before, create the following two variables to help aid in analysis:

black: a dummy variable that is 1 for all conditions with a Black email sender and 0 otherwise.

dem: a dummy variable that is 1 for all conditions with a Democrat email sender and 0 otherwise.

1. Run the difference of means test between daystorespond and the black variable you created.

(a) What is the average days to receive a response for emails from Black senders?

(b) What is the average days to receive a response for emails from white senders?

(c) Interpret the p-value on the difference between the means.

(d) Connect this to the 95% confidence interval reported. What number does this interval include?

2. Execute the multivariate regression of daystorespond () on experimental condition black () and dem ().

(a) Interpret the from this regression. Is it useful here? Why or why not?

(b) Interpret the on black from this regression (including its statistical significance).

(c) Interpret the on dem from this regression (including its statistical significance).

(d) Interpret the RMSE from this regression. Be sure to indicate what “units” means (i.e. the scale of the RMSE) in your interpretation. Is this RMSE “good”? How would you know?

(e) Interpret the from this regression.

(f) Generate and interpret the following predictions (discussing them on the scale of the variable):

i. The predicted days to respond to an email from a Black Democrat.

ii. The predicted days to respond to an email from a Black non-Democrat.

iii. The predicted days to respond to an email from a white Democrat.

iv. The predicted days to respond to an email from a white non-Democrat.

Jones and Brewer (2019) (DOI: 10.1086/701835). Download the data from Canvas: jb_data.csv.2 It contains eight variables:

expt: The condition of the experiment for each observation (see Jones and Brewer, page 698)

0 = Did not get the treatment

1 = Got the treatment

pool.vote: The respondent’s propensity to vote for the hypothetical candidate

1 = Very unlikely

2 = Somewhat unlikely

3 = Somewhat likely

4 = Very likely

pool.ideo: The respondent’s rating of the hypothetical candidate’s ideology (as a scale, where you select from numbers in between)3

1 = Very liberal

...

4 = Moderate

...

7 = Very conservative

pool.tworthy: The respondent’s rating of how “trustworthy” describes the hypothet-ical candidate

1 = Not at all well

2 = Somewhat not well

3 = Somewhat well

4 = Very well

pool.moral: The respondent’s rating of how “moral” describes the hypothetical can-didate

1 = Not at all well

2 = Somewhat not well

3 = Somewhat well

4 = Very well

pool.repsyou: The respondent’s rating of how “represents you” describes the hypo-thetical candidate

1 = Not at all well

2 = Somewhat not well

3 = Somewhat well

4 = Very well

ideo7_libhigh: The respondent’s own ideology4

1 = Very conservative

2 = Somewhat conservative

3 = Moderate, leaning conservative

4 = Moderate

5 = Moderate, leaning liberal

6 = Somewhat liberal

7 = Very liberal

pid7_demhigh: The respondent’s own ideology5

1 = Strong Republican

2 = Weak Republican

3 = Independent, leaning Republican

4 = Independent

5 = Independent, leaning Democrat

6 = Weak Democrat

7 = Strong Democrat

1. Run a summary of each of the variables in the dataset.

(a) Describe the ideology of the overall sample.

(b) Describe the party identification of the overall sample.

(c) Why do you think the treatment variable (expt) has a mean so close to 0.50?

(d) On which of the three rating variables (pool.tworthy, pool.moral, and pool.repsyou) does the hypothetical candidate have the lowest average evaluation? Why do you think this is? (Remember, these averages combine across the experimental con-ditions.)

2. Execute the appropriate bivariate test to investigate the relationship between the ex-perimental condition, expt, and the respondent’s willingness to vote for the candidate, pool.vote.

(a) Why did you choose this bivariate test?

(b) Interpret the p-value of the bivariate test.

(c) Connect the confidence interval reported by R to the p-value.

3. Execute the bivariate regression of pool.vote () on experimental condition expt ().

(a) Interpret the from this regression. Is it useful here? Why or why not?

(b) Interpret the from this regression (including its statistical significance).

(c) Interpret the RMSE from this regression. Be sure to indicate what “units” means (i.e. the scale of the RMSE) in your interpretation. Is this RMSE “good”? How would you know?

(d) Interpret the from this regression.

(e) Is this relationship between expt and pool.vote substantively significant? Justify your answer.

4. Execute the multivariate regression of pool.vote () on experimental condition expt () and ideo7_libhigh () but only for Democrats (pid7_demhigh > 4).

(a) Interpret the from this regression. Is it useful here? Why or why not?

(b) Interpret the on expt from this regression (including its statistical significance).

(c) Interpret the on ideo7_libhigh from this regression (including its statistical significance).

(d) Interpret the RMSE from this regression. Be sure to indicate what “units” means (i.e. the scale of the RMSE) in your interpretation.

(e) Interpret the from this regression.

(f) Generate and interpret two sets of predictions (discussing them on the scale of the variable):

i. The predicted value of pool.vote for a very conservative person, a moderate person, and a very liberal person, all of who got the treatment.

ii. The predicted value of pool.vote for someone who did versus did not get the treatment, both of who are moderate.

iii. Of the five predictions you just generated, which makes the least sense, given what we know about American politics and who is in the model?

5. Execute the multivariate regression of pool.vote () on experimental condition expt () and ideo7_libhigh () but only for Republicans (pid7_demhigh < 4).

(a) Interpret the from this regression. Is it useful here? Why or why not?

(b) Interpret the on expt from this regression (including its statistical significance).

(c) Interpret the on ideo7_libhigh from this regression (including its statistical significance).

(d) Interpret the RMSE from this regression. Be sure to indicate what “units” means (i.e. the scale of the RMSE) in your interpretation.

(e) Interpret the from this regression.

(f) Generate and interpret two sets of predictions (discussing them on the scale of the Y variable):

i. The predicted value of pool.vote for a very conservative person, a moderate person, and a very liberal person, all of who got the treatment.

ii. The predicted value of pool.vote for someone who did versus did not get the treatment, both of who are moderate.

iii. Of the five predictions you just generated, which makes the least sense, given what we know about American politics and who is in the model?

6. Explain why the sign (positive versus negative) on expt flipped across the two re-gressions. What does this mean for how Republicans and Democrats reacted to the treatment?

7. Which group, Republicans or Democrats, had a stronger reaction to the treatment? How do you know?

Lajevardi and Abrajano (2019) (DOI: 10.1086/700001). Download the data from Canvas:

la_data.csv6 It contains 11 variables:

X: A counter variable for each row

age: The respondent’s age

– 21-82 = Age (in years)

income: The respondent’s income

1 = Lowest income (code not provided by authors)

...

6 = Highest income (code not provided by authors)

female: An indicator for if the respondent is female

0 = No

1 = Yes

black: An indicator for if the respondent is Black

0 = No

1 = Yes

democrat: An indicator for if the respondent is a Democrat

0 = No

1 = Yes

independent: An indicator for if the respondent is an Independent

0 = No

1 = Yes

supportDT_primary: The level of support the respondent gave to Donald Trump in the 2016 primary

0 = No support at all

...

100 = Highest level of support

z_fav_muslim: Standardized feelings towards Muslims7

-2 = 2.5th percentile (less favorable towards Muslims than 97.5% of people)

...

0 = 50th percentile (as favorable towards Muslims as the average person)

...

2 = 97.5th percentile (more favorable towards Muslims than 97.5% of people)

z_mrr: Standardized Muslim racial resentment (the MAR scale from page 297)

-2 = 2.5th percentile (less resentful of Muslims than 97.5% of people)

...

0 = 50th percentile (as resentful of Muslims as the average person)

...

2 = 97.5th percentile (more resentful of Muslims than 97.5% of people)

otherrace2: An indicator for if the respondent is another race (other than Black or white)

0 = No

1 = Yes

1. Run a summary of each of the variables in the dataset.

(a) What percent of the sample is female?

(b) What percent of the sample is Black?

(c) What percent of the sample is a Democrat?

(d) How would you describe the sample’s feelings towards supporting Donald Trump?

2. Execute the appropriate bivariate test to investigate the relationship between Mus-lim racial resentment, z_mrr, and the respondent’s support for Donald Trump in the primary, supportDT_primary.

(a) Why did you choose this bivariate test?

(b) Interpret the substantive significance of the test.

(c) Interpret the p-value of the bivariate test.

(d) Connect the confidence interval reported by R to the p-value.

3. Execute the bivariate regression of supportDT_primary () on Muslim racial resent-ment z_mrr ().

(a) Interpret the from this regression. Is it useful here? Why or why not?

(b) Interpret the from this regression (including its statistical significance).

(c) Interpret the RMSE from this regression. Be sure to indicate what “units” means (i.e. the scale of the RMSE) in your interpretation. Is this RMSE “good”? How would you know?

(d) Interpret the from this regression.

(e) What does this bivariate regression tell you that the bivariate test did not?

4. Execute the multivariate regression of supportDT_primary () on Muslim racial re-sentment z_mrr (), age (), income (), female (), black (), democrat (), independent (), whether the respondent is another race otherrace2 (), and the respondent’s favorability of Muslims z_fav_muslim ().

(a) Interpret the from this regression. Is it useful here? Why or why not?

(b) Interpret the on z_mrr from this regression (including its statistical significance).

(c) Interpret the on female from this regression (including its statistical significance).

(d) Interpret the on democrat from this regression (including its statistical significance).

(e) Interpret the RMSE from this regression. Be sure to indicate what “units” means (i.e. the scale of the RMSE) in your interpretation.

(f) Interpret the from this regression.

(g) Generate and interpret two sets of predictions (discussing them on the scale of the Y variable):

i. The predicted value of supportDT_primary for someone with the minimum Muslim racial resentment, average Muslim racial resentment, and the maxi-mum Muslim racial resentment, holding all other variables at their means.

ii. The predicted value of supportDT_primary for someone with the minimum Muslim racial resentment, average Muslim racial resentment, and the maxi-mum Muslim racial resentment, holding age, income, and z_fav_muslim at their means and female, black, democrat, independent, and otherrace2 at their modes.

(h) Which variable is more substantively signifificant: z_mrr or democrat? Justify your answer (potentially using the predictions you just made).