R Quiz 1: ECON-UA 266 - Intro to Econometrics Fall 2025
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
R Quiz 1: ECON-UA 266 - Intro to Econometrics
Fall 2025
You have 24 hours to finish the exam. Please write all of your questions in an .R format file, name it as “firstname_lastname_RExam1”. When submitting your quiz, only submit your .R file on brightspace.
[IMPORTANT] For all the exercises use a seed = 123.
Before you start, make sure you have installed and library packages below: ggplot2, dplyr, tidyverse, foreign, stargazer
Question 1: ggplot and descriptive statistics [15 points]
In this question, we are asked to generate samples from a random variable, visualize the sample and generate descriptive statistics tables.
(a) [5 points] Generate a sample of 1000 observations from a Bernoulli distribution where the variable takes a value equal to one with p = 0.3 and zero with probability p = 0.7.
(b) [5 points] Plot the histogram of the data you got from (a), make sure the y axis shows the share and not the count.
(c) [5 points] Create a table to show the descriptive statistics of the samples you generated in a (mean, standard deviation, min and max); make sure to label the variables Bernoulli.
Question 2: Data [35 points]
Download data by 2016 CPS (the March CPS) on Brightspace (data named morg16). You should be familiar with morg16 since it has been used several times in homework. You will be asked to do an ols regression for specific variables, report the results and interpret the coefficients.
(a) [5 points] Keep only the observations:
(a) on weekly earnings, sex, race, age and education (the corresponding variable names are earnwke, sex,age, race, grade92).
(b) for respondents aged 25-64.
(c) from New York State (hint: stfips==36)
(b) [5 points] Data cleaning: Change all the data which equals to 0 to NA and drop all the NA data.
(c) [10 points] Take log of weekly earnings of individuals. Create a new column/vector called logincome for it and plot the log of weekly earnings of individuals against the age for the identified male only. Do the same thing for the identified female.
(d) [5 points] Create a dummy variable equal to one if an individual is older or equal to 45 and zero otherwise.
(e) [10 points] Run a regression with log of weekly earnings of individuals as dependent variable and the age dummy variable you created for the sample of men and women separately. Report your results of ols by a formatted table using stargazer.
Question 3: Simulate OLS Coefficients [50 points]
In this question, you are going to simulate ols estimates step by step.
Suppose the true relationship between Yi and Xi is captured by:
Yi = Xi + εi
where E[εi |Xi] = 0; ϵi is drawn from a normal distribution with mean equal to 0 and standard deviation equal to 1. And Xi is a normal distribution with mean equal to 2 and standard deviation equal to 2.
(a) [10 points] Generate a sample with 1000 observations and name the variable covariate from a normal distribution with mean equal to 2 and standard deviation equal to 2. Generate a sample with 1000 observations and name the variable error from a normal distribution with mean equal to 0 and standard deviation equal to 1. Generate the outcome variable using the relationship above and name it outcome. Save the three variables in a dataframe.
(b) [5 points] Plot the sample of outcome variable (Y-axis) against the covariate (X-axis) using a scatterplot. Make sure to label the Y-axis “outcome” and X-axis “covariate.”
(c) [10 points] Estimate the ols estimators for the intercept and the slope using the sample you generated from the population. Then use a formatted table to show the results using stargazer.
(d) [10 points] Use a for loop to simulate the OLS regression from part d) 1000 times (generating a new sample of X,Y each time). Save the slope and intercept coefficients from each iteration and then plot a histogram of the estimates (one histogram for the slope estimates and one for the estimates of the intercept). Print the standard deviation and the mean of the estimators.
(e) [15 points] Answer the questions from a to e but with a sample with 100 observations.
2025-11-05