Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


Econ3338: Introduction to Econometrics I

Fall 2021

Project Instructions

Read the entire document carefully


1. Introduction

In this course, you are required to work on a project which uses econometric analysis for economic data.

Throughout the course, you will learn how use regression analysis to evaluate an economic relationship such as the effect of some variable(s) on another variable of interest. More specifically, you can estimate the effect by applying OLS to a sample data. One example in our textbook is the effect of the education on wages. You can extend this analysis by including additional explanatory variables such as experience and firm size.

This project requires you select a topic of your own where an economic relationship can be analyzed. You will be responsible for determining the research question, formulating the regression model, finding the relevant data, performing the econometric analysis and interpreting the empirical results.


2. Technical Details-Formats

Your project report (sometimes called the term paper) should be typed.

1. Cover page. You should have a cover page that should be structured as follows:

Name

B00#

Date

Project Title

Prepared for ECON 3338: Introduction to Econometrics

Section 01

2. Length. The maximum length including figures, tables and references should not exceed 10 pages.

3. Font Size and Space. You should use either Times New Roman or Garamond fonts, size 12. The text should be double-spaced.

4. Equations. Use an equation editor (built-in in MS Word) to show your regression(s), and number all equations in your text sequentially (e.g., first equation will be (1), second would be (2) etc.).

5. Color vs Greyscale Graphs/Figures. You can use color graphs/figures for your e-copy. But as the printouts will be probably in black and white, when you have two, or more, lines in a graph/figure use different line styles (e.g., solid, dotted, etc.).


3. Proposed Outline

1. Introduction

This section must contain a brief description of what you will do in your project without getting into details. Notice that you should also briefly review the literature that is closely related to your project. By briefly we mean that the review is brief and to the point and does not contain irrelevant explanations.


2. Methodology

This section must discuss what you will do in this project. You should state your research questions and how to answer them. For example, you wish to know how education and experience affect wages. You can use a linear model and estimate it by using OLS. If there is a similar paper in the literature that is relevant, you must review and cite it and list the source of the paper in your references. It is important that you emphasize the difference between your work and the cited paper. Is it in the methodology? Do you include more independent variables in your analysis? Do you use a different estimation technique? Do you have a different dataset? Do you cover a different sample period? If any of the above is true, you are on the right track because your project might add new evidence to the existing literature.

Remember to write the regression that you plan to run using the format like:

wageiα + βedui + ui (1)


3. Description of the data

In this section, you describe your data set: the variables and any transformations (log, squared, etc.), their nature (continuous or binary (0/1)), time period that they span (or the number of observations), and their source.

Summary statistics should be provided either in tables or figures, depending on their nature. The full range of summary statistics (mean/variance/min/max/skewness/kurtosis/number of observations) can be provided for continuous variables. Binary variables like gender (male/female) can be summarized in figures in the form of a pie/bar chart. You may try other data visualization approaches to provide the relevant information of your data set. Remember that this is not the core requirement for your project. If you notice some patterns in your data that are interesting or unusual, please discuss them.

Furthermore, you can also provide some preliminary analysis about the relationship between variables of interest using scatterplots between pairs of variables of interest.


4. Empirical Results

In section 2 you have explained your research methodology. In this section, you should estimate your models based on your data and report the empirical results. The regression outputs must be provided in the table form as shown in our textbook and the empirical findings must be fully discussed. In the course, you will learn how to estimate the models and how to do inference for the models (i.e., testing hypotheses about the values of the coefficients of your model based on your OLS estimates). You are asked to use what you have learned to estimate, and make inference for, your models. In this section, you will also discuss the model specification and potential biases in your analysis. You may consider additional independent variables that would also affect the dependent variable and you may want to see if they are correlated with the included independent variables.

If you have regressed the same variable of interest on different independent variables, you should discuss which resulting model is better in terms of the level of goodness of fit (e.g., R2 and adjusted R2).

You need to be aware that your empirical results would be reliable if most of the OLS assumptions are satisfied. After running the regressions, you should check whether the OLS assumptions are indeed satisfied. If the assumptions are not satisfied, this may imply that your model might be misspecified. We do have remedies for some of these violations, which will be discussed in the future and should be considered in your analysis.


5. Conclusion

This is the final section of your project. You should provide a summary of what you have done and what you have found. In one or at the most two paragraphs, you state your research questions and interpret the empirical results. You may also provide any suggestions for further research (i.e., including variables other than those in the dataset, considering different functional forms, and using different estimators etc.).


6. References

You should list all cited papers in the literature that are related to your research questions following the Chicago Manual of Style as follows:

Andrews D., and E. Zivot, (1992), Further Evidence on the Great Crash, the Oil-Price, and the Unit-Root Hypothesis, Journal of Business & Economic Statistics, 10, 251-270.


Appendix

Most of you probably would use MS-Word or in similar programs to type up the project report. Using these programs may create a lot of difficulties regarding the layout of your work, if you include tables and figures in the main text. Tables will go to the next page leaving a huge blank in the previous page and figures will move to different pages if you try to resize them. One way to deal with these problems is to put all your tables and figures in the end of the file. One word of caution: all tables and figures should be labeled (e.g., Table 1, Figure 5, etc.) with their titles. When you discuss the empirical results in the text, you use table and figure numbers to refer to the empirical results shown in those table and figures. In Section 4, you should refer to relevant tables and figures when you discuss the empirical results as follows:

“The results of running the regression. ……. can be found in Table 2. We notice that the coefficient estimate for education is statistically insignificant….”

Similarly, you can refer figures by their numbers in your discussion.

If you move all tables and figures into Appendix at the end of the file. You should have 2 sections: one for tables and one for figures. The tables must be put in order of appearance in the text. Table 1 should be the first table mentioned in the text and so on. You will proceed similarly for the figures.

It is important that tables and figures are not a direct copy-paste from the software output. You should create your own tables and graphs. If you do not do so, there will be a point deduction.

The following can be considered as rough guidelines on the above structure:

1. Short introduction (2-3 paragraphs) on: what is(are) the research question(s)? Why do we care? What does the literature say about your question(s)? Focus on the dependent variable.

2. The model (1 page+): What regression model do you use to answer you research question(s)? Write down the relevant equation(s) and present the independent variables you are using. What impact do you expect the independent variables have on the dependent variable? That is, will each of these independent variables have a positive or a negative impact on the dependent variable? Focus on the independent variables.

3. Data description (2-3 paragraphs): What is the source of the data that you are using? Provide some discussion of the descriptive statistics of the dependent and independent variables that you will be using to estimate your regression model.

4. Empirical Results (2-3 pages): you can divide the analysis in the following distinct parts

Misspecification Testing

➢ Estimate your regression model in the original form

➢ Perform testing for assumptions (RESET, B-P and White’s tests) in the following order:

If using RESET, functional form is found to be a problem, transform the model using logarithms, or by adding squared terms of some of your independent variables.

Estimate the “updated” model and check again for functional form using RESET.

Regardless of whether you find that there is still a problem, move forward by checking the “updated” model for heteroskedasticity (B-P and White’s tests).

If heteroskedasticity is not an issue, then you are ready to discuss your regression output results.

If heteroskedasticity is found, then your re-estimate your “updated” model using Robust Standard Errors and proceed in discussing your regression results now.

Discussion of Regression Output

➢ Overall significance (F-test in the regression output checks if all the variables are insignificant).

➢ Goodness-of-fit: what is the adjusted R-squared found? Do you deem it to be large enough that the model has explanatory power? Could you add more independent variables to the model?

Note that if you decide to explore the case of using more independent variables by including them in the “updated” model, please use the general-to-specific approach to evaluate these additions discussed in the next section.

➢ Individual coefficient estimates: Statistical interpretation

If you have found a coefficient estimate associated with a variable to be significant, please state the finding (mention the significance level).

If you have found a coefficient estimate associated with a variable to be insignificant, please state the finding. There is nothing wrong if you have found any insignificant finding. It means that based on the dataset, you have found no evidence (statistical) of any effect of this independent variable on the dependent variable. It is also of interest to find out whether some independent variables have any explanatory power or not.

If you find that the coefficient estimates associated with two or more independent variables are insignificant, you may want to check of whether they are jointly insignificant. You can perform an F-test of this restriction.

➢ Individual coefficient estimates: Economic interpretation

You can now interpret the coefficient estimates in economic terms, i.e., what effects the independent variables have on the dependent variable. Do you think that the effects are large?

Please do so for both statistically significant and insignificant coefficient estimates.

5. Conclusion (2 paragraphs): Briefly report the model (in words) you wish to estimate and explain why you care. What are your main findings (the independent variables that have some effects on the dependent variables and the magnitude of each effect)? How could these empirical results be used for policy making (practical purposes)?

Important: whenever you perform hypothesis testing, the p-value is relevant to the null hypothesis.

p-value<1%: reject null hypothesis at 1% significance level

1%<p-value<5%: reject null hypothesis at 5% significance level

5%<p-value<10%: reject null hypothesis at 10% significance level

P-value>10%: do not reject null hypothesis.

In misspecification tests, the null hypothesis is that there is no problem with the specification or no misspecification. In regression output, the null hypothesis is that the individual coefficient is equal to zero (statistically insignificant).


4. Which variables should you include in your models?

Sometimes you have a theory about which independent variables you should include in your model (an attempt to quantify the coefficients in a Cobb-Douglas production function: the independent variables are labor and physical capital). In other cases, you do not have such a theory but a large set of independent variables that you believe that determine/affect your dependent variable. In such a case you are not sure which you should include and which not. You can proceed in the following way:

Begin with all independent variables that you think are relevant and that do not have near multicollinearity issues. If applicable, you may consider cross products of these variables (you need to justify why you do so).

Estimate your model with all of them.

Check the output. Some of coefficient estimates may be statistically insignificant. Consider testing the null hypothesis that these coefficients are jointly insignificant. If you cannot reject this null hypothesis, remove the independent variables associated with these coefficients in your model. Otherwise, consider testing all the null hypotheses that are related to subgroups of these coefficients.

Estimate your model again and repeat the hypothesis testing process. The coefficient estimates should not be considerably changed, compared to the previous models or else you have omitted variables bias (especially their sign). Examine again if the OLS assumptions are satisfied.

In the end, you will be left only with the independent variables with statistically significant coefficient estimates. This will be your final version of the model. Examine if the OLS assumptions are valid for this model.

This procedure is called the “general to specific” approach and it is a process in which researchers remove, at each step, the independent variables with insignificant coefficient estimates from the initially proposed model.

Finally, study the materials on the OLS specification. You will find it useful in building your model and interpreting your empirical results.


5. Project Rubric



Approximate Weight
Technical
Model Description
7.5
Data Discussion
10
Control Variables (check)
15
Misspecification Testing
12.5
Statistical Significance
15
Qualitative
Motivation/ Originality of Research Question
10
Literature Review
10
Economic Significance (Discussion)
10
Layout-Format
10