Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Research project guidelines

Overview

You will complete a research project of your choice with data compiled from existing databases and analyzed using Stata (if you’d like to use a statistical software other than Stata, please let me know first). The project has 3 parts: (1) topic and variable description, (2) simple regression model and (3) multiple regression model. Projects based on collecting original data via self-designed survey instruments are not allowed.

1.1 Format requirements

Each submission should have a professional appearance. Use a 12-point font and double-spacing. Use complete sentences—no bullet points! Use the spell checker. Integrate the statistical output into the text, e.g., do not simply attach output from the statistical software you use to the end of your document.

Present your work in a logical order that is easy for the reader (me) to follow. I am not grading on grammar per se but persistently sloppy grammar will result in point deductions, as will poor presentation of your results.

1.2 Interpretation of statistical output

All statistical analyses in your project should be interpreted.  Use the statistical terminology in the course,  along with an interpretation that is understandable by a general audience. For example: “I reject the null hypothesis that the two means are equal, leading me to conclude that the selling price of the houses in gated communities is greater than the selling price of houses in non-gated communities.” Presenting results with   no discussion will result in loss of points.

You do not need to conduct or present the results of a statistical test that we have not gone over in class, e.g., test for normality, test for multicollinearity.

Project requirements

2.1 Topic and Variable Description

2.1.1 Overview

The choice of topic is possibly the most challenging part of this project. A general principle to follow in selecting a topic: do not overthink this! The purpose of this project is for you to apply the concepts we  are learning in this course. Period. You are not expected to expand the frontiers of knowledge. If you like sports, consider a topic in your sport of interest. If you came across something interesting in one of your economics courses, explore it in more detail in this project. If you are getting stressed out picking a topic, particularly if you think it isn’t original enough, schedule a meeting with me.

Use cross-sectional data—no time series and no panel datasets. (These are topics for EC 451.) Part of the purpose of this project is to give you the experience of assembling a dataset to explore a topic of your choosing.

2.1.2 Requirements

1. Describe your topic: Discuss what question you are trying to answer and why it is important to people interested in that field. Your topic should express a specific hypothesis, e.g., higher teacher salaries are a positive and significant determinant of student performance on standardized tests.

In this example, teacher salaries is an independent, or X, variable and the variable of interest; stan- dardized test scores are the dependent, or Y, variable. You will need at least 4-5 independent variables, at least one of which must be a dummy variable. Get help choosing those variables by reviewing what other researchers have used in similar projects.

2. Data: The data set that you will build for the project must contain at least 5-6 variables (1 dependent variable and 4-5 independent variables). Before you proceed further, you should check that data is available for each of your variables. Missing values are not allowed. If you have a missing value for one variable, you will have to throw out the entire observation. Describe your variables with enough specificity to make me confident that you know what you are doing.

? Considerations for the dependent variable (Y):

– Cannot be binary, e.g., win/loss, success/failure, loan/no loan, white/non-white.

– Cannot be categorical, for example, a team’s ranking in its conference.

– Cannot be a percentage.

– Can be a ratio, e.g., crime rate (number of violent crimes per 100,000 people), mortality rate (number of deaths per 100,000 people), etc.

? Considerations for the independent variables:

– Most should be discrete or continuous variables; no more than 2 binary (dummy) variables.

– You can use a variable measured in percentages here.

– At  least one binary variable that splits your sample into 2 sub-samples (not necessarily of  equal size).

* Base the split on a logical division within your sample where there are intuitive differences between the two  groups.  For  example,  if you are using sport teams,  break the sample  into NFC vs. AFC, NL vs. AL, or large market teams vs. small market teams. Countries could be divided into developed and less-developed countries.

* Categorical variable (e.g., ethnicities, level of education, region of country) must be con- verted into binary variables.

* If the dependent variable can be represented as an identity, you cannot use its defining variables as independent variables.

? For example, GDP = C + I + G + X – M, so C, I, G, X and M cannot be your independent variables.

? For per capita variable GDP, GDP or population cannot be independent variables

3. Data sources: What is the source or sources of your data? Will you be able to get the data for your variables directly or will you have to transform data from your sources to get the desired variable? If so, how?

2.2 Simple Regression Model

1. Variable description: Describe each of your variables. Imagine you are describing your project to someone who knows nothing about it and give them enough information to know what your variables are. An adequate description should include:

? Definition: what the variable represents, its units of measurement, scaling (hundreds, millions, etc.). Note that income variables should always be adjusted for inflation; often referred to as real income or income measured in constant dollars.

? Create a short descriptive name (around 6 characters) for each variable for use in your code and the regression equation (I will deduct points for using long descriptions as variable names)

? For categorical variable, provide the rule used to convert it to a binary variable.

? For each non-binary variable create a histogram and comment on its characteristics. For example, a histogram of income will most likely be positively skewed. The majority of the observations will be on the left side of the histogram, and the histogram will have a long right tail as there is a small number of people at the top of the earnings distribution. Include the output of the summary statistics from Stata (or the statistical software you use) and discuss the summary statistics of each variable.

? For each independent variable, specify the sign you expect its coefficient to have and explain why you expect it to have that sign. In other words, provide a hypothesis and tell a story justifying that hypothesis.

2. Simple linear regression: Run a simple linear regression of your dependent variable on a constant and your variable of interest (pick one that you’re most interested in). Don’t forget to use robust standard errors.

? Interpret the coefficient: Does the coefficient have the expected sign? Is it statistically significant? Is it economically significant?

Note: You only need to present ONE simple linear regression (do NOT present more than one).

2.3 Multiple Regression Model

? Run a regression of your dependent variable on all of your independent variables. Interpret the statisti- cally significant coefficients only (those with p-values of .10 or less). Your interpretation should include the effect of a one-unit change in your variables of interest on the dependent variable in terms of the units your data are measured in. You should also note whether the coefficients have the expected sign.

? Comparing this regression with the simple linear regression aboove: How well does this model fit the data?

? Perform at least one F-test of joint hypotheses

2.4 Conclusion

? Summarize your project: what are your findings? Explain the policy or practical implication of your findings to the audience relevant to your research question.