Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECO205

1st SEMESTER 2022/23 Group Coursework

BSc Actuarial Science Year 3

BSc Economics Year 3

BSc Financial Mathematics – Year 3

BA English and Finance Year 3

ECONOMETRICS I

Group Coursework

General guidelines

This group project is an integral component of ECO205 and it contributes 35% of your module mark. Please choose a socioeconomic phenomenon or relationship (see guidelines below on choosing topics) that involves two or more variables and study this phenomenon or relationship using real world data and statistical models you learn in ECO205. As a stand-alone empirical study, your report is expected to follow the structure of a typical academic research (see more about the recommended structure later). Your submission is subject to Turnitin to check for similarities. Cases of academic dishonesty will be penalized according to university policy.

The topic may come from your own experience/knowledge (as an economist),textbook examples (with proper modification), or the academic literature. You are free to choose any topic, but please bear in mind that (1) it must make use of regression models, and (2) it must be properly motivated (i.e., why is it important/useful to investigate the specific problem). Please also note that even though the statistical methods and models presented in ECO205 is sufficient to produce many interesting  results,  you  are  free  to  use  more  advanced  statistical  methods  if they  provide additional information or fit your purpose.

Guidelines on choosing topics

If you don’t know where to start from, there are some good references that you want to check. Potentially, you may find research topics from the following sources.

1. The first source is your textbooks in other fields of studies (micro/macroeconomics, labor economics, international economics, finance, etc.). Usually these textbooks cover a wide range of economic or financial theories which you can test with real-world data. For example, you learned the concept of production  function in micro/macroeconomics and you may want to estimate a parametric form using city-level data on capital stock, labor input, and output for a  given year.

2.A second source of topics is the academic literature. Google Scholar is the best place to search the academic literature. Type a keyword and it will return hundreds of articles. You may read an article  arguing  that  the  urban  land  use  is  determined  by  income,  population,  and  urban transportation conditions. Following this article, you can collect data from China City Statistical Yearbook 2018 on (1) urban population, (2) per capita income, (3) transport infrastructure, and (4) urban land use and analyze how the first three factors may affect urban land use.

3.A third source is textbooks in econometrics. Most econometric textbooks emphasize empirical examples or exercises. Thus, they provide a large pool of potential topics. The easiest approach is to take one of the problems and apply the empirical model to your own data.

4. Of course, your topics are not restricted to the sources mentioned above. I also encourage you to find your own topics through deep thinking. Deep thinking produces interesting research questions. To give an example, you may model housing price to be jointly determined by demand and supply factors. However, there are many of them. It is then your job to narrow down to a few major factors and collect data accordingly. These cannot be done without deep thinking. Even if you adopt a research question raised by others, deep thinking will help you refine the question and generate new insights. For instance, in the model of Chinese housing price, you may want to consider factors overlooked by others but may be important in the Chinese context, such as administrative hierarchy and geographical location. These factors may bring further insights into your results.

Below are a few exemplary topics:

.    Estimate aggregate production function using regional (province- or city-level) data.

.    Estimate determinants of pollutants emission using regional data.

.    Estimate determinants of housing price using regional data.

.     Estimate  β -convergence using national data.

.    Estimate the environmental Kuznets curve using national data.

Although there is no restriction to the scope of topics you may try, to ensure that you obtain meaningful results from the analysis, please adhere to the following principles.

1. Please make sure you test an economic model, rather than an accounting identity. An economic model is a hypothetical functional form (according to some theory) that describes how one variable is determined by other variables. The exact form of this function is unknown and must be estimated using real-world data. For instance, economists often view the entire economy as a factory, where  inputs  (capital  and  labor)  are  converted  into  outputs  (GDP)  using  a  certain technology. A commonly adopted functional form is the Cobb-Douglas one, i.e.,  Y  = AKaLβ , where  Y  stands for GDP,  K  for capital stock,  L  for labor input, and  A  is called the total factor productivity (TFP). In this formulation, the parameters  a  and  β  are unknown, which can be estimated using real-world data. Within the regression model, we can test whether the technology exhibits constant returns to  scale (a + β = 1), increasing resturns to  scale (a + β > 1),  or decreasing returns to scale (a + β  < 1).

Accounting identities, on the other hand, are known formulas that must be universally true. This statement has two implications. One, the parameters of the formula are all known, which means there is no need to estimate them. Second, the relationship must be always true for any data set.

To illustrate, let’s consider the well-known GDP decomposition by expenditure type:  Y  = C + I + G + NX, where  Y   stands for GDP,  C  for personal consumption expenditures,  I   for private investment,  G   for government spending, and  NX   for net export. This is an accounting identity because the use of outputs must be one of the four types. Thus, their sum must be GDP. Here we have a linear function in  C ,  I,  G , and  NX, but their coefficients are known to be unity. Hence, it is meaningless for you to estimate this equation.

2. Data must be available for all the variables in your model. You cannot perform econometric analysis without data. Data availability is usually a major challenge for empirical studies. Using the Cobb-Douglas production function as an example, usually data on GDP (or value added) and labor input (employment) are relatively easy to obtain, but data on capital stock are seldom provided by the statistic bureau. If data on capital stock is unavailable, in principle the estimation cannot be done. In this very example, there are ways to overcome this data problem, but I don’t plan to elaborate here.

As another example, you may conceptualize a relationship between IQ and students ’ academic performance, controlling for effort. Although measures of effort are relatively easy to construct (attendance,  hours  of  study,  etc.),  a  reliable  measure  of  IQ  is  usually  difficult  to  obtain. Imaginably you need to ask the subject to undergo an IQ test, which is very costly and difficult to implement.

If your study employs country-level, province-level, or city-level aggregate data, please keep in mind that government agencies or international organizations are your only data source. Please check their websites or publications (statistical yearbooks) to verify that the data you need are available. If you plan to collect data by a survey, please think carefully about implementation issues.

If data availability is a problem, you have two options: First, you can change the proxy you are using for the variable of interest. For instance, if you need data on the number of permanent residents in cities, but such information is not provided, you can use the number of registered residents instead. Second, you can modify your topic by using a different variable. As an example, you may want to study the production function for the economy as a whole. In that situation you need productive capital stock for the entire economy. Suppose that data are unavailable but the statistical yearbooks do provide data on the capital stock of the secondary industry, then you can narrow down your topic to the production function of the secondary industry. Third, if both options are not possible, you had better think about a different topic for which data are available.

Guidelines on using data

A large  sample is always recommended. Although it was mentioned in the lecture that the minimal sample size could be as small as 50, in empirical studies it is highly recommended that you have far more data. A sample size of a few hundred or more is preferred.

Aggregate socioeconomic data at the city-, province-, or country-level can be downloaded from online sources. Below are some frequently used ones.

Statistical yearbooks offered by CNKI (access from XJTLU library link):

https://data-oversea-cnki-net.ez.xjtlu.edu.cn/chn/

Data offered by the National Statistics Bureau (register to download):

http://data.stats.gov.cn/index.htm

World Bank Open Data (all indicators):

https://data.worldbank.org/indicator?tab=all

IMF data:

https://www.imf.org/en/Data#global

Eurostat:

https://ec.europa.eu/eurostat/data/database

OECD.Stat:

https://stats.oecd.org/index.aspx?lang=en

FRED Economic data:

https://fred.stlouisfed.org/

A rich collection of online data sources (including U.S. labor survey data) compiled by the

American Economic Association:

https://www.aeaweb.org/resources/data

Please note: Some data sources cannot be accessed from China, please find technical solutions.

Guidelines on designing the analysis

This module covers quite a few important methods, including the OLS regression model, test of a single parameter, test of joint hypothesis, test for heteroskedasticity and WLS, nonlinear model, instrumental variable, etc. You are expected to employ appropriate methods (potentially statistical methods not covered by this module) in your empirical analysis. Although there is no fixed rule for good research design, quality researches share these common features:

1.   The analytical framework is carefully chosen to answer the research question and to analyze the data.

2.   Alternative model specifications or extensions of the model are explored to extract further

information from the data, to address data problems, and to consolidate the main findings. 3.   The results are interpreted and analyzed in detail.

Please avoid these common mistakes among past students:

1. Trying all the regression models or analytical methods learned in this module. Please bear in mind that your ultimate objective is to answer research questions. The coursework is not supposed to be an  exercise on  everything you learn. Contents that are unrelated to the research question damage the quality of your work.

2.   Presenting the analytical results without much interpretation. It is the interpretation, not the numerical results generated by software that answers the research question. Without proper interpretation, the results make little sense.

3.   Copying the analytical framework of a past  student work that earned a high mark. Their analysis serves their research question and their data, which are different from yours. Blindly copying other students’ analytical framework often results in a poor report.

Guidelines on format

1.   I suggest no more than 2,000 words for the report (excluding frontpage, appendix, references, tables, and footnotes or endnotes). The mark is not explicitly linked to the word count.  However, if your report is very short, it is unlikely to meet the marking criteria (see marking  scheme).

2.   I recommend the following structure for the final report:

a.    Title;

b.   Motivation and research question

c.    Description of data sources, variable measurement, and empirical model (why the regressors are important determinants of the dependent variable and what are their expected signs)

d.   Presentation of analytical results, interpretations, and statistical inferences e.    Discussion of results and conclusion

f.    References (if any);

g.   Appendix (see below).

3.   All Stata code and regression output must be reported in the appendix, placed at the end of the report. You should also include figures and tables in the main text and tables should be formatted as those in the textbook (for example, Table 8.3, though you can skip the 95% confidence intervals). Please do not use screenshots of any kind in your report.

4.   Please use the accompanying MS Word template to prepare your final report. Please insert your digital signature as a picture in the cover page. Please do not alter the format (font, line spacing, page margin, etc.) of the first two pages of the document. Please submit your final report as a MS Word document. PDF files are not accepted.

Important Dates

Event

Open date

Due date

Report submission

November 1, 2022 (Tuesday)

December 4, 2022 (Sunday) 24:00 PM

Marking components and weight

Component

Weight

Motivation of research question

10%

Data description

15%

Empirical model and method

25%

Results, interpretations, and inferences

30%

Discussion and conclusion

10%

Structure, format, and writing

10%

Total

100%

Peer assessment and individual marks

Your individual mark will be jointly determined by the base mark (the mark awarded to the report), and peer assessment. In principle, the individual mark can be higher or lower than the base mark,  depending on your relative contribution. The weight (20%) determines the magnitude of these  deviations. The algorithm used by the Learning Mall peer assessment function is described here:

http://webpaproject.lboro.ac.uk/academic-guidance/a-worked-example-of-the-scoring- algorithm/.

Please place your evaluations truthfully, with full respect for your teammates' effort. I will not override peer assessment results even if disputes arise after the release of component marks.