Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

GROUP PROJECT, ESSENTIALS OF ECONOMETRICS

Deadline:  Thursday November 9th 2023 at 12:00 Noon

INSTRUCTIONS:

•  The term “regression” was first used by Sir Francis Galton in the late XIX century when he studied the relationship between the heights of adult children and their parents. This project has two parts.  In part  1, you are asked to analyse this question using Galton’s original data set. In part 2, you are asked to collect a new data seton heights andre-analyse this question.

•  The  project  has  been   divided  into  8  short  questions,   equally  weighted.     Please  note  that many questions can be answered with a self-contained table and a  short  and  concise  text. We will cover self-contained tables in our lectures and you can also find more information in the STATA guide. Please limit your answers to ONE page per question maximum, including text–with font size 12 and double spaced– , graphs and tables.

•  The data collection has to be conducted under the ethical guidelines of the School of Economics as per https://www.ed.ac.uk/economics/research/ethics. It can only involve individuals 18 years old and over. The information collected cannot be sensitive in anyway that can harm the well-being and dignity of the subjects interviewed.  To  maintain  confidentiality, no names or any contact details have been collected. All interviewed subjects have been told that this is part of the EofE course project.  At the end of the semester, the data need to be erased.

Every project has to include the following declaration.

Declaration: “Group XX, composed by students BXX,BXX, etc., confirms that the data collection has been conducted under the ethical guidelines of the School of Economics.  The data collection has only involved individuals 18 years old and over.  The information collected is not sensitive in any way that can harm the well-being and dignity of the subjects interviewed. To maintain confidentiality, no names or any contact details have been collected. All interviewed subjects have been told that this is part of the EofE course project. At the end of the semester, the data need to be erased. 

•  You also need to provide an Appendix that contains a STATA log file  (resulting of running a do file) that showshow you have solved each question.

•  Please show your calculations in the main text of the project (even if they are performed in the do file).

•  Submission:  ONE  electronic  copy  per  group  should  be  submitted  via LEARN. Please  submit ONE  unique  pdf  file  for  the  whole  project  that  includes  your  answers  and  the  Appendix containing the log file.

•  Please ensure that you are aware of the requirements  for appropriate citation of references and data sources and have read the guidance on plagiarism in section 4.4.1 of the Economics Honours Handbook and/or the general University guidance at:

http://www.aaps.ed.ac.uk/regulations/Plagiarism/Guidance/StudentGuidance.doc


•  This project relies on team work. All members of a team will be awarded the same common mark for the project. If unfortunately a team suffers from free riding issues please follow the guidelines followed in the EofE Handbook.

•  All STATA commands that you need for this project you have learned them  already in the labs. These are sufficient to undertake this project, please refer to the STATA guide. If you decide to use other commands in STATA, please make sure you know the assumptions behind these and the different options in order to use them properly.

•  Recall that the Econometrics Team holds helpdesks everyday of the week! Please come along for for any questions on theoretical and empirical issues. Please visit the teaching assistants during their office hours for any Stata and computing issues related to this project.

•  Please note that the above submission deadline will be strictly applied.  Following standard University-wide policy, a penalty of 5 percentage points per working day, or part thereof will be applied, up to a maximum of 5 days after which a mark of zero will be awarded. Extensions will only be granted where there are substantial and properly authenticated special circumstances (e.g. serious illness).

Part 1: Analysis with Galton’s original data set

Galton’s  work  on  children  and  parents’  height  was  published  in:   Galton,  F.  (1886):   “Regression towards mediocrity in hereditary stature”, Journal of  the Anthropological Institute, 15: 246-63. In this first part of the project you are asked to reconstruct the original data from this original article and replicate his analysis.

•  Question 1. Find Galton’soriginal article (on jstor.org or LEARN). On Table I of his article, the data used are summarized. You need to create a STATA data set that contains the 928 observations that Galton collected.  It is recommended that you first type the data in an excel file and then have STATA read that file.  Some versions of the Galton data set are available online. You are advised NOT to use them.  It is part of this project that you show that you understand how to make a data set from such a table. There are important conceptual issues that you will miss if you borrow the data from somewhere else.

For those observations reported in Table I of Galtons article as belowor abovethe min-  imum and maximum height values, you need to assume some particular values. Please state  these explicitly in a table and provide a justification with one sentence. Define “tall parents” and “short parents” according to your data.  Then divide your sample into these two groups  and report relevant statistics for the adult children and for parents in each group. Report this  information in a table and comment it.

•  Question 2. Galton was the first to describe and explain the phenomenon of “regression to- wards the mean”. Being concerned about the height of the English aristocracy, he interpreted his results as regression to mediocrity” (hence the name regression”).

Regress the height of adult children against the height of parents.  Report your results in a table and interpret the estimated coefficients.  What can you say about the relationship between the height of parents and their children? Are children of tall (short) parents as tall (short) as their parents?

•  Question  3.  Taking your regression results from question 2, and using your definition of “tall parents” and short parents” from question 1:

Calculate the predicted adult children’s height whose parents are “tall” after 1, 2, 3, ..., Z generations. And similarly, for adult children of “short” parents. Report your results in a table. Is there convergence in heights? If so, how many generations does it take? Is Galton’s prediction correct?

•  Question 4.  Using the same data set,

Regress the height of parents against the height of adult children. Report your results in a table. Is this regression equivalent to that in question 2? Are the estimated parameters the same? Why or why not?

Part 2: Your own sample collection

In this part of the project you are being asked to collect your own sample of heights for adult children and their parents (father and mother). Your aim is to define a population of interest and collect data for a random sample of 100 individuals. In order to improve some of Galton’s analysis you should also collect information on the gender of the adult children as well as any other variable you think is crucial given the population of interest you have chosen.  Please ensure that your data collection process is in accordance with the ethical approval from the School of Economics.

•  Question 5. Using your collected data:

Describe the chosen population of interest and your data collection process as well as your  strategy for trying to achieve a random sample. Define “tall parents” and “short parents” according to your data, report summary statistics in a table as in question 1, and comment  it. Also comment on any relevant comparisons to Galton’s data.

•  Question 6. Using your collected data:

Regress the height of adult children against the height of parents using the same specification that Galton used. Report your results in a table. Propose a test to check if “regression towards the mean” is present in your data. Provide some intuition for the proposed test. Then, undertake the proposed test, report your results in a table and interpret your results.

•  Question  7.  Your population  of interest is likely to differ  from that of Galton.  How would you improve Galton’s specification with your data set?

Propose and implement improved specification(s) using your data and report your results in a table. Compare the results of your proposed specification(s) to those with Galton’s original specification used in question 6.

•  Question  8.   Since  Galton, regression towards the mean has been  analysed in other areas beyond  ‘height’ of parents and children.

Identify another topic in Economics which has also analysed regression towards the mean. Provide a short summary of this literature, including the regression run, the key findings and the reasons why this phenomenon could be present.