Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

HW1: Empirical study: does R&D investment improve corporate performance?

Hints: provide your answers to the following questions in a word file, submit your do-file, raw data file (the final merged data file in form of dta) and log-file.

1.  Download raw data from CSMAR databank, specifically data on balance sheet, income statement, valuation, innovation and industry information. (Please drop observations of financial companies, ST/SST companies, select sample period from 2010 to 2019);

2. Import data into STATA and generate numeric variables for firm identifier and year for each dataset and save data file in form of dta;

3. Reduce datasets  (dta data  files) to unique firm*year  observations on annual basis and by selecting report type A (A= Consolidated Statement);

4. Merge these three datasets with "Merge" command and make sure the merging process goes flawlessly;

5.  Generate  following  variables, namely R&D intensity  (R&D  expenditures  over  sales), log-transformed total asset as firm size, capital structure, asset tangibility, current ratio, growth of operating income and tobin's Q (using tobinq_a); winsorize all continuous variables at 1% and 99% level;

6. Summarize these variables, generate descriptive statistics (including number of observations, mean, standard deviation, min, max) and provide brief description of them;

7. Generate Pearson correlation coefficients matrix (when the pair of variables is significantly correlated at 1% level, put a star behind the correlation coefficients) and analyze two things, i: is the  relationship between y and our test variable significant?  ii:  whether  there  is  serious multicollinearity problem?

8. Plot scatters of tobin's q against R&D intensity, as well as the linear and quadratic fitted lines;

9. Generate year and industry dummy variables;

10. Design your empirical model and run OLS (do not forget to control for year and industry fixed effects);

11. Provide analysis on the main specification and the economic and statistical significance of our variable of interest, R&D  intensity,  and  make  your judgment  of the  contribution  of R&D investment;

12.  Run  post-estimation  tests  to  identify  whether  full  rank  condition and homoskedasticity condition hold. If not, provide your recommendations on improving the empirical specification;

13. Design your own moderator and justify the reasoning behind it, run the moderating effects test, report your result and analyze what's the interpretation;