Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

BUSANA 7001 - Predictive and Visual Analytics for Business

2022 S2, Individual Assignment

Instructions

1. This is an individual assignment.

2. The maximum score is 25 points.

3. The presentation of your write-up is important.   Poorly formatted reports might lose up to 5 points.

4. All numerical analysis, all tables and figures need to be done using SAS or SAS Visual Analytics (however, you may use Excel or Word etc. to make tables for regressions as the standard SAS output for regressions is not very nice).

5. Please retain your SAS code and make sure that it is user-friendly (use com- ments where necessary).  Using your submitted code, one should be able to produce all your results, tables, and gures.

6. Please retain a copy of the problem set that is submitted.

7. You should submit 3 files (feel free to combine them into a single le):

● ‘Assignment Cover Sheet’, which must be signed (electronic signature is okay) and dated

● the report (in pdf format) for Task 1

● the report (in doc, docx, or pdf format) for Tasks 2 and 3; the report should be properly formatted and be similar to a business report; font:

12 pt Times New Roman; maximum number of pages: 10 (no penalty for exceeding this limit); at the end of the report (in the appendix) include your SAS code.

8. Lecturer can refuse to accept assignments, which do not have a signed ac- knowledgment of the University’s policy on plagiarism.

9. Any suspected plagiarism will be severely punished. This includes any student that submits copied work or any student that allows their work to be copied.

10. You must acknowledge any external material you use in your answers, e.g., material from websites, textbooks, academic journals and newspaper articles.

11. All queries (including deadline extensions) for this project should be directed to Lecturer.

12. The submission deadline for the problem set is 6pm, Friday the 26th of Septem- ber, 2022.

13. The submission must be done through MyUni.

14. Late submission will be penalized 2.5 points per day.

1    Visual Analytics (8 points)

Assume that you are a real estate analyst. Your goal is to analyze the recent sales of residential property in Melbourne.  Using the dataset Housing.sas7bdat’, create a report with various gures and tables (around 6 objects) that summarize the sample (i.e., perform descriptive statistics). Briefly describe your results (using Text’ object which is available under Content’ group).

Estimate an OLS regression model where the dependent variable is the natural logarithm of sale price (you will need to generate this variable). Motivate your choice of the independent variables and discuss the results (using Text’ object which is available under Content’ group).

2    Sample and description statistics (8 points)

Assume that you are a bond analyst and you have been asked to focus on the U.S. corporate bond market.  You have been provided 2 files with the bond data (’Sample_a.csv’ and Sample_b.csv’).  First, you should prepare your data for the analysis:

remove duplicates

merge the les using bond_idvariable

remove observations with missing values of any variable           check for outliers and take necessary actions to deal with them

remove bonds not denominated in US dollars (i.e., your sample should include bonds which currency is US dollar)

remove putable bonds from the sample

remove convertible bonds from the sample.

Then create the following variables:

1. years to maturity:

● SAS code: maturity2=(maturity-today())/365;

2. amount outstanding in billions of USD (amount2)

3. a natural logarithm of amount outstanding (ln_amount)

4. a dummy if a bond is callable

5. a dummy if seniorityis Senior Unsecured

6. the following dummy variables:

aaa_d=1 if credit rating is Aaa; SAS code:

aaa_d=0;

if Moodys_cred_rat="Aaa"  then  aaa_d=1;

aa_d=1 if credit rating is Aa1, Aa2, Aa3; SAS code:

aa_d=0;

if Moodys_cred_rat  in  ("Aa1"  "Aa2"  "Aa3")  then  aa_d=1; a_d=1 if credit rating is A1, A2, A3

baa_d=1 if credit rating is Baa1, Baa2, Baa3

ba_d=1 if credit rating is Ba1, Ba2, Ba3

b_d=1 if credit rating is B1, B2, B3

c_d=1 for all other values of Moodys_cred_rat’ .

Discuss briefly your sample, including the number of observations, outliers. Pro- vide the descriptive statistics of the sample. How you choose to do this is entirely at your discretion. However, it is recommended that you consider using both summary statistic and graphical methods (this task should include at least one properly for- matted table, one pie chart, one histogram, and one scatter plot) while also noting any peculiarities within the data set.  You should put more emphasis on variables that are the dependent variables in the regressions estimated in the next task.

3    Estimating yield for a hypothetical bond (9 points)

Lastly, you need to estimate the yield for a bond with the following characteristics:

maturity: 10 years

coupon: 2.5%

amount outstanding: $750,000,000

seniority: senior unsecured

Moodys (Issue) credit rating (‘Moodys_cred_rat’): Aa2

sector: Electronics

callable: yes

market of issue: global.

Use the sample from the previous task.  To ensure that the results are robust, estimate at least 3 regression models (e.g., in the rst regression model, one includes amount in $, in the second model, one uses the natural logarithm of amount in $, and the third model features something else). To ensure that regression residuals “behave well,” you may need to scale or transform one or more variables. For example, to use a natural logarithm value of the variable instead of its raw value. Do not forget to include credit rating dummies in the regression models as the independent variables (i.e., credit rating xed effects).

Briey discuss the determinants of yield.

Using one of the regression models, compute two additional yields:

1. the amount is $1,000,000,000, other bond characteristics the same as above

2. Moody’s (Issue) credit rating is A2, other bond characteristics the same as above (i.e., amount: $750,000,000 etc.).

Are the results the same as the main estimate? Why?