Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

ECON 5060

HKUST Department of Economics

2023/24 Fall

Group Project (Instructions and Guidelines)

Choose one from the given topics below:

Topics

Target Variables

File Names

P2P Lending Default Prediction

Status: Late, Repaid, Current

Define numeric variable as follows:

Late = 1 (default)

Repaid = 0 (not default)

P2P Lending Dataset

Predicting US Corporate Bankruptcy

status_label: alive or failed

Define numeric variable as follows:

failed = 1 (bankrupted)

alive = 0 (not bankrupted)

US Bankruptcy Dataset

Predicting Taiwan Corporate Bankruptcy

Bankrupt:

1: bankrupt

0: not bankrupt

Taiwan Bankruptcy Dataset

Forecasting US Inflation Rate

CPIAUCSL

Define inflation as the log first difference of CPIAUCSL:

Inflation = log(CPIAUCSL(t)) – log(CPIAUCSL(t-1))

US FRED-MD Macro Dataset

US FRED-MD_Appendix

Forecasting US Interest Rate

FEDFUNDS: Federal Fund Rate

US FRED-MD Macro Dataset

US FRED-MD_Appendix

Financial Crisis Prediction

crisisJST:

1: financial crisis

0: no crisis

Financial Crisis Dataset

Financial Crisis Chronology

Forecasting Direction of Stock Market with Technical Indicators

Define the weekly directional movement of the “Close” price of S&P 500 as follows:

1 if the weekly return of the index

is positive, i.e., Close(t) > Close(t-5)

0 otherwise

S&P 500 Index Dataset

Predicting Stock Return with Fundamental Indicators

return_adj_12m: 12-month adjusted return of share price after financial reporting period (i.e., difference between stock return and S&P 500 Index return)

US Stock Fundamentals Dataset

Content Requirements:

· Choose one from the given topics above

· Formulate the ML procedures or methodologies in addressing the topic

· Collect, compile, preprocess, and analyze the data

· Apply at least five different ML methods that you learn in this course to solve your ML task

· Summarize the findings, make conclusion and recommendations

Format Requirements:

· Word or PDF

· A cover page with title and group information (group number, student names and numbers).

· The structure includes an introduction (or executive summary), main body, conclusion, and a list of references.

· A maximum of 16 pages including the cover page, tables, charts, and references

· Font size 11 or 12, double spacing

Submission of Paper:

· Please email your term paper together with the code file to me by December 13.

Guidelines on Data Preprocessing

· Data preprocessing such as removing unnecessary features and transforming features into comparable ones across observations is highly recommended.

· You may need to transform some categorical features into numeric ones.  There are three ways to do that, namely, Ordinal Encoding, One-Hot Encoding, Dummy Variable Encoding.  Check https://machinelearningmastery.com/one-hot-encoding-for-categorical-data/ for details.

· If there is severe class imbalance in the data (e.g., less than 5% of instances belong to one class and the remaining instances belong to another class), you may undersample the majority class or oversample the minority class.

· If you undersample the instances in the majority class, you can randomly select the instances in the majority class so that the number of instances in the majority class matches the number of instances in the minority class.

· If you oversample the instances in the minority class, you can apply the Synthetic Minority Oversampling Technique (SMOTE).