Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MAS202 Intermediate Business Statistics Project Description

Project Due 12/4

Objectives

You will work in a group of four students to complete the project. The goal of your project is to analyze an interesting dataset by applying regression analysis using Excel and report your findings.

Structure of the Project Report

The final deliverable of the project is the project report. Your final report should include the following components:

1. Introduction (15%)

In this part, you should give the background and context of the research problem and clearly state the research question or objectives and goals of the project. You may briefly discuss what makes the dataset interesting and motivates your analysis or your justification for the study.

2. Dataset Selection and Exploration (15%)

A description of the source of the data and provide a description of the dataset. A statistical summary (e.g. a descriptive statistics table) and brief discussion of the variables should be included.

3. Data Preparation and Method (15%)

In this part, the report should state how data were prepared for further analysis (e.g. deleting outliers, filling in missing values or data transformation). A description of the method you used (e.g., variable selection, regression performance), including a discussion of why they are suitable to the goal of the analysis.

4. Data Analysis Results (40%)

In this part, the report should present the results of the data analysis. The group is encouraged to use appropriate visualizations (tables, charts, graphs) to enhance understanding and check assumptions for the analysis method. The results should also be interpreted in the context of the research question.

5. Discussion (15%)

In this part, the report should discuss the further implications of the results and make conclusions to your research questions or objectives. The report may also compare the findings with existing literature, addressing any unexpected or contradictory results, and discuss any limitations, and possible future directions for subsequent analysis.

6. Reference

List references, if any.

Datasets

The dataset should be open or published, well documented with meaningful research questions, either with large number of predictors/features or large number of observations. One source of such data is https://www.kaggle.com/datasets. Another source of data sets is UCI machine learning repository at http://archive.ics.uci.edu/ml/ Read the citation policy if you use this site.

Some example datasets include:

· Online News Popularity http://archive.ics.uci.edu/dataset/332/online+news+popularity

· Waiting Time in Border Provided by U.S. Customs and Border Protection (CBP) https://awt.cbp.gov/

· World Happiness Report https://www.kaggle.com/datasets/unsdsn/world-happiness

· Football Wages Dataset https://www.kaggle.com/datasets/ultimus/football-wages-prediction

· IMDB Top 250 Movies https://www.kaggle.com/datasets/rajugc/imdb-top-250-movies-dataset