Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

DTS001TC Data Analytic for Entrepreneurship

Final Coursework

Submission deadline: 4th  Nov.

Percentage in final mark:  100%

Learning outcomes assessed:

A: Preprocess, analyse and interpret data using a modern computer package

B: Summarize and visualize data using a modern computer package

C: Present findings to a business audience in a suitable format

Late policy: 5% of the total marks available for the assessment shall be deducted from the assessment mark for each working day after the submission date, up to a maximum of five working days

Risks:

Please read the coursework instructions and requirements carefully. Not following these instructions and requirements may result in loss of marks.

Plagiarism results in award of ZERO mark.

The formal procedure for submitting coursework at XJTLU is strictly followed. Submission link on

Learning Mall will be provided in due course. The submission timestamp on Learning Mall will be used to check late submission.

All students must download their file and check that it is viewable after submission. Documents may become corrupted during the uploading process (e.g. due to slow internet connections). However,

students themselves are responsible for submitting a functional and correct file for assessments.

Overview

In this coursework, you are required to complete two tasks based on the given dataset and submit a compressed document that includes two files:

1. Task1: An Excel file (in xlsx file) containing your visualization and modeling process and results for the given dataset.

2. Task2: A report (in pdf file) analyzing the visualization and modeling results.

The assignment must be submitted via Learning Mall Online to the correct drop box. Only electronic submission is accepted and no hard copy submission.

Task 1 (50 marks)

You are given a dataset of Weather information of various regions in China over the years. You need to design and create your visualization and model based on the dataset. The visualization will show the impact of different factors on the temperature the next day, while the model needs to consider multiple factors to predict whether there will be significant changes in the temperature of the next day.  Here are task specifications:

Target for visualization: You are asked to use excel to create a visualization that complete the following tasks

O Clean and preprocess the original dataset (9 marks)

O Select appropriate charts and data formats for visualizing the data (8 marks)

O Show the impact of the current date on the temperature change the next day (3 marks)

O Show the impact of the current wind direction on the temperature change the next day (3 marks) O Show the impact of the current wind speed on the temperature change the next day (3 marks)     O Show the impact of the current temperature on the temperature the next day (3 marks)

Target for model: You are asked to use excel to construct a model that can predict whether there be a huge temperature change the next day based on the weather factors of the current day. Your model

needs to complete the following tasks.

o Choose the appropriate dependent variable for the appropriate model (13 marks)

o Strive for high prediction accuracy as much as possible (8 marks)

The submitted Excel file should include:

o The original dataset

o The dataset after data preprocessing

o All visualized tables and charts

o Summary output of the constructed model

Detailed Requirements:

o The formulas and functions used in data preprocessing needs to be retained in your xlsx file. You need to demonstrate through formulas how the processed data was transformed step by step.

o Visual charts and tables need to be generated by Excel and remain in an editable state in your xlsx file. Screenshots will not be accepted.

Additional notes:

o The use of  add-ins that have not been mentioned in lecture is allowed, but it is necessary to refer the source and ensure that the add-ins is publicly available

o It is allowed to use newly constructed features during the model constructing, but these features must be based on the original dataset, and the process of constructing the new features needs to be retained.

Task 2 (50 marks)

In this task, you need to write a report based on your visualization and modeling results

Target for report: You are asked to write a report (in PDF) to analyze your visualization results and evaluate your model's prediction result, the report should consisting of following contents:

o Analysis of each visualization table and chart (16 marks)

o Conclude which factor/feature has the greatest impact on the temperature change the next day and provide corresponding evidence (6 marks)

o Evaluation of the predictive results of the model (5 marks)

o Evaluation of the fitness of the model (5 marks)

o Elaborate on the potential of your predictions in commercial applications (10 marks)

o Discuss the limitations of the model and potential directions for improvement (8 marks)

The formatting requirements in the report:

o  Font: Times new roman

o  Page limitation: 1

o  Line Spacing: single space

o  Spacing Before: 0pt

o  Spacing After: 12pt

Notes:

o Newly created features can be included in the discussion

o You can evaluate your model by comparing different models

o Discussions on directions for improvement can include discussions on improving the dataset

o You may get marks deducted if your report has more than 1 page

Marking Criteria

Tasks

100

Components

Description

Maximum Credit

Mark

 

 

 

 

 

 

 

 

Task 1

 

 

 

 

 

 

 

 

 

50

 

Data

Preprocessing

[9 marks]

Missing value handling

 

3

 

Outlier handling

3

 

Text Data handling

3

 

 

Data

Visualization

[20 marks]

Data Choice

4

 

Chart Choice

4

 

Pivot Table

4

 

Pivot Chart

4

 

Data representation format Choice

4

 

Model

Construction [21 marks]

Model Choice

8

 

Feature engineering

5

 

Prediction Accuracy

8

 

 

 

 

 

 

 

 

Task 2

 

 

 

 

 

 

 

 

50

 

Visualization Analysis

[22 marks]

Analysis of pivot table

8

 

Analysis of pivot chart

8

 

Analysis of the impact level on various factors/features

6

 

 

Model    Evaluation [20 marks]

Evaluation of the predictive result

5

 

Evaluation of the fitness of the model

5

 

Potential of commercial applications

10

 

Discussion [8 marks]

Limitations

4

 

Future improvement directions

4

 

 

Late Submission?

 

¨Yes ¨No

Days

late

 

Final Marks