Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


FIT5147 Project Proposal and Data Exploration Project


In this assignment, you are asked to analyse and explore data about a topic of your choice.

Please note that your topic is subject to your tutor’s approval. Do not seek approval from the lecturer.

It is an individual assignment and worth 35% of your total mark for FIT5147.


Relevant learning outcome

● Perform exploratory data analysis using a range of visualisation tools.


Overview of the tasks

1. Identify the project topic, questions that you want to address, and data source(s) that you will be using to answer those questions.

2. Submit your Project Proposal in the Assessment block of Moodle in Week 3.

3. Wait for approval before proceeding further. You will receive feedback in Week 4.

4. Collect data and wrangle it into a suitable form for analysis using whatever tools you like (e.g., Excel, R, Python)

5. Explore the data to answer your original question and/or to find something interesting using Tableau or R. The exploration should use appropriate visualisations and statistical analyses.

6. Submit a report detailing your findings and the method(s) that you used.

7. The Data Exploration is due in Week 7.


Project Proposal (2%)

Write a document consists of the following sections:

1. Your full name, student ID, tutors’ names, and tutorial number.

2. Project title.

3. Brief introduction of your topic and motivation (max 1-2 paragraphs).

4. Up to 3 questions you wish to answer. The number of questions depends on the scope of the question itself. You can have one general question or three more detailed ones.

5. Data source(s) you plan to use to answer these questions, including a brief description of the data in each data source (e.g. number of rows, number of columns, type of data: tabular, spatial, network, textual or other; URL).

Note: The topic and questions should allow for interesting and detailed analysis for the Data Exploration Project and the subsequent Visualisation Project (due at the end of semester). Questions that are too easy to answer (e.g., what is the correlation between x and y, what are the top N values in the data), too difficult to do, not relevant to the unit, or are not possible to find out from the data sources provided will all be rejected (and not receive the Suitability and Clarity grade, see marking rubric). If in doubt, you should talk to your tutor during their consultation before the due date.

Please ensure you read the entire document before deciding on your project, as the proposal is for the entire Data Exploration Project. See the end of this document for an example proposal and potential data sources that you may look at to get you started.


Data Exploration (33%)

The report should contain the following structure:

1. Introduction

Problem description, question, and motivation.

2. Data Wrangling

Description of the data sources with links if available, the steps in data wrangling (including data cleaning and data transformations), and tools that you used.

3. Data Checking

Description of the data checking that you performed, errors that you found, your method and justification for how you corrected them, and the tools that you used.

4. Data Exploration

Description of the data exploration process with details of the visualisations and statistical tests (if applicable) you used, what you discovered, and tools that you used.

5. Conclusion

Summary of what you learned from the data and how your data exploration process answered (or didn’t) your original questions.

6. Reflection

Brief description of what you learned in this project and what you might have done differently in hindsight.

7. Bibliography

Appropriate references and bibliography (this includes acknowledgements to online references or sources that have influenced your exploration) using either the APA or IEEE referencing system.

The written report should be no more than 10 pages for all sections mentioned above, excluding cover page and appendix (see below). Your written report will be the sole basis for judging the quality of the data checking, data wrangling, data exploration, as well as the degree of difficulty. Thus, please include sufficient information in the report. It should, for instance, contain images of visualisations used for exploration and the results of any statistical analysis. You should include any analysis that you carry out even if it is incomplete or inconclusive as it demonstrates that you have thoroughly explored the data set.

If you wish to provide further additional material an Appendix of up to 5 pages may be added at the end of the document. The Appendix will not be graded however. Therefore, you should only use it to provide supplementary material that is not essential to the report or the reader's understanding. Clearly title this section as Appendix.


Marking Rubric

Project Proposal (2%)

● Completeness and Timeliness [1%] : All components of the Proposal are included and it is submitted on time.

● Suitability and Clarity [1%]: Clear motivation, valid and suitable question(s) and data source(s).

You will be meeting with your tutor to discuss your Project Proposal and receive feedback during your Tutorial in Week 3. If your proposal is rejected, your tutor will specify the reason(s) they have done so and suggest areas for improvement.


Data Exploration (33%)

● Data Checking and Wrangling [5%]: appropriate checking, cleaning and reformatting; managing to get data into Tableau or R.

● Visualisation Design [5%]: visualisations that are appropriate for the intended purpose; readable and interpretable; appropriate labeling of axes; clear legends; saliency of patterns and trends.

● Analytical Methods and Interpretations [6%]: analysis that is appropriate for the intended purpose; justification and explanation of exploration process and use of statistical measures; identification of trends, patterns, and insights.

● Degree of Difficulty:

 Data Complexity [4%]: e.g. significant wrangling or cleaning required; good use of non-tabular data (e.g. spatial, relational, textual); large datasets (observations or dimensions) and/or multiple data sets; data scraping.

 Advanced Analysis [2%]: e.g. clustering; dimensionality reduction; sophisticated aggregation and/or filtering; non-linear model fitting; correct use of statistical tests; complex timeseries analysis.

 Visualisation Complexity [3%]: e.g. implementation difficulty; variety of good visualisations; attention to visual detail; complex visualisations.

 Thoroughness of Interpretation [3%]: e.g. clearly articulated findings; awareness of limitations; deep exploration; thorough conclusions.

● Written Report [5%]: completeness; quality of writing and images; logical structure; correct referencing of figures and tables; correct academic referencing of sources.


Submission and due dates

● Project Proposal: Submit a one page PDF. Due Week 3. See Moodle for date and time.

● Exploration Project: Submit a 10 page PDF (excluding cover page and appendix). Due Week 7. See Moodle for date and time.


Late submissions

● There will be zero marks for late Project Proposal submissions. Everyone must submit the Project Proposal. Even if the deadline has passed, you must still submit a proposal (with a grade of 0) as your project must be approved before you can continue working on the Data Exploration.

● For Data Exploration, submissions received after the deadline (or after an extended deadline for those with extension/special consideration), will be penalised at 10% of the total mark [33%] per day up to a maximum of 7 days. If submitted after 7 days, it will receive zero marks and no feedback will be provided.

● For further information on eligibility for Extensions or Special Consideration, see the Moodle unit page under Assessments > Extensions and Special Consideration


Resubmissions

If you are retaking this unit from a previous semester, please ensure you choose a completely new topic and dataset.