Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


FIT5149 S2 2021 Assessment 1

Productivity Prediction of Garment Employees

Aug 2021


Marks: 15% of all marks for the unit

Due Date: Friday, 3 September 2021, 12:59 PM

Extension: An extension could be granted for circumstances. A special consideration application form must be submitted. Please refer to the university webpage on special consideration.

Lateness: For all assessment items handed in after the official due date, and without an agreed extension, a 10% penalty applies to the student’s mark for each day after the due date (including weekends, and public holidays) for up to 5 days. Assessment items handed in after 5 days will not be considered.

Authorship: This assignment is an individual assignment and the final submission must be identifiable your own work. Breaches of this requirement will result in an assignment not being accepted for assessment and many result in disciplinary action.

Submission: You are required to submit two files, one is either a Jupyter notebook or a R Markdown file, another is the PDF file generated by them. The two files must be submitted via Moodle. Students are required to accepted the terms and conditions in the Moodle submission page. A draft submission won’t be marked.

Programming language: R in Jupyter Notebook or R Markdown


Introduction

The Garment Industry is one of the key examples of the industrial globalization of this modern era. It is a highly labour-intensive industry with lots of manual processes. Satisfying the huge global demand for garment products is mostly dependent on the production and delivery performance of the employees in the garment manufacturing companies. So, it is highly desirable among the decision makers in the garments industry to track, analyse and predict the productivity performance of the working teams in their factories.

        In this assignment, the task is to build models to predict employee pro-ductivity given some other factors in a factory. In order to maintain real-time capability, model sizes should be as small as possible. Note that the employee productivity estimation in production will be deployed on best-cost hardware of traction drives in an automotive environment, where lean computation and lightweight implementation is key.

        Specifically, the problem you are going to solve is: Can you

● accurately predict the actual productivity given the collected data?

● well explain your prediction and the associated findings? For example, identify the key factors are strongly associated with the response variable, i.e., actual productivity.


Data set

The data set contains 1197 instances, each of which have 15 columns: the first 14 columns corresponding to the attributes, the 15th column “actual productivity” is the variable that we will predict. The details of the data set can be found in the original UCI Repository. Note we will use a revised data set in which there are no missing values.


Task description

In this assessment, you will focus on the following two tasks.


Prediction task

For the prediction task, the underlying problem is to predict the actual pro-ductivity using the collected 14 attributes. The provided data sets are well organised, you do not need to wrangle the data. But make sure you understand the intuition of these attributes.

        To measure the performance of your model(s), it is suggested to shuffle the data and split the data into training and testing sets, fit the model using the training set, do the prediction on the test set and compute some performance metrics.

        In this task, you are required to develop models that can accurately predict the actual productivity. To finish the task, you should

1. develop and compare at least 3 models;

2. describe and justify the choice of your models;

3. analyze and interpret your results


Description task

The purpose of the description task is identify the key factors that have strong affect on the actual productivity. In other words, which property contributes the most to your model’s performance? Descriptions can be based on variable correlation analysis, regression equations, or any other forms. The description and the accompanying interpretation must be comprehensible, useful and with statistic support whenever it is possible. To finish this task, you should use proper data analysis techniques to

1. identify a subset of attributes that have a significant impact on the pre-diction of the actual productivity;

2. and give statistical reasons of your finding.


Files to be submitted

There are two files required to be submitted, which are

● The R implementation of the tasks in one file.

– The file must be either a Jupyter notebook or an R Markdown file. Besides the R code, all the discussions must also be included in the file.

– The name of the file must be in one of the following formats:

 XXXXXXXX_FIT5149_Ass1.ipynb

 XXXXXXXX_FIT5149_Ass1.Rmd

You should replace “XXXXXXXX” with your student ID.

● A PDF file generated by the Jupyter notebook or R Markdown. The name of the PDF file must be in the following format

– XXXXXXXX_FIT5149_Ass1.pdf

Before you generated the PDF file, please clear all the outputs. Please note that the PDF file will be used by Turnitin for the purpose of plagiarism check. It is your full responsibility to make sure that all the outputs are cleared before the PDF file is generated, as the outputs can contribute significantly to the Turnitin scores.

Please refer to the Assessment 1’s Moodle page for how to submit the two files. Please note that If you do not follow the above way to name your submission, your submission will not be marked and will receive 0 mark directly.


Academic integrity

Please be aware of University’s policy on academic integrity. Monash University takes academic misconduct very seriously. You can learn from the above materials and understand the principle of how the analysis was done. However, you must finish this assessment with your own work.