Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Final Project Description

Fundamentals of Business Analytics, MAN2188 Semester 1, 2022-23

Essential Information:

·  Students need to prepare the final project individually and submit it before 3rd of January 2023, 4:00 pm via Surrey Learn.

·  The project consists of applying statistical analyses based on a real-world dataset using statistical software (IBM SPSS Statistics). The expected level of statistical analyses will be based on the lectures and lab sessions.

· IBM SPSS Statistics is available on All FASS labs and ALL central labs (including the library) computers. The network version is also accessible via Surrey virtual desktop.

·  All documents and files (e.g., Dataset, Assessment Brief, Sample Report) related to the final project are available from SurreyLearn: Course Materials -> Assessment Information.

·  Please ensure you are aware of the assessment regulation and submission: https://exams.surrey.ac.uk/assessments

Context and Dataset:

The dataset contains data related to the salaries of graduates working in technology-related sectors during the time period 2017-2020. Overall, the objective of the report is to evaluate the impact of several factors (i.e., predictors) on the annual salary of a graduate employee. In other words, the objective of this report is to estimate the effect of total work experience, current work experience, and bonus on the graduate employee’s annual salary.

Please note, as per the confidentiality of the data some numbers have been amended; therefore, this dataset is not suitable for research purposes.

Below follows the description of the variables in the dataset.

· year: year of data collection

· id: unique identifier for each employee

· company: name of the company

· sector: category of the sector that the company belongs to (e.g., Aerospace)

· seniority_level: level of the employee in the company (e.g., entry level position)

· title: title of job position of employee (e.g., software engineer)

· annualsalary: annual salary of the employee in thousand dollars (USD)

· totalexperience: total working experience of the employee

· currentexperience:  work experience of the employee in the current company

· bonus: the sum of money the employee received that year as a reward for good performance in thousand dollars (USD)

· gender: gender of the employee

· country: location of company

Content and Structure of the Assignment:

Introduction

1. Please report the word count, your name and student number at the beginning of your project.

2. Provide a brief explanation of the methodology, such as data, the definition of dependent, independent, the objective of the analyses (i.e., purpose), and the regression model.

Descriptive Analysis

1. Provide a table showing summary statistics of the variables for the entire sample. Discuss the results.

2. Provide a classification of employees in three groups according to their seniority levels. Recode the seniority_level variable into three groups and assign relevant labels:

- 1 = Entry-level: less than 4

- 2 = Mid-level: 5 to 8

- 3 = Senior-level: 9 or higher

Provide a table showing summary statistics for the new variable. Briefly discuss the results.

3. Provide a classification of employees in three groups based on their total working experience. Recode the totalworkexperience variable in three groups and assign relevant labels:

§ 1 =  Entry: 0 to 5

§ 2 = Average: 6 to 15

§ 3 = High: 16 higher

Provide a table showing summary statistics for the new variable. Briefly discuss the results.

4. Based on the classifications you have just developed, please answer the following:

o Does the mean salary of employees increase as total working experience increases? Explain briefly, providing relevant tables.

o Does the mean bonus of employees decrease as seniority increases? Explain briefly, providing relevant tables.

o Do men have a higher mean salary than women? Explain briefly, providing relevant tables.

5. Create relevant graphs to answer the following questions:  

o How well paid in bonuses are men and women across different seniority levels? Discuss your results. What conclusions can you derive from the visualisation?

o How do annual salaries compare across different years and job titles? Discuss your results. What conclusions can you derive from the visualisation?

Exploratory Analysis

1. Inspect the dataset graphically, such as checking the distribution of all variables, checking the possibility of outliers, and pre-checking the relationship/association between the dependent and all independent variables. The details and types of graphs are your decision - the objective is to provide a concise yet informative inspection of the data before running the regression. You may select any graphs that we have produced in the labs, which efficiently describe various aspects of the data. Make sure to provide adequate discussion and explanation.

2. Evaluate the skewness and kurtosis of relevant variables in the dataset. Compute descriptive statistics (showing mean, standard deviation, min, max). Produce relevant graphs. Discuss your results.

3. Estimate the correlation between annualsalary and currentworkexperience. Produce a relevant graph to inspect the relationship visually. What is the strength and direction of the relationship? Discuss your results.

4. Create a new variable called status that refers to the receipt of bonus per employee. The variable takes two values: 0 and 1. It is 0 when an employee received 2000 dollars or less that year and 1 when the employee received more than 2000 dollars as a bonus that year. Produce a table showing summary statistics for the new variable. Produce a relevant graph showing the average salary for employees who received 2000 dollars or less and for employees who received more than 2000 dollars as a bonus. Discuss your results.

Inferential Statistics:

1. Apply a statistical test and evaluate if there are any significant differences across years regarding the annual salary. If there are significant differences, evaluate which groups differ significantly with each other. You may use a relevant graphical illustration to enhance your discussion. Justify the selection of the statistical test(s). Discuss your results.

2. Apply a statistical test and evaluate if there are any significant differences between genders regarding the bonus. Again, you may use graphical illustration to enhance the interpretation of your results. Justify the selection of the statistical test. Discuss your results.

3. Apply a statistical test to evaluate whether there are any significant differences across seniority_levels regarding the annual salary. Justify the selection of the statistical test. Produce a boxplot graph and explain the results. Discuss your overall results.

Regression Analysis:

1. Conduct a simple regression to estimate the effect of bonus on the annual salary. Carefully interpret and discuss the results (e.g., R-squared, the statistical significance of coefficients and the effect size of the independent variables).

2. Conduct a multiple regression to estimate the effect of bonus, total work experience, and current work experience on annual the salary. This is the baseline model. Carefully interpret and discuss the results (e.g., R-squared, the statistical significance of coefficients and the effect size of independent variables). Compare your results with the previous model you produced (using relevant goodness of fit indices to compare the two models).

3.  Apply diagnostic analyses on the baseline model to check for potential multicollinearity and suggest potential appropriate remedies that could be applied if needed. Briefly discuss the results.

4. Apply diagnostic analyses on the baseline model to check for potential homoscedasticity. Briefly discuss your results.

5. Overall, what other variables, that may be affecting a graduate employee’s salary, are not included in the dataset and/or the baseline model? How may the baseline model improve if these additional variables are included in the regression? Discuss briefly.

Appendix

Include your IBM SPSS Statistics output results in your submission. You can upload the output results along with your report on SurreyLearn as a separate file. Please do not copy the output results as screenshots in the report. Please make sure that the output file you will submit includes only the correct answers to the questions given to you.

Format

· The project file should be in Microsoft word format in Times New Roman 12-point font double spaced. The word count of the project should be no more than 3500 words. 

· The word count includes everything from the first word of the introduction to the last word of the conclusion. The word count does not include tables, figures or images, and appendices. It does not include abstract, table of contents, abbreviation pages, or references (though these are not mandatory in this project). You should report the word count, your name and student number at the beginning of your project. According to the university policy, exceeding the word count limit is subject to a 10- point penalty.

Guidelines 

·  Apply the analyses required as explained orderly, section by section (from Introduction to Descriptives and Regression Analysis).

·  The report should stand as a self-sufficient and stand-alone document for readers, who do not have access to the project description. Thus, the report, including the writing, explanations, tables and graphs, should be clear and informative.

·  In the introduction, you need to explain the aim of this empirical report, the sample and data, providing the definition of all variables incorporated in the dataset. Some of this information (such as sample and variables definition) has been provided to you, but you need to summarise them in your report concisely.

·  All tables and graphs should be numbered and titled (with captions if an additional explanation is required) and should be referred to in the report accordingly. The label of the variables in tables and graphs should be informative.

·  Graphs should be visually clear (axis title, colour, legend, axis scale, etc.). You can use image format for your graphs. Please try not to populate the report with lots of graphs; be selective and use the most informative ones for your purposes.

·  Tables can be exported from the statistical software to a Word format or image format.

·  In the regression tables, coefficients should be reported along with the significance level of the coefficient. The R-squared and number of observations for each model should be reported too.  

·  The output SPSS results for all tables, graphs, and regressions etc should be provided in a clear, and readable format as a separate sav file.

·  You don’t need to cite any reference but use a proper citation style and provide the reference list in the appendix if you intend to do so.

·  Overall, the project’s quality (i.e., clarity, rigour, precision, and depth) is more important than the length.