关键词 > MBAD6224.10P

MBAD6224.10P – Decision Making and Data Analysis Fall 2023

发布时间:2023-09-22

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MBAD6224.10P – Decision Making and Data Analysis

Fall 2023

Individual Assignment 1

General Instructions

Please read carefully: You are to work strictly individually on this assignment. You must not receive or give any help in working on this assignment, and you must not discuss or exchange any information (electronic or other) with anyone about  it.  Having your work  automatically  backed  up to  a  drive accessible to others is a violation of this rule. Read the assignment entirely, including the instructions at the end, before starting to work. Provide explanations for your answers to the questions.

Due date:  The assignment is due on Sunday 24 September, 11:59pm EST.

What to turn in:  Submit a single Excel file containing all your work, as detailed in Instructions at the end. When finished, write the following statement at the top of the first worksheet: “I, your full name, attest that I did not receive or provide any help in working on this assignment.”

How to turn in your work:  Submit your work on Blackboard by clicking on “Submit Assignment” in the left pane menu. If you submit your work multiple times, only your last submission will be considered.

Medical Care Costs

A health insurance company would like to predict the medical care costs that it will have to pay, based on characteristics of the health policy beneficiary. Data on the following variables were gathered:

-     Gender: gender of primary beneficiary; 1 = Female, 0 = Male

-     Age: age of primary beneficiary

-     BMI: Body Mass Indexof primary beneficiary. Calculated from a person's weight and height, BMI is an approximate measure of total body fat

-     Smoker: binary variable indicating whether the primary beneficiary is a smoker; 1 = Smoker, 0 = Non-smoker

-     Dependents: number of dependents (e.g., children) covered under the health insurance policy

-     Charges: medical care costs (in USD) billed to health insurance over a calendar year

The data on the above variables for a sample of 1338 policy holders is in the worksheet named ‘Sheet1’ in the accompanying Excel file.

a)    Using the data  in Sheet1, build a linear regression model to relate Charges, as the dependent variable, to the other variables. Show the regression output only for the model you choose to retain, and explain the steps you took to arrive at your model. Write down the regression equation resulting from your model.

Sheet2 of the data file includes an additional variable called ‘Smoker x BMI’. This new variable is the product  of  the  variables  ‘Smoker’  and  ‘BMI’.  Such  a  product  variable  can  capture  the  possible interaction of two variables, for example, the fact that the combination of being a smoker and having a high BMI may have an important effect beyond either factor by itself.

b)   Working with the data in Sheet2, build a new regression model to predict Charges. Again, keep only the model you retain in the end, and explain all the steps you took to select a model. Write down the resulting regression equation.

c)    According to the model you obtained in Part (b), what are the predicted annual charges on a policy in  which  the  primary  beneficiary  is  a  50-year-old  male,  smoker,  with  a  BMI  of  35,  with  3 dependents?

d)   Of the two models you obtained in (a) and (b), which would you recommend using? Why?

Instructions. Your Excel file should contain separate worksheets for Parts (a) and (b),showing the regression output for your final model only, and describing the steps you took to arrive at your final model. For Parts (c) and (d), type your answers and/or calculations somewhere on the same worksheet as for Part (b). Clearly identify your answers by question number, so that I do not have to guess where to find them in your worksheet. Create an additional worksheet for any other material that you want to show, but please remove outputs of intermediate regression runs, so that I will not be confused about what I need to look at in your file.

When you are done with your work, include the honor code statement at the top of your file.