Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MATH6006 Computer Workshop (2021-22 Class)

At the end of the session you should be:

1. Familiar with Minitab and capable of using the Help files to find out how to do new things.

2. Able to input data into Minitab.

3. Able to use Minitab to find descriptive statistics for univariate data; plot data; perform simple linear regression; test for normality; perform one-way ANOVA tests.

1 Getting Used to Using Minitab

Minitab consists of a Session Window and a Worksheet Window. Results of analyses will usually be output to the session window and data are stored in the worksheet window. Complex plots are output in their own windows. Minitab describes the combination of the session, worksheet and graph windows as a project.

Install the latest version of Minitab from http://www.software.soton.ac.uk.

1.1 Using the Help Menu

Minitab has excellent Help files, some useful tutorials and a searchable database. There is also a detailed online manual available at

http://www.minitab.com/en-us/support/documentation/

1.2 Inputting Data

Data can be input in several different ways:

❼ Directly into the worksheet using the keyboard

❼ Copying from MS Excel and pasting into Minitab

❼ Importing from a data file.

We will be using the following three data sets in this computer session:

eggs.txt,        olympics.xls,        ShippingData.MTW,

which can be found on the Blackboard site in the Course Documents folder. The eggs data are stored in a text file and the Olympic results data in an Excel file. Details of how to import the data are below.

(a) Open the eggs data in Minitab: Download the eggs data from the Blackboard site and save it somewhere in your computer. Open the file in Notepad and look at the format of the data, e.g. is it comma, tab or space delimited? To open it in Minitab, use File → Open. In the pop out window, browser your folder to select the eggs file and click OK to confirm.

(b) Open the Olympic Results data in Minitab: Download the Olympic results data from Blackboard site and save it somewhere in your computer. Open the file in Excel and look at the format of the data. There are two ways of transferring the data from Excel to Minitab. The first is identical to opening a txt file, as described above. Alternatively, copy and paste the data into the Minitab worksheet.

(c) Saving Your Work: To save your work in Minitab, go to File → Save Project As and then choose a location and name for the file. This will save the worksheet, session window and graph windows. It is possible to save only the worksheet window.

2 Plotting Data

Consider initially the eggs data. Draw the following plots:

1. Histogram Graph → Histogram. There are lots of options for histograms. Try initially drawing a histogram of all of the data by selecting a simple graph and clicking OK. In the next command window select “egg” for the “Graph variables” box, then click OK to draw the histogram.

To draw histograms for each of the different groups, again choose Graph → His-togram and select Groups Overlaid button and choose the Y-variable to be “’eggs’ and Group Var. to be “’species’ to make a set of three graphs in one window. Or select or Groups Displayed Separately to make three separate graph windows. Click OK and return to the main command win- dow. Click OK again to draw the graph. Investigate some of the other options available to you.

2. Box and Whiskers Plot Graph → Boxplot. Start by drawing a boxplot for the whole dataset: select “One Y simple” in the first command window and then press OK. Select “egg” as the Graph variable in the second command window and then click OK. To draw a Boxplot that gives boxes for the 3 different species, select “One Y groups” in the first command window, then click OK. In the second window, select “egg” as the graph variable and “species” as the categorical variable, then click OK to draw the plot. Put a title on this graph and label the axes (Hint: use the Labels button).

3. Other Plots Try drawing a dot plot and a stem and leaf plot of the eggs data. What are these graphs useful for? How would you interpret them?

Save the eggs project and open the Olympic results project.

1. Scatter Plot A scatter plot is not a useful tool to use for the eggs data, but would be useful for the Olympics results data. To draw a scatter plot, go to Graph → Scatterplot. Start with a simple scatterplot. In the table, select “long” as a y-variable and “year” as an x-variable for row 1. In row 2, put “high” as the y-variable and “year” as the x-variable. In row 3, put “discus” as the y-variable and “year” as the x-variable. Click on the Multiple Graphs button and choose how you wish the data to be displayed. If you are not sure what some of the options are, try them and see what the resulting graphs look like. Next, try drawing scatterplots with regression lines superimposed on them by choosing “with regression” in the first command window.

2. 3D Scatter Plot Graph → 3D Scatterplot. Choose a simple plot and click OK. Choose “long”, “high” and “discus” as the z, y and x variables to observe the rela-tionships between these variables. Click OK to draw the plot. A taskbar will appear with the plot, with buttons that allow you to rotate the plot and zoom in and out. Play around with these buttons to give the orientation that best demonstrates the patterns in the data.

Save the Olympics results project, close it and open the eggs project.

3 Descriptive Statistics for Univariate Data

Return to the eggs data for this part of the worksheet.

Go to Stat → Basic Statistics → Display Descriptive Statistics. Initially, produce descriptive statistics for all of the eggs data, by selecting “egg” as the variable and leaving the “By variables” box empty. Click Statistics and select which statistics you would like it to display. Clicking the Help button will give a definition of each of these statistics. Click OK when you have made your selection. Click Graphs and select “Histogram of data, with fit”. Do the data look normal?

Next, produce statistics for the eggs data by species, by selecting “species” for the “By variables” box.

3.1 Comparison of Means

1. Test for Value of Mean Use Stat → Basic Statistics → 1-Sample t to determine whether the mean of the eggs data is equal to 2.10 or not. Select “egg” for the “Samples in Columns” box and put 2.10 in the “Test Mean” box. Click OK. Is the mean equal to 2.10 at a 95% significance level? Now try with 2.20. To alter the significance level, click the Options button. How does this affect the results?

2. Test for Differences in the Mean Use Stat → Basic Statistics → 2-Sample t to test whether the means of species 1 and 2 are the same. This test compares two means, and as we have three species in our data, we need to do some manipulation of the data before we can perform the test. The best way is to unstack the columns. Go to Data → Unstack Columns and select “egg” for the “Unstack data in” box and “species” for the “Using subscripts in” box. To keep all the data in the same worksheet, select the “After the last column in use” option and tick the “Name the columns” box. Click OK and look at the results in your worksheet.

Now, try to perform the 2-sample t- test. Select “Samples in different columns” and put “egg−1” in the “First” box and “egg−2” in the “Second” box. Click OK. Look at the confidence interval for the difference in the means. Are the means equal at a 95% signifcance level? Remember that the null hypothesis is that the means are equal, so if p < 0.05, this means that the means are different at a 95% confidence level. Now compare species 2 and 3.

3. Test for Equal Variances We assume when performing the test for equal means that the variances of the two samples are equal. To test this, go to Stat → Basic Statistics → 2 Variances. Select “Samples in different columns” and put “egg−1” in the “First” box and “egg−2” in the “Second” box. Click OK. Are the variances equal at a 95% significance level?

4 Fitting Statistical Models to Data

Here, we consider tests for normality, simple regression and (as an extension) fitting data to other statistical models.

4.1 Tests for Normality

Go to Stat → Basic Statistics → Normality Test. Select “egg” as the variable, and perform an Anderson-Darling test on the data. If the data are normal, they should follow the blue line. Do they look like they follow it? Consider the p-value (remember that the null hypothesis is that the data do follow a normal distribution, so high p- values suggest that there is no evidence that the data are not normally distributed). Does this suggest that the data are normal?

Save the eggs project.

4.2 Simple Regression Analysis: Olympic Gold Medals

Open the Olympic medals project.

We wish to determine regression equations for the winning lengths for the three events: long jump, high jump and discus. Consider initially the scatterplots produced at the start of the worksheet. Do these suggest that there are straight line relationships between the year and the length?

Assuming a straight line as a first approximation, perform simple regression analysis to

nd the regression equations for these events. Use Stat → Regression → Regression. Initially, select “Long” as the response and “Year” as the predictor. Click Graphs to choose which residual plots to display. Select “Four in One” to show all four on one graph. Click OK. Click Results to choose which results to display. Click OK. Click OK again to perform the analysis.

Interpret these results using the work that we have done in the lecture. Perform a similar analysis for the high jump and discus results. Do you think that this is a good model for these data?

Finally, using Minitab, predict the results in the long jump, high jump and discus for 2000 and 2004. To do this, use Stat → Regression → Regression as before and then click on Options. In the “Prediction Intervals for New Observations” box, put in 100 to find the predicted value for the year 2000, and 104 for the predicted value for the year 2004. (To input two or more values at a time, you will need to put these values in a column in the worksheet.)

Ask a demonstrator what the actual results were in 2000 and 2004. Do they fall within your standard errors?

5 ANOVA

For the ANOVA tests, we will focus only on One-Way ANOVA. The Minitab Getting Started guide (page 23: Compare Two or More Means) has an excellent demonstra-tion of this on the Shipping Centers Data (all available from Blackboard).