STAT3102/6102: Graphics, Multivariate Methods and Data Mining


SGTA Exercises

Week 11

Question 1

These data stored in climate.sav, concern the climate in particular places in NSW. The variables are each long term averages for particular measurements, as follows:

janmax              Average Daily Maximum Temperature in January (°C)

julmax               Average Daily Maximum Temperature in July (°C)

janmin               Average Daily Minimum Temperature in January (°C)

julmin                Average Daily Minimum Temperature in July (°C)

jan9hum            Average 9am Relative Humidity in January (%)

jul9hum             Average 9am Relative Humidity in July (%)

jan3hum            Average 3pm Relative Humidity in January (%)

jul3hum             Average 3pm Relative Humidity in July (%)

janrain               Average Rainfall in January (mm)

julrain                Average Rainfall in July (mm)

janclear              Average number of clear days in January

julclear               Average number of clear days in July

Source: Bureau of Meterology http://www.bom.gov.au/climate

Perform a Principal Components Analysis using SPSS. Extract all SPSS outputs as required and provide your comments.

Question 2

This question uses the data set iris.sav; Iris data can be found under Week 5/Data on iLearn.

The data set contains measurements of four variables (sepal length and width, petal length and width) for three species of Iris. The species are coded as follows, ‘setosa’ as 1, ‘versicolour’ as 2 and ‘virginica’ as 3.

Perform a discriminant analysis, with species as the grouping variable and the remaining four available variables as independents.

To do this select Analyze > Classify > Discriminant…. Enter species as the Grouping Variable:Click on Define Range…, enter 1 for Minimum: and 3 for Maximum:, and click on ContinueNow enter the four remaining variables as the Independents:. Click on Statistics…. Under Descriptives choose Means and Univariate ANOVAs. Under Function Coefficients choose Unstandardised. Click on Continue. Click on Classify…. For Prior Probabilities select All groups equal. Also select the Territorial Map. Under Display select Casewise results and Summary table. Click on Continue. Click on OK.

(a) From the Univariate Anova’s, which variables have significantly different means for the three species? Discuss the normality of variables individually.

(b) Write down the canonical discriminant function/s (unstandardised coefficients).

(c) Compute the value of the function for Case 1, and confirm this with the value in the output (approximately).

(d) Indicate where Case 1 would fall on the territorial map.