关键词 > MTH6134/MTH6134P

MTH6134 / MTH6134P: Statistical Modelling II Main Examination period 2019

发布时间：2024-01-02

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Main Examination period 2019

MTH6134/MTH6134P: Statistical Modelling II

Question 1. [29 marks] The root yield of sugar beet, in tons per acre, was recorded for plots fertilised with six levels of nitrogen. Below are the data.

Nitrogen	Yield					Total
1	31.3	33.4	29.2	32.2	33.9	160.0
2	38.8	37.5	37.4	35.8	38.4	187.9
3	40.9	39.2	39.5	38.6	39.8	198.0
4	40.9	41.7	39.4	40.1	40.0	202.1
5	39.7	40.6	39.2	38.7	41.9	200.1
6	40.6	41.0	41.5	41.1	39.8	204.0

The sum of squares of the observations is Σ1 Σj(5)= 1yij(2) = 44, 555.61. notation. [5]

(a) Write down a suitable model for these data and any necessary assumptions, explaining your notation. [5]

(b) Compute the analysis of variance table and test factor nitrogen. [10]

(d) Describe how the model and analysis would have changed if the levels of nitrogen had been selected at random from a large number of possible levels available. [4]

Question 2. [23 marks] An experiment was designed to study the performance of four different detergents in cleaning clothes. The cleanness readings were obtained with specially designed equipment for three different types of common stains. The data are given below, where a higher reading indicates that the clothes are cleaner.

Detergent

Stain	1	2	3	4
1	45	47	48	42
2	43	46	50	37
3	51	52	55	49

The model to be used for these data is

yij = μ +αi+βj+εij

for i = 1, 2, 3, 4 and j = 1, 2, 3, where αi is the effect of the ith detergent and βj is the effect of the jth 1αi = 0 and Σj(3)= 1 βj = 0, and it is assumed that

εij ∼ N (0, σ2 ), all independent.

(a) Derive expressions for the least squares estimates of µ, αi and βj. [8]

(b) Derive expressions for the fitted values and residuals. [3]

(d) Using the above constraints, write down in matrix form the multiple regression model that is equivalent to the analysis of variance model for the data. [8]

Question 3. [15 marks] A bacteriologist is interested in the effect of two different culture mediums at two different times on the growth of a particular virus. She performs six replicates of a factorial design, making the runs in random order. Below are the results, in plaque-forming units.

Culture Medium

Time 1 2

21 22 25 26

12 23 28 24 25

20 26 29 27

37 39 31 34

18 38 38 29 33

35 36 30 35

An analysis of variance was performed using GenStat and the output is as follows:

Analysis of variance
Variate: growth
Source of variation d.f.	s.s.	m.s.	v.r. F pr .
medium 1	9.375	9.375	1.84 0.191
time 1	590.042	590.042	115.51 <.001
medium.time 1	92.042	92.042	18.02 <.001
Residual 20	102.167	5.108
Total 23	793.625

(a) Briefly explain how you would enter these data into GenStat. What expression should be entered in Treatment Structure in the analysis of variance Dialogue Box? [4]

(b) Draw conclusions from the above output, illustrating your answer by calculating and commenting on the treatment means. [5]

(d) Explain why the runs are made in random order. [2]

Question 4. [22 marks] Three regional health authorities were chosen at random to participate in a health awareness programme. Within each authority, three cities were randomly selected for participation. To evaluate the effectiveness of the programme, ive households within each city were randomly selected. All members of the selected households were interviewed before and after participation, and a composite index was formed for each household in order to measure the impact of the programme. The data are given below, where the larger the index, the greater the awareness.

Authority City

1 2 3

1	2	3	1	2	3	1	2	3
42	26	34	47	56	68	19	18	16
56	38	51	58	43	51	36	40	28
35	42	60	39	65	49	24	27	45
40	35	29	62	70	71	12	31	30
28	53	44	65	59	57	33	23	21

The sums of squares of the observations and the treatment totals are Σ1 Σj(3)= 1 Σk(5)= 1yijk(2) = 89, 246 and Σ1 Σj(3)= 1 Tij(2) = 426, 764.

(a) Write down a suitable model for these data and any necessary assumptions, explaining your notation. [5]

(b) Compute the analysis of variance table, and test factors authority and city. [13]

Question 5. [11 marks] Consider a completely randomised design with two treatments. Let the data vector be y = (y11 ,y12 ,y13 ,y21 ,y22 ,y23) .

(a) Define the treatment subspace VT and the null subspace V0. [3]

(b) Compute the projections PVT y and PV0 y. [4]

(d) In the usual analysis of variance notation, what are the formulae for the quantities that you defined in part (c)? [2]

1	2	3	1	2	3	1	2	3
42	26	34	47	56	68	19	18	16
56	38	51	58	43	51	36	40	28
35	42	60	39	65	49	24	27	45
40	35	29	62	70	71	12	31	30
28	53	44	65	59	57	33	23	21

1	2	3	1	2	3	1	2	3
42	26	34	47	56	68	19	18	16
56	38	51	58	43	51	36	40	28
35	42	60	39	65	49	24	27	45
40	35	29	62	70	71	12	31	30
28	53	44	65	59	57	33	23	21

1	2	3	1	2	3	1	2	3
42	26	34	47	56	68	19	18	16
56	38	51	58	43	51	36	40	28
35	42	60	39	65	49	24	27	45
40	35	29	62	70	71	12	31	30
28	53	44	65	59	57	33	23	21