Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
STA304
Chapter 6: Ratio Estimation
Fall 2021
Outline
1 Introduction r>2 Ratio Estimation
Estimation of R
Estimation of τy
Estimation of µy
Sample Size Determination
3 Regression (Linear) Estimation
4 Difference Estimation
5 Comparison between Estimators
Introduction
Ratio Estimation
• The estimators of µ, τ and p were based on a SRS of responses,
y1, y2, . . . , yn, selected from population (Chapter 4) or L SRSs se-
lected from L strata (Chapter 5).
• The emphasis in Chapter 4 and Chapter 5 was placed on sample
selection (design of the sample survey).
• In Chapter 6, a new estimation technique will be presented,
which, under certain conditions yield “better” estimates (i.e., es-
timates with smaller variances).
• The new estimation technique is called the Ratio Estimation.
Introduction
Ratio Estimation
• Ratio estimator makes use of auxiliary (subsidiary or ancillary) in-
formation to improve estimation of the population parameters.
• An ancillary variate xi, correlated with yi, is obtained for each ele-
ment in the sample.
• We must also be able to obtain the population total for x, τx.
Introduction
Motivating Example
• The wholesale price paid for oranges in large shipments is based on
the sugar content of the load.
• The exact sugar content cannot be determined prior to the purchase
and extraction of the juice from the entire load.
• It can be estimated (how)?
• Estimate the mean sugar content per orange, µy, and then to multiply
by the number of oranges N in the load.
• Thus, we could randomly sample n oranges from the load to deter-
mine the sugar content y for each.
• Take a sample: yl, y2, . . . , yn.
• An estimate of the total sugar content for the load is τˆy = Ny¯.
Introduction
Motivating Example
• But how to determine N?
• Count the total number of oranges in the load.
• This method is not feasible because it is too time-consuming and
costly.
• We can avoid the need to know N by noting the following two facts.
1 The sugar content of an individual orange, y, is closely related to its
weight x.
2 The ratio of the total sugar content τy to the total weight of the
truckload τx is equal to the ratio of the mean sugar content per
orange, µy, to the mean weight µx.
µy
µx
=
Nµy
Nµx
=
τy
τx
.
Introduction
Motivating Example
• Solve for τy, we have
τy =
µy
µx
(τx).
• We can replace µy and µx by y¯ (the average of the sugar contents
in the sample) and x¯ (the average of the weights in the sample).
• We get
τˆy =
y¯
x¯
(τx).
• Note that
τˆy =
y¯
x¯
(τx) =
ny¯
nx¯
(τx) =
∑n
i=1 yi∑n
i=1 xi
(τx).
Introduction
Another Example
• If y is the expenditure on textbooks by a college student then x
could be the number of courses the student is taking.
Introduction
Ratio Estimation
• For each member of the population, two variables are measured xi
and yi.
• Ratio estimation is used when the relationship between y and x
is linear and the line passes through the origin (y = 0, x = 0).
Ratio Estimation
• There are two cases that may be of interest to the researcher to use
ratio estimator:
1 To estimate the ratio of two population characteristics. The most
common case is the population ratio R of means or totals:
R =
τy
τx
=
µy
µx
.
2 To use the relationship between x and y to improve estimation of the
µy or τy.
Ratio Estimation
• Examples:
• If y is the total income earned by all adults in the household and x
is the total number of adults in the household, then R is the average
income per adult in a household.
• If y is weekly food expenditure and x is number of inhabitants, then
R is weekly food cost per inhabitant.
• If y is the number of motor vehicles and x is the number of inhabitants
of driving age, then R is the number of motor vehicles per inhabitant
of driving age.
Ratio Estimation Estimation of R
Ratio Estimation of the Population Ratio R
In this sampling plan we take a simple random sample of size n from a
population of size N and measure both yi and xi.
• Let
R =
τy
τx
=
µy
µx
be the population ratio.
• A sample-based estimator of R is given by
r =
∑n
i=1 yi∑n
i=1 xi
=
y¯
x¯
Ratio Estimation Estimation of R
Is r an unbiased estimator of R?
• r is not unbiased estimator of R but it is approximately unbiased
(for large sample size) since
E(Rˆ) = E(r) = E
(
y
x
)
≈ E
(
y
µx
)
=
1
µx
E(y) =
µy
µx
= R.
• Here the population is
(u1, v1), . . . , (uN , vN ).
µx =
1
N
N∑
i=1
ui and µy =
1
N
N∑
i=1
vi.
Ratio Estimation Estimation of R
Question
1 Show that
bias = E(Rˆ−R) = E(r −R) = −cov(r, x¯)
µx
.
2 Show that |E(r −R)|
σr
≤ σx¯|µx| .
• Estimated variance of r:
Vˆ (r) =
(
1− nN
)
1
µ2x
s2r
n ,
where
s2r =
∑n
i=1(yi − rxi)2
n− 1 .
• If µx is unknown, we estimate it by x¯.
• Bound on the error of estimation:
B = 2
√
Vˆ (r).
Other Forms of Vˆ (r)
• The estimated variance of r can be written in many forms.
• One that is particularly useful is the one that involves the the cor-
relation coefficient ρ between x and y.
• This correlation ρ can be estimated by
ρˆ =
sxy
sxsy
,
where
sxy =
1
n− 1
n∑
i=1
(xi − x)(yi − y¯)
s2x =
1
n− 1
n∑
i=1
(xi − x)2
s2y =
1
n− 1
n∑
i=1
(yi − y¯)2.
Ratio Estimation Estimation of R
Other Forms of the Vˆ (r)
• Thus,
Vˆ (r) = 1−fn
1
µ2x
(
s2y + r
2s2x − 2rρˆsxsy
)
,
where f = n/N .
• If µx is replaced by x¯, then
Vˆ (r) = 1−fn r
2
(
s2y
y¯2
+ s
2
x
x¯2
− 2ρˆ sxsysxy
)
.
Question
Show that
s2r =
∑n
i=1(yi − rxi)2
n− 1 = s
2
y + r
2s2x − 2rρˆsxsy.
Example
Suppose that 100 people who recently bought houses are surveyed, and
the monthly mortgage payment and gross income of each buyer are de-
termined. Let y denote the mortgage payment and x the gross income.
Suppose that
x¯ = $3100 y¯ = $868
sx = $1200 sy = $250
ρˆ = 0.85 n = 100
(a) Estimate the ratio of the mortgage payment to the gross income
and place a bound on the error of estimation.
(b) Find a 95% confidence interval for the ratio of the mortgage payment
to the gross income.
Exercise- Try it!
The Toyota Company wants to estimate the ratio of the number of man-
hours lost due to sickness of its employees at one of its branches. It has
N = 7000 employees and takes a sample of n = 10 employees and obtains
the following data:
Employee 1 2 3 4 5 6 7 8 9 10
Previous year 15 18 30 25 10 20 16 12 13 2
Current year 14 20 34 18 15 25 20 15 10 5
(a) Plot the data and describe the main features of the plot.
(b) Obtain an estimate of the desired ratio and set up a 95% confidence
interval for it.
Ratio Estimation Estimation of τy
Ratio Estimation of the Population Total τy
• Recall: R = τyτx =
µy
µx
.
• Ratio estimator of the population total τˆY :
τˆy =
∑n
i=1 yi∑n
i=1 xi
(τx) = rτx.
• Note: We do not need to know N or µx but we must know τx.
• Estimated variance of τˆy:
Vˆ (τˆy) = (τx)
2Vˆ (r) = (Nµx)
2
(
1− n
N
) 1
µ2x
s2r
n
= N2
(
1− n
N
) s2r
n
.