闪电代写 -代写CS作业_CS代写_Finance代写_Economic代写_Statistics代写_代码代做_IT代写_加急帮助

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Salty chips

Measuring sodium content

Background

In real world applications, it is important to have quality measuring systems. For example, chemists ﬁnd it convenient to have on hand specimens of an alloy of known composition. Such specimens make it easy for the chemist to calibrate the equipment used in the analysis of metals and make sure that it is working properly.

In food science, the amount of salt, or sodium chloride (NaCl), is an important measure in determining a healthy diet. According to the 2010 Dietary Guidelines for Americans, a generally healthy person could consume up to 2,300 milligrams of sodium. People suﬀering from hypertension, for example, will want to reduce the amount of sodium they intake daily.

Snack foods and other processed foods may have higher sodium levels than you might expect, simply to add ﬂavour. An ounce of potato chips, of some ﬂavours, might contain as much as 10 percent of your daily maximum for a healthy diet – generally, the more ﬂavourful the chips, the higher the sodium content. It is important, therefore, to have good measurements of the sodium content.

A measurement system can be assessed by measuring a known amount in some sample. The known amount may have been determined by a very accurate, and usually expensive, system. If a cheaper, faster, or more convenient system is available, then it will be acceptable it produces measurements very close to the actual values.

In this question, you will examine realistic data (i.e., not real data) from a study to develop a new and inexpensive way to measure the sodium content of snack foods such as potato chips.

Ten samples with known sodium content were prepared and the content was measured by the new procedure.

All measurements are in milligrams of sodium per ounce and are available as the R data ﬁle chips .Rda. The two variates are actual for the actual sodium content and measured for the sodium content determined using the new procedure.

directory <- " . ./data" # directory where you have saved "chips .Rda"

load(file .path(directory, "chips .Rda"))

26 marks

a. (3 marks) Produce a scatterplot of the data with measured as the vertical or 夕 axis, and actual as the horizontal or 北 axis (use cex = 2). Make sure the axes are meaningfully labelled and give the plot a meaningful title as well.

In "red", add a dashed line (of type lty = 2) having zero intercept and unit slope.

● Show your plot.

● Comment on why the line added might be of particular interest.

● Based on this plot alone, comment on the quality of the new measurement procedure.

b. (3 marks) Fit the measured amount of sodium, being the response 夕 , as a straight line model with the explanatory variate 北 being the actual amount.

● comment on the quality of the ﬁtted straight line; justify your comments by making reference to appropriate statistics found in the summary of the ﬁtted model.

c. (2 marks) Having ﬁtted the model, it is of interest to test the hypothesis that the intercept of the line is zero.

● formally, write down the hypothesis

● report the observed signiﬁcance level (p-value)

● what do you conclude about the evidence against the hypothesis?

d. (3 marks) Having ﬁtted the model, it is also of interest whether the slope of the line is 1.

● write down a formal hypothesis to be assessed

● mathematically write down the test statistic (or discrepancy measure) for testing this hypothesis.

● calculate its value in R. Show your code.

● determine the observed signiﬁcance level (p-value)

● on the basis of these ﬁndings, what do you conclude about the evidence against the hypothesis?

e. (4 marks) Construct 99% prediction intervals for a new value of the measurement for all values of α in the sequence α = 140 ﹐ 141 ﹐ 142 ﹐．．．﹐ 230. Again, draw a scatterplot of the data as in part (a), but this time overlay the prediction intervals just determined on top of the plot. Make sure the plot has xlim and ylim values large enough to accommodate all the data and all of the prediction intervals.

● construct the plot as described above;

● print/give the prediction interval values when the actual value is 140, and again when it is 230. Which, if either, is larger? Why is the one larger than the other, or, why are they the same length?

● comment on the quality of the new measuring system. Justify your comments.

f. (3 marks) Suppose instead of taking only a single measurement at some new actual value α, we imagine taking two new values at that same α and averaging. Mathematically, derive a prediction interval for

the average of two measurements at α .

● show your derivation

● how does this prediction interval compare to simply predicting one observation?

● on the basis of your ﬁndings in this question, would you rather average two observations at a new value of α? Or a single value? Explain and matnematically support your reasoning.

g. (4 marks) A straight line model with zero intercept, namely, Y = μ(α) + ∩

with

μ(α) = g1 α

can be ﬁtted in R as

fit_nointercept <- lm(measured ~ actual - 1 , data = chips)

# or equivalently as

fit_nointercept <- lm(measured ~ 0 + actual, data = chips)

The -1 term in the formula removes the intercept from the model deﬁnition; the 0 term indicates a zero intercept model.

Fit the no intercept model and test the hypothesis that the slope is 1.

● show how to derive the p-value mathematically,

● calculate the p-value in R and report the result

● what do you conclude about the evidence against the hypothesis?

● comment on how this ﬁnding either corroborates, if it does, or contradicts, if it does, your ﬁndings in earlier parts of this question.

h. (2 marks) A straight line model with slope forced to be one, namely, Y = μ(α) + ∩

with

μ(α) = g〇 + α

can be ﬁtted in R as

fit_slope1 <- lm( (measured - actual) ~ . , data = chips)

# or equivalently

fit_slope1 <- lm( (measured - actual) ~ 1 , data = chips)

The formula was manipulated to ﬁt the model

夕 H α = g〇 + r

The diﬀerence appears on the left of ~ in the formula, and . (or simply 1) appears on the right to indicate only the default (or intercept) term remains.

Fit this unit slope model and test the hypothesis that the intercept is 0.

● calculate the p-value in R and report the result

● what do you conclude about the evidence against the hypothesis?

● comment on how this ﬁnding either corroborates, if it does, or contradicts, if it does, your ﬁndings in earlier parts of this question.

i. (2 marks) The hypotheses 九〇 : g〇 = 0 and 九1 : g1 = 1 were each tested twice on the same data. Once in each of parts (c) and (h) for 九〇 , and once in each of parts (d) and (g) for 九1 .

Depending on your results, explain why each time testing either of the hypotheses must yield the same conclusion, OR, why it is possible that diﬀerent conclusions can be drawn when testing the same hypothesis, OR, having observed diﬀerent conclusions, why there must be something wrong with the data and/or model since such contradictions must be impossible.

2023-02-18

Measuring sodium content

Java

物理(Physical)

LINUX

C++

Python

Processing

sas

ios

maths

maple