Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Math 181A HW Problems Complete List

General instructions:

❼ Clearly and thoroughly write your solutions on blank paper, showing all your work. See the syllabus

for instructions for uploading to Gradescope. See the calendar for due dates/times.

❼ You may list answers in exact form (e.g., π) or round to three decimal places (e.g., 3.142), unless

the problem says otherwise. If rounding to three decimal places would result in the number 0 (e.g., with 0.00012345), instead use scientific notation and write three decimal places (e.g., 1 .235 · 104).

❼ On any problem involving R, you must include your code and output as part of your answer. You

may take a screenshot of the code/output, or write it by hand.

❼ Problems tend to focus on content from the two or three previous lectures and never require ideas

from a lecture that falls on the day a problem is due. For example, if a problem is due on Friday the 19th, it is likely to use ideas from lectures on Wednesday the 17th, Monday the 15th, and/or Friday the 12th. It will not use ideas from the lecture on Friday 19th. It is possible that a problem requires knowledge from earlier in the course, or from prerequisite courses. If some prerequisite knowledge is required which you have forgotten, you should feel free to consult books/internet to learn this knowledge (e.g., Taylor series, impropoer integrals, L’Hˆopital’s Rule, etc.).  Expect prerequisite knowledge to be drawn on frequently.

❼ At the end of the course calendar, you should see a phrase like “Problems XX-XX not collected”.

This refers to the problems at the end of this packet that are here to help you learn the material but cannot be collected/graded because of union rules related to UCSD graders. You should work these problems to develop your mastery of topics from the last few lectures in the course as you prepare for the final exam.

1. The simplest random variable  (RV) follows the Bernoulli distribution.   This is a RV with two possible values: success (which we think of as 1), which appears with probability p, and failure (0), which appears with probability 1 − p.

(a) Explain why the pmf can be written in this surprising way: f(x;p) = p (1 p)1−北 .

(b) For many students, the above pmf feels like pure magic.  Explain how you can come up with

this if you happen to know the pmf for the Binom(n,p) distribution.

(c) Explicitly calculate the mean and variance of the Bernoulli RV with parameter p using the definitions of mean and variance.

(d) If X1 , . . . ,Xn  are iid (independent and identically distributed) Bernoulli(p) RVs, and Y ∼ Binom(n,p) is Binomial, write a formula that relates Y and the Xis. Then, explain how the formula can help you easily remember the mean and variance of a Binomial RV.

2. The Kernel Technique”.   One of the most helpful tricks in mathematical statistics is to use the fact that all probability density functions must integrate to 1 over their support.   That is \support f(x)dx = 1.  For example, if we take X Exp(λ = 4), then we know \0 4e4 dx = 1 since the pdf is f(x) = 4e4 .  Now, each pdf (or pmf) can be separated into two parts: the con- stant(s) and the terms with the variable (known as the“kernel”). For the exponential distribution above, the kernel is e 4 . This distinction is useful because:

\support kernel =

(a) Find \ e −北2 /2 dx. Do not try to use integration techniques from calculus. Instead, think of

a RV with a pdf whose kernel looks like e2 /2  and use the above comments to immediately write the answer. (Mention the RV in your answer on all parts!)

(b) Find \0 x4 e 3 dx. (Integration by parts 4 times? Nope. The kernel technique.)

(c) x(2)! (You could do this problem using Taylor series, but use the kernel technique. Note that because we have sum instead of an integral, you should be thinking about discrete random variables here, not continuous random variables.)

(d) (x 1)(0.7) 4  (This is very scary without the kernel technique.)

3. Let X1 , . . . ,Xn  be iid from the distribution modeled by

fX (x;θ) = (θ2 + θ)xθ−1(1 x) where 0 < x < 1 and θ > 0

Find the MME (method of moments estimate/estimator) for θ . (Note: We always assume a pdf is 0 outside of the zone specified. For example, here we assume fX (x;θ) = 0 if x ≤ 0 or x ≥ 1.)


4. In the 2017 video game hit Legend of Zelda: Breath of the Wild, you must collect star fragments to upgrade your armor to the highest levels. You decide to explore the mechanic behind how these rare items are generated in the game. Suppose you have this partial knowledge:

❼ A star fragment will appear once per night, sometime after 9 PM (using the in-game clock).

❼ Once the clock reads θ (a particular, unknown time of day), star fragments no longer appear.

❼ The game uses a random number generator to decide on the spawn time for the star fragment

where each time between 9 PM and θ is equally likely.

It is important for the gaming community to learn what θ is because this helps users understand the game and saves people time: If you know the time is past θ, you will stop waiting for the star fragment (which you missed!)  and plan on trying again the next night.  To help the community, you plan to record the appearance time for 6 star fragments on 6 random (in-game) nights.  You get: x1 =11:20 PM, x2 = 1:20 AM, x3 =12:20 AM, x4 = 10:00 PM, x5 = 1:05 AM, and x6 = 11:55 PM. Find the MME for θ based on these six data points.


τ '(') 3 ,

'(')τ

5.  Suppose a discrete RV is modeled by pX (x;τ) = τ(6) , '(') 4 ,

''(1 ,

x = 1

x = 2

x = 3

x = 4

Suppose you observe the sample x1  = 4, x2  = 3, x3  = 4, x4  = 2, x5  = 2, x6  = 2, and x7  = 2. Find the MME for τ . After this, find the set of values τ could actually take on given that pX (x;τ) must be a valid pmf and list your answer in interval notation.  (Note: It is possible that some data sets will give rise to an MME value that is outside the set of possible values for the parameter! This is not a problem with the above data, but it is one drawback to MME in general.)

6. It is important to note that the MME for a parameter is not a unique idea (despite me writing “the MME” on problems). Suppose X1,X2 , . . . ,Xn  are iid from X ∼ Pois(λ).

(a) Find an MME for λ using the first moment of X (which is what people typically use).

(b) Find an MME using the second (!) moment of X . Then, see if the estimators in parts a and

b give the same estimate for λ using the data X1 = 1,X2 = 2,X3 = 2.

7. Engineers will often use this distribution to model the lifetime of electronics: fY (y;α,β) = αβyβ−1e −αyβ , where y > 0,α > 0,β > 0

Assuming Y1 , . . . ,Yn  are iid from this distribution, find the MME for α assuming that β is fixed (known).  (Hint: After setting up an integral, try a u-substitution with u = αyβ .  Remember to switch the bounds to u bounds, and the switch the dy as well. Your answer will have a Γ ( 1 + )

in it.  Also, this problem might be the first time in your life that you’ve seen exponents inside exponents. Often, such expressions can be tough to read, so mathematicians will use the notation

exp(a) to mean ea . With this notation, we can write the pdf as αβyβ−1 exp( αyβ ), which is a little

8. Let Y be a CRV with density fY (y;θ) = e−y2 /θ  where y > 0,θ > 0.  Given a random sample

(a) Find the MLE for θ .  (As always with one parameter, you must check the second derivative

condition!)

(b) Find the MME for θ using first moments. You should get a different answer from part a, hence

showing the MME and MLE may be different.

9. One common distribution that appears in branching process theory is a DRV with pmf:

fX (x;µ) =                      where x ∈ {1, 2, . . .} and µ ∈ (0, 1)

(a) Find the MLE for µ given iid X1 , . . . ,Xn .  Then, find the MLE for the particular data x1  =

2,x2 = 1,x3 = 6.

(b) Using Desmos, draw a graph of the likelihood function (not log-likelihood) for the data x1  =

2,x2  = 1,x3  = 6. It should be maximal at the µ value you found in part a. Include a sketch of the graph from Desmos (or a screenshot if you’re tech-fancy).  (Note:  In Desmos, if you click on the wrench icon in the upper-right, you can change the range of values on the x and y axes.)

10. Economists frequently use the CRV X with pdf:

fX (x;α,β) = βαβx−β−1  where x ≥ α > 0 and β > 1

Find the MLE for α and β .  (As with all multivariable maximization problems in this class, you need NOT show your MLE is maximal via higher derivatives. Also, as on all MLE problems, if no data values are explicitly given, you should begin by naming them for your use:  “Let x1 , . . . ,xn be a random sample of data.”.)

11. Suppose Y is a CRV whose pdf is pictured below. Find the MME and MLE for w given the small sample: y1  = 1, y2  = 3.  (Technology may be useful on the MLE. Do not try to find a formula for the MLE in the general case of n data; no closed-form solution exists.)

12. Let’s start using R to see estimators in action. While an estimator looks like formula, it is actually a random variable because as different random samples come out of a distribution, they combine (via the formula) to make a random value. Different data give rise to different values of the estimator. Let’s consider estimating the parameters from N(µ,σ2 ). In class, we learned the MLEs are:

n

i=1

(a) Imagine we are collecting data on IQ scores at UCSD, and suppose these are N(µ = 106,σ2 = 142). I have to give you values for µ and σ 2  so we can run a simulation, but pretend we don’t know them! Using the function rnorm in R, generate a random sample of 23 IQ scores from this distribution, and then write code that finds and . Include your code and results.  (Note: When calculating , do not use the built-in function var, as this does a slightly different calculation than our formula above and because I want you to see how straight-forward it is to calculate the formula for in a vectorized language like R.)

(b) Now, let’s imagine that instead of one sample of size 23, we collected 1000 samples, each of

size 23.  Each sample gives a value for , so we have 1000 different values for .  Using the replicate function in R, create these 1000 values for , and then use the hist function to make a histogram.  Include your code and a rough sketch (or screenshot) of the histogram.  This picture allows you to see as a random variable.  In R, you can type ?replicate to read the documentation for the replicate function.

13. Suppose we have a random sample Y1 , . . . ,Yn  from a CRV with density

fY (y;θ) = (y + 1)θ+1  where y > 0,θ > 1

Find the MME and MLE for θ .

14. Suppose that the time it takes your computer to load R Studio on a random day is normally distributed with unknown mean, µ, and variance 1.2 seconds2 . You’d like to build a 95% confidence

interval for µ, so you time your load speeds on 6 random days: 2.1, 1.7, 3.3, 2, 2.1, 1.9 (seconds). (a) Find your 95% CI for µ .

(b) Suppose you had instead created an interval using an 80% confidence procedure. What CI do

you get now?

(c) Suppose you’re unhappy with the width of the interval in part a and want it to be one-third its current size. Assuming you use the 95% confidence procedure, how many total data would you need to collect to achieve this?

15. Suppose X1 , . . . ,Xn  are iid from X  ∼ N(µ,σ2 ), where σ 2  is known.   Consider the confidence procedure that generates intervals of the form ( X 2 , X + c ), where c is a constant.

(a) What value must c be for this to be a 70% confidence procedure? (Also, see problem 16.)

(b) Your friend collects some data in the above setting, builds a 70% CI, and writes in a journal

article: “Given our data, we find there is a 70% chance that the unknown µ is in our interval.”. Critique this statement and offer an improved statement.

16. Let’s check your answer to 15a.  Below is an outline of some R code that does this.  Fill in the missing parts, and then type the code into R and run it to see if your answer from 15awas correct. Our setup will assume the n = 25 data come from the distribution N(µ = 7,σ2 = 16) (we must set a value for µ to run the simulation!). We make 50000 intervals and then see which capture µ and which don’t. Finally, we calculate the confidence level.

17. Suppose X ∼ Unif(0,θ2 ) where θ > 0 is unknown.  In this problem, we find point and interval estimators for θ .

(a) Find an MME for θ using first moments for a sample X1 , . . . ,Xn .

(b) Suppose your answer from part a is called MME . You design an interval estimator to try to capture θ:  (MME , 2 · MME ). Find the confidence level for this confidence procedure assuming a sample size of n = 1. (Also, see question 18.)

18. Using the programming skills you gained in question 16, write code to check your answer for question 17b. Include your code in your answer.

19. Suppose you are drawing a random sample of size n > 0 from N(µ,σ2 ) where σ  > 0 is known. Decide if the following statements are true or false and explain your reasoning. Assume our 95% confidence procedure is (X 1.96 , X + 1.96 ).

(a) If (3.2, 5.1) is a 95% CI from a particular random sample, then there is a 95% chance that µ

is in this interval.

(b) If (3.2, 5.1) is a 95% CI from a particular random sample, then there is a 95% chance that the

mean from our next random sample will be in this interval.

(c) A 95% CI will contain 95% of the possible values from the population distribution we are studying.

(d) If we generate 400 random CIs using our 95% confidence procedure, we expect about 20 intervals to not contain µ .

20. In the modern political era, campaign rallies are being infiltrated by people who do not support the speaker (e.g., to sow dissent, to study those who do support the speaker, etc.). You’re curious about this, so you decide to attend a political rally. Your plan is simple: you’ll choose 70 random people in the audience and use hidden cameras to videotape them during the rally. When the crowd breaks into a chant, you will use the video footage to see what proportion of your random subjects actually engage in the chant.  Suppose you do this and find only 58 of the 70 people took part in the chant.

(a) Assign notation to and define the population parameter we are trying to study. Then, create

an approximate 92% CI for this parameter.

(b) You may have noticed that your approximate 92% CI from part a did not include 100% (or

1, if you are using decimals).  Suppose you change the confidence level to C% and the upper bound of the approximate CI exactly equals 100%. Find C .

21. One thing that often surprises San Diego newcomers is how present the U.S. military is here.  As a statistician at DQ Industries, your boss has tasked you with finding the proportion of San Diego jobs that are connected to the military (this includes jobs at bases, contractors, military R & D, etc.). You are required to draw a large-enough sample so that the sample proportion will be within 1.5% of the true value 90% of the time.

(a) Suppose you have no information about this proportion and that it costs $5 to contact each

person in your sample.   What is the least amount of money you can spend to meet your requirements?

(b) Your boss is horrified by the cost estimate in part a. You decide to do some Google searching to get an estimate for the proportion. At this website, you see that 1 in 5 jobs in San Diego is linked to the military sector. Since this is a pro-military group, you figure this number can serve as an upper bound on the true proportion. What is the new minimal cost estimate?

22. One frustrating issue with proportions is that your mathematics might give a CI like:  ( 2%, 7%)

or 100%. This makes it particularly difficult to study rare or hyper-prevelant phenomena.

(a) Suppose you are building an approximate 95% CI for a parameter p using a sample of size 100.

If the lower bound of your CI exactly equals 0%, what is your sample proportion?  (Note: It may not be possible to actually get this sample proportion because with 100 people the only possible sample proportions are 0, 0.01, 0.02, . . . , 0.99, and 1.)

(b) Suppose you are trying to decide on a sample size for a study to determine what percentage

of Americans self-identify as transgender. Based on previous studies, this proportion is some- where around 1%, so you decide you’d like to be within 0.1% of the true proportion 95% of the time. If you take 7% as an upper bound on the transgender proportion, what is the smallest sample size you can draw?

23. In this problem, we explore more estimators for µ and σ 2  in the distribution N(µ,σ2 ).

(a) Typically, people use 1  = X as an estimator for µ .  You might also use 2  =                  or

3  = . Show that all three of these estimators are unbiased.

(Note: 2  might be used if you didn’t trust data X3 , . . . ,Xn, while 3  might be used if you wanted to give increasing importance to data collected later in the process!)

(b) In class, we showed that σ 1(一2) = (Xi X)2  is a biased estimator for σ 2  when both µ and

σ are unknown. Suppose, however, that µ is known and so we can use σ2(一2) = (Xi − µ)2 .

Show that σ2(一2)  is actually unbiased.

24. Suppose you decide to randomly generate numbers from X ∼ Unif(0,θ). Your friend will ask for n numbers and then use this information to guess what value you (secretly) chose for θ . Typically, one might use MLE  = maxXi = X to estimate θ . Your friend, however, has meganumerophobia, and is afraid to say the maximum number in the random sample.  Instead, he’ll say the second largest number: = X1 .  Determine the bias of this estimator by carefully finding the density function for X−1 and continuing from there. If the estimator is biased, check if it is asymptotically unbiased, and also modify it to create a new unbiased estimator.

25. Suppose you have X ∼ Binom(n,p) where n is known and p is unknown.  Typically, people use

= to estimate p, where X = X1  is simply a sample of size 1. (Note: A sample of size 1 from

a Binomial RV is equivalent to n Bernoulli trials.) This might represent simultaneously flipping n

coins (just once!) and counting the number of heads you see, where each coin has pheads = p. Now,

if both n and p are known, we know the variance, V , of X is just np(1 p). If p is unknown, you it is biased, determine if it is asymptotically unbiased, and also modify to create a new unbiased

estimator.

26. The number of times, X, a particular first-year college student calls home during a random week is a Poisson RV with mean λ: X ∼ Poisson(λ). Curious to find the value for λ, you break into the NSA (!) and access phone records for this student on n random weeks. You record the number of calls home and get the random sample X1 , . . . ,Xn .

(a) Find an unbiased estimator of λ and prove it is unbiased.

(b) You’re curious how many total minutes, M, these X calls amount to in a week, and you read a recent journal article that suggests the model M = 2X + 3X2 .  Find the expected number of weekly minutes as an expression involving λ .

(c) Find an unbiased estimator of E(M) (your answer from part b) based on the random sample X1,X2 , . . . ,Xn .

27. Let X ∼ Exp(λ) with λ unknown, and suppose X1,X2  is a random sample of size 2.  Show that M = ^X1 · X2  is a biased estimator of and modify it to create an unbiased estimator.  (Hint: During your journey, you’ll need the help of the gamma distribution, the gamma function, and the knowledge that Γ(1/2) = ^π .)

28. Suppose that X ∼ Unif(0, 3θ) and we draw a random sample X1 , . . . ,Xn .  Find the MME and compute its relative efficiency to 2 = 2X1 − X2 .

29. In class, I showed the below picture.  Here, I have changed the vertical axis from variance to SD. In this new picture, how can we visualize the MSE? How does this way of seeing the MSE help us decide which of two (possibly biased) estimators is more efficient?

30. Let X be a continuous random variable with E(X) = µ and Var(X) = σ 2  < ∞ . Suppose we try

to estimate µ using these two estimators from a random sample X1 , . . . ,Xn  (where n 3):

1  = X

2 = 2X1 + aX2 + bX3

For what a and b are both estimators unbiased and the relative efficiency of 1  to 2  is 45n?

31. Find the Fisher Information and the Cramer-Rao lower bound for the variance of an unbiased estimator of θ given a random sample X1 , . . . ,Xn  from the density

f(x;θ) = ex/θ  where x > 0 and θ > 0.

32. Find the Fisher Information and the Cramer-Rao lower bound for the variance of an unbiased estimator of θ given a random sample X1 , . . . ,Xn  from the density

f(x;θ) = where − ∞ < x < ∞ and − ∞ < θ < ∞ .

You should use WolframAlpha.com to evaluate the complicated integral that will arise.

33. Let X1 , . . . ,Xn be iid based on f(x;θ) = ex2 /θ where x > 0. Show that = Xi(2) is efficient.

34. Let Y1 , . . . ,Yn  be a random sample from Y with pdf

fY (y;θ) =                where 0 < y < θ .

35. Let Y1 , . . . ,Yn  be iid based on Y with pdf

fY (y;β) = e(y3)/β  where y > 3,β > 0.

(a) Find MME  for β using first moments.

(b) Show MME  is MSE-consistent, and hence, consistent.  (Note that problems 34 and 35b are

training two ways to show consistency: via the ε-definition, and through MSE-consistency. Be skilled at both.)

36. Let X1 , . . . ,Xn  be a random sample from the discrete RV X with pmf

f(x;α) =                       where x = 0, 1, 2, . . .

Find the MLE for α and use it to create a formula for an approximate 82% MLE CI for α . (Recall the note on exp notation from problem 7.)

37. Suppose we try to model the test-taking abilities of a given student by the CRV X with pdf

f(x;θ) = (θ + 3)xθ+2  where 0 < x < 1 and θ > −3

Here, the constant θ is unknown and is determined by the work ethic and background training of the student.  Design an approximate 93% MLE CI for θ and use it to build a CI for the data x1  = 0.8,x2  = 0.92,x3  = 0.81,x4  = 0.96 (which represent random test scores of the student: 80%, 92%, 81%, and 96%).

38. Consider a RV modeled by the density f(x;θ) = x(1−θ)/θ  where 0 < x < 1 and θ > 0.

(a) Find the MLE for θ based on a sample X1 , . . . ,Xn .

(b) According to MLE theory, MLE  should be asymptotically unbiased and consistent. Explicitly

show that both of these are true for your result from part a.

39. State the decision rule (i.e., test) that would be used to test the following hypotheses for the specific test statistic  mentioned.  Then, make a decision using the data provided and write a conclusion. Assume the data come from a normal distribution with unknown µ and known σ . Include a picture (OK to draw by hand, doing this in R is inefficient) of the sampling distribution for the test statistic and label the critical region.

(a) H0  : µ = 20,H1  : µ < 20, n = 16, σ = 3, and α = 0.06. Test stat: x. Data: x = 18.5

x 20

(c) H0  : µ = 10,H1  : µ 10, n = 100, σ = 0.4, and α = 0.12. Test stat: x. Data: x = 11

(d) H0  : µ = 50,H1  : µ > 50, n = 60, σ = 4, and α = 0.08. Test stat: 3x. Data: x = 50.5

Note: Life is about trade-offs.  This problem helps you see this.  For example, part a has a nicer- looking test stat, but the distribution it follows is a little messier. Part b has a messy test stat, but the distribution it follows is very nice. Part d is here to remind you that just about any expression can act as a test stat, as long as you can determine its distribution.  Since we never know what expression might arise from MME or MLE, this reminder is comforting.

40. Calculate the P-values for problems 39b and 39c. Does using these P-values lead you to the same conclusions as the critical regions did?

41. Suppose you wanted to alter problem 39a so that the P-value, when calculated, would equal 0.04. If you could only change σ, what value would it need to equal to get the P-value to be 0.04?

42. In December 2017, the J-RPG Xenoblade Chronicles 2 was released for Nintendo Switch. The game is epic in its number of main quests and side quests.  Those that try to finish every aspect of the game are known as “completionists”. What is the average time for all completionists in the world (currently)? Assume completion times are normally distributed with unknown mean and standard deviation 50 hours (a reasonable estimate for J-RPGs). Before you collect data, your friend claims this average time is 250 hours (based on her personal experience). You think the value is something different and go to HowLongToBeat.com to find some data.  Based on when I looked at this page (don’t use more recent data!), 96 completionists had submitted their times for an average of 254 hours.   Define parameter(s), write hypotheses, draw a sampling distribution, and decide which hypothesis to support using α = 0.01 (and any one of the three methods shown in class). [For those curious, my completion time was around 225 hours, and my current play time is around 700 hours because of expansion pass content!]

43. Students often wonder what to do if you get a P-value of exactly 0.05 when α = 0.05. In truth, it doesn’t matter if you suggest rejecting H0  or keeping it, because the probability your P-value exactly equals α is 0 (since the P-value is actually a continuous random variable).  Let’s say you wanted to be evil and design a problem for your next statistics exam where the P-value would exactly equal 0.05. You plan to make a problem where we study µ from X ∼ N(µ,32) with data x = 7 and n = 28. What value(s) should you have students use for the null hypothesis to get your P-value to be 0.05 assuming a two-sided H1?

(a) Suppose a problem has H0  : µ = µ0  and H1  : µ µ0  . If a given data set causes us to reject

H0  when α = 0.02, would the same data force us reject H0  if α = 0.05?

(b) Suppose a problem has H0  : µ = µ0  and H1  : µ > µ0  . If a given data set causes us to reject

H0  for some α, would the same data force us to reject H0  if we change H1 to µ µ0? Assume α remains the same.

45. You’ve just made the best app ever!  You plan to upload it to the app store and are curious how

many reviews you might get from users.  The histogram of review counts for various apps in the

Apple  store  is very  right-skewed:   most  apps  get  a  small  number  of reviews,  but  some  apps

like Pandora, PayPal, and LinkedIn get millions.  It turns out that ln(revi