Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Lab Activity #8:  Probability and Sampling Distributions

STAT 1350, Spring 2023

This lab activity focuses on the content from Chapters 17 and 18.  We strongly encourage you to read through these chapters in your textbook before you start working on this assignment and to review notes from lecture as you work toward completing the assignment.  

There are 17 questions in this activity.  You will need to submit the answers to these questions no later than 11:59 p.m. on Friday, March 10th.   You will share the answers via a Word or PDF file that you submit through the course website (either by going to the “Submit” link within the Week 9 Overview, next to the place where you downloaded this lab activity handout, or by going to the Assignments link on the left side of the course page and clicking on Lab Activity #8).

Your answers to each question do not have to be long, but they should be as complete as possible.  Aim to be concise but thorough in your answers.  As always, you will be graded based on effort/completeness and the correctness of a selection of problems that we choose to grade.  For this reason, please try to complete the entire lab, and ask for help if you get stuck along the way!

IMPORTANT:  If any problem requires a calculation to answer, please attempt to write out or type out how you arrived at your answer so we can see your thought process.  We cannot give you full credit if no work is shown.  You will need to use Table B for part of this assignment, and you can find a copy at the end of this assignment handout.  Also, in the event you cannot easily see symbols or formulas that are presented in the Word version of this assignment, please review the PDF copy of the assignment.  

Part 1:  Probability Review

Dr. White teaches a chemistry course, and he begins each new semester by giving his students a 6-question pre-test.  The table below was put together based on the pre-test results from all students who have ever taken Dr. White’s course.  The table shows the number of pre-test questions that could be answered correctly, along with the probabilities associated with answering each of those questions correctly.  Please use this information to answer Questions 1 through 5. 

Number of questions answered correctly

0

1

2

3

4

5

6

Probability

0.02

0.14

0.33

0.21

0.13

0.12

0.05

1. What makes the above probability model a legitimate model? 

2. What is the probability that if we randomly select one of Dr. White’s students, that student answered at least 4 questions correctly on the pre-test?   

3. What is the probability that if we randomly select one of Dr. White’s students, that student answered at most 2 questions correctly on the pre-test?

4. What is the probability that if we randomly select one of Dr. White’s students, that student answered either 0 questions correctly or 6 questions correctly on the pre-test?

5. What percentage of Dr. White’s students answered 0 questions correctly?  

6. Dr. Chance has just graded the midterm exams in her engineering course.  If 18% of Dr. Chance’s students earned a grade of A on this exam, this means that if we randomly select any one student from Dr. Chance’s course, the probability that student will have earned an A on the midterm exam is equal to ___________.

7. You have tossed a fair coin 12 times, and all 12 tosses have landed with the “heads” side up. When we say the coin is “fair,” it means that when you toss it, it should have an equal chance of landing with either side up (i.e., either “heads” up or “tails” up).   You are now preparing to toss the coin again, for a 13th time. When asked what the probability is of the 13th toss also landing with the “heads” side up, Mary says the probability is exactly equal to 0.5, Julie says the probability is greater than 0.5, and Sandy says the probability is less than 0.5.  Who is correct?  Please explain.  

Part 2:  Sampling Distributions

In our Chapter 18 lecture, we introduced you to the idea of a sampling distribution.  Here, we will again discuss that idea, and we will do so by focusing on an example that you might recall from earlier in the semester.  In Lab Activity #3, we asked you to think about what would happen if we could repeatedly sample from the population of OSU students in order to estimate the percentage of all OSU students who participated in some kind of music education (e.g., band, choir, private music lessons, etc.) while in high school.  Let’s now revisit this example.

8. Suppose you survey a random sample of n = 40 current OSU students, and you find that 11 of these students participated in some kind of music education while in high school.  What will your sample proportion, or , be?  Please calculate this number below, and present your answer to three decimal places.  

9. The number you wrote in response to Question 8 describes a sample.  We call a number that describes a sample a _________________________.

10. In the population of all current OSU students, 30% participated in some kind of music education while in high school.  Written as a proportion, this number would be 0.30, and we would use the symbol p to denote this population proportion.  We would consider the population proportion to be a_____________________ because it’s a number that describes a population.  

Imagine now that 12 of your STAT 1350 classmates each survey a random sample n = 40 OSU students about their high school music education experiences.  Each classmate then determines the proportion of students in their sample who participated in some kind of music education while in high school.  The graph below—called a dot plot—shows each of these sample proportions.  Each proportion appears as a dot above a number line, and the number line gives us different values of our sample proportions.  If any dots are stacked in a column, it would mean that more than one sample had the same sample proportion.  For example, there are two dots stacked in a column at 0.325, and this means that two of the sample proportions were equal to 0.325.

 

11. What does each dot in the above plot represent?

A. A sample statistic

B. A population parameter

C. A survey response from exactly one current OSU student

12. Look again carefully at the dotplot above.  Does it surprise you to see variability in the sample proportions that were obtained?  Please explain why or why not.

What do you think would happen if we were able to survey many more random samples of size n = 40 from the population current OSU students?   In order to better understand how the sample proportion of students who have participated in high school music education will vary from sample to sample, we have to think about what we would expect to see in the long run, if we could randomly sample many times from this population, compute a sample proportion from each of our samples, and then examine the resulting distribution of sample statistics.   

Remember that a distribution shows us all values a variable can take on and how often it can take on different values.  A sampling distribution is a collection of the statistics of all possible samples of a particular size taken from a particular population.  If we want to use one sample to make an inference about an unknown parameter, we need to understand how our one sample would compare to all other possible samples we could have drawn randomly from the population.  

To get a sense of what that underlying sampling distribution will look like in this example, we will next examine what would happen if we could survey many random samples of current OSU students.  We’ll focus first on the sample size of n = 40, and then we’ll explore what happens as the sample size gets bigger.  

The histogram below shows you what happens when we draw thousands of random samples of size n = 40 OSU students from the population of all current OSU students and then determine the proportion of students in each sample who participated in some kind of music education while in high school.  What is being displayed in the histogram is many sample proportions. 

 

13. How would you describe the shape of the distribution that is displayed in the above histogram?

It turns out that the mean, or the average, of all the sample proportions in the above histogram above is 0.30.  You might recall that 0.30 is p, or the population proportion.  All of the sample proportions in the distribution average to a value that is equal to the population proportion.  This is an important characteristic of a sampling distribution.  

The standard deviation of the above distribution of sample proportions is equal to approximately 0.0725.  We explain in our Chapter 18 coverage that the formula you would use to determine the standard deviation of the sampling distribution is:     

  

Here, we know our sample size is n = 40 and the claimed population proportion is p = 0.30, so the standard deviation of the sampling distribution is:

 

How does sample size affect the resulting sampling distribution?   Below, we have attempted to compare three sampling distributions.  Each distribution is based on taking samples of a particular size from the population of all current OSU students and then computing the proportion of students in each sample who participated in music education while in high school.  We start with a smaller sample of size of n = 40 and then increase the sample size, first to n = 70 and then to n = 100.

 

14. As you look at the three distributions presented in the image above, we hope you are noticing that they all have similar shapes, and they all have the same mean.  However, the standard deviation is different for each distribution of sample proportions.  How is the standard deviation changing as the sample size increases, and why do you think it changes in this way?

15. Let’s focus specifically now on the distribution of sample proportions based on samples of size n = 100.  Because that distribution is approximately Normal, with a mean of 0.30 and a standard deviation of 0.0458, we can apply the Empirical Rule. Use that rule to fill in the blanks below.

A. Approximately 68% of the sample proportions are between __________ and __________.

B. Approximately 95% of the sample proportions are between __________ and __________.

C. Approximately 99.7% of the sample proportions are between __________ and __________.

16. Think about some of the examples we worked through in our lecture coverage of Chapter 18 that involved converting a sample statistic to a z-score and then using Table B.   You can even find a similar example in your textbook, in Chapter 18 (see Example 5 in Chapter 18).  Based on what you learned from that coverage, what is the probability of obtaining a sample proportion of 0.24 or smaller from a sample size of n = 100?  As you answer this, remember that the sampling distribution for samples of size n = 100 has a mean of 0.30 and a standard deviation of 0.0458.   Please show your work below as you attempt to answer this question.

17. What is the probability of obtaining a sample proportion of 0.34 or larger from a sample of size n = 100?  Again, please show all work below as you are figuring out the answer to this question, and remember that the sampling distribution for samples of size n = 100 has a mean of 0.30 and a standard deviation of 0.0458.

For fun:  If you would like to explore sampling distributions in more detail, check out the Art of Stat sampling distribution of the sample proportion tool.  Here is a link:

https://istats.shinyapps.io/SampDist_Prop/

We used this tool to create some of the images that you have viewed in this lab activity.  Note the tool allows you to set a particular population proportion, to select a sample size, and to draw many samples of your chosen sample size from the population.  We hope you’ll take some time to play with this tool and to notice what happens to the resulting distribution of sample proportions as you vary the size of the sample.