Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STAT 207

Exam 1

Use SPSS and the Titanic Passengers.sav data set for all problems unless specifically told otherwise.

1.   Consider the variable embarked (boarding city).

a.   What type of graphical display would be appropriate for this variable?

Bar or Pie

b.   How many passengers boarded the Titanic in Queenstown?

123

2.   Consider the variable sibsp (number of siblings or spouses on board)

a.   What percentage of passengers in the data set had at least one sibling or spouse on board?

100% - 68. 1% = 31.9%

b.   What is the percentile rank associated with a person who had 2 siblings or spouses on board?

92.4%

3.   Consider the variable fare.

a.   Properly use SPSS to determine the 80th percentile of the variable fare.

42.40

b.   Using the z-score criterion for outliers, how many outliers are there for the variable fare?

54

4.   Use the Explore feature to create a comparative boxplot for the variable age that compares       distributions of this variable by gender.  Base your answers for the parts of this problem solely on the boxplot.

a.   Based on the plot, which gender’s distribution of age was typically higher?  Cite the specific aspect of the plot that you used as the basis of your answer.

They appear roughly equal, medians = 28.

b.   Based on the plot, which gender has more consistent values of the variable age?  Use the BEST choice of statistic that is evident in the boxplot for your answer.

Best choice is IQR… both look roughly equal to 18.

(Range is higher for males, but range is not the best statistic to use.)

c.   Based only on the plot, describe the shape of the males’ distribution of age.  Cite the aspects of the plot upon which you base your answer.

Positively skewed due to large outliers.

d.   Based only on the plots, which sex has the more skewed distribution for the variable age? Give the reason for your answer.

Males.  More outliers.

5.   Create a stem-and-leaf display for variable fare (cost of ticket).  Answer the following questions based only on this graph.

a.   Refer to a relevant feature of the graph and tell me if the distribution is positively skewed, negatively skewed, or roughly symmetric.

Positively skewed, due to outliers (Extremes).

b.   According to the stem-and-leaf plot alone, what was the highest recorded value of fare?

Can’t be determined, but it is at least $67.

6.   For this problem, use relevant statistics, as in Chapter 3, not plots as the basis of your responses.

a.   Consider the variable age factored by whether the person survived (survived).  Which         group is more skewed? Your answer should include the name and value of the statistic upon which you base your response.

Skewness ratio of those who died is  = 5.77

Skewness ratio of those who survived is  = 2.20

The distribution of those who died is more skewed.

b.   Consider the variable age again.  Using the best statistic as the basis for your answer, which group, those who died or those who survived, has the higher typical age?  You must name   the statistic you use and report its value.

Due to skew, compare medians … both are 28, so equal.

c.   Using the best statistic on which to base the comparison, which group has a more consistent distribution of ages?

Due to skew, compare IQRs… both equal 18, equally consistent.

You do not need SPSS for the rest of the exam.

7.   Two friends taking different sections of the same course wish to determine who did better on the last exam relative to the classmates in their section.  The summary statistics of each of the two sections are given below:

Section 1

Mean = 77

S.D.   = 10

Skew-ratio = 2.50

Section 2

Mean = 70

S.D.  = 20

Skew-ratio = 3.10


a.   Suppose student 1 (from Section 1) scored an 80 on the last exam.  What is this student’s z-

score?

 =  = 0.30

b.   Suppose student 2 (from Section 2) has a z-score of 0.35.  What is this student’s raw exam score?

 = 0.35(20) + 70 = 77

c.   If the instructor of Section 1 realizes a mistake in his recording of the grades, and changes everyone’s score by adding 3 points, how does this affect the skew-ratio of the distribution of test scores?

It doesn’t.