Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

MA EXAMINATION 2020-2021 academic Year

Course: Quantitative Methods

Course Code: SEES0083


PART A: Please Answer FOUR out of FIVE Exercises

Exercise 1 (4 points)

The amount of time you have to wait at a dentist's office before you are called in is uniformly distributed between zero and 60 minutes.

1/60

1

60


(a) What is the probability that you have to wait more than 40 minutes? 1/3= (60-40)/60         (b) What is the probability that you have to wait between 20 and 40 minutes? 1/3= (40-20)/60

(c) What is the first, second and third quartile of this uniform distribution? 1st 0.25*60=15; 2nd 0.5*60=30; 3rd 0.75*60=45

Exercise 2 (4 points)

Consider the following probability distribution function of the random variable X assuming values 0 to 6 and with the associated probabilities, P(x), written below:

x

0

1

2

3

4

5

6

P(x)

0.07

0.19

0.23

0.17

0.16

0.14

0.04

(a) What is P (X > 0)? 0.93 Summation of

0.19

0.23

0.17

0.16

0.14

0.04

(b) What is P (1 < X < 3)? 0.23=P(X=2)

(c) What is P (2 < X ≤ 4)? P(X=3) + P(X=4) =0.17+0. 16=0.33

(d) What is  P (X ≥ 5)? P(X=5) + P(X=6) =0. 14+0.04=0. 18

(e) What is P (X < 6)? 0.96=1-P(X=6) =1-0.04

Exercise 3 (4 points)

In a recent survey of 30 teenagers, 62% of them indicated that they saw a movie within the past month. 75% of those teenagers who saw a movie also went out to dinner in the past month, while only 64% of the teenagers who had not seen a movie had been out to dinner in the past month. Define the random variables as follows:

X = 1 if teenager had been to movie; X = 0 otherwise

Y = 1 if teenager had been out to dinner; Y = 0 otherwise

(a) Find the joint probability function of X and Y.

(b) Find the conditional probability function of X, given Y = 1.

X\Y

Out dinner yes 1

Out dinner no 0

marginal

Movie yes 1

Movie no 0

marginal

P(X|Y=1) =

Exercise 4 (4 points)

A random variable x has the following probability density function:

f(x)=0.25x

f(x)=1-0.25x

f(x)=0

for 0

for 2

otherwise.

a)   Graph the probability density function for X.

0.5


x


0

2

4

b)  Show that the density function has the properties of the probability density function (hint: what would be the area underneath the density function? Which are the boundaries of this area?).

•   for example, by measuring the area below the triangle geometrically, [(4-0) *0.5]/2=1

•   by   using   the   mathematical   integral. f(x)   dx = f(x)   dx = 0.25x   dx + (1 −

0.25x)   dx     = 0.25x2 {2 + [x 0.25x2] {4 = 0.25∗22 − 0 + [4 − 0.25∗42 − 2 + 0.25∗22] = 0.5 + 0.5

•   The function assumes value 0 for value of x=<0 and="" for="" values="" x="">=4 but the boundaries are

infinity and +infinity. In other words, the function is well defined for value of x=<0 and="" for="" values="" x="">=4 too, not only for x between 0 and 4

c)  Find the probability that X takes value between 1 and 3.

Students should compute

Geometrically, (1*0.25 +1*0.25/2) *2=0.75

integrals, f(x)   dx = 0.25x   dx + (1 − 0.25x)   dx     = {1(2) + [x ] {2(3) = + [3 − − 2 + ] = 0.375 + 0.375

the area between 1 and 3. Result 0.75.

Exercise 5 (4 points)

An administrator in charge of undergraduate education on a large campus wants to estimate the average number of books required by instructors. Using bookstore data, he drew a random sample of 26 courses for which he obtained a sample mean of 4 books and a sample Standard Deviation (not Standard Error) of 0.6 (the SD of the population is unknown so the students should use the t-student

not the z-score). Construct a 95% confidence interval to estimate (i.e. inference) the mean number of books assigned by instructors on campus.

The degrees of freedom are 25, the closest number they have on the table from the book is 25; the

column is the 0.025 (being (1-.95)/2)

Table result 2.060 (line 25 and 0.0025)

Confidence Interval=4 +/-2.060(0.6/√26) = 4 +/-2.060*0. 118= (3.76, 4.24)

Exercise 6 (4 points)

A local traffic enforcement department attempted to estimate the average rate of speed (μ) of vehicles along a strip of Fantasy Street. With hidden radar, the speed of a random selection of 250 vehicles was measured, which yielded a sample mean of 50 mph and a SD of 15 mph. (the SD of the population is unknown so the students should use the t-student not the z-score).

Estimate the standard error of the mean

SE (of the mean) =SD(sample)/√n, i.e. s/√n=15/√250=0.949

(a) Find the 90% Confidence Interval for the population mean

The degrees of freedom are 249, the closest number they have on the table from the book is infinity; the column is the 0.05 (being (1-.90)/2) (we can allow students to have some small

decimal inconsistencies here) 1.645

Confidence Interval=50 +/- 1.645*0.949= 50 +/- 1.561105= (48.438895, 51.561105)

(b) Find the 95% Confidence Interval for the population mean

The degrees of freedom are 249, the closest number they have on the table from the book is infinity; the column is the 0.025 (being (1-.95)/2) (we can allow students to have some small

decimal inconsistencies here) 1.960

Confidence Interval=50 +/- 1.960*0.949= 50 +/- 1.86004= (48.13996, 51.86004)

(c) Find the 98% Confidence Interval for the population mean

The degrees of freedom are 249, the closest number they have on the table from the book is infinity; the column is the 0.01 (being (1-.98)/2) (we can allow students to have some small

decimal inconsistencies here) 2.326

Confidence Interval=50 +/- 2.326*0.949= 50 +/- 2.207374= (47.792626,52.207374)

(d) Find the 99% Confidence Interval for the population mean

The degrees of freedom are 249, the closest number they have on the table from the book is infinity; the column is the 0.005 (being (1-.99)/2) (we can allow students to have some small

decimal inconsistencies here) 2.576

Confidence Interval=50 +/- 2.576*0.949= 50 +/- 2.444624= (47.555376, 52.444624)

Is there a lot of difference between c) and d)? Why? Explain briefly.

There is no much difference given the minimal change in the Confidence Interval in a distribution that is very thin” in the tails, this translates into very small probability mass. Note that we are using a probability distribution with such characteristic, but this is NOT

necessarily always the case.

PART B: answer all questions (total 60 points).

1.   Open STATA.

2.   Open the dataset EXAM.dta (located into SEES0083 Moodle page)

3.   Set a seed that is composed in this way and in this sequence

•   Your own year of birth (4-digit)

•   your own month of birth (2-digit)

•   your own day of birth (2-digit)

Example: if the date of birth is 1972 May the 9th

Into STATA type: set seed 19720509

•   generate a random number

Into STATA type: generate u = runiform ()

•   drop the observations in which the newly created variable u” (it should be in your list of variables) is strictly greater than or equal to 0.5

Into STATA type: drop if u>=0.5

Compliments! You have now created a unique dataset for your exercise (in other words no other student has the same database) in which you need to perform the tasks below.

Exercise 7 (20 points)

A researcher is trying to understand whether the average sales in Spain (ES) are comparable to the average sales in Italy (IT). For this purpose, she is using your firm level database. The variable for sales is called “r_OperRevTurnThEuro” (expressed in ‘000 of Euros) and number of employees is called numberofemployees” . The first output analysed by the researcher is what will come out as the following command is generated (summary stats for the two variables only for Spain):

Into STATA type: sum r_OperRevTurnThEur numberofemployees if

country_ACRONYM=="ES"

•   (1 point) Copy and paste the result into the exam sheet.

SPAIN


Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

r_OperRevT~r |     13,669    7323.658    61903.99   .9090909    2992946

numberofem~s |     12,770    29.23759    143.8255          1       8331

The second output analysed by the researcher is what will come out as the following command is generated (summary stats for the two variables only for Italy):

Into STATA type: sum r_OperRevTurnThEur numberofemployees if

country_ACRONYM=="IT"

•   (1 point) Copy and paste the result into the exam sheet.

ITALY

Variable |        Obs        Mean    Std. Dev.       Min        Max

-------------+---------------------------------------------------------

r_OperRevT~r |     17,046    8252.208    66708.15   .9165903    4054171

numberofem~s |      9,206    44.53237    290.8417          1      13978

•   How would the researcher interpret the output of this tables?

(1 point)  Can  the  researcher  conclude  that  there  is  a systematic (statistically significant) difference between the sales in Spain and Italy? No, she cannot conclude there is a systematic difference by just looking at the sheer meanscomparison. The comparison of the means does not take into account the distributions in the two samples.

(1 point)  Can  the  researcher  conclude  that  there  is  a systematic (statistically significant) difference between the number of Employees in Spain and Italy? No, she cannot conclude there is a systematic difference by just  looking  at  the  sheer  meanscomparison.  The comparison of the means does not take into account the distributions in the two samples.

•   The researcher runs a more appropriate test i.e. a two-sample t-test for the comparison of the mean of sales in Spain (ES) vs. Italy (IT). The researcher sets its alpha at 0.01 (1%)

Into STATA type: ttest r_OperRevTurnThEur,

by(country_ACRONYM)

•   (1 point) Copy and paste the result into the exam sheet.

Two-sample t test with equal variances

--------------------

----------------------------------------------------------

Group |

Mean  Std. Err.   Std. Dev.   [95% Conf. Interval]

---------+----------

----------------------------------------------------------

ES | 13,669

7323.658    529.4809    61903.99    6285.803    8361.513

IT |  17,046

8252.208    510.9373    66708.15    7250.718    9253.698

combined |  30,715

7838.978    368.6867    64614.91    7116.337    8561.619

---------+----------

----------------------------------------------------------

diff |

-928.5497    741.8641               -2382.634    525.5345

diff = mean(ES)

t =  -1.2516

Ho: diff = 0

degrees of freedom =    30713

Ha: diff < 0

Ha: diff != 0

Pr(T < t) = 0.1054

Pr( |T | > |t |) = 0.2107          Pr(T > t) = 0.8946

•      (3 points) What is the t-stat in this test?  -1.2516

•      (3 points) What are the three null hypotheses and the three alternative hypotheses?







diff <= 0.