Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Economics 502- Fall 2022

R project 3

R tutorial & questions: You will turn in the print of R commands & the answers to the questions

- To count the number of combinations of n items taken K at a time use the choose function: choose (n, k)

> choose(5, 3) # How many ways we can select 3 items from 5 items?

[1] 10

- To generate all combinations of n items taken k at a time use the combn function: combn(n, k)

The following generates all combinations of the numbers 1 through 5 taken three at a time:

> combn(1:5, 3)

[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]

[1,]    1    1    1    1    1    1    2    2    2     3

[2,]    2    2    2    3    3    4    3    3    4     4

[3,]    3    4    5    4    5    5    4    5    5     5

Question # 1:

a) How many ways we can pick 5 students out of 40 students.

b) Of 40 students 25 are 22 years old and 15 are older than 22. What is the probability that 3 of the five selected students are 22 years old and 2 are older than 22?

- Common discrete distribution

Discrete

Distribution

R name

Parameters

Binomial

binom

n = number of trials; p = probability of success for one trial

Geometric

geom

p = probability of success for one trial

Hypergeometric

hyper

m = number of white balls in urn; n = number of black balls in urn; k = number of balls drawn from urn

Negative binomial

nbinom

size =  number of successful trials; either prob = probability of successful trial or mu = mean

Poisson

pois

lambda = mean

- Uniform distribution is a continuous distribution.

The R command is: unif(min= lower limit; max =upper limit)

- Generating Random Numbers in R:

R can generate random numbers for different distributions: for a given distribution, the name of the random number generator is r” prefixed to the distribution’s abbreviated name. Here are some examples:

> runif(1) # creates one realization of uniform distribution between 0 and 1. [1] 0.05024443

> runif(10) # creates 10 realizations of uniform distribution between 0 and 1.

[1] 0.82300291 0.37334368 0.09734816 0.10543026 0.08772198 0.35740546 [7] 0.12568500 0.58146755 0.07868367 0.12366968

> runif(4, min=-3, max =3) # creates four realizations of uniform distribution between -3 and 3.

[1]  0.7591449 -2.0561143  1.0841136  1.3391422

> rnorm(5, mean=100, sd=15) # creates 5 realizations of normal distribution with mean 100 & standard deviation 15

[1] 114.04903  75.87155 126.50714  99.84988 115.27328

> rbinom(6, size=1, prob=0.5) # a Bernoulli trial (probability success = 0.5) repeated 6 times and each time the number of success is recorded.

[1] 0 0 0 1 0 1

> rbinom(6, size=10, prob=0.5) # 10 Bernoulli trials (probability success = 0.5) repeated 6 times and each time the number of success is recorded.

[1] 3 5 4 6 4 4

> rpois(4, lambda=10) # creates 4 realizations of Poisson distribution with mean 10 [1]  6 10 11  4

Question #2

a) Create 100 random numbers of Bernoulli distribution with probability of success equal to 0.4. Use the sum command to calculate the proportion of success in 100   trials. Is it close to 0.4?

b) Repeat (a) with 1000 and 10000 and each time calculate the proportion of success. Is it getting closer to 0.4?

- The command rle(x)

rle(x) stands for 'run length encoding'. A run means a streak of repeats of the same number. It will be easiest to explain what rle means through an example.

First let's make a small sequence where we can see the runs

> x = c(1,1,1,2,3,3,3,1,1)

We can describe this sequence as: three 1's, then one 2, then three 3's and two 1's.This is exactly what rle(x) shows us

> y = rle(x)

> y

Run Length Encoding

lengths: int [1:4] 3 1 3 2 # The lengths vector shows the lengths of the runs of each value.

values : num [1:4] 1 2 3 1 # The values vector shows the values in the order they            appeared.                                                                                                                                           To pick out just the lengths vector you use the syntax y$lengths                                            > y$lengths                                                                                                                                         [1] 3 1 3 2                                                                                                                                            Let's look for streaks in a sequence of Bernoulli trials                                                               # We simulate 20 Bernoulli(.5) trials using rbinom(20,1,.5).                                                    > set.seed(1)                                                                                                                                      > y = rbinom(20,1,.5)                                                                                                                        y is a vector of 0's and 1's of length 20. We can use rle() to find the length of the longest run in y                                                                                                                                                > max(rle(y)$lengths)                                                                                                                       We can count the number of runs of more than 3.                                                                    > sum(rle(y)$lengths > 3)                                                                                                                 We can count the number of runs of exactly length 3.                                                              > sum(rle(y)$lengths == 3)

Question #3

In 100 coin tosses, what is the probability of having the same side come up 10 times in a row?

To answer this question use the R commands for (i in 1:n) , max , rbinom, rle, $length. Set the number of simulations at 100,000.