关键词 > Statistics5350/7110

Statistics 5350/7110 Assignment 4

发布时间:2022-12-07

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

Statistics 5350/7110

21 November 2022

Assignment 4—due 6 December at 11:59 pm

This problem deals with the daily TSA throughput at U.S. airports for the period 14 February 2021 to 20 November 2021, a period of 40 weeks.  The data are in the file TSAnew4.txt.

1.  Begin by giving plots of the daily TSA throughput and the log daily TSA throughput.  Discuss the plots in detail, and relate your discussion to the effects that COVID had on TSA passenger throughput during 2021.

2.  Plot the spectral density of the log daily TSA throughput data.  Explain carefully what the plot reveals about structure of the data.

3.  Fit a regression model to the log data, to account for trend, seasonality, and any outlier values which require attention.  Interpret your fitted model, and present static seasonal estimates with a table and a plot.  What do you conclude from the seasonality estimation about the days during which passengers tend to embark on and return from airline flights?

4.  Present a thorough residual analysis of your regression model.  What do you conclude?

5.  Fit a seasonal ARIMAX model to the log data.  Describe your fitted model carefully.  Perform a thorough residual analysis of your model.

6.  Compare the goodness of fit of the regression and ARIMAX models.  Give a discussion.

7.  Explore dynamic seasonality for these data, as in recent notes.  Give a thorough discussion of what you find.  You may use the following code.  Be careful—this code differs somewhat from that used in some examples in the notes; the reason is that differencing is used to remove the trend of the ARIMAX predicted values.

> #drop the first week of ARIMAX predicted values

> y<-(logTravelers-resid(arimaxmodel))[8:280]

> #next, difference to remove trend, losing one data point, for Sunday of the original second week; drop the remaining 6 days of this second week; this leaves 266 points, 38 weeks

> y<-diff(y)[7:272]

> #to start, for each week adjust the differences to add to 0

> #use the adjusted differences to construct seasonal estimates

> seasm<-matrix(rep(0,266),ncol=38)

> j<--6

> for(ii in 1:38){

+ j<-j+7;j2<-j+6

+ y[j:j2]<-y[j:j2]-mean(y[j:j2])

+ #construct S7

+ j1<-j+1

+ seasm[7,ii]<-0

+ for(i in j1:j2){

+ sub<-y[i:j2]

+ seasm[7,ii]<-seasm[7,ii]+sum(sub)

+ }

+ seasm[7,ii]<-seasm[7,ii]/7

+ #find other S values

+ j3<-j+5

+ ir<-0

+ for(i in j:j3){

+ ir<-ir+1

+ sub<-y[j:i]

+ seasm[ir,ii]<-seasm[7,ii]+sum(sub)

+ }

+ }

> #static seasonal

> seasstatic<-rowMeans(seasm)-mean(rowMeans(seasm))

> seasstatic<-exp(seasstatic)

To print the plots:

> week<-seq(1,38)

> seasstaticm<-matrix(rep(seasstatic,38),ncol=38)

> name<-c("Sunday","Monday","Tuesday","Wednesday","Thursday","Friday","Saturday")

> par(mfrow=c(3,3))

> for(i in 1:7){

+ plot(week,exp(seasm)[i,],xlab="Week",ylab="Indices",main=name[i],type="l",lwd=2,col="red")

+ lines(week,seasstaticm[i,],lty=1,lwd=2,col="blue")

+ }

8.  Present some closing remarks to discuss how daily TSA throughout evolved during 2021.  What are some factors which were responsible for the fluctuation over time?