Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STATS 787 TERM TEST

1.                                                                                                                                     [10 marks]

(a)

[5 marks]

NOTE that this would require a geom_col rather than a geom_bar because the data set already contains the counts. Also note that the x-axis scale is x_continuous, even though it represents a date, because the values are just week numbers.

geom:

col

aesthetics:

x  =  Week

y  =  Count

stat:

identity

coord:

cartesian

scale:

x  =  x_continuous

y  =  y_continuous

 


(b)

There is no need to provide code to generate the data set.

library(ggplot2)

ggplot(scotland)  +

geom_col(aes(x  =  Week,  y  =  Count))  +

ggtitle("Deaths per  week  involving  Covid-19")  + ylab(NULL)  +

scale_x_continuous(breaks  =  12:37)  +                      scale_y_continuous(expand  =  expansion(add  =  0))


[5 marks]


 

 

 

2.                                                                                                                                     [10 marks]

The legend can be drawn using a layout of 3 rows and 2 columns, with the left column being fixed width (e.g., 1 inch) and the right column filling the remaining space ("null" units).

We push a viewport with that layout and then a subsequent viewport for each “cell” to draw the individual components of the legend.

In the first column, there are two rectangle grobs, one filled with a dark blue and the other grey, plus a segments grob, with a thick dashed line style.

The text is all left-justified.

This is a fair bit of code, but it has a simple, repetitive structure.

library(grid)

grid.newpage()

blue  <-  "#284f99"

grey  <-  "#b9b9b9"

lay  <-  grid.layout(3,  2,

widths  =  unit(c(1.25,  1),  c("in",  "null")), heights  =  unit(2,  "lines"))

pushViewport(viewport(layout  =  lay))                                                   pushViewport(viewport(layout.pos.col  =  1,  layout.pos.row  =  1)) grid.rect(width=.8,  height=unit(1,  "lines"),

gp=gpar(col=blue,  fill=blue))

popViewport()

pushViewport(viewport(layout.pos.col  =  1,  layout.pos.row  =  2)) grid.rect(width=.8,  height=unit(1,  "lines"),

gp=gpar(col=grey,  fill=grey))

popViewport()

pushViewport(viewport(layout.pos.col  =  1,  layout.pos.row  =  3)) grid.segments(.1,  .5,  .9,  .5,

gp=gpar(lwd=3,  lty="dashed",  lineend="butt"))

popViewport()

pushViewport(viewport(layout.pos.col  =  2,  layout.pos.row  =  1))            grid.text("Deaths  involving  COVID-19",  x=unit(5,  "mm"),  just="left") popViewport()

pushViewport(viewport(layout.pos.col  =  2,  layout.pos.row  =  2))    grid.text("Other  causes  of  death",  x=unit(5,  "mm"),  just="left") popViewport()

pushViewport(viewport(layout.pos.col  =  2,  layout.pos.row  =  3))

grid.text("Average  deaths per  corresponding  week  over previous  5  years", x=unit(5,  "mm"),  just="left")

popViewport()

popViewport()

It is also acceptable to position each element explicitly as long as sensible units are used.


 

 

 

3.                                                                                                                                     [10 marks]

 


(a)


[3 marks]


Redundant coding is used in both the barplot and the multi-panel plot by adding text labels.  Each bar in the barplot represents a number of deaths by its height and by the text label on the bar. Each panel in the multi-panel plot represents the relative deaths due to  COVID-19 versus Other causes using shaded areas and a text label describing the percentages.

(b)                                                                                                                                 [4 marks] If we overlapped the areas representing COVID-19 and Other deaths, we would create a problem of not being able to see both areas at once. We would at least have to draw the COVID-19 deaths on top of the Other deaths.  We could also just draw a line (rather than filling an area) for each source of deaths and/or we could use a semitransparent fill for each area.  Finally, we could use (more) multi-panel figures, producing a panel for COVID-19 deaths and another panel for Other deaths, for each age group. I think that the last option would be the most effective.

(c)                                                                                                                                 [3 marks] Yes (as far as we can tell in the barplot), the Principle of proportional ink is being obeyed in both plots because the y-axis scales start at zero. This means that the height of the bars in the bar plot and the height/area of the regions in the multi-panel plot correspond to the data values that they represent (number of deaths in a day in the barplot and number of deaths/total deaths from each source in the multi-panel plot).  The width of the bars in the bar plot is redundant (so the data-ink ratio is not the best), but because it is consistent that does not cause a major problem.


 

 

 

4.                                                                                                                                     [10 marks]

 


(a)

Visual tasks:


[5 marks]


● comparing the number of Other deaths on different days within the same age group involves position on a common scale;

● comparing  the  number  of COVID-19  deaths  versus  Other  deaths  on  the  same day within the same age group involves position on unaligned scales (because the COVID-19 deaths are stacked” on top of the Other deaths);

● comparing the number of COVID-19 deaths on different days within the same age group also involves position on unaligned scales (again because of the “stacking”);

● comparing the number of Other deaths on a single day in Ages 0-14 with the number of Other deaths on a single day in Ages 15-44 involves position on a common scale;

● comparing the total number of deaths from COVID-19 versus the total number of deaths from Other causes within the same age group involves area (which we are poor at, particularly with irregular regions like these).

(b)                                                                                                                                 [3 marks]

Visual perception concepts:

● The Gestalt rule of similarity is being used to identify areas representing COVID-

19 deaths in each panel (and on the legend). All of these areas are coloured blue. Preattentive colour (or brightness in greyscale) is also used here to make the identification of these areas immediate and effortless.

● The Gestalt rule of proximity is being used to associate labels such as ”Ages 0-14” with the relevant panel (although this could be more effective if each of those labels was shifted down a little bit more).

(c)                                                                                                                                 [2 marks] The Gestalt Law of “Figure and Ground” (visual elements are taken to be either in the foreground or the background) might explain why the COVID-19 death regions can look like they are 6e在áηd the Other death regions in the multi-panel plot.