Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

STATS 787

1.                                                                                                            [10 marks]

The data visualisation below was produced with the ggplot2 package using the singapore data frame.


age

(a)                                                                                                        [5 marks]

Write R code to produce this data visualisation.

(b)                                                                                                        [5 marks] Identify all of the Grammar of Graphics concepts (geoms, aesthetics, stats, scales, coords, and facetting) that are being used to produce this data visual- isation.

2.                                                                                                            [10 marks]

The image below was produced with the grid package using the singapore data frame.


Write R code that only uses the grid package to produce this image.

3.

(a)

This question relates to the data visualisation below.

[10 marks]

[5 marks]

age

For each of the following Effective Data Visualisation” guidelines, write a paragraph explaining how the guideline is being used in this data visualisation and whether that guideline is making the data visualisation more effective.

● Small multiples.

● Redundant coding.

● The principle of proportional ink.

● The data-ink ratio.

(b)

This question relates to the data visualisation below.

[5 marks]

For each of the following questions of interest, write a paragraph explaining which perceptual visual tasks or“channels”are involved with answering the question using the data visualisation above.

● Is there a trend towards younger or older ages over time?

● Do the females tend to be younger or older than the males?

● Which age occurs most often?

The list of visual tasks/channels for continuous values are: Position on common scale; Position on unaligned scale; Length; Angle; Area; Lightness (of colour); Saturation (of colour).

The list of visual tasks/channels for categorical values are:  Position (in 2D space); Hue (of colour); Shape.

4.

(a)

This question relates to the data visualisation below.

[10 marks]

[5 marks]

For each of the“CRAP” design guidelines, write a paragraph describing at least one example of how each guideline has been applied in the data visualisa- tion above and commenting on whether the application of the guideline makes the data visualisation better.

(b)                                                                                                        [5 marks] The code below shows a code chunk from an R Markdown document.

999{r  dev  =  "png",  dev.args  =  list(type  =  "cairo")}

library(ggtext)

g  +  ggtitle(paste( !COVID  cases  in  ! ,

!<span  style="color:  red;  font-family: Ubuntu"> ! , !Singapore ! ,

!</span> !))  +

theme(plot.title  =  element_markdown())

999

Explain in detail the purpose of every line of the code above, including the meaning of all functions and arguments and the code chunk options. You can assume that the symbol g is a "ggplot" object.

5.                                                                                                            [10 marks]

This question relates to the following code, which produces an interactive scatterplot that looks similar to the image below (except that it is interactive).

library(ggplot2)

library(plotly)

sharedf  <- highlight_key(singapore,  ~  gender)

gg  <-  ggplot(sharedf)  +

geom_point(aes(x=1:30,  y=age,  text=gender),

color=c(3,  4,  rep(1,  28)),  size=3)

pg  <-  ggplotly(gg)

highlight(pg,  on  =  "plotly_selected",  color  =  "red")

(a)                                                                                                        [7 marks] Explain in detail the purpose of every line of the code above, including the meaning of all functions and arguments.

(b)                                                                                                        [3 marks] Describe what would happen in the interactive scatterplot for each of the following interactions (if we perform these interactions one after the other, in the order below):

● We hover the mouse cursor over the green data symbol in the scatterplot.

● We select the green data symbol in the scatterplot (by clicking and drag- ging a rectangular selection around just that data symbol).

● We select the blue data symbol (by clicking and dragging a rectangular selection around just that data symbol).

6.                                                                                                                   [10 marks]

(a)                                                                                                               [7 marks] This question relates to the code below which draws a map showing China, Singapore, and Indonesia, with the fill colour for each country reflecting the number of cases that originated from that country.  The map is shown below the code.

library(ggplot2)

library(scales)

gmap  <-  ggplot(asiaCounts)  +

geom_sf(aes(fill=freq,  colour=freq,  size=weight))  +

geom_label(aes(X, Y,  label=name),

color="white",  fill=rgb(0,0,0,.5),  label.size=0, hjust=0,  vjust=0)  +

scale_colour_gradient(low=hcl(60,  60,  40),                     high=hcl(60,  80,  80),                   aesthetics=c("colour",  "fill"))

Explain in detail the purpose of every line of the code above, including the meaning of all functions and arguments.

(b)                                                                                                               [3 marks] Suppose that we are only interested in comparing the three countries in terms of the number of COVID cases that originated in each country.

Describe an alternative data visualisation that would make this comparison easier and write R code that would produce such a visualisation.

7.                                                                                                            [10 marks]

This question relates to the code below, which draws a graph of the connections between the 30 COVID cases, with the nodes arranged in order from left to right. If case i is a contact of case j, there is an edge between node i and node j.  The graph is shown below the code.

library(ggraph)

ggraph  <-  ggraph(graph,  "linear")  +

geom_edge_arc()  +

geom_node_point(aes(colour=gender),  size=4)  +                 geom_node_text(aes(label=Case),  size=3, nudge_y=-.5)  + scale_colour_manual(values=c(Male=4,  Female=2))

(a)

[5 marks]

Explain in detail the purpose of every line of the code above, including the meaning of all functions and arguments.

(b)                                                                                                        [5 marks] For each of the questions below, write a paragraph explaining which visual perception concepts are involved with answering the question using the graph above.

● What is the case number for each case?

● Which cases are female and which are male?

● Which cases are contacts of each other?

The list of visual perception concepts to choose from are: Preattentive process- ing; Separable dimensions; and the“Gestalt Laws”of Proximity, Connection, Similarity, Continuity, Closure, and Figure and Ground.

8.

(a)

This question relates to the following SVG code.

<?xml  version="1.0"  encoding="UTF-8"?>

<svg  xmlns="http://www.w3.org/2000/svg "  version="1.1" width="310" height="310">

<line  x1="100"  x2="200"  y1="300"  y2="300" stroke="black"  stroke-width="1"/>

<line  x1="100"  x2="100"  y1="300"  y2="102" stroke="grey"  stroke-width="1"/>

<line  x1="200"  x2="200"  y1="300"  y2="141" stroke="grey"  stroke-width="1"/>

<circle  cx="100"  cy="102"  r="5"  fill="#2297E6"/>

<circle  cx="200"  cy="141"  r="5"  fill="#DF536B"/> <text  x="100"  y="90"  font-size="12"

text-anchor="middle"  fill="black">66</text> <text  x="200"  y="130"  font-size="12"

text-anchor="middle"  fill="black">53</text>

</svg>

This SVG code produces the image below.

[10 marks]

[5 marks]

 

Explain in detail the purpose of every line of the SVG code above, including the meaning of all elements and attributes.

(b)                                                                                                               [5 marks] Write R code that uses the xml2 package to create an SVG file similar to the one in the previous question, but this image draws values for all 30 COVID cases.

The SVG code that your R code creates would look like the image below.