COMP373601 Information Visualization 2019
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
January 2019
COMP373601
Information Visualization
Question 1
(a) The figure below contains three charts that show data for the regions of England.
(i) For each of the variables that are shown in the figure, classify the data type (categorical, ordinal, or numerical) and state how the variable is visually encoded. [5 marks]
(ii) Provide constructive feedback about the encoding that is used in the “Number of jobs in different industries” chart. [4 marks]
(iii) Provide constructive feedback about the type of chart that is used to show the “Median annual salary” data. [2 marks]
Question 1 (continued)
(b) A dataset has 10 million records and two variables (price and product). There are 50 different categories of product. Compare and contrast the use of histograms vs. box plots for providing an overview visualization of the dataset. [3 marks]
(c) Imagine that you have a dataset that contains 20 numerical variables. Discuss the advantages and disadvantages that scatterplot matrices and parallel coordinates each have for visualizing the data. [6 marks]
[question 1 total: 20 marks]
Question 2
(a) There are more than 25,000 universities in the world and, in total, they publish 3 million research papers/year. One measure of the quality of a university’s research is the collaborations that it has with other universities around the world. To investigate that quality, you have obtained a dataset that contains the following information for every research paper that was published last year:
Paper ID
Paper title
And for every co-author:
o Person’s name
o University
o Country
(i) This part of the question is about the Top 100 universities (you should ignore co-authors who work at other universities). What type of network visualization would you use to show the number of papers that are co-authored by people at each pair of universities, on a single sheet of A4 paper? You may find it useful to illustrate your answer with a diagram. [4 marks]
(ii) Describe two problems that users may have reading your visualization. [4 marks]
(iii) Now imagine that you have been asked to provide your network visualization on an interactive website. Choose a suitable network metric and explain how you would use it to allow users to interactively explore the data for every university in the world (not just the Top 100). [5 marks]
Question 2 (continued)
(b) This part of the question is about user evaluation and the following extract from the Method of a paper. The evaluation was designed to compare the time that participants took to compare sequences of categorical events with two different types of visualization.
Participants Thirteen university undergraduates (7 men and 6 women) took part. They were all right-handed, gave their informed consent, and were paid £10 for their participation. Procedure The evaluation used a within participants design. The evaluation was divided into two stages, and participants were run individually. The first stage involved 10 trials and used colour to encode different types of event in a sequence. For each trial, three sequences were presented (see Figure), and the participant had to select whether A or B was more similar to the Target. Then the participant took part in Stage 2. This also involved 10 trials, with shape used to encode different types of event (see Figure). The trials were different to those used in Stage 1, but of similar difficulty. |
|
Stage 1: Sequences of four events that are encoded using colour |
Stage 2: Sequences of four events that are encoded using shape |
(i) State two important pieces of information that are missing from the Participants section of the Method. [2 marks]
(ii) Explain why the experiment’s design is flawed. [3 marks]
(iii) Describe how you would correct the design. [2 marks]
[question 2 total: 20 marks]
2023-01-14