COMPSCI 3007, 7059 Artificial Intelligence Semester 1 2015
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Semester 1 2015
Artificial Intelligence
COMPSCI 3007, 7059
Problem solving by searching
Question 1
(a) The following diagram shows a highway network that connects several cities in a certain country. The distance between any two cities connected by a highway is shown on the network.
You are currently in San Francisco and you wish to travel via road to Boston. You wish to use search methods to find a path that connects the two cities.
i. What is the branching factor of the search tree?
ii. Draw the search tree generated by depth first search (DFS) until a path between San Francisco and Boston is found. Re- member to avoid repeated states. The order of expansion is based on the alphabetical order of the name of the cities.
iii. Give one advantage of depth first search (DFS) over breadth first search (BFS).
iv. In the course of generating the search tree by DFS in part ii above, what was the highest number of nodes you needed to store in the fringe (the list of yet to be expanded nodes)?
Outline in the tree you drew in part ii the set of nodes that exist in the fringe when its size was the largest.
(b) The direct flight distance (DFD) (distance travelled by an air plane)
between several cities and Boston is given in Table 1 in Page 3.
For the other cities in the network, their DFD to Boston is unknown or not available.
City |
DFD to Boston |
San Francisco |
2700 |
Portland |
2530 |
Reno |
2520 |
Las Vegas |
2370 |
Los Angeles |
2600 |
Table 1: DFD between several cities and Boston.
A function h(n) that takes as input a node n is defined as follows:
h(n) =
where city(n) gives the city encapsulated in node n.
If the h(n) defined above is used as a heuristic function in A* search to find a path between San Francisco and Boston, is the solution found guaranteed to be the shortest path? Justify your answer.
Adversarial search
Question 2
(a) The following diagram shows a game tree to be searched based on the minimax algorithm. The first move belongs to Max. In the diagram, +Inf means +o and -Inf means -o.
Copy this game tree into your answer book. Then,
i. Write the minimax value at each node.
ii. If alpha-beta pruning is to be used to search the game tree, clearly circle the branches that are pruned (i.e, branches that do not need to be searched). Note: at each node, search the child nodes from left to right.
(b) Here’s a partially completed tic-tac-toe game. Player Max uses sym-
bol X while player Min uses symbol O. The utility of Max wins is +1, draw is 0, and Min wins is -1.
|
x |
|
x |
o |
x |
|
o |
|
It’s Min’s turn to choose a move.
i. Draw the game tree starting from the state above until the end of the game. Write clearly the minimax value at each node of the tree.
ii. Starting from the state above, if both Min and Max play opti- mally, what is the best possible final outcome for Min?
Decision trees
Question 3
(a) Why is it important to conduct pruning in decision tree learning?
(b) Post-pruning is a framework to prune decision trees. Briefly explain i. What kinds of data are required to conduct post-pruning; and
ii. How to conduct post-pruning given the required data.
(c) Prof Dumbledore teaches the Artificial Intelligence (AI) course. In this semester, he wishes to predict whether the students will pass the course, before they take the final exam. He collected a set of data from 20 students who took the AI course last semester.
Specifically, he recorded the marks they had obtained in the data structure course, the amount of time they had spent on studying for the AI course, and whether the students eventually passed AI. The following diagram plots the collected data.
15 20 25 30 35 40 45
Hours spent on studying for AI
As his loyal student, help Prof Dumbledore to build a decision tree that can predict whether a student from this semester will pass the AI course, given the same type of data.
i. What is the information content at the root node of the de- cision tree? Leave your answer in the form of an arithmetic expression.
ii. Calculate the information gain for the following two candidate tests at the root node:
• Final marks in data structure course < 57.
• Hours spent studying for AI < 31.
Leave your answers in the form of arithmetic expressions.
iii. If we split based on the attribute ‘Hours spent on studying for AI’ at the root node,
• What is the minimum number of unique split values that we need to examine in order to find the test that maximises the information gain?
• Write down a set of unique split values that we can exam- ine to find the test that maximises the information gain.
Clustering
Question 4
(a) i. What is unsupervised learning? Please describe in terms of the
input (or features) and labels (class) of the training data.
ii. What is clustering? Is it unsupervised or supervised?
iii. Describe K-means algorithm. Suppose we know there are K clusters.
iv. Name one problem with K-means algorithm (from many po- tential problems).
(b) i. When Mean Shift is used in clustering, does it require to know
how many clusters (like K-means does)? Please explain why.
ii. Assume there are N many data {xi }, and the kernel function is k(xi , xj ). If an old mean vector is z, please write down the mathematical expression for the updated mean vector znew .
Probabilistic Graphical Models
Question 5
(a) The joint distribution for three boolean variables A, B, C is given
in Table 2. Please compute the following probabilities.
a |
b |
c |
0.01 |
a |
b |
-c |
0.01 |
a |
-b |
c |
0.06 |
a |
-b |
-c |
0.02 |
-a |
b |
c |
0.04 |
-a |
b |
-c |
0.04 |
-a |
-b |
c |
0.80 |
-a |
-b |
-c |
0.02 |
Table 2: P (A, B, C)
i. What is P (B = b, C = -c)?
ii. What is P (C = -c)?
iii. What is P (B = bIC = -c)?
(b) A Bayesian network with 3 boolean variables A, B, C has a graph
in Figure 1 with the local (conditional) distributions provided in the Table 3.
Figure 1: Bayesian Network
Table 3: Local (conditional) distributions P (A), P (BIA), P (CIA, B)
|
c |
-c |
a, b |
0.9 |
0.1 |
a, -b |
0.4 |
0.6 |
-a, b |
0.3 |
0.7 |
-a, -b |
0.2 |
0.8 |
i. What is P (A = a, B = -b, C = -c)? Please write down the derivation and intermediate result instead of the final number alone.
ii. What is P (A = aIC = -c)? Let x = P (A = -a, C = -c). You can figure out the exact number of x from the table. But to save you some time, please express P (A = aIC = -c) as a function of x ( do not give us the number). Please write down the derivation and intermediate result instead of the final function alone.
iii. If the edge from A to B is deleted, will the local distribution tables be changed? If not, why? If yes, which one will be changed and becomes what?
iv. If the edge from A to B is deleted, is A independent to B? Prove it.
v. Which one of the following two inference methods can also be used to compute P (C): max-product or sum-product?
Markov Decision Processes and Particle Filters
Question 6
(a) i. What is the Markov property?
ii. In Markov Decision Processes, what is the meaning of a policy?
iii. Let [1, 2, 3] be the rewards for a sequence of states. Let the discount factor be 0.5. What is the discount utility for this sequence? What is the (additive) utility for this sequence?
(b) Table 4 shows a map of a robot’s world in which each grid square represents a discrete state of the robot. There are 5 states a, b, ..., e in total where each state represents the robot’s location. The re- wards are shown on each state/location. Action: the robot can
only move WEST or EAST, or EXIT ( only available in exit states a and e).
5 |
0 |
0 |
0 |
1 |
a b c d e
Table 4: Rewards on a robot’s map
i. If the discount factor γ = 1, what are the preferred actions (i.e. the output of the optimal policy) in states c and d?
ii. If the discount factor γ = 0.1, what are the preferred actions (i.e. the output of the optimal policy) in states c and d?
iii. What is the value of γ that makes WEST and EAST equally good in state d?
2022-06-20