关键词 > COMP3308/3608

COMP3308/3608 Introduction to Artificial Intelligenece Solutions semester 1, 2022

发布时间:2022-05-31

Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COMP3308/3608 Introduction to Artificial Intelligenece

(regular and advanced)

semester 1, 2022

Information about the exam

Question 1. (Type 2 problem solving/calculation)

In the tree below the step costs are shown along the edges and the h values are shown next to each node. The goal nodes are double-circled: F and D.

Write down the order in which nodes are expanded using:

a)  Breadth-first search

b)  Depth-first search

c)  Uniform cost search

d)  Iterative deepening search

e)   Greedy search

f)   A*

In case of ties, expand the nodes in alphabetical order.

Answer:

a)  Breadth-first search: ABCD

b)  Depth-first search: ABEF

c)  Uniform cost search: ABEF

d)  Iterative deepening search:AABCD

e)   Greedy search: AD

f)   A*: ABEF

Review:

• BFS: Expands the shallowest unexpanded node

• DFS: Expands the deepest unexpanded node

• UCS: Expands the node with the smallest path cost g(n)

• IDS: DFS at levels l = 0, 1, 2, etc. Expands the deepest unexpanded node within level l

• Greedy: Expands the node with the smallest heuristic value h(n)

• A*: Expands the node with the smallestf(n)=g(n)+h(n)

Question 2. (Type 1 short answers)

Answer briefly and concisely:

a)  A* uses admissible heuristics. What happens if we use a non-admissible one? Is it still useful to use A* with a non-admissible heuristic?

b)  What is the advantage of choosing a dominant heuristic in A* search?

c)  What is the main advantage of hill climbing search over A* search?

Answers:

a)  Not optimal anymore. But it could still find a reasonably good solution in acceptable time, depending on how good the heuristic is.

b)  Fewer nodes expanded. As a result, the optimal solution will be found quicker.

c)   Space complexity – keeps only the current node in memory.

Question 3. (Type 2 problem solving/calculation)

Consider the following game in which the evaluation function values for player MAX are shown at the leaf nodes. MAX is the maximizing player and MIN is the minimizing player. The first player is MAX.

a)  What will be the backed-up value of the root node computed by the minimax algorithm?

b)  Which move should MAX choose based on the minimax algorithm to node B, C or D?

c)  Assume that we now use the alpha-beta algorithm. List all branches that will be pruned, e.g. AB etc. Assume that the children are visited left-to-right (as usual).

Solution:

a)   The value of A is 4:

b)  To node C

c)  LD and MD will be pruned:

Question 4. (Type 1 short answers)

Answer briefly and concisely:

a)  The 1R algorithm generates a set of rules. What do these rules test?

b) Gain ratio is a modification of Gain used in decision trees. What is its advantage?

c)  Propose two strategies for dealing with missing attribute values in learning algorithms.

d)  Why do we need to normalize the attribute values in the k-nearest-neighbor algorithm?

e)  What is the main limitation of the perceptrons?

f)   Describe an early stopping method used in the backpropagation algorithm to prevent overfitting.

g)  The  problem  of finding  a  decision  boundary  in  support  vector  machine  can  be formulated  as  an  optimisation problem  using  Lagrange  multipliers.  What  are  we maximizing?

h)  In linear support vector machines, we use dot products both during training and

during classification of a new example. What vectors are these products of? During training:

During classification of a new example:

Answers:

a) They test the values of a single attribute.

b) It penalizes highly-branching attributes by taking into account the number and the size of branches.

c) Strategy 1: Use the attribute mean to fill in the missing value.

Strategy 2: Use the attribute mean for all examples belonging to the same class.

d) As different attributes are measured on different scales, without normalization the effect of the attributes with smaller scale of measurement will be less significant than those with larger.

e) Can separate only linearly separable data.

f) Available data is divided into 3 subsets:          Training set used for updating the weights. Validation set used for early stopping.

Training is stopped when the error on the validation set increases for a pre-specified number of iterations.

g) The margin of the hyperplane.

h) During training: Pairs of training examples.

During classification of a new example: The new example and the support vectors.

Question 5. (Type 2 problem solving/calculation)

Consider the task of learning to  classify mushrooms  as safe or poisonous based  on the following four features: stem = {short, long}, bell = {rounded, flat}, texture = {plain, spots,

bumpy, ruffles} and number = {single, multiple}.

The training data consists of the following 10 examples:

Safe:

Poisonous:

These examples are also shown in the table below:

n

stem

bell

texture

number

class

1

short

rounded

spots

single

safe

2

long

flat

ruffles

single

safe

3

long

flat

ruffles

multiple

safe

4

long

rounded

plain

single

safe

5

long

flat

plain

single

safe

6

short

rounded

plain

single

poisonous

7

short

flat

plain

single

poisonous

8

short

rounded

bumpy

single

poisonous

9

long

rounded

spots

single

poisonous

10

long

rounded

bumpy

single

poisonous

a)  Use Naïve Bayes to predict the class of the  following new example: stem=long,    bell=flat,    texture=spots,    number=single.    Show    your calculations.

a)  How would 3-Nearest Neighbor using the Hamming distance classify the same example as above? Explain your answer. (Hint: The Hamming distance is the number of different feature values).

b)  Consider building  a  decision tree.  Calculate the  information  gain  for texture and number and briefly show your calculations Which one of these two features will be selected and why?