Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

FUNDAMENTALS OF ARTIFICIAL INTELLIGENCE (COMP 1032)

2017-2018


1.      Search (25 marks)


i.    This is about general search.

a)  How problem solving is done through search? Explain your answer in the context of problem formulation and its mains components. K

[6 marks]

Answer:

Problem solving via search involves formulating the problem in a manner so that the relevant information that describe the states of the world (associated with solving the

problem) and their transitions can be encoded (abstracted).

The four main components are:

Initial State - the starting state of the problem, defined in a suitable manner. Operator - a set of actions that moves the problem from one state to another.

Goal Test - a test applied to a state that returns true if a state that solves the problem has been reached.

Path Cost - the costs to take a particular sequence of actions.

[6 marks]

b) Explain about the approach for problem representation and strategies for search.

C

[5 marks]

Answer:

One aspect ofproblem formulation is the state space representation that takes the form of a graph (each state represented by a vertex), with state transitions given by the edges connecting the pairs of states. Problem solving then becomes searching for a path from the initial state to the solution state. The search strategy determines the choice for expansion of states. The corresponding sequences of state expansion is represented by a data structure known as a search tree.

[5 marks]


c) The two  main search approaches are goal-driven and data-driven.  Provide a specific problem example for each approach and explain how it can be formulated in a manner that can be solved through the particular search approach. C

[4 marks]

Answer:

In applying a goal-driven search approach to path-finding problems, the goal (end state) can be clearly and easily formulated (represented). For data-driven search, one starts from the initial state and go forward in some systematic manner to the goal state (as in most search tree algorithms for two-player games). Note: Any example can be provided as long as explanation is given.

[4 marks]


ii.    Figure 1 is a graph representation of cities {a,b,c,d,e,f,g,h,i} interconnected by highways in a country. The number for each edge is the distance in km. For example, the distance between a and b is  100  km. The straight-line distance v(n)  between the city n ∈ {a,b,c,d,e,f,g,h,i} and i is given as follows: v(a)=300, v(b)=195, v(c)=120, v(d)=123, v(e)=125, v(f)=260, v(g)=235, v(h)=61, and v(i)=0.


Figure 1

a) Let a be the starting city and i be the ending city. Construct a tree generated as the result of using greedy search. Label the nodes expanded with the appropriate costs.


A

Answer:


b) Repeat (a) using A*. A

Answer:


[4 marks]



[4 marks]

[4 marks]


[4 marks]


c)  Which search algorithm gives a lower total distance travelled? A

[2 marks]

Answer:

Greedy search returns the path a-b-c-i with a total distance of 100+80+145 = 325 km. A* search returns the path a-b-d-h-i with a total distance of 100+70+60+60 = 290 km. A* search returns a lower total distance travelled.

[2 marks]



2.      Artificial Neural Networks (25 marks)


i.     Formulate the mathematical model of a neuron. Explain your formulation with the help of a diagram of the neuron model. K

[12 marks]

Answer:

= (in),

in = , .

[2 marks]

is the activation value of neuron (unit) j.

, is the weight of the link from unit j to unit i.

in is the weighted sum of activation values ofjs (inputs) to unit i.

is the activation value of unit i.

(∗) is the activation function.

[5 marks]


Sample diagram

[5 marks]


ii.    Demonstrate how the XOR problem can be solved with a neural network.

a) Construct the appropriate multilayer perceptrons of logical functions. Specify its input-output response. A

[3 marks]

Answer:

The input-output response of the multilayer perceptron of two inputs X1 and X2 has the form ofX1 XOR X2 = (X1 AND (NOT X2)) OR ((NOT X1) AND X2).

[3 marks]


b) Verify your answer by testing the input-output responses with appropriate logical tables. A

[6 marks]

Answer:



X1

X2

NOTX2

(X1 AND (NOT

X2))

1

1

0

0

1

0

1

1

0

1

0

0

0

0

1

0



X1

X2

NOTX1

((NOT X1) AND

X2)

1

1

0

0

1

0

0

0

0

1

1

1

0

0

1

0



(X1 AND (NOT X2))

((NOT X1) AND X2)

(X1 AND (NOT X2)) OR ((NOT X1) AND X2)

0

0

0

1

0

1

0

1

1

0

0

0



[6 marks]


c)  Explain how it solves the problem with the help of an appropriate illustration. C [4 marks]

Answer:

The input-output response separating the two classes of the 2D feature space can be illustrated from the formation of the two hyperplanes (one for (X1 AND (NOT X2)) and another for ((NOT X1) AND X2)).



0,1



0,0                          1,0


[4 marks]


3.      Machine Learning and Probabilistic Reasoning (25 marks)


i.    Answer the following questions on Machine Learning principles.

a)  Explain these two learning categories: supervised and unsupervised. K

[4 marks]

Answer:

In supervised learning, one is given input samples x and their corresponding labelled outputs y. The problem is then to learn the function y = f(x) that explains the input- output pairs (x,y) and evaluate the function on new data (x’,y’).

In unsupervised learning, one is given only samples x to infer a model y that describes the hidden structure of the data in x.

[4 marks]

b) For each learning category in (a), provide two tasks and explain them. K

[8 marks]

Answer:

For supervised learning:


Classification y is discrete (class labels) with the task involves learning decision boundaries that separates x into different classes y’.

Regression y is continuous with the task involves learning some (typically parametric) functions (input-output mappings) f that best (in some sense) explain (x’,y’).

[4 marks]

For unsupervised learning:

Clustering y is discrete with the learning task involves discovering intrinsic structure underlying (grouping) the data.

Dimensional reduction y is continuous with the learning task involves discovering a lower dimensional surface (described by some principal features) on which the data lives.

[4 marks]

c)  Briefly explain what is the main difference behind the principles on supervised and unsupervised learning tasks. Provide real-world examples to help illustrate your    answer. C

[3 marks]

Answer:

The main difference is on data labelling supervised learning requires having a set of labelled training pair (x,y). For example, classifying images iflabels are provided. Else, one would perform some clustering task that best group images.

[3 marks]

ii.     Let U and V be discrete random variables. U takes on a value from {A,B,C}. V takes on a value from {1,2}. Answer the following questions.


a) Assume that both P(U=A) and P(U=B) is 1/5. What is P(U=C)? Briefly explain your answer. C

[2 marks]

Answer:

P(U=C) = 1 (P(U=A) + P(U=B)) = 1 (1/5 + 1/5) = 3/5,

since the total probability () = 1.

[2 marks]

b) Assume that you have two jointly distributed random variables U and V. Given the following conditional probabilities P(V|U):



U=A

U=B

U=C

V=1

9/10

7/10

4/10

V=2

1/10

3/10

6/10





First verify that the P(V|U)s are valid. Second, compute the joint probabilities and construct the associated joint  probability table for  P(V,U). Third, verify your computations for P(V,U)s are correct. A

[5 marks]

Answer:

For two jointly distributed random variables U and V, given a value taken by U, the sum over all values that V takes must be one. For each U = u, u{A,B,C} (| = ) = 1, e.g., P(V=1|U=B) + P(V=2|U=B) = 7/10 + 3/10 = 1.

[1 mark]

The joint probability table can be constructed from computed P(V,U)s



U=A

U=B

U=C

V=1

9/50

7/50

12/50

V=2

1/50

3/50

18/50

The individual joint probability can be computed as P(V=v,U=u) = P(V=v|U=u)P(U=u). For example, P(V=1,U=A) = P(V=1|U=A)P(U=A) = (9/10)(1/5) = 9/50.

[3 marks]

The total probabilities given in the joint probability table must be one. For the table above, this is indeed true since 9/50 + 1/50 + 7/50 + 3/50 + 12/50 + 18/50 = 50/50 = 1.

[1 mark]

c)  Compute the marginal probabilities P(V). Verify your answer that they constitute a distribution. A,C

[3 marks]

Answer:

P(V=1) = P(V=1,U=A) + P(V=1,U=B) + P(V=1,U=C) = 9/50 + 7/50 + 12/50 = 28/50.

P(V=2) = P(V=2,U=A) + P(V=2,U=B) + P(V=2,U=C) = 1/50 + 3/50 + 18/50 = 22/50.

Note that P(V=1) + P(V=2) = 28/50 + 22/50 = 1.

[3 marks]