COIY065H7 Machine Learning 2022
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
Machine Learning
COIY065H7
2022
Question 1 [60 marks]
(1a1) Provide the mathematical formula of an error/loss function that is suitable for training neural networks in a classification problem. Explain the role of the various terms in this equation.
(5 marks)
(1a2) Provide the mathematical formula of an error/loss function that is suitable for training neural networks in a regression problem. Explain the role of the various terms in this equation.
(5 marks)
(1b) During the COVID- 19 outbreak, a local dealership launched online click and collect services, where customers can browse cars and order online, and then collect the car from the dealership office. The following figure shows data of customers’ visiting times and transactions just after a couple of days of online operation. The x-axis shows time in minutes that customers spent on the website. The y-axis shows the amount of money these customers spent in pounds (£K).
£K spent |
150
125
100
75
50
25
5 |
10 15 20 25 30 Time spent on visiting the website (min) |
(1b1) How many features are used to describe customers’ behaviour in this case? Explain why.
(1 mark)
(1b2) Looking at these data, how many features does one have to select in order to accurately predict a customer’s spending and why?
(1 mark)
(1b3) How can one measure in computational terms the similarity between customers of the same group and between customers of distinct groups?
(3 marks)
(1c) A fuzzy rule-based system is controlling the speed of a train. The system can smoothly slow and stop the train from any speed and distance from a station. The design of the system has been based on the following input and output variables and ranges:
Input 1: SPEED |
||
Linguistic Range |
Low |
High |
Fast |
26.5 |
70 |
Medium Fast |
6.5 |
46.5 |
Slow |
2.5 |
10.5 |
Very Slow |
1 |
4 |
Stopped |
0 |
2 |
Output 1: THROTTLE |
||
Linguistic Range |
Low |
High |
Full |
60% |
100% |
Medium |
20% |
80% |
Slight |
3% |
30% |
Very Slight |
1% |
5% |
No |
0 |
2% |
(1c1) Based on the ranges and values shown above, explain what the universe of discourse of each input and output variable is. What could the fuzzy membership functions of the input and output variables be, if the system designers have used continuous triangular functions, or combination of continuous triangular and trapezoid functions, when defining fuzzy sets for each variable?
(10 marks)
(1c2) Give examples of two fuzzy rules that can be included in the rule base of the system to cover the scenario that the train is near a station.
(5 marks)
(1c3) Use the example you provided above to explain what rules can be triggered when train speed is 3km/h and distance to the station is 1.8m
(5 marks)
(1d) The weights of a neural network topology are encoded in a chromosome with 16 genes.
(1d1) Create your own example and explain how this approach can be applied on two different neural networks of the same topology, when the weights are real numbers in the interval (- 1, +1).
(10 marks)
(1d2) Use the two neural networks you have encoded above as a pair of parents, i.e. Parent- 1 and Parent-2. Apply uniform crossover with mask vector [0100110001001100], where 0 refers to inheriting a gene from Parent- 1 and 1 refers to inheriting a gene from Parent-2, to generate offspring, Child- 1 and Child-2. Show and explain the process you followed to create the children and the way these can be mapped back into neural networks.
(10 marks)
(1e) In an environmental AI application, a chemical sensor array measures a set of features for further processing by an intelligent system. One of the sensors appears to have a fault and requires maintenance urgently. The machine learning engineers believe that the sensing environment is stable, and the statistical characteristics will not change significantly during maintenance. Instead of taking the intelligent system off-line, they suggest using historic data to estimate the sensor’s input to the system and integrate it with other sensors data. From a set of 16 samples in a randomly selected time window they calculate a sample mean of 1.35 for the feature measured by this sensor, and from a longer set of 1,000,000 measurements from the operation of the sensor last month they estimate a mean of 1.4. To evaluate whether this plan is acceptable, they conduct a statistical hypothesis testing at 0.05 significance level, which produces the acceptance interval [1.237, 1.463]. Does the outcome of the hypothesis test support the suggestion to take the sensor off-line for maintenance and in the meantime feed the estimated mean value to the AI system? Explain your view. What is the percentage of confidence in that decision?
(5 marks)
Question 2 [20 marks]
(2a) Discuss three techniques for knowledge acquisition covered in the module. Explain how one could use each one of them to capture knowledge for modelling user behaviour in an interactive intelligent system.
(10 marks)
(2b) Explain the notion of a swarm. Use high-level description or pseudo-code to explain the overall operation of the particle swarm algorithm. Present each step of the algorithm and the operations that take place, explaining how one can use this method for training a neural network.
(10 marks)
Question 3 [20 marks]
(3a) Neuro-fuzzy systems have been proposed as a promising approach to build intelligent systems that would combine the advantages of both neural networks and fuzzy systems.
(3a1) What is a neuro-fuzzy system?
(2 marks)
(3a2) In the context of this module, discuss two alternative ways to combine neural networks and fuzzy systems to form a neuro-fuzzy system.
(6 marks)
(b) What is competitive learning? Derive a competitive learning algorithm presented in this module, describe its steps, the operators used and explain the operation of the algorithm.
(12 marks)
Question 4 [20 marks]
(4a) In the context of genetic computing, explain two types of encoding covered in the module. Provide your own example for each type assuming that there are three variables to encode in a chromosome.
(6 marks)
(4b) A simple Genetic Algorithm with fitness-proportionate selection, population size 4, single-point crossover rate, and bitwise mutation is applied on the fitness function: f (x) = 2*(number of 1's in the chromosomex) +(number of 0's in the chromosome x) , where x is a chromosome string of length 7. The initial, randomly generated, population of chromosome strings is:
Chromosome label A |
Chromosome 0111111 |
string |
B |
1110111 |
|
C |
0010000 |
|
D |
0011010 |
|
After applying the selection operator two pairs of chromosomes are chosen as parents: chromosomes B and D constitute the first pair, and chromosomes B and C the second pair of parents. Parents B and D cross over after the first bit position to form offspring E and F, and parents B and C do not cross over, instead forming offspring that are exact copies of B and C. Next, offspring E is mutated at the sixth bit position to form Em, offspring F and C are not mutated at all, and offspring B is mutated at the first bit position to form Bm .
(4b1) What is the fitness of each member of the initial population?
(2 marks)
(4b2) What will the new population be after one generation?
(8 marks)
(4b3) What is the fitness of each member of the new population? Does the population improve? Explain your view.
(4 marks)
2022-08-09