Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit


Fall 2021 COSC 3P71 Artificial Intelligence: Assignment 2


Goal: Showcase your understanding of Genetic Algorithms (GA) and prepare a technical report examining the algorithm’s performance in a given problem. Start this assignment soonest because it will require time and cannot be completed on last minute, and without the required report, one cannot score well.

Languages: Any programming language which will compile and/or run on the lab computers (within reason).

Task: Below are two options for problems to implement for assignment 2. You will choose one to implement for this assignment. This assignment has 3 parts. First, you must implement a GA system as described in lecture for one of the described problems. Next, you must perform a number of experiments with your GA system and collect data regarding its performance. The final step will be preparing a technical report to present your findings. Specific details regarding the requirements of the GA implementation, the experiments to be performed, and the format and content of the report will be described under each option.

As a reminder, the basic procedure of a GA is as follows:

Read problem instance data set GA parameters
generate a random initial population, POP, of size
popSize for gen=1 to MAXGEN do
    evaluate fitness of each individual in POP
    select a new population using a selection
    strategy apply crossover and mutation
end for

A GA should have the following components:

 Initial Population Initializer: Creates a population of size popSize of randomized individuals as described in class.

 Chromosome: A chromosome encodes a solution to the problem being solved. Your representation will depend on the assignment option you select. See the option details below for possible chromosome representations.

 Reproduction: Use Tournament Selection (remember K = 2, 3, 4 or 5).

 Crossover: Given two individuals, a crossover creates two offspring. Implement your GA using the following crossover strategies independently:

– Uniform order crossover (UOX) with a bitmask

– A crossover of your choice. For example: 1-point or 2-point crossover, ordered crossover, PMX

 Mutator: Given an individual, a mutator creates a mutated individual. Implement your GA using a mutation operator of your choice (from those discussed in class)

 Fitness evaluation function: A function that utilizes information about the problem to evaluate the strength of a chromosome. Your fitness evaluation function will change depending on the option you select. See option details for specific evaluation functions you can utilize.

 Genetic algorithm system: The implementation of the GA system. This file should glue together the various components of your system.

 User parameters: Population size, maximum generation span, probability of (crossover, mutation, etc.)

Your GA program should permit the user to easily define his/her own genetic parameters and data (e.g., crossover rate, mutation rate, population size, maximum generation span etc.....).

BONUS: (For a bonus of 2% to your total course grade) Incorporate into your experiments your own innovative idea. This could be a different initial population representation and creation strategy, a different selection scheme, a different (third) crossover not discussed in class, etc. Alternatively, you could introduce a local search into your GA.


Assignment 2 Option A

Implementation: You will implement a GA for the travelling salesman problem (TSP). A travelling salesman needs to travel to n cities to do business. Given the location of the cities, find a route that traverses all cities exactly once and returns to the original city the salesman began in. The goal is to minimize the total distance travelled within the route.

Chromosome: The TSP problem can be represented using a permutation of n integers. Each city is assigned a number, and the order of the numbers in your chromosome represent the route that the salesman should take.

Note that for a permutation, each number appears in the chromosome exactly once. When implementing crossover and mutation, take care that your implemented procedures do not break this constraint. It is up to you to ensure that your implemented methods do not create invalid chromosomes, either by modifying the procedures to avoid illegal chromosomes, or by create a repair procedure to fix invalid chromosomes.

Fitness Evaluation: You will evaluate a solution by calculating the total Euclidean distance of a route, where a smaller total distance means a stronger chromosome. You are encouraged to experiment with your own ideas on top of this.


Experimental Analysis

Run your GA to compare the performance of the two crossover operators mentioned above by using the following parameters (and include elitism in all cases):

1) Crossover rate = 100 %, mutation = 0%

2) Crossover rate = 100 %, mutation = 10%

3) Crossover rate = 90 %, mutation 0%

4) Crossover rate = 90 %, mutation 10%

5) Determine your own best parameter settings

(PS. For elitism), first consider an elite strategy where only the best chromosome is replicated to the next generation. Next consider replication criteria where a certain percentage of individuals determined empirically (no greater than 10%) are allowed to replicate (include chromosome of best fitness value for this replication)

For each experiment mentioned above, run your GA at least 5 times. Use the data sets obtained from TSP-online benchmark data at:

http://elib.zib.de/pub/mp-testdata/tsp/tsplib/tsp/ulysses22.tsp

http://elib.zib.de/pub/mp-testdata/tsp/tsplib/tsp/eil51.tsp

Output the following to a file or standard output:

a) All GA parameters, including random number seed

b) Per each generation: best fitness value, average population fitness value

c) Per each run: best solution fitness and its corresponding best solution chromosome

For your analysis, compute for the multiple runs: average of best fitness per generation and average population fitness per generation. Using a graph drawing tool such as excel, plot well labeled graphs for experiment 1, above including experiment 2 (if done). Types of graphs you plot for experiment 2 depend on your incorporated idea, if any. Feel free to experiment with different crossover and mutation rates. You will set your own Pop-Size and generation size.


Assignment 2 Option B

Implementation: You will implement a GA to generate keys responsible for decrypting various pieces of encrypted text.

Cryptanalysis is an active and challenging area of research. The security of cryptographical systems is more important than ever, given how much of our lives take place online. We store our memories online, in the form of pictures and video, our schedules, we make purchases, or transfer money between accounts. No one would feel comfortable announcing their credit card number out loud in a crowded room. However, most people feel comfortable paying online by credit card, even though their credit card number will pass through tens or hundreds of machines owned by complete strangers. This is because there are a number of crypto-systems commonly used today, which we trust to keep our private information private. In order to ensure that these system remain secure, researchers are regularly testing and investigating potential avenues of attack, include the use of evolutionary algorithms, like genetic algorithms, to find the key associated with some encrypted text. In this assignment we will be using a genetic algorithm to find the password used to encrypt some text with a simple crypto system, the Vigenere Cipher.

Chromosome: For a chromosome representation, each chromosome must represent a string to use as a key for encryption and decryption. The simplest representation to use is a character array, mutation and crossover can then be performed on the characters of the array, although other representations are possible. For simplicity your chromosomes should only contain the letters ’a’ to ’z’ and ’-’.

Fitness evaluation function: The fitness function should take an individual and produce a real number describing the suitability of the solution encoded in the individual. In our case the fitness function will describe how well the individual performed at decrypting the text. One possible fitness evaluation function is provided in “Option B Provided Code.java”, but feel free to experiment with other options. With the provided fitness function smaller numbers indicate a more fit solution.

Using your GA implementation perform the following experimentation:

1. Run your GA to compare the performance of the two crossover operators mentioned above by using the following parameters (and include elitism in all cases):

a Crossover rate = 100 %, mutation = 0%

b Crossover rate = 100 %, mutation = 10%

c Crossover rate = 90 %, mutation 0%

d Crossover rate = 90 %, mutation 10%

e Determine your own best parameter settings

For each experiment mentioned above, run your GA at least 5 times with 5 different random number seeds. Encrypted text will be provided. You should repeat the experiments above for both pieces of encrypted text provided. Since GAs are stochastic you will likely not get the same result for each run.

For each run your GA system should output the following to a file or standard out:

– All GA parameters, including random number seed

– Per each generation: best fitness value, average population fitness value

– Per each run: best solution fitness and its corresponding best solution chromosome

Implementation Notes:

For the sake of testing some encrypted text is included below, along with the key used and the plain text.

You will not be told of the length of the key in advance, your GA must determine the length of the password as well. To accommodate this ’-’ will be taken as a null character and the fitness function will simply ignore it. For example, the key ”p-a-s-s-w-o-r-d” will become ”password”. This will allow your GA to test keys up to the length of the chromosome. For example, if a chromosome is simply a character array of length 50 containing ’a’ to ’z’ and ’-’, then the GA will test passwords up to 50 characters long.

The provided fitness function will accept some text, a key, and return a real number, v, indicating how much the decrypted result looks like English. v will be greater than or equal to 0 with smaller numbers indicating a better solution.

There will be code provided for encryption, decryption, and determining the fitness of a solution. Feel free the modify this code if needed.

Examples(Note that spaces were added to the decrypted lines for readability. The provided encrypt and decrypt functions will remove spaces from the text.)


Additional Notes

Your implementation is not expected to find a perfect solution every run, for many runs your implementation may not find the complete password. The purpose of the assignment is to implement a genetic algorithm, and to prepare a scientific report, not to find the best possible solution to the decryption problem.

The provided fitness function is very basic, there is certainly lots of room for improvement. Perhaps your innovative idea could be an improvement to this fitness function. A local search which fine tunes the solution produced by the GA could also be effective.


Assignment Report

Once your data is collected and your analysis is complete, you will prepare a summarized report of your findings using IEEE format introduced to you during tutorial. IEEE format details are found at: https://www.ieee.org/conferences/publishing/templates.html. The report should have the following sections and each section should address the listed points.

– Introduction

∗ BRIEFLY introduce the concepts and topics discussed in the report.

∗ Precisely define the problem you implemented and explain why its solution is important.

Background

∗ This section should explain the algorithms used in the report (pseudo code is helpful) and may provide other information which you feel will be relevant to someone trying to understand your results.

* The goal of a background section is that if someone who did not know anything about GAs read your report, they would have enough details to understand what you did for your experiment.

– Experimental Setup

∗ This section should provide enough information about your experiments to allow someone else to duplicate your results.

∗ This should include algorithm parameters used, the crossover and mutation operators used, and any other relevant implementation details.

– Results.

∗ This section should summarize your findings.

∗ For your multiple runs compute the average of best fitness per generation and average population fitness per generation. Using a graph drawing tool such as excel, plot well labeled graphs for your experiments.

∗ Also include summary tables describing the fitness of your final solution. Summary statistics such as min, max, mean, median, and standard deviation should be included in your tables. Tests for statistical significance would also be appropriate, for example T-Tests or Mann Whitney U tests.

∗ Explain your graphs/data in detail and emphasize the similarities and differences between different algorithm configurations.

– Discussions and Conclusions.

∗ This section should provide a BRIEF summary of what experiments you performed and the results you observed.

∗ Following this BRIEF summary, you should discuss your opinions regarding your results and what conclusions you’ve arrived at.

∗ This could include issues like which crossover performed better. If more than one mutation type was tried, which one performed better. If you included local search, did it help? How did the choice of GA parameters affect the final outcome etc.?

– References.

∗ List your sources here. The text of the report should contain references to your sources.

This report is very important, so be sure to include it. Start early, gathering the data and doing the experimental analysis will take much more time than coding the assignment.


Submission Details:

Your electronic submission should include:

Your source code

An executable

Instructions for compiling/running your program and changing parameters

The data you’ve generated for your report

Your written Latex report as a pdf or doc file.

Submission will be done electronically using Sakai. Any questions regarding submission can be directed to Tyler at [email protected].

Please note that the virtual COMMONS is available to all students at Brock. You are not required, but if you prefer to (or need to) you can use the virtual COMMONS instead of a personal computer. Machines in the virtual COMMONS have IDEs for Java, Python, and C#.

Any questions/concerns regarding the grading of any assignment MUST be raised within 7 days of graded assignment hand-back. In this case, please send your concerns/questions to Course Coordinator: Tyler Crane at: “[email protected]”. To better serve you, please don't send multiple queries on the same topic to Tyler, Professor and other TAs. Tyler will be the point of contact for any such queries and the Professor will receive them all at once from Tyler.

Feel free to use any language (with reason) as long as it can be opened and executed on the lab computers. Examples include Java, C#, C++, and Python. No matter your choice of language, ensure you have provided sufficient comments such that your program can be understood. At minimum, include a comment describing each function/method and class/module. Unity projects are not allowed for this assignment.

This assignment is to be completed individually. Plagiarism detection software will be applied in this course for all submitted work. Additionally, a number of assignments will be randomly selected and the authors will be asked to explain their code and submitted documents.