FIT5226 Multi-agent Systems and Collective Behaviour
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
FIT5226 Project
|
Assessment |
Topic |
Release |
Due |
Weight |
Assessment mode |
|
in-semester |
Project Stage 1
(Table-based RL, single agent)
|
Wk3 |
April 11 |
15% |
Individual |
|
in-semester |
Project Stage 2
(MARL coordination task)
|
Wk 6 |
Wk 9 |
35% |
Individual |
|
exam |
All topics but no programming
|
|
|
50% |
|
This project constitutes your entire in-semester assessment for the unit.. Phase 1 (the current Phase) concerns Tabular Q-Learning for Single Agents.
Tasks for Stage 1
Your agent has 8 actions that it can execute, namely to step to any neighboring field (including along diagonals). It starts at a random location. When it reaches location A it automatically picks up the item, when it reaches location B it automatically discharges the item. No specific action is required from the agent for picking up or dropping an item, it only needs to step onto the corresponding field. At this point, it has completed its task.
The agent is allowed to observe its own location, the location of A and whether it carries an item.
The task the agent has to learn is to pick up the load at A and deliver it to B taking as few steps as possible regardless of its (random) starting position.
2. Environment: Implement a grid world, in which the agent can move and execute its task. To make the task manageable, we use a 5x5 grid world. However, your code should be set up to work for any size (so use parameters for the size and for the target location). We simply choose a smaller grid here to limit memory and time requirements for the training. Note that you will have to integrate this with a visualization in the final task. It is highly advisable that you consider this for your code design right from the beginning.
3. Learning: Implement a table-based Q-Learning algorithm for this agent in Python as a Jupyter notebook. You are not allowed to use any reinforcement learning libraries for this, your Q-learning must be implemented "from scratch". You are, of course, allowed to use all modules in the standard distribution of Python (e.g. random). The only library you should need beyond this is Numpy (and later Matplotlib for the visualization). If you want to use any other additional libraries apart from Numpy and Matplotlib check with your tutor beforehand whether these are admissible.4. Training and Testing: Train your Q-Learner. Devise a test procedure and metrics that you can use to show that your agent learns the task successfully and that it learns to solve the task independently of the location of A.
Performance Level and Training Budget
To achieve full marks, your agent must learn to solve 100% of the possible scenarios in 10 steps or less within less than 20,000 training episodes.
Use of Generative AI
You must give a declaration that fully explains how and for which components you used generative AI.
Any use of generative AI must be appropriately acknowledged (see Learn HQ).
2026-04-01