COMP226 Assignment 1
Hello, dear friend, you can consult us at any timeif you have any questions, add WeChat: daixieit
COMP226 Assignment 1
|
Continuous Assessment Number |
1 (of 2) |
|
Weighting |
10% |
|
Assignment Circulated |
Monday 23 February 2026 |
|
Deadline |
17:00 Fri 20 March 2026 |
|
Submission Mode |
Submit a single Python file “solution.py” to the CodeGrade Assignment on Canvas |
|
Learning Outcomes Assessed |
Have an understanding of market microstructure and its impact on trading. |
|
Goal of Assignment |
Reconstruct a limit order book from order messages; com pute quantities based on the limit order book |
|
Marking Criteria |
Pre-deadline visible CodeGrade tests of correctness of 6 functions (70%); Post-deadline CodeGrade tests of 4 further functions (30%) |
|
Submission necessary in order to satisfy module requirements |
No |
|
Expected time taken |
Roughly 8-12 hours |
|
Module Coordinator (Contact) |
Dr Andrew Roxburgh ([email protected]) |
Standard UoL policy applies; every student may submit up to 7 calendar days after the deadline, without penalty. Students who have a Student Support Information Sheet (SSIS) that includes the "coursework deadlines extensions allowed" adjustment can make use of a further 7 day extension period (total of 14 calendar days after the deadline). Any attempt to submit more than 7 days (or 14 days with an adjustment) after the deadline will be classed as non-submission, and will receive a mark of zero. .
Submissions are automatically put through a plagiarism and collusion detection system. Students found to have plagiarized or colluded will likely receive a mark of zero. Do not discuss or show your work to others, and do not search for solutions to the assignment online. In previous years, students have had their studies terminated and left without a degree because of plagiarism or collusion.
In this assignment, we use Python to run our code in any IDE or in the terminal, e.g., python main.py input/book_1.csv input/empty.txt
As a first step, please download comp226_a1.zip via the assignment page on Canvas. Then unzip
comp226_a1.zip, which will yield the following contents in the directory comp226_a1:
- main.py is the file you will run with Python, by specifying several command line arguments described below (see example above);
- common.py contains complete working functions that are used by main.py in conjunction with the incomplete functions in template.py; and one incomplete file:
- template.py is the file that you will edit – the distributed version contains empty functions. It contains 10 empty functions.
You need to change the file name from template.py to solution.py first. Then it will work with main.py. solution.py is also the file name you need to submit.
To get 70% of marks, you will need to correctly complete the first 6 functions; if your answer is only partially correct you will get a mark less than 70%. If you have correctly completed all these 6 functions, you can then – and only then – get marks for the 4 “extra” functions, which together account for 30% of the marks.
You should submit a single Python file that contains your implementation of some or all of these 10 functions.
Your submission will be marked via extensive automated tests of correctness across a wide range of example cases, with some chosen randomly at test time:
- The tests for the first 6 functions, which give up to 70% of the marks, will run at the time of submission and are fully visible pre-deadline on CodeGrade (detailed guidance on using CodeGrade to improve your mark can be found at the end of this document);
- The tests for the final 4 functions will only run post-deadline, and only if you got full marks for the first 6 functions. You can (and if required should) submit multiple times to repeatedly use the CodeGrade pre-deadline tests; for a correct solution CodeGrade will show that you pass all tests and have thus achieved the first 70% of marks. It probably does not make sense for you to work much on the final 4 functions until you have achieved this and submitted completely correct versions of the first 6 functions.
In addition to the visible pre-deadline tests on CodeGrade, for the first 6 functions, correct sample output is provided so that you can check correctness of your implementations “offline” (without submitting to CodeGrade). Offline testing is quick to do once you have set it up, and if you match all the offline examples then chances are that you will also pass the CodeGrade tests.
The first 6 functions to implement
The first 6 functions, which are worth 70% of the marks, are broken down into two groups. The percentages in square brackets show the breakdown of the marks by function.
These first 4 functions are intentionally very easy, and are meant to get you used to the format of the book. The next 2 functions are more involved and relate to reconstructing the order book from an initial book and a file of messages.
As seen in this example, main.py takes as arguments the path to two input files (the order of the arguments matters):
Existing code in common.py will read in a file like input/book_1.csv and create the cor responding two (possibly empty) orders books as two data frames that will be stored in the dictionary book, a version of which will be passed to all of the functions that you are required to implement.
The first four of the functions that you need to implement compute limit order book stats, and can be developed and tested without parsing the order messages at all. In particular, you can develop and test the first four functions using an empty message file, input/empty.txt, as in the first example above.
Reconstructing the order book from messages
In the next section, we describe the two types of message, “Add” messages and “Reduce” messages. All you need to know to complete the assignment is that messages in the input file are processed in order, i.e., line by line, with “Add” messages passed to add and “Reduce” messages passed to reduce, along with the current book in both cases.
The message file contains one message per line (terminated by a single linefeed character, ’\n’), and each message is a series of fields separated by spaces. Here’s an example, which contains an “Add” message followed by a “Reduce” message:
• not cross the spread and then add a single row to the book (orders at the same price are stored separately to preserve their distinct “oid”s);
• cross the spread and in that case can affect any number of orders on the other side of the book (and may or may not result in a remaining limit order for residual volume).
The provided example message files are split into cases that include crosses and those that don’t. This allows you to develop your code incrementally and test it on inputs of differing difficulty. We work through an example of each case, one by one. In each example we start from input/book_1.csv; we only show this initial book in the first case.
We provide sample output for 9 cases, namely all combinations of the 3 initial books (book_1.csv, book_2.csv, book_3.csv) and 3 message files, all found in the input subdirectory. The 3 message files are called:
|
file |
|
|
messages_a.txt |
add messages only, i.e., requires add but not reduce; for all three initial books, none of the messages cross the spread |
|
messages_ar.txt |
add and reduce messages, but for the initial book book_3.csv, no add message crosses the spread |
|
messages_arc.txt |
add and reduce messages, with some adds that cross the spread for all three initial books |
The 9 output files can be found in the output subdirectory of the comp226_a1 directory, shown as below.
A possible way to implement add and reduce that makes use of the different example message files is the following:
• Best price first, but when two orders have the same price, the earlier one is executed first
We provide sort that respects price-time precedence. It relies on the fact that the order ids increase as follows:
This method will ensure that limit orders are sorted first by price and second by time of arrival (so that for two orders at the same price, the older one is nearer the top of the book). You are encouraged to use sort in your own implementations. In particular, by using it you can avoid having to find exactly where to place an order in the book.
Be very careful when you index the bid or ask book after the book is sorted. You need to check whether index 0 or -1 points to the first row or the last row of the dataframe.
Hint on using logging in reconstruct
If you want to use this for debugging, you can turn it on with the –log flag, e.g.:
Then summarise(book) is used to give intermediate output after each message is processed.
For these final 30%, there are four functions that you are asked to implement. You can get marks for any one of these independently. The marks available are as follows:
The functions are defined by a full specification, given below, but we intentionally do not give you explicit test cases (you need to create them for yourself if you want) and you cannot find out your mark before the deadline. We will also not offer detailed help on solving these parts, as they are meant to be more challenging, with part of that challenge being to solve them on your own.
If you have achieved the full first 70%, then you can get extra marks for fully or partially correct implementations of any of the 4 extra functions, independent of each other (e.g., you can complete extra3 and not the other extra functions and still get marks).
The first three require you to compute an expectation of a discrete random variable. If you need a refresher, go back to COMP111 Introduction to Artificial Intelligence. For our application such an expectation is just the average (since the probability distribution is uniform) over all the possible values of the random variable.
For the extra tests, you can assume that bid book is not empty, so that the starting mid-price is only None if the ask book has no orders in it.
The function should return the expected value of the midprice after execution of a buy limit order with size size and price drawn uniformly at random from the set of prices in ask.
Extra problem 2
Hint: For the first 2 extra problems, if size is equal to M then the correct answer is None.
extra3 only takes book as an argument. The function should return the expected value of the midprice after execution of a buy market order with size s, assuming that s is drawn uniformly at random from the set {1,....,(M-1)}, where M is the total volume in the ask book.
Hint: For the first 3 extra problems, one unified approach is to simulate the relevant orders using functions that you implemented for reconstruct.
extra4 has two parameters, book (the order book), and k a non-negative number that will be interpreted as a percentage, e.g., if k=0.4 then k corresponds to 0.4%.
The function should return: 0 if the ask book has no orders in it; otherwise the largest amount of buy volume v such that a buy market order with size v causes the mid-price to increase by no more than k % (so an order with size v+1 would either cause the midprice to increase by more than k % or would leave no asks left in the book which means that the mid-price is None).
Note: the return value should be an integer between 0 and the total ask volume (exclusive) in book.
Hint: For all 4 extra problems, to increase the chance that your implementations are correct, do one or two examples where you compute the correct expectation by hand once and then check that your code produces the correct answer.
2026-03-20