Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

COMP226 Assignment 1

Distributed code and sample input and output data

As  a first  step,  please  download  comp226_a1.zip via  the  assignment  page  on  Canvas.  Then  unzip comp226_a1.zip, which will yield the following contents in the directory comp226_a1:

comp226_a1

■■■ ■■■ ■■■ ■■■ ■■■ ■■■ ■■■ ■■■ ■■■ ■■■ ■■■

book_1.csv

book_2 .csv

book_3 .csv

empty .txt

message_a .txt

message_ar .txt

message_arc .txt

message_ex_add .txt

message_ex_cross .txt

message_ex_reduce .txt

message_ex_same_price.txt

output

■■■  book_1-message_a.out  ■■■  book_1-message_ar.out ■■■  book_1-message_arc.out ■■■  book_2-message_a.out  ■■■  book_2-message_ar.out ■■■  book_2-message_arc.out ■■■  book_3-message_a.out  ■■■  book_3-message_ar.out ■■■  book_3-message_arc.out

template.R


2 directories, 23 files

Brief summary

You are provided with three .R files, two complete, which should not be edited:

• main.R is the file you will run, e.g. with Rscript, by specifying several command line arguments described below (see example above);

• common.R contains  complete working functions that  are  used  by main.R in  conjunction with the incomplete functions in template.R;

and one incomplete file:

template .R is the file that you will edit -- the distributed version contains empty functions. It contains

10 empty functions.

If you run main.R using template.R as it is distributed, it runs without error, but does not produce the desired output because the first 6 functions in template.R are provided empty. To get 70%, you will need to correctly complete these 6 functions; if your answer is only partially correct you will get a mark less than 70%. If you have correctly completed all these 6 functions, you can then -- and only then -- get marks for the 4 "extra" functions, which together account for 30% of the marks.

You should submit a single R file that contains your implementation of some or all of these 10 functions. Your submission will be marked via extensive automated tests of correctness across a wide range of example cases, with some chosen randomly at test time:

• The tests for the first 6 functions, which give up to 70% of the marks will run at the time of submission and are fully visible pre-deadline on CodeGrade (detailed guidance on using CodeGrade to improve your mark can be found at the end of this document);

• The tests for the final 3 functions will only run post-deadline, and only if you got full marks for the first 6 functions.

You can (and if required should) submit multiple times to repeatedly use the CodeGrade pre-deadline tests; for a correct solution CodeGrade will show that you pass all tests and have thus achieved the first 70% of marks. It probably does not make sense for you to work much on the final 4 functions until you have achieved this and submitted completely correct versions of the first 6 functions.

In addition to the visible pre-deadline tests on CodeGrade, for the first 6 functions, correct sample output is provided  so  that  you  can  check  correctness  of  your  implementations  "offline"  (without  submitting  to CodeGrade), for example with a tool like diff (https://en.wikipedia.org/wiki/Diff) to compare the output that you produce with the correct output. Offline testing is quick to do once you have set it up, and if you match all the offline examples then chances are that you will also pass the CodeGrade tests (but make sure that you check this to avoid nasty surprises).

template.R versus solution.R

Throughout the rest of this handout, we show example output from the incomplete template.R as well as using a full solution file "solution.R" that contains a correct implementation of all the functions. Obviously, you are not given the file solution.R, you need to create it from template.R.

The first 6 functions to implement

The first 6 functions, which are worth 70% of the marks, are broken down into two groups. The percentages in square brackets show the breakdown of the marks by function.

Order book stats:

1. book.total_volume <- function(book) [5%]

2. book.best_prices <- function(book) [5%]

3. book.midprice <- function(book) [5%]

4. book.spread <- function(book) [5%]

These first 4 functions are intentionally very easy, and are meant to as get you used to the format of book. The next 2 functions are more involved and relate to reconstructing the order book from an initial book and a file of messages.

Reconstructing the limit order book:

5. book.reduce <- function(book, message) [15%]

6. book.add <- function(book, message) [35%]

Running main.R with template.R

An example of calling main.R with template.R is as follows.

Rscript main.R template.R input/book_1.csv input/empty.txt

As  seen  in  this  example,  main.R  takes  as  arguments  the  path  to three input files (the  order  of the arguments matters):

1. an R file with the functon implementations (template.R in the example)

2. an initial order book (input/book_1.csv in the example)

3. order messages to be processed (input/empty.txt in the example) Let's see the source code of main.R and the output that it produces.

options (warn=-1)

args <- commandArgs (trailingOnly = TRUE); nargs = length (args)

log <- (nargs == 4) # TRUE is there are exactly 4 arguments

arg_format <- "<--log> <solution_path> <book_path> <messages_path>"

if (nargs < 3 || nargs > 4) # check that there are 3 or 4 arguments

stop (paste ("main.R has 3 required arguments and 1 optional flag:", arg_format))

if (nargs == 4 && args[1] != "--log") # if 4 check that --log is the first

stop (paste ("Bad arguments format, expected:", arg_format))

solution_path <- args[nargs-2]

book_path <- args[nargs-1]

messages_path <- args[nargs]

if ( !all (file.exists (c (solution_path, book_path, messages_path))))

stop ("File does not exist at path provided.")

source (solution_path); source ("common.R") # source common .R from pwd

book <- book.load(book_path)

book <- book .reconstruct (data .load(messages_path), init=book, log=log)

book.summarise (book)

ret <- book.extra3 (book)

cat ( 'ret', ret, '\n')

In short, main.R:

checks that the command line arguments are ok

assigns them to variables (solution_path, data_path, and book_path respectively)

sources common.R and the file at solution_path; loads the initial book

reconstructs the book according to the messages

prints out the resulting book; prints out the book stats

Let's see the output for the example above:

$ Rscript main.R template.R input/book_1.csv input/empty.txt

$ask

oid price size

1 a 105  100

$bid

oid price size

1 b 95  100

Total volume:

Best prices:

Mid-price:

Spread:

Now let's see what the output would look like for a correct implementation:

$ Rscript main.R solution.R input/book_1.csv input/empty.txt

$ask

Total volume: 100 100

Best prices: 95 105

Mid-price: 100

Spread: 10

You will see that now the order book stats have been included in the output, because the four related functions that are empty in template.R have been implemented in solution.R.

The initial order book

Here is the contents of input/book_1.csv, one of 3 provided initial book examples:

oid,side,price,size

a,S,105,100

b,B,