Labs 7 and 8 will provide some background for this assignment.

The question is presented from a Linux point of view using the computer science server mimi.cs.mcgill.ca which you can reach remotely using ssh or putty from your laptop (see lab 1). If you do this assignment from an MS Windows machine, then make sure to provide the DLL libraries your program uses (if any) so that the TA can run it from their MS Windows machine. It is not the TA’s responsibility to make your program run. The TAs will not debug your program.

You must write this assignment in the C Programming language. Use modular programming techniques.

Build your assignment #4 using your solution from Assignment #3 or the official solution posted on myCourses. 

Building a File System 

A file system is composed of three primary things: the data structures that are created on the disk to represent abstract concepts like directories and files. The driver(s) that instruct the machine to read and write data from/to the disk. The higher-level OS software that schedules disk requests from processes. In the OS, many processes can run at the same time requesting access to the disk. The I/O Scheduler controls these requests by ordering them in a queue using a semaphore. The I/O Scheduler also manages the slowness of the device with buffers. A process does not have direct access to the data on disk. Instead the OS uses indirect pointers to private OS data structures to apply rules. For example, “many programs can access the same file at the same time for reading, but only one program can access a file for writing or appending”. If a program had direct access to a file, then the programmer would have to implement the code to adhere to this rule. 

The following steps will add to your OS some of the capabilities outlined above:
DISK_driver.c
The following bullets outline the contents of the DISK_driver.c and DISK_driver.h files. The “Functions” section lists all the function. The “Data structures” section describe all the private data structures. The “Notes” section describes how the functions work. Make sure to read the “Notes” section to understand what is going on. 

Functions:

o // initialize all global data structure and variables

void initIO();

o // create & format partition, return success (1) or failure (0)

int partition(char *name, int blocksize, int totalblocks);

o // load FAT & create buffer_block, return success / failure

int mount(char *name);

o // find filename or creates file if it does not exist, returns file’s FAT index

int openfile(char *name);

o // using the file FAT index number, load buffer with data from blockID

int readBlock(int file);

o // return block data as string from buffer_block

char *returnBlock()

o // sensitive to block size, write data to disk at blockID

int writeBlock(int file, char *data);

Data structures:
o The Partition Table structure
struct PARTITION {
int total_blocks;
int block_size;
} partition;

The PARTITION structure records information about the format of the partition. A disk partition is composed of two areas: the partition header and the partition data area. The partition header contains the PARTITION structure and the FAT structure (described next). When a partition is created information about the format of the partition is recorded at the beginning of the partition, followed by the directory tree (FAT), and finally the largest section is the partition data area where all the files are stored. In our example, total_blocks records the number of blocks available in the partition data area. The integer block_size records the byte size of a block. For example, if total_blocks is 10 and block_size is 5, then the total number of bytes in the partition data area is 50 bytes (divided into 10 blocks). A byte will be represented as a character in our simulation. The structures PARTITION and FAT are fixed size. On disk, the format looks as follows, PARTITION : FAT : BLOCKS, where the colon means concatenation, and the world BLOCKS refers to “partition data area”.

This is a private structure but global within DISK_driver.c.

o The File Allocation Table structure

struct FAT {

char *filename;

int file_length;

int blockPtrs[10];

int current_location;

} fat[20];

The file allocation table (FAT) can store a maximum of 20 files. It is a private structure but global within DISK_driver.c. The field filename is the name of a file in the partition. The field file_length is the length of the file in number of blocks. Each block is equal to block_size bytes (char). The array blockPtrs[] are the pointers to the blocks in the partition populated with data from this file. These pointers are block numbers. The field current_location is a pointer to the current read/write location of the file in block number. It is initialized to -1 when not used. 

o The block buffer

char * block_buffer;
FILE *fp[5];

The block_buffer is malloced to the correct size when the partition() function is called. The block_size parameter is used in the malloc operation. The array fp[] contains all the system wide open file pointers. These are private structures. They are global data structures within DISK_Driver.c. 

Notes:

o The initIO() function initializes all the global data structures and variables to zero or null.

o The partition() function must be called first before any of the other functions in driver. All the other functions depend on the existence of a partition. To simulate a partition, we will use a subdirectory. The partition() function creates (mkdir) a directory called PARTITION (in caps) (mkdir PARTITION) if one does not exist in the current directory. Within PARTITION create a file having the char *name argument as its filename. Format the file as follows: the beginning of the file contains the information from struct partition, and then the information from fat[20] (initially empty information). It then appends the partition data area by writing total_blocks * block_size number of ‘0’ (character zero) to the end of the file. If it is successful, then it returns 1 otherwise 0.

o The mount() function uses the char *name argument as the partition filename. It opens the partition from directory PARTITION and loads the information from the partition into the global structures partition and fat[]. This function will also malloc block_size bytes and assign that to block_buffer. It returns 1 for success or 0 for fail.

o The openfile() function assumes fat[20] contains data. It searches for the string name argument in the filename fields of fat[20]. If it finds the file, then FILE *fp[] is made to point to the first block of the file and the index of the FAT cell is returned, otherwise it creates a new entry in the table (leaving FILE *fp[] = NULL) and returns the FAT index to that new entry. If there are no available cells in fp[], or there is no more space in the partition data area, or fat[20] is full, or some other issue happened, it returns the value -1 to indicate an error occurred. To set FILE *fp[] you will need to fopen() the partition and then fseek() to the block. You will also need another data structure to remember which fp[] belongs to which fat[].

o The functions readBlock() and writeBlock() need the index number returned from openfile() as the input parameter. These functions fail when an invalid index number is given. The index number behaves like a pointer to the file. It tells the function where in the file we are currently at, through fat[], and which file pointer to use, through fp[]. These functions are agnostic to the runtime situation. They perform no system wide checking. They handle only immediate read and write operations.

It is important to note that blockPtrs[] points to the actual blocks in the partition file where the file data is stored. It is also important to note that current_location is used as an index for blockPtrs[].

In the case of readBlock() the current_location is used to load the block_buffer with the data from the block, if not at end of file. It then increments the current_location integer and returns success. If at end of file, then block_buffer does not change, and the function returns failure.

In the case of writeBlock() the current_location is used to write the data argument into the partition. This is a destructive block write, not a destructive file write. This means when a file has many blocks and we write to a particular block, the data in that block is overwritten but the rest of the file is not affected by that write. The fat[] is updated correctly, and current_location is incremented to the next cell.

This is a contiguous file system. (For bonus points make this not contiguous – you have everything for that). 

You know when a block is not free because it is no longer initialized to ‘0’. Since we do not have a delete command, this ‘0’ property always holds true. 

The data stored in your partition is persistent. It is there every time you rerun your assignment.

IOSCHEDULER.C
Write a public function called char *IOscheduler(char *data, PCB *ptr, int cmd). You do not need to implement a semaphore since our simulator does not run programs in parallel. You will need to implement a queue as a circular array to handle up to 10 requests from exec programs. The queue must be able to store the data, ptr, and cmd arguments. The argument cmd = 0 for a read and 1 for a write. The argument data contains the information to write. The function IOscheduler() returns the data read. The argument ptr tracks who this is for. You can write private helper functions. 

THE KERNEL
As already eluded to from IOSCHEDULER.C, this assignment connects with the exec command you wrote in assignment #3. You will create three addition commands for your scripting language. We will not implement a full suite of file command, so our three commands will be limited in functionality. 

Mount partitionName number_of_blocks block_size The command mount uses the partition() and mount() functions. If a partition already exists with the name provided then the existing partition is mounted, otherwise the provided arguments are  used to create and mount the new partition. All the arguments are required.

Write filename [a bunch of words] The command write checks to see if filename is open. If it is already open then [bunch of words] is written to the file (as defined in the driver), otherwise it opens the file and then write to it (as defined in the driver). The [bunch of words] is a series of words separated by spaces with the open square bracket preceding the words and the close square bracket terminating the output.

Read filename variable  The command read checks to see if filename is open. If it is already open then it reads from the file (as defined in the driver), otherwise it opens the file and then reads from it (as defined in the driver). The string that is returned is assigned to the variable. The variable uses the set command.

PROGRAM TERMINATION

Program termination is like in assignment 3. Remember the partitions are persistent. When your program terminates the TA will be able to see those files.

Testing your kernel:

The TAs will use and modify the provided text file to test your kernel. This text file will contain the same tests from assignment 3. You can also use this file to test your kernel or you can test it the old fashion way by typing input from the keyboard. To use the provided text file, you will need to run your program from the OS command line as follows:

$ ./mykernel < testfile.txt 

Each line from testfile.txt will be used as input for each prompt your program displays to the user. Instead of
typing the input from the keyboard the program will take the next line from the file testfile.txt as the keyboard
input.
When testfile.txt is exhausted of input the shell command line prompt is displayed to the user (unless testfile.txt had a quit command, in that case mykernel terminates). Make sure your program works in the above way.

WHAT TO HAND IN

Your assignment has a due date plus four late days. If you choose to submit your assignment during the late days, then your grade will NOT be reduced by -5% per day. Submit your assignment to the assignment #4 submission box in myCourses. You need to submit the following:

Make sure to ZIP your assignment submission into a single file called ass4.zip
A README.TXT file stating what OS you used: mimi.cs.mcgill.ca or MS Windows and any other special instructions you think the TA needs to know to run your program.
Your version of TESTFILE.TXT. This will be you telling the TA that you know for sure that your program can at least do the following. The TA will run your program with this file and they will also run it with their own version of the file.
Submit all the .c file described in the assignment (you may want to create .h files, if so, please hand those in as well)
Submit the executable (compiled on the appropriate machine).
If you used MS Windows and you used a DLL then upload those as well.
You must submit your own work. You can speak to each other for help but copied code will be handled as to McGill regulations.

HOW IT WILL BE GRADED

Your assignment is graded out of 26 points and it will follow this rubric:
The student is responsible to provide a working solution for every requirement.
The TA grades each requirement proportionally. This means, if the requirement is only 40% correct (for instance), then the student receives only 40% of the points assigned to that requirement.
Your program must run to be graded. If it does not run, then the student receives zero for the entire assignment. If your program only runs partially or sometimes, you should still hand it in. You will receive partial points.
The TA looks at your source code only if the program runs (correctly or not). The TA looks at your code to verify that you (A) implemented the requirement as requested, and (B) to check if the submission was copied.
Mark breakdown:
o 1 point - Source file names. The TA must verify that the student created the specified source files as the only source files in the application. In addition, the source files must be populated with at least what was specified in the assignment description for each file. The student is permitted to supply additional helper functions as they see fit, if it does not hinder the assignment requirements.
o 1 point - Modular programming. If the student wrote their application using modular programming techniques as described in lab 2 (or seen from another course) then they receive these points.
o 1 point - The student’s compiled program must be named mykernel2.
o 1 point – A fully working assignment 4.
o 10 points – DISK_driver.c – 1 point for each function and data structure/global variable
o 4 point – Ioscheduler.c
o 2 points – The mount command
o 2 point – The read command
o 2 points – The write command
o 2 points - Program can be tested with the TESTFILE.TXT file
o BONUS POINTS
4 points for non-contiguous file system (add that to your readme file so TA notices)
You have two weeks for this assignment. Please use that time to your advantage.