Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit

CSE 565M

Acceleration of Algorithms in Reconfigurable Logic

Assignment 3

Due: March 10, 2023

In this third assignment, our interest is in exploring how to accelerate a sorting application. For this assignment, you are allowed to work in groups of 2 or 3 individuals.  As you form groups, please send  me a private Piazza post indicating the members of the group (or just let me know that you are staying  with the same group).

This application will be a pipeline of smaller stages.  Specifically, each stage will be an instance of a merge sort (see Figure 8 of Chamberlain and Ganesan, “Sorting on Architecturally Diverse Computer   Systems,”). These stages will then be pipelined into a collection of merge sorts (see Figure 10).

You can assume that the size of the input is fixed at 1024 integers (which, if only using binary merge sorts, implies 9 pipeline stages).  All of the buffers between stages can be restricted to using on-chip memory (no need to use external memory).  Unlike the parallel prefix, an array may or may not need to be fully partitioned for this to work effectively. Note that a merge groups block only reads from the head of the input FIFO buffer or writes to the tail of the downstream FIFO buffer.

Vitis has already provided the HLS stream library to implement FIFO (https://docs.xilinx.com/r/2021.1-English/ug1399-vitis-hls/HLS-Stream-Library) where a FIFO is also called hls::stream<> object. It introduces how to initialize a FIFO and how to read/write a FIFO. 

To customize the depth and type of the FIFO and execute each submodule concurrently, you may also  want to use the stream pragma (https://docs.xilinx.com/r/2021.1-English/ug1399-vitis-hls/pragma-HLS-stream) along with dataflow pragma (https://docs.xilinx.com/r/2021.1-English/ug1399-vitis-hls/pragma-HLS-dataflow). Here are two examples about how to build a pipelined vector addition kernel and a pipelined N-cascaded pipelined vector kernel, respectively, using FIFO, dataflow, and stream pragmas: (https://github.com/Xilinx/Vitis_Accel_Examples/blob/2021.1/cpp_kernels/dataflow_stream/src/adder. cpp) and

(https://github.com/Xilinx/Vitis_Accel_Examples/tree/2021.1/cpp_kernels/dataflow_stream_array) .    

To implement the pipeline merge sorts, you need to first build merge groups blocks shown in Figure 8 using what you’ve learned from the above materials. Then connect these merge groups blocks using     FIFOs. Note that the number of read accesses and write accesses to the same FIFO mush be equal, otherwise there might be deadlock issues.

What is the predicted execution time of the application when run on an FPGA? Describe this execution time in terms of the initiation interval, I, and latency, L, for the various pipeline stages.

In your report, we are interested in the predicted execution time and resource usage of the application   kernel(s) on the FPGA.