Skip to content

RC1 - Use Spartan-7 by default, documentation updates #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 31, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added FPGA_Competition_Application_Form.pdf
Binary file not shown.
Binary file not shown.
76 changes: 65 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,12 @@
# VDF FPGA Competition Baseline Model

This repository contains the modular squaring multiplier baseline design for the VDF low latency multiplier FPGA competition.
This repository contains the modular squaring multiplier baseline design for the VDF (Verifiable Delay Function) low latency multiplier FPGA competition. For more information about the research behind VDFs see <https://vdfresearch.org/>.

The goal of the competition is to create the fastest (lowest latency) 1024 bit modular squaring circuit possible targeting the AWS F1 FPGA platform. Up to $100k in prizes is available across two rounds of the competition. For additional detail see **TODO**.
The goal of the competition is to create the fastest (lowest latency) 1024 bit modular squaring circuit possible targeting the AWS F1 FPGA platform. Up to $100k in prizes is available across two rounds of the competition. For additional detail see [FPGA Contest](https://supranational.atlassian.net/wiki/spaces/VA/pages/36569208/FPGA+Contest) on the [VDF Alliance](https://supranational.atlassian.net/wiki/spaces/VA/overview) page.

Official competition rules can be found in [FPGA_Competition_Official_Rules_and_Disclosures.pdf](FPGA_Competition_Official_Rules_and_Disclosures.pdf).

For a step by step walkthrough of how to get started, with screenshots, see [Getting Started](https://supranational.atlassian.net/wiki/spaces/VA/pages/37847091/Getting+Started).

## Function

Expand All @@ -11,13 +15,43 @@ The function to optimize is repeated modular squaring over integers. A random in
```
h = x^(2^t) mod N

y, N are 1024 bits
x, N are 1024 bits

t = 30
t = 2^30

x = random
```

Here is a sample implementation in Python:
```
#!/usr/bin/python3

from Crypto.PublicKey import RSA
from random import getrandbits

# Competition is for 1024 bits
NUM_BITS = 1024

NUM_ITERATIONS = 1000

# Rather than being random each time, we will provide randomly generated values
x = getrandbits(NUM_BITS)
N = RSA.generate(NUM_BITS).n

# t should be small for testing purposes.
# For the final FPGA runs, t will be around 1 billion
t = NUM_ITERATIONS

# Iterative modular squaring t times
# This is the function that needs to be optimized on FPGA
for _ in range(t):
x = (x * x) % N

# Final result is a 1024b value
h = x
print(h)
```

## Interface

The competition uses the AWS F1/Xilinx SDAccel build infrastructure described in [aws_f1](docs/aws_f1.md) to measure performance and functional correctness. If you conform to the following interface your design should function correctly in F1 in the provided software/control infrastructure.
Expand Down Expand Up @@ -48,11 +82,11 @@ module modular_square_simple
- **sq_out** - The result of the squaring operation. This should be fed back internally to sq_in for repeated squaring. It will be consumed externally at the clock edge trailing the valid signal pulse.
- **valid** - A one cycle pulse indicating that sq_out is valid.

If you have requirements that go beyond this interface, such as loading precomputed values, contact us by email (hello@supranational.net) and we will work with you to determine the best path forward. We are very interested in seeing alternative approaches and algorithms.
If you have requirements that go beyond this interface, such as loading precomputed values, contact us by email (hello@vdfalliance.org) and we will work with you to determine the best path forward. We are very interested in seeing alternative approaches and algorithms.

## Baseline models

Two baseline models are provided. You can start from either design.
Two baseline models are provided for guidance. You can start from either design or create your own that matches the expected interface.

**Simple**

Expand All @@ -66,7 +100,14 @@ There are several potential paths for alternative designs and optimizations note

## Step 1 - Develop your multiplier

1. Install [Vivado 2018.3](https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/vivado-design-tools/2018-3.html). To get started you can use a Xilinx WebPack or 30-day trial license. Extended trial licenses will be made available to registered competitors through Supranational in partnership with Xilinx early in the competition.
1. Install [Vivado 2018.3](https://www.xilinx.com/support/download/index.html/content/xilinx/en/downloadNav/vivado-design-tools/2018-3.html). To get started you can use a Xilinx WebPack or 30-day trial license. **Extended trial licenses will be made available to registered competitors through Supranational in partnership with Xilinx early in the competition.**
1. Install dependencies:
```
source msu/scripts/simulation_setup.sh
```
1. Be sure support for Spartan-7 is included in Vivado. This part should be available with either WebPack or the 30-day trial licenase. To verify:
* In Vivado, select Help -> Add Design Tools or Devices, then sign in
* Expand "7 Series", ensure "Spartan-7" is enabled
1. Depending on your approach choose one of the baseline models to start from. Starting Vivado using the `run_vivado.sh` will automatically generate testbench inputs.

**Simple**
Expand All @@ -91,7 +132,12 @@ There are several potential paths for alternative designs and optimizations note
* The test is self checking and should print "SUCCESS".
* The simulation prints cycles per squaring statistics. This, along with synthesis timing results, provides an estimate of latency per squaring.
* You can also use [verilator](docs/verilator.md) if you prefer by running 'cd msu/rtl; make'. No license required.
1. Run out-of-context synthesis + place and route to understand and tune performance. A pblock is set up to mimic the AWS F1 Shell exclusion zone. In our exprience these results are pretty close to what you will get on F1 and and provide an easier/faster/more intuitive interface for improving the design.
1. Run out-of-context synthesis + place and route to understand and tune performance. A pblock is set up to mimic the AWS F1 Shell exclusion zone. In our exprience these results are pretty close to what you will get on F1 and and provide an easier/faster/more intuitive interface for improving the design.
1. Once you have the 30 day trial license you can enable the vu9p part, which is the target of the contest.
* Help -> Add Design Tools or Devices, sign in
* Enable "Ultrascale+"
* In your project, select Settings in the upper left, then "Project device"
* Select Boards, then select VCU118
1. When you are happy with your design move on to Step 2!

## Step 2 - SDAccel integration
Expand Down Expand Up @@ -123,6 +169,12 @@ There are three ways to test your design in SDAccel:
1. **AWS F1** - Instantiate an AWS EC2 F1 development instance and run the flows yourself. See [aws_f1](docs/aws_f1.md).
1. **On-premise** - You can install SDAccel on-premise and run the same flows locally. See [on-premise](docs/onprem.md).

## Step 3 - Submit

1. Fill in the [FPGA_Competition_Application_Form.pdf](FPGA_Competition_Application_Form.pdf) and email to [email protected], if you haven't already.
1. Fill in the [Submission_Form.txt](Submission_Form.txt). This stays in the repository and helps convey design expectations.
1. Invite 'simonatsn' from Supranational to collaborate on your design repository in github.

## Optimization Ideas

The following are some potential optimization paths.
Expand All @@ -131,7 +183,7 @@ The following are some potential optimization paths.
* Shorten the pipeline - we believe a 4-5 cycle pipeline is possible with this design
* Lengthen the pipeline - insert more pipe stages, run with a faster clock
* Change the partial product multiplier size. The DSPs are 26x17 bit multipliers and the modular squaring circuit supports using either by changing a define at the top.
* This design uses lookup tables stored in BlockRAM for the reduction step. These are easy to change to distributed memory and there is support in the model to use UltraRAM. **TODO - point to a branch with this code**
* This design uses lookup tables stored in BlockRAM for the reduction step. These are easy to change to distributed memory and there is support in the model to use UltraRAM. For an example using UltraRAM see https://github.com/supranational/vdf-fpga/tree/f72eb8c06eec94a09142f675cde8d1514fb72e60
* Optimize the compression trees and accumulators to make the best use of FPGA LUTs and CARRY8 primitives.
* Floorplan the design.
* Use High Level Synthesis (HLS) or other techniques.
Expand All @@ -154,5 +206,7 @@ AWS online documentation:

## Questions?

Please reach out with any questions, comments, or feedback through **TODO - channels**

Please reach out with any questions, comments, or feedback through any of the following channels:
- Message Board: https://vdfalliance.discourse.group/
- Telegram: https://t.me/joinchat/FoVncxdnEPRGRvkt1OuQmQ
- E-mail: [email protected]
26 changes: 26 additions & 0 deletions modular_square/model/vdf_basic.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
#!/usr/bin/python3

from Crypto.PublicKey import RSA
from random import getrandbits

# Competition is for 1024 bits
NUM_BITS = 1024

NUM_ITERATIONS = 1000

# Rather than being random each time, we will provide randomly generated values
x = getrandbits(NUM_BITS)
N = RSA.generate(NUM_BITS).n

# t should be small for testing purposes.
# For the final FPGA runs, t will be around 1 billion
t = NUM_ITERATIONS

# Iterative modular squaring t times
# This is the function that needs to be optimized on FPGA
for _ in range(t):
x = (x * x) % N

# Final result is a 1024b value
h = x
print(h)
11 changes: 7 additions & 4 deletions msu/rtl/vivado_ozturk/msu.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -154,14 +154,13 @@ if { $::argc > 0 } {
set orig_proj_dir "[file normalize "$origin_dir/"]"

# Create project
create_project ${_xil_proj_name_} ./${_xil_proj_name_} -part xcvu9p-flga2104-2L-e
create_project ${_xil_proj_name_} ./${_xil_proj_name_} -part xc7s100fgga676-2

# Set the directory path for the new project
set proj_dir [get_property directory [current_project]]

# Set project properties
set obj [current_project]
set_property -name "board_part" -value "xilinx.com:vcu118:part0:2.0" -objects $obj
set_property -name "board_part_repo_paths" -value "/home/snpeffer/src/vdf/artya7/vivado-boards-master/new/board_files" -objects $obj
set_property -name "default_lib" -value "xil_defaultlib" -objects $obj
set_property -name "dsa.accelerator_binary_content" -value "bitstream" -objects $obj
Expand All @@ -183,6 +182,7 @@ set_property -name "enable_vhdl_2008" -value "1" -objects $obj
set_property -name "ip_cache_permissions" -value "read write" -objects $obj
set_property -name "ip_output_repo" -value "$proj_dir/${_xil_proj_name_}.cache/ip" -objects $obj
set_property -name "mem.enable_memory_map_generation" -value "1" -objects $obj
set_property -name "part" -value "xc7s100fgga676-2" -objects $obj
set_property -name "sim.central_dir" -value "$proj_dir/${_xil_proj_name_}.ip_user_files" -objects $obj
set_property -name "sim.ip.auto_export_scripts" -value "1" -objects $obj
set_property -name "simulator_language" -value "Mixed" -objects $obj
Expand Down Expand Up @@ -507,6 +507,7 @@ set_property -name "file_type" -value "XDC" -objects $file_obj
# Set 'constrs_1' fileset properties
set obj [get_filesets constrs_1]
set_property -name "target_constrs_file" -value "[get_files *new/user.xdc]" -objects $obj
set_property -name "target_part" -value "xc7s100fgga676-2" -objects $obj
set_property -name "target_ucf" -value "[get_files *new/user.xdc]" -objects $obj

# Create 'sim_1' fileset (if not found)
Expand Down Expand Up @@ -549,7 +550,7 @@ set obj [get_filesets utils_1]

# Create 'synth_1' run (if not found)
if {[string equal [get_runs -quiet synth_1] ""]} {
create_run -name synth_1 -part xcvu9p-flga2104-2L-e -flow {Vivado Synthesis 2018} -strategy "Vivado Synthesis Defaults" -report_strategy {No Reports} -constrset constrs_1
create_run -name synth_1 -part xc7s100fgga676-2 -flow {Vivado Synthesis 2018} -strategy "Vivado Synthesis Defaults" -report_strategy {No Reports} -constrset constrs_1
} else {
set_property strategy "Vivado Synthesis Defaults" [get_runs synth_1]
set_property flow "Vivado Synthesis 2018" [get_runs synth_1]
Expand All @@ -568,6 +569,7 @@ set_property -name "display_name" -value "synth_1_synth_report_utilization_0" -o

}
set obj [get_runs synth_1]
set_property -name "part" -value "xc7s100fgga676-2" -objects $obj
set_property -name "strategy" -value "Vivado Synthesis Defaults" -objects $obj
set_property -name "steps.synth_design.args.more options" -value "-mode out_of_context" -objects $obj

Expand All @@ -576,7 +578,7 @@ current_run -synthesis [get_runs synth_1]

# Create 'impl_1' run (if not found)
if {[string equal [get_runs -quiet impl_1] ""]} {
create_run -name impl_1 -part xcvu9p-flga2104-2L-e -flow {Vivado Implementation 2018} -strategy "Vivado Implementation Defaults" -report_strategy {No Reports} -constrset constrs_1 -parent_run synth_1
create_run -name impl_1 -part xc7s100fgga676-2 -flow {Vivado Implementation 2018} -strategy "Vivado Implementation Defaults" -report_strategy {No Reports} -constrset constrs_1 -parent_run synth_1
} else {
set_property strategy "Vivado Implementation Defaults" [get_runs impl_1]
set_property flow "Vivado Implementation 2018" [get_runs impl_1]
Expand Down Expand Up @@ -792,6 +794,7 @@ set_property -name "display_name" -value "impl_1_post_route_phys_opt_report_bus_

}
set obj [get_runs impl_1]
set_property -name "part" -value "xc7s100fgga676-2" -objects $obj
set_property -name "strategy" -value "Vivado Implementation Defaults" -objects $obj
set_property -name "steps.write_bitstream.args.readback_file" -value "0" -objects $obj
set_property -name "steps.write_bitstream.args.verbose" -value "0" -objects $obj
Expand Down
2 changes: 2 additions & 0 deletions msu/rtl/vivado_ozturk/run_vivado.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
#!/bin/bash

set -e

# Configuration
# If using 128 bits be sure to change tb.sv as well.
export MOD_LEN=1024
Expand Down
11 changes: 7 additions & 4 deletions msu/rtl/vivado_simple/msu.tcl
Original file line number Diff line number Diff line change
Expand Up @@ -109,14 +109,13 @@ if { $::argc > 0 } {
set orig_proj_dir "[file normalize "$origin_dir/../msu"]"

# Create project
create_project ${_xil_proj_name_} ./${_xil_proj_name_} -part xcvu9p-flga2104-2L-e
create_project ${_xil_proj_name_} ./${_xil_proj_name_} -part xc7s100fgga676-2

# Set the directory path for the new project
set proj_dir [get_property directory [current_project]]

# Set project properties
set obj [current_project]
set_property -name "board_part" -value "xilinx.com:vcu118:part0:2.0" -objects $obj
set_property -name "board_part_repo_paths" -value "/home/snpeffer/src/vdf/artya7/vivado-boards-master/new/board_files" -objects $obj
set_property -name "default_lib" -value "xil_defaultlib" -objects $obj
set_property -name "dsa.accelerator_binary_content" -value "bitstream" -objects $obj
Expand All @@ -138,6 +137,7 @@ set_property -name "enable_vhdl_2008" -value "1" -objects $obj
set_property -name "ip_cache_permissions" -value "read write" -objects $obj
set_property -name "ip_output_repo" -value "$proj_dir/${_xil_proj_name_}.cache/ip" -objects $obj
set_property -name "mem.enable_memory_map_generation" -value "1" -objects $obj
set_property -name "part" -value "xc7s100fgga676-2" -objects $obj
set_property -name "sim.central_dir" -value "$proj_dir/${_xil_proj_name_}.ip_user_files" -objects $obj
set_property -name "sim.ip.auto_export_scripts" -value "1" -objects $obj
set_property -name "simulator_language" -value "Mixed" -objects $obj
Expand Down Expand Up @@ -225,6 +225,7 @@ set obj [get_filesets constrs_1]

# Set 'constrs_1' fileset properties
set obj [get_filesets constrs_1]
set_property -name "target_part" -value "xc7s100fgga676-2" -objects $obj

# Create 'sim_1' fileset (if not found)
if {[string equal [get_filesets -quiet sim_1] ""]} {
Expand Down Expand Up @@ -264,7 +265,7 @@ set obj [get_filesets utils_1]

# Create 'synth_1' run (if not found)
if {[string equal [get_runs -quiet synth_1] ""]} {
create_run -name synth_1 -part xcvu9p-flga2104-2L-e -flow {Vivado Synthesis 2018} -strategy "Vivado Synthesis Defaults" -report_strategy {No Reports} -constrset constrs_1
create_run -name synth_1 -part xc7s100fgga676-2 -flow {Vivado Synthesis 2018} -strategy "Vivado Synthesis Defaults" -report_strategy {No Reports} -constrset constrs_1
} else {
set_property strategy "Vivado Synthesis Defaults" [get_runs synth_1]
set_property flow "Vivado Synthesis 2018" [get_runs synth_1]
Expand All @@ -284,13 +285,14 @@ set_property -name "display_name" -value "synth_1_synth_report_utilization_0" -o
}
set obj [get_runs synth_1]
set_property -name "strategy" -value "Vivado Synthesis Defaults" -objects $obj
set_property -name "part" -value "xc7s100fgga676-2" -objects $obj

# set the current synth run
current_run -synthesis [get_runs synth_1]

# Create 'impl_1' run (if not found)
if {[string equal [get_runs -quiet impl_1] ""]} {
create_run -name impl_1 -part xcvu9p-flga2104-2L-e -flow {Vivado Implementation 2018} -strategy "Vivado Implementation Defaults" -report_strategy {No Reports} -constrset constrs_1 -parent_run synth_1
create_run -name impl_1 -part xc7s100fgga676-2 -flow {Vivado Implementation 2018} -strategy "Vivado Implementation Defaults" -report_strategy {No Reports} -constrset constrs_1 -parent_run synth_1
} else {
set_property strategy "Vivado Implementation Defaults" [get_runs impl_1]
set_property flow "Vivado Implementation 2018" [get_runs impl_1]
Expand Down Expand Up @@ -506,6 +508,7 @@ set_property -name "display_name" -value "impl_1_post_route_phys_opt_report_bus_

}
set obj [get_runs impl_1]
set_property -name "part" -value "xc7s100fgga676-2" -objects $obj
set_property -name "strategy" -value "Vivado Implementation Defaults" -objects $obj
set_property -name "steps.write_bitstream.args.readback_file" -value "0" -objects $obj
set_property -name "steps.write_bitstream.args.verbose" -value "0" -objects $obj
Expand Down
4 changes: 3 additions & 1 deletion msu/rtl/vivado_simple/run_vivado.sh
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
#!/bin/bash

set -e

# Configuration
export MOD_LEN=1024
export SIMPLE_SQ=1
Expand All @@ -15,7 +17,7 @@ cd $SCRIPTPATH
rm -f ../msuconfig.vh

# Generate a test
../gen_test.py -c
../gen_test.py -c -s $MOD_LEN

# Generate the Vivado project
if [ ! -d msu ]; then
Expand Down
2 changes: 2 additions & 0 deletions msu/scripts/simulation_setup.sh
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,5 @@ else
sudo yum update -y
sudo yum install -y gmp-devel verilator python36 gtkwave
fi

export PATH=/tools/Xilinx/Vivado/2018.3/bin:$PATH