-
Notifications
You must be signed in to change notification settings - Fork 76
feat: add SystemVerilog backend generator #1054
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
UDB currently allows for 2 schemas for instructions with the on-going sub-type development. Some instructions now use a 'format' field with 'opcodes' instead of the traditional 'encoding' field with 'match' and 'variables'. This caused generators to fail when processing these instructions as they couldn't extract the bit pattern matching information. Changes: - Added build_match_from_format() function to convert format.opcodes to match strings compatible with existing generator logic - Enhanced encoding detection in load_instruction() to handle both old schema (encoding.match) and new schema (format.opcodes) - Maintains full backward compatibility with existing instructions - No functional changes to generated output format The fix ensures generators can process the complete UDB instruction set regardless of which schema format individual instructions use. Signed-off-by: Afonso Oliveira <[email protected]>
Implements a new backend generator for SystemVerilog output, matching the exact format used by riscv-opcodes/inst.sverilog. This provides direct compatibility with hardware designs using the riscv-opcodes SystemVerilog package format. Features: - Generates SystemVerilog package with instruction and CSR definitions - Outputs 32-bit instruction patterns with proper bit encoding - Handles compressed (16-bit) instructions correctly - Supports all standard RISC-V extensions - Integrated with the ./do build system as gen:sverilog task The generator produces output identical to riscv-opcodes format: - Instructions as 'localparam [31:0] NAME = 32'bpattern' - CSRs as 'localparam logic [11:0] CSR_NAME = 12'haddr' - Proper alignment and formatting for readability Tested against riscv-opcodes/inst.sverilog to ensure format compatibility. Signed-off-by: Afonso Oliveira <[email protected]>
CC @jordancarlin If you can test this or spot any errors/improvements, it would be awesome. If you are already working on something similar I can drop this, I just wanted to give a little help and improve UDB's outputs |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1054 +/- ##
=======================================
Coverage 46.05% 46.05%
=======================================
Files 11 11
Lines 4942 4942
Branches 1345 1345
=======================================
Hits 2276 2276
Misses 2666 2666
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Thanks @AFOliveira. I started on it a while ago, but haven't touched it since I ran into #893, so let's go with this version since it is more up to date. This is looking great so far! I have a few comments with differences from the riscv-opcodes version and then some additional comments for things we should improve compared to the riscv-opcodes version since we're reimplementing it anyway:
I'll leave some implementation specific comments inline. |
if name.startswith("c."): | ||
name = "C_" + name[2:] | ||
# Replace dots with underscores and convert to uppercase | ||
return name.replace(".", "_").upper() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like there is no reason to special-case compressed instructions here? The c.
--> C_
conversion should be handled by the standard name.replace(".", "_").upper()
.
def match_to_sverilog_bits(match_str, is_compressed=False): | ||
"""Convert a match string to SystemVerilog bit pattern.""" | ||
if not match_str: | ||
return "32'b" + "?" * 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this throw an error/warning instead of silently generating a match string of all wildcards that will match any instruction? That would be very problematic in use because it might shadow an actual match later.
elif len(match_str) < 32: | ||
# For other cases, pad on the right | ||
match_str = match_str + "-" * (32 - len(match_str)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What cases do you envision this matching? If none, we're probably better off throwing an error here.
return "32'b" + "?" * 32 | ||
|
||
# For compressed instructions (16-bit), we need to handle them differently | ||
# The riscv-opcodes format puts the 16-bit pattern in the lower 16 bits |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably better for comments to stand alone instead of justifying choices based on riscv-opcodes.
for bit in match_str: | ||
if bit == "0": | ||
result.append("0") | ||
elif bit == "1": | ||
result.append("1") | ||
else: # '-' or any other character | ||
result.append("?") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this whole block be simplified to just match_str.replace("-", "?")
?
else: | ||
# If no match field, use all wildcards | ||
match = "-" * 32 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, please don't generate all wildcards if we don't find a match. A warning/error seems like a better choice here.
match = "-" * 32 | ||
|
||
# Check if this is a compressed instruction | ||
is_compressed = name.startswith("c.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the is_compressed
logic necessary since you also check if the instruction encoding is 16 bits in the match_to_sverilog_bits
function? Seems like it should be fine to just use that to check instead of having both the string-based and encoding-based approaches.
) | ||
parser.add_argument( | ||
"--output", | ||
default="inst.sverilog", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about something like this instead?
default="inst.sverilog", | |
default="riscv_decode_package.svh", |
parser.add_argument( | ||
"--extensions", | ||
default="A,D,F,I,M,Q,Zba,Zbb,Zbs,S,System,V,Zicsr,Smpmp,Sm,H,U,Zicntr,Zihpm,Smhpm", | ||
help="Comma-separated list of enabled extensions. Default includes standard extensions.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not default to a subset of (seemingly arbitrary) extensions. Not sure what the comments means when it says it defaults to the standard extensions. Similar to the generated C header, the default should be to include all extensions.
parser.add_argument( | ||
"--arch", | ||
default="RV64", | ||
choices=["RV32", "RV64", "BOTH"], | ||
help="Target architecture (RV32, RV64, or BOTH). Default is RV64.", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The generated C header does not have this option. I think it makes sense to keep the same CLI interface for both of them. The default output should also include everything, not just RV64.
Thanks for the review @jordancarlin I tend to agree with all of them. I made this on a late night flight thus the ammount of small inconsistencies. I'll fix them ASAP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work guys. Two high-level comments:
- Once this is ready, I think it'd be a good candidate for golden output checking like we do with the instruction appendix
- Not a must, but I do have a long-term vision of transitioning everything out of backends/ and into tools/ to finish up the file reorg. If it isn't too much work, we could move it there now and save some work down the road.
Sure, no problem.
To keep the commit history organized, can we first merge these two (this PR and #1051) and then I may move the full generator folder from one place to another? |
We should also add the generated C header as a golden output to avoid issues like #893 from popping up again. |
Works for me. |
Superseeded by #1090 |
Implements a new backend generator for SystemVerilog output, matching
the exact format used by riscv-opcodes/inst.sverilog. This provides
direct compatibility with hardware designs using the riscv-opcodes
SystemVerilog package format.
Features:
The generator produces output identical to riscv-opcodes format:
Tested against riscv-opcodes/inst.sverilog to ensure format compatibility.