In which we lay the groundwork for the rest of the book…
While someone with no programming experience could probably learn RISC-V from this book, it is definitely preferable to have at least some experience in a higher level imperative programming language. I say imperative, because programming in assembly is the antithesis of functional programming; everything is about state, with each line changing the state of the CPU and sometimes memory. Given that, experience in functional languages like Lisp, Scheme etc. are less helpful than experience in C/C++, Java, Python, Javascript etc.
Of all of the latter, C is the best, with C++ being a close second because at least all of C exists in C++. There are many reasons C is the best prior experience when learning assembly (any assembly, not just RISC-V), including the following:
-
pointers, concepts and symmetry of "address of" and "dereference" operators
-
pointer/array syntax equivalence
-
stack allocation as the default
-
manual memory management, no garbage collector
-
global data
-
rough equivalence in structure of a C program and an assembly program (vs. say Java)
-
pass by value
There is some overlap between those and there are probably more, but you can see that most other languages that are commonly taught as first languages are missing most, if not all of those things.
Even C++, which technically has all of them being a superset of C, is usually taught in a way that mostly ignores all of those things. They teach C++ as if it’s Java, never teaching the fundamentals. In any case this is getting into my problems with CS pedagogy of the last 20 years based on my experience as a CS major myself ('12) and as a programming tutor helping college students across the country since 2016, and I should save it for a proper essay/rant sometime.
Long story short, I use C code and C syntax to help explain and teach RISC-V. I’ll try to provide enough explanation regardless of past experience as best I can.
As I tell all of my tutoring students, if you’re majoring in CS or anything related I highly recommend you use Linux. It’s easier in every way to do dev work on Linux vs Windows or Mac. Many assignments require it, which often necessitates using a virtual machine (which is painful, especially on laptops) and/or ssh-ing into a school Linux server, which is also less than ideal. In general, you’ll have to learn how to use the Unix terminal eventually and will probably use it to some extent in your career so it also makes sense to get used to it asap.
That being said, Windows does now have WSL so you can get the full Ubuntu or Debian or Fedora etc. terminal based system on Windows without having to setup a real virtual machine (or dealing with the slowdown that would cause). I’ve even heard that they’ll get support for Linux GUI programs soon.
MacOS on the other hand, is technically a Unix based system and you can use their terminal and install virtually any program from there using Macports or Homebrew or similar.
There are a few RISC-V simulators that I know of and have used:
-
RARS is a RISC-V port of MARS, a Java GUI based simulator with dozens of extra environment calls, syntactic sugar and features like graphics, memory mapped I/O, etc.
-
Venus, a web based simulator used by Berkeley (also has a downloadable jar)
-
Ripes, a graphical processor simulator and assembly editor for bare bones assembly programming
Of those three, RARS is by far the most full featured and user friendly for learning since it forked from the venerable MARS MIPS simulator. It is also the most commonly used by students outside of Berkeley. Given that, this book will focus primarily on RARS, though most of it applies equally well to Venus. [_appendix_a_venus] covers the differences between RARS and Venus.
You can download/access both at the following links:
-
RARSM RARS iMproved; fork of RARS with fixes/features
-
ThaumicMekanism’s Venus Get the jar here
-
CS 61C’s Venus Get the jar here.
There are a few references that you should bookmark (or download) before you get started. The first is the RISC-V Greensheet. It’s possible you already have a physical copy of this as it’s actually the tearout from the Patterson and Hennessey textbook Computer Architecture and Design that is commonly used in college courses. Berkeley provides a similar reference sheet with the same information.
There is also a large format of the greensheet.
The second thing is the list of environment calls (aka ecalls, system calls, syscalls) from the RARS wiki.
I recommend you download/bookmark both and keep them open while working because you’ll be referencing them often to remind yourself which instructions and ecalls you have available and how they work.
Let’s start with the classic hello world program, first in C, then in RISC-V, and go over all the pieces in overview. You can copy paste these into your editor of choice (mine being neovim), or use the files in the associated repo to follow along.
link:code/hello.c[role=include]
It is pretty self explanatory. You have to include stdio.h so you can use the function printf (though in the real world I’d use puts here), the function main is the start of any C/C++ program, which is a function that returns an int. We call printf to display the string "Hello World!\n" to the user and then return 0 to exit. Returning 0 indicates success and there were no errors.
You can compile and run it in a linux/unix terminal as shown below. You can substitute clang or another compiler for gcc if you want.
$ gcc -o hello hello.c
$ ./hello
Hello World!
Now, the same program in RISC-V:
link:code/hello.s[role=include]
The .data
section is where you declare global variables, which includes string
literals as in this case. We’ll cover them in more detail later.
The .text
section is where any code goes. Here we declare a single label main:
,
indicating the start of our main function.
We then put the number 4 in the a7
register to select the print string system
call. The print string system call takes one argument, the address of the string
to print, in the a0
register. We do that on the next line. On line 8, we call
the system call using the ecall instruction.
Finally we call the exit system call which takes no arguments and exits the program.
Again, we’ll cover system calls in a later chapter. This is just an intro/overview so don’t worry if some things aren’t completely clear. This chapter is about getting you up and running, not really about teaching anything specific yet.
Now that we have our hello world RISC-V program, how do we run it? Well the easiest and quickest[1] way is of course to do it on the command line, which can be done like this:
$ java -jar ~/rars_latest.jar hello.s
RARS 1.5 Copyright 2003-2019 Pete Sanderson and Kenneth Vollmar
Hello World!
Program terminated by calling exit
The name of your RARS jar file may be different[2], so be sure to use the correct name and path. For myself, I keep the jar file in my home directory so I can use tilde to access it no matter where I am. You can also copy it into your working directory (ie wherever you have your source code) so you don’t have to specify a path at all. There are lots of useful command line options that you can use[3], some of which we’ll touch on later.
Running the jar directly on the command line works even in the Windows/DOS command line though I’ve never done it and it’s probably not worth it.
Alternatively, you can start up RARS like a normal GUI application and then load your source file. RARS requires you to hit "assemble" and then "run".