-
Notifications
You must be signed in to change notification settings - Fork 7
Varargs
Varargs and proper tail calls are not implemented in Sourceror currently. This document describes the necessity of varargs and a possible way of implementing them in Sourceror.
Note: This design might be superceded by other possibilities. In particular, one might want to consider supporting tail calls only on platforms with the new WebAssembly tail calls feature.
A vararg function is a function that takes any number of arguments. The calling convention of a vararg function must be designed such that the callee can figure out the number and types of arguments that it was called with at runtime.
There are two types of situations where we will need varargs:
- To implement first-class support for display()/error()/etc that allows multiple number of arguments
- To implement tail call optimisation
There are two types of function calls:
- Those where we know the function being called (i.e. direct call)
- Those where we don’t, because we call a function via a pointer to it (i.e. indirect call)
For direct calls, we know if the callee is a vararg function, and for non-vararg functions we know the number and types of arguments at compilation time. We can hence check at compilation time if a direct call has the wrong number or types of arguments.
For indirect calls, we don’t know anything about the callee. We have to call the callee using a vararg calling convention and let the callee figure out if the number and type of arguments is correct. As the callee cannot make any guarantees about the types of arguments, all arguments must be widened to Any before being passed through an indirect call. For simplicity, we demand that all vararg calls must have all arguments widened to Any, regardless of whether they are direct or indirect calls. The return type must also be Any.
Since WebAssembly does not have builtin vararg function signatures, we have to utilise the unprotected stack to pass the actual arguments. Then we call the callee with a single i32 parameter, specifying the number of actual arguments to read from the unprotected stack.
For display(), there are two functions display#1() and display#2() implemented in Source. The compiler automatically generates the function display() (unimplementable in Source) that does a direct call to display#1(), display#2(), or trap, depending on the number of arguments. The same applies for error(). Some functions like math_min(), math_max(), list() do not forward their arguments to any function implemented in Source. These functions are self-contained functions automatically generated by the compiler.
Source demands proper tail calls, which is not afforded by WebAsssembly (there is a separate WebAssembly proposal for specialized instructions for tail calls, but it is nowhere near ready yet).
To emulate tail calls in WebAssembly, we in general have to emit a trampoline functions at every normal call site:
val_t trampoline(f_ptr* fn) {
val_t result;
do {
result = (*fn)(&fn);
} while(fn != NULL);
return result;
}
The complementary code at the site of every tail call would also have to be added.
There is a difference in signature of the callee, because the version called by a trampoline needs to take in an extra pointer to function pointer parameter.
Executing the trampoline is expensive compared to normal direct calls. This is because:
- We need to leverage the unprotected stack to pass arguments
- Direct calls have to become indirect calls (except possibly the for the first call)
Hence, we would like to select a minimum number of call sites to emit trampolines; the other call sites should use normal direct calls without trampolines. To do so, we make the following assumption: The size of the call stack is large enough to contain all non-recursive functions. More specifically, we mean: In every possible chain of calls to distinct functions, the size of the chain of calls will not exceed the size available for the call stack.
This assumption means that we only need to guarantee tail calls in situations where recursion might happen AND such recursion is tail call optimisable.
We build a directed graph G = (V, E) where the vertices are the set of functions in our program, and an edge from u to v means that the function body of u contains at least one direct call to v that can be optimised into a tail call (i.e. the return value of the call to v is directly returned by u). Note that this graph might contain self-loops.
We say that v is tail reachable from u if there is a non-empty path from u to v in G. (So u is tail reachable from u if and only if there is a cycle containing u in G.)
We call a function f requires trampoline if f satisfies at least one of the following:
- f is tail reachable from f
- f contains an indirect tail call
- There exists some function v in G such that v is tail reachable from f and v contains an indirect tail call
Intuitively, a function requires trampoline if it could be part of a cycle of tail calls, that could be direct or indirect. Note that some functions that requires trampoline may not actually ever be able to lead to a recursion at runtime. The definition of “requires trampoline” is meant to be a superset of the functions that could ever lead to a recursion at runtime.
The rules for emitting functions are as follows:
- Every function that does not require trampoline shall have a non-trampolined copy (i.e. doesn’t have the pointer to function pointer parameter) emitted; this may or may not use a vararg signature (display()/error()/etc will use vararg even though they do not require trampoline)
- Every function that requires trampoline or has its address taken (i.e. can be a target of an indirect call) shall have a trampolined copy emitted (i.e. has the pointer to function pointer parameter) emitted; note that all trampolined copies of functions have vararg parameters since they have to be called indirectly.
Note that a function that has its address taken and does not require trampoline will have two copies of the function generated.
Hence each function may have up to two versions emitted:
- Non-trampolined non-vararg version: for functions that do not require trampoline and are not pre-declared vararg functions (e.g. display(), etc)
- Non-trampolined vararg version: for functions that do not require trampoline and are pre-declared vararg functions
- Trampolined vararg version: for functions that require trampoline or functions that have their address taken
There are hence two ways to call a function:
- Direct call without trampoline: invokes non-trampolined non-vararg version for functions that are not pre-declared vararg functions, and non-trampolined vararg version for functions that are pre-declared vararg functions
- Direct or indirect call with trampoline: invokes trampolined vararg version
The rules for emitting a trampoline at a call site are as follows:
- If it is a direct call:
- If it is a non-tail call or the caller is a non-trampolined version, then:
- If the callee requires trampoline, emit a direct call with a new trampoline
- Else, emit a direct call without trampoline
- Else:
- If the callee requires trampoline, emit a tail call with the caller’s original trampoline
- Else, emit a direct call without trampoline
- If it is a non-tail call or the caller is a non-trampolined version, then:
- If it is an indirect call:
- If it is a non-tail call or the caller is a non-trampolined version, then emit an indirect call with a new trampoline
- Else, emit a tail call with the caller’s original trampoline
This guarantees that all functions that requires trampoline will only ever be called from a trampoline context. Functions that do not require trampoline will be called from a trampoline context if they are called indirectly, and from a non-trampoline context if they are called directly; they do not make use of their own trampoline context (if any).