Skip to content

Documentation for CSEC machine #72

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
99 changes: 99 additions & 0 deletions docs/unified/docs/csec-machine/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Architecture

## Entry Point

The CSEC machine module lives in `java-slang/src/ec-evaluator`.

The entry point is in `ec-evaluator/index.ts`, in `runECEvaluator`.

The evaluator loop is in `ec-evaluator/interpreter.ts`, in `evaluate`.

The frontend used for the Source Academy website has a specialised implementation for the Java CSEC machine to include implementation for the 'Classes' component.

## Components

The CSEC machine comprises 4 components:

- Control
- Stash
- Environment
- Classes

All components are declared and implemented in `ec-evaluator/components.ts`.

**Control**

The control is implemented as a simple stack of `ControlItem`s.

**Stash**

The stash is implemented as a simple stack of `StashItem`s.

**Environment**

The environment is implemented as a doubly-linked list of frames.

**Classes**

Classes are a component unique to the Java CSEC machine, which is a specialisation of the CSE notional machine (which is, in its theoretical form, language-agnostic). This component was added to better reflect class and object-oriented programming semantics with minimal mental overhead (to keep the machine easy to understand for learners), instead of attempting to represent classes entirely in terms of constructs native to the formal CSE machine (or an existing familiar implementation, like the Source CSE machine).

The `Class` type is defined (as an interface) in `ec-evaluator/types.ts`, like all other CSE machine types.

### Initialisation

When a program is first input for evaluation, it is parsed and transformed into a compilation unit. Normal Java programs are compiled in separate classes, with each top-level classes occupying one file each. When compiled into bytecode, class files may be run, and if so, the entry point is the `main` function of the class file to be run.

The CSEC machine does not permit distinct files for Java programs. Instead, all classes are declared in the same file. The entry point is then the first declared `main` method in the entire file. It is a runtime error if no such method exists.

### Evaluation Loop

As is the case with all CSE machines, the evaluator repeatedly pops the current command off the top of the control, and performs actions based on the command seen.

Step-wise evaluation is handled by dispatch to functions of the type

`(command, environment, control, stash) => void`,

which has the type alias `CmdEvaluator` as defined in `interpreter.ts`.

The necessary function for each command (instruction or AST node type) is registered in `cmdEvaluators`, a table declared also in `interpreter.ts`.

The full list of registered command evaluators (and thus AST node types and instructions that may be presently handled by the CSEC machine) are:

_AST Node Types_

- `CompilationUnit`
- `NormalClassDeclaration`
- `Block`
- `ConstructorDeclaration`
- `FieldDeclaration`
- `LocalVariableDeclarationStatement`
- `ExpressionStatement`
- `ReturnStatement`
- `Assignment`
- `MethodInvocation`
- `ClassInstanceCreationExpression`
- `ExplicitConstructorInvocation`
- `Literal`
- `Void`
- `ExpressionName`
- `BinaryExpression`

_Instructions_

- `Pop`
- `Assign`
- `BinaryOperation`
- `Invocation`
- `Env`
- `Reset`
- `EvalVariable`
- `Res`
- `Deref`
- `New`
- `ResType`
- `ResTypeCont`
- `ResOverload`
- `ResOverride`
- `ResConOverload`

All AST node types are supplied by (various modules in) `src/ast/types/…`, while all instructions are supplied by `src/ec-evaluator/types.ts`.
19 changes: 19 additions & 0 deletions docs/unified/docs/csec-machine/classes.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# Classes

## Classes

CSEC machine classes are stored in the environment, like bindings (in fact, class objects are just bound names in the CSEC machine).

Class declarations merely extend (directly or indirectly) from the global environment frame. The `Object` class is initialised first for any program evaluated in the CSEC machine (created as part of the `CompilationUnit` expansion), and extends the global environment.

Each class owns its own environment frame, which contain all class member bindings (static fields, static methods, instance methods, and constructors).

## Instances

Instances of classes are also stored in the environment. For a particular instance of a class, a frame is created for that class and for each of all of its superclasses; they are then linked together in order to reflect the scoping semantics of classes (hierarchical search).

An empty frame is created corresponding to the `Object` class' instance frame. which is then extended with the superclasses of `Object` with a frame for each class. As the frames are created, the instance fields *and* static fields are populated.

The `Object` frame is presently empty.

Static fields are handled by aliasing the variable identifiers. That is, instead of pointing the instance frame to the class' frame (where the static variables were stored), the variables are redeclared in the instance frames and point to the same underlying 'location' (implementation detail).
66 changes: 66 additions & 0 deletions docs/unified/docs/csec-machine/control.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
# Control

## Definitions

Control items are split into two kinds:

- AST nodes, and
- instructions.

The following specification uses the following terms and definitions to specify the behaviour of the CSEC machine when a certain control item is executed.

A control item is _executed_ when it is run by the CSEC machine. The CSEC machine can only execute one control item per step.

A control item, when executed, may:

- pop zero or more stash items,
- push zero or more control items, and
- push zero or more stash items.

A control item is _fully executed_ when it is executed, and when all of the control items created during its reduction have been fully executed. Where the context is clear, when we use the term 'fully executed' to refer a particular state of the CSEC machine, we refer to the time when a control item is _first_ fully executed.

For each control item, we define:

- _Qualified Name_: the name of the control item and its type (AST node or instruction)
- _Preconditions_: the conditions that must be fulfilled when the control item is executed.
- _Postconditions_: the conditions that are fulfilled when the control item is fully executed.
- _Expansion_: the conditions that are fulfilled when the control item is fully executed.

## Control Item Set

_AST Node Types_

- `CompilationUnit`
- `NormalClassDeclaration`
- `Block`
- `ConstructorDeclaration`
- `FieldDeclaration`
- `LocalVariableDeclarationStatement`
- `ExpressionStatement`
- `ReturnStatement`
- `Assignment`
- `MethodInvocation`
- `ClassInstanceCreationExpression`
- `ExplicitConstructorInvocation`
- `Literal`
- `Void`
- `ExpressionName`
- `BinaryExpression`

_Instructions_

- `Pop`
- `Assign`
- `BinaryOperation`
- `Invocation`
- `Env`
- `Reset`
- `EvalVariable`
- `Res`
- `Deref`
- `New`
- `ResType`
- `ResTypeCont`
- `ResOverload`
- `ResOverride`
- `ResConOverload`
Binary file added docs/unified/docs/csec-machine/ece.pdf
Binary file not shown.
3 changes: 3 additions & 0 deletions docs/unified/docs/csec-machine/environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Environment

Environments are used for collections of bindings in the CSEC machine, and adopt the familiar linked frame implementation that implements scoping semantics.
17 changes: 17 additions & 0 deletions docs/unified/docs/csec-machine/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# CSEC Machine

The CSEC Machine is an implementation of the CSE notional machine. This machine extends the CSE notional machine with an additional component, Classes, to better reflect Java's semantics in a learner-friendly manner.

The implementation architecture of the CSEC machine is described in [Architecture](./architecture.md).

## CSEC Machine Components

- [Control](control.md)
- [Stash](stash.md)
- [Environment](environment.md)
- [Classes](classes.md)

## References

- [K. M. Abad, M. Henz, "Beyond SICP — Design and Implementation of a Notional Machine," Proceedings of the 2024 Workshop on Scheme and Functional Programming, 2024.](scheme24.pdf)
- [X. Y. Liew, "Explicit-Control Evaluator for Java," B. Comp. dissertation, School of Computing, National University of Singapore, 2024.](ece.pdf)
Binary file added docs/unified/docs/csec-machine/scheme24.pdf
Binary file not shown.
39 changes: 39 additions & 0 deletions docs/unified/docs/csec-machine/stash.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Stash

## Stash Item Set

- Primitive
- Literal
- Reference (to an object/instance of a class)
- Value
- Variable
- Closure
- Class
- Void
- Type

## Description

**Primitive.**
_Primitives_ are used for direct evaluation with primitive operations. Primitive operations only accept primitive types.

There is only one primitive type: _literals_. Literals are abstract representations of numbers in the CSEC machine.

**Reference.**
_References_ refer to Java objects.

**Value.**
The _Value_ type is a union type of three types:
- _Variables_: Todo
- ~ Messy implementation?
- _Closures_: Either methods or constructors
- ~ Messy implementation: why should instance and/or static methods and constructors be treated any differently? Why is that information baked into closures, when all a closure really is is a function?
- _Classes_: Java Class definitions.

**Void.**
The _Void_ type represents the empty (uninhabited) type.

**Type.**
The _Type_ type is a symbolic reference to a Java type. More specifically, it is a symbolic reference to a Java class.

At runtime, instantiations of this type are resolved symbolically in an environment to a Java class.
9 changes: 9 additions & 0 deletions docs/unified/mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@ site_name: java-slang
nav:
- Introduction: index.md
- Contributing: contributing.md
- CSEC Machine:
- csec-machine/index.md
- Architecture: csec-machine/architecture.md
- Control: csec-machine/control.md
- Stash: csec-machine/stash.md
- Environment: csec-machine/environment.md
- Classes: csec-machine/classes.md

repo_url: https://github.com/source-academy/java-slang
repo_name: source-academy/java-slang
Expand All @@ -22,3 +29,5 @@ theme:
toggle:
icon: material/brightness-4
name: Switch to light mode
features:
- navigation.indexes
Loading