diff --git a/README.md b/README.md index 3ab0c38..02ce05e 100644 --- a/README.md +++ b/README.md @@ -4,21 +4,26 @@ WAForth is a bootstrapping Forth interpreter and dynamic compiler for [WebAssembly](https://webassembly.org). You can see it in a demo [here](https://el-tramo.be/waforth/). -It is (almost) entirely written in WebAssembly and Forth, and the compiler generates -WebAssembly code on the fly. The only parts for which it relies on external -(JavaScript) code is the dynamic loader (since WebAssembly [doesn't support JIT -yet](https://webassembly.org/docs/future-features/#platform-independent-just-in-time-jit-compilation)), and the I/O primitives to read and write a character. +It is (almost) entirely written in WebAssembly and Forth, and the compiler +generates WebAssembly code on the fly. The only parts for which it relies on +external (JavaScript) code is the dynamic loader (since WebAssembly [doesn't +support JIT +yet](https://webassembly.org/docs/future-features/#platform-independent-just-in-time-jit-compilation)), +and the I/O primitives to read and write a character. -The implementation was influenced by [jonesforth](http://git.annexia.org/?p=jonesforth.git;a=summary), -and I shamelessly stole the Forth implementation of some of its high-level words. +The implementation was influenced by +[jonesforth](http://git.annexia.org/?p=jonesforth.git;a=summary), and I +shamelessly stole the Forth implementation of some of its high-level words. -WAForth is still just an experiment, and doesn't implement all the ANS standard words yet. +WAForth is still just an experiment, and doesn't implement all the ANS standard +words yet. ## Install Dependencies -The build uses [Racket](https://racket-lang.org) for processing the WebAssembly code, -the [WebAssembly Binary Toolkit](https://github.com/WebAssembly/wabt) for converting it in binary -format,and [Yarn](https://yarnpkg.com) for managing the dependencies of the shell. +The build uses [Racket](https://racket-lang.org) for processing the WebAssembly +code, the [WebAssembly Binary Toolkit](https://github.com/WebAssembly/wabt) for +converting it in binary format,and [Yarn](https://yarnpkg.com) for managing the +dependencies of the shell. brew install wabt yarn minimal-racket yarn @@ -42,67 +47,90 @@ The tests are served from `/tests`. ### The Macro Assembler -The WAForth core is written as [a single module](https://github.com/remko/waforth/blob/master/src/waforth.wat) in WebAssembly's [text format](https://webassembly.github.io/spec/core/text/index.html). The -text format isn't really meant for writing code in, so it has no facilities like a real assembler -(e.g. constant definitions, macro expansion, ...) However, since the text format uses S-expressions, -you can do some small modifications to make it extensible with Lisp-style macros. +The WAForth core is written as [a single +module](https://github.com/remko/waforth/blob/master/src/waforth.wat) in +WebAssembly's [text +format](https://webassembly.github.io/spec/core/text/index.html). The text +format isn't really meant for writing code in, so it has no facilities like a +real assembler (e.g. constant definitions, macro expansion, ...) However, since +the text format uses S-expressions, you can do some small modifications to make +it extensible with Lisp-style macros. I added some Racket macros to the module definition, and implemented [a mini assembler](https://github.com/remko/waforth/blob/master/src/tools/assembler.rkt) to print out the resulting s-expressions in the right format. -The result is something that looks like a standard WebAssembly module, but sprinkled with some -macros for convenience. +The result is something that looks like a standard WebAssembly module, but +sprinkled with some macros for convenience. ### The Interpreter The interpreter runs a loop that processes commands, and switches to and from compiler mode. -Contrary to some other Forth systems, this system doesn't use direct threading -for executing code. WebAssembly doesn't allow unstructured jumps, let alone dynamic jumps. -Instead, WAForth uses subroutine threading, where each word is implemented as a single -WebAssembly function, and the system uses -calls and indirect calls (see below) to execute words. +Contrary to some other Forth systems, this system doesn't use direct threading +for executing code. WebAssembly doesn't allow unstructured jumps, let alone +dynamic jumps. Instead, WAForth uses subroutine threading, where each word +is implemented as a single WebAssembly function, and the system uses calls +and indirect calls (see below) to execute words. ### The Compiler -While in compile mode for a word, the compiler generates WebAssembly instructions in -binary format (since there is no assembler infrastructure in the browser). Since WebAssembly -[doesn't support JIT compilation yet](https://webassembly.org/docs/future-features/#platform-independent-just-in-time-jit-compilation), a finished word is bundled into a separate binary WebAssembly module, and -sent to the loader, which dynamically loads it and registers it with a shared -[function table](https://webassembly.github.io/spec/core/valid/modules.html#tables) at the -next offset, which in turn is recorded in the word dictionary. +While in compile mode for a word, the compiler generates WebAssembly +instructions in binary format (since there is no assembler infrastructure in +the browser). Since WebAssembly [doesn't support JIT compilation +yet](https://webassembly.org/docs/future-features/#platform-independent-just-in-time-jit-compilation), +a finished word is bundled into a separate binary WebAssembly module, and sent +to the loader, which dynamically loads it and registers it with a shared +[function +table](https://webassembly.github.io/spec/core/valid/modules.html#tables) at +the next offset, which in turn is recorded in the word dictionary. -Because words reside in different modules, all calls to and from the words need to happen as -indirect `call_indirect` calls through the shared function table. This of course introduces -some overhead, although it seems limited. +Because words reside in different modules, all calls to and from the words need +to happen as indirect `call_indirect` calls through the shared function table. +This of course introduces some overhead, although it seems limited. -As WebAssembly doesn't support unstructured jumps, control flow words (`IF/ELSE/THEN`, -`LOOP`, `REPEAT`, ...) can't be implemented in terms of more basic words, unlike in jonesforth. -However, since Forth only requires structured jumps, the compiler can easily be implemented -using the loop and branch instructions available in WebAssembly. +As WebAssembly doesn't support unstructured jumps, control flow words +(`IF/ELSE/THEN`, `LOOP`, `REPEAT`, ...) can't be implemented in terms of more +basic words, unlike in jonesforth. However, since Forth only requires +structured jumps, the compiler can easily be implemented using the loop and +branch instructions available in WebAssembly. Finally, the compiler adds minimal debug information about the compiled word in -the [name section](https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#name-section), making it easier for doing some debugging in the browser. +the [name +section](https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#name-section), +making it easier for doing some debugging in the browser. -![Debugger view of a compiled word](https://el-tramo.be/blog/waforth/debugger.png "Debugger view of a compiled word") +![Debugger view of a compiled +word](https://el-tramo.be/blog/waforth/debugger.png "Debugger view of a +compiled word") ### The Loader -The loader is a small bit of JavaScript that uses the [WebAssembly JavaScript API](https://webassembly.github.io/spec/js-api/index.html) to dynamically load a compiled word (in the form of a WebAssembly module), and ensuring that the shared function table is large enough for the module to -register itself. +The loader is a small bit of JavaScript that uses the [WebAssembly JavaScript +API](https://webassembly.github.io/spec/js-api/index.html) to dynamically load +a compiled word (in the form of a WebAssembly module), and ensuring that the +shared function table is large enough for the module to register itself. ### The Shell -The shell is [a JavaScript class](https://github.com/remko/waforth/blob/master/src/shell/WAForth.js) -that wraps the WebAssembly module, and loads it in the browser. -It provides the I/O primitives to the WebAssembly module to read and write characters to a terminal, -and externally provides a `run()` function to execute a fragment of Forth code. +The shell is [a JavaScript +class](https://github.com/remko/waforth/blob/master/src/shell/WAForth.js) that +wraps the WebAssembly module, and loads it in the browser. It provides the I/O +primitives to the WebAssembly module to read and write characters to a +terminal, and externally provides a `run()` function to execute a fragment of +Forth code. -To tie everything together into an interactive system, there's a small console-based interface around this shell to type -Forth code, which you can see in action [here](https://el-tramo.be/waforth/). +To tie everything together into an interactive system, there's a small +console-based interface around this shell to type Forth code, which you can see +in action [here](https://el-tramo.be/waforth/). ![WAForth Console](https://el-tramo.be/waforth/console.gif "WAForth Console") + +### Misc notes + +- The exposed return stack isn't used. Control flow is kept implicitly in the + code (e.g. through branches, indirect calls, ...). This also means that + control flow can't be influenced by code. diff --git a/src/waforth.wat b/src/waforth.wat index 1ba18bb..0a58767 100644 --- a/src/waforth.wat +++ b/src/waforth.wat @@ -1894,9 +1894,11 @@ EOF ;; - name (n bytes): Name characters. End is 4-byte aligned. ;; - code pointer (4 bytes): Index into the function ;; table of code to execute - ;; - code argument (4 bytes) (optional): In case the function in the - ;; code pointer takes an argument (i.e. 'flags' has fData set). + ;; - code argument (4 bytes) (optional): In case the function is + ;; pushDataValue (used by CONST), contains data used by the function. ;; - data (m bytes) + ;; + ;; Execution tokens are addresses of dictionary entries (data (i32.const !baseBase) "\u000A\u0000\u0000\u0000") (data (i32.const !stateBase) "\u0000\u0000\u0000\u0000")