...

Text file src/github.com/tetratelabs/wazero/internal/engine/compiler/RATIONALE.md

Documentation: github.com/tetratelabs/wazero/internal/engine/compiler

     1# Compiler engine
     2
     3This package implements the Compiler engine for WebAssembly *purely written in Go*.
     4In this README, we describe the background, technical difficulties and some design choices.
     5
     6## General limitations on pure Go Compiler engines
     7
     8In Go program, each Goroutine manages its own stack, and each item on Goroutine
     9stack is managed by Go runtime for garbage collection, etc.
    10
    11These impose some difficulties on compiler engine purely written in Go because
    12we *cannot* use native push/pop instructions to save/restore temporary
    13variables spilling from registers. This results in making it impossible for us
    14to invoke Go functions from compiled native codes with the native `call`
    15instruction since it involves stack manipulations.
    16
    17*TODO: maybe it is possible to hack the runtime to make it possible to achieve
    18function calls with `call`.*
    19
    20## How to generate native codes
    21
    22wazero uses its own assembler, implemented from scratch in the
    23[`internal/asm`](../../asm/) package. The primary rationale are wazero's zero
    24dependency policy, and to enable concurrent compilation (a feature the
    25WebAssembly binary format optimizes for).
    26
    27Before this, wazero used [`twitchyliquid64/golang-asm`](https://github.com/twitchyliquid64/golang-asm).
    28However, this was not only a dependency (one of our goals is to have zero
    29dependencies), but also a large one (several megabytes added to the binary).
    30Moreover, any copy of golang-asm is not thread-safe, so can't be used for
    31concurrent compilation (See [#233](https://github.com/tetratelabs/wazero/issues/233)).
    32
    33The assembled native codes are represented as `[]byte` and the slice region is
    34marked as executable via mmap system call.
    35
    36## How to enter native codes
    37
    38Assuming that we have a native code as `[]byte`, it is straightforward to enter
    39the native code region via Go assembly code. In this package, we have the
    40function without body called `nativecall`
    41
    42```go
    43func nativecall(codeSegment, engine, memory uintptr)
    44```
    45
    46where we pass `codeSegment uintptr` as a first argument. This pointer is to the
    47first instruction to be executed. The pointer can be easily derived from
    48`[]byte` via `unsafe.Pointer`:
    49
    50```go
    51code := []byte{}
    52/* ...Compilation ...*/
    53codeSegment := uintptr(unsafe.Pointer(&code[0]))
    54nativecall(codeSegment, ...)
    55```
    56
    57And `nativecall` is actually implemented in [arch_amd64.s](./arch_amd64.s)
    58as a convenience layer to comply with the Go's official calling convention.
    59We delegate the task to jump into the code segment to the Go assembler code.
    60
    61
    62## Why it's safe to execute runtime-generated machine codes against async Goroutine preemption
    63
    64Goroutine preemption is the mechanism of the Go runtime to switch goroutines contexts on an OS thread.
    65There are two types of preemption: cooperative preemption and async preemption. The former happens, for example,
    66when making a function call, and it is not an issue for our runtime-generated functions as they do not make
    67direct function calls to Go-implemented functions. On the other hand, the latter, async preemption, can be problematic
    68since it tries to interrupt the execution of Goroutine at any point of function, and manipulates CPU register states.
    69
    70Fortunately, our runtime-generated machine codes do not need to take the async preemption into account.
    71All the assembly codes are entered via the trampoline implemented as Go Assembler Function (e.g. [arch_amd64.s](./arch_amd64.s)),
    72and as of Go 1.20, these assembler functions are considered as _unsafe_ for async preemption:
    73- https://github.com/golang/go/blob/go1.20rc1/src/runtime/preempt.go#L406-L407
    74- https://github.com/golang/go/blob/9f0234214473dfb785a5ad84a8fc62a6a395cbc3/src/runtime/traceback.go#L227
    75
    76From the Go runtime point of view, the execution of runtime-generated machine codes is considered as a part of
    77that trampoline function. Therefore, runtime-generated machine code is also correctly considered unsafe for async preemption.
    78
    79## Why context cancellation is handled in Go code rather than native code
    80
    81Since [wazero v1.0.0-pre.9](https://github.com/tetratelabs/wazero/releases/tag/v1.0.0-pre.9), the runtime
    82supports integration with Go contexts to interrupt execution after a timeout, or in response to explicit cancellation.
    83This support is internally implemented as a special opcode `builtinFunctionCheckExitCode` that triggers the execution of
    84a Go function (`ModuleInstance.FailIfClosed`) that atomically checks a sentinel value at strategic points in the code
    85(e.g. [within loops][checkexitcode_loop]).
    86
    87[It _is indeed_ possible to check the sentinel value directly, without leaving the native world][native_check], thus sparing some cycles;
    88however, because native code never preempts (see section above), this may lead to a state where the other goroutines
    89never get the chance to run, and thus never get the chance to set the sentinel value; effectively preventing
    90cancellation from taking place.
    91
    92[checkexitcode_loop]: https://github.com/tetratelabs/wazero/blob/86444c67a37dbf9e693ae5b365901f64968d9025/internal/wazeroir/compiler.go#L467-L476
    93[native_check]: https://github.com/tetratelabs/wazero/issues/1409
    94
    95## Source Offset Mapping
    96
    97When translating code from WebAssembly to the wazero IR, and compiling to native
    98binary, wazero keeps track of two indexes to correlate native program counters
    99to the original source offset that they were generated from.
   100
   101Source offset maps are useful for debugging, but holding indexes in memory for
   102all instructions can have a significant overhead. To reduce the memory footprint
   103of the compiled modules, wazero uses data structures inspired by
   104[frame-of-reference and delta encoding][FOR].
   105
   106Because wazero does not reorder instructions, the source offsets are naturally
   107sorted during compilation, and the distance between two consecutive offsets is
   108usually small. Encoding deltas instead of the absolute values allows most of
   109the indexes to store offsets with an overhead of 8 bits per instruction, instead
   110of recording 64 bits integers for absolute code positions.
   111
   112[FOR]: https://lemire.me/blog/2012/02/08/effective-compression-using-frame-of-reference-and-delta-coding/

View as plain text