...

Text file src/github.com/tetratelabs/wazero/site/content/docs/how_do_compiler_functions_work.md

Documentation: github.com/tetratelabs/wazero/site/content/docs

     1# How do compiler functions work?
     2
     3WebAssembly runtimes let you call functions defined in wasm. How this works in
     4wazero is different depending on your `RuntimeConfig`.
     5
     6* `RuntimeConfigCompiler` compiles machine code from your wasm, and jumps to
     7  that when invoking a function.
     8* `RuntimeConfigInterpreter` does not generate code. It interprets wasm and
     9  executes go statements that correspond to WebAssembly instructions.
    10
    11How the compiler works precisely is a large topic, and discussed at length on
    12this page. For more general information on architecture, etc., please refer to
    13[Docs](..).
    14
    15## Engines
    16
    17Our [Docs](..) introduce the "engine" concept of wazero. More precisely, there
    18are three types of engines, `Engine`, `ModuleEngine` and `callEngine`. Each has
    19a different scope and role:
    20
    21- `Engine` has the same lifetime as `Runtime`. This compiles a `CompiledModule`
    22  into machine code, which is both cached and memory-mapped as an executable.
    23- `ModuleEngine` is a virtual machine with the same lifetime as its [Module][api-module].
    24  Notably, this binds each [function instance][spec-function-instance] to
    25  corresponding machine code owned by its `Engine`.
    26- `callEngine` is the implementation of [api.Function][api-function] in a
    27  [Module][api-module]. This implements `Function.Call(...)` by invoking
    28  machine code corresponding to a function instance in `ModuleEngine` and
    29  managing the [call stack][call-stack] representing the invocation.
    30
    31Here is a diagram showing the relationships of these engines:
    32
    33```goat
    34      .-----------> Instantiated module                                 Exported Function
    35     /1:N                   |                                                  |
    36    /                       |                                                  v
    37   |     +----------+       v        +----------------+                  +------------+
    38   |     |  Engine  |--------------->|  ModuleEngine  |----------------->| callEngine |
    39   |     +----------+                +----------------+                  +------------+
    40   |          |                               |                            |      |
    41   .          |                               |                            |      |
    42 main.wasm -->|        .--------------------->|          '-----------------+      |
    43              |       /                       |          |                        |
    44              v      .                        v          v                        v
    45      +--------------+      +-----------------------------------+            +----------+
    46      | Machine Code |      |[(func_instance, machine_code),...]|            |Call Stack|
    47      +--------------+      +-----------------------------------+            +----------+
    48                                               ^                                  ^
    49                                               |                                  |
    50                                               |                                  |
    51                                               +----------------------------------+
    52                                                               |
    53                                                               |
    54                                                               |
    55                                                        Function.Call()
    56```
    57
    58## Callbacks from machine code to Go
    59
    60Go source can be compiled to invoke native library functions using CGO.
    61However, [CGO is not GO][cgo-not-go]. To call native functions in pure Go, we
    62need a different approach with unique constraints.
    63
    64The most notable constraints are:
    65* machine code must not manipulate the Goroutine or system stack
    66* we cannot modify the signal handler of Go at runtime
    67
    68### Handling the call stack
    69
    70One constraint is the generated machine code must not manipulate Goroutine
    71(or system) stack. Otherwise, the Go runtime gets corrupted, which results in
    72fatal execution errors. This means we cannot[^1] call Go functions (host
    73functions) directly from machine code (compiled from wasm). This is routinely
    74needed in WebAssembly, as system calls such as WASI are defined in Go, but
    75invoked from Wasm. To handle this, we employ a "trampoline strategy".
    76
    77Let's explain the "trampoline strategy" with an example. `random_get` is a host
    78function defined in Go, called from machine code compiled from guest `main`
    79function. Let's say the wasm function corresponding to that is called `_start`.
    80`_start` function is called by wazero by default on `Instantiate`.
    81
    82Here is a TinyGo source file describing this.
    83```go
    84//go:import wasi_snapshot_preview1 random_get
    85func random_get(age int32)package main
    86
    87import "unsafe"
    88
    89// random_get is a function defined on the host, specifically, the wazero
    90// program written in Go.
    91//
    92//go:wasmimport wasi_snapshot_preview1 random_get
    93func random_get(ptr uintptr, size uint32) (errno uint32)
    94
    95// main is compiled to wasm, so this is the guest. Conventionally, this ends up
    96// named `_start`.
    97func main() {
    98    // Define a buffer to hold random data
    99	size := uint32(8)
   100    buf := make([]byte, size)
   101
   102	// Fill the buffer with random data using an imported host function.
   103    // The host needs to know where in guest memory to place the random data.
   104	// To communicate this, we have to convert buf to a uintptr.
   105    errno := random_get(uintptr(unsafe.Pointer(&buf[0])), size)
   106    if errno != 0 {
   107        panic(errno)
   108    }
   109}
   110```
   111
   112When `_start` calls `random_get`, it exits execution first. wazero calls the Go
   113function mapped to `random_get` like a usual Go program. Finally, wazero
   114transfers control back to machine code again, resuming `_start` after the call
   115instruction to `random_get`.
   116
   117Here's what the "trampoline strategy" looks like in a diagram. For simplicity,
   118we'll say the wasm memory offset of the `buf` is zero, but it will be different
   119in real execution.
   120```goat
   121   |                                     Go              |           Machine Code
   122   |                                                           (compiled from main.wasm)
   123   |                                                     |
   124   v
   125   |                        `Instantiate(ctx, mainWasm)` |
   126   |                                     |
   127   v                                     v               |
   128   |                            +----------------+                  +------------+
   129   |                            |func exec_native|-------|--------> |func _start |
   130   v                            +----------------+                  +------------+
   131   |                                                     |         /
   132   |            Go func call    +----------------+                / ptr=0,size=8
   133   v           .----------------|func exec_native|<------|-------. status=call_host_fn(name=rand_get)
   134   |          /  ptr=0,size=8   +----------------+     exit
   135   |         v                                           |
   136   v   +-------------+          +----------------+
   137   |   |func rand_get|--------->|func exec_native|-------|-------.
   138   |   +-------------+ errno=0  +----------------+    continue    \ errno=0
   139   v                                                     |         \
   140   |                                                     |          +------------+
   141   |                                                     |          |func _start |
   142   v                                                     |          +------------+
   143```
   144
   145### Signal handling
   146
   147Code compiled to wasm use [runtime traps][spec-trap] to abort execution. For
   148example, a `panic` compiled with TinyGo becomes a wasm function named
   149`runtime._panic`, which issues an [unreachable][spec-unreachable] instruction
   150after printing the message to STDERR.
   151
   152```go
   153package main
   154
   155func main() {
   156	panic("help")
   157}
   158```
   159
   160Native JIT compilers set custom signal handlers for [Wasm runtime traps][spec-trap],
   161such as the [unreachable][spec-unreachable] instruction. However, we cannot
   162safely [modify the signal handler of Go at runtime][signal-handler-discussion].
   163As described in the first section, wazero always exits the execution of machine
   164code. Machine code sets status when it encounters an `unreachable` instruction.
   165This is read by wazero, which propagates it back with `ErrRuntimeUnreachable`.
   166
   167Here's a diagram showing this:
   168```goat
   169   |                               Go                 |                             Machine Code
   170   |                                                                          (compiled from main.wasm)
   171   |                                                  |
   172   v
   173   |                   `Instantiate(ctx, mainWasm)`   |
   174   |                                |
   175   v                                v                 |
   176   |                       +----------------+                                     +------------+
   177   |                       |func exec_native|---------|-------------------------> |func _start |
   178   v                       +----------------+                                     +------------+
   179   |                                                  |                                 |
   180   |                       +----------------+                  exit           +--------------------+
   181   v                       |func exec_native|<--------|---------------------- |func runtime._panic |
   182   |                       +----------------+            status=unreachable   +--------------------+
   183   |                              |                   |
   184   v                              |
   185   |                panic(WasmRuntimeErrUnreachable)  |
   186```
   187
   188One thing you will notice above is that the calls between wasm functions, such
   189as from `_start` to `runtime._panic` do not use a trampoline. The trampoline
   190strategy is only used between wasm and the host.
   191
   192## Summary
   193
   194When an exported wasm function is called, using a wazero API, such as
   195`Function.Call()`, wazero allocates a `callEngine` and starts invocation. This
   196begins with jumping to machine code compiled from the Wasm binary. When that
   197code makes a callback to the host, it exits execution, passing control back to
   198`exec_native` which then calls a Go function and resumes the machine code
   199afterwards. In the face of Wasm runtime errors, we exit the machine code
   200execution with the proper status, and return the control back to `exec_native`
   201function, just like host function calls. Just instead of calling a Go function,
   202we call `panic` with a corresponding error. This jumping is why the strategy is
   203called a trampoline, and only used between the guest wasm and the host running
   204it.
   205
   206For more details, see [RATIONALE.md][compiler-rationale].
   207
   208[call-stack]: https://en.wikipedia.org/wiki/Call_stack
   209[api-function]: https://pkg.go.dev/github.com/tetratelabs/wazero@v1.0.0-rc.1/api#Function
   210[api-module]: https://pkg.go.dev/github.com/tetratelabs/wazero@v1.0.0-rc.1/api#Module
   211[spec-function-instance]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#function-instances%E2%91%A0
   212[spec-trap]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#trap
   213[spec-unreachable]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#syntax-instr-control
   214[compiler-rationale]: https://github.com/tetratelabs/wazero/blob/v1.0.0-rc.1/internal/engine/compiler/RATIONALE.md
   215[signal-handler-discussion]: https://gophers.slack.com/archives/C1C1YSQBT/p1675992411241409
   216[cgo-not-go]: https://www.youtube.com/watch?v=PAAkCSZUG1c&t=757s
   217
   218[^1]: it's technically possible to call it directly, but that would come with performing "stack switching" in the native code.
   219  It's almost the same as what wazero does: exiting the execution of machine code, then call the target Go function (using the caller of machine code as a "trampoline").

View as plain text