1# How do compiler functions work?
2
3WebAssembly runtimes let you call functions defined in wasm. How this works in
4wazero is different depending on your `RuntimeConfig`.
5
6* `RuntimeConfigCompiler` compiles machine code from your wasm, and jumps to
7 that when invoking a function.
8* `RuntimeConfigInterpreter` does not generate code. It interprets wasm and
9 executes go statements that correspond to WebAssembly instructions.
10
11How the compiler works precisely is a large topic, and discussed at length on
12this page. For more general information on architecture, etc., please refer to
13[Docs](..).
14
15## Engines
16
17Our [Docs](..) introduce the "engine" concept of wazero. More precisely, there
18are three types of engines, `Engine`, `ModuleEngine` and `callEngine`. Each has
19a different scope and role:
20
21- `Engine` has the same lifetime as `Runtime`. This compiles a `CompiledModule`
22 into machine code, which is both cached and memory-mapped as an executable.
23- `ModuleEngine` is a virtual machine with the same lifetime as its [Module][api-module].
24 Notably, this binds each [function instance][spec-function-instance] to
25 corresponding machine code owned by its `Engine`.
26- `callEngine` is the implementation of [api.Function][api-function] in a
27 [Module][api-module]. This implements `Function.Call(...)` by invoking
28 machine code corresponding to a function instance in `ModuleEngine` and
29 managing the [call stack][call-stack] representing the invocation.
30
31Here is a diagram showing the relationships of these engines:
32
33```goat
34 .-----------> Instantiated module Exported Function
35 /1:N | |
36 / | v
37 | +----------+ v +----------------+ +------------+
38 | | Engine |--------------->| ModuleEngine |----------------->| callEngine |
39 | +----------+ +----------------+ +------------+
40 | | | | |
41 . | | | |
42 main.wasm -->| .--------------------->| '-----------------+ |
43 | / | | |
44 v . v v v
45 +--------------+ +-----------------------------------+ +----------+
46 | Machine Code | |[(func_instance, machine_code),...]| |Call Stack|
47 +--------------+ +-----------------------------------+ +----------+
48 ^ ^
49 | |
50 | |
51 +----------------------------------+
52 |
53 |
54 |
55 Function.Call()
56```
57
58## Callbacks from machine code to Go
59
60Go source can be compiled to invoke native library functions using CGO.
61However, [CGO is not GO][cgo-not-go]. To call native functions in pure Go, we
62need a different approach with unique constraints.
63
64The most notable constraints are:
65* machine code must not manipulate the Goroutine or system stack
66* we cannot modify the signal handler of Go at runtime
67
68### Handling the call stack
69
70One constraint is the generated machine code must not manipulate Goroutine
71(or system) stack. Otherwise, the Go runtime gets corrupted, which results in
72fatal execution errors. This means we cannot[^1] call Go functions (host
73functions) directly from machine code (compiled from wasm). This is routinely
74needed in WebAssembly, as system calls such as WASI are defined in Go, but
75invoked from Wasm. To handle this, we employ a "trampoline strategy".
76
77Let's explain the "trampoline strategy" with an example. `random_get` is a host
78function defined in Go, called from machine code compiled from guest `main`
79function. Let's say the wasm function corresponding to that is called `_start`.
80`_start` function is called by wazero by default on `Instantiate`.
81
82Here is a TinyGo source file describing this.
83```go
84//go:import wasi_snapshot_preview1 random_get
85func random_get(age int32)package main
86
87import "unsafe"
88
89// random_get is a function defined on the host, specifically, the wazero
90// program written in Go.
91//
92//go:wasmimport wasi_snapshot_preview1 random_get
93func random_get(ptr uintptr, size uint32) (errno uint32)
94
95// main is compiled to wasm, so this is the guest. Conventionally, this ends up
96// named `_start`.
97func main() {
98 // Define a buffer to hold random data
99 size := uint32(8)
100 buf := make([]byte, size)
101
102 // Fill the buffer with random data using an imported host function.
103 // The host needs to know where in guest memory to place the random data.
104 // To communicate this, we have to convert buf to a uintptr.
105 errno := random_get(uintptr(unsafe.Pointer(&buf[0])), size)
106 if errno != 0 {
107 panic(errno)
108 }
109}
110```
111
112When `_start` calls `random_get`, it exits execution first. wazero calls the Go
113function mapped to `random_get` like a usual Go program. Finally, wazero
114transfers control back to machine code again, resuming `_start` after the call
115instruction to `random_get`.
116
117Here's what the "trampoline strategy" looks like in a diagram. For simplicity,
118we'll say the wasm memory offset of the `buf` is zero, but it will be different
119in real execution.
120```goat
121 | Go | Machine Code
122 | (compiled from main.wasm)
123 | |
124 v
125 | `Instantiate(ctx, mainWasm)` |
126 | |
127 v v |
128 | +----------------+ +------------+
129 | |func exec_native|-------|--------> |func _start |
130 v +----------------+ +------------+
131 | | /
132 | Go func call +----------------+ / ptr=0,size=8
133 v .----------------|func exec_native|<------|-------. status=call_host_fn(name=rand_get)
134 | / ptr=0,size=8 +----------------+ exit
135 | v |
136 v +-------------+ +----------------+
137 | |func rand_get|--------->|func exec_native|-------|-------.
138 | +-------------+ errno=0 +----------------+ continue \ errno=0
139 v | \
140 | | +------------+
141 | | |func _start |
142 v | +------------+
143```
144
145### Signal handling
146
147Code compiled to wasm use [runtime traps][spec-trap] to abort execution. For
148example, a `panic` compiled with TinyGo becomes a wasm function named
149`runtime._panic`, which issues an [unreachable][spec-unreachable] instruction
150after printing the message to STDERR.
151
152```go
153package main
154
155func main() {
156 panic("help")
157}
158```
159
160Native JIT compilers set custom signal handlers for [Wasm runtime traps][spec-trap],
161such as the [unreachable][spec-unreachable] instruction. However, we cannot
162safely [modify the signal handler of Go at runtime][signal-handler-discussion].
163As described in the first section, wazero always exits the execution of machine
164code. Machine code sets status when it encounters an `unreachable` instruction.
165This is read by wazero, which propagates it back with `ErrRuntimeUnreachable`.
166
167Here's a diagram showing this:
168```goat
169 | Go | Machine Code
170 | (compiled from main.wasm)
171 | |
172 v
173 | `Instantiate(ctx, mainWasm)` |
174 | |
175 v v |
176 | +----------------+ +------------+
177 | |func exec_native|---------|-------------------------> |func _start |
178 v +----------------+ +------------+
179 | | |
180 | +----------------+ exit +--------------------+
181 v |func exec_native|<--------|---------------------- |func runtime._panic |
182 | +----------------+ status=unreachable +--------------------+
183 | | |
184 v |
185 | panic(WasmRuntimeErrUnreachable) |
186```
187
188One thing you will notice above is that the calls between wasm functions, such
189as from `_start` to `runtime._panic` do not use a trampoline. The trampoline
190strategy is only used between wasm and the host.
191
192## Summary
193
194When an exported wasm function is called, using a wazero API, such as
195`Function.Call()`, wazero allocates a `callEngine` and starts invocation. This
196begins with jumping to machine code compiled from the Wasm binary. When that
197code makes a callback to the host, it exits execution, passing control back to
198`exec_native` which then calls a Go function and resumes the machine code
199afterwards. In the face of Wasm runtime errors, we exit the machine code
200execution with the proper status, and return the control back to `exec_native`
201function, just like host function calls. Just instead of calling a Go function,
202we call `panic` with a corresponding error. This jumping is why the strategy is
203called a trampoline, and only used between the guest wasm and the host running
204it.
205
206For more details, see [RATIONALE.md][compiler-rationale].
207
208[call-stack]: https://en.wikipedia.org/wiki/Call_stack
209[api-function]: https://pkg.go.dev/github.com/tetratelabs/wazero@v1.0.0-rc.1/api#Function
210[api-module]: https://pkg.go.dev/github.com/tetratelabs/wazero@v1.0.0-rc.1/api#Module
211[spec-function-instance]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#function-instances%E2%91%A0
212[spec-trap]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#trap
213[spec-unreachable]: https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#syntax-instr-control
214[compiler-rationale]: https://github.com/tetratelabs/wazero/blob/v1.0.0-rc.1/internal/engine/compiler/RATIONALE.md
215[signal-handler-discussion]: https://gophers.slack.com/archives/C1C1YSQBT/p1675992411241409
216[cgo-not-go]: https://www.youtube.com/watch?v=PAAkCSZUG1c&t=757s
217
218[^1]: it's technically possible to call it directly, but that would come with performing "stack switching" in the native code.
219 It's almost the same as what wazero does: exiting the execution of machine code, then call the target Go function (using the caller of machine code as a "trampoline").
View as plain text