Stack vs Heap Execution Model

Engineers arriving from C, Rust, or the JVM bring a mental picture of “the stack” and “the heap” as two ends of one address space. WebAssembly breaks that picture in a way that trips up almost everyone: the value stack the virtual machine actually executes on is not addressable memory at all, and the “heap” you malloc into is just a region of a single byte buffer that your own allocator carves up. Get this distinction wrong and you will reach for a pointer to a local, or assume memory.grow keeps your views valid — both of which fail. This guide pins down exactly what the value stack is, what linear memory is, and how bytes move between the two and across to JavaScript.

Prerequisites

  • [ ] wabt 1.0.34+ installed (wat2wasm, wasm2wat, wasm-objdump) — brew install wabt or build from source
  • [ ] Node.js 20+ (or any browser with WebAssembly global) to instantiate the examples
  • [ ] Familiarity with reading hand-written wat (the WebAssembly text format basics covers the syntax)
  • [ ] A terminal where you can run wat2wasm module.wat -o module.wasm

Two memories that are nothing alike

WebAssembly runs on a stack machine. Every instruction consumes some operands from an implicit value stack and pushes its results back. i32.add pops two i32 values and pushes their sum; local.get $x pushes the value of a local. This value stack is internal to the engine — it has no addresses, you cannot take a pointer into it, and it never appears in linear memory. The engine is free to keep it in machine registers. Function locals and parameters live in the same world: they are typed slots the engine manages, not bytes at an offset.

The linear memory heap is the opposite in every respect. It is a single contiguous, byte-addressable, growable ArrayBuffer — index 0 is the first byte, index byteLength - 1 is the last. Only load and store instructions touch it, and every access is bounds-checked against the current length; an out-of-bounds access raises a trap rather than reading host memory. Here is the critical part: WebAssembly has no native heap. The spec defines exactly one memory. There is no malloc instruction, no garbage collector for your data, no allocator. The “heap” is a software convention — a chunk of that one buffer that a library allocator (dlmalloc, wee_alloc, a bump allocator) hands out and reclaims. The engine neither knows nor cares where your heap begins.

Value stack versus linear memory On the left, an engine-internal value stack holding typed operands with no addresses. On the right, a single byte-addressable linear memory ArrayBuffer divided into data, a software heap, and a shadow stack, with only load and store instructions reaching it. value stack (engine) no addresses · typed slots i32: 7 (top) i32: 12 f64: 3.14 push / pop per instruction locals: managed slots may live in registers linear memory one ArrayBuffer · byte-addressed data & globals (offset 0…) shadow stack (compiler) heap (malloc / free) software convention only load / store reach here bounds-checked · grows in pages load / store

Notice the third box on the memory side: the shadow stack. Because the engine’s value stack holds only scalar values and you cannot take its address, any C or Rust local whose address is taken (&x, an array, a struct passed by reference) cannot live there. Compilers solve this by reserving a region of linear memory and using a global as a stack pointer — a software stack that grows downward inside the same buffer as the heap. So “stack-allocated” data in your source language often ends up in linear memory too, just in a different region from the malloc heap. The engine-level value stack and the source-level call stack are genuinely different things, and keeping them straight is half the battle.

Building and inspecting a module step by step

The fastest way to internalize the split is to write a tiny module that does both — pure stack arithmetic and an explicit memory store — then read it back with the toolchain.

  1. Write the module. Save this as stack-heap.wat. It computes 12 + 7 entirely on the value stack, then stores the result into linear memory at byte offset 0:

    (module
      (memory (export "memory") 1)            ;; one page = 64 KiB of linear memory
      (func (export "compute") (result i32)
        (local $sum i32)
        ;; --- value stack only: no memory touched ---
        (local.set $sum
          (i32.add (i32.const 12) (i32.const 7)))   ;; push 12, push 7, add -> 19
        ;; --- now cross into linear memory ---
        (i32.store (i32.const 0) (local.get $sum))   ;; store 19 at byte offset 0
        (i32.load (i32.const 0))))                    ;; load it back, leave on stack as result
  2. Assemble it. Turn the text format into a binary:

    wat2wasm stack-heap.wat -o stack-heap.wasm
  3. Instantiate and call it. From Node or a browser, run the export and read the byte you stored:

    const bytes = await (await fetch("/stack-heap.wasm")).arrayBuffer();
    const { instance } = await WebAssembly.instantiate(bytes);
    const result = instance.exports.compute();          // 19
    const mem = new Int32Array(instance.exports.memory.buffer);
    console.log(result, mem[0]);                         // 19 19
  4. Disassemble to confirm the layout. Read the binary back to see the memory section the engine will enforce:

    wasm-objdump -x stack-heap.wasm

Step 1’s i32.add never touches linear memory — its operands are pure stack values. Only the i32.store and i32.load cross into the buffer. That is the whole model in four instructions.

Structured control flow keeps the stack honest

Unlike a hardware CPU, WebAssembly has no arbitrary goto. Control flow is structured: block, loop, and if open lexically nested regions, and br/br_if/br_table can only jump to the end (or, for loop, the start) of an enclosing region — never into the middle of one. This is what lets the engine know the exact value-stack height at every program point, which in turn makes validation fast and ahead-of-time compilation possible without tracking a free-form stack.

(func (export "clamp") (param $x i32) (result i32)
  (block $hi (result i32)
    (br_if $hi (i32.lt_s (local.get $x) (i32.const 100))   ;; if x < 100, fall through with x
      (local.get $x))
    (i32.const 100)))                                       ;; else replace top of stack with 100

The branch can only target the label $hi, and at that label the stack must hold exactly one i32 — the validator rejects any path that would leave it otherwise. This static stack discipline is why a Wasm trap (an out-of-bounds load, an integer divide-by-zero, an unreachable) cleanly unwinds to the host instead of corrupting state.

How data crosses to JavaScript

JavaScript cannot see the value stack at all — it is invisible engine state. The only shared surface is linear memory. A function call passes integers (almost always pointers — byte offsets into the buffer — paired with a length), and the actual payload lives in memory where both sides can reach it with typed-array views. The example below writes a string into the module’s memory from JavaScript, then asks the module to sum its bytes off the value stack:

const { instance } = await WebAssembly.instantiate(bytes);
const memory = instance.exports.memory;
let bytesView = new Uint8Array(memory.buffer);          // view over linear memory

const payload = new TextEncoder().encode("wasm");       // [119, 97, 115, 109]
bytesView.set(payload, 0);                              // write at offset 0
const total = instance.exports.sum_bytes(0, payload.length);
console.log(total);                                     // 444

The pointer (0) and length (4) ride the value stack as plain i32 operands; the four bytes ride linear memory. This is the entire contract, and it is the same one formalized for strings, structs, and large buffers in linear memory management & allocators, where the JavaScript side learns to call the module’s malloc/free rather than hard-coding offset 0.

Tracing one expression through the stack

The stack discipline is easiest to trust once you watch a single expression evaluate. Consider (a + b) * 2 for parameters $a and $b. In the text format the post-order encoding makes the push/pop order explicit:

(func (export "calc") (param $a i32) (param $b i32) (result i32)
  (i32.mul
    (i32.add (local.get $a) (local.get $b))
    (i32.const 2)))

The engine executes the leaves first and walks up. Here is the value-stack height after each instruction, reading the body top-to-bottom as the engine flattens it:

step instruction stack after (bottom → top)
1 local.get $a [a]
2 local.get $b [a, b]
3 i32.add [a+b]
4 i32.const 2 [a+b, 2]
5 i32.mul [(a+b)*2]

At the function’s end exactly one i32 remains, matching the declared (result i32). The validator proved that height and type before the module ran — no instruction ever underflows the stack or leaves the wrong type, which is why an engine can compile each instruction to a register operation without runtime stack bookkeeping. None of these five steps touch linear memory; this is pure value-stack arithmetic, and it is the common case for hot inner loops where keeping data off the heap is exactly what makes Wasm fast.

Frames, calls, and where each kind of stack lives

A function call pushes a new activation: the callee gets fresh locals (its parameters seeded from the operands the caller left on the value stack, the rest zero-initialized) and a fresh, empty operand stack region. When the callee returns, its results are pushed onto the caller’s value stack and its frame is discarded. Crucially, this entire call mechanism is engine-managed — there is no return-address or saved-register area you can read in linear memory, because the value stack and the activation records are not addressable.

That is fine until a function needs addressable locals: a fixed-size array, a struct it passes by pointer, or a variable whose address escapes. Those cannot live on the value stack, so the compiler emits a prologue that subtracts from a global stack-pointer to claim a slice of the shadow stack in linear memory, and an epilogue that restores it. So a single source-level function call can touch three different “stacks”: the engine value stack (operands), the engine activation stack (frames), and the shadow stack in linear memory (addressable locals). When you profile deep recursion, the value and activation stacks are bounded by engine limits and surface as a clean trap or RangeError, while the shadow stack is bounded only by the page you reserved for it — overrun it and you silently corrupt the malloc heap next door unless a guard or canary catches it.

Optimization flags & tradeoffs

The boundary between value stack and shadow stack is something you can tune at compile time, and the numbers matter for both size and safety:

  • Shadow-stack size. Emscripten’s -sSTACK_SIZE=N (default 64 KiB) sets how much of linear memory is reserved for the software call stack. Too small and deep recursion overflows it, silently clobbering adjacent heap or data and producing memory corruption rather than a clean error; too large and you waste pages. -sSTACK_OVERFLOW_CHECK=2 adds a canary so an overflow traps instead.
  • Allocator choice. dlmalloc (Emscripten default, ~6 KiB of code) resists fragmentation; wee_alloc in Rust trims roughly 10 KiB off the binary but fragments badly under churn and is not thread-safe. A bump allocator is a few hundred bytes and frees nothing until you reset it — ideal for per-frame scratch.
  • Stack-first layout. Emscripten’s -z stack-first places the shadow stack at low addresses so an overflow traps on a guard page instead of silently corrupting data. The tradeoff is a few bytes of layout overhead. Verify the resulting memory and data offsets with wasm2wat app.wasm | grep -A2 "(memory".

Gotchas & failure modes

  • RangeError: Maximum call stack size exceeded usually means JavaScript recursion (JS→Wasm→JS) blew the host stack — not the Wasm shadow stack. Deep Wasm recursion instead silently overruns the shadow stack and corrupts linear memory unless you compiled with -sSTACK_OVERFLOW_CHECK.
  • RuntimeError: memory access out of bounds is the trap you get when a load/store offset is past the current buffer length. It is a good failure — the bounds check caught a bad pointer before it could read host memory.
  • Stale views after growth. Calling memory.grow can swap the backing ArrayBuffer, detaching every typed array you built over the old memory.buffer; they read as zero-length afterward. Always re-create views from instance.exports.memory.buffer after any call that might grow memory.
  • Taking the address of a value-stack local is impossible. There is no instruction for it. If your source language needs &local, the compiler has already moved that local to the shadow stack in linear memory; reasoning as if it were on the value stack will mislead you.

Verification

Confirm the stack/heap split is what you think with these checks:

# See the declared memory and how the value stack is used per function
wasm-objdump -d stack-heap.wasm        # disassemble: spot load/store vs pure stack ops
wasm-objdump -x stack-heap.wasm        # section headers: Memory, Data, Global (stack pointer)

# Validate the module's stack discipline statically
wasm-validate stack-heap.wasm          # exits 0 only if every stack height type-checks

In Chrome DevTools, the Memory inspector lets you view linear memory as raw bytes while stepping Wasm frames, so you can watch a store land at the offset you expect and confirm the value stack never appears there.

In this guide

Frequently Asked Questions

Does WebAssembly have a heap? Not natively. The specification defines exactly one linear memory — a flat byte buffer — and no allocator. What everyone calls “the heap” is a region of that buffer managed by a software allocator (dlmalloc, wee_alloc, or a bump allocator) that your toolchain links in. The engine only sees load, store, and memory.grow.

Is the value stack stored in linear memory? No. The value stack is engine-internal state that holds the typed operands flowing between instructions; it has no addresses and you cannot point into it. It may live entirely in machine registers. Only load/store reach linear memory, which is a separate, byte-addressable buffer.

Why can’t I take a pointer to a function local? Because value-stack locals are typed engine slots, not memory. When your source code takes the address of a local, the compiler relocates that variable to the shadow stack — a software stack inside linear memory managed via a global stack-pointer — precisely so it has a real byte address.

What happens when a load goes out of bounds? The access fails its bounds check and the module traps, surfacing in JavaScript as a RuntimeError: memory access out of bounds. Execution unwinds cleanly to the host; the bad access never touches memory outside the module’s buffer, which is the core of the sandbox guarantee.

← Back to WebAssembly Core Concepts & Browser Runtime