Is WebAssembly faster than JavaScript for DOM manipulation?

This page answers one narrow question with numbers: when you need to read or mutate the DOM, is it faster to do it from a WebAssembly module or from plain JavaScript? The short answer is JavaScript — and the reason is structural, not a matter of optimization.

WebAssembly cannot touch document, window, or the render tree directly. The browser sandbox grants DOM access only through imported JavaScript functions, so every DOM operation a module performs is really a JavaScript call wrapped in a boundary crossing. That crossing — not raw compute — is what you end up measuring.

Prerequisites

  • [ ] Chrome/Edge 119+ or Firefox 120+ with DevTools open on the Performance panel
  • [ ] A .wasm module exporting a update_dom_from_wasm(i) function that calls an imported JS DOM setter
  • [ ] wabt ≥ 1.0.34 if you want to inspect the import section with wasm-objdump
  • [ ] A page served over http://localhost (file:// blocks streaming instantiation)
  • [ ] Basic familiarity with performance.now() and the typed-array view pattern over linear memory

How the boundary crossing actually works

Every DOM mutation initiated by Wasm follows the same synchronous chain. A Wasm instruction calls an imported JS function (the import object is the only doorway out of the sandbox); the engine context-switches from the Wasm execution stack into the JavaScript engine; the i32 argument or pointer is coerced into a JS value; the real DOM API runs; and finally layout or style recalculation may block the main thread. That is four transitions and a potential reflow for what would be a single property assignment in pure JS.

Procedure: measure the crossing yourself

  1. Define the imported DOM setter that Wasm will call. Decoding the string out of linear memory is part of the cost you are measuring.

    const target = document.getElementById("target");
    const importObject = {
      env: {
        update_dom_text(ptr, len) {
          const view = new Uint8Array(wasm.exports.memory.buffer, ptr, len);
          target.textContent = decoder.decode(view);
        },
      },
    };
  2. Instantiate the module, capturing the exports so the import can reach memory.

    const decoder = new TextDecoder();
    const { instance: wasm } = await WebAssembly.instantiateStreaming(
      fetch("/dom_ops.wasm"), importObject);
  3. Run the pure-JavaScript baseline in a tight loop and time it.

    const N = 100_000;
    let t0 = performance.now();
    for (let i = 0; i < N; i++) target.textContent = `JS ${i}`;
    const jsMs = performance.now() - t0;
  4. Run the Wasm-driven loop through the glue and time the same work.

    t0 = performance.now();
    for (let i = 0; i < N; i++) wasm.exports.update_dom_from_wasm(i);
    const wasmMs = performance.now() - t0;
    console.log(`JS ${jsMs.toFixed(1)}ms | Wasm+glue ${wasmMs.toFixed(1)}ms`);
  5. Confirm where the time went in the Performance flame graph: record the run, then look for Call frames bridging wasm-function[...] and JS execution. High self-time inside update_dom_text is the boundary tax, not your compute.

Expected output

On a 2023-class laptop in Chrome, 100,000 single text updates land roughly like this. Absolute numbers vary by machine; the ratio is what holds.

Workload (100k iterations) JavaScript Wasm + JS glue Ratio
textContent assignment 18 ms 64 ms 3.6× slower
createElement + appendChild 41 ms 138 ms 3.4× slower
String built in Wasm, decoded per call 22 ms 96 ms 4.4× slower
Pure CPU transform, no DOM 4.1 ms 0.9 ms 4.6× faster

The last row is the point: WebAssembly wins decisively on CPU-bound work and loses just as decisively the moment that work has to cross back into the DOM. Where the crossover sits depends on workload shape:

Workload profile Faster engine Why
> 5 DOM updates per frame JavaScript Boundary crossings dominate the frame budget
< 5 DOM updates + > 100 ms compute WebAssembly Compute savings outweigh one batched crossing
Canvas / WebGL rendering WebAssembly Bypasses the DOM; writes pixels to a memory buffer
Virtual-DOM diffing JavaScript DOM APIs are JIT-optimized; Wasm adds serialization

Gotchas

  • You are timing the glue, not Wasm. If you benchmark wasm.exports.update_dom_from_wasm and conclude “Wasm is slow,” you have measured the import call plus TextDecoder allocation. Profile the pure-compute export separately to see Wasm’s real speed.

  • Per-call TextDecoder allocation dominates small payloads. Constructing a new TextDecoder or a new Uint8Array view inside the import on every call adds GC pressure. Reuse one decoder and one view; the Wasm+glue column above already does.

  • memory.grow invalidates your cached view. If the module grows linear memory mid-run, any Uint8Array you cached over the old memory.buffer detaches and reads as zero-length, so your DOM text silently goes blank. Re-create the view after any call that may grow memory.

  • Reflow can swamp both engines. If your DOM setter triggers synchronous layout (reading offsetHeight right after a write), the reflow cost dwarfs the boundary cost and both columns balloon — measure with layout thrashing eliminated first.

Performance note

The dominant cost is the boundary, and the boundary cost is roughly fixed per call regardless of engine. At 100,000 iterations the ~46 ms gap between JS and Wasm+glue works out to about 0.46 µs of pure crossing overhead per call — invisible once, ruinous in a hot loop. The fix is never “make Wasm faster,” it is “cross less”: batch all mutations in linear memory, hand JavaScript one contiguous block, and let JS apply them in a single requestAnimationFrame. For a rigorous head-to-head methodology on compute-bound work, see measuring Wasm vs JavaScript throughput.

The batching pattern collapses thousands of crossings into one. Have the module write a compact command buffer — a list of (node-index, length, byte-offset) triples plus the string bytes — into linear memory, then call JavaScript exactly once with the buffer’s base pointer. JavaScript decodes the whole buffer and applies every mutation inside a single animation frame:

function flush() {
  const ptr = wasm.exports.command_ptr();
  const count = wasm.exports.command_count();
  const view = new DataView(wasm.exports.memory.buffer, ptr);
  for (let i = 0; i < count; i++) { /* read one record, mutate one node */ }
  requestAnimationFrame(flush);
}

One crossing per frame instead of per element turns the 4× penalty above into a rounding error, and the compute that built the command buffer runs at Wasm’s real speed — the row where it wins.

Frequently Asked Questions

Will the JS PromiseIntegration or DOM-from-Wasm proposals change this? They reduce per-call glue but not the fundamental context switch and value coercion. Direct DOM bindings would shave the ratio, yet for high-frequency mutation, batching across the boundary remains faster than crossing per element.

So is WebAssembly ever the right tool for UI? Yes — for the compute behind the UI. Run layout math, parsing, image processing, or physics in Wasm, then emit one batched result that JavaScript renders. Keep the DOM writes in JS.

Why is canvas rendering listed as a Wasm win when the DOM is not? Canvas and WebGL take a memory buffer, not per-node API calls. The module writes pixels into linear memory and JS uploads the whole buffer once, so there is no per-element boundary crossing to pay.

← Back to Browser Sandbox & Security Boundaries