Is WebAssembly faster than JavaScript for DOM manipulation?
This page answers one narrow question with numbers: when you need to read or mutate the DOM, is it faster to do it from a WebAssembly module or from plain JavaScript? The short answer is JavaScript — and the reason is structural, not a matter of optimization.
WebAssembly cannot touch document, window, or the render tree directly. The
browser sandbox grants
DOM access only through imported JavaScript functions, so every DOM operation a module performs is really
a JavaScript call wrapped in a boundary crossing. That crossing — not raw compute — is what you end up
measuring.
Prerequisites
- [ ] Chrome/Edge 119+ or Firefox 120+ with DevTools open on the Performance panel
- [ ] A
.wasmmodule exporting aupdate_dom_from_wasm(i)function that calls an imported JS DOM setter - [ ]
wabt≥ 1.0.34 if you want to inspect the import section withwasm-objdump - [ ] A page served over
http://localhost(file:// blocks streaming instantiation) - [ ] Basic familiarity with
performance.now()and the typed-array view pattern overlinear memory
How the boundary crossing actually works
Every DOM mutation initiated by Wasm follows the same synchronous chain. A Wasm instruction calls an
imported JS function (the import object is the only doorway out of the sandbox); the engine context-switches
from the Wasm execution stack into the JavaScript engine; the i32 argument or pointer is coerced into a
JS value; the real DOM API runs; and finally layout or style recalculation may block the main thread. That
is four transitions and a potential reflow for what would be a single property assignment in pure JS.
Procedure: measure the crossing yourself
-
Define the imported DOM setter that Wasm will call. Decoding the string out of
linear memoryis part of the cost you are measuring.const target = document.getElementById("target"); const importObject = { env: { update_dom_text(ptr, len) { const view = new Uint8Array(wasm.exports.memory.buffer, ptr, len); target.textContent = decoder.decode(view); }, }, }; -
Instantiate the module, capturing the exports so the import can reach memory.
const decoder = new TextDecoder(); const { instance: wasm } = await WebAssembly.instantiateStreaming( fetch("/dom_ops.wasm"), importObject); -
Run the pure-JavaScript baseline in a tight loop and time it.
const N = 100_000; let t0 = performance.now(); for (let i = 0; i < N; i++) target.textContent = `JS ${i}`; const jsMs = performance.now() - t0; -
Run the Wasm-driven loop through the glue and time the same work.
t0 = performance.now(); for (let i = 0; i < N; i++) wasm.exports.update_dom_from_wasm(i); const wasmMs = performance.now() - t0; console.log(`JS ${jsMs.toFixed(1)}ms | Wasm+glue ${wasmMs.toFixed(1)}ms`); -
Confirm where the time went in the Performance flame graph: record the run, then look for
Callframes bridgingwasm-function[...]and JS execution. High self-time insideupdate_dom_textis the boundary tax, not your compute.
Expected output
On a 2023-class laptop in Chrome, 100,000 single text updates land roughly like this. Absolute numbers vary by machine; the ratio is what holds.
| Workload (100k iterations) | JavaScript | Wasm + JS glue | Ratio |
|---|---|---|---|
textContent assignment |
18 ms | 64 ms | 3.6× slower |
createElement + appendChild |
41 ms | 138 ms | 3.4× slower |
| String built in Wasm, decoded per call | 22 ms | 96 ms | 4.4× slower |
| Pure CPU transform, no DOM | 4.1 ms | 0.9 ms | 4.6× faster |
The last row is the point: WebAssembly wins decisively on CPU-bound work and loses just as decisively the moment that work has to cross back into the DOM. Where the crossover sits depends on workload shape:
| Workload profile | Faster engine | Why |
|---|---|---|
| > 5 DOM updates per frame | JavaScript | Boundary crossings dominate the frame budget |
| < 5 DOM updates + > 100 ms compute | WebAssembly | Compute savings outweigh one batched crossing |
| Canvas / WebGL rendering | WebAssembly | Bypasses the DOM; writes pixels to a memory buffer |
| Virtual-DOM diffing | JavaScript | DOM APIs are JIT-optimized; Wasm adds serialization |
Gotchas
-
You are timing the glue, not Wasm. If you benchmark
wasm.exports.update_dom_from_wasmand conclude “Wasm is slow,” you have measured the import call plusTextDecoderallocation. Profile the pure-compute export separately to see Wasm’s real speed. -
Per-call
TextDecoderallocation dominates small payloads. Constructing a newTextDecoderor a newUint8Arrayview inside the import on every call adds GC pressure. Reuse one decoder and one view; the Wasm+glue column above already does. -
memory.growinvalidates your cached view. If the module growslinear memorymid-run, anyUint8Arrayyou cached over the oldmemory.bufferdetaches and reads as zero-length, so your DOM text silently goes blank. Re-create the view after any call that may grow memory. -
Reflow can swamp both engines. If your DOM setter triggers synchronous layout (reading
offsetHeightright after a write), the reflow cost dwarfs the boundary cost and both columns balloon — measure with layout thrashing eliminated first.
Performance note
The dominant cost is the boundary, and the boundary cost is roughly fixed per call regardless of engine.
At 100,000 iterations the ~46 ms gap between JS and Wasm+glue works out to about 0.46 µs of pure crossing
overhead per call — invisible once, ruinous in a hot loop. The fix is never “make Wasm faster,” it is
“cross less”: batch all mutations in linear memory, hand JavaScript one contiguous block, and let JS apply
them in a single requestAnimationFrame. For a rigorous head-to-head methodology on compute-bound work,
see measuring Wasm vs JavaScript throughput.
The batching pattern collapses thousands of crossings into one. Have the module write a compact command
buffer — a list of (node-index, length, byte-offset) triples plus the string bytes — into linear memory,
then call JavaScript exactly once with the buffer’s base pointer. JavaScript decodes the whole buffer and
applies every mutation inside a single animation frame:
function flush() {
const ptr = wasm.exports.command_ptr();
const count = wasm.exports.command_count();
const view = new DataView(wasm.exports.memory.buffer, ptr);
for (let i = 0; i < count; i++) { /* read one record, mutate one node */ }
requestAnimationFrame(flush);
}
One crossing per frame instead of per element turns the 4× penalty above into a rounding error, and the compute that built the command buffer runs at Wasm’s real speed — the row where it wins.
Frequently Asked Questions
Will the JS PromiseIntegration or DOM-from-Wasm proposals change this? They reduce per-call glue but not the fundamental context switch and value coercion. Direct DOM bindings would shave the ratio, yet for high-frequency mutation, batching across the boundary remains faster than crossing per element.
So is WebAssembly ever the right tool for UI? Yes — for the compute behind the UI. Run layout math, parsing, image processing, or physics in Wasm, then emit one batched result that JavaScript renders. Keep the DOM writes in JS.
Why is canvas rendering listed as a Wasm win when the DOM is not?
Canvas and WebGL take a memory buffer, not per-node API calls. The module writes pixels into linear memory
and JS uploads the whole buffer once, so there is no per-element boundary crossing to pay.
Related
- Browser sandbox & security boundaries — why DOM access is gated through imports in the first place.
- Measuring Wasm vs JavaScript throughput — a reproducible harness for compute-bound comparisons.
- Security implications of Wasm in enterprise apps — the wider hardening picture for production modules.
← Back to Browser Sandbox & Security Boundaries