The WebAssembly Memory Dance
Passing arrays between JavaScript and C++ in WebAssembly is about pointers. Not objects. Not types. Raw memory addresses.
This took longer to understand than it should have.
The Problem
We have vector network data in JavaScript:
const vertices = new Float32Array([0, 0, 100, 0, 100, 100, 0, 100]);
const segments = new Uint16Array([0, 1, 1, 2, 2, 3, 3, 0]);
Need to pass it to C++ for processing:
void CfVectorNetwork::addSegments(
const skia_private::TArray& fromTo,
const skia_private::TArray& tangents) {
// Process segments...
}
Can't just call it. JavaScript arrays aren't C++ arrays. They live in different memory spaces.
First Attempt: embind's val::array
Emscripten has embind—a binding system for exposing C++ to JavaScript. It has emscripten::val for passing JavaScript values.
Seemed straightforward:
.function("addSegments", &CfVectorNetwork::addSegments)
Build, run:
Cannot call addSegments due to unbound types: St6vectorItSaItEE
Translation: "I don't know how to convert JavaScript arrays to std::vector<uint16_t>."
Embind can't automatically convert array types across the WASM boundary. Fair enough.
Second Attempt: Manual val Conversion
Maybe convert the emscripten::val to a C++ vector manually?
.function("addSegments", optional_override([](
CfVectorNetwork& vn,
emscripten::val jsArray) {
skia_private::TArray cppArray;
size_t length = jsArray["length"].as();
for (size_t i = 0; i < length; i++) {
cppArray.push_back(jsArray[i].as());
}
vn.addSegments(cppArray, ...);
}))
This worked. For 10 elements.
For 10,000 elements, it was 30× slower than it should have been. Each jsArray[i] call crosses the WASM boundary—a relatively expensive operation. We were making 10,000 boundary crossings to copy one array.
The Solution: Raw Pointers
WebAssembly has a linear memory model. Both JavaScript and C++ can access the same underlying ArrayBuffer. We just need to:
- Allocate memory in the WASM heap
- Copy JavaScript data into that memory
- Pass the raw pointer to C++
- Free the memory when done
Here's the pattern (modules/canvaskit/memory.js:121-147):
function copy1dArray(arr, dest, ptr) {
if (!arr || !arr.length) return 0; // nullptr
var bytesPerElement = CanvasKit[dest].BYTES_PER_ELEMENT;
if (!ptr) {
ptr = CanvasKit._malloc(arr.length * bytesPerElement);
}
// The WASM heap is a uint8_t* - a byte array.
// CanvasKit.HEAPF32 is a Float32Array view of that memory.
// To convert a byte pointer to a float pointer, divide by 4.
CanvasKit[dest].set(arr, ptr / bytesPerElement);
return ptr;
}
Now the JavaScript side becomes:
CfVectorNetwork.prototype.addSegments = function(fromTo, tangents) {
var fromToPtr = copy1dArray(fromTo, 'HEAPU16');
var tangentsPtr = copy1dArray(tangents, 'HEAPF32');
try {
this._['_addSegments'](fromToPtr, fromTo.length,
tangentsPtr, tangents.length);
} finally {
CanvasKit._free(fromToPtr);
CanvasKit._free(tangentsPtr);
}
};
And the C++ binding:
.function("_addSegments", optional_override([](
CfVectorNetwork& vn,
WASMPointerU16 fromToPtr, int fromToLength,
WASMPointerF32 tangentsPtr, int tangentsLength) {
// Convert raw pointers to TArray
const uint16_t* fromToData = reinterpret_cast(fromToPtr);
skia_private::TArray fromToArray;
fromToArray.resize(fromToLength);
for (int i = 0; i < fromToLength; i++) {
fromToArray[i] = fromToData[i];
}
// Same for tangents...
return vn.addSegments(fromToArray, tangentsArray);
}), allow_raw_pointers())
Performance: One boundary crossing. One memcpy. 30× faster.
The Pointer Arithmetic
The confusing part: ptr / bytesPerElement
The WASM heap is exposed as multiple typed array views:
HEAPU8- Uint8Array (1 byte per element)HEAPU16- Uint16Array (2 bytes per element)HEAPF32- Float32Array (4 bytes per element)
All these views wrap the same underlying ArrayBuffer. But they have different element sizes.
When _malloc returns a pointer, it returns a byte offset into the heap. To use it with HEAPF32, we need to convert byte offset → float offset:
// _malloc returns byte offset: 4096
var ptr = CanvasKit._malloc(16); // 4 floats × 4 bytes
// HEAPF32 is indexed by floats, not bytes
// To access float index 0 at byte 4096:
// 4096 bytes ÷ 4 bytes per float = index 1024
CanvasKit.HEAPF32.set(arr, ptr / 4);
Same pointer, different interpretation depending on which view you use.
Memory Management Hell
The tricky part: who frees what?
JavaScript allocates, JavaScript frees:
var ptr = copy1dArray(vertices, 'HEAPF32');
// Use ptr...
CanvasKit._free(ptr);
Unless C++ takes ownership:
SkPath MakePathFromVerbs(WASMPointerU8 verbsPtr, int verbCount) {
// We read from verbsPtr but don't own it
// JavaScript must free it
}
Or you use a scratch buffer (modules/canvaskit/memory.js:83-112):
// Pre-allocated at startup, never freed
var _scratchColorPtr;
function copyColorToWasm(color) {
return copy1dArray(color, 'HEAPF32', _scratchColorPtr);
}
// No need to free - reused for all color operations
We have ~10 scratch buffers for common sizes (4 floats, 9 floats, 16 floats). Avoids malloc/free overhead for frequent small allocations.
The Malloc Optimization
For user convenience, CanvasKit provides Malloc—a way to allocate once, reuse many times:
const mObj = CanvasKit.Malloc(Float32Array, 20);
const arr = mObj.toTypedArray();
// Fill array
for (let i = 0; i < 20; i++) {
arr[i] = i * 2;
}
// Pass directly to CanvasKit (it detects the _ck marker)
someCanvasKitFunction(arr);
// Eventually...
CanvasKit.Free(mObj);
The _ck property marks it as "already in WASM memory." Our copy functions check for this and skip the copy:
function copy1dArray(arr, dest, ptr) {
// If already in WASM heap, just return its pointer
if (arr && arr['_ck']) {
return arr.byteOffset;
}
// Otherwise malloc and copy...
}
This pattern is for advanced use—render loops that pass the same arrays hundreds of times per second. Allocate once, reuse forever.
What We Learned
The WASM boundary isn't about types, it's about bytes:
- JavaScript: Allocate memory, copy data, pass pointer
- C++: Read from pointer, do work, don't free
- JavaScript: Free memory when done
Typed arrays are just different views of the same memory. A pointer at byte 100 is float index 25 (100 / 4).
Memory management is manual. Forget to free, you leak. Free too early, you crash.
But once you understand it's just pointer arithmetic and manual malloc/free, it's straightforward. Tedious, but straightforward.
And 30× faster than trying to abstract it away with automatic conversions.
Read next: Sticky-Out Caps: When Round Isn't Round - Why Skia's caps extend beyond endpoints and how we fixed it.