March 24, 2017

The WebAssembly Memory Dance

Passing arrays between JavaScript and C++ in WebAssembly is about pointers. Not objects. Not types. Raw memory addresses.

This took longer to understand than it should have.

The Problem

We have vector network data in JavaScript:

const vertices = new Float32Array([0, 0, 100, 0, 100, 100, 0, 100]);
const segments = new Uint16Array([0, 1, 1, 2, 2, 3, 3, 0]);

Need to pass it to C++ for processing:

void CfVectorNetwork::addSegments(
    const skia_private::TArray& fromTo,
    const skia_private::TArray& tangents) {
  // Process segments...
}

Can't just call it. JavaScript arrays aren't C++ arrays. They live in different memory spaces.

First Attempt: embind's val::array

Emscripten has embind—a binding system for exposing C++ to JavaScript. It has emscripten::val for passing JavaScript values.

Seemed straightforward:

.function("addSegments", &CfVectorNetwork::addSegments)

Build, run:

Cannot call addSegments due to unbound types: St6vectorItSaItEE

Translation: "I don't know how to convert JavaScript arrays to std::vector<uint16_t>."

Embind can't automatically convert array types across the WASM boundary. Fair enough.

Second Attempt: Manual val Conversion

Maybe convert the emscripten::val to a C++ vector manually?

.function("addSegments", optional_override([](
    CfVectorNetwork& vn,
    emscripten::val jsArray) {

  skia_private::TArray cppArray;
  size_t length = jsArray["length"].as();

  for (size_t i = 0; i < length; i++) {
    cppArray.push_back(jsArray[i].as());
  }

  vn.addSegments(cppArray, ...);
}))

This worked. For 10 elements.

For 10,000 elements, it was 30× slower than it should have been. Each jsArray[i] call crosses the WASM boundary—a relatively expensive operation. We were making 10,000 boundary crossings to copy one array.

The Solution: Raw Pointers

WebAssembly has a linear memory model. Both JavaScript and C++ can access the same underlying ArrayBuffer. We just need to:

Allocate memory in the WASM heap
Copy JavaScript data into that memory
Pass the raw pointer to C++
Free the memory when done

Here's the pattern (modules/canvaskit/memory.js:121-147):

function copy1dArray(arr, dest, ptr) {
  if (!arr || !arr.length) return 0;  // nullptr

  var bytesPerElement = CanvasKit[dest].BYTES_PER_ELEMENT;
  if (!ptr) {
    ptr = CanvasKit._malloc(arr.length * bytesPerElement);
  }

  // The WASM heap is a uint8_t* - a byte array.
  // CanvasKit.HEAPF32 is a Float32Array view of that memory.
  // To convert a byte pointer to a float pointer, divide by 4.
  CanvasKit[dest].set(arr, ptr / bytesPerElement);
  return ptr;
}

Now the JavaScript side becomes:

CfVectorNetwork.prototype.addSegments = function(fromTo, tangents) {
  var fromToPtr = copy1dArray(fromTo, 'HEAPU16');
  var tangentsPtr = copy1dArray(tangents, 'HEAPF32');

  try {
    this._['_addSegments'](fromToPtr, fromTo.length,
                          tangentsPtr, tangents.length);
  } finally {
    CanvasKit._free(fromToPtr);
    CanvasKit._free(tangentsPtr);
  }
};

And the C++ binding:

.function("_addSegments", optional_override([](
    CfVectorNetwork& vn,
    WASMPointerU16 fromToPtr, int fromToLength,
    WASMPointerF32 tangentsPtr, int tangentsLength) {

  // Convert raw pointers to TArray
  const uint16_t* fromToData = reinterpret_cast(fromToPtr);
  skia_private::TArray fromToArray;
  fromToArray.resize(fromToLength);

  for (int i = 0; i < fromToLength; i++) {
    fromToArray[i] = fromToData[i];
  }

  // Same for tangents...

  return vn.addSegments(fromToArray, tangentsArray);
}), allow_raw_pointers())

Performance: One boundary crossing. One memcpy. 30× faster.

The Pointer Arithmetic

The confusing part: ptr / bytesPerElement

The WASM heap is exposed as multiple typed array views:

HEAPU8 - Uint8Array (1 byte per element)
HEAPU16 - Uint16Array (2 bytes per element)
HEAPF32 - Float32Array (4 bytes per element)

All these views wrap the same underlying ArrayBuffer. But they have different element sizes.

When _malloc returns a pointer, it returns a byte offset into the heap. To use it with HEAPF32, we need to convert byte offset → float offset:

// _malloc returns byte offset: 4096
var ptr = CanvasKit._malloc(16);  // 4 floats × 4 bytes

// HEAPF32 is indexed by floats, not bytes
// To access float index 0 at byte 4096:
// 4096 bytes ÷ 4 bytes per float = index 1024
CanvasKit.HEAPF32.set(arr, ptr / 4);

Same pointer, different interpretation depending on which view you use.

Memory Management Hell

The tricky part: who frees what?

JavaScript allocates, JavaScript frees:

var ptr = copy1dArray(vertices, 'HEAPF32');
// Use ptr...
CanvasKit._free(ptr);

Unless C++ takes ownership:

SkPath MakePathFromVerbs(WASMPointerU8 verbsPtr, int verbCount) {
  // We read from verbsPtr but don't own it
  // JavaScript must free it
}

Or you use a scratch buffer (modules/canvaskit/memory.js:83-112):

// Pre-allocated at startup, never freed
var _scratchColorPtr;

function copyColorToWasm(color) {
  return copy1dArray(color, 'HEAPF32', _scratchColorPtr);
}
// No need to free - reused for all color operations

We have ~10 scratch buffers for common sizes (4 floats, 9 floats, 16 floats). Avoids malloc/free overhead for frequent small allocations.

The Malloc Optimization

For user convenience, CanvasKit provides Malloc—a way to allocate once, reuse many times:

const mObj = CanvasKit.Malloc(Float32Array, 20);
const arr = mObj.toTypedArray();

// Fill array
for (let i = 0; i < 20; i++) {
  arr[i] = i * 2;
}

// Pass directly to CanvasKit (it detects the _ck marker)
someCanvasKitFunction(arr);

// Eventually...
CanvasKit.Free(mObj);

The _ck property marks it as "already in WASM memory." Our copy functions check for this and skip the copy:

function copy1dArray(arr, dest, ptr) {
  // If already in WASM heap, just return its pointer
  if (arr && arr['_ck']) {
    return arr.byteOffset;
  }
  // Otherwise malloc and copy...
}

This pattern is for advanced use—render loops that pass the same arrays hundreds of times per second. Allocate once, reuse forever.

What We Learned

The WASM boundary isn't about types, it's about bytes:

JavaScript: Allocate memory, copy data, pass pointer
C++: Read from pointer, do work, don't free
JavaScript: Free memory when done

Typed arrays are just different views of the same memory. A pointer at byte 100 is float index 25 (100 / 4).

Memory management is manual. Forget to free, you leak. Free too early, you crash.

But once you understand it's just pointer arithmetic and manual malloc/free, it's straightforward. Tedious, but straightforward.

And 30× faster than trying to abstract it away with automatic conversions.

Read next: Sticky-Out Caps: When Round Isn't Round - Why Skia's caps extend beyond endpoints and how we fixed it.