An improved guide for compiling WASM with Emscripten and Embind


In one of the recent projects, I had a task to consume an existing homegrown C++ library in our web application. This is one of the most common use cases for WebAssembly (also known as WASM), but I've faced obstacles when using C++ and Emscripten/Embind.

This post is for folks who are not familiar with C++ and are using Emscripten and Embind to compile their code to WASM. The Embind documentation is not clear in certain areas and hopefully, this post provides more clarity.

A working example and the code from this post can be found in this Github repository. Let's dive in.

Flags

We will start with flags that we pass to the emscripten compiler. You can edit these in the wasm.cmake file.

set_target_properties(${EXECUTABLE_NAME} PROPERTIES LINK_FLAGS "-s ENVIRONMENT=web -s SINGLE_FILE=1 -s MODULARIZE -s 'EXPORT_NAME=DummyCppLibrary' --bind")

ENVIRONMENT

This flag tells emscripten in which environment our code will operate. Supported values are web, webview, worker, node (NodeJS), and shell.

SINGLE_FILE

If you worked with WebAssembly before, you probably encountered .wasm files as an output of some compilation process. When this flag is used, these .wasm files will not be produced. Instead, emscripten will embed your wasm code as a base64 string and output a single .js file. This simplifies integration between your app and wasm code since you don't have to cater for another extension (e.g. configure your build tools etc.). In situations where you can't add this file as a part of your asset folder, you can also serve it and import it via CDN.

MODULARIZE and EXPORT_NAME

With the help of these two flags, I can import the WASM module using the ES6 import syntax.

We have covered only these 3 flags, but Emscripten has many other flags which you can find here and here

Converting C++ types with Embind

Embind is a tool that allows us to call C++ functions as if they were regular Javascript code. Since C++ and Javascript are two different worlds, Embind acts as a "bridge" and translates things like Javascript objects to types that are familiar to C++ e.g. structs.

The tricky part is that we need to write these bindings manually. Let's explore some common types and how to map and use them in Javascript code. The complete bindings can be found in the wasm.cpp file.

Structs

Structs and classes can be either mapped as a class or a value_object. Which of these you choose will depend on your use case. Mapping to a Javascript class is preferable when your C++ struct has some logic inside a constructor or exposes methods as a part of this class. In other words, if you want to keep the "class" behavior then map it to a JS class. Let's discuss the second option.

Vector of structs

In our library, we were exposing a function that accepts a vector of C++ structs. This was probably one of the most challenging parts of working with Embind since the documentation only covers simple type mappings. With vectors involved, there is even more complexity in the mix. Let's take it step by step and try to convert this function signature.

typedef std::vector<Item> ItemVector;

// Our C++ function
static long calculateTotal(const ItemVector& items) {
// Some code...
}

The first task we are going to tackle is the struct. Since we will be passing down a vector of our structs, it is easier to bind this struct as a value object.

struct Item;
typedef std::vector<Item> ItemVector;

struct Item {
    int32t id;
    String name;
    int32t price;
    ItemStatus status;
}
// Emscripten binding
value_object<Item>("Item")
    .field("id", &Item::id)
    .field("price", &Item::price)
    .field("name", &Item::name)
    .field("status", &Item::status);

This allows us to construct a plain JS object like so.

const item = {id: 1, name: 'Dummy Item', price: 123, status: 0}

This covers the Item part, but our function only accepts std::vector<Item> ItemVector. In the next part, we tackle the vector portion.

C++ vector

To use vector on Javascript side, we first need to bind it via Embind.

register_vector<Item>("ItemVector");

Now in our Typescript code, we need to instantiate it and push items to it. This is the part where we connect our previously bound value_object and the vector. The push_back method is a built-in method of vector, it's available out of the box.

const newItems = [
    {id: 1, name: 'Dummy Item', price: 123, status: 0},
    {id: 2, name: 'Other dummy Item', price: 321, status: 0}
];

const itemVector = new wasmModule.ItemVector();
newItems.forEach(item => itemVector.push_back(item));

We have covered the struct and the vector, the last bit remaining is our function. As you already know by now, to use C++ function in our Typescript code, we need to bind it first.

A small tip on the binding - we can "rename" the exposed functions during binding. In the example below, you can see that I'm renaming the original function to getTotal and this is the name I will use in my Typescript code.

function("getTotal", &Order::calculateTotal);

With everything in place, we can now use the full power of our WASM module.

import React, {useEffect, useState} from 'react';
import DummyCppLibrary from './dummy-cpp-library';

// Other React code omitted for readability...

const [wasmModule, setWasmModule] = useState<WasmModule>();

useEffect(() => {
    DummyCppLibrary().then((wasmModule: WasmModule) => setWasmModule(wasmModule));
}, []);

const newItems = [
    {id: 1, name: 'Dummy Item', price: 123, status: 0},
    {id: 2, name: 'Other dummy Item', price: 321, status: 0}
];

const itemVector = new wasmModule.ItemVector();
newItems.forEach(item => itemVector.push_back(item));

const total = wasmModule.getTotal(itemVector);

Now that we have covered the major types and have everything working end-to-end, there are still a few quirks we need to be aware of when working with Embind.

Enums

In our example above, one of the properties of the Item struct was an enum. You might think of them as numbers only, but we still need to bind them and pass them down when calling our C++ code.

enum_<ItemStatus>("ItemStatus")
    .value("ItemStatus_InStock", ItemStatus::ItemStatus_InStock)
    .value("ItemStatus_OutOfStock", ItemStatus::ItemStatus_OutOfStock)
    .value("ItemStatus_Unknown", ItemStatus::ItemStatus_Unknown);

One might think - enums are only numbers so why do I need to do this mapping - but if you try to pass down numbers as enum arguments to C++ code (as I foolishly tried ๐Ÿ˜‰), that is not going to work. It's best if I illustrate with an example. I've exposed a handy function called printEnum that is called in my web app.

wasmModule.printEnum(2); // Prints 0 to the console
wasmModule.printEnum(wasmModule.ItemStatus.ItemStatus_Unknown); // Prints 2 to the console

The worst part is that this code will not throw any error and still prints, even though I'passing an incorrect type. Well, I guess that's the joy of working with Emscripten. ๐Ÿ˜

Debugging

The last part of this guide is about debugging. Chrome offers great support for debugging in developer tools. To ensure this works, don't forget to pass the debugging flag to Emscripten.

If you are more into old-school print statements type of debugging, there are two ways to go about this.

The first one is to use printf statements in C++. Don't forget to add a new line \n at the end, otherwise, it doesn't work!

printf("Processing %s with price %d \n", items[i].name.c_str(), items[i].price);

The second option is handy when exceptions happen in our C++ code. These are usually cryptic, but Emscripten provides a debugging function where you can catch these errors on Typescript side and print them to the console.

std::string getExceptionMessage(intptr_t exceptionPtr) {
  return std::string(reinterpret_cast<std::exception *>(exceptionPtr)->what());
}
// We have to bind it as usual
emscripten::function("getExceptionMessage", &getExceptionMessage);

Now on the web side of things, we can wrap our code in a try/catch block and print it.

try {
    wasmModule.getTotal(itemVector)
} catch (e) {
    console.log(wasmModule.getExceptionMessage(e))
}

More details can be found in Emscripten documentation.

Final thoughts

Working with C++ was a fun experience since I've never used this language before, even though it took me a while to figure out some things. Having worked with WebAssembly before, when I compare Rust/wasm-bindgen with Emscripten, the latter has much more work to do on the tooling.

As we saw in this article, the binding part is tedious, manual, and highly error-prone. I can't even imagine how much work this is on a big C++ codebase. Wasm-bindgen does a lot of heavy lifting here, you only need to add a macro, and the rest is done for you. There are ongoing discussions to do the same in Emscripten, but nothing concrete at the moment.

Either way, I hope this post helps someone. The snippets shared here are mostly stripped to illustrate the concepts, but a working example in React, along with how to create Typescript interfaces for your WebAssembly module, can be found in this repo.

If you see any way that this post can be improved, feel free to drop me an email. Thanks for reading! ๐Ÿ˜€