An old adage in cryptography is that one should never “roll his or her own crypto.” Besides being really hard to get right, it is stress-inducing, time-consuming, and tedious. It quickly becomes a black hole of code review and worry.
This article is a look at how we were able to take good C crypto, and call it from our Clojure backend and our Clojurescript frontend without having to change a single line of a trusted base.
Before we get started, we should mention that browser crypto is a bad idea if your goal is to fight the Man. It is, on the other hand, a good idea if the goal is to prevent spreading sensitive data across caches and microservices. For example, if we delete a key we could never read (it’s encrypted client-side), it is now rendered inert everywhere in our backend.
In the case of Balboa, we never want to see our users’ data and we love the literature about crypto providing elegant solutions to access control. Also, since Asm.js keeps getting faster and more available, this may pay greater dividends in future browser releases.
The dream: trust once, run anywhere
Our big question was the following, “Is it possible to take something developers already trust and call that as faithfully as possible from Clojure and Clojurescript?”
After some research, we had a gut feeling it would look something like the above. It seemed sensible enough, so we set to work finding suitable algorithms for the experiment.
After exploring some potential issues with Emscripten compilation (timing attacks, memset’s being removed silently, etc.), we went looking for a PRF algorithm that would be simple for
emcc to compile faithfully. We decided on Skein/Threefish.
Skein’s NIST x86 implementation had no strange assembly instructions to worry about and had already gone through the review process of the SHA-3 competition. In a later post, we will talk about some of the tricks we used for NaCl and scrypt, which will cover tradeoffs around when to write Clojurescript or C.
uint8_t* as lingua franca
The first hurdle we encountered with JNI and Emscripten is what to do with structs. In the case of something like Skein, there is a struct,
Skein1024_Ctxt_t, that contains all of the state for the pseudorandom function. This struct must be initialized and passed around for any incremental hashing operations.
In Emscripten, you can only pass
string. Also, in Java, the passing of objects through the JNI boundary is taxing.
One way to portably share code across the platforms was memcpy’ing, back and forth, structs into
uint8_t*. Once a struct gets represented as a
uint8_t* and passed up to Clojure and Clojurescript as either a
Uint8Array or a
byte respectively, it can no longer be sensibly mutated. Fortunately, all modifications happen in C, where the
uint8_t* can be memcpy’d back into a struct before operations are done on it.
skein_shim.c wraps the initialization and teardown of these
uint8_t* into structs so that the API exposed to both Emscripten and JNI is always
uint8_t*. For Emscripten,
HEAPU8, the heap in the Asm.js virtual machine, is of type
Uint8Array. No translation of the sort done in JNI for
uint8_t* is required. Instead, a set call is required to bring the buffer that exists outside of Emscripten’s heap, into
HEAPU8: a far simpler task.
Love means never having to OOM
If you want Emscripten code to run quickly, you generally have to set
ALLOW_MEMORY_GROWTH=0 at Emscripten-compile-time, forcing you to work with a finite amount of heap. Calling
malloc means calling
free. Clojurescript offers some really elegant means for controlling your memory usage with Emscripten.
Like most cool things with Clojure(script), it involves a macro:
HEAPU8, for the data that merits the treatment.
Calling from clojurescript
Now that the lower-level plumbing is in place, we can look at an example of making a C HMAC function callable from Clojurescript.
Above is nothing more than a simple wrapper for
Manipulating data inside Emscripten requires copying data in and out of its own heap. We used the above methods for bringing Uint8Array back and forth.
For data being passed into the heap, the method is as follows: After
malloc‘ing space, a slice of the whole
HEAPU8 is set.
In order to take data out of Emscripten, data is copied out of the heap, back into a regular
Uint8Array, by representing the “memory address” and range as a slice of
HEAPU8, and copying it into a new buffer. The cloned array is returned, and the allocation of Emscripten heap is free’d upon returning.
Our goal was to make sure this copying back and forth was negligible with respect to the amount of work being done in Emscripten-space.
And, now, the payoff: We can generate the HMAC for a given message! About time, if you ask me.