BranchTaken

Hemlock language insights

Work Stealing - Type System

Hemlock’s type system is a critical factor in making some of the language’s best features possible — deterministic, constant-overhead garbage collection, aggressive compile-time optimization, and (today’s topic) work stealing. Essentially, stealable work has the same rules as parallelizable work.

  1. Mutability - The work can not modify state while other, parallel work might observe or modify the same state.
    int arr[42];
    // None of these can be reordered or run in parallel.
    modify_arr0(arr);  // void modify_arr0(int arr[]);
    modify_arr1(arr);  // void modify_arr1(int arr[]);
    observe_arr(arr);  // void observe_arr(const int arr[]);
    
  2. Effects - Work that has side effects are order-dependent and cannot be parallelized since we can’t guarantee that the order of the underlying effects will be preserved.
    const char *contents = "File contents to write and then read.\n";
    char buf[1024];
    
    // This deals with filesystem operations. It's obviously dependent on order of
    // execution.
    fprintf(f, "%s", contents);
    fseek(f, 0, SEEK_SET);
    fread(&buf, sizeof(char), sizeof(buf), f);
    fprintf(stdout, "%s", buf);
    

Fortunately, that’s really it for the rules. Unfortunately, I’m not aware of a single mainstream programming language that can guarantee either of these without extremely deep, brittle compiler introspection of the transitive implementation behind every function call. In cases where the implementation is unavailable at compile time (shared libraries, static libraries without link-time optimization) or the implementation is just unknowable (application of composed functions — functors), it is impossible to determine that such functions can be reordered, parallelized, or stolen. Certainty is an absolute requirement when a compiler attempts to do any of this, and mainstream compilers just can’t do it.

Hemlock easily does better in this regard. Mutability and side effect information is written into all function signatures. We know what inputs are mutable, in which function applications they are mutated, and in which function applications side effects will happen.

Mutability

Starting with mutability. Take the following C function signature.

int foo(char * bar);

The input bar is mutable, but does foo mutate it? We need to check the implementation to find out.

bool baz = false;

size_t foo(char * bar) {
    size_t ret = (size_t) strlen(bar);
    if (baz) {
        baz = false;
        bar[0] = '\0';
    }
    return ret;
}

Yes, it does mutate bar. Even worse, it has the extra surprise mutation of the global variable baz. Ew. Unfortunately, if C implementation can’t be fully introspected, the compiler has to be conservative and assume that something like this is happening.

Here’s a close approximation of what the same function looks like in Hemlock.

# Note that mutable global variables don't exist in Hemlock, but per-actor
# module-level mutable variables do.
# val &baz: bool
let &baz := false

# val foo: !&bytes >mut-> uns
let foo bar =
    let ret = Bytes.length bar
    match baz with
    | false -> ()
    | true ->
        baz := false
        Bytes.set_inplace 0 '\0' bar
    ret

We didn’t need to see the function body to know what is going on. The function signature tells us that the input will be mutated (!&bytes) when the function executes and that additional module-level state will be mutated (>mut->).

We can even track that the input is not mutated during function application, but during a later time as in the application of an encapsulating closure.

val bar: &bytes -> (unit >!&bytes-> uns)

Calling bar by itself does not mutate anything. Calling the returned closure does. Knowing during which function applications mutation occurs provides us valuable information about what can be reordered or parallelized.

let a = Bytes.of_string "I'm a little teapot"
# Reordering or parallelizing via work stealing the
# following 3 calls is valid (though these simple
# functions would likely execute before they can be
# stolen).
let b = bar a
let c = bar a
let d = bar a

# Reordering or parallelizing via work stealing the
# following 3 calls is not valid.
let b = b ()
let c = c ()
let d = d ()

Knowing what will be mutated makes it possible for us to identify work stealing opportunities just by knowing function signatures. It’s extremely easy to implement in the Hemlock compiler and we’re allowed to be very aggressive about it.

Side effects

Some functions don’t mutate any data inside the Hemlock runtime, but have other side effects that must be executed in the correct order or exclusively. Two such side effects are hlt (input may cause the actor to halt) and os (the function interacts with the operating system, and we must assume mutation external to the process).

Using the Hemlock translation of the example in rule #2 at the beginning of this post, we can see how the {hlt|os} set of effects keeps us from reordering or parallelizing function applications that must be run in sequence.

# val write_hlt: _&slice _&bytes -> File.t >{hlt|os}-> unit
# val seek_hd_hlt: uns -> File.t >{hlt|os}-> unit
# val read_into_hlt: !&slice !&bytes -> File.t >{hlt|os}-> unit

let contents = Slice.of_container
  (Bytes.of_string "File contents to write and then read.\n")
write_hlt contents f
seek_hd_hlt 0 f
let buf = Slice.of_container (Bytes.init 1024 ~f:(_ -> '\0'))
read_into_hlt buf f
write_hlt buf File.stdout

Only the read_into_hlt function mutates any data, but every statement must happen in sequence and without being reordered for this code snippet to run as intended.

What exactly can be stolen?

I spent a lot of time showing things that can’t be stolen in this blog post. Here is the punch-line. Any function application with no mutation (e.g. !&t) or side effects (e.g. >os->) can be computed in parallel with any other pure function application that comes before or after it. There is, quite honestly, a lot of work that can be stolen. As demonstrated by the reduce example in the Work Stealing - Intro blog post, there are some pretty fundamental, ubiquitous coding patterns that have great work stealing opportunities. Hemlock is going to find them and take them.