BranchTaken

Hemlock language insights

Explicit Signage

Extreme sign post

Integers seem like a simple concept. Start with natural numbers (ℕ⁺), incorporate 0 (ℕ⁰), then add a sign to create integers (ℤ). In theory, integers range from -∞ to ∞, but computers currently lack infinite memory, so we have to compromise in implementation. And compromise we do! Some high-level scripting languages try pretty hard to make integers behave like the mathematical definition, but at great computational cost. Most systems programming languages instead utilize high-performance CPU features to provide some combination of limited-range 2’s complement signed integers and unsigned integers which provide modular arithmetic semantics (i.e. fixed-width wrap-around behavior).

It’s bad enough that signed integers in languages like C and C++ have undefined underflow/overflow semantics. Adding insult to injury, it’s common for languages to implicitly convert between types of differing bitwidths! This can produce some very surprising results because the implicit conversion doesn’t consistently happen soon enough to maximize the fidelity of intermediate values.

For an interesting example of how implicit conversion can go wrong, see the analysis of Stagefright vulnerabilities in this blog post. The blog post presents an alternative solution to integer conversion, and is worth reading in its entirety.

Hemlock takes a rigid approach to integer types that resolves some historical baggage and removes footguns.

I have literally spent decades now fighting the good fight against implicit integer conversion while writing low-level C/C++ code. In Hemlock, all conversions are explicit, and I look forward to a rich set of numerical types that produce exactly the results one would expect when reading the code.