The Lone Ranges
Hemlock is a mostly functional programming language in which tail recursion is ubiquitous, so it
should come as no surprise that Hemlock has no while
loop construct. Perhaps a bit surprising though is that
it doesn’t even have a for loop construct like
OCaml’s for ... {to,downto} ... do ...
done
. We have instead been using ranges, the
subject of a previous blog post. Unfortunately there was a
rough edge in the original design, but there is a fix! Read on for details.
The problem
It can be useful to work with half-open or full-open ranges, depending on context, and it is not
possible to represent all relevant ranges with only one kind of range. For example, [0..0)
is an
empty half-open range, which is impossible to represent as a full-open range — [0..0]
contains the
value 0
. Conversely, [0..max_value]
contains all values, which cannot be represented as a
half-open range. It’s also important to note that the length of the maximal full-open range is one
more than max_value
, i.e. it overflows the relevant integer type, uns
in this example.
We initially decided to solve the half-open vs full-open range problem by making them distinct
types. That is, for every type in Hemlock’s integer numerical
tower, there was a corresponding
half-open range type and a full-open range type. For example, for the Uns.t
type (aka uns
),
there was RangeH.Uns.t
and RangeF.Uns.t
. But this posed a problem in that functions like
Array.init
could only support half-open ranges. That’s okay most of the time, but sometimes a
range is more naturally expressed as full-open.
# The easy way to create an array of uns values.
a = RangeH.Uns.to_array (0..5)
# [|0; 1; 2; 3; 4|]
# Equivalent to the above.
a' = Array.init (0..5) ~f:id
# A natural use of full-open ranges.
b = RangeF.Uns.to_array (1..=5)
# [|1; 2; 3; 4; 5|]
# Non-obvious simulation of full-open ranges.
b' = Array.init (1..(succ 5)) ~f:id
b'' = Array.init (1..6) ~f:id
Another problem with the ..
and ..=
operators in the original design is that they don’t fit well
with pattern matching. At the time we had some plans for supporting pure function application in
patterns rather than supporting full-open ranges (e.g. 'a'..='z'
), but that pattern matching
design turned out to be be ill-defined. And in contrast to most other code, full-open ranges are the
better default for patterns. The need for ..=
rather than ..
in pattern match syntax would have
been a clear POLA violation, which
left Hemlock with no good default choice for ranges.
The solution
Types
There is no fundamental problem with merging half-open and full-open range types into one type, as
long as the merged type encodes in each value whether the range is half-open or full-open. Where we
previously had past
vs last
limits in separate types we now conceptually have a single variant
type [1]. Furthermore, rather than past
/last
functions, there is a limit
function
that accommodates both range variants.
type t 'a: t a =
| RangeH of {base: a; past: a}
| RangeF of {base: a; last: a}
type limit 'a: limit a =
| Excl of a
| Incl of a
limit 'a: t a -> limit a
type length 'a: length a =
| Overflow
| Length of a
length 'a: t a -> length a
# Halt on overflow. Overflow is impossible for half-open ranges, and many use cases would halt
# anyway due to memory exhaustion long before overflow became an issue.
length_hlt 'a: t a -> a
Obviously it’s a bit more complicated to use limit
than it was to use past
, but the majority of
range-related code does not deal with these functions, rather it creates a range and uses
container-standard iteration functions like fold
and iter
.
Syntax
The ..
/..=
syntax is problematic due to the lack of obvious default range variant. A simple, if
slightly ugly, solution would be to use ..<
/..=
. Indeed, those are the functions that Hemlock
now uses as the basis for range construction, but with a bit of syntactic
sugar things look pretty good! Hemlock will desugar
[base..past)
to (..<) base past
and [base..last]
to (..=) base last
.
# The easy way to create an array of uns values.
a = Range.Uns.to_array [0..5)
# [|0; 1; 2; 3; 4|]
# Equivalent to the above.
a' = Array.init [0..5) ~f:id
# A natural use of full-open ranges.
b = Range.Uns.to_array [1..5]
# [|1; 2; 3; 4; 5|]
# Equivalent to the above.
b' = Array.init [1..5] ~f:id
And only the sugared syntax is supported in pattern matching.
is_cident_cp = function
| '_'
| ['A'..'Z']
| ['a'..'z']
| ['0'..'9'] -> true
| _ -> false
Footnotes
-
The range types are actually monomorphized via functor generation in the current implementation, but the concepts hold in the simplified polymorphic presentation. ↩