I've spent this past summer working on the Sorbet type checker for Ruby at the behest of Stripe's Ruby Infrastructure team. Both Sorbet and Ruby in general have been on my periphery for a long time, but this is the first time I've had the opportunity to explore either in any level of detail.
While I've written my fair share of mypy-annotated Python, this summer marks my first major excursion into using a real gradual type system. To my surprise, writing typed Ruby by way of Sorbet is quite pleasant, despite the absolutely egregious syntax (much to my own horror, I've also become a C++ programmer).
One problem uniquely faced by gradual systems is that it's important for idiomatic code in the host language to also be correct in the typed language as much as possible. While this is usually fine, it can often lead to some awkward results for people used to either fully-static or fully-dynamic languages.
All this is to say, today I want to talk about
T.nilable
, a type with a five, maybe even six digit
number of occurrences in Stripe's codebase.
T.nilable
at a glance
The Sorbet
docs for nilable types give a great summary, but the tl;dr is
that a T.nilable(Typ)
can be either a value
of type Typ
or, well, nil
.
T.nilable
(and similar types, like Python's
Optional
, or even C++'s std::optional
)
are supposedly implementations of the
Maybe
/option
monad of functional
programming fame. And it's true that, in a lot of typical cases,
it works just as well. But T.nilable
differs from a
proper optional in a few subtle, but key ways.
What is the difference?
A complaint I see made not-infrequently about languages with
proper options is the necessity of writing the word
Some
to lift a value from type T
to
Option[T]
. In Scala, if you try writing this:
def foo(s: String): Option[String] = s
you'll be greeted with an error message informing you that
you've passed a value of type String
where an
Option[String]
was expected.
In the other languages I mentioned (Ruby, Python, C++), this
"just works". If you need a value of type
T.nilable(String)
, and you pass a regular
String
, Sorbet accepts this with no complaints (sorbet.run).
If you're not already familiar with an ML-style option, what
we're observing is that Option[T]
might
contain a value of type T
, whereas a
T.nilable(U)
might be a value of type
U
.
This "is-a" vs "has-a" distinction matters more than just
needing to type the word Some
. Consider the type
Bool
. It has two possible values, true
and false
. Then the type Option[Bool]
has three possible values:
Some(true)
Some(false)
None
Option[Option[Bool]]
has four:
Some(Some(true))
Some(Some(false))
Some(None)
None
and so on. With each added Option
, we add an extra
possibility to the set of possible values. You can guess why, in
academic tradition, the type Option[T]
is sometimes
notated as \(\tau + 1\) (see these old
lecture notes for a more thorough treatment of what that
+
means).
Now, let's consider what happens in the T.nilable
case:
T::Boolean
once again has the two values,true
andfalse
.T.nilable(T::Boolean)
can betrue
,false
, ornil
.- What about
T.nilable(T.nilable(T::Boolean))
?
Well, a value of type T.nilable(U)
is a
value of type U
or nil
, so... hey, wait,
nil
is already a member of
T.nilable(T::Boolean)
!
What we've just shown is that, as a function on types,
T.nilable
is idempotent -
T.nilable(T.nilable(U))
is the same type as
T.nilable(U)
. Not isomorphic to, not bijected to, but
exactly the same type, as we can see here,
where a function taking and returning
T.nilable(T.nilable(Integer))
is accepted in the same
place as one taking and returning T.nilable(Integer)
1.
Of course, how often does this really come up?
Here's a (stripped-down) situation that I've actually dealt with "in the wild".
Suppose we have some JSON data we'd like to process. The exact minutiae of deserializing JSON into a language-native datatype (especially in a typed language) is fiddly and domain-specific, but let's assume we have some schema like this:
{ age: int
, address: string | null
}
which maps nicely to the Ruby class
class MyData
{ returns(Integer) }
sig attr_reader :age
{ returns(T.nilable(String)) }
sig attr_reader :address
end
These records will be stored in a database, indexed by name (another string), with a lookup function like
{ params(name: String).returns(T.nilable(MyData)) }
sig def lookup(name)
# ...
end
The problem statement is as follows: Given a list of names,
retrieve the corresponding list of (possibly missing) addresses,
or :not_found
if no record with the given name
exists. In other words, write a function with the sig
names: T::Array[String])
params(.returns(T::Array[T.any(T.nilable(String), NotFound)]) # or Symbol
In Haskell, this is easy:
data AddressResult = Found (Maybe String) | NotFound
fetchAddresses :: [String] -> [AddressResult]
=
fetchAddresses map (maybe NotFound Found . fmap address . lookup)
The first map
gives us a list of the form
[Just p1, Nothing, Just p2, ...]
, from which we then
extract an address
from every Just
,
giving us something like
[Just (Just "An Address"), Nothing, Just Nothing]
.
Finally, we flatten the outer Maybe
with
fromMaybe
, giving us
[Found (Just "An Address"), NotFound, Found Nothing]
.
If we try to write the same logic in Ruby, however...
def fetch_addresses_bad(names)
# `&.` here is `map` for `T.nilable`
.map { |x| lookup(x)&.address || :not_found }
namesend
you'll find that you get :not_found
for valid
records with no address! This is because both lookup
and address
give the same nil
-
we've lost the ability to distinguish between "not present" and
"present, but no address"!
To fix this, we need to check for the "not present" case explicitly:
def fetch_addresses(names)
.map do |x|
names= lookup(x)
record .nil? ? :not_found : x.address
recordend
end
If you've ever heard someone say that some pattern doesnt
compose, this is what they mean: instead of simply
chaining lookup
and address
, we needed
to step in and massage the intermediate result.
T.nilable
specifically
Regardless of whether your Option
equivalent is a
sum or a union, at some point you need to determine whether
x
is a "real" value or not. In languages with sum
typing, this is done via pattern matching. In Python, we have
if x is None
, C++ has x.has_value()
, and
so on. What about Ruby?
Well, the official documentation opens with this snippet:
extend T::Sig
{ params(x: T.nilable(String)).void }
sig def foo(x)
if x
puts "x was not nil! Got: #{x}"
else
puts "x was nil"
end
end
which seems reasonable enough - unlike Python, Ruby only has a
few falsy values: false
and nil
.
Hm... false
. What happens if we do this?
extend T::Sig
{ params(x: T.nilable(T::Boolean)).void }
sig def foo(x)
if x
puts "was x nil?"
end
end
Turns out, Sorbet is smart enough to account for this case. If
we T.reveal_type(x)
within the if
block,
we get TrueClass
, not T::Boolean
. But
that means that the pattern in the docs isn't universal!
When I brought this up to the Sorbet team, the consensus seemed to be that the docs don't advocate for nil-checking in this particular way, only that this is one pattern for doing so. I personally think the site should at least mention it, but have not had the time to write anything up myself, so whatever.
In the big picture, this is an extremely minor edge-case -
unlike Python, Ruby doesn't let you define your own falsy classes,
so this literally only affects the type
T.nilable(T::Boolean)
.
But it does have consequences for generics. Consider this
implementation of the Peekable
wrapper over an
iterable:
class Peekable
extend T::Sig
extend T::Generic
Elem = type_member(:out)
# Contains the topmost `Elem` if peeked, `nil` otherwise.
{ returns(T.nilable(Elem)) }
sig attr_reader :top
{ returns(Enumerable[]) }
sig attr_reader :inner
{ returns(T.nilable(Elem)) }
sig def peek
begin; inner.next
rescue StopIteration; nil
end
end
# sig elided, see
# https://sorbet.org/docs/stdlib-generics#implementing-the-enumerable-interface
def each(&blk)
# ...
end
end
How should we implement each
? We might try
something like this:
def each(&blk)
loop do
if self.top
= T.must(self.top)
result @top = nil
yield result
else
yield inner.next
end
end
end
But wait, what if Elem = T::Boolean
? If we
peek
at a stream and see false
, then the
if self.top
branch isn't taken! Instead, we'll have
to check for nil explicitly, just like before.
But wait, Ruby has one more surprise. If you actually try
it, you'll notice that nil?
isn't actually
defined. That's because Sorbet doesn't assume that arbitrary type
variables inherit from Object
, where
nil?
is defined. We need to add the upper bound on
Elem
:
Elem = type_member(:out) { {upper: Object} }
But it gets worse. Did you know you
can override nil?
? You shouldn't do
this, obviously, but defensive programming is good programming,
and we should behave properly even if one of our consumers is
being naughty.
What about self.top == nil
or
self.top.equal?(nil)
? Same problem. Ruby lacks an
equivalent of Python's is
operator, which
always returns object identity equality (so
x is None
always gives the right answer).
For that matter, ===
is also overridable, so
even case self.top; when nil
won't work.
The actual foolproof thing to do is
if nil == self.top
, which uses the equals operator
from NilClass
. A bit anticlimactic of a solution, but
it is yet one more papercut2.
Closing Thoughts
Let me say explicitly that I don't think that what Ruby does is
"wrong", or that sum types are inherently "better". The warts
described in this post are minor annoyances at worst. Honestly,
having to wrap/unwrap Some
is at least as
irritating any of the "problems" described here!
One could envision a language that has the best of both worlds,
and automatically inserts Some
anywhere a "expected
Option[T]
, found T
" error would
otherwise be raised. My gut says that this would be silky-smooth
to use in the happy cases, but would be deeply unpleasant if
inference failed for any reason. Scala's implicits offer a
stronger form of this (allowing for implicit conversion of
any A
to B
, so long as a
specially-marked conversion function is in scope), and they are
widely considered to be an anti-feature. Maybe it would be
tolerable when limited to Option
, but in my
experience this kind of non-local inference always ends up having
frustrating pathological cases, no matter how benign it seems.
The initial conception of this post from six months ago
(oops...) was a rant devoted to union types in general, and how
anybody who thinks they're just as expressive as sum types has
never done anything complex with either. The idea languished for a
while due to a lack of motivation, until a conversation at work
about Ruby vs Scala helped me re-focus. Many of the concepts from
that post (is-a vs has-a, the impossibility of nesting) made it
here, in a form that was supposed to be less antagonistic
(though, reading back over this now, I suspect I've missed the
mark). Maybe I'll still write that post someday, if I get
sufficiently mad (as a teaser: What's the difference between
List[A] | List[B]
and List[A | B]
?).
For you OOP aficionados out there, what this actually shows is that
T.nilable(T.nilable(U))
is both a sub- and supertype ofT.nilable(U)
, as it's accepted in both covariant (positive) and contravariant (negative) positions. In a language where subtyping forms a proper lattice, it is the case that this implies that the two types are the same. Unfortunately, Ruby's class hierarchy gets a bit tangled when it comes to the "top" types ofBasicObject
,Class
andModule
, which can form a cycle. The original conclusion (thatT.nilable(T.nilable(U))
andT.nilable(U)
are the same type) still holds, but if we were writing an academic paper, we'd need to do a bit more work to show why the cycle doesn't matter.↩︎If you've learned the lession from the previous section, you might notice that
Elem = T.nilable(U)
will also behave incorrectly! The actual correct implementation either maintains a separate@peeked
field or uses sealed classes to make a proper Option type.↩︎