Thoughts on Sorbet's T.nilable

I've spent this past summer working on the Sorbet type checker for Ruby at the behest of Stripe's Ruby Infrastructure team. Both Sorbet and Ruby in general have been on my periphery for a long time, but this is the first time I've had the opportunity to explore either in any level of detail.

While I've written my fair share of mypy-annotated Python, this summer marks my first major excursion into using a real gradual type system. To my surprise, writing typed Ruby by way of Sorbet is quite pleasant, despite the absolutely egregious syntax (much to my own horror, I've also become a C++ programmer).

One problem uniquely faced by gradual systems is that it's important for idiomatic code in the host language to also be correct in the typed language as much as possible. While this is usually fine, it can often lead to some awkward results for people used to either fully-static or fully-dynamic languages.

All this is to say, today I want to talk about T.nilable, a type with a five, maybe even six digit number of occurrences in Stripe's codebase.

`T.nilable` at a glance

The Sorbet docs for nilable types give a great summary, but the tl;dr is that a T.nilable(Typ) can be either a value of type Typ or, well, nil.

T.nilable (and similar types, like Python's Optional, or even C++'s std::optional) are supposedly implementations of the Maybe/option monad of functional programming fame. And it's true that, in a lot of typical cases, it works just as well. But T.nilable differs from a proper optional in a few subtle, but key ways.

What is the difference?

A complaint I see made not-infrequently about languages with proper options is the necessity of writing the word Some to lift a value from type T to Option[T]. In Scala, if you try writing this:

def foo(s: String): Option[String] = s

you'll be greeted with an error message informing you that you've passed a value of type String where an Option[String] was expected.

In the other languages I mentioned (Ruby, Python, C++), this "just works". If you need a value of type T.nilable(String), and you pass a regular String, Sorbet accepts this with no complaints (sorbet.run).

If you're not already familiar with an ML-style option, what we're observing is that Option[T] might contain a value of type T, whereas a T.nilable(U) might be a value of type U.

This "is-a" vs "has-a" distinction matters more than just needing to type the word Some. Consider the type Bool. It has two possible values, true and false. Then the type Option[Bool] has three possible values:

Some(true)
Some(false)
None

Option[Option[Bool]] has four:

Some(Some(true))
Some(Some(false))
Some(None)
None

and so on. With each added Option, we add an extra possibility to the set of possible values. You can guess why, in academic tradition, the type Option[T] is sometimes notated as \(\tau + 1\) (see these old lecture notes for a more thorough treatment of what that + means).

Now, let's consider what happens in the T.nilable case:

T::Boolean once again has the two values, true and false.
T.nilable(T::Boolean) can be true, false, or nil.
What about T.nilable(T.nilable(T::Boolean))?

Well, a value of type T.nilable(U) is a value of type U or nil, so... hey, wait, nil is already a member of T.nilable(T::Boolean)!

What we've just shown is that, as a function on types, T.nilable is idempotent - T.nilable(T.nilable(U)) is the same type as T.nilable(U). Not isomorphic to, not bijected to, but exactly the same type, as we can see here, where a function taking and returning T.nilable(T.nilable(Integer)) is accepted in the same place as one taking and returning T.nilable(Integer) ¹.

Of course, how often does this really come up?

Here's a (stripped-down) situation that I've actually dealt with "in the wild".

Suppose we have some JSON data we'd like to process. The exact minutiae of deserializing JSON into a language-native datatype (especially in a typed language) is fiddly and domain-specific, but let's assume we have some schema like this:

{ age: int
, address: string | null
}

which maps nicely to the Ruby class

class MyData
  sig { returns(Integer) }
  attr_reader :age

  sig { returns(T.nilable(String)) }
  attr_reader :address
end

These records will be stored in a database, indexed by name (another string), with a lookup function like

sig { params(name: String).returns(T.nilable(MyData)) }
def lookup(name)
  # ...
end

The problem statement is as follows: Given a list of names, retrieve the corresponding list of (possibly missing) addresses, or :not_found if no record with the given name exists. In other words, write a function with the sig

params(names: T::Array[String])
  .returns(T::Array[T.any(T.nilable(String), NotFound)]) # or Symbol

In Haskell, this is easy:

data AddressResult = Found (Maybe String) | NotFound

fetchAddresses :: [String] -> [AddressResult]
fetchAddresses =
    map (maybe NotFound Found . fmap address . lookup)

The first map gives us a list of the form [Just p1, Nothing, Just p2, ...], from which we then extract an address from every Just, giving us something like [Just (Just "An Address"), Nothing, Just Nothing]. Finally, we flatten the outer Maybe with fromMaybe, giving us [Found (Just "An Address"), NotFound, Found Nothing].

If we try to write the same logic in Ruby, however...

def fetch_addresses_bad(names)
  # `&.` here is `map` for `T.nilable`
  names.map { |x| lookup(x)&.address || :not_found }
end

you'll find that you get :not_found for valid records with no address! This is because both lookup and address give the same nil - we've lost the ability to distinguish between "not present" and "present, but no address"!

To fix this, we need to check for the "not present" case explicitly:

def fetch_addresses(names)
  names.map do |x|
    record = lookup(x)
    record.nil? ? :not_found : x.address
  end
end

If you've ever heard someone say that some pattern doesnt compose, this is what they mean: instead of simply chaining lookup and address, we needed to step in and massage the intermediate result.

`T.nilable` specifically

Regardless of whether your Option equivalent is a sum or a union, at some point you need to determine whether x is a "real" value or not. In languages with sum typing, this is done via pattern matching. In Python, we have if x is None, C++ has x.has_value(), and so on. What about Ruby?

Well, the official documentation opens with this snippet:

extend T::Sig

sig { params(x: T.nilable(String)).void }
def foo(x)
  if x
    puts "x was not nil! Got: #{x}"
  else
    puts "x was nil"
  end
end

which seems reasonable enough - unlike Python, Ruby only has a few falsy values: false and nil.

Hm... false. What happens if we do this?

extend T::Sig

sig { params(x: T.nilable(T::Boolean)).void }
def foo(x)
  if x
    puts "was x nil?"
  end
end

Turns out, Sorbet is smart enough to account for this case. If we T.reveal_type(x) within the if block, we get TrueClass, not T::Boolean. But that means that the pattern in the docs isn't universal!

When I brought this up to the Sorbet team, the consensus seemed to be that the docs don't advocate for nil-checking in this particular way, only that this is one pattern for doing so. I personally think the site should at least mention it, but have not had the time to write anything up myself, so whatever.

In the big picture, this is an extremely minor edge-case - unlike Python, Ruby doesn't let you define your own falsy classes, so this literally only affects the type T.nilable(T::Boolean).

But it does have consequences for generics. Consider this implementation of the Peekable wrapper over an iterable:

class Peekable
  extend T::Sig
  extend T::Generic

  Elem = type_member(:out)

  # Contains the topmost `Elem` if peeked, `nil` otherwise.
  sig { returns(T.nilable(Elem)) }
  attr_reader :top

  sig { returns(Enumerable[]) }
  attr_reader :inner

  sig { returns(T.nilable(Elem)) }
  def peek
    begin; inner.next
    rescue StopIteration; nil
    end
  end

  # sig elided, see
  # https://sorbet.org/docs/stdlib-generics#implementing-the-enumerable-interface
  def each(&blk)
    # ...
  end
end

How should we implement each? We might try something like this:

def each(&blk)
  loop do
    if self.top
      result = T.must(self.top)
      @top = nil
      yield result
    else
      yield inner.next
    end
  end
end

But wait, what if Elem = T::Boolean? If we peek at a stream and see false, then the if self.top branch isn't taken! Instead, we'll have to check for nil explicitly, just like before.

But wait, Ruby has one more surprise. If you actually try it, you'll notice that nil? isn't actually defined. That's because Sorbet doesn't assume that arbitrary type variables inherit from Object, where nil? is defined. We need to add the upper bound on Elem:

Elem = type_member(:out) { {upper: Object} }

But it gets worse. Did you know you can override nil?? You shouldn't do this, obviously, but defensive programming is good programming, and we should behave properly even if one of our consumers is being naughty.

What about self.top == nil or self.top.equal?(nil)? Same problem. Ruby lacks an equivalent of Python's is operator, which always returns object identity equality (so x is None always gives the right answer). For that matter, === is also overridable, so even case self.top; when nil won't work.

The actual foolproof thing to do is if nil == self.top, which uses the equals operator from NilClass. A bit anticlimactic of a solution, but it is yet one more papercut².

Closing Thoughts

Let me say explicitly that I don't think that what Ruby does is "wrong", or that sum types are inherently "better". The warts described in this post are minor annoyances at worst. Honestly, having to wrap/unwrap Some is at least as irritating any of the "problems" described here!

One could envision a language that has the best of both worlds, and automatically inserts Some anywhere a "expected Option[T], found T" error would otherwise be raised. My gut says that this would be silky-smooth to use in the happy cases, but would be deeply unpleasant if inference failed for any reason. Scala's implicits offer a stronger form of this (allowing for implicit conversion of any A to B, so long as a specially-marked conversion function is in scope), and they are widely considered to be an anti-feature. Maybe it would be tolerable when limited to Option, but in my experience this kind of non-local inference always ends up having frustrating pathological cases, no matter how benign it seems.

The initial conception of this post from six months ago (oops...) was a rant devoted to union types in general, and how anybody who thinks they're just as expressive as sum types has never done anything complex with either. The idea languished for a while due to a lack of motivation, until a conversation at work about Ruby vs Scala helped me re-focus. Many of the concepts from that post (is-a vs has-a, the impossibility of nesting) made it here, in a form that was supposed to be less antagonistic (though, reading back over this now, I suspect I've missed the mark). Maybe I'll still write that post someday, if I get sufficiently mad (as a teaser: What's the difference between List[A] | List[B] and List[A | B]?).

For you OOP aficionados out there, what this actually shows is that T.nilable(T.nilable(U)) is both a sub- and supertype of T.nilable(U), as it's accepted in both covariant (positive) and contravariant (negative) positions. In a language where subtyping forms a proper lattice, it is the case that this implies that the two types are the same. Unfortunately, Ruby's class hierarchy gets a bit tangled when it comes to the "top" types of BasicObject, Class and Module, which can form a cycle. The original conclusion (that T.nilable(T.nilable(U)) and T.nilable(U) are the same type) still holds, but if we were writing an academic paper, we'd need to do a bit more work to show why the cycle doesn't matter.↩︎
If you've learned the lession from the previous section, you might notice that Elem = T.nilable(U) will also behave incorrectly! The actual correct implementation either maintains a separate @peeked field or uses sealed classes to make a proper Option type.↩︎

Thoughts on Sorbet's T.nilable

T.nilable at a glance

What is the difference?

T.nilable specifically

Closing Thoughts

Thoughts on Sorbet's `T.nilable`

`T.nilable` at a glance

`T.nilable` specifically