2009-03-28

Poor man's union types

The original design of the Curl language anticipated support of user defined "union types", that is, the ability to define an abstract data type representing values that may belong to two or more unrelated types. (For an example from another dynamic language, see Dylan's type-union syntax.)

The proposed syntax was:

{one-of T1, T2 [,...]}

In early versions of the language, one-of was defined as a macro that simply resolved to 'any', which is the supertype of all scalar types in Curl, with the intention of implementing a correct version at a later time. Of course, such an implementation does not do the desired compile-time checking and served only as an indication of the programmers intent. However, when we started looking into implementing this concept for real, it quickly became evident that the amount of effort required to implement it was not justified by the relative infrequency of its anticipated use, so we dropped it from our development schedules and removed the syntactic placeholder.

Nevertheless, there are times when such a type would come in handy. One example can be found in Curl's built-in 'evaluate' function, whose first argument is declared as 'any', but which actually accepts one of 'CurlSource', 'StringInterface', or 'Url'. Because the argument is declared as taking 'any', you can write code that passes an unsupported type and the compiler will not generate an error; an error will not be thrown until the function is actually executed with a bad value. If the function could have been declared as taking a union type, the compiler would be able to detect such errors at compile time.

Fortunately, it turns out that it is possible to define a class type that while not implementing the full semantics of a union type, still provides us with the most important feature you want from a union type: i.e., the type checking of assignments. The trick is to define a class with a 'any' field to hold the value, and an implicit constructor for each type in the union:

{define-class public IntOrString
field public constant value:any
{constructor public implicit {from-int i:int}
set self.value = i
}
{constructor public implicit {from-String s:String}
set self.value = s
}
}

Each implicit constructor supports implicit conversion from the argument type when assigning to a variable or argument of the class type, and since the field is constant and can only be initialized by one of the constructors, you can safely assume that the value is a member of one of the specified types (or null 'uninitialized-value-for-type' would return null for one of the types).

It would be tedious to have to define such a class every time you needed to use this pattern, but it is straightforward to define a parameterized versions of this class for different numbers of arguments, and a macro that picks the correct parameterized class based on the number of arguments, and this is exactly what I did last week. I added a new ZUZU.LIB.TYPES package to the ZUZU.LIB project, which contain the macros 'One-of' and 'AnyOne-of' along with associated parameterized value classes. The 'One-of' type represents the value using a field of type '#Object' and can only be used when all of the types in the unions are subtypes of 'Object'; 'AnyOne-of' uses an 'any' field and can be used with any types. Here is a small example:


To see this, you need to install the Curl RTE.