Tuples
Tuples are a grouping of values (components), kind of like C-style
struct
s. In a
struct
, each component is given a unique name. In a tuple, there are no names; a component is identified solely based on its position in the tuple.
// Tuple type
(int, String, bool)
// Sort-of-but-not-quite-equivalent struct type
struct {
int v1
String v2
bool v3
}
Usually, though, you don’t access tuples like you access
struct
s. You use
parallel assignment syntax to compose and decompose tuples from and to individual elements.
typedef (int, String, bool) MyTuple
// Composing a tuple
MyTuple t = (1,"Hello",false)
// Decomposing a tuple
int i
String s
bool b
(i,s,b) = t
assert(i == 1)
assert(s == "Hello")
assert(b == false)
// A language might provide more convenient syntax to decompose
// a tuple into newly declared variables:
(int i, String s, bool b) = t
Function Parameters and Return Values
Does that last line remind you of anything? It looks a lot like a function parameter list in a C-style language. That’s because C-style function parameter lists
are tuples. Some languages (like the functional programming languages Haskell and ML) let you use tuples all over the place. Most C-style languages only allow them as arguments to functions.
One way of looking at it is that a function always takes
exactly one value and returns
exactly one value. If you want to pass in more than one parameter, you have to use a tuple.
Take this C function signature, for example:
void jumpTo(int x, int y)
The
jump
function takes a single argument. That argument is a tuple of two
int
s. Get it? It’s kind of like having:
struct JumpParams {
int x
int y
}
void jump(JumpParams params)
Wouldn’t it be a pain if you were forced to do that for every function that accepts more than one parameter? Yes it would, which is why C uses tuple-like syntax to help you out. But what if you want to
return more than one value? Let’s say the
jump
function also returned two
float
values (which represent the angle and distance that you need to jump to get to the requested location).
Since most C-style languages only let functions return one value, you have to use a workaround.
The most straightforward method is to wrap the return types in a separate struct:
struct JumpReturnVal {
float angle
float distance
}
JumpReturnVal jump(int x, int y)
This is not a good solution because you’re forced to do different things with function parameters and function return values. Why explicitly wrap return values in a separate
struct
if you don’t have to do that for function parameters?
You could also accept pointers as arguments into the function:
void jump(int x, int y, float *angle, float *distance)
// In C#, you can use "out" parameters
void jump(int x, int y, out float angle, out float distance)
The problem with the pointer method is that it’s not as straightforward. The convention is that a function’s output is sent out through the return value mechanism. With pointers and “
out
” parameters, the return values look kinda like
inputs instead of
outputs.
The C++ Standard Template Library provides a generic
pair
type:
pair<int,int> jump(float angle, float distance)
This solution is insufficient because it only works for functions that return two arguments. You could create more generic types for larger argument lists, but it still wont work as smoothly as function argument lists do. Another problem is that the mechanism for multiple parameters is still different from the mechanism for multiple return values.
With tuples, all you have to do is:
(float, float) jump(int x, int y) {
float angle = cos(x/y)
float distance = sqrt(x*x + y*y)
return (angle, distance)
}
int x = 4
int y = 2
(float angle, float radians) = jump(x, y)
If you’re looking at the code above and thinking, “Man…that function signature doesn’t read well,” you’re right. It’s ugly. But that’s just because the C syntax for type definitions doesn’t work well with tuples. If you use another standard style (putting the type
after the identifier), things look a little cleaner:
jump(int x, int y) : (float, float)
{
/* ... */
}
Isn’t that better?
Tuples are a generalization of a concept programmers are already familiar with (through function argument lists). They should be added, swiftly and mercilessly, to every programming languge in existence. C# missed out on a great opportunity to add some value over Java. On the other hand, tuple syntax doesn’t really conflict too badly with the existing language grammar, so maybe there’s still a chance.
FAQ
The difference between tuples and lists
A list (or array) is a sequence of elements of the same type (aka a sequence of homogenous elements). A tuple is a fixed-size group of elements (like a C-style
struct
without explicit field names). The type of a tuple is more complicated (there is one type entry for each element in the tuple).
Here are examples of values and their types, separated by colons. (Parens are used for tuples, square brackets are used for lists.)
(1, 2, 3) : (int, int, int)
[1, 2, 3] : int[]
(1, "String", true, "Goose") : (int, String, boolean, String)
[1, "String", true, "Goose"] : Object[]
The last example may or may not cause a compile-time error, depending on whether the language will automatically use the common supertype of all of the list’s elements.
Singleton tuples
Singleton tuples (tuples with one element) do not make sense.
At this point, a Python user may feel the need to point out GvR’s explanation for why Python has singleton tuples. The difference is that Python tuples aren’t really tuples. They are immutable lists. I think they’re called tuples because they support parallel assignment, but that’s not what makes something a tuple. You need stronger typing to have real tuples.Why don’t singleton tuples make sense? An N-tuple of integers is “N integers”. If you have a 3-tuple of integers, you have “three integers”. If you have a 1-tuple of integer, all you have is “one integer” (which is just an individual element, not at tuple).
Because of this, tuples are slightly different from records (C
struct
s). Records are objects that contain other objects. A tuple is simply a collection of objects “glued” together – no container. I think the main problem is that the surrounding parens
look like a container. If you think of the comma as the symbol that creates tuples (since the parens are just for generic expression grouping), it’s easy to see that singleton tuples don’t make sense. In OCaml, the “
*
” symbol is used to combine types to form tuple types (which is based on the traditional mathematical notation).
(int, int) --> int * int
(int, String, bool) --> int * String * bool
(int) --> int
int --> int
It turns out to be fortunate that singleton tuples don’t make sense because it avoids the ambiguity caused by other uses of parenthesis: grouping of expressions. Otherwise, we’d have to decide how to deal with the expression
(5+6)
. Is the expression’s type
int
or is it the singleton tuple
(int)
?
Empty tuples
Empty tuples don’t make sense either. Many languages, however, have syntatic sugar to make it seem like they exist. They let you write “
()
”, which looks like an empty tuple, but is actually a value with a special type (often called “unit”).
In some languages, the “unit” type is used in places where you might use “void” in C-style languages:
PrintNewline() : ()
C’s equivalent of “unit” is “void” (though C’s weak tuple support makes “void” seem like a special marker for functions that don’t return a value).
Event Multicast
When you combine tuples with generics, many possibilities open up. In C-style languages, tuples can only be used in function parameter lists, so you end up running into problems when you want to interact generically with functions.
One place where this problem appears is when you’re creating event multicast classes.
The worst possible way
In Java, an event multicast class for mouse click events might be implemented like this:
// Listener objects must implement this interface
interface EventListener
{
void EventOccurred(int x, int y)
}
// This is the multicast class that listeners have to register themselves with.
class EventMulticaster
{
List<EventListener> listeners = new List<EventListener>()
// Listeners will call this to register themselves
void AddListener(EventListener l)
{
listeners.add(l)
}
// The event source will call this method.
void EventOccurred(int x, int y)
{
for (EventListener l : listeners) {
l.EventOccurred(x, y)
}
}
}
Whenever the event source calls
EventOccurred
, all the registered listeners will be notified.
(Though C# delegates make event multicast a little easier, they don’t solve the fundamental problem. I’ll stick to plain old methods to keep things simple).
So what’s the problem? The problem is that this class will only work for events that use two
int
parameters to hold the relevant event information. This works fine for mouse clicks, but you have to
duplicate the entire chunk of code for every set of parameters. I know what you’re thinking. You’re thinking that we should use parametrics types (a.k.a. generics) to abstract all that away.
The generic way
// Holds the context information about our event
class MouseEventInfo
{
int x
int y
MouseEventInfo(int x, int y) { this.x = x; this.y = y }
}
interface EventListener<Param>
{
void EventOccurred(Param p)
}
class EventMulticaster<Param>
{
List<EventListener<Param>> listeners = new ArrayList()
void AddListener(EventListener<Param> l)
{
listeners.add(l)
}
void EventOccurred(Param p)
{
for (EventListener<T> l : listeners)
l.EventOccurred(p)
}
}
To use it:
EventMulitcaster<MouseEventInfo> em = getMouseEventMulticaster()
em.EventOccurred(new MouseEventInfo(4, 6))
class MyMouseListener implements EventListener<MouseEventInfo>
{
void EventOccured(MouseEventInfo params) { /* ... */ }
}
This works. But now we have to create a separate struct (like
MouseEventInfo
) for every event type. We need a way to create these on the fly.
The tuple way
Tuples eliminate the need to explicitly define container classes. You can create them on the fly.
EventMulticaster<(int,int)> em = getMouseEventMulticaster()
em.EventOccurred((4, 6))
class MyMouseListener implements EventListener<(int,int)>
{
void EventOccurred((int, int) params)
{
int x
int y
(x, y) = params
}
// A language could allow special syntax in the argument list
// to unwrap the tuple automatically:
void EventOccurred((int x, int y)) { /* ... */ }
}
Note that we haven’t yet added any new features to the Java/C# style generics model. We’ve added tuples to the language (but other Java-style languages already have this;
Nice,
Nemerle). Those languages (I think) can already support the above implementation. To add this to C# or Java, you don’t have to change the generics semantics.
What? You don’t like the extra set of parentheses? Hmm…they are kinda ugly. We need a way to “inline” a tuple into its container.
Inlining Tuples
I think that a “
..
” suffix is intuitive syntax for inlining. I think that either Perl, Python or Ruby uses a prefix “
*
” operator to accomplish a similar thing.
class EventListener<Param>
{
void EventOccurred(Param..)
}
class EventMulticaster<Param>
{
List<EventListener<Param>> listeners = new List<EventListener<Param>>()
void AddListener(EventListener<Param> l)
{
listeners.add(l)
}
void EventOccurred(Param.. p)
{
for (EventListener<T> l : listeners)
l.EventOccurred(p..)
}
}
Now you can write:
EventMulticaster<(int,int)> em = getMouseEventMulticaster()
em.EventOccurred(4, 6)
class MyMouseListener implements EventListener<(int,int)>
{
void EventOccurred(int x, int y) { /* ... */ }
}
In this situation, the inlining is just syntatic sugar, but I think it’s worth it.
Inlining specifics
Generic type parameter list
If a generic type parameter declaration has a “
..
” suffix, then you don’t need parenthesis when instantiating it. For example, we could have defined the
EventListener
class like this:
class EventListener<Param..> {
void EventOccurred(Param..)
}
// Now, to instantiate for an event with an (int, int) signature, we
// don't need the parens.
EventListener<int, int>
No big deal, just a little convenience.
To avoid ambiguity, you can only have one inlined type parameter in every parameter list.
Parameter lists
A function parameter can also be inlined. This feature was used in the event multicast example.
// Totally trivial use of inlining:
void GotoXY((int, int).. location) {
/* ... */
}
// Equivalent to:
void GotoXY(int x, int y) {
(int, int) location = (x, y)
/* ... */
}
// Inlining is more useful with generic type parameters.
X = (int, int)
void EventOccurred(X..)
void EventOccurred(int, int)
DoSomething(int i, String s, boolean b) { /* ... */ }
DoSomething(123, "Hello", true)
var temp = (123, "Hello")
DoSomething(temp.., true)
var temp = (123, "Hello", true)
DoSomething(temp..)
Value expressions
The event multicast example also used inlining to expand a tuple value.
void EventOccurred(int x, int y)
// Invocations
EventOccurred(1, 2)
EventOccurred((1, 2)..)
(int, int) location = (1, 2)
EventOccurred(location..)
Other uses
This same syntax can be used to implement C#’s “
delegate
” feature cleanly. (That is, ignoring “
out
” parameters, which are stupid and should be shot on sight. Seriously though, tuple return values can easily replace them). Currently, delegates are implemented through back-end hacks in the CLR implementation.