Corrections? Opinions? Insults? Send 'em in. Recent changes
Table of Contents

# Generic Tuple Computation

I don’t really understand this stuff very well. My goal in writing this down was to get some feedback. Since you’re probably the only person who will ever visit this site, your feedback better be really good.

I want to perform computations on tuples generically.

# `zip` in Haskell

There’s a function in the Haskell Prelude called `zip`. It takes a pair of lists and joins their elements together, returning a list of pairs:

```-- The type of 'zip':
zip :: [a] -> [b] -> [(a,b)]

-- Sample invocation of 'zip'
> zip  [1, 2, 3, 4, 5]  ["one", "two", "three"]
[(1, "one"), (2, "two"), (3, "three")]```

Note that `zip` doesn’t take a tuple of two arrays as it’s argument. It uses currying to process two arguments. For our purposes, that wont do. We need a regular tuple. So let’s make a “tuple zip” that looks slightly different:

```-- Type signature:
tzip :: ([a], [b]) -> [(a,b)]

-- Implementation
tzip ((x:xs), (y:ys)) = (x, y) : tzip (xs, ys)
tzip _                = []

-- Sample invocation
> tzip ([1, 2, 3, 4, 5], ["one", "two", "three"])
[(1, "one"), (2, "two"), (3, "three")]```

The problem with the above function is that it only works with two arguments. The Haskell Prelude had a curried 3-argument function called `zip3`, so it’s probably a useful thing to have variants of `tzip` to handle more lists. We could define `tzip3`, `tzip4`, `tzip5` and so on until we get to the point where nobody would need an additional version, but that’s not a real solution. What we should do is handle all the different sizes generically. To make things clearer, lets switch over to a “normal” language.

# `zip` in a "normal" language

Haskell’s pattern maching syntax and native support for lists hide a lot of the messy work. Too much happens automagically. With a Java-like language, we can see exactly what is going on.

The plain old 2-argument zip:

```<X,Y>
List<(X,Y)> zip(List<X> x, List<Y> y)
{
Iterator<X> xi = x.iterator()
Iterator<Y> yi = y.iterator()
List<(X,Y)> result = new List<(X,Y)>()

while (xi.hasNext() && yi.hasNext())
result.add((xi.next(), yi.next())

return result
}```

What a pain. OK, on to the generic version. But first, we need to pretend we have some extra language features.

# Language support

## Type expansion

To write a generic `zip` function, we need to be able to take a tuple of type `(int, bool)` and derive the type `(Iterator<int>, Iterator<bool>`). Here’s the plan:

• When you see an integer in a type expression, it represents a placeholder.
• If a type expression has any placeholders, then the “`*`” operator will let you expand the expression according to a certain tuple’s types.

Here are some examples:

```1*(int,bool)        :  (int, bool)
List<1>*(int,bool)  :  (List<int>, List<bool>)
(1,1)*(int,bool)    :  ((int,int), (bool,bool))

// Assume the following type definitions
X = (a, b, c)
Y = (p, q)

List<1*Y>        :  List<(p, q)>
List<1>*X        :  (List<a>, List<b>, List<c>)

(1,2)*X*Y        :  ((a,2),(b,2),(c,2)) * Y
:  (((a,p),(b,p),(c,p)), ((a,q),(b,q),(c,q)))

(2,1)*Y*X        :  ((2,p),(2,q)) * Y
:  (((a,p),(a,q)), ((b,p),(b,q)), ((c,p),(c,q)))

(1,2)*X.. * Y    :  ((a,2),(b,2),(c,2)).. * Y
:  (((a,p),(b,p),(c,p)).., ((a,q),(b,q),(c,q))..)
:  ((a,p),(b,p),(c,p),(a,q),(b,q),(c,q))```

This looks a lot like function invocation, but I’m pretty sure it isn’t powerful enough to spin out into halting problem land. Then again, I probably would have said that about C++ templates when I first learned about them (but I think it’s the fact that C++ allows templates to perform integer computations that makes them Turing complete).

## Tuple value computation

We need a way to perform a computation on all the elements of a tuple value. The most obvious way of doing this is to copy the two fundamental functions of list computations: “map” and “reduce”.

“map” invokes the same function on multiple elements in parallel. This is essentially the same functionality provided by the type expansion syntax and so we could always use a variation on that. However, that syntax is really tedious and so we could use a simpler syntax that will cover most of our needs: the “`@`” suffix. If a tuple expression is suffixed by a “`@`”, then the surrounding expression will be applied to all of the tuple elements individually and the results will be combined back into a tuple.

```a = (12, "Goose", false)
a@.GetClassName()      : ("int", "String", "bool")
a@ instanceof String   : (false, true, false)

b = (1, 2, 5)
b@ + 10    :  (12, 13, 15)
b@ == 2    :  (false, true, false)```

Similarly, we’ll define a limited subset of “reduce”: when the “`&&`” and “`||`” operators are prefixed to a tuple expression, they perform an AND or an OR on all the values in the tuple.

```&& (true, false)     :  false
|| (true, false)     :  true

&& ((1,2,3)@ == 2)   :  false
|| ((1,2,3)@ == 2)   :  true```

Now that we have the necessary tools, we can define a generic `zip` function.

# Generic `zip`

```<X>
List<X> zip(List<1>*X x)
{
Iterator<1>*X xi = x@.iterator()
List<X> result = new List<X>()

while (&& xi@.hasNext())
result.add(xi@.next())

return result
}```

You may have noticed that we could have problems with empty tuples. Since “`&&`” on an empty tuple will always return “`true`”, the “`while`” loop will go on forever. Luckily, empty tuples don’t really exist. See the tuples page for an attempt at an explanation.

# `unzip`

`unzip` undoes the work that `zip` does. Here’s the two-argument version

```<X,Y>
(List<X>,List<Y>) unzip(List<(X,Y)> l)
{
Iterator<(X,Y)> i = l.iterator();
List<X> resultX = new List<X>();
List<Y> resultY = new List<Y>();

while (i.hasNext()) {
X valX;
Y valY;
(valX, valY) = i.next();
resultX.add(valX);
resultY.add(valY);
}

return (resultX, resultY);
}```

# Generic `unzip`

Um…I don’t know how to do this. I don’t think the syntax we have so far is enough. We need to be able to do the following:

```X  :  (int, String, bool)
Y  :  (List<int>, List<String>, List<bool>)

????? : ((int, List<int>), (String, List<String>), (bool, List<bool>))```

Any ideas?

```<X>
List<1>*X unzip(List<X> x)
{
Iterator<X> xs = x.iterator()
List<1>*X result = new List<X@>()

while (xs.hasNext())
// do some voodoo here

return result
}```

# Miscellaneous details

## When the "`@`" isn’t powerful enough

```b = (1, 2, 3)
c = (10, 20, 30)

b@ + c@      :  ((12, 13, 13), (22, 23, 23), (32, 33, 33))
c@ + b@      :  ((12, 22, 32), (13, 23, 33), (13, 23, 33))

b@ - c@      :  ((-9,-19,-29), (-8,-18,-28),  (-7,-17,-27))
???????      :  (( 9, 19, 29), ( 8, 18, 28),  ( 7, 17, 27))
c@ - b@      :  (( 9,  8,  7), (19, 18, 17),  (29, 28, 27))```

The “`???????`” means that I don’t know an easy way to generate the result on the right. That’s because we have to control the order of expansion of the two tuples, but we can’t. With addition, we can just reverse the arguments because addition commutes. Can’t do the same thing with subtraction.

The long way of getting around this is to create a function that will reverse the arguments:

```int RevSub(int x, int y) { return (x - y) }
RevSub(@c, @b)  :  (( 9, 19, 29), ( 8, 18, 28),  ( 7, 17, 27))```

Do we need special syntax to handle this more concisely?

## Type constraints

This is the syntax that lets you specify what the tuple elements should look like. Though Haskell-style (or even C++-style) implicit constraints may come in handy, the predictability of explicit constraints is sometimes nice to have.

Here are example constraints for a remote function call proxy:

```class RemoteProxy<(Param)>
where Param@ implements Serializable
{
void Invoke(Param.. p)
}```

This constraint can be used to ensure that all the parameter types in the `Param` tuple implement the `Serializable` interface (so that all the parameters can be shipped to a remote target).

Interestingly, in this special case we might be able to avoid the

```class RemoveProxy<Params>
where Params implements Serializable
{
void Invoke(Params p..)
}```

The second example only checks that the entire tuple implements `Serializable`. Since the runtime system creates tuples on the fly, it could treat `Serializable` in a special way by automatically making a tuple `Serializable` if all its constituents are serializable.

But we can’t bake everything into the language. We need allow interface writers to provide implementations for tuples:

```interface Hashable
{
int hash();
}

tuple (T..) implements Hashable
where T@ implements Hashable
{
int hash()
{
// TODO: try to figure out how to do this without degenerating to
// list-style programming.
}
}```

# Implementation

Often, when a new language feature is proposed, the proposer provides a reference implementation. That’s hard and so I’m not going to do that. If you know a language that has similar features, let me know and I’ll link to it here.

data/generic_tuple_computation.txt Last modified: 01.10.2008 00:37 