Generic Tuple Computation
I don’t really understand this stuff very well. My goal in writing this down was to get some feedback. Since you’re probably the only person who will ever visit this site, your feedback better be really good.
I want to perform computations on tuples generically.
zip
in Haskell
There’s a function in the Haskell Prelude called
zip
. It takes a pair of lists and joins their elements together, returning a list of pairs:
-- The type of 'zip':
zip :: [a] -> [b] -> [(a,b)]
-- Sample invocation of 'zip'
> zip [1, 2, 3, 4, 5] ["one", "two", "three"]
[(1, "one"), (2, "two"), (3, "three")]
Note that
zip
doesn’t take a tuple of two arrays as it’s argument. It uses
currying to process two arguments. For our purposes, that wont do. We need a regular tuple. So let’s make a “tuple zip” that looks slightly different:
-- Type signature:
tzip :: ([a], [b]) -> [(a,b)]
-- Implementation
tzip ((x:xs), (y:ys)) = (x, y) : tzip (xs, ys)
tzip _ = []
-- Sample invocation
> tzip ([1, 2, 3, 4, 5], ["one", "two", "three"])
[(1, "one"), (2, "two"), (3, "three")]
The problem with the above function is that it only works with two arguments. The Haskell Prelude had a curried 3-argument function called
zip3
, so it’s probably a useful thing to have variants of
tzip
to handle more lists. We could define
tzip3
,
tzip4
,
tzip5
and so on until we get to the point where nobody would need an additional version, but that’s not a real solution. What we should do is handle all the different sizes generically. To make things clearer, lets switch over to a “normal” language.
zip
in a "normal" language
Haskell’s pattern maching syntax and native support for lists hide a lot of the messy work. Too much happens automagically. With a Java-like language, we can see exactly what is going on.
The plain old 2-argument zip:
<X,Y>
List<(X,Y)> zip(List<X> x, List<Y> y)
{
Iterator<X> xi = x.iterator()
Iterator<Y> yi = y.iterator()
List<(X,Y)> result = new List<(X,Y)>()
while (xi.hasNext() && yi.hasNext())
result.add((xi.next(), yi.next())
return result
}
What a pain. OK, on to the generic version. But first, we need to pretend we have some extra language features.
Language support
Type expansion
To write a generic
zip
function, we need to be able to take a tuple of type
(int, bool)
and derive the type
(Iterator<int>, Iterator<bool>
). Here’s the plan:
- When you see an integer in a type expression, it represents a placeholder.
- If a type expression has any placeholders, then the “
*
” operator will let you expand the expression according to a certain tuple’s types.
Here are some examples:
1*(int,bool) : (int, bool)
List<1>*(int,bool) : (List<int>, List<bool>)
(1,1)*(int,bool) : ((int,int), (bool,bool))
// Assume the following type definitions
X = (a, b, c)
Y = (p, q)
List<1*Y> : List<(p, q)>
List<1>*X : (List<a>, List<b>, List<c>)
(1,2)*X*Y : ((a,2),(b,2),(c,2)) * Y
: (((a,p),(b,p),(c,p)), ((a,q),(b,q),(c,q)))
(2,1)*Y*X : ((2,p),(2,q)) * Y
: (((a,p),(a,q)), ((b,p),(b,q)), ((c,p),(c,q)))
(1,2)*X.. * Y : ((a,2),(b,2),(c,2)).. * Y
: (((a,p),(b,p),(c,p)).., ((a,q),(b,q),(c,q))..)
: ((a,p),(b,p),(c,p),(a,q),(b,q),(c,q))
This looks a lot like function invocation, but I’m pretty sure it isn’t powerful enough to spin out into halting problem land. Then again, I probably would have said that about C++ templates when I first learned about them (but I think it’s the fact that C++ allows templates to perform integer computations that makes them Turing complete).
Tuple value computation
We need a way to perform a computation on all the elements of a tuple value. The most obvious way of doing this is to copy the two fundamental functions of list computations: “map” and “reduce”.
“map” invokes the same function on multiple elements in parallel. This is essentially the same functionality provided by the type expansion syntax and so we could always use a variation on that. However, that syntax is really tedious and so we could use a simpler syntax that will cover most of our needs: the “
@
” suffix. If a tuple expression is suffixed by a “
@
”, then the surrounding expression will be applied to all of the tuple elements individually and the results will be combined back into a tuple.
a = (12, "Goose", false)
a@.GetClassName() : ("int", "String", "bool")
a@ instanceof String : (false, true, false)
b = (1, 2, 5)
b@ + 10 : (12, 13, 15)
b@ == 2 : (false, true, false)
Similarly, we’ll define a limited subset of “reduce”: when the “
&&
” and “
||
” operators are prefixed to a tuple expression, they perform an AND or an OR on all the values in the tuple.
&& (true, false) : false
|| (true, false) : true
&& ((1,2,3)@ == 2) : false
|| ((1,2,3)@ == 2) : true
Now that we have the necessary tools, we can define a generic
zip
function.
Generic zip
<X>
List<X> zip(List<1>*X x)
{
Iterator<1>*X xi = x@.iterator()
List<X> result = new List<X>()
while (&& xi@.hasNext())
result.add(xi@.next())
return result
}
You may have noticed that we could have problems with empty tuples. Since “
&&
” on an empty tuple will always return “
true
”, the “
while
” loop will go on forever. Luckily, empty tuples don’t really exist. See the
tuples page for an attempt at an explanation.
unzip
unzip
undoes the work that
zip
does. Here’s the two-argument version
<X,Y>
(List<X>,List<Y>) unzip(List<(X,Y)> l)
{
Iterator<(X,Y)> i = l.iterator();
List<X> resultX = new List<X>();
List<Y> resultY = new List<Y>();
while (i.hasNext()) {
X valX;
Y valY;
(valX, valY) = i.next();
resultX.add(valX);
resultY.add(valY);
}
return (resultX, resultY);
}
Generic unzip
Um…I don’t know how to do this. I don’t think the syntax we have so far is enough. We need to be able to do the following:
X : (int, String, bool)
Y : (List<int>, List<String>, List<bool>)
????? : ((int, List<int>), (String, List<String>), (bool, List<bool>))
Any ideas?
<X>
List<1>*X unzip(List<X> x)
{
Iterator<X> xs = x.iterator()
List<1>*X result = new List<X@>()
while (xs.hasNext())
// do some voodoo here
return result
}
Miscellaneous details
When the "@
" isn’t powerful enough
b = (1, 2, 3)
c = (10, 20, 30)
b@ + c@ : ((12, 13, 13), (22, 23, 23), (32, 33, 33))
c@ + b@ : ((12, 22, 32), (13, 23, 33), (13, 23, 33))
b@ - c@ : ((-9,-19,-29), (-8,-18,-28), (-7,-17,-27))
??????? : (( 9, 19, 29), ( 8, 18, 28), ( 7, 17, 27))
c@ - b@ : (( 9, 8, 7), (19, 18, 17), (29, 28, 27))
The “
???????
” means that I don’t know an easy way to generate the result on the right. That’s because we have to control the order of expansion of the two tuples, but we can’t. With addition, we can just reverse the arguments because addition commutes. Can’t do the same thing with subtraction.
The long way of getting around this is to create a function that will reverse the arguments:
int RevSub(int x, int y) { return (x - y) }
RevSub(@c, @b) : (( 9, 19, 29), ( 8, 18, 28), ( 7, 17, 27))
Do we need special syntax to handle this more concisely?
Type constraints
This is the syntax that lets you specify what the tuple elements should look like. Though Haskell-style (or even C++-style) implicit constraints may come in handy, the predictability of explicit constraints is sometimes nice to have.
Here are example constraints for a remote function call proxy:
class RemoteProxy<(Param)>
where Param@ implements Serializable
{
void Invoke(Param.. p)
}
This constraint can be used to ensure that all the parameter types in the
Param
tuple implement the
Serializable
interface (so that all the parameters can be shipped to a remote target).
Interestingly, in this special case we might be able to avoid the
class RemoveProxy<Params>
where Params implements Serializable
{
void Invoke(Params p..)
}
The second example only checks that the entire tuple implements
Serializable
. Since the runtime system creates tuples on the fly, it could treat
Serializable
in a special way by automatically making a tuple
Serializable
if all its constituents are serializable.
But we can’t bake everything into the language. We need allow interface writers to provide implementations for tuples:
interface Hashable
{
int hash();
}
tuple (T..) implements Hashable
where T@ implements Hashable
{
int hash()
{
// TODO: try to figure out how to do this without degenerating to
// list-style programming.
}
}
Implementation
Often, when a new language feature is proposed, the proposer provides a reference implementation. That’s hard and so I’m not going to do that. If you know a language that has similar features, let me know and I’ll link to it here.