This post of pydanny prompted me to flesh this article out (previously it had just been floating around in my mind). pydanny’s article reminded me of when I first learned about lambda and how scary it was. That was in about 1988 when I was trying to get to grips with XLISP on the Atari ST (hah! a Lisp dialect so old it still spells its name with capital letters. sweet). Previous to XLISP I had been programming in (Sinclair) BASIC, z80 machine code, and 68000 assembler. Oh, and FORTH. Maybe you can imagine how scary and alien a concept like lambda was. I can just barely remember how confusing it was. I don’t think the confusion really passed until I started dabbling with ML, in 1991. By the time I started dabbling with the lambda calculus in 1993 I was actually pretty comfortable with it. So, only 5 years of exposure to get lambda to get comfy.
Of course now, I know that:
A lambda is a simply function with no name.
I emphasise the simply because I feel that that’s where the problem of understanding lay (for me). In BASIC and FORTH it was impossible to have a function without a name. In fact, the very thought was literally inconceivable. At least for a few years. Even in machine code where functions don’t have names they do have concrete locations, and they have names in the listing.
It wasn’t until many years later, in fact perhaps relatively recently, that I realised that there was a sort of hierarchy of things that must be named, and things that need not be named. For example, an expression, like «a**2 + b**2 + c**2» is an anonymous value, a value that does not need a name. More importantly, so are all the sub-expressions inside that expression. What would a language look like if it didn’t have anonymous values? Well, something like this:
a2 = a**2 b2 = b**2 c2 = c**2 s = a2+b2 s = s + c2
Do you see? Each expression must be given a name (by assigning it to a named variable), and no expression can contain a sub-expression, because that sub-expression would be anonymous.
Programming in assembler is a bit like this. The names are the registers and every expression has to go in a register. If you run out, you have to rename everything by spilling to the stack. It’s partly why it can be so tedious.
So almost everything worthy of being called a language has anonymous values. At least simple values, like numbers. In the BASIC era it was common for strings to be named only. The kinds of restrictions you had to deal with were things like not being able to create a string from part of an existing string. You had to first create a named variable that was the target string, then copy part of the source string into the target string. Maybe you had some string expressions but there were restrictions on where you could use them. For example maybe you could only pass named strings to functions, so you could go «
T$=MID$(A$,12,16):PROCFOO(T$)» but not pass the
MID$ expression directly: «
PROCFOO(MID$(A$,12,16))» (bad bad bad). Life was hard, and we licked coal for breakfast.
Arrays were another thing where you had to name them and couldn’t pass them to functions (for example). AWK is still like this (arrays cannot be returned from a function, for example).
Speaking of arrays, an element of an array is a bit like an anonymous variable:
a[i] = x
a[i] specifies the place where the computed value, x is this case, goes. I say “a bit like” because you could equally well think of the element being named by the array name and an small integer as a pair. A place where we can store a value that changes is called a variable. You knew that right? It’s just that most people think of named variables when you say variable. The generalised term for “anything you can put on the left of an assignment” is lvalue. A term you mostly see in the C programming language but often elsewhere too. Common Lisp calls such things, places. How sensible.
So the hierarchy of things you need not name starts something like:
As you go down the list, you see fewer and fewer languages that give you anonymous versions of the thing in question. Although these days we probably all agree that any decent language will have anonymous versions of all of the above. Python does, and that’s a Good Thing.
One reason for arranging things this way is that we can use it to both explain and motivate things like lambdas. A lambda is an anonymous function, and you remember how useful it was when you realised you could have string expressions and you didn’t have to store intermediate strings in variables? You could just manipulate strings without having to name them? Well, the same is true of functions.
So what else could and should be anonymous? Well we can look to other languages for inspiration. ML has anonymous types. Java has anonymous classes. So perhaps the list can continue:
(note that a class is a sort of type, so this order has a natural feel to it).
So everything I said about anonymous functions applies to anonymous classes and anonymous types. They’re great, and every language needs them. Basically my philosophy here is that if there’s a thing in a language and you can name it, then you should also be able to have anonymous versions of those things. That will make the language better.
So what other sorts of things does Python have that we haven’t removed the names from yet? Modules. For a long while I wished that Python had anonymous modules, a way of getting hold of a module without it polluting your namespace. Mostly I just wished for this because it would make Python more orthogonal, or better as I like to say. Of course modules in Python are first class citizens. Once you’ve imported the module struct, then
struct refers to the module as a first class value, and it can be passed around and so on. That’s how
help(struct) works. Recently I found both the way to get at modules anonymously (without importing them into a namespace) , and a reason why you might want to.
I’m a tidy sort of guy so when I program in Python I put my imports only in the functions that need them (a habit I picked up from Command Line Warriors but buggered if I can find the original inspiration). Well, sometimes I am.
I often find myself debugging my Python by the time honoured tradition of inserting
sys module is not in scope (in fact, often the only function that has «
import sys» is
main. So this fails:
print >>sys.stderr, "length:", l, "data:" x
And I’ll be damned if I’m going to add an
import sys and forget to remove it later. But this works:
print >>__import__('sys').stderr, "length:", l, "data:" x
Not a very nice syntax though, is it?