Archive for March, 2007

The perils of multiple-value-prog1

2007-03-21

In learning Common Lisp I come across some pretty esoteric stuff (Lisp is like that). I tend to learn a language from specification-like documents. One thing this does is give me a view of the language that is probably different from how a typical programmer in that language sees it. It’s difficult for me to get a pragmatic view of the language. In Lisp for example, do people use PROG (as opposed to LET, PROGN, and TAGBODY)? What is the value of (VALUES)? How often do people put a reader macro on !? And, the subject of this post, does anyone ever use MULTIPLE-VALUE-PROG1?

Well, I turned to a co-worker who gets paid to program in Lisp. He said, “Yes, it’s quite often used in macros”; he turned to his current project and found 5 uses of it, all in macros. It turns out that 2 of those uses were buggy (I never did get paid for improving the quality of that project). I was tempted to humourously advertise MULTIPLE-VALUE-PROG1 with a strapline: “Causes experts to write buggy code 40% of the time”. A typical use is when you have a macro that takes a body (list of forms), where the expansion of the macro executes the body, and then does some more stuff (cleanup perhaps), but wants to pretend that really it’s just the the body that being executed, so the resulting form will yield the same values that the body alone does.

You can’t use just a PROGN like this:

(defmacro foo (... &body body)
  (progn ,@body (more-stuff ...)))

because PROGN won’t return the values from the body, it’ll return values from more-stuff. Combining PROG1 and PROGN gives the almost right:

(defmacro foo (... &body body)
  (prog1 (progn ,@body) (more-stuff ...)))

but it goes wrong when body returns other than exactly 1 value; it’s not multiple value transparent in other words. That’s why you need MULTIPLE-VALUE-PROG1:

(defmacro foo (... &body body)
  (multiple-value-prog1 (progn ,@body) (more-stuff ...)))

The bug is when you forget the inner PROGN:

(defmacro foo (... &body body)
  (multiple-values-prog1 ,@body (more-stuff ...)))

This returns multiple values, but unfortunately it returns all the values from the first form of body, not its last form. In the, probably all too common, case where body has only one form, you won’t spot the difference and this bug might go unnoticed for some while.

The amusing icing on the cake is that you can use Google’s codesearch facility to find bugs like this automatically: search for “multiple-value-prog1 ,@”. When I searched I got three results. The first result is a correct implementation of a MULTIPLE-VALUE-PROG2 macro, the other 2 are bugs. The code for the first bug is:

(defmacro waiting-for-token-event ((contact &optional (token-data 0)) &body body)
  `(multiple-value-prog1 ,@body
     (await-token-event ,contact ,token-data)))

Now you know what to look for you should be able to see how this follows the typical use pattern outlined above, and is also not correct.

Searching for multiple-value-prog1 alone shows that most of the time this appears in Lisp code it is either to describe how to format MULTIPLE-VALUE-PROG1 forms in a pretty printer or code indenter or similar, or it appears in a compiler test suite. There aren’t very many actual uses of it. Well, that’s the impression I get when viewing the world through Google’s codesearch spectacles.

Stupid const, stupid mutable, stupid C++

2007-03-21

Every now and then someone drags me into a C++ discussion. I have friends who code in C++. Some of them do it for money. I’m sorry about that.

Recently I heard about “const methods”. “What’s that?” I ask; “Methods that are contractually obliged not to change their object.”. Oh, I think. That seems really useful, but a feature like that would never make into C++ (I reason, based on what I know of C). Still, not knowing much about C++ I shut up and go and learn some more C++ instead. I consult my friends who actually know more C++ than I do. The replies are not consistent, not confident, and sometimes go all slippery and vague when I press for details.

Then, it dawns on me. A “const method” is simply a method whose this pointer is const qualified. It’s an inevitable feature of the language once you’ve add both const and this; otherwise how would you invoke a method on a const instance, and what would the type of this be? You have to have methods with const this otherwise you end up throwing away constness and I’m sure we can all agree on what a bad idea that is. (of course when I say things like const this I mean that this has type const foo * (pointer to const thingy)).

Here’s the most important thing to know about const methods:

They cannot help the compiler; they cannot help you.

(Actually they can help you, if certain conventions are obeyed. The key thing is that the conventions are just that, so const methods cannot be relied upon to help you; there is no contractual obligation). There’s a lot of confusion about const on the web, particular about what it means for a method to be const. You see stuff like this (from this random webpage):

void inspect() const;   // This member promises NOT to change *this

This is trying to claim that declaring the method to be const means that the method promises not to change the object. This is rot. Aside from naughty const-removing casts that would allow the inspect method to modify the object, the inspect method might simply call a function that happens to use a global pointer (that is not const) to modify the object in question. To illustrate (if my C++ is not idiomatic, that’s because I don’t get paid to program in it; if it’s not correct, I want to know):

#include <iostream>

class counter {
  public:
  int i;

  counter();

  int inspect() const;
  void increment();
};

counter sigma_inspect;

counter::counter()
{
  i = 0;
}

int counter::inspect() const
{
  sigma_inspect.increment();
  return i;
}

void counter::increment()
{
  ++ i;
  return;
}

int main(void)
{
  counter a;

  std::cout << a.inspect() << "\\n";

  std::cout << sigma_inspect.inspect() << "\\n";
  std::cout << sigma_inspect.inspect() << "\\n";
  return 0;
}

inspect is a const method and increment is a non-const method. increment increments the value of a counter, and inspect returns the value of that counter. The inspect method also increments a global counter that records the total number of times that inspect has been called. Consider the two calls to sigma_inspect.inspect in the main function. They will print out as 2 and 3. The value of sigma_inspect.inspect() has changed even though inspect is a const method. This is of course because the object on which we are invoking the inspect method is sigma_inspect, the same object that inspect is using to keep track of the number of times it has been invoked. The reason that this works is that the inspect method is not using its this pointer to modify the sigma_inspect object, it is using the sigma_inspect global directly. It would be an error for the inspect method to try and modify *this because that’s a const lvalue and can’t be used to modify an object. Essentially the sigma_inspect object has been aliased.

The best that can be said for const methods is that they’re a useful convention. It is useful to denote in an interface which methods do not modify an object and which do, and const is a way of doing that. But note that there are no obligations nor guarantees. A const method is not obliged to refrain from modifying the object (as the above example shows); the caller of a const method has no guarantee that the object will not change. const is not magic.

I’ve heard some people tout const in C++ as an advantage it has over other languages such as Java or Common Lisp. const methods add contractual guarantees to the language that are useful. Well, there are no guarantees. As with my many “my language is better than yours” debates both sides would probably do well to simply spend more time programming in different languages (not necessarily the ones that are being argued about).

Objective-C doesn’t have const methods, and yet Objective-C programmers using Apple’s Cocoa interfaces enjoy many of the same benefits, in particular a clear separation of mutating and non-mutating methods enforced by the compiler. How is it achieved? Through the class hierarchy. Observe that in C++ that if I have a foo instance then I can call const methods as well as non-const methods, whereas if I have a const foo instance I can only call const methods. A foo instance has all the methods of a const foo instance, and some more on top. This is just like inheritance in an ordinary class hierarchy. If A has all the methods of B and some more methods then we can usually implement this by making A a subclass of B. So it is in Cocoa. The Collections classes illustrate this idea best. Cocoa has an Array class (called NSArray for historical reasons) and a Mutable Array class. The NSArray class models read-only arrays. They can be created (and returned from other methods and functions and so on), but not modifed. The NSMutableArray class is a subclass of NSArray and it supports methods that can modify the array (such as adding, removing, and so on). A method like count, which counts the number of items in the array, is specified by the NSArray class so is invokable on instances on NSArray and instances NSMutableArray. You have an NSMutableArray instance in your hand and you want to pass it to a function which is expecting the read-only version NSArray? No problem, casting up the hierarchy is legal and problem free, just like a cast that adds const in C++.

Some parts of Java’s JSE class hierarchy do a similar thing, see java.awt.image.Raster and its subclass java.awt.image.WritableRaster for example. And of course, Apple’s Cocoa classes (NSArray and friends) also exist in Java as well as Objective-C.

Using the class hierarchy like this might scare you, especially if you have preconceived ideas about what to use a class hiearchy for. To be honest the only people I know to be scared are C++ programmers that haven’t been exposed to enough other object oriented languages.

So, what about mutable? Well, perhaps by now you’ve guessed that I think it’s stupid and wrong and should not be tolerated. It doesn’t add anything that you can’t do with casts or pointers. Perhaps the simplest illustration is putting a pointer to itself in every object:

#include <iostream>

struct silly {
  silly();
  void foo() const;
  silly *const mutate;
  int i;
};

silly::silly(): mutate(this)
{
  i=0;
}

void silly::foo() const
{
  std::cout << i << "\\n";
  ++ mutate->i;
  std::cout << i << "\\n";
  return;
}

int main(void)
{
  silly s;
  s.foo();
  return 0;
}

Every silly object has a mutate variable that points to itself. The const foo method can use the mutate variable to change the object, in this case incrementing i. Calling it mutate even makes it kind of self-documenting. If it’s that easy to get round the const restriction when you want, why bother with an entire new keyword, mutable? The only explanation I have is that the people that have influence over the evolution of C++ enjoy making it more complex in ways that don’t give you any more real programming power.

Of course, as a C++ programmer you need to know about const, and you probably need to know about const methods, but that’s mostly because if you don’t you’ll annoy people that try to create const instances (or refs to such) of your classes.

Stupid colour, stupid slime, stupid emacs

2007-03-19

Why is it so many things try and customise the colour of something and so many things get it wrong?

I’d heard lots of good things about slime. I’m learning Common Lisp, so I decided to try it out. The first thing I notice is how much junk this adds to my .emacs. My emacs startup time goes from about 500ms to about 1500ms (oh okay, I actually went and measured it; it turns out that emacs has a command line option specifically for timing how long it takes to startup (well, is there any other use for emacs --kill ?). Anyway, old version: 29 ms, new version: 246 ms. To my surprise vi (vim really) is slower at 46 ms. The mighty ed scores 11 ms).

Oh well, I only use emacs for 3 things anyway: reading the info documentation for stupid GNU tools that think that I shouldn’t be using man to read my documentation; testing and debugging what little emacs lisp I write; inferior-lisp-mode. Anyway, the next thing I notice is that my prompt is cyan. This colour just happens to clash horribly with the colour of the Terminal window in which I am running emacs (which is #E78C3F). THIS HAS TO STOP. Coloured slime. Great. Should slime start with colour enabled? No. Should slime provide the ability to decorate its UI in colour? Of course. There’s no way that any particular default colour scheme provided by slime will match an arbitrary colour scheme chosen by me, so that’s why it shouldn’t use colour by default.

I think I have some sort of default colour scheme curse. All you hackers out there writing colour schemes must use ultra-black or something; I generally run my Terminal windows with a light background and black ink, and I often fall foul of default colour schemes for stupid programs that stupidly use colour by default. All you people creating ad-hoc colour schemes for your crappy little tools (slime, emacs, vi, ls) stop it. Make sure the default is to not change any colours.

So I spend another half-hour trying to find and change the relevant parts of customize-group slime (naturally this involves finding a bug in this interface which makes it impossible for me to save any changes the first time). Part of the customize-group interface are almost unbearably unreadable because of the choice of colour. Great, now my .emacs has grown from 2 lines to 17. It’s the start of a slippery slope.

Would you like to see how bad info-mode looks? Okay, here it is: Emacs screenshot with nasty clashing colours

I knew I was right to fall in love with FreeBSD when I saw Linux embrace colour ls.