Archive for the 'lisp' Category

A little programming puzzle: /=

2007-12-18

/= is Common Lisp’s not-equals function (similar to the != operator from C). In Common Lisp /= is restricted to numbers and it is in fact an error to call it with any other type, but not a lot of people know that. It looks like an innocuous enough function, until one considers that like a lot of Lisp functions it can take any number of arguments:

(/= x1 x2 x3 ... xn)

The semantics of /= are to return True if and only if no two numbers are the same value. Note that (/= 1 2 1) is False; it is insufficient to merely consider adjacent pairs, if any pair have the same value then the function should return False.

Contrast this with the semantics of the function for equality, =, which returns True if and only if all numbers have the same value. This can be implemented by checking adjacent pairs, or comparing each argument against the first, a straightforward O(n) algorithm.

So the puzzle is, how do you implement /=? Given a list of numbers we want to return False if any pair have the same value, returning True otherwise. It’s most interesting if we consider the complexity and therefore the case when we have large numbers of arguments.

Common Lisp implements a wide range of number types: unbounded integers (bignums), unbounded rationals, finite precision floating-point types, and complex versions of all of those. /= compares numbers by value so the integer 0, the floating-point number 0d0, and the complex floating-point number #C(0d0 0d0) are all equal. Note that any real number can be considered as being equal in value to a complex number with a 0 imaginary part; also note that any floating-point number can be converted to a rational having exactly the same value (see my blog article on the subject). That means that for simplification we might want to convert all input numbers to the type (complex rational) before proceeding. Since that’s obviously a bounded time operation it doesn’t affect the complexity of whatever algorithm we come up with. That cuts down on the number of different types of pair-wise comparison operators that we might need.

The first, obviously naïve, approach would be to compare all possible pairs. That obviously works and is obviously O(n2).

If you like a puzzle, try improving upon this naïve algorithm before reading on.

The next approach I came up with is to sort the arguments first, and then do comparisons of all the adjacent pairs. Recall that the input numbers are complex and there’s no natural ordering. Doesn’t matter, we can just pick any old ordering, for example lexicographic ordering, where #C(a b) is less then #C(c d) if a < c, or a = c and b < d. This algorithm is O(n log(n)) for the sort (consult any algorithms book) and O(n) for the final comparison of adjacent pairs, so O(n log(n)) overall.

That seems a lot better. But there is another approach that’s even better, suggested by a good friend:

Create a hash-table and hash each argument into it, checking to see if an entry with that value already exists. If an entry already exists then that’s an equal pair, so return False; if we get to the end of the arguments then there’s no equal pair, so return True. This algorithm is O(n) unless you’re unlucky to run into some pathological hashing collision. So O(n) in all likelihood.

Of course the asymptotic behaviour of /= for large numbers of arguments is unlikely to form part of your decision to use one Lisp system or another, but it makes a nice puzzle. Unsurprisingly, the practice of using more than 2 arguments is very rare. Lisp test suites, that sort of thing. The best example I found is:

(not (/= samples 3 4))

which is true when samples is 3 or 4. This seems obscure to me, but it is more compact than alternatives; maybe it is idiomatic in some circles.

SBCL used in e-counting

2007-08-16

Page 21 of ORG’s full report on their observations of the 2007 elections (held for various purposes across the UK) says that SBCL was used for e-Counting in South Bucks. That’s pretty much the only positive thing I’ve found so far. The report just makes me want to weep.

Of course the usual suspects also appear: Windows Server 2003, .Net 2.0, Oracle, Red Hat, Ubuntu (twice!), Apache, Tomcat, Ruby on Rails, Firefox 2.0, Sun JRE.

Lisp: coerce float to rational

2007-07-10

Since all floats are mathematical rationals you might expect to be able to coerce a float to a rational. But you can’t:

* (coerce 1d0 'rational)

debugger invoked on a SIMPLE-TYPE-ERROR:
  1.0d0 can't be converted to type RATIONAL.

Here’s a function which does what (coerce x 'rational) might do:

(defun coerce-float-to-rational (x)
  (declare (type float x))
  (let ((b (float-radix x)))
    (multiple-value-bind (signif e s) (integer-decode-float x)
      (* s signif (expt b e))))) 

[Edit on 2007-07-12: jaoswald points out that this functionality is provided by the function rational. So my code could be (part of) an implementation of rational.]

Example:

CL-USER> (coerce-float-to-rational 3.141592653589793)
13176795/4194304                                                                
CL-USER> (coerce-float-to-rational 3.141592653589793d0)
884279719003555/281474976710656                                                 

Numbers that are mathematical integers (which includes all sufficiently large ones) come out as integer objects. Note that for very large numbers you quickly exhaust the precision of the underlying float type:

CL-USER> (coerce-float-to-rational 1e10)
10000000000                                                                     
CL-USER> (coerce-float-to-rational 1e20)
100000002004087734272                                                           

The smallest number

2007-07-05

SBCL doesn’t correctly read this number:

* 2.4703282292062328d-324

0.0d0

Bad Lisp. To your room and no snacks.

Python does:

>>> 2.4703282292062328e-324
4.9406564584124654e-324

The reason Python reads «2.47…» but then prints «4.94…» is because the value that it converts the input to is the smallest positive number in IEEE 754 double precision. As I pointed out whilst I was badmouthing the C# standard this is 2-1074. The number immediately below 2-1074 in the double type is +0 and the number immediately above it is 2-1073. Thus for x such that 2-1075 < x < 3*2-1075, x gets converted to 2-1074 because that’s the nearest member of the double type.

The shortest way to write this number (which I suggest is best) is 5e-324. But because the relative input error is so large for this denormal, we can use anything from 3e-324 to 7e-324 (we don’t even have a single digit of precision when our denormals get this small).

Reading the Common Lisp Hyperspec I’m a bit shocked. There doesn’t seem to be any discussion of the meaning of a number when it is read. In other words whilst the spec says that reading «1.1» results in a number, it doesn’t say what number it is (does it? It wouldn’t be the first time that I’ve missed something obvious). It doesn’t say this even for integers. For integers we can take the stunningly obvious as read: an integer when read is converted internally to an integer that has the same mathematical value as its external representation. For reals it’s not so clear. In most (all?) implementations the real number 1.1 will not be representable exactly, we’ll have to choose some number to approximate it. A sensible thing to do is to choose the member of the target type that is closest to the mathematical value of the decimal number. This allows (as long as printing is good enough) floating point numbers to be printed and read without loss.

It seems like an argument could be made that when the reader underflows a number to 0 (as SBCL did with my first example) then it should signal a floating point underflow error. “Should” in the sense of “at safety level 3”. SBCL doesn’t signal an error for this code: «(progn(declare(optimize(safety 3)))(expt 2d0 -1075))», so I don’t hold out much hope of seeing the reader signal an error.

Of course, SBCL is perfectly capable of representing and printing the smallest number:

* (expt 2d0 -1074)

4.9406564584124654d-324

I guess I’ll have to read that Clinger paper after all.

Stop Press: CLHS doesn’t, strictly speaking, support floating point formats that use denormals. Bad standard.

multiple-value-constantly

2007-04-26

Improvement to Common Lisp:

Macro multiple-value-constantly

(multiple-value-constantly form*)

multiple-value-constantly evaluates each form, gathers all the values together, and yields a constant function that takes any number of arguments and returns the gathered values. It’s the spawn of multiple-value-call and constantly.

constantly could be defined as (defun constantly (it) (multiple-value-constantly it)).

Here’s my implementation:

(defmacro multiple-value-constantly (&rest forms)         
  `(multiple-value-call                                                         
    (lambda (&rest l) (lambda (&rest any)                                       
                        (declare (ignore any))                                  
                        (values-list l)))                                       
    ,@forms))

May induce funny feeling in Lisp programmers:

(setf (symbol-function 'constantly-nothing)
      (multiple-value-constantly))

(constantly-nothing)

Of course, you have to be careful using multiple values; they can be tricky.

The perils of multiple-value-prog1

2007-03-21

In learning Common Lisp I come across some pretty esoteric stuff (Lisp is like that). I tend to learn a language from specification-like documents. One thing this does is give me a view of the language that is probably different from how a typical programmer in that language sees it. It’s difficult for me to get a pragmatic view of the language. In Lisp for example, do people use PROG (as opposed to LET, PROGN, and TAGBODY)? What is the value of (VALUES)? How often do people put a reader macro on !? And, the subject of this post, does anyone ever use MULTIPLE-VALUE-PROG1?

Well, I turned to a co-worker who gets paid to program in Lisp. He said, “Yes, it’s quite often used in macros”; he turned to his current project and found 5 uses of it, all in macros. It turns out that 2 of those uses were buggy (I never did get paid for improving the quality of that project). I was tempted to humourously advertise MULTIPLE-VALUE-PROG1 with a strapline: “Causes experts to write buggy code 40% of the time”. A typical use is when you have a macro that takes a body (list of forms), where the expansion of the macro executes the body, and then does some more stuff (cleanup perhaps), but wants to pretend that really it’s just the the body that being executed, so the resulting form will yield the same values that the body alone does.

You can’t use just a PROGN like this:

(defmacro foo (... &body body)
  (progn ,@body (more-stuff ...)))

because PROGN won’t return the values from the body, it’ll return values from more-stuff. Combining PROG1 and PROGN gives the almost right:

(defmacro foo (... &body body)
  (prog1 (progn ,@body) (more-stuff ...)))

but it goes wrong when body returns other than exactly 1 value; it’s not multiple value transparent in other words. That’s why you need MULTIPLE-VALUE-PROG1:

(defmacro foo (... &body body)
  (multiple-value-prog1 (progn ,@body) (more-stuff ...)))

The bug is when you forget the inner PROGN:

(defmacro foo (... &body body)
  (multiple-values-prog1 ,@body (more-stuff ...)))

This returns multiple values, but unfortunately it returns all the values from the first form of body, not its last form. In the, probably all too common, case where body has only one form, you won’t spot the difference and this bug might go unnoticed for some while.

The amusing icing on the cake is that you can use Google’s codesearch facility to find bugs like this automatically: search for “multiple-value-prog1 ,@”. When I searched I got three results. The first result is a correct implementation of a MULTIPLE-VALUE-PROG2 macro, the other 2 are bugs. The code for the first bug is:

(defmacro waiting-for-token-event ((contact &optional (token-data 0)) &body body)
  `(multiple-value-prog1 ,@body
     (await-token-event ,contact ,token-data)))

Now you know what to look for you should be able to see how this follows the typical use pattern outlined above, and is also not correct.

Searching for multiple-value-prog1 alone shows that most of the time this appears in Lisp code it is either to describe how to format MULTIPLE-VALUE-PROG1 forms in a pretty printer or code indenter or similar, or it appears in a compiler test suite. There aren’t very many actual uses of it. Well, that’s the impression I get when viewing the world through Google’s codesearch spectacles.

Stupid colour, stupid slime, stupid emacs

2007-03-19

Why is it so many things try and customise the colour of something and so many things get it wrong?

I’d heard lots of good things about slime. I’m learning Common Lisp, so I decided to try it out. The first thing I notice is how much junk this adds to my .emacs. My emacs startup time goes from about 500ms to about 1500ms (oh okay, I actually went and measured it; it turns out that emacs has a command line option specifically for timing how long it takes to startup (well, is there any other use for emacs --kill ?). Anyway, old version: 29 ms, new version: 246 ms. To my surprise vi (vim really) is slower at 46 ms. The mighty ed scores 11 ms).

Oh well, I only use emacs for 3 things anyway: reading the info documentation for stupid GNU tools that think that I shouldn’t be using man to read my documentation; testing and debugging what little emacs lisp I write; inferior-lisp-mode. Anyway, the next thing I notice is that my prompt is cyan. This colour just happens to clash horribly with the colour of the Terminal window in which I am running emacs (which is #E78C3F). THIS HAS TO STOP. Coloured slime. Great. Should slime start with colour enabled? No. Should slime provide the ability to decorate its UI in colour? Of course. There’s no way that any particular default colour scheme provided by slime will match an arbitrary colour scheme chosen by me, so that’s why it shouldn’t use colour by default.

I think I have some sort of default colour scheme curse. All you hackers out there writing colour schemes must use ultra-black or something; I generally run my Terminal windows with a light background and black ink, and I often fall foul of default colour schemes for stupid programs that stupidly use colour by default. All you people creating ad-hoc colour schemes for your crappy little tools (slime, emacs, vi, ls) stop it. Make sure the default is to not change any colours.

So I spend another half-hour trying to find and change the relevant parts of customize-group slime (naturally this involves finding a bug in this interface which makes it impossible for me to save any changes the first time). Part of the customize-group interface are almost unbearably unreadable because of the choice of colour. Great, now my .emacs has grown from 2 lines to 17. It’s the start of a slippery slope.

Would you like to see how bad info-mode looks? Okay, here it is: Emacs screenshot with nasty clashing colours

I knew I was right to fall in love with FreeBSD when I saw Linux embrace colour ls.