Python: Tail Call Optimisation

2009-04-30

Normally, two words are sufficient to explain why we need tail call optimisation.

Guido still doesn’t get it.

lambda, map, reduce, tail-call optimisation.

I’m going postal. I have shipped a copy of SICP to Guido.

[update 2009-05-20: Guido got the book]

30 Responses to “Python: Tail Call Optimisation”

  1. Gareth Rees Says:

    You know that SICP is available online under a Creative Commons license, right?

    If Guido hasn’t read it online, he’s probably not going to read it in paper form. But you never know, I suppose.

  2. Tony Finch Says:

    Guido’s resistance seems to be based on [a] lambda the ultimate goto (though I’d be surprised but pleased if he has read those papers) and therefore the ultimate spaghetti; and [b] he loves stack backtraces (it’s especially fun to show them to non-developers).

    One of the things Erlang programmers have to get used to is debugging with stack backtraces that have lost parts of the call chain to tail calls. They don’t seem to think it’s as crippling as Guido, but then they have to rely on it to implement loops where Python does not.

  3. drj11 Says:

    @Gareth: I did know that. Shipping an actual artefact is more of an expensive joke.

    @Tony: Yes. I find it more than a little bit ironic that Guido is pushing Python’s good backtraces as an argument for not having proper tail calls.


  4. Personally I’m used to turning of optimization when debugging C, I don’t see why the same shouldn’t apply to other languages. If you get a backtrace in production then you’ve already failed (not that this stops any number of Python developers, to pick quite unfairly on the language).

  5. Nick Barnes Says:

    Guido has a small point. It can be a pain in the arse debugging in a tail-call-“optimised” implementation.
    However, this is really no excuse for not having TCO.

  6. drj11 Says:

    @NickB. Right, but in terms of debugging Python there are way more pains in my arse that I would address before I got round to “a way to disable TCO when I need to”.

  7. Ted Lemon Says:

    Is Python really such a great programming environment that it matters whether or not it has TCO? Why not just program in Scheme?

    (I am doing a lot of programming in Python right now, so you could argue that this is a hypocritical question, but I mean it sincerely – Python is something from which I am ambitious to be liberated.)

    And of course Nick is right about debugging tail call optimized code, but the same can be said of debugging any optimized routine that has branches.

  8. mathew Says:

    I’d just like to point out that Ruby 1.9 has tail call optimization. However, it’s turned off by default for reason [b], that people like stack backtraces. You can enable it in vm_opts.h.

    [Join the light side, we have tail call optimization!]

  9. Vilhelm S Says:

    If you want to combine stack traces and tail call, here is another idea (as opposed to switching a different procedure call mechanism after a certain stack depth):

    Implement tail calls by garbage collecting the stack. So a tail call pushes the return address to the previous stack frame, but does not overwrite the current one, and it’s up to the garbage collector to reclaim the unreachable stack frames. (Some scheme implementations do it this way, e.g. Scheme48 I think. See e.g. http://mumble.net/~kelsey/papers/stack-gc.ps.gz)

    Now, in a debug build, you can have the caller also push a weak pointer to the current stack frame. You could even have a special type of weak pointer that does not get collected in the first generation, in order to ensure that these frames will stick around for a while.

    So now, if you have some procedure foo calling some mutually tail-recursive procedures loop1 and loop2, you will have backtrace information for the most recently done tail recursive calls, but as you go up the stack, at some point you will get to where the garbage collector has had time to collect them, so there will be a gap — but you still know where the first non-tailrecursive call in the stack was, since that’s where the real return pointer points. And the debugger can always know where some intermediate frames were elided due to being garbage collected, so it can show it in the user interface keeping the programmer un-confused. For instance, in this example the backtrace could be displayed as

    foo()

    loop1(640)
    loop2(641)
    loop1(642)
    loop2(643)

    Where the ellipsis means that there is some gap of collected frames between the activation of foo and of loop1(640).

  10. Ted Lemon Says:

    In order to do this in the case of a recursive tail call, you’re forgoing an opportunity to avoid generating some new garbage by reusing the storage from the previous invocation of the function, which seems like it might be expensive.

    However, in general I’d certainly love to see some numbers on this, since it’s also an elegant solution to the problem of remembering continuations. I seem to recall one of the T guys making the claim that this performed poorly, but he didn’t substantiate that claim with a description of his implementation or any numbers (at least not in the paper I was reading at the time).

  11. Nick Barnes Says:

    Although I’ve built, and helped to build, development environments myself, I’m always a bit mystified by the term. What is a “development environment”, and why exactly would I want one?

    You can take it as read that I will continue using emacs to edit my code (real GNU emacs, not the “oh, just like emacs, really, you’ll never tell the difference” garbage that comes with some language implementations). And I’ll always want a fairly functional TTY-based debugging tool (gdb for C; pdb for Python is OK). And Unix, or cygwin if forced to use Windows: make and grep and so on. And of course SCM, for which Perforce is just dandy.

    So I’m not unhappy with my Python “development environment”. What am I missing?

  12. Ted Lemon Says:

    Hm, if you’re referring to my comment about programming environments, I guess I should try to respond. A development environment is the environment in which you develop, which for a lot of us is an ad-hoc collection of tools like Emacs and a python interpreter.

    The things Python is missing that I really care about are:

    – first class continuations
    – real lambdas
    – a free-form syntax, so that when I reindent code I don’t accidentally change its meaning
    – macros

    (And probably lots of other stuff I’m not thinking of.) It’s pretty nice, and I get a lot of work done with it, but if I were going to expend effort on making a program language better, Python is not where I would expend that effort (as you know… :’).

  13. Nick Barnes Says:

    Agree about lambdas: Python’s are not great, especially syntactically. Agree about macros, in the Lisp sense – i.e. functions for generating source code. My settled view is that this is what makes Lisp uniquely the best language, despite all its myriad failings (I’m coding in Lisp this week). Macros in the C sense I have never missed in Python.

    Syntax, meh. Change your “re-indent this stuff” keybinding.

    First-class continuations, also meh. I can’t remember the last time I thought “oh, I wish I had continuations” (unlike good lambdas and good macros).

  14. Ted Lemon Says:

    The problem with syntax is that there is no syntax that denotes nesting – the only denotation for nesting is indentation. So if you get distracted in the middle of reindenting, it can substantially change the meaning of the code. I’ve accidentally done this more than once. You can call me stupid, but for me at least it’s a serious problem.

    I am not sure I actually need first-class continuations, but when I finally grokked what they did a while back, I found them to be a very nice way of encoding non-local exits. Even before I really understood what they were in CS terms, they seemed like a pretty nice way to avoid unrolling linear code flows in GUI implementations.

    Ultimately explicit control over continuations may be unnecessary, but at the moment I don’t believe that – I think they are useful. Granted, I haven’t used them enough to be able to say for sure.

  15. Nick Barnes Says:

    Yes, single-shot continuations are good for non-local exits.

    Don’t get me wrong, I like continuations in theory. I particularly like the way that a continuation-passing style implementation can do without a stack: once the continuation is an object, you can invent any control-flow you can imagine. But in practice it’s way too much rope.

  16. Nick Barnes Says:

    “distracted in the middle of re-indentation”

    The process fix for this is to do your re-indentation all-at-once, rather than a line at a time. C-x r o and C-x r d are my friends. It took a while for me to retrain my fingers.

  17. Ted Lemon Says:

    Umph, reindentation serves a purpose other than moving blocks around – it’s part of my thinking process. Having to switch to a macro that does it instantly is broken.

    Oh, other gripes about Python: backslashes for line continuation? What is this, 1960? Next you’ll tell me that I should put a ‘C’ in column 6. And there’s no visual separation between an if statement and its code block–the conditional test lines right up with the first line of the code block. Having syntax to define blocks makes code more readable. Too much different syntax, like Ruby, is bad, but no syntax is also bad.

  18. Nick Barnes Says:

    Can you give an example of what you mean regarding if statements?

  19. Ted Lemon Says:

    if (a in ['a', 'b', 'c', 'd'] or
        1 == 2):
        a = a + 'abercromby'
    else
        a = a + 'fitescue'
    

    You see, the 1 is directly above the a, so it looks like they’re part of the same code block.

  20. Ted Lemon Says:

    Sigh. No, you don’t see, because wordpress hates me. [ed: I fixed it! use PRE not CODE]

  21. Nick Barnes Says:

    This would be exactly the same in a braces language, and can be fixed in the same way: a larger indentation quantum. I recommend 8.

  22. Ted Lemon Says:

    Oh, now you’re just *trying* to provoke me, aren’t you? :’)

  23. drj11 Says:

    I only indent continuation lines by 2 spaces for exactly this reason. This also helps with the sucky backslash for continuation lines, as you now have 2 (crappy) clues.

  24. drj11 Says:

    «a in ‘abcd’», by the way. [ed: Ooh. Scratch that. It has different semantics.]

  25. drj11 Says:

    Oh yeah. And while we’re whinging about whitespace and continuation lines, I wish Python used the “optional newline” rule from BCPL. Basically if a line ends with a binary operator, then the following newline can’t terminate the statement so the next line continues it. This mans one would be able to go:

    if a in ['a', 'b', 'c', 'd'] or
      1 == 2:
        a = a + 'abercromby'
    else
        a = a + 'fitescue'
    
  26. Ted Lemon Says:

    God yes, that would be a huge improvement. I realize that I’m just being a fussy whiner when I complain about the trailing backslash, but my entire coding style would change (I think for the better) if this were the rule.

    Of course, there’s a case to be made for having the text editor handle all of this, and not even encode line breaks in the source file…

  27. Jason Says:

    Abstract. Python is a terrible programming language, for various reasons. Among which: the language designer won’t take TCO, because he thinks it would be really confusing for Python developers, especially newbies. The hole in his argument is that stack traces are bad and anyone trying to debug a Python program is already screwed anyway because he’s using such a horrible language.

    :-P

  28. Carl Meyer Says:

    Yeah, the line-continuation backslash sucks. In practice, though, I find when I need to wrap lines I can usually do it inside parenths that would be there anyway; and if not, I add some :-)

  29. drj11 Says:

    @Carl: I add parens too, but it makes me feel like I’ve forgotten the Python syntax and reverted to C syntax (where parens are required) just out of familiarity.

    I just did a quick check of some Python code I’ve been working on. For “if” all the “long-line” cases were chains of either “and” or “or”. They would’ve been admirably handled by my BCPL suggestion above.


  30. Grreat blog you have here


Leave a comment