Archive for September, 2008

700,000,000,000 USD


How can we visualise such a large amount?

Well, it’s about a dime for every square meter of land in the US (9.1 cents per square meter of the contiguous 48).

Instead of buying Wall Street, the US taxpayers could’ve bought over 77 Large Hadron Colliders!

Or… enough gold to cover Lake Superior in a gold sheet (16 nanometers thick!).

700 billion dollars, man!

My Battle with Metroids


«Metroid II: Return of Samus» (Game Boy): Got to final boss, haven’t killed it.

«Super Metroid» (SNES): Got to final boss, haven’t killed it.

«Metroid Prime» (GameCube): Got to final boss (*), haven’t killed it.

«Metroid Fusion» (Game Boy Advance): Got to final boss, killed it! Yay!

So, what are my chances with:

«Metroid Prime Hunters» (DS)


(*) I have a sneaking suspicion that the “final boss” I am currently fighting in Metroid Prime may not in fact be the final boss. How can I tell?

I learn Python


It takes a long time to get to the point where you stop learning a language, perhaps you never do stop.

Some useful Python tidbits I’ve recently picked up that I feel I ought to have known much earlier:

Using the dict constructor instead of the literal syntax:

# awkward
{'cht': 'lc', 'chs': '500x600'}
# nice
dict(cht='lc', chs='500x600')

This is mentioned in the tutorial. Bad code monk. (Ah, this is that new fangled Python 2.3; that explains why I previously didn’t know it. Basically I learnt the vast majority of my working Python when lambda was new and «from __future__ import nested_scopes» was necessary)

When I met Raymond Hettinger at PyCon UK (I was doing a “Code Clinic” with him) one of the first questions I asked him was “when are we going to get unzip“; when he replied “zip is also unzip, you can just use zip(*l)” I had one of those forehead slapping moments (quite literally I think). I course I knew that “zip(*l)” transposed a matrix (see Norvig’s IAQ), but I wasn’t used to thinking of a list of pairs, say, as being a matrix that I could transpose to get a pair of lists. Kick! Punch! It’s all in the mind.

Using split instead of literal lists:

# awkward
['F', 'PD', 'AD', 'D', 'TD', 'ED', 'ABO']
# nice
'F PD AD D TD ED ABO'.split()

This isn’t so much a new trick, but I was always a bit embarrassed about it. But now I’ve seen other Python programmers do it too, so I know it’s more socially acceptable.

Now one that I don’t actually use so much, nicked from Thomas Guest’s blog:

"chd=s:%(xs)s,%(ys)s" % locals()

Can you see what’s going on? The variables xs and ys are local variables whose values are spread into the string using «% locals()» as a sort of “interpolate from local variables” operator. I’m not a great fan, but I can see that it is useful. I’m not such a fan because it uses locals which is very cool, but could interfere with a compiler’s optimisations. Though in this case, with a constant string on the left of the % operator, the compiler has enough information to do a good analysis.

Speaking of Python string formatting, I was disappointed to learn that the optimisation of compiling format strings, which is routine in the Lisp world, is not generally done in Python. The optimisation I am talking about is one where «”some constant string” % stuff» gets converted into «some_function(stuff)» where some_function is a compiled function that does the formatting. It’s just one of the signs of how immature whole Python thing is.

And one specially for Pythonistas doing Google charts:

>>> d = dict(cht='lc', chs='500x600') # The dict from above
>>> '&'.join(map('='.join, d.items()))

An earlier version of this example involved a lambda: «’&’.join(map(lambda item: ‘=’.join(item), d.items()))», but then just as I was pasting it into this article I realised I could drop the lambda altogether. Bound methods rule!

What Python have you learnt recently?

Putting the Heat on Wheat


Wherein I play with the lovely Google Charts API and expose my total incompetence in statistics, economics, agriculture, and geography. And quite possibly other things too.

So I was reading the Open Knowledge Foundation blog and came across this article featuring US wheat production, which points to this dataset of wheaty goodness. My recent work on Clear Climate Code had made me already aware of the availability of GISTEMP’s summary data products.

So it occurred to me that this could be used to answer the question “when the weather is warmer, does more wheat grow?”.

So the wheat data is US wheat production, including yields in bushels/acre, sigh. GISTEMP even do a dataset that shows the temperature anomaly for the US. I think this is incredibly parochial, but it happens to be just what I want.

So the wheat yield (volume of wheat per harvested unit area) has a general upward trend. At least from the mid 1930’s or so. Because I’m only interested in the local variation I have detrended the wheat data:

My hypothesis is that any deviation of the temperature from the long term average will lower wheat yields. I think this because I would expect that over the thousands of years of selection humans will have cultivated a variety of wheat that is optimised to grow at the average temperatures and it will do less well when temperatures deviate.

So what do we see? Here’s wheat yields and temperatures together:

Well, there’s no obvious correlation to eyeball. Scattergram:

(which is almost just changing ‘cht=lc’ to ‘cht=s’ in the above chart URL)

Bit of a blurry mess. If anything a slight negative trend, which would mean that colder temperatures gave a higher wheat yield. And indeed Pearson’s correlation is about -0.3 (assuming my calculations are correct) indicating a weak negative correlation.

There are problems. One problem is that I have no p-value. That’s partly because I haven’t read that far on the Wikipedia page (I’m not using some fancy stats package for my analysis; everything is hand-coded in Python), and partly because I have a degrees of freedom problem. Temperature is autocorrelated, so whilst I have 128 samples, that’s fewer than 128 degrees of freedom, so the standard assumption of independent variables is incorrect.

The other problem is that it looks like the detrending might have introduced a bit of an alarming feature into the wheat anomalies. There’s a gentle hump from 1866 to about 1940 and a similar one from about 1940 to 2000. This is almost certainly because I’ve used a cubic polynomial to fit to the data to detrend it. It looks like a two-leg linear fit would be better (with a kink around 1942), but I haven’t found how to do that. I have a sneaking suspicion I have some FORTRAN code lying around here to do it, but I’m too scared to look.

Final tiny problem almost too small to be worth mentioning: the wheat data is for the entire US, whereas the temperature data is for the contiguous 48. I’m guessing that Alaska and Hawaii make so little wheat contribution that it doesn’t matter.

In any case it doesn’t really look like fixing these problems would ever indicate a strong positive trend between temperature anomalies and wheat yields. So we can reject the notion that warmer weather means higher wheat yields. Of course warmer weather might mean we can grow more of something else (possibly just a different variety of wheat); it also might mean that the available belt of land for growing wheat is larger (but this is unlikely since it probably means the available belt of land for growing wheat has moved North).

Ofsted: satisfactory doublethink


Maybe you’ve read the BBC article “how maths teaching is not good enough”? Perhaps you should read the Oftsted report. Perhaps I should.

41% of the maths teaching (in secondary schools) is satisfactory. The tone of the news article is that this is not good enough.

This is characterised by the section headline in the Ofsted report (section 26): «What is not good enough about ‘satisfactory’ teaching?»

I have news for Ofsted. “satisfactory” means almost the same thing as “good enough”. If you’re not satisfied with “satisfactory” teaching, then you set your assessment criteria incorrectly. How unsatisfactory.

My PyCon UK talks


My «Embedded Programming with Python» talk was the first Saturday morning slot, and really it was about manipulating hex-files and ‘scope dumps with Python:

Slides as PDF, 2.5e6 octets.

On Sunday Nick B gave a presentation of Clear Climate Code. I didn’t do much during the presentation; I was the OmniGraffle monkey. Here’s the presentation at the Clear Climate Code site.

Xbox sales


Microsoft: “About 75% of all console sales have been made below [USD 200]”

Another way of saying that is: “In the future we expect to sell no more than 3 times what we already have”.

Reactionary twat!


So Jacqueline Wilson’s novel “My Sister Jodie” features a character that uses the word “twat” in dialogue (apparently, I haven’t read it). After 150,000 copies have been sold, an attentive aunt spots the word whilst reading the entire novel before presenting it as a gift to her niece. Naturally she returns the book to the retailer and complains. The retailer pulls the book and the publisher decides that the next run will have all twats replaced with twits (imagine that!).

I am minded of the words in the RAF “Arctic Survival” pamphlet, PAM (AIR) 226 published January 1953 (reprinted May 1966):

Section 143, “Final Decision”

Once you have made your decision stick to it. The decision will have been reached after considering all the factors when your minds were fresh. As time goes on, your powers of reasoning will deteriorate and there may be a tendency to consider factors individually instead of collectively.

Of course the pamphlet is talking about whether to leave your downed aircraft when you’re in a survival situation north of the timber line, but I think the same approach applies.

Wilson no doubt considered the impact when she penned her “twat” into the character’s dialogue, and she will have considered in various rewrites of the drafts. As will her editor. The collective decision of her, her editor, and her publisher seems to have been that it was an appropriate and honest use of the word.

For the publisher to react to a single complaint by changing the next printing seems to be, well, reactionary. Jacqueline Wilson had an opportunity to take a real stand here, to say that her considered words stand, and its her right to publish what she thinks is fit. I’m sure Wilson has the best interests of her readership at heart, she is not trying to offend a nation, she is trying to improve it. Well, I thought she was; this move just seems publicity seeking and reactionary. And if it’s the publisher acting and not Jacqueline Wilson, well, if she can’t keep her publisher on a tight leash, who can?