Talk about Python

2008-05-07

PyConUK 2008 is asking for submissions. You should give a talk.

You should give a talk because I want to hear what you have to say. And so will others. People tend to think that no-one will want to hear what they’ve been doing, or that it won’t be interesting. But on the whole I just don’t think it’s true.

Communication is a skill, and it is improved by practice. I think computer scientists and software engineers should all strive to be better communicators.

Python is not a huge community, we do not have a PR department. If we want Python to grow then we have to grow it ourselves. Conferences like PyConUK are an important part of that community growth, and the speakers—that’s you—that provide content are an important part of the conference.

What should you speak about? Why you love Python, why you hate Python. Your favourite module. How you saved the world from global nuclear catastrophe with a simple Python script. Anything really.

I’ve submitted my talk already, and—shock—it’s not about functional programming.

Look forward to hearing what you have to say in September.


Lost Searchers

2008-05-06

WordPress gives me a very convenient view of what people have been searching for when they reach my blog. I think most readers are regulars; some come from sporadic programming.reddit.com spikes, but some come via search.

I feel sorry for some of these people, coming across my blog in the middle of their search must be like stepping on a rake whilst looking for the lost ball in the garden.

«Element Not Good For Monks» - ok, so the search is a bit wacky, but I’m sure I’m not the blog they’re looking for. There’s a Kryptonite for monks?

«quilting pattern flower uk» - Search engines have obviously decided on the basis of this post about purple things, and this one about multiplication in rings that I have something to do with flower quilting patterns. Sorry, no.

«multiplication table» - this is a relatively popular search term, it crops up a few times a week. Sufficiently popular that I’m thinking of creating a blog post for it.

«c++ matrix multiplication code» - close, but no cigar. Anyway, using Google to do your homework assignments is cheating.

«how to find sex of foetus in ultrasound» - it’s clear what they’re looking for and my blog doesn’t help them at all. If I had to guess I’d say you look for the willy.

«emacs python color» - there are quite a few variations on this theme, obviously looking for emacs modes that do colour syntax highlighting for various languages (not just python, I see fortran, javascript, and so on). Sorry, no idea.

«metroid 2 map» - Well, I suppose I do have a Metroid II map, but really it’s a celebration of how not to do mapping. Sorry, not very useful. Anyway, regardless of what I think about maps in video games, using somebody else’s map is clearly cheating.

«where is the colon» - umm… somewhere behind your belly button?

«most fragile part of the brain» - I have no idea, and I’m not just about to experiment to find out.


The use of blah (_), aka underscore

2008-04-18

I’m part of a tradition of C programmers that pronounce «_» as “blah” (other people might call it “underscore”). I got the habit from Mr Moore and Dr Owen. Given how much the blah character appears, it’s useful to have a one-syllable pronunciation for it. That we chose “blah” is a bit unfortunate as it conflicts with “blah” in ordinary speech meaning “insert extra random stuff here”. Unfortunate, but also I suspect partly deliberate.

In C and friends this character has different connotations. Some are blessed by the standard, some have grown up in the communities that use them.

Blah is used to denote reserved identifiers in C. All identifiers starting with __, two blahs, are reserved for use by the implementation. Identifiers of this form are often used to introduced non-standard extensions without fear of breaking any existing strictly conforming programs. For example, GCC allows assembler to written in a C function by using the __asm__ keyword. Because this keyword starts with __, GCC can guarantee that no correct portable C program can contain this keyword so GCC won’t break any existing code. If GCC had chosen asm to introduce inline assembler then this would have broken any existing C code that legitimately used asm as an ordinary identifier (for example, the name of a function).

Identifers beginning with _ and followed by a capital letter are also reserved for the use of the implementation. Clearly no portable C code should use _ followed by a capital letter.

C itself has used this to extend the language in a safe manner that doesn’t upset any pre-existing portable code. In C99 a proper Boolean type was added. Obviously this type couldn’t be called bool as that would upset any codes that perfectly reasonably used that name for their own purposes. So in C99 the Boolean type is called _Bool. No portable code can have been using this name already (since it was reserved by the C standard), so it was safe for the new version of the C standard to use.

Another place you often see _ is at the beginning of struct tags. Recall that a struct tag is the foo in «struct foo {int a; void *b;}». Some people code like this:

struct _node
{
    struct _node *parent;
    struct _node *prev;
   ...

As far as I can tell, this is voodoo (maybe it has something to with C++, how would I know?). In particular there’s nothing wrong with this code:

typedef struct node *node;

(or «typedef struct node node;» if you prefer)

In C it’s totally fine to use node for a struct tag as well as for a typedef. That’s because they’re separate namespaces (see [ISOC1999] section 6.2.3).

Lots of people use an initial capital letter for their types. If the blah convention for struct tags is used as well then this combines disastrously with the ISO C reserved namespace. This Amaya code is full of it:

typedef struct _Match
  {
     struct _SymbDesc   *MatchSymb;	/* pattern symbol */

Sorry guys, that’s straight-out use of a reserved identifier, and I claim my angry warthog. In all fairness to Amaya, they’re not the only ones doing it.

More boring uses of blah include:

Using it to separate components of a name. The C standard uses this for some of the macros it defines:

CHAR_BIT, DBL_DIG, EXIT_SUCCESS

This is of course a very widely adopted convention. A specialised use of this is where the components have different meanings, the blah is used as a sort of punctuation character. In «size_t», the «size» is really the name of the type, and the «t» suffix denotes that the name is a type (not a function or a variable, for example); there’s no enforcement of this, it’s just a convention. Similarly in «tm_year», «year» is the name of the field, and «tm» is a hint that the field is a member of the «struct tm» type (this is a hangover from pre-ANSI days when all structure member names shared the same namespace).

The standard is hilariously inconsistent in this regard. «size_t» is the simple case, but we have «ptrdiff_t», «sig_atomic_t» (note, blah between «sig» and «atomic», but not between «ptr» and «diff»), and «va_list» (it’s a type, but there’s no “blah t” suffix).

The C standard doesn’t use blah in ordinary identifiers (functions, basically). But loads of other people do. Once you get a sufficiently large C program where you have to think about modularity, it’s a good way to separate names into different (conventional) namespaces. xpidl_parse_iid, ssh_hmac_init, that sort of thing.

A sort of extension of this is using underscore to separate identifier metadata from its name. In «m_var», the m might indicate a member variable.

There are rarer, but possibly more interesting uses of underscore.

One is when you really really want to use a keyword as a name. Actually, for me, this crops up in Python more than C. I find I constantly want to call an input file «in», and quite often want to call a variable that holds a class, «class».

Another is when you have a local variable in a function that’s really just the same as one of the function parameters, but with a different type. In an object oriented language that might happen because of upcasting. In C it commonly happens because a generic pointer is passed as a void * but the called code uses it with some more specific type.

qsort is a good example. Say you want to sort an array of ints in C. You need a function that compares two ints. qsort expects a function that takes two void * arguments. So this sort of thing is common:

int compare_int(void *l_, void *r_)
{
  int *l = l_;
  int *r = r_;
  ...
}

(Personally I would tend to call these «lArg» and «rArg», but I’m not going to blink at this use of blah).

You can see OpenSSL doing something similar with the «data_» argument to md4_block_data_order. The function takes the argument «const void *data_», but the body of the function actually wants to use data as a char *: «const unsigned char *data=data_;».

Another use of blah, is to mark deliberately unused parameters. This can commonly happen in callback schemes. Like the previous qsort example, where the comparison function is constrained to have a particular type, often a callback is constrained to have a particular type, taking particular parameters. Sometimes not all of these parameters are needed. Example: zlib’s zalloc interface. zalloc’s type, when you peer behind the typedefs, is:

void *(*)(void *, unsigned, unsigned);

It’s a function pointer. If you want to use this interface, which you would do when you want zlib to use a custom allocator rather than malloc, then you need to implement a memory allocation function that takes 3 arguments:

void *superalloc(void *opaque, unsigned n, unsigned size) { … }

opaque is an opaque pointer that zlib doesn’t care about. It simply passes it from the struct z_stream_s opaque member to your function. What if you don’t need it? Then stick a blah after the parameter name to indicate that you intend to not use it:

void *superalloc(void *opaque_, unsigned n, unsigned size) { … }

Anyone know any other uses of blah?


Introduction to Functional Programming (UKUUG type)

2008-04-03

On Wednesday at the UKUUG Spring Conference I gave a talk: «Introduction to Functional Programming in Python». This sounds suspiciously similar to my PyCon UK talk I gave last year. I had intended to only tweak the talk a bit, but in the end quite a lot of the material changed, and there’s not actually all that much overlap.

Slides (769e3 octet PDF) and notes (52e3 octet PDF).

Thanks to those that attended.


Embedding Lua in 5 Minutes

2008-04-03

So at the UKUUG Spring Conference I kind of decided that there weren’t enough different dynamic languages being talked about; in fact it was pretty much divided into Python land and Perl land (at least as far as dynamic languages were concerned). So I decided to give a 5 minute lightning talk at the end of the conference, on embedding Lua into an application in 5 minutes.

This was my first lightning talk and it was a bit scary and a lot of fun. I highly recommend the experience.

I decided that instead of talking about embedding Lua, I would actually do it, standing in front of the conference live. Including downloading the Lua sources and compiling them (yay for working conference wifi). I thought this was hilarious, I have no idea what anyone else thought. I surprised myself by being able to type code in vi and talk at the same time, though I’m not sure I made much sense.

Here’s an example using the new command line option I added to yes:

$ ./a.out -l 'x=x or 1; x=x*2; return x' | head
2
4
8
16
32
64
128
256
512
1024

For the sake of completeness here is the modified version of yes.c that I ended up with:


#include <sys/cdefs.h>

#include <stdio.h>
#include "lua.h"
#include "lauxlib.h"

int main __P((int, char **));

int
main(argc, argv)
        int argc;
        char **argv;
{
  if(argc >= 3 && strcmp(argv[2], "-l")) {
    lua_State *l = luaL_newstate();
    luaL_openlibs(l);

    while(1) {
      luaL_dostring(l, argv[2]);
      puts(lua_tostring(l, -1));
      lua_settop(l, 0);
    }
  }

        if (argc > 1)
                for(;;)
                        (void)puts(argv[1]);
        else for (;;)
                (void)puts("y");
}

/*
 * Copyright (c) 1987, 1993
 *      The Regents of the University of California.  All rights reserved.
 *
 * Redistribution and use in source and binary forms, with or without
 * modification, are permitted provided that the following conditions
 * are met:
 * 1. Redistributions of source code must retain the above copyright
 *    notice, this list of conditions and the following disclaimer.
 * 2. Redistributions in binary form must reproduce the above copyright
 *    notice, this list of conditions and the following disclaimer in the
 *    documentation and/or other materials provided with the distribution.
 * 3. All advertising materials mentioning features or use of this software
 *    must display the following acknowledgement:
 *      This product includes software developed by the University of
 *      California, Berkeley and its contributors.
 * 4. Neither the name of the University nor the names of its contributors
 *    may be used to endorse or promote products derived from this software
 *    without specific prior written permission.
 *
 * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS “AS IS'' AND
 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
 * ARE DISCLAIMED.  IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
 * SUCH DAMAGE.
 */

Background Checks For Our Corporate Citizens

2008-02-29

Before people are employed it is routine to do some sort background check on them. Most employers would check a candidate’s references before going ahead and employing. Some jobs (or perhaps employers) have more involvd background checks than others. Applicants to be local government positions, for example, are required to disclose any previous convictions. Jobs that involve working with children require a CRB check.

I think we should extend this background checking principle to corporations that we contract with. For example, when a school contracts with a caterer it should check that the caterer has not been convicted of pushing alcohol at kids; when government buys software perhaps it could check and see whether the vendor has been convicted of running an illegal monopoly in multiple countries.

I’m looking at you Microsoft.


The GNU GPL is not an EULA!

2008-02-25

MPlayer’s OS X pkg displays the GNU GPL in the license section of the installer. The installer then requires that I click a button laballed “Agree” in order to continue. This is all fine and normal practice for the EULAs that are attached to software. MPlayer is not the only one that does this, quite a lot of open source software packaged for the Mac does it.

But the GPL is not an EULA!

The GPL is not a license to use the software. I can use the software without agreeing to the GPL. It says so, right there, clause 0: “The act of running the Program is not restricted”. The GPL is a license to distribute the software, if I don’t do any distribution I don’t need the license.

This is an important point about the GPL that is not understood by enough people. The GPL is not like (most) other software licenses, because it does not restrict my use. Unlike a tradition EULA which attempts to prevent me from doing something which I might otherwise be able to do, the GPL only licenses me to do something that I otherwise wouldn’t be able to do, namely distribute it. If I don’t want to distribute the software (and I’m certainly not obliged to), then I don’t need to agree to the GPL.

I think the GPL is very cunning in this regard.

So to summarise: The license section of the OS X packager is for EULAs, and I never want to see the GPL in that section again. Okay?


Abuses of Lambda, by Design

2008-02-11

Again and again I see the Greek capital letter lambda, Λ, standing in for the English (latin) letter capital A. You know, in trendy logos for film studios, web consultants and the like.

This is a sin against typography and it must stop!

When I’m reading, it just trips me up to see a capital lambda in the middle of English text. Yeah yeah, it looks cute and it introduces all sorts of amusing design possibilities, but it’s just bad writing.

I feel a tiny bit guilty about this rant because the most recent example I observed was Transitive:

who just happen to be one of the sponsors for the UKUUG Spring Conference at Birmingham where I am giving a talk. On guess what? Lambda.

Ooh I just found another one (I knew there was a good reason to delay publishing this article):

Navarre logo

and they commit the additional sin of using a Greek capital letter xi. Is there no end to this madness!

[A couple of month's later Dyalog send me an e-mail inviting me to their corporate headquarters]

Dyalog logo


The perils of going to Canada

2008-02-08

I went to Canada and Jeremy Beadle died! Oh My Gosh! I only just found out, why did no-one tell me?

Obligatory XKCD cartoon.


Canadian foetal gender

2008-02-06

According to the in-flight magazine, in Canada it is illegal for a doctor to disclose the gender of a foetus (to the woman bearing it, or anyone else) until the foetus is 24 weeks into term. Naturally there are walk-in clinics in California and Washington that are prepared to do the ultrasound gender determination for a reasonable fee.

Will they make it illegal to travel to another country in order to access medical facilities that are unavailable indigenously? And illegal to own and operate an OB ultrasound machine (Tom Cruise did it, but then there was a bill before senate to make it illegal; I got bored of following the trail)?

The ultrasound operator knows the gender. How is it ethical to withhold this information from the patient?

The other thing I learned on the plane is that people used to terminate pregnancies using slippery elm bark. Slippery elm bark cannot be sold in the UK.

I find myself wholly unequipped (quite possibly in a physical as well as a mental sense) to think about these issues.