Archive for the 'programming' Category

Why big is bad

2014-09-26

If you know me, you know that I don’t like using /bin/bash for scripting. It’s not that hard to write scripts that are portable, and my earlier “10 tips” article might help.

Why don’t I like /bin/bash? There are many reasons, but it’s mostly about size.


drj$ ls -lL $(which sh bash)
-rwxr-xr-x 1 root root 959120 Sep 22 21:39 /bin/bash
-rwxr-xr-x 1 root root 109768 Mar 29 2012 /bin/sh

/bin/bash is nearly 10 times the size of /bin/sh (which in this case, is dash). It’s bigger because it’s loaded with features that you probably don’t need. An interactive editor (two in fact). That’s great for interactive use, but it’s just a burden for non-interactive scripts. Arrays. Arrays are really super useful and fundamental to many algorithms. In a real programming language. If you need arrays, it’s time for your script to grow up and become a program, in Python, Lua, Go, or somesuch.

Ditto job control.
Ditto Extended Regular Expression matching.
Ditto mapfile.
Ditto a random number generator.
Ditto a TCP/IP stack.

You might think that these things can’t harm you if you don’t use them. That’s not true. We have a little bit of harm just by being bigger. When one thing is 10 times bigger than it needs to be, no one will notice. When everything is 10 times bigger than it needs to be then it’s wasteful, and extremely difficult to fix. These features take up namespace. Got a shell script called source or complete? Can’t use it, those are builtins in bash. They slow things down. Normally I wouldn’t mention speed, but 8 years ago Ubuntu switched from bash to dash for the standard /bin/sh and the speed increase was enough to affect boot time. Probably part of the reason that bash is slower is simply that it’s bigger. There are more things it has to do or check even though you’re not making use of those features.

If you’re unlucky a feature you’ve never heard of and don’t use will interact with another feature or a part of your system and surprise you. If you’re really unlucky it will be a remote code exploit so easy to use you can tweet exploits, which is what ShellShock is. Did you know you can have functions in bash? Did you know you can export them to the environment? Did you know that the export feature works by executing the definition of the function? Did you know that it’s buggy and can execute more than bash expected? Did you know that with CGI you can set environment variables to arbitrary strings?

There are lots of little pieces to reason about when considering the ShellShock bug because bash is big. And that’s after we know about the bug. What about all those features of you don’t use and don’t even know about? Have you read and understood the bash man page? Well, those features you’ve never heard of are probably about as secure as the feature that exports functions to the environment, a feature that few people know about, and fewer people use (and in my opinion, no one should use).

The most important thing about security is attitude. It’s okay to have the attitude that a shell should have lots of useful interactive features; it’s arguable that a shell should have a rich programming environment that includes arrays and hash tables.

It’s not okay to argue that this piece of bloatware should be installed as the standard system shell.

Coding is like Cooking

2014-03-14

Coding is like cooking.

Well, not really. But a bit. This is not an article about how recipes are like programs, it’s about the role that cooking has in our personal lives and in society.

I can cook. A bit. Well enough that I can cook for my household, and friends that might drop by. I don’t always eat frozen pizza. Day to day cooking I can mostly do without a written recipe (spag bol, salmon and broccoli, that kind of thing), but when we entertain I’ll generally use a recipe; we own a few too many cookbooks and I can find recipes online. Perhaps one or two dishes I can make well enough that they’re actually good. So I can cook, but not well enough that anyone would pay me to do it. And as for being a chef there are probably other skills that professional cooks have that are part of the job that are simply not on my radar. Planning a menu, choosing suppliers, managing a kitchen.

I’m not suggesting that because I can cook a bit that I’ll be a cook. But conversely just because I can’t be a chef, that doesn’t mean that cooking is pointless. I don’t cook because it’s useful to the economy or because it helps me get a better job. I don’t do it as a hobby. I cook because it’s useful to me personally. It’s a sort of basic life skill.

I imagine most people are like me in this regard. They can cook, something. The amount of cooking that people do might vary. Some people will do it as a hobby, cooking things for their friends every week. Some people will do it professionally and cook for hundreds or thousands of people.

I would like programming to be like this.

I think most people can program. A bit. Not to a professional level, not to a standard where they would be comfortable getting paid to do it. Some people might like programming a bit more and do it as a hobby. Again, doesn’t mean they would be paid to do it.

You don’t have to be a programmer to program. Just like you don’t have to be a cook to cook.

Tinkering

I have friends who I think of as cooks. Some of them do it as a hobby, some professionally. One thing I’ve noticed is that they enjoy cooking, and they like to tinker. They’ll try cooking something just for fun on a Saturday morning. The professionals will try some new technique or ingredient or idea because it will expand their power. I don’t do that. Not with cooking anyway. But I do with programming. I’ll write a browser-based Logo implementation to learn a bit about SVG and JavaScript. Or I’ll learn a new language because it seems fun and might stretch me intellectually.

Are there problems with this analogy? Yes there are:

  • Cooking is useful in itself. The result is usually a meal; you can eat it. This is not so clearly true of programming. There are useful programs to be written, but by and large the kinds of programs that non-professional coders write are not all that useful (yet; see below).
  • Terminology. Lots of people (professionals and hobbyists alike) seem to think that when you say “teach people to program” you mean “train people to be professional programmers”. We don’t. Or at least, I don’t. No more than “teaching people to cook” means “teaching people to be cooks”.
  • Access

    The access route to cooking is pretty straightforward. Most homes and (all?) schools have a hob or a cooker or a microwave. You can cook something with that. I don’t have to buy a stainless steel counter and professional range to cook. You shouldn’t need “pro-level” tools to program. You shouldn’t have to buy a specialist computer and learn how to use an editor like vi, emacs, or TextMate.

    So I think we have to make the access route to programming simpler.

    Tools like ScraperWiki’s Code in your Browser tool help (full disclosure: I work for the company that made this), as may things like CoffeePlay (note I said things like CoffeePlay, which is a whimsical tool created over a weekend, not a serious product). You can start programming using just your browser. There is a browser based version of MIT’s Scratch programming environment. The Raspberry Pi helps.

    At the moment, I think these things show possible futures. They are hints at how we can make coding more accessible. But I think it is possible to make the route to programming simpler. A child can take the first steps in cooking by mixing flour and butter, and putting scones in the oven. We need to make programming just as easy and accessible.

    We need to make it easier to do more things with code. What I mean here is more hackable things in the spirit of the maker community, hack spaces, and so on. It’s kind of neat that I can log into my Kobo ereader and modify the software, it’s a shame I can’t do that with my Kindle so easily. I really love my label printer, but I wish I could hack the firmware on it.

    I don’t think everyone should be a cook, but I think everyone should cook.

    Traditional approach to ATtiny programming

    2014-03-07

    I recently bought an Adafruit Gemma. It’s a little programming board that is slightly bigger than a 10p coin and it costs about GBP 7.

    It uses an ATtiny85 micro and is Arduino compatible, so the way you’re encouraged to program is to use the Arduino GUI tools and all that good stuff.

    By “traditional approach” I mean grumpy old man approach. I don’t like GUIs much. I can’t use vi and the rendering of the font in the editor is terrible. And the syntax highlighting burns my eyes.

    So you can just use the command line tools, right? Right. On Ubuntu you can apt-get install gcc-avr and then use the avr-gcc compiler to compile your C code. You’ll need avr-objcopy (from the binutils-avr package) to convert your .elf file into an Intel .hex file, and you’ll need avrdude (from the avrdude package) to flash the device. The gory details are captured in this Makefile, and I got those gory details by switching the Arduino GUI into its verbose mode and watching it compile my project.

    My first demo project is also an exercise in avoiding the Arduino libraries. Mostly because when I was working out how to use the command line tools, I didn’t want the hassle of dealing with multiple files and libraries and things.

    So this is also an example of how to program the ATtiny85 (and more or less any AVR type micro) without using heavyweight libraries. The Gemma has a built-in red LED on PB1. This was definitely one of the things that attracted me to the Gemma. I can program it do something without needing to plug any extra hardware in. Specifically, I can flash the LED.

    Flashing an LED is a matter of using a GPIO pin and driving it high (on) and low (off). The assembler programmer would do that with the SBI (Set BIt) and CBI (Clear BIt) instructions. So I’m thinking “Can we have reasonable looking C code generate The Right Instruction?”.

    The C code to set a bit is generally of the form *p |= b where p is a pointer to some memory location and b is a number with a single bit set (a power of 2 in other words). Similarly the C code to clear a bit is *p &= ~b. As long as we give the compiler enough information, it should be able to compile the code *p |= b into an SBI instruction.

    And so it can. Through some fairly tedious but also fairly ordinary C macros, I can write BIT_SET(PORTB, 1) to set pin 1 or PORTB (PB1, the pin with the LED attached), and it gets converted into roughly: *(volatile uint8_t *)0x38 |= 2; which is basically saying modify memory location 0x38 by setting its bit 1. In a little oddity of the AVR architecture the SBI and CBI instructions operate on IO addresses which are at memory locations 0x20 onwards. The upshot of this is that memory location 0x38 is modified with a SBI 0x18 instruction (this mystified me for about 2 hours last night, and I realised what was wrong just as I was drifting off to sleep).

    Because in the code *(volatile uint8_t *)0x38 |= 2; both the location, 0x38, and the value, 2, are constant, the compiler has everything it needs to generate the right SBI instruction. And it does!

    drj$ avr-objdump -d main.elf 2>&1 | sed -n '/<main>:/,$p' | sed 9q
    00000040 <main>:
      40:	1f 93       	push	r17
      42:	b9 9a       	sbi	0x17, 1	; 23
      44:	c1 9a       	sbi	0x18, 1	; 24
      46:	88 ec       	ldi	r24, 0xC8	; 200
      48:	90 e0       	ldi	r25, 0x00	; 0
      4a:	f2 df       	rcall	.-28     	; 0x30 <delay>
      4c:	c1 98       	cbi	0x18, 1	; 24
      4e:	88 ec       	ldi	r24, 0xC8	; 200
    

    You can see at the beginning of the disassembly of main the SBI 0x17, 1 instruction which is as a result of the macro BIT_SET(DDRB, 1) (setting the pin to be a digital output). And you can see SBI 0x18, 1 to drive the pin high and light the LED and CBI 0x18, 1 to drive the pin low. The compiler has even subtracted 0x20 from the addresses.

    avr-gcc -Os FTW!

    classy enumerations

    2014-02-17

    An enumeration is a term that usually refers to a type consisting of a finite number of distinct members. The members themselves can be tested for equality, but usually their particular value is not important.

    Maybe you’re modelling a Sphex wasp and you have a state variable with values NOTNEARHOME, JUSTLEFTHOME. You could represent that with an enumeration.

    In C the enum keyword assigns (usually small) integers to constant identifiers. It is problematic, chiefly because the members of the enumeration are really just integers. After enum { Foo; }, then code like Foo / 5 is valid (note: valid, not sensible).

    In Python you could do essentially the same thing:

    NOTNEARHOME = 0
    JUSTLEFTHOME = 1
    
    if self.state == NOTNEARHOME:
        if 'spider' in self.inventory:
            # head towards home
        else:
            # look for juicy spiders
    

    You do see this style (ast.PyCF_ONLY_AST Note 1), but it has the same problems as enum in C. The values are just integers (so, for example, print self.state will print 0, or 1).

    You could use strings (like decimal.ROUND_HALF_EVEN):

    NOTNEARHOME = 'notnearhome'
    # and so on...
    

    That’s better because now I might have a clue if a print out a value and it’s 'notnearhome', but it’s only a little bit better, because you still might accidentally use the value innappropriately (opt = decimal.ROUND_HALF_EVEN; opt += 'foo').

    I have a proposal:

    Use a class!

    class NOTNEARHOME: pass         # Note 2
    class JUSTLEFTHOME: pass
    

    Let’s call this classy enumerations.

    Classy enumerations have the advantage that we don’t need to manually assign numbers or strings. Values like Mood.PUZZLED and Mood.CONFUSED (which are actually classes) will be unique, so can be tested using == or is correctly.

    With classy enumerations we get an error if we accidentally use them in an expression:

    >>> PUZZLED+1
    Traceback (most recent call last):
      File "", line 1, in 
    TypeError: unsupported operand type(s) for +: 'classobj' and 'int'
    

    And to wrap up:

    class True: pass
    class False: pass
    

    This article was inspired by looking at some of Peter Waller‘s code who seems to have invented the idea of using classes for enumerations.

    Note 1

    Yes this value matches a value in the C header file. Maybe that has some merit, but it doesn’t make for a very Pythonic interface.

    Note 2

    The body of a class definition is just a block of code. When that body is just a simple statement, it can share the line with the class declaration. Hence, class NOTNEARHOME: pass is a compact complete class definition. If you’re in a mood for documentation, replace “pass” with a docstring.

    Explaining p += q in Python

    2013-10-29

    If you’re a Python programmer you should know about the augmented assignment statements:


    i += 1

    This adds one to i. There is a whole host of these augmented operators (-=, *=, /=, %= etc).

    Aside: I used to call these assignment operators which is the C terminology, but in Python assignment is a statement, not an expression (yay!): you can’t go while i -= 1 (and this is a Good Thing).

    An augmented assignment like i += 1 is often described as being the same as i = i + 1, except that i can be a complicated expression and it will not be evaluated twice.

    As Julian Todd pointed out to me (a couple of years ago now), this is not quite right when it comes to lists.

    Recall that when p and q are lists, p + q is a fresh list that is neither p nor q:


    >>> p = [1]
    >>> q = [2]
    >>> r = p + q
    >>> r
    [1, 2]
    >>> r is p
    False
    >>> r is q
    False

    So p += q should be the same as p = p + q, which creates a fresh list and assigns a reference to it to p, right?

    No. It’s a little bit tricky to convince yourself of this fact; you have to keep a second reference to the original p (called op below):


    >>> p = [1]
    >>> op = p
    >>> p += [2]
    >>> p
    [1, 2]
    >>> op
    [1, 2]
    >>> p is op
    True

    Here it is in pictures:
    before.dot
    fresh.dot
    augment.dot

    Because of this, it’s slightly harder to explain how the += assignment statement behaves. For numbers we can explain it by breaking it down into a + operator and an assignment, but for lists this explanation fails because it doesn’t explain how p (in our p += q example) retains its identity (the curious will have already found out that+= is implemented by calling the __iadd__ method of p).

    What about tuples?

    When p and q are tuples the behaviour of += is more like numbers than lists. A fresh tuple is created. It has to be, since you can’t mutate a tuple.

    This kind of issue, the difference between creating a fresh object and mutating an existing one, lies at the heart of understanding the P languages (perl, Python, PHP, Ruby, JavaScript).

    The keen may wish to fill in this table:

    p q p + q p += q
    list list fresh p mutated
    tuple tuple fresh fresh
    list tuple ? ?
    tuple list ? ?

    On compiling 34 year old C code

    2013-09-01

    The 34 year old C code is the C code for ed and sed from Unix Version 7. I’ve been getting it to compile on a modern POSIXish system.

    Some changes had to be made. But not very many.

    The union hack

    sed uses a union to save space in a struct. Specifically, the reptr union has two sub structs that differ only in that one of them has a char *re1 field where the other has a union reptr *lb1. In the old days it was possible to access members of structs inside unions without having to name the intermediate struct. For example the code in the sed implementation uses rep->ad1 instead of rep->reptr1.ad1. That’s no longer possible (I’m pretty sure this shortcut was already out of fashion by the time K&R was published in 1978, but I don’t have a copy to hand).

    I first changed the union to a struct that had a union inside it only for the two members that differed:

    struct	reptr {
    		char	*ad1;
    		char	*ad2;
    	union {
    		char	*re1;
    		struct reptr	*lb1;
            } u;
    		char	*rhs;
    		FILE	*fcode;
    		char	command;
    		char	gfl;
    		char	pfl;
    		char	inar;
    		char	negfl;
    } ptrspace[PTRSIZE], *rep;
    

    The meant changing a few “union reptr” to “struct reptr”, but most of the member accesses would be unchanged. ->re1 had to be changed to ->u.re1, but that’s a simple search and replace.

    It wasn’t until I was explaining this ghastly use of union to Paul a day later that I realised the union is a completely unnecessary space-saving micro-optimisation. We can just have a plain struct where only one of the two fields re1 and lb1 were ever used. That’s much nicer, and so is the code.

    The rise of headers

    In K&R C if the function you were calling returned int then you didn’t need to declare it before calling it. Many functions that in modern times return void, used to return int (which is to say, they didn’t declare what they returned, so it defaulted to int, and if the function used plain return; then that was okay as long as the caller didn’t use the return value). exit() is such a function. sed calls it without declaring it first, and that generates a warning:

    sed0.c:48:3: warning: incompatible implicit declaration of built-in function ‘exit’ [enabled by default]
    

    I could declare exit() explicitly, but it seemed simpler to just add #include <stdlib.h>. And it is.

    ed declares some of the library functions it uses explicitly. Like malloc():

    char	*malloc();

    That’s a problem because the declaration of malloc() has changed. Now it’s void *malloc(size_t). This is a fatal violation of the C standard that the compiler is not obliged to warn me about, but thankfully it does.

    The modern fix is again to add header files. Amusingly, version 7 Unix didn’t even have a header file that declared malloc().

    When it comes to mktemp() (which is also declared explicitly rather than via a header file), ed has a problem:

    tfname = mktemp("/tmp/eXXXXX");
    

    2 problems in fact. One is that modern mktemp() expects its argument to have 6 “X”s, not 5. But the more serious problem is that the storage pointed to by the argument will be modified, and the string literal that is passed is not guaranteed to be placed in a modifiable data region. I’m surprised this worked in Version 7 Unix. These days not only is it not guaranteed to work, it doesn’t actually work. SEGFAULT because mktemp() tries to write into a read-only page.

    And the 3rd problem is of course that mktemp() is a problematic interface so the better mkstemp() interface made it into the POSIX standard and mktemp() did not.

    Which brings me to…

    Unfashionable interfaces

    ed uses gtty() and stty() to get and set the terminal flags (to switch off ECHO while the encryption key is read from the terminal). Not only is gtty() unfashionable in modern Unixes (I replaced it with tcgettattr()), it was unfashionable in Version 7 Unix. It’s not documented in the man pages; instead, tty(4) documents the use of the TIOCGETP command to ioctl().

    ed is already using a legacy interface in 1979.

    CoffeeScript is 4 times shorter than C

    2013-08-30

    Compare FreeBSD’s sed.c (implemented in C) with drj’s sed.js (implemented in CoffeeScript).

    A size comparison

                  lines bytes
    C             2471  62k
    CoffeeScript  533   14k
    

    That right there is CoffeeScript’s chief advantage over C. The CoffeeScript is 4 times smaller, meaning CoffeeScript programs are likely to be faster to write, easier to maintain, and contain fewer bugs.

    Is it an apples to apples comparison? The FreeBSD sed implements a couple more options than my mostly POSIX compliant sed.js but they don’t pad it out much. Neither implementation includes code to handle Regular Expressions. sed.c uses the Unix library and sed.coffee uses JavaScript’s built-in RegExp class. So I think it’s a pretty reasonable comparison.

    What makes the C code bigger? There is a lot of memory management, and an awful lot of string copying and substring extraction. There is an implementation of a hash table (for labels).

    Obviously we all knew that high level languages resulted in smaller programs, but it’s rare to get the opportunity to do such a direct comparison.

    sed, POSIX, and Node.js

    2013-07-11

    I’ve been implementing sed. A POSIX compatible sed in Node.js.

    It just seemed to me that one day soon the world will need a suite of Unix utilities written in Node.js. And I shall be The One.

    The experience has made me a bit sad about the POSIX spec. There are problems. For example, it’s not very good at documenting the actual or desired behaviour of classic Unix utilities:

    sed has a D command.

    This command deletes the initial portion of the pattern space up to the first newline (which may be the entire pattern space if a newline has not been introduced with an editing command or with N); then D begins a new cycle. At the start of this new cycle, the next line of input is loaded into the pattern space, but ONLY IF THE PATTERN SPACE IS EMPTY.

    This last bit is missing from the 2004 edition of the POSIX spec. It’s fixed and documented correctly in the 2013 edition of the POSIX spec.

    The behaviour of sed hasn’t changed since Version 7 in 1979. The D command has always skipped appending input (if there was anything left in the pattern space). Probably no sed ever had its D command behave in the way documented in the 2004 POSIX spec. Maybe if someone was to try building a version of sed from scratch using the 2004 POSIX spec and without reference to any other sed implementations. But who would be mad enough to do that?

    At some point someone drafting the POSIX spec didn’t notice the actual behaviour of sed, made a mistake in documenting the behaviour of its D command, and noone noticed until 2013 (well, a few years before, presumably). Which brings me to…

    The pace of change is glacial.

    Another thing about the POSIX spec which saddened me a little was the way all sorts of bizarre, obscure, and not very useful behaviours get documented and locked-in. You knew that sed has a ! modifier that negates an address. So «sed -n /barf/!p» prints all the lines that do NOT match /barf/. Did you know you can have as many ! as you like? «sed -n /barf/!!!!p» has the same behaviour as the previous program. At least according to the spec, and the Version 7 sed that I tried. There’s no point to this. No real program relies on this behaviour, yet there it is in the spec, so you have to implement it (if you want to comply to the spec). GNU sed (popular on Linux) gives an error instead. Which brings me to…

    You can’t really rely on what you read in the spec being implemented.

    or…

    GNU feel free to depart from the spec whenever they see fit to do so.

    sed is a bit weak. For example, its regular expressions (POSIX Basic Regular Expressions) don’t even support «|» for alternation. POSIX has Extended Regular Expressions. Wouldn’t it be sensible to move towards adding Extended Regular Expressions to all the tools that only had Basic Regular Expressions? Well, maybe yes, but there seems to be no taste for doing that in a POSIX committee. And remember…

    The pace of change is glacial.

    The 64-bit revolution

    2012-02-03

    Basically, wtf happened here? It’s 2012 and I’m chastised (a bit) on Twitter for installing 64-bit Ubuntu on my 64-bit laptop.

    It’s 1993 and as a graduate student in the Computer Lab I get an account on the shared Unix server. A DEC Alpha, running OSF/1. The Alpha was a true 64-bit chip, no 32-bit heritage in its history, no 32-bit mode, and basically no 32-bit arithmetic instructions. Everything was 64-bit, no hostages. As a C programmer this is quite funny. int was still 32-bit (with the usual compiler flags). It was clear that the Alpha with its elegant, very RISC, instruction set was an example a new breed of processor architecture.

    In 1994 I started my first job and in the next few years I use and program a variety of 64-bit architectures. The Alpha, UltraSPARC, 64-bit MIPS; I read up on various others (HP PA-RISC 2.0, POWER/PowerPC), because it was clear that the future was one where everyone had a 64-bit RISC workstation on their desk. Unix was clearly the only operating system with a chance of running on 64-bit architectures; sure, I saw, and used briefly, Windows NT running on Alpha, but it was a sorry effort. A 32-bit port of NT onto a 64-bit architecture. Microsoft/Intel were barely making it on 32-bit architectures, never mind 64-bit.

    When Intel’s IA-64 architecture came out, I read the manuals. It was clear that: a) Intel hadn’t been paying attention; and, b) no-one would be able to write a decent C compiler for this architecture.

    What happens next? I really don’t know. Sun got sucked into Java and took their eye off the ball, every other manufacturer canned their 64-bit line or sold to someone who didn’t care, Intel’s efforts to actually implement their IA-64 architecture (Itanium) were indeed a massive failure, and Intel had to piss away the R&D for 2 or 3 entire generations of CPU design. The amazing thing is that even though Intel basically failed to innovate for like a decade (between Pentium Pro and Core) everyone worshipped them and gave Intel enough time (aka money) to steal their new 64-bit strategy from their competitor AMD. They had the DEC designers, and the fabs, and they eventually produced Core, which was actually pretty good from a hardware perspective. But still based on the awful Intel software architecture.

    Which is where we are now.

    Does anyone have any idea why all the very sensible and nice 64-bit architectures (SPARC, Alpa, MIPS, PowerPC, not PA-RISC) all failed basically together?

    My Favourite Bugs, Part 2

    2011-05-12

    Another one.

    Two of us were working at the client’s site on the software for what was essentially a compact static robot that was responsible for moving small items from a hopper to a delivery point. There was approximately one instance of the hardware that we were writing software for, and it sat on a bench between our two workstations. Most of the mechanical bits were in a more or less finished state. I was particularly impressed with the piston that generated partial vacuum so that items could be picked with a moving arm with suction cups on. Just one or two gears were made of prototyping plastic; and because of a gearing problem the belt didn’t move at the speed that it said it should in the spec. But you know, typical prototype hardware. The electronics were a mixture of off-the-shelf dev kits for 8-bit embedded micros, mini custom circuit boards for novel sensors, and lovingly hand soldered discrete parts. Add to that the fact that as a software guy I didn’t really understand the importance of grounding, and La Machine wasn’t always completely reliable (my colleague had just lent me Tracy Kidder’s Soul of a New Machine).

    Various optical/IR sensors kept track of the items as they moved inside the internals of the machine, various other sensors kept track of motor positions and/or speeds. There was a slightly hairy state machine (documented using OmniGraffle) to keep track of it all. The target pick rate was 5 items per second, and as it took more than 200ms for an item to go from the hopper to the point where it left the machine, there could be several items “in-flight” at any one time (and of course, picking an item was never completely reliable, so the sensors were used to track the items, and determine if a retry was required).

    This day it seemed to be working fine, except for some reason the software was reporting that items were failing to be delivered, when in fact I could plainly see that items were popping out of the top of the machine. This was causing the machine to prematurely stop, as it would, sensibly, stop picking items if it thought that a picked item was stuck inside the machine somewhere. Up to this point it had basically been working fine; it had been working the same morning. I was sat thinking about this and investigating somewhat. I’d even checked the last thing I’d changed. So I called my colleague over (just at the next desk) and he came over to look while I demonstrated the problem. It worked. There was no problem. Flakies (there’s a memorable part of Kidder’s book where one of the engineers in helping a more junior engineer sort out a problem with a wire-wrap memory board, grabs the whole frame and shakes it, claiming that it’s probably just flakies; of course it works after that).

    So that’s okay then. But when I try it again, it’s not working. Colleague comes over. Works. I work at it on my own. Doesn’t work.

    It turned out that it was quite a sunny day. The sun reached the window in the afternoon. Sunlight was falling on machine near the optical sensor and presumably bouncing around enough to prevent the sensor from registering the occlusion as the item went past. When my colleague came over, he was standing over the machine and his shadow blocked the sunlight. This wouldn’t be a problem in the production hardware, as it was all in a box (and hence, dark inside).

    I constructed an optical shield and installed it on La Machine (a post-it note stuck to the side).

    This was not a problem that could be fixed using print. We did have print via the serial port, but printing more than one character was hazardous because it meant that time taken to transmit characters on the serial line would interfere with the realtime operation of the rest of the software (when items are being picked every 200ms, every millisecond counts).

    Follow

    Get every new post delivered to your Inbox.