Using the RFC 2397 “data” URL scheme to micro-optimise small images


Curly Logo has a text area with a transparent background (maybe you haven’t noticed, but you can move the turtle “underneath” the purple text area and it is still visible, try «bk 333»). Support for colours with alpha channels (a CSS3 feature) was limited when I tried, so I ended up implementing this using a transparent 1×1 PNG which is repeated across the background.

That PNG file is 95 octets big. That’s no big deal, but the HTTP 1.1 headers that are transmitted before the file are about 400 octets. I’m paying to transmit the headers to you, and it takes time. Receiving the header takes 4 times as long as transmitting the file. Time is Money. [edit: 2007-11-16 I do pay to transmit headers, I was wrong when I said I didn't.]

If I could somehow bundle the PNG file inside the only file that uses it (it’s used in a CSS background-image property in an XHTML file) then you could avoid downloading the extra header. Win! This would be a win even if the bundled PNG was slightly larger. Even if the overall transmitted octet count was a bit higher it would probably be a win in elapsed time because we avoid having to do another HTTP round trip for the extra file (and on some browsers that might mean another TCP/IP connexion, so we save all that too). It turns out we can bundle the PNG file inside the CSS.

We use the apparently obscure “data” URL scheme from RFC 2397. It works by having URLs like this:


(this example is actually a graphic from my earlier article on anti-aliasing, paste it into the location field of a browser to try it out)

“image/png” is optional but defaults to “text/plain” so you probably need to specify it for almost any practical application.

“;base64″ is also optional but if you don’t use it then you need to use the standard %xx URL encoding for non-ASCII octets. For binary data it’s probably saner to use “;base64″. Conceivably there might be binary files for which it was shorter to not use “;base64″.

The comma, “,”, is not optional.

So my CSS changes from:

background-image: url(ts.png);



Once I’ve gzip’d everything (which I used to not do, but is a big win for XML and JavaScript) I end up with an extra 19 octets. Which I pay for to store and transmit. So I’m 19 octets worse off, but you guys lose an entire header so you’re well over 300 octets better off plus an entire round-trip. How good is that?

Naturally RFC 2397 is implemented in Safari (3.0.3), Firefox, and Opera.

Now looking at the Base64 encoded version of the 1×1 PNG I can see that the PNG file is mostly overhead. Maybe I can get rid of some of those obviously unused header fields or chunks? Maybe there is some other image file format that would have less overhead for very tiny images (must be able to store at least 1 pixel to 8-bit precision for each of 4 channels). It’s 1-pixel GIFs all over again. Sorry.

Appendix – The Script

Happily uuencode turns out to support Base64 (on OS X and Single Unix Specification).

(includes bugfix!)

# $Id: //depot/prj/logoscript/master/code/dataurl#1 $
# Convert anything to a data URL.
# See
# Base64 is always used.
# dataurl [filename [mimetype]]

if test "$1" != "" && test "$2" == ""
  case "x$1" in
  *.png) m=image/png;;
  *.gif) m=image/gif;;

if test "$1" = ""
  uuencode -m foo
  uuencode -m "$1" foo
fi |
   { echo data:"${m};base64," ; sed '1d;$d;' ; } | tr -d '
About these ads

15 Responses to “Using the RFC 2397 “data” URL scheme to micro-optimise small images”

  1. glorkspangle Says:

    This is cool, way cooler than Logo arity.

  2. glorkspangle Says:

    How come you don’t pay for headers?

  3. drj11 Says:

    Don’t really know how come I don’t pay for headers. In a way my provider chooses the headers (by their choice of webserver) so it would be cruel to make me pay for headers that they make me send. If I was paying for the headers then I’m sure I could find a way to send less than 400 octets per header. On the other hand, I can influence the headers (I think I can add headers for example), and they certainly cost my provider something to send.

    Presumably a suitably abusive server script could put all the content in the headers and then have a suitable AJAX application decode it all on the client side, and thereby avoid paying transmission costs.

  4. Gareth Rees Says:

    I think the smallest possible PNG encoding of a 1×1 truecolour image would be 69 bytes long. That breaks down as follows:

    PNG signature: 8 bytes
    IHDR chunk: 25 bytes (12 overhead + 13 data)
    IDAT chunk: 24 bytes (12 overhead + 12 data)
    IEND chunk: 12 bytes (12 overhead + 0 data)

    Each chunk has 12 bytes of overhead (4 bytes length; 4 bytes chunk type; 4 bytes CRC). The IHDR and IEND chunks are fixed in size. The IDAT chunk has 4 bytes of data (the RGBA samples), which expand to 12 bytes when compressed with deflate.

    I tried this just now with GraphicConverter, and it encodes a 1×1 image in 75 bytes. I’m not sure why this is — in particular, when I decompress the IDAT chunk the result is only 4 bytes long, so for some reason there’s 6 bytes of wastage in GraphicConverter’s implementation of zlib compression.

    Anyway, I was able to make a 69-byte transparent 1×1 PNG, which you can see in all its glory here. (I made it “by hand”, using Python to do the encoding: struct.pack, zlib.compress, and zlib.crc32 were useful routines.)

  5. Gareth Rees Says:

    Apparently a 1×1 transparent GIF is only 43 bytes.

  6. glorkspangle Says:

    Please provide a list of Curly Logo functions. Joe was playing with it this evening and had a number of questions (e.g. about colour).
    Having a “help” command produce a list of functions would seem fairly easy and cool.
    I know that I could probably do this with javascript introspection.

  7. drj11 Says:

    @Gareth: Good work on the PNG stuff, that would’ve taken me ages.

    A transparent GIF has alpha = 0 though does it not?

  8. Gareth Rees Says:

    Yes, in GIF you get to optionally pick one colour in the palette and make that fully transparent. So you only have 1 bit of alpha.

    (It now occurs to me that I could have embedded the 1×1 PNG in my comment rather than uploading it to my website, like this.)

  9. Gareth Rees Says:

    Hmmm, that didn’t work, because WordPress mangled the link, removing the initial “data:”.

  10. drj11 Says:

    @glorkspangle: Done. “opps” outputs the names of all procedures (traditional name). Hover over a name for tooltip help. Sorry about the formatting.

  11. Gareth Rees Says:

    I was wrong in what I wrote above. The IDAT chunk needs five bytes (I forgot that you have to include a filter type on each scanline). Which compresses to 13 bytes. So a transparent PNG must be at least 70 bytes.

    The 69-byte file I created was a valid PNG but it was opaque; I’ve put the 70-byte file here.

  12. drj11 Says:

    @Gareth: I think wordpress mangles data URLs because they could encode JavaScript (it could look at the mime-type but it doesn’t seem to). That would be bad because it would mean you could create a link which when I clicked on it executed some JavaScript that posted my wordpress cookies publicly, and my cookies might contain private information (like passwords or something equivalent). I think this is what a Cross Site Scripting attack is.

    I need more than 1-bit of alpha so GIF no good.

  13. Gareth Rees Says:

    Further surprising developments at my blog.

  14. drj11 Says:

    Turns out I do pay for headers. I guess that means I’ll be stitching my .js files into my .html files.

  15. drj11 Says:

    It turns out that my script to convert a file to a data URL was deleting the trailing padding and that violated section 2.2 of RFC 3548. More importantly Safari would barf on the resulting URLs. Fixed.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Get every new post delivered to your Inbox.

%d bloggers like this: