JavaScript: using numbers as table keys considered harmful

2008-07-11

You may or may not know, but in JavaScript all tables (associative arrays) are indexed by strings, and nothing else. So what happens when you go a[0] = thing, isn’t that indexing the table a with the number 0? Well, yes and no. The number gets converted to a string, so a[0] is equivalent to a['0']. In fact all values, of any type, get converted to a string when used as a table key, but I’m only concerned about numbers here (see ECMA 262 section 11.2.1).

Note that this is only how the language is specified, not implemented. Given the importance of arrays in JavaScript, routinely indexed by integers, I would expect that special representations in the implementation would allow efficient indexing by integers.

So with (small) integer keys there are few possible problems. 113 gets converted to ‘113’ without ambiguity, so a[113] and a['113'] are equivalent. What about other number values, like 1.5 or 98.4 (Aside: when I say “98.4” what I mean is the double precision floating point number that is closest to 98.4, this will be a little bit different from exactly 98.4 (see my article on printing floating point numbers))? It would be embarrassing if a[98.4] were equivalent to a['98.400000000000006']. It would be even more embarrassing if the equivalence was allowed to vary between implementations. Well, it’s not. The JavaScript programmer has every right to expect the body of the if in this code to be executed:

a={'98.4':true}
if(a[98.4]) { ... }

When converting numbers to strings JavaScript requires that the parsimonious printing rule is applied. Roughly speaking, a number gets converted to a decimal string that is closer to the source number than any other (floating point) number, and of all such possible decimal strings the shortest one is chosen. The “shortest one” requirement is what gives us ‘98.4’ instead of ‘98.400000000000006’.

However, there is a problem. Sometimes there is more than one possible shortest string. An obvious example is the smallest positive double precision float: 5e-324 (see the smallest number). This number can be represented in decimal as 3e-324, 4e-324, …, 7e-324. These are representations of the floating point number in the sense that the nearest floating point to each of these decimals is in each case the same floating point number, the smallest positive double precision float. There are other numbers. nextafter(1, 2) is one such number: 1.0000000000000002. This number can equally well be represented by the decimal 1.0000000000000003.

So the problem is that when there is more than one possible shortest decimal string the implementation is free to choose amongst them (so my “no variance between implementations”, above, was a small lie). In principle a JavaScript implementation could make a different choice each time it performed the conversion. So a[5e-324] could index up to 5 different keys in the table a. It would of course be insanity for an implementation to do that, but it’s still a loophole in the specification, and it should be closed.

In the 3rd edition of ECMA 262 there’s a note about this at the end of section 9.8.1; the note is “not part of the normative requirements”. The note specifies additional requirements on the choice of the decimal string (namely “nearest”, and, if still ambiguous, “even”) that ensure a unique conversion is performed. So the text that specifies the requirements we need is already written, it should be a simple matter to make this note normative in the next revision, and this should be done.

I thought Java got this right, but on close inspection I see that it also permits the same variation when there is more than one shortest decimal string. In fact, for the smallest positive double precision float 2 decimal digits are required by Java (because a decimal point is required), so permitting many more options for 5e-324. In this case I wouldn’t even like to say whether “4.9E-324″ or “5.0E-324″ should (in the moral sense) be returned. The former has the virtue of being closest to the floating point number, the latter has the virtue of not misleading about available precision.

About these ads

4 Responses to “JavaScript: using numbers as table keys considered harmful”

  1. TNO Says:

    This is good stuff, you should report this to the ES4 mailing list. Maybe they’ll get it fixed in time for the release of ES3.1

    https://mail.mozilla.org/pipermail/es4-discuss/2008-July/date.html

  2. drj11 Says:

    What, you’re suggesting the ES3.1 and ES4 working groups don’t avidly read my blog?

    I have sent an e-mail to the list.

  3. TNO Says:

    FYI, there’s alot of noise on this topic now that might interest you

    Follow the thread: “Thoughts on IEEE P754″

    https://mail.mozilla.org/pipermail/es-discuss/2008-August/subject.html#start

  4. drj11 Says:

    Yeah. I’m subscribed so I have the joy of following along in my inbox.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: