Python: Iterating and Mutating Dictionary

2009-04-02

Whilst browsing Ciaran’s blog I was thinking about the following:

Clearly this Python code is fine:

for k,v in adict.items():
    if f(v):
        del adict[k]

Why wouldn’t it be? Well, it deletes things from the dictionary, adict, as it is iterating over the dictionary. Which should always start the alarm bells ringing. (if you haven’t guessed, f is just some test function that tells us whether we want to delete the dictionary item, based on its value. For example, it could be: def f(x): return len(x) > 1)

This code is fine because it uses items, and items returns a copy of the dictionary’s items, so it doesn’t matter what you do to the dictionary, the iteration will still proceed as intended.

So what about iteritems?:

for k,v in adict.iteritems():
    if f(v):
        del adict[k]

No idea, documentation horribly silent on this point. Ah, but trying it (well, a simpler case):

>>> for k,v in d.iteritems():
...   del d[k]
... 
Traceback (most recent call last):
  File "", line 1, in 
RuntimeError: dictionary changed size during iteration

We get a RuntimeError raised. Would be nice if the documentation for iteritems mentioned that.

So, interesting fact: You can’t always replace items with iteritems.

What about «for k in adict»? Well, for one thing, Python built in dictionaries are not even documented as supporting the __iter__ magic method that for uses in this case. Though of course they do, and lots of people use it all the time. For another thing, the __iter__ protocol documentation is also silent on how safe it is to mutate the underlying object. If you try it out, you get the same exception as when you use iteritems. Not safe to mutate underlying object.

Using «for k in adict.keys()» (which you also see quite a lot) is okay if you want to delete dictionary entries. keys returns a copy of the key list, so the iteration cannot be harmed.

So, interesting fact: You can’t always replace «for k in adict.keys()» with «for k in adict».

I wouldn’t rely on always getting an exception raised for unsafe practices either.

12 Responses to “Python: Iterating and Mutating Dictionary”


  1. I always considered this kind of thing rather unsatisfactory; iterating over some container and deleting things from it, or modifying it in more general ways, are perfectly natural things to want to do, and it’s a shame that predefined general-purpose containers don’t always support it well.

  2. Thomas Guest Says:

    I agree, it’s a bit subtle and the documentation doesn’t tell the whole story. Python 3.0 has no dict.iteritems(), and the dict “view” returned by dict.items() does seem to keep in step with the dict.

    > [View objects] provide a dynamic view on the dictionary’s entries, which means that when the dictionary changes, the view reflects these changes.

    You still can’t change the dict size as you iterate, though.

    I don’t particularly like the modify-dict-in-place code anyway, since you have to think twice about it. I’d prefer to just create a new dict. E.g.

    >>> adict = dict((k, v) for k, v in adict.items() if not f(v))

    Should work for items, iteritems, 2.6, 3.0 (unless f has some horrible side-effects).

  3. Tony Finch Says:

    Lua’s next() function is used for iterating over tables and it explicitly allows deleting keys during iteration – but not adding since that could cause a table resize and muck up the key order.
    (docs)

  4. lorg Says:

    Note that in Python 3 keys(), items() and values() return views. So I wouldn’t rely on it being possible to iterate on them and delete at the same time.

    An easy solution (but not the most efficient one) would be to use list(..) on any of these functions, and make sure you get a copy.

    Also, I remember that once I wanted to delete items from a list in a second pass, so I used reversed(enumerate(..)), and kept the indices to delete in a separate list to go over later.

  5. drj11 Says:

    I didn’t say so in the article, but I think I tend to agree with Richard Kettlewell. The desired semantics are quite easy to specify:

    1. Do not crash;
    2. Do not visit an item more than once;
    3. Do not miss an item unless I deleted it;
    4. Do not visit deleted items.

    That we rarely get these semantics is a case of premature optimisation. The semantics are quite easy to implement by having the iterator take a copy of the items and check before each iteration that the item in hand is still present. Anything else is optimisation gravy.

    I agree with Thomas Guest in that modify-in-place code looks tricky and leads to tricky reasoning, but I should still be allowed to do it. If I wanted my language to be my nanny I’d be programming in Pascal or Ada.


  6. interesting blog post, those corner-case/lack of doc are really ugly. (blog post : Bookmark’d and forwarded)

    The view approach seems a bit strange or at least incomplete. Why forbidding the delete op on the dict ?
    That seems the whole point.

  7. Nick Barnes Says:

    DRJ has the desired semantics right, but incomplete: they should specify whether newly-added items should be included. Either answer is fine, but it should be specified. Does this terminate?

    for k in dict: dict[(k,k)]=True

  8. Nick Barnes Says:

    (also updated items).

  9. Benjamin Peterson Says:

    I’ve updated docs with some of your suggestions. They should be live in a day or two.

  10. drj11 Says:

    @Benjamin: cool!

  11. Jay Says:

    I am trying to add an item dynamically into a dictionary while I am iterating thought that like below: I tried “dictionary.items()”. But that is excluding the newly added items. Can somebody let me know how can I work around this issue:

    Code:

    test_dict = dict();
    test_dict[1] = "abc"
    test_dict[2] = "cde"
    test_dict[3] = "efg"
    test_dict[4] = "hji"
    test_dict[5] = "jkl"
    test_dict[6] = "mno"
    test_dict[7] = "pqr"
    
    for num in test_dict.keys():
        print "the key is : " + str(num) 
        test_dict[8] = "mno"
        test_dict[9] = "xyz"
    
    print "----------------"
    print test_dict
    

    Results I got:

    the key is : 6
    the key is : 5
    the key is : 7
    the key is : 3
    the key is : 2
    the key is : 1
    the key is : 4
    —————-
    {6: ‘mno’, 5: ‘jkl’, 7: ‘pqr’, 8: ‘mno’, 3: ‘efg’, 2: ‘cde’, 9: ‘xyz’, 1: ‘abc’, 4: ‘hji’}

    Thanks,
    Jay

    [ed: 2010-03-05: I added a PRE tag to your code so that the indentation is visible]

  12. Matt L Says:

    I’d point out that it’s easy enough to iterate through a mutable data structure that you’re changing during iteration, Jay to answer your question:

    x = {‘a': 1, ‘b': 2, ‘c': 3, ‘d': 4, ‘e': 5}
    while len(x):
    key, value = x.popitem()
    print(‘The key is %s the value is %s’ % (key, value))
    if value > 3:
    x['z%s' % value] = 0

    returns:
    The key is a the value is 1
    The key is c the value is 3
    The key is b the value is 2
    The key is e the value is 5
    The key is d the value is 4
    The key is z4 the value is 0
    The key is z5 the value is 0

    Of course if you’d like to use the dictionary later on, you’d need to rebuild the dictionary during the loop as needed.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: