Python, Erlang, Map/Reduce
defaultdict as seen in the wild is almost always just a “bag”, a set-like object that allows objects to be in it multiple times.
I remember the Bag from my old days as a Smalltalk programmer (Digitalk under MS-DOS, man I feel old) and I have always wondered why supposedly rich collection frameworks like the ones in java and python don’t have it. It is exactly what is needed for the reduce task here: counting.
Here I did what the original code was doing and what your code and Smalltalk uses as the implementation of Bag: use a dictionary to store the counts.
I even hesitated to use defaultdict to keep 2.5- compatibility, but now that major linux distros are moving to 2.5 I guess it is mainstream enough.
Posted by Santiago Gala