Python, Erlang, Map/Reduce
defaultdict as seen in the wild is almost always just a “bag”, a set-like object that allows objects to be in it multiple times. Using that structure would make the code a bit shorter. I threw together a little bag implementation: [link]
You should be able to do:
b = Bag(map_lines(file))
for count, page in b.mostcommon():
print ‘%40s=%s’ % (page, count)
This will probably slow the code down, but if Bag was written in C like defaultdict it wouldn’t.
Posted by Ian Bicking