New dict of top n values (and keys) from dictionary (Python) -

February 15, 2014

i have dictionary of names , number of times names appear in phone book:

names_dict = {     'adam': 100,     'anne': 400,     'britney': 321,     'george': 645,     'joe': 200,     'john': 1010,     'mike': 500,     'paul': 325,     'sarah': 150 }

preferably without using sorted(), want iterate through dictionary , create new dictionary has top 5 names only:

def sort_top_list():   # create dict of 5 names first   new_dict = {}   in names_dict.keys()[:5]:     new_dict[i] = names_dict[i]:    # find smallest current value in new_dict   # , compare others in names_dict   # find bigger ones; replace smaller name in new_dict bigger name   k,v in address_dict.iteritems():     current_smallest = min(new_dict.itervalues())     if v > current_smallest:       # found bigger value; replace smaller key/ value in new_dict larger key/ value       new_dict[k] = v       # ?? delete old key/ value pair new_dict somehow

i seem able create new dictionary gets new key/ value pair whenever iterate through names_dict , find name/ count higher have in new_dict. can't figure out, though, how remove smaller ones new_dict after add bigger ones names_dict.

is there better way - without having import special libraries or use sorted() - iterate through dict , create new dict of top n keys highest values?

you should use heapq.nlargest() function achieve this:

import heapq operator import itemgetter  top_names = dict(heapq.nlargest(5, names_dict.items(), key=itemgetter(1)))

this uses more efficient algorithm (o(nlogk) dict of size n, , k top items) extract top 5 items (key, value) tuples, passed dict() create new dictionary.

demo:

>>> import heapq >>> operator import itemgetter >>> names_dict = {'adam': 100, 'anne': 400, 'britney': 321, 'george': 645, 'joe': 200, 'john': 1010, 'mike': 500, 'paul': 325, 'sarah': 150} >>> dict(heapq.nlargest(5, names_dict.items(), key=itemgetter(1))) {'john': 1010, 'george': 645, 'mike': 500, 'anne': 400, 'paul': 325}

you want use collections.counter() class instead. counter.most_common() method have made use-case trivial solve. implementation method uses heapq.nlargest() under hood.

these not special libraries, part of python standard library. otherwise have implement binary heap achieve this. unless studying algorithm, there little point in re-implementing own, python implementation highly optimised extension written in c critical functions).

Search This Blog

CSS

New dict of top n values (and keys) from dictionary (Python) -

Comments

Post a Comment

Popular posts from this blog

sql server - Cannot query correctly (MSSQL - PHP - JSON) -

php - trouble displaying mysqli database results in correct order -

C++ Linked List -