python - Add column with numbering of elements with respect to a groupby operation without loops -


i managed add column pandas dataframe internal numbering respect groups.

this input dataframe:

df = pd.dataframe({      'name': ['name1','name2','name3','name4','name5','name6', 'name7', 'name8'],       'group':['groupb','groupb','groupb','groupa','groupa','groupa', 'groupc', 'groupc'],      'revenue':[1,2,3,4,5,6,11,22]} ) 

that looks that:

    group   name    revenue 0   groupb  name1   1 1   groupb  name2   2 2   groupb  name3   3 3   groupa  name4   4 4   groupa  name5   5 5   groupa  name6   6 6   groupc  name7   11 7   groupc  name8   22 

i want output 1

    group   name    revenue group_internal_id 0   groupa  name4   4   0 1   groupa  name5   5   1 2   groupa  name6   6   2 3   groupb  name1   1   0 4   groupb  name2   2   1 5   groupb  name3   3   2 6   groupc  name7   11  0 7   groupc  name8   22  1 

i managed output wanted in dataframe outdf following code:

numbering_function = lambda x: range(len(x.index))  outdf = pd.dataframe() ik, idf in df.groupby('group'):     tempdf = idf.copy()     tempdf['group_internal_id'] = numbering_function(tempdf)     outdf = outdf.append(tempdf, ignore_index=true) 

then outdf looks follow:

group   name    revenue group_internal_id 0   groupa  name4   4   0 1   groupa  name5   5   1 2   groupa  name6   6   2 3   groupb  name1   1   0 4   groupb  name2   2   1 5   groupb  name3   3   2 6   groupc  name7   11  0 7   groupc  name8   22  1 

i find way obtain same output dataframe without using loop.

thanks!

you need cumcount sort_values:

df['new'] = df.groupby('group').cumcount() df = df.sort_values('group') print (df)     group   name  revenue  new 3  groupa  name4        4    0 4  groupa  name5        5    1 5  groupa  name6        6    2 0  groupb  name1        1    0 1  groupb  name2        2    1 2  groupb  name3        3    2 6  groupc  name7       11    0 7  groupc  name8       22    1 

Comments

Popular posts from this blog

asynchronous - C# WinSCP .NET assembly: How to upload multiple files asynchronously -

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -