python - How do I select a range of rows between two values using pandas? -
i want slice sample data @ bottom...
so each session (session - events login through action before next login) looks this:
login,4,2016-11-10 05:28:30.396,hbhimani,11/10/2016 getuserpreferences,179,2016-11-10 05:28:30.575,hbhimani,11/10/2016 getpreference,3,2016-11-10 05:28:55.686,hbhimani,11/10/2016 getpreference,4,2016-11-10 05:28:55.961,hbhimani,11/10/2016 constructfromsession,4,2016-11-10 05:28:56.108,hbhimani,11/10/2016 getuserpreferences,4,2016-11-10 05:28:56.112,hbhimani,11/10/2016 getuserpreferences,3,2016-11-10 05:28:56.116,hbhimani,11/10/2016 setbooleanpreference,4,2016-11-10 05:28:56.238,hbhimani,11/10/2016 setbooleanpreference,4,2016-11-10 05:28:56.513,hbhimani,11/10/2016 getquicksearchinitinfo,3,2016-11-10 05:28:58.936,hbhimani,11/10/2016 getquicksearchinitinfo2,4,2016-11-10 05:28:59.315,hbhimani,11/10/2016 i count number of records , occurrence of getpreference action. appear 1 record looks this:
day,user,session_duration(min),getpreference_count,total_session_actions 11/10/2016,hbhimani, 180, 2, 11 my challenge occurring when have more 1 session. don't know how slice dynamically on index.
sample data:
action,duration,_time,user,day login,4,2016-11-10 05:28:30.396,hbhimani,11/10/2016 getuserpreferences,179,2016-11-10 05:28:30.575,hbhimani,11/10/2016 getpreference,3,2016-11-10 05:28:55.686,hbhimani,11/10/2016 getpreference,4,2016-11-10 05:28:55.961,hbhimani,11/10/2016 constructfromsession,4,2016-11-10 05:28:56.108,hbhimani,11/10/2016 getuserpreferences,4,2016-11-10 05:28:56.112,hbhimani,11/10/2016 getuserpreferences,3,2016-11-10 05:28:56.116,hbhimani,11/10/2016 setbooleanpreference,4,2016-11-10 05:28:56.238,hbhimani,11/10/2016 setbooleanpreference,4,2016-11-10 05:28:56.513,hbhimani,11/10/2016 getquicksearchinitinfo,3,2016-11-10 05:28:58.936,hbhimani,11/10/2016 getquicksearchinitinfo2,4,2016-11-10 05:28:59.315,hbhimani,11/10/2016 login,3,2016-11-10 05:29:29.202,hbhimani,11/10/2016 getsummary,4042,2016-11-10 05:29:33.246,hbhimani,11/10/2016 getenclosures,457,2016-11-10 05:29:34.372,hbhimani,11/10/2016 getaudittrail,1061,2016-11-10 05:29:36.034,hbhimani,11/10/2016 getrelateddefects,5,2016-11-10 05:29:36.586,hbhimani,11/10/2016 getservicerequests,5,2016-11-10 05:29:36.864,hbhimani,11/10/2016 getforeignbugs,270,2016-11-10 05:29:37.408,hbhimani,11/10/2016 getenclosures,455,2016-11-10 05:29:50.087,hbhimani,11/10/2016 getsummary,5505,2016-11-10 05:32:26.584,hbhimani,11/10/2016 getenclosures,459,2016-11-10 05:32:27.940,hbhimani,11/10/2016 login,997,2016-11-10 05:32:29.480,anshanno,11/10/2016 getrelateddefects,5,2016-11-10 05:32:30.027,anshanno,11/10/2016 getservicerequests,5,2016-11-10 05:32:30.306,anshanno,11/10/2016 getforeignbugs,6,2016-11-10 05:32:30.585,anshanno,11/10/2016
iiuc can group data follows:
original df:
in [62]: df out[62]: action duration _time user day 0 login 4 2016-11-10 05:28:30.396 hbhimani 2016-11-10 1 getuserpreferences 179 2016-11-10 05:28:30.575 hbhimani 2016-11-10 2 getpreference 3 2016-11-10 05:28:55.686 hbhimani 2016-11-10 3 getpreference 4 2016-11-10 05:28:55.961 hbhimani 2016-11-10 4 constructfromsession 4 2016-11-10 05:28:56.108 hbhimani 2016-11-10 5 getuserpreferences 4 2016-11-10 05:28:56.112 hbhimani 2016-11-10 6 getuserpreferences 3 2016-11-10 05:28:56.116 hbhimani 2016-11-10 7 setbooleanpreference 4 2016-11-10 05:28:56.238 hbhimani 2016-11-10 8 setbooleanpreference 4 2016-11-10 05:28:56.513 hbhimani 2016-11-10 9 getquicksearchinitinfo 3 2016-11-10 05:28:58.936 hbhimani 2016-11-10 10 getquicksearchinitinfo2 4 2016-11-10 05:28:59.315 hbhimani 2016-11-10 11 login 3 2016-11-10 05:29:29.202 hbhimani 2016-11-10 12 getsummary 4042 2016-11-10 05:29:33.246 hbhimani 2016-11-10 13 getenclosures 457 2016-11-10 05:29:34.372 hbhimani 2016-11-10 14 getaudittrail 1061 2016-11-10 05:29:36.034 hbhimani 2016-11-10 15 getrelateddefects 5 2016-11-10 05:29:36.586 hbhimani 2016-11-10 16 getservicerequests 5 2016-11-10 05:29:36.864 hbhimani 2016-11-10 17 getforeignbugs 270 2016-11-10 05:29:37.408 hbhimani 2016-11-10 18 getenclosures 455 2016-11-10 05:29:50.087 hbhimani 2016-11-10 19 getsummary 5505 2016-11-10 05:32:26.584 hbhimani 2016-11-10 20 getenclosures 459 2016-11-10 05:32:27.940 hbhimani 2016-11-10 21 login 997 2016-11-10 05:32:29.480 anshanno 2016-11-10 22 getrelateddefects 5 2016-11-10 05:32:30.027 anshanno 2016-11-10 23 getservicerequests 5 2016-11-10 05:32:30.306 anshanno 2016-11-10 24 getforeignbugs 6 2016-11-10 05:32:30.585 anshanno 2016-11-10 group it:
in [63]: grp = df.groupby(['user', df.action.eq('login').cumsum()]) print groups:
in [64]: g, x in grp: ...: print(x) ...: action duration _time user day 21 login 997 2016-11-10 05:32:29.480 anshanno 2016-11-10 22 getrelateddefects 5 2016-11-10 05:32:30.027 anshanno 2016-11-10 23 getservicerequests 5 2016-11-10 05:32:30.306 anshanno 2016-11-10 24 getforeignbugs 6 2016-11-10 05:32:30.585 anshanno 2016-11-10 action duration _time user day 0 login 4 2016-11-10 05:28:30.396 hbhimani 2016-11-10 1 getuserpreferences 179 2016-11-10 05:28:30.575 hbhimani 2016-11-10 2 getpreference 3 2016-11-10 05:28:55.686 hbhimani 2016-11-10 3 getpreference 4 2016-11-10 05:28:55.961 hbhimani 2016-11-10 4 constructfromsession 4 2016-11-10 05:28:56.108 hbhimani 2016-11-10 5 getuserpreferences 4 2016-11-10 05:28:56.112 hbhimani 2016-11-10 6 getuserpreferences 3 2016-11-10 05:28:56.116 hbhimani 2016-11-10 7 setbooleanpreference 4 2016-11-10 05:28:56.238 hbhimani 2016-11-10 8 setbooleanpreference 4 2016-11-10 05:28:56.513 hbhimani 2016-11-10 9 getquicksearchinitinfo 3 2016-11-10 05:28:58.936 hbhimani 2016-11-10 10 getquicksearchinitinfo2 4 2016-11-10 05:28:59.315 hbhimani 2016-11-10 action duration _time user day 11 login 3 2016-11-10 05:29:29.202 hbhimani 2016-11-10 12 getsummary 4042 2016-11-10 05:29:33.246 hbhimani 2016-11-10 13 getenclosures 457 2016-11-10 05:29:34.372 hbhimani 2016-11-10 14 getaudittrail 1061 2016-11-10 05:29:36.034 hbhimani 2016-11-10 15 getrelateddefects 5 2016-11-10 05:29:36.586 hbhimani 2016-11-10 16 getservicerequests 5 2016-11-10 05:29:36.864 hbhimani 2016-11-10 17 getforeignbugs 270 2016-11-10 05:29:37.408 hbhimani 2016-11-10 18 getenclosures 455 2016-11-10 05:29:50.087 hbhimani 2016-11-10 19 getsummary 5505 2016-11-10 05:32:26.584 hbhimani 2016-11-10 20 getenclosures 459 2016-11-10 05:32:27.940 hbhimani 2016-11-10 explanation:
in [71]: df['grp_id'] = df.action.eq('login').cumsum() in [72]: df[['action','user','grp_id']] out[72]: action user grp_id 0 login hbhimani 1 1 getuserpreferences hbhimani 1 2 getpreference hbhimani 1 3 getpreference hbhimani 1 4 constructfromsession hbhimani 1 5 getuserpreferences hbhimani 1 6 getuserpreferences hbhimani 1 7 setbooleanpreference hbhimani 1 8 setbooleanpreference hbhimani 1 9 getquicksearchinitinfo hbhimani 1 10 getquicksearchinitinfo2 hbhimani 1 11 login hbhimani 2 12 getsummary hbhimani 2 13 getenclosures hbhimani 2 14 getaudittrail hbhimani 2 15 getrelateddefects hbhimani 2 16 getservicerequests hbhimani 2 17 getforeignbugs hbhimani 2 18 getenclosures hbhimani 2 19 getsummary hbhimani 2 20 getenclosures hbhimani 2 21 login anshanno 3 22 getrelateddefects anshanno 3 23 getservicerequests anshanno 3 24 getforeignbugs anshanno 3
Comments
Post a Comment