r - How to aggregate data.table by applying two functions -
this question has answer here:
i have data.table lists user id, week number, fact user did (processed, either 0 or 1) , column use count how many values have, called howmany:
data <- data.table(weeknumber=c(33,33,33,34,34,33,33,34,34), user=c(1,1,1,1,1,2,2,2,2), processed=c(1,1,0,0,1,0,1,0,1), howmany=c(1,1,1,1,1,1,1,1,1))
i want find, each week, sum of things done , not done, this:
> dcast(setdt(data), weeknumber~processed, value.var="howmany", sum) weeknumber 0 1 1: 33 2 3 2: 34 2 2
now i'd find average number of things done , not done week, in case have aggregate user before, fail @ step:
> dcast(setdt(data), weeknumber~processed+user, value.var="howmany", mean) weeknumber 0_1 0_2 1_1 1_2 1: 33 1 1 1 1 2: 34 1 1 1 1
while optimal results be:
weeknumber 0 1 33 1 1.5 34 1 1
what this:
dat[, user_processed := paste(user, processed, sep="_")] dcast(dat, weeknumber~user_processed, value.var="processed", length)
which gives you:
weeknumber 10001041_1 10001042_0 10001042_1 1: 33 0 3 2 2: 43 5 0 0
sample data used:
dat <- fread("user processed weeknumber 1: 10001042 0 33 2: 10001042 0 33 3: 10001042 1 33 4: 10001042 0 33 5: 10001042 1 33 870: 10001041 1 43 871: 10001041 1 43 872: 10001041 1 43 873: 10001041 1 43 874: 10001041 1 43") dat <- dat[, v1 := null] setnames(dat, c("user", "processed", "weeknumber"))
Comments
Post a Comment