r - Packing customers into buckets -


i have 25 customers. each customer has number of users of our system, e.g. customer 1 has 45 users, customer 2 has 46 users... customer 25 has 1000 users.

i want bin each customer bucket, each bucket contains equal number of users. know want 5 buckets in total.

(the buckets here represent servers, want apportion clients different servers total number of users per server equal, prevent overloading servers. 1 client has on same server (i.e. can't split 1 client on 2 servers).

any idea of suitable methods apportioning customers buckets? thought clustering methods might work (i tried kmeans using r), cant seem find ways of stipulating total number of users in each cluster same.

here's r code example of i've done far:

#create dataset r <- data.frame(users=c(1000, 960, 920, 870, 850, 700, 600, 550, 520, 500, 420, 400, 390, 300, 210, 200, 160, 80, 70, 50, 49, 48, 47, 46, 45)) #try kmeans clustering fit <- kmeans(r, 5)  #get cluster means aggregate(r, by=list(fit$cluster),fun = mean) #append cluster assignment r <- data.frame(r,fit$cluster)  #plot cluster library(cluster) clusplot(r, fit$cluster, color=true, shade=true, labels=2, lines=0) library(fpc) plotcluster(r, fit$cluster) 

this clusters customers buckets, number of users in each bucket not equal.

i've tagged r problem, if there's simple solution in other package i'm ears :-)

i don't know recommended solution such 'constant sum sampling ' is. here's shot @ -- sort items, convert matrix each column represents sample, reverse every other row.

here's code:

set.seed(1024) r <- data.frame(users=c(1000, 960, 920, 870, 850, 700, 600, 550, 520, 500, 420, 400, 390, 300, 210, 200, 160, 80, 70, 50, 49, 48, 47, 46, 45))  a<-   r$users #runif(n = 25, 100,400) #rnorm(25,100,100) # 1:25 #hist(a) df<- data.frame(id=1:25,x=a)  # sort  x<- df$id[order(df$x)] # convert matrix #each column of matrix represetns 1 sample xm<-matrix(x,ncol=5,byrow = t); xm oldsum<-apply(matrix(df$x,ncol=5,byrow = t), 2,sum)  #flip alternate rows of sorted matrix i= 1:nrow(xm) im=i[c(f,t)] xm[im,] xm[im,]<- rev(xm[im,])  # new matrix of indeices  xm  #hence new matrix of values xm2<- matrix(a[c(xm)],ncol = 5, byrow = f) xm xm2  newsum<- (apply(xm2, 2,sum))  # improvement rbind(oldsum,newsum) barplot(rbind(oldsum,newsum)[1,]) barplot(rbind(oldsum,newsum)[2,])  # each column of following matrix represents 1 sample  #(values indices in original vector a) xm  

Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -