How to make a random strata sample in R? -
i have data.frame calls "per" has 3 variables: nrodocumento, cod_jer(42 groups) , grupo_fict(8 groups). have random sample (data.frame)for each cod_jer , inside each grupo_fict.
> dput(head(per)) structure(list(nrodocumento = c(49574917l, 54692750l, 54731807l, 57364176l, 57364198l, 46867674l), cod_jer = c(1146l, 32l, 0l, 0l, 0l, 0l), grupo_fict = c(3l, 1l, 8l, 1l, 1l, 1l)), .names = c("nrodocumento", "cod_jer", "grupo_fict"), row.names = c(na, 6l), class = "data.frame") > head(per,n=100) nrodocumento cod_jer grupo_fict 1 49574917 1146 3 2 54692750 32 1 3 54731807 0 8 4 57364176 0 1 5 57364198 0 1 6 46867674 0 1 7 46867668 0 1 8 57364201 0 1 9 53767871 0 1 10 55339012 0 1 11 49204318 0 8 12 53743017 0 1 13 47622958 0 1 14 49019862 0 1 15 50167428 0 2 16 48783260 0 4 17 52020945 433 5 18 54486680 236 4 19 51402916 0 4 20 48543242 0 2 21 54671603 0 1 22 50644599 0 8 23 53293608 0 1 24 52742799 0 4 25 49815210 0 8 26 50967719 236 3 27 51938997 0 8 28 50057188 324 3 29 52754706 0 6 30 55322102 0 3 31 53040748 0 1 32 50321642 0 5 33 51621354 236 8 34 49611806 0 7 35 53347667 0 8 36 52462498 0 3 37 54158570 0 8 38 54034849 0 8 39 52507674 321 3 40 50218598 317 7 41 45078442 432 7 42 51491066 0 8 43 53278953 0 2 44 52661658 0 2 45 50092873 236 3 46 50308064 0 7 47 51941635 0 7 48 53527966 0 1 49 49614579 0 1 50 49450678 318 8 51 52953427 1146 7 52 52133221 0 8 53 53363128 0 7 54 52819643 0 1 55 47516589 0 1 56 52563137 0 3 57 49511296 0 7 58 54154013 0 2 59 50822420 1349 4 60 50822408 1349 4 61 50822414 1349 6 62 52339683 0 1 63 50026113 0 7 64 47328586 0 7 65 56041961 0 7 66 47756955 432 8 67 53158397 0 7 68 53151167 0 7 69 54710039 0 3 70 54408844 114 4 71 46286323 114 4 72 50310877 0 1 73 50929135 0 7 74 49817218 0 1 75 53604540 0 8 76 52812736 1147 1 77 53726314 1147 1 78 50835936 0 8 79 55429334 0 1 80 48421020 329 8 81 49800217 0 3 82 52818263 0 1 83 45884978 0 1 84 50203385 0 1 85 53433610 0 2 86 54515938 0 1 87 50263935 0 8 88 52439152 0 2 89 48424129 236 3 90 47031563 0 8 91 53577610 11 1 92 48759083 11 1 93 50344731 432 1 94 51164013 0 3 95 52026977 163 7 96 50965482 0 3 97 45947594 433 8 98 53357234 0 7 99 48367529 0 8 100 54286153 0 3 > table(per$cod_jer,per$grupo_fict) 1 2 3 4 5 6 7 8 0 3990 2296 1743 1453 356 250 2031 2051 11 149 85 29 34 14 6 34 25 13 2 4 1 0 0 0 1 1 14 3 1 0 0 0 0 0 1 32 37 12 13 10 3 1 23 13 101 19 12 6 5 3 0 6 12 102 2 0 0 0 0 0 0 0 103 11 10 3 3 0 1 3 0 104 17 8 1 7 2 1 7 9 105 11 12 3 3 3 0 6 10 106 147 57 30 29 8 1 43 42 107 33 37 5 9 3 2 8 9 108 6 10 2 3 0 2 3 4 109 44 37 11 9 6 2 14 14 111 112 81 26 28 8 3 22 18 112 21 8 4 8 2 0 3 2 113 94 61 14 16 4 1 17 24 114 60 52 10 14 9 5 8 20 115 72 24 21 13 5 1 11 16 125 5 4 1 0 1 0 0 1 138 15 5 2 2 1 0 2 0 163 50 35 26 26 7 12 43 41 234 51 43 31 32 10 7 49 53 236 78 29 46 35 7 7 39 37 317 44 28 21 13 7 2 28 21 318 20 27 5 10 4 3 12 14 319 45 21 25 19 1 2 26 21 321 6 4 9 3 0 3 8 1 322 43 30 24 16 5 3 16 34 323 30 14 25 15 3 4 24 22 324 59 29 31 27 8 5 28 27 325 15 12 6 5 1 2 8 11 326 18 12 17 13 4 2 20 15 327 45 28 23 26 7 6 25 40 328 52 49 33 32 5 9 31 35 329 42 36 26 20 2 3 23 30 431 6 2 4 1 2 0 2 6 432 39 18 27 24 5 1 28 34 433 139 92 90 89 18 13 61 66 1146 97 49 26 14 7 5 24 29 1147 56 33 26 25 9 0 19 20 1349 15 9 11 10 0 1 10 3 1544 62 33 20 32 4 3 25 43 1545 37 13 22 14 1 3 14 31 1848 16 27 11 15 3 0 10 12
for other hand have data.frame wiht vacancies, mean, size of each sample need inside each gruop.
> dput(head(vacantes)) structure(list(cod_jer = c(101l, 316l, 325l, 1349l, 1544l, 102l ), vacantes = c(132, 180, 54, 63, 45, 0), vac1 = c(27, 36, 11, 13, 9, 0), vac2 = c(27, 36, 11, 13, 9, 0), vac3 = c(24, 33, 10, 12, 9, 0), vac4 = c(24, 33, 10, 12, 9, 0), vac5 = c(8, 11, 4, 4, 3, 0), vac6 = c(8, 11, 4, 4, 3, 0), vac7 = c(7, 10, 3, 3, 2, 0), vac8 = c(7, 10, 3, 3, 2, 0)), .names = c("cod_jer", "vacantes", "vac1", "vac2", "vac3", "vac4", "vac5", "vac6", "vac7", "vac8" ), row.names = c(na, 6l), class = "data.frame") > vacantes cod_jer vacantes vac1 vac2 vac3 vac4 vac5 vac6 vac7 vac8 1 101 132 27 27 24 24 8 8 7 7 2 316 180 36 36 33 33 11 11 10 10 3 325 54 11 11 10 10 4 4 3 3 4 1349 63 13 13 12 12 4 4 3 3 5 1544 45 9 9 9 9 3 3 2 2 6 102 0 0 0 0 0 0 0 0 0 7 103 0 0 0 0 0 0 0 0 0 8 104 0 0 0 0 0 0 0 0 0 9 105 0 0 0 0 0 0 0 0 0 10 106 0 0 0 0 0 0 0 0 0 11 107 0 0 0 0 0 0 0 0 0 12 108 0 0 0 0 0 0 0 0 0 13 109 0 0 0 0 0 0 0 0 0 14 110 0 0 0 0 0 0 0 0 0 15 111 0 0 0 0 0 0 0 0 0 16 112 0 0 0 0 0 0 0 0 0 17 113 0 0 0 0 0 0 0 0 0 18 114 0 0 0 0 0 0 0 0 0 19 115 0 0 0 0 0 0 0 0 0 20 137 0 0 0 0 0 0 0 0 0 21 138 0 0 0 0 0 0 0 0 0 22 139 0 0 0 0 0 0 0 0 0 23 140 0 0 0 0 0 0 0 0 0 24 234 0 0 0 0 0 0 0 0 0 25 236 0 0 0 0 0 0 0 0 0 26 317 0 0 0 0 0 0 0 0 0 27 318 0 0 0 0 0 0 0 0 0 28 319 0 0 0 0 0 0 0 0 0 29 320 0 0 0 0 0 0 0 0 0 30 321 0 0 0 0 0 0 0 0 0 31 322 0 0 0 0 0 0 0 0 0 32 323 0 0 0 0 0 0 0 0 0 33 324 0 0 0 0 0 0 0 0 0 34 326 0 0 0 0 0 0 0 0 0 35 327 0 0 0 0 0 0 0 0 0 36 328 0 0 0 0 0 0 0 0 0 37 329 0 0 0 0 0 0 0 0 0 38 431 0 0 0 0 0 0 0 0 0 39 432 0 0 0 0 0 0 0 0 0 40 433 0 0 0 0 0 0 0 0 0 41 1146 0 0 0 0 0 0 0 0 0 42 1147 0 0 0 0 0 0 0 0 0 43 1545 0 0 0 0 0 0 0 0 0 44 1630 0 0 0 0 0 0 0 0 0 45 1848 0 0 0 0 0 0 0 0 0
i make sample strata in each of combination groups: cod_jer , grupo_fict, in case of vacancies 0, sample size 0.
i trying this:
size=subset(vacantes,select=c(vac1,vac2,vac3,vac4,vac5,vac6,vac7,vac8)) size=as.matrix(size) size=as.vector(size) for(i in 1:length(size)) { if (size[i] > 0 ) { s=strata(per,c("cod_jer","grupo_fict"),size=size, method="srswor") } else { s="0" }}
but cant work :(
any suugestion?
thanks!
Comments
Post a Comment