r - Top Athlete from a statistical point of view -
let's junior track , field athlete specializing in 100m. have rankings of 400 junior players each individual year since 2006 until 2016.(each year separate csv file (table))
and have rankings of senior players each individual year since 2006 until 2016.(each year separate csv file (table))
the question want answer: there correlation between being junior athlete , chances of being world star?
so how should approach problem. have skills in r. point me direction.
is there correlation between being junior athlete , chances of being world star?
is being world star equal appearing in second group of csv`s?
is being in first group of csvs proof of being junior athlete?
will suppose each name unique , names don't chance on time?
you might want build table similar in mcnemar test.
name in top athlethes yes | no +------+------- top yes | 150 | 250 junior no | 250 | 550
right now, fail see reason why not compute odds ratio answer question.
all needed rbind junior-csv , unique
the names, same top-csv , merge these 2 inner join find overlapping names. joins can done using merge
.
Comments
Post a Comment