R: Why does "ifelse" coerce factor into integer? -
this question has answer here:
i'm attempting change values of variable na values if they're not in vector:
sample <- factor(c('01', '014', '1', '14', '24')) df <- data.frame(var1 = 1:6, var2 = factor(c('01', '24', 'none', '1', 'unknown', '24'))) df$var2 <- ifelse(df$var2 %in% sample, df$var2, na)
for reason r not preserve original values of factor variable turns them numeric sequence:
> sample <- factor(c('01', '014', '1', '14', '24')) > df <- data.frame(var1 = 1:6, var2 = factor(c('01', '24', 'none', '1', 'unknown', '24'))) > class(df$var2) [1] "factor" > df var1 var2 1 1 01 2 2 24 3 3 none 4 4 1 5 5 unknown 6 6 24 > df$var2 <- ifelse(df$var2 %in% sample, df$var2, na) > class(df$var2) [1] "integer" > df var1 var2 1 1 1 2 2 3 3 3 na 4 4 2 5 5 na 6 6 3
why happen , correct way of achieving i'm trying here?
(i need use factors rather integers in order not confuse "01" , "1" , original data set large, using factors rather characters should save me memory)
i think 1 way achieve trying change levels of factor:
levels(df$var2)[!levels(df$var2) %in% sample] <- na
by changing levels values not matching these levels converted factor na , result be:
df var1 var2 1 1 01 2 2 24 3 3 <na> 4 4 1 5 5 <na> 6 6 24 > df$var2 [1] 01 24 <na> 1 <na> 24 levels: 01 1 24
the unknown , none values no longer in factor levels. or if keep unknown , none in values try this:
df$var2[!df$var2 %in% sample] <- na > df var1 var2 1 1 01 2 2 24 3 3 <na> 4 4 1 5 5 <na> 6 6 24 > df$var2 [1] 01 24 <na> 1 <na> 24 levels: 01 1 24 none unknown
the reason why ifelse changing class of data ifelse not maintain class. read second answer here: how prevent ifelse() turning date objects numeric objects
and last way @tchakravarty mentioned in comments use if_else dplyr!
Comments
Post a Comment