r - Substring in Corpus in tm package -
i have created corpus using following command:
corpus_map <-vcorpus(vectorsource(classified_narr_sel$narration)) corpus_map <- tm_map(corpus_map, removenumbers)
the above command removes numbers corpus. there command such can sub-string words of corpus? eg: "travelling" should converted #to substring of 3 letters "tra". normally, use
substr("travelling",1,3)
but want same thing corpus in tm
you can write function conversions want , run on corpus, example:
convertstrings <- function(textinput){ textoutput <- gsub("travelling", "tra", textinput) textoutput <- gsub("furtherwords", "further", textoutput) #... return(textoutput) } corpus_transformed <- convertstrings(corpus_map)
Comments
Post a Comment