r - Substring in Corpus in tm package -


i have created corpus using following command:

corpus_map <-vcorpus(vectorsource(classified_narr_sel$narration)) corpus_map <- tm_map(corpus_map, removenumbers)  

the above command removes numbers corpus. there command such can sub-string words of corpus? eg: "travelling" should converted #to substring of 3 letters "tra". normally, use

substr("travelling",1,3)  

but want same thing corpus in tm

you can write function conversions want , run on corpus, example:

convertstrings <- function(textinput){ textoutput <- gsub("travelling", "tra", textinput) textoutput <- gsub("furtherwords", "further", textoutput) #... return(textoutput) } corpus_transformed <- convertstrings(corpus_map) 

Comments

Popular posts from this blog

asynchronous - C# WinSCP .NET assembly: How to upload multiple files asynchronously -

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -