regex to make the string meaningful. python -


how remove non-english words (vocabulary) string

for example:

puppies monitoring_string = c1299fe10ba49eb54f197dd4f735fcdc dogtime 

how remove non-english word, keep vocabulary: result :

puppies monitoring string dogtime 

or

puppies monitoring string  ....or others 

the purpose make string meaningful.

what tried was:

re.sub('[^a-za-z0-9]+', ' ', string) result:   puppies monitoring string c1299fe10ba49eb54f197dd4f735fcdc dogtime 

can't think of logic words possess non-words not.

to start, maybe can try removing words numbers in them.

the regex \w*\d\w* should find letter combos numbers , numbers.


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -