regex to make the string meaningful. python -
how remove non-english words (vocabulary) string
for example:
puppies monitoring_string = c1299fe10ba49eb54f197dd4f735fcdc dogtime
how remove non-english word, keep vocabulary: result :
puppies monitoring string dogtime
or
puppies monitoring string ....or others
the purpose make string meaningful.
what tried was:
re.sub('[^a-za-z0-9]+', ' ', string) result: puppies monitoring string c1299fe10ba49eb54f197dd4f735fcdc dogtime
can't think of logic words possess non-words not.
to start, maybe can try removing words numbers in them.
the regex \w*\d\w*
should find letter combos numbers , numbers.
Comments
Post a Comment