Encoding error in sed -


i've been trying remove duplicate character sequences in social media text using following code:

sed 's/\([a-za-z]\)\1\1\1*/\1\1\1/g' 

the code works fine on regular ascii lines breaks on non-ascii text error sed: re error: illegal byte sequence. example:

you 💩 

for it's worth, i'm running mac osx. need reset encoding variable?


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -