awk - Find duplicate records with only text case difference -
i have log file 8m entries/records urls. i'd find duplicate urls (same urls) difference being type / text case.
example:
origin-www.example.com/this/is/hard.html origin-www.example.com/this/is/hard.html origin-www.example.com/this/is/hard.html
in case, there 3 duplicates case sensitivity.
output should count -c , new file duplicates.
use typical awk '!seen[$0]++' file
trick combined tolower()
or toupper()
make lines in same case:
$ awk '!seen[tolower($0)]++' file origin-www.example.com/this/is/hard.html
for different output , counters whatsoever, provide valid desired output.
Comments
Post a Comment