csv - Split columns that are seperated with tabs and spaces -


i have weird fileformat here, uses tabs , spaces in amount seperate fields (even trailing , leading ones). speciality is, fields can added spaces in them, escaped in csv manner.

one example:

   0    "some string" 234      23947     123 ""some escaped"string"" 

i try parse such columns awk , need have every item in array, e.g.

foo[0] -> 0 foo[1] -> "some string" foo[2] -> 234 foo[3] -> 23947 foo[4] -> 123 foo[5] -> ""some escaped"string"" 

is possible? read http://web.archive.org/web/20120531065332/http://backreference.org/2010/04/17/csv-parsing-with-awk/ says parsing csv hard (for beginning should enough parse normal strings spaces, escaped variant rare)

before mess around long time: there way in awk or better use other language?

with gnu awk fpat:

$ cat tst.awk begin { fpat="\\s+|\"[^\"]+\"|,[^,]+," } {     gsub(/@/,"@a")     gsub(/,/,"@b")     gsub(/""/,",")     (i=1; i<=nf; i++) {         gsub(/,/,"\"\"",$i)         gsub(/@b/,",",$i)         gsub(/@a/,"@",$i)         print i, $i     } }  $ awk -f tst.awk file 1 0 2 "some string" 3 234 4 23947 5 123 6 ""some escaped"string"" 

to understand that's doing, see https://stackoverflow.com/a/40512703/1745001


Comments

Popular posts from this blog

sql server - Cannot query correctly (MSSQL - PHP - JSON) -

php - trouble displaying mysqli database results in correct order -

C++ Linked List -