dataset - Regex to match a repeating pattern between two strings -
i have dataset repeating pattern:
---- mv: oxford , cambridge university boat race (1895) sd: 30 march 1895 - ---- mv: awakening of rip (1896) cp: american mutoscope company; 4 february 1897; 9237 (in copyright registry) pd: august 1896 - august 1896 ---- mv: chegada comboio inaugural à estação central porto (1897) pd: 7 november 1896 - ---- mv: exit of rip , dwarf (1896) cp: american mutoscope , biograph co.; 9 december 1902; h24875 (in copyright registry) pd: august 1896 - august 1896 ---- now, i'd take what's between first ---- , next ---- string , change \n \t, each entry in same line tab separated. each entry separated ---- easier read in. in end should like:
---- mv: oxford , cambridge university boat race (1895) sd: 30 march 1895 - ---- mv: awakening of rip (1896) cp: american mutoscope company; 4 february 1897; 9237 (in copyright registry) pd: august 1896 - august 1896 ---- mv: chegada comboio inaugural à estação central porto (1897) pd: 7 november 1896 - ---- mv: exit of rip , dwarf (1896) cp: american mutoscope , biograph co.; 9 december 1902; h24875 (in copyright registry) pd: august 1896 - august 1896 ---- i tried positive lookbehind patterns, no luck.
you want both negative lookaheads , lookbehinds. this:
(?<!----)\n(?!----) then replace matches \t , you're done.
Comments
Post a Comment