python - "\r\n" is ignored at csv file end -
import csv impfilename = [] impfilename.append("file_1.csv") impfilename.append("file_2.csv") expfilename = "masterfile.csv" l = [] overwrite = false comma = "," f in range(len(impfilename)): open(impfilename[f], "r") impfile: table = csv.reader(impfile, delimiter = comma) row in table: data_1 = row[0] data_2 = row[1] data_3 = row[2] data_4 = row[3] data_5 = row[4] data_6 = row[5] dic = {"one":data_1, "two":data_2, "three":data_3, "four":data_4, "five":data_5, "six":data_6} in range(len(l)): if l[i]["one"] == data_1: print("data, 1 = " + data_1 + " has been updated using data " + impfilename[f]) l[i] = dic overwrite = true break if overwrite == false: l.append(dic) else: overwrite = false print(impfilename[f] + " has been added list 'l'") open(expfilename, "a") expfile: print("master file being created...") in range(len(l)): expfile.write(l[i]["one"] + comma + l[i]["two"] + comma + l[i]["three"] + comma + l[i]["four"] + comma + l[i]["five"] + comma + l[i]["six"] + "\r\n") print("process complete")
this program takes 2 (or more) .csv files , compares uniqueid (data_1) of each row others. if match, assumes current row updated version overwrites it. if there no match it's new entry.
i store each row's data in dictionary, stored in list "l".
once files have been processed, output list "l" "masterfile.csv" in specified format.
---the problem---
last row of "file_1.csv" , first row of "file_2.csv" end on same line in output file. continue on new line.
--visual
... data_1,data_2,data_3,data_4,data_5,data_6 data_1,data_2,data_3,data_4,data_5,data_6data_1,data_2,data_3,data_4,data_5,data_6 data_1,data_2,data_3,data_4,data_5,data_6 ...
note: there no header rows in of .csv files.
i've tried using "\n" @ end of "expfile.write" - same result
just little suggestion. comparing 2 files in way looks expensive . try using pandas
in following way.
import pandas data1 = pandas.read_csv("file_1.csv") data2 = pandas.read_csv("file_2.csv") # merging 2 dataframes combineddata = data1.append(data2,ignore_index=true) # dropping duplicates # give name of column on comparing uniqueness uniquedata = combineddata.drop_duplicates(["columnname"])
Comments
Post a Comment