python - How to add a column value from a .csv to another one when the other columns are the same? -


i have 2 .csv files: file1=28 columns , 1000 rows , file2=29 columns , 100 rows. there no index row, hence, not know rows same in both files.

for each row in file 1, want add new column value of 29. column in file2 when other 28 columns same.

file1: a,b,c,...,empty x,y,z,...,empty  file2: a,b,c,...,b1 x,y,z,...,b2  output: a,b,c,...,b1 x,y,z,...,b2 

so far @ beginning;

with open(('file1.csv', 'rb'), delimiter=';') test1:     reader = csv.reader(test1)     next(reader, none)  # ignore header     test1 = set(row[0:28] row in reader) open(('file2.csv', 'rb'), delimiter=';') test2:     reader = csv.reader(test2)     next(reader, none)  # ignore header     test2 = set(row[0:28] row in reader) 

i suggest using numpy.loadtxt load both csv files more efficient , gives structure content. then, have array f1 of 1000 x 28 values file1 , array f2 of 100 x 29 file2.

the second step add new column file1 using f1 = numpy.column_stack([f1, numpy.zeros(f1.shape[0])]).

then, can iterate on smallest array with:

for row in f2:     # find row(s) in f1 same 28 first columns     equal_rows = np.argwhere((f1[:, :28] == row[:28]).all(axis=1))     row2 in equal_rows:         # add last column of f2         f1[row2[0], -1] = row[-1] 

i hope helps.


Comments

Popular posts from this blog

sql server - Cannot query correctly (MSSQL - PHP - JSON) -

php - trouble displaying mysqli database results in correct order -

C++ Linked List -