python - Creating a matrix CSV file with numpy -


relatively new python here.

so have csv file contents this:

 dsa dds fsdf dasdsa  1 1 32.2 9 4  1 2 53.2 8 2  1 3 44.2 0 1  1 4 12.3 3 2  1 5 15.6 4 3  2 1 12.3 3 2  2 2 91.3 4 11  2 3 32.3 5 33   2 4 44.2 3 2  2 5 55.2 4 1  3 1 60.2 4 2  3 2 80.2 1 15  3 3 10.2 4 1  3 4 99.2 8 3  3 5 13.1 10 2  4 1 32.3 19 2  4 2 10.3 12 3    4 3 52.3 22 4  .  .  .   .  .  .  .   .  .  . 

i want output this:

    1    2     3    4  .  .  . 1 32.2  53.2  44.2  12.3  .  . 2 12.3  91.3  32.3  44.2  .  . 3 60.2  80.2  10.2  99.2  .  . 4 32.3  10.3  52.3   .    .  . .   .    .      .    .    .  . .   .    .      .    .    .  . 

as can see, i'm using first 3 columns of csv file , skipped first row (rubbish data).

i'd use numpy this, thought code trick:

from scipy.sparse import coo_matrix import numpy np  l, c, v = np.load('test.csv', skiprows=1, delimiter=',').t[:3,:] m = coo_matrix((v, (l-1, c-1)), shape=(l.max(), c.max())) print(m.toarray()) 

this works, first 2 columns in csv file excluded output. result turns out be:

[32.2  53.2  44.2  12.3  12.3  91.3  32.3  44.2  60.2  80.2  10.2  99.2  32.3  10.3  52.3    .] 

any thoughts on how can generate matrix need (the output)? csv file huge (it's got around 10k rows , columns), need use first 3 columns.

thanks heaps!

import pandas pd data = pd.read_csv('data.txt', delim_whitespace=true) data2 = data['dds'].reshape(len(data['dds'])/5, 5) df = pd.dataframe(data2, columns=range(1, 6), index=range(1, data2.shape[0]+1)) print(df) 

update:

without 'rubbish data':

import pandas pd names_ = range(1, 6) data = pd.read_csv('data.txt', delim_whitespace=true, names=names_) data2 = data[3].reshape(len(data[3])/5, 5) df = pd.dataframe(data2, columns=names_, index=range(1, data2.shape[0]+1)) print(df) 

Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -