python - Appending two DataFrames and sorting columns with exception of first two -


i'd concatenate 2 data frames, created 2 lists:

import pandas pd import numpy np  header_1 = ['a', 'b', -1, 3, 5, 7] data_1 = ['x', 'y', 1, 2, 3, 4] d = pd.dataframe(np.array([data_1]), columns=header_1)  header_2 = ['a', 'b', -2, 4, 5, 6] data_2 = ['x', 'z', 1, 2, 3, 4] e = pd.dataframe(np.array([data_2]), columns=header_2)  f = pd.concat([d, e])  > f     b   -1    3  5    7   -2    4    6 0  x  y    1    2  3    4  nan  nan  nan 0  x  z  nan  nan  3  nan    1    2    4 

however, want numerical columns appear in sorted order , wondering if there easier way splitting off first 2 columns, sorting remaining dataframe , concatenating 2 again:

ab_cols = f[['a', 'b']]               # copy of first 2 columns g = f.drop(['a', 'b'], axis=1)        # removing cols dataframe h = g.sort_index(axis=1)              # sort remaining column header = pd.concat([ab_cols, h], axis=1)   # putting again  >     b   -2   -1    3    4  5    6    7 0  x  y  nan    1    2  nan  3  nan    4 0  x  z    1  nan  nan    2  3    4  nan 

i've thought multi-indices, i'm using index else (source of data row, not shown here), , i'm afraid three-level multi-index might make more complicated slicing dataframe later.

steps:

make columns series representation both index , values equal index keys.

using pd.to_numeric errors=coerce, parse numeric values , handling string values nans.

sort these values pushing nans(which string values before) on top , when encountered.

taking corresponding indices , re-arranging df based on these newly returned column labels.

c = pd.to_numeric(f.columns.to_series(), errors='coerce').sort_values(na_position='first') f[c.index] 

image


Comments

Popular posts from this blog

aws api gateway - SerializationException in posting new Records via Dynamodb Proxy Service in API -

asp.net - Problems sending emails from forum -