python - Appending two DataFrames and sorting columns with exception of first two -
i'd concatenate 2 data frames, created 2 lists:
import pandas pd import numpy np header_1 = ['a', 'b', -1, 3, 5, 7] data_1 = ['x', 'y', 1, 2, 3, 4] d = pd.dataframe(np.array([data_1]), columns=header_1) header_2 = ['a', 'b', -2, 4, 5, 6] data_2 = ['x', 'z', 1, 2, 3, 4] e = pd.dataframe(np.array([data_2]), columns=header_2) f = pd.concat([d, e]) > f b -1 3 5 7 -2 4 6 0 x y 1 2 3 4 nan nan nan 0 x z nan nan 3 nan 1 2 4
however, want numerical columns appear in sorted order , wondering if there easier way splitting off first 2 columns, sorting remaining dataframe , concatenating 2 again:
ab_cols = f[['a', 'b']] # copy of first 2 columns g = f.drop(['a', 'b'], axis=1) # removing cols dataframe h = g.sort_index(axis=1) # sort remaining column header = pd.concat([ab_cols, h], axis=1) # putting again > b -2 -1 3 4 5 6 7 0 x y nan 1 2 nan 3 nan 4 0 x z 1 nan nan 2 3 4 nan
i've thought multi-indices, i'm using index else (source of data row, not shown here), , i'm afraid three-level multi-index might make more complicated slicing dataframe later.
steps:
make columns series representation both index , values equal index keys.
using pd.to_numeric
errors=coerce
, parse numeric values , handling string values nans
.
sort these values pushing nans
(which string values before) on top , when encountered.
taking corresponding indices , re-arranging df
based on these newly returned column labels.
c = pd.to_numeric(f.columns.to_series(), errors='coerce').sort_values(na_position='first') f[c.index]
Comments
Post a Comment