python - How to replace empty cells with 0 and change strings to integers where possible in a pandas dataframe? -
i have dataframe 3000+ columns. many cells in dataframe empty strings (' '). also, have lot of numerical values are strings should integers. wrote 2 functions fill empty cells 0 , possible change value integer, when run them nothing changes dataframe. functions:
def recode_empty_cells(dataframe, list_of_columns): column in list_of_columns: dataframe[column].replace(r'\s+', np.nan, regex=true) dataframe[column].fillna(0) return dataframe def change_string_to_int(dataframe, list_of_columns): dataframe = recode_empty_cells(dataframe, list_of_columns) column in list_of_columns: try: dataframe[column] = dataframe[column].astype(int) except valueerror: pass return dataframe
note: i'm using try/except statement because columns contain text in form. in advance help.
edit:
thanks got first part working. empty cells have 0s now. code @ moment:
def recode_empty_cells(dataframe, list_of_columns): column in list_of_columns: dataframe[column] = dataframe[column].replace(r'\s+', 0, regex=true) return dataframe def change_string_to_int(dataframe, list_of_columns): dataframe = recode_empty_cells(dataframe, list_of_columns) column in list_of_columns: try: dataframe[column] = dataframe[column].astype(int) except valueerror: pass return dataframe
however, gives me following error: overflowerror: python int large convert c long
you not saving change in function:
def recode_empty_cells(dataframe, list_of_columns): column in list_of_columns: dataframe[column] = dataframe[column].replace(r'\s+', np.nan, regex=true) dataframe[column] = dataframe[column].fillna(0) return dataframe
Comments
Post a Comment