pytrnsys_process.process.process_sim.handle_duplicate_columns#

pytrnsys_process.process.process_sim.handle_duplicate_columns(df: DataFrame) → DataFrame[source]#

Process duplicate columns in a DataFrame, ensuring they contain consistent data.

This function checks for duplicate column names and verifies that: 1. If one duplicate column has NaN values, the other(s) must also have NaN at the same indices 2. All non-NaN values must be identical across duplicate columns

Parameters:: df (pandas.DataFrame) – Input DataFrame to process
Returns:: df – DataFrame with duplicate columns removed, keeping only the first occurrence
Return type:: pandas.DataFrame
Raises:: ValueError – If duplicate columns have: 1. NaN values in one column while having actual values in another at the same index, or 2. Different non-NaN values at the same index

Note

https://stackoverflow.com/questions/14984119/python-pandas-remove-duplicate-columns

pytrnsys_process.process.process_sim.handle_duplicate_columns#

This Page