pytrnsys_process.process package#
Submodules#
pytrnsys_process.process.data_structures module#
- class pytrnsys_process.process.data_structures.Simulation(path: str, monthly: DataFrame, hourly: DataFrame, step: DataFrame, scalar: DataFrame)[source]#
Bases:
objectClass representing a TRNSYS simulation with its associated data.
This class holds the simulation data organized in different time resolutions (monthly, hourly, timestep) along with the path to the simulation files.
- monthly#
Monthly aggregated simulation data. Each column represents a different variable and each row represents a month.
- Type:
- hourly#
Hourly simulation data. Each column represents a different variable and each row represents an hour.
- Type:
- step#
Simulation data at the smallest timestep resolution. Each column represents a different variable and each row represents a timestep.
- Type:
- monthly: DataFrame#
- hourly: DataFrame#
- step: DataFrame#
- scalar: DataFrame#
- class pytrnsys_process.process.data_structures.ProcessingResults(processed_count: int = 0, error_count: int = 0, failed_simulations: ~typing.List[str] = <factory>, failed_scenarios: dict[str, ~typing.List[str]] = <factory>)[source]#
Bases:
objectResults from processing one or more simulations.
- failed_scenarios#
Dictionary mapping simulation names to lists of failed scenario names
- simulations#
Dictionary mapping simulation names to processed Simulation objects
Example
>>> results = ProcessingResults() >>> results.processed_count = 5 >>> results.error_count = 1 >>> results.failed_simulations = ['sim_001'] >>> results.failed_scenarios = {'sim_002': ['scenario_1']}
- class pytrnsys_process.process.data_structures.SimulationsData(simulations: dict[str, ~pytrnsys_process.process.data_structures.Simulation] = <factory>, scalar: ~pandas.core.frame.DataFrame = <factory>, path_to_simulations: str = <factory>)[source]#
Bases:
objectClass representing a result set
Used to do comparisons plots across different simulations
- simulations#
Can be accessed using the simulations names as keys. Example:
simulations['sim_001']- Type:
dict of {str, Simulation}
- scalar#
Contains all deck constant deck values from all simulations. This is also the place to store your calculations for plotting.
- Type:
- simulations: dict[str, Simulation]#
- scalar: DataFrame#
pytrnsys_process.process.file_type_detector module#
- pytrnsys_process.process.file_type_detector.get_file_type_using_file_content(file_path: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) FileType[source]#
Determine the file type by analyzing its content.
- Parameters:
file_path (
pathlib.Path) – Path to the file to analyze- Returns:
FileType – The detected file type (MONTHLY, HOURLY, or TIMESTEP)
- Return type:
pytrnsys_process.constants.FileType- Raises:
ValueError – If the file type cannot be determined from the content:
- pytrnsys_process.process.file_type_detector.get_file_type_using_file_name(file: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) FileType[source]#
Determine the file type by checking the filename against known patterns.
- Parameters:
file (
pathlib.Path) – The path to the file to check- Returns:
FileType – The detected file type (MONTHLY, HOURLY, TIMESTEP or DECK)
- Return type:
pytrnsys_process.constants.FileType- Raises:
ValueError – If no matching pattern is found:
- pytrnsys_process.process.file_type_detector.has_pattern(file: Path, file_type: FileType) bool[source]#
Check if a filename contains any of the patterns associated with a specific FileType.
- Parameters:
file (
pathlib.Path) – The path to the file to checkfile_type (
pytrnsys_process.constants.FileType) – The FileType enum containing patterns to match against
- Returns:
bool – True if the filename contains any of the patterns, False otherwise
- Return type:
pytrnsys_process.process.process_batch module#
- exception pytrnsys_process.process.process_batch.UnableToProcessSimulationError[source]#
Bases:
ExceptionRaised when a simulation cannot be processed.
- pytrnsys_process.process.process_batch.process_single_simulation(sim_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) Simulation[source]#
Process a single simulation folder using the provided processing step/scenario.
- Parameters:
sim_folder (pathlib.Path) – Path to the simulation folder to process
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should contain the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
- Returns:
Simulation
- Return type:
Example
>>> import pathlib as _pl >>> from pytrnsys_process import api ... >>> def processing_step_1(sim: api.Simulation): ... # Process simulation data ... pass >>> results = api.process_single_simulation( ... _pl.Path("path/to/simulation"), ... processing_step_1 ... )
- pytrnsys_process.process.process_batch.process_whole_result_set(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) SimulationsData[source]#
Process all simulation folders in a results directory sequentially.
Processes each simulation folder found in the results directory one at a time, applying the provided processing step/scenario to each simulation.
Using the default settings your structure should look like this:
results_folder├─ sim-1├─ sim-2├─ sim-3├─ temp├─ your-printer-files.prt- Parameters:
pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
- Returns:
SimulationsData –
monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations
- Return type:
- Raises:
ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:
Example
>>> import pathlib as _pl >>> from pytrnsys_process import api ... >>> def processing_step_1(sim): ... # Process simulation data ... pass >>> def processing_step_2(sim): ... # Process simulation data ... pass >>> results = api.process_whole_result_set( ... _pl.Path("path/to/results"), ... [processing_step_1, processing_step_2] ... )
- pytrnsys_process.process.process_batch.process_whole_result_set_parallel(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]], max_workers: int | None = None) SimulationsData[source]#
Process all simulation folders in a results directory in parallel.
Uses a ProcessPoolExecutor to process multiple simulations concurrently, applying the provided processing step/scenario to each simulation.
Using the default settings your structure should look like this:
results_folder├─ sim-1├─ sim-2├─ sim-3├─ temp├─ your-printer-files.prt- Parameters:
pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
int (max_workers) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.
None (default) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.
- Returns:
SimulationsData –
monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations
- Return type:
- Raises:
ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:
Example
>>> import pathlib as _pl >>> from pytrnsys_process import api ... >>> def processing_step_1(sim): ... # Process simulation data ... pass >>> def processing_step_2(sim): ... # Process simulation data ... pass >>> results = api.process_whole_result_set_parallel( ... _pl.Path("path/to/results"), ... [processing_step_1, processing_step_2] ... )
- pytrnsys_process.process.process_batch.do_comparison(comparison_scenario: Callable[[SimulationsData], None] | Sequence[Callable[[SimulationsData], None]], simulations_data: SimulationsData | None = None, results_folder: Path | None = None) SimulationsData[source]#
Execute comparison scenarios on processed simulation results.
- Parameters:
comparison_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the comparison logic. Each callable should take a SimulationsData object as its only parameter and modify it in place.
simulations_data (SimulationsData, optional) – SimulationsData object containing the processed simulations data to be compared.
results_folder (pathlib.Path, optional) – Path to the directory containing simulation results. Used if simulations_data is not provided.
- Returns:
SimulationsData
- Return type:
Example
>>> from pytrnsys_process import api ... >>> def comparison_step(simulations_data: ds.SimulationsData): ... # Compare simulation results ... pass ... >>> api.do_comparison(comparison_step, simulations_data=processed_results)
pytrnsys_process.process.process_sim module#
- pytrnsys_process.process.process_sim.process_sim(sim_files: Sequence[Path], sim_folder: Path) Simulation[source]#
- pytrnsys_process.process.process_sim.handle_duplicate_columns(df: DataFrame) DataFrame[source]#
Process duplicate columns in a DataFrame, ensuring they contain consistent data.
This function checks for duplicate column names and verifies that: 1. If one duplicate column has NaN values, the other(s) must also have NaN at the same indices 2. All non-NaN values must be identical across duplicate columns
- Parameters:
df (pandas.DataFrame) – Input DataFrame to process
- Returns:
df – DataFrame with duplicate columns removed, keeping only the first occurrence
- Return type:
- Raises:
ValueError – If duplicate columns have: 1. NaN values in one column while having actual values in another at the same index, or 2. Different non-NaN values at the same index
Module contents#
- class pytrnsys_process.process.Simulation(path: str, monthly: DataFrame, hourly: DataFrame, step: DataFrame, scalar: DataFrame)[source]#
Bases:
objectClass representing a TRNSYS simulation with its associated data.
This class holds the simulation data organized in different time resolutions (monthly, hourly, timestep) along with the path to the simulation files.
- monthly#
Monthly aggregated simulation data. Each column represents a different variable and each row represents a month.
- Type:
- hourly#
Hourly simulation data. Each column represents a different variable and each row represents an hour.
- Type:
- step#
Simulation data at the smallest timestep resolution. Each column represents a different variable and each row represents a timestep.
- Type:
- __init__(path: str, monthly: DataFrame, hourly: DataFrame, step: DataFrame, scalar: DataFrame) None#
- monthly: DataFrame#
- hourly: DataFrame#
- step: DataFrame#
- scalar: DataFrame#
- class pytrnsys_process.process.SimulationsData(simulations: dict[str, ~pytrnsys_process.process.data_structures.Simulation] = <factory>, scalar: ~pandas.core.frame.DataFrame = <factory>, path_to_simulations: str = <factory>)[source]#
Bases:
objectClass representing a result set
Used to do comparisons plots across different simulations
- simulations#
Can be accessed using the simulations names as keys. Example:
simulations['sim_001']- Type:
dict of {str, Simulation}
- scalar#
Contains all deck constant deck values from all simulations. This is also the place to store your calculations for plotting.
- Type:
- __init__(simulations: dict[str, ~pytrnsys_process.process.data_structures.Simulation] = <factory>, scalar: ~pandas.core.frame.DataFrame = <factory>, path_to_simulations: str = <factory>) None#
- simulations: dict[str, Simulation]#
- scalar: DataFrame#
- pytrnsys_process.process.get_file_type_using_file_content(file_path: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) FileType[source]#
Determine the file type by analyzing its content.
- Parameters:
file_path (
pathlib.Path) – Path to the file to analyze- Returns:
FileType – The detected file type (MONTHLY, HOURLY, or TIMESTEP)
- Return type:
pytrnsys_process.constants.FileType- Raises:
ValueError – If the file type cannot be determined from the content:
- pytrnsys_process.process.get_file_type_using_file_name(file: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) FileType[source]#
Determine the file type by checking the filename against known patterns.
- Parameters:
file (
pathlib.Path) – The path to the file to check- Returns:
FileType – The detected file type (MONTHLY, HOURLY, TIMESTEP or DECK)
- Return type:
pytrnsys_process.constants.FileType- Raises:
ValueError – If no matching pattern is found:
- pytrnsys_process.process.has_pattern(file: Path, file_type: FileType) bool[source]#
Check if a filename contains any of the patterns associated with a specific FileType.
- Parameters:
file (
pathlib.Path) – The path to the file to checkfile_type (
pytrnsys_process.constants.FileType) – The FileType enum containing patterns to match against
- Returns:
bool – True if the filename contains any of the patterns, False otherwise
- Return type:
- pytrnsys_process.process.process_single_simulation(sim_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) Simulation[source]#
Process a single simulation folder using the provided processing step/scenario.
- Parameters:
sim_folder (pathlib.Path) – Path to the simulation folder to process
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should contain the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
- Returns:
Simulation
- Return type:
Example
>>> import pathlib as _pl >>> from pytrnsys_process import api ... >>> def processing_step_1(sim: api.Simulation): ... # Process simulation data ... pass >>> results = api.process_single_simulation( ... _pl.Path("path/to/simulation"), ... processing_step_1 ... )
- pytrnsys_process.process.process_whole_result_set(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) SimulationsData[source]#
Process all simulation folders in a results directory sequentially.
Processes each simulation folder found in the results directory one at a time, applying the provided processing step/scenario to each simulation.
Using the default settings your structure should look like this:
results_folder├─ sim-1├─ sim-2├─ sim-3├─ temp├─ your-printer-files.prt- Parameters:
pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
- Returns:
SimulationsData –
monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations
- Return type:
- Raises:
ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:
Example
>>> import pathlib as _pl >>> from pytrnsys_process import api ... >>> def processing_step_1(sim): ... # Process simulation data ... pass >>> def processing_step_2(sim): ... # Process simulation data ... pass >>> results = api.process_whole_result_set( ... _pl.Path("path/to/results"), ... [processing_step_1, processing_step_2] ... )
- pytrnsys_process.process.process_whole_result_set_parallel(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]], max_workers: int | None = None) SimulationsData[source]#
Process all simulation folders in a results directory in parallel.
Uses a ProcessPoolExecutor to process multiple simulations concurrently, applying the provided processing step/scenario to each simulation.
Using the default settings your structure should look like this:
results_folder├─ sim-1├─ sim-2├─ sim-3├─ temp├─ your-printer-files.prt- Parameters:
pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
int (max_workers) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.
None (default) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.
- Returns:
SimulationsData –
monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations
- Return type:
- Raises:
ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:
Example
>>> import pathlib as _pl >>> from pytrnsys_process import api ... >>> def processing_step_1(sim): ... # Process simulation data ... pass >>> def processing_step_2(sim): ... # Process simulation data ... pass >>> results = api.process_whole_result_set_parallel( ... _pl.Path("path/to/results"), ... [processing_step_1, processing_step_2] ... )
- pytrnsys_process.process.do_comparison(comparison_scenario: Callable[[SimulationsData], None] | Sequence[Callable[[SimulationsData], None]], simulations_data: SimulationsData | None = None, results_folder: Path | None = None) SimulationsData[source]#
Execute comparison scenarios on processed simulation results.
- Parameters:
comparison_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the comparison logic. Each callable should take a SimulationsData object as its only parameter and modify it in place.
simulations_data (SimulationsData, optional) – SimulationsData object containing the processed simulations data to be compared.
results_folder (pathlib.Path, optional) – Path to the directory containing simulation results. Used if simulations_data is not provided.
- Returns:
SimulationsData
- Return type:
Example
>>> from pytrnsys_process import api ... >>> def comparison_step(simulations_data: ds.SimulationsData): ... # Compare simulation results ... pass ... >>> api.do_comparison(comparison_step, simulations_data=processed_results)