pytrnsys_process.process package#

Submodules#

pytrnsys_process.process.data_structures module#

class pytrnsys_process.process.data_structures.Simulation(path: str, monthly: DataFrame, hourly: DataFrame, step: DataFrame, scalar: DataFrame)[source]#

Bases: object

Class representing a TRNSYS simulation with its associated data.

This class holds the simulation data organized in different time resolutions (monthly, hourly, timestep) along with the path to the simulation files.

path#

Path to the simulation folder containing the input files

Type:: str

monthly#

Monthly aggregated simulation data. Each column represents a different variable and each row represents a month.

Type:: pandas.DataFrame

hourly#

Hourly simulation data. Each column represents a different variable and each row represents an hour.

Type:: pandas.DataFrame

step#

Simulation data at the smallest timestep resolution. Each column represents a different variable and each row represents a timestep.

Type:: pandas.DataFrame

path: str#

monthly: DataFrame#

hourly: DataFrame#

step: DataFrame#

scalar: DataFrame#

__init__(path: str, monthly: DataFrame, hourly: DataFrame, step: DataFrame, scalar: DataFrame) → None#

class pytrnsys_process.process.data_structures.ProcessingResults(processed_count: int = 0, error_count: int = 0, failed_simulations: ~typing.List[str] = <factory>, failed_scenarios: dict[str, ~typing.List[str]] = <factory>)[source]#

Bases: object

Results from processing one or more simulations.

processed_count#

Number of successfully processed simulations

Type:: int

error_count#

Number of simulations that failed to process

Type:: int

failed_simulations#

List of simulation names that failed to process

Type:: List[str]

failed_scenarios#

Dictionary mapping simulation names to lists of failed scenario names

Type:: dict[str, List[str]]

simulations#: Dictionary mapping simulation names to processed Simulation objects

Example

>>> results = ProcessingResults()
>>> results.processed_count = 5
>>> results.error_count = 1
>>> results.failed_simulations = ['sim_001']
>>> results.failed_scenarios = {'sim_002': ['scenario_1']}

processed_count: int = 0#

error_count: int = 0#

failed_simulations: List[str]#

failed_scenarios: dict[str, List[str]]#

__init__(processed_count: int = 0, error_count: int = 0, failed_simulations: ~typing.List[str] = <factory>, failed_scenarios: dict[str, ~typing.List[str]] = <factory>) → None#

class pytrnsys_process.process.data_structures.SimulationsData(simulations: dict[str, ~pytrnsys_process.process.data_structures.Simulation] = <factory>, scalar: ~pandas.core.frame.DataFrame = <factory>, path_to_simulations: str = <factory>)[source]#

Bases: object

Class representing a result set

Used to do comparisons plots across different simulations

simulations#

Can be accessed using the simulations names as keys. Example: simulations['sim_001']

Type:: dict of {str, Simulation}

scalar#

Contains all deck constant deck values from all simulations. This is also the place to store your calculations for plotting.

Type:: pandas.DataFrame

path_to_simulations#

The path to your results folder

Type:: str

simulations: dict[str, Simulation]#

scalar: DataFrame#

path_to_simulations: str#

path_to_simulations_original: str#

__init__(simulations: dict[str, ~pytrnsys_process.process.data_structures.Simulation] = <factory>, scalar: ~pandas.core.frame.DataFrame = <factory>, path_to_simulations: str = <factory>) → None#

pytrnsys_process.process.file_type_detector module#

pytrnsys_process.process.file_type_detector.get_file_type_using_file_content(file_path: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) → FileType[source]#

Determine the file type by analyzing its content.

Parameters:: file_path (pathlib.Path) – Path to the file to analyze
Returns:: FileType – The detected file type (MONTHLY, HOURLY, or TIMESTEP)
Return type:: pytrnsys_process.constants.FileType
Raises:: ValueError – If the file type cannot be determined from the content:

pytrnsys_process.process.file_type_detector.get_file_type_using_file_name(file: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) → FileType[source]#

Determine the file type by checking the filename against known patterns.

Parameters:: file (pathlib.Path) – The path to the file to check
Returns:: FileType – The detected file type (MONTHLY, HOURLY, TIMESTEP or DECK)
Return type:: pytrnsys_process.constants.FileType
Raises:: ValueError – If no matching pattern is found:

pytrnsys_process.process.file_type_detector.has_pattern(file: Path, file_type: FileType) → bool[source]#

Check if a filename contains any of the patterns associated with a specific FileType.

Parameters:

file (pathlib.Path) – The path to the file to check
file_type (pytrnsys_process.constants.FileType) – The FileType enum containing patterns to match against

Returns:

bool – True if the filename contains any of the patterns, False otherwise

Return type:

bool

pytrnsys_process.process.process_batch module#

exception pytrnsys_process.process.process_batch.UnableToProcessSimulationError[source]#

Bases: Exception

Raised when a simulation cannot be processed.

pytrnsys_process.process.process_batch.process_single_simulation(sim_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) → Simulation[source]#

Process a single simulation folder using the provided processing step/scenario.

Parameters:

sim_folder (pathlib.Path) – Path to the simulation folder to process
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should contain the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.

Returns:

Simulation

Return type:

pytrnsys_process.api.Simulation

Example

>>> import pathlib as _pl
>>> from pytrnsys_process import api
...
>>> def processing_step_1(sim: api.Simulation):
...     # Process simulation data
...     pass
>>> results = api.process_single_simulation(
...     _pl.Path("path/to/simulation"),
...     processing_step_1
... )

pytrnsys_process.process.process_batch.process_whole_result_set(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) → SimulationsData[source]#

Process all simulation folders in a results directory sequentially.

Processes each simulation folder found in the results directory one at a time, applying the provided processing step/scenario to each simulation.

Using the default settings your structure should look like this:

results_folder

├─ sim-1

├─ sim-2

├─ sim-3

├─ temp

├─ your-printer-files.prt

Parameters:

pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.

Returns:

SimulationsData –

monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations

Return type:

pytrnsys_process.api.SimulationsData

Raises:

ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:

Example

>>> import pathlib as _pl
>>> from pytrnsys_process import api
...
>>> def processing_step_1(sim):
...     # Process simulation data
...     pass
>>> def processing_step_2(sim):
...     # Process simulation data
...     pass
>>> results = api.process_whole_result_set(
...     _pl.Path("path/to/results"),
...     [processing_step_1, processing_step_2]
... )

pytrnsys_process.process.process_batch.process_whole_result_set_parallel(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]], max_workers: int | None = None) → SimulationsData[source]#

Process all simulation folders in a results directory in parallel.

Uses a ProcessPoolExecutor to process multiple simulations concurrently, applying the provided processing step/scenario to each simulation.

Using the default settings your structure should look like this:

results_folder

├─ sim-1

├─ sim-2

├─ sim-3

├─ temp

├─ your-printer-files.prt

Parameters:

pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
int (max_workers) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.
None (default) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.

Returns:

SimulationsData –

monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations

Return type:

pytrnsys_process.api.SimulationsData

Raises:

ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:

Example

>>> import pathlib as _pl
>>> from pytrnsys_process import api
...
>>> def processing_step_1(sim):
...     # Process simulation data
...     pass
>>> def processing_step_2(sim):
...     # Process simulation data
...     pass
>>> results = api.process_whole_result_set_parallel(
...     _pl.Path("path/to/results"),
...     [processing_step_1, processing_step_2]
... )

pytrnsys_process.process.process_batch.do_comparison(comparison_scenario: Callable[[SimulationsData], None] | Sequence[Callable[[SimulationsData], None]], simulations_data: SimulationsData | None = None, results_folder: Path | None = None) → SimulationsData[source]#

Execute comparison scenarios on processed simulation results.

Parameters:

comparison_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the comparison logic. Each callable should take a SimulationsData object as its only parameter and modify it in place.
simulations_data (SimulationsData, optional) – SimulationsData object containing the processed simulations data to be compared.
results_folder (pathlib.Path, optional) – Path to the directory containing simulation results. Used if simulations_data is not provided.

Returns:

SimulationsData

Return type:

pytrnsys_process.api.SimulationsData

Example

>>> from pytrnsys_process import api
...
>>> def comparison_step(simulations_data: ds.SimulationsData):
...     # Compare simulation results
...     pass
...
>>> api.do_comparison(comparison_step, simulations_data=processed_results)

pytrnsys_process.process.process_sim module#

pytrnsys_process.process.process_sim.process_sim(sim_files: Sequence[Path], sim_folder: Path) → Simulation[source]#

pytrnsys_process.process.process_sim.handle_duplicate_columns(df: DataFrame) → DataFrame[source]#

Process duplicate columns in a DataFrame, ensuring they contain consistent data.

This function checks for duplicate column names and verifies that: 1. If one duplicate column has NaN values, the other(s) must also have NaN at the same indices 2. All non-NaN values must be identical across duplicate columns

Parameters:: df (pandas.DataFrame) – Input DataFrame to process
Returns:: df – DataFrame with duplicate columns removed, keeping only the first occurrence
Return type:: pandas.DataFrame
Raises:: ValueError – If duplicate columns have: 1. NaN values in one column while having actual values in another at the same index, or 2. Different non-NaN values at the same index

Note

https://stackoverflow.com/questions/14984119/python-pandas-remove-duplicate-columns

Module contents#

class pytrnsys_process.process.Simulation(path: str, monthly: DataFrame, hourly: DataFrame, step: DataFrame, scalar: DataFrame)[source]#

Bases: object

Class representing a TRNSYS simulation with its associated data.

This class holds the simulation data organized in different time resolutions (monthly, hourly, timestep) along with the path to the simulation files.

path#

Path to the simulation folder containing the input files

Type:: str

monthly#

Monthly aggregated simulation data. Each column represents a different variable and each row represents a month.

Type:: pandas.DataFrame

hourly#

Hourly simulation data. Each column represents a different variable and each row represents an hour.

Type:: pandas.DataFrame

step#

Simulation data at the smallest timestep resolution. Each column represents a different variable and each row represents a timestep.

Type:: pandas.DataFrame

__init__(path: str, monthly: DataFrame, hourly: DataFrame, step: DataFrame, scalar: DataFrame) → None#

path: str#

monthly: DataFrame#

hourly: DataFrame#

step: DataFrame#

scalar: DataFrame#

class pytrnsys_process.process.SimulationsData(simulations: dict[str, ~pytrnsys_process.process.data_structures.Simulation] = <factory>, scalar: ~pandas.core.frame.DataFrame = <factory>, path_to_simulations: str = <factory>)[source]#

Bases: object

Class representing a result set

Used to do comparisons plots across different simulations

simulations#

Can be accessed using the simulations names as keys. Example: simulations['sim_001']

Type:: dict of {str, Simulation}

scalar#

Contains all deck constant deck values from all simulations. This is also the place to store your calculations for plotting.

Type:: pandas.DataFrame

path_to_simulations#

The path to your results folder

Type:: str

__init__(simulations: dict[str, ~pytrnsys_process.process.data_structures.Simulation] = <factory>, scalar: ~pandas.core.frame.DataFrame = <factory>, path_to_simulations: str = <factory>) → None#

simulations: dict[str, Simulation]#

scalar: DataFrame#

path_to_simulations: str#

path_to_simulations_original: str#

pytrnsys_process.process.get_file_type_using_file_content(file_path: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) → FileType[source]#

Determine the file type by analyzing its content.

Parameters:: file_path (pathlib.Path) – Path to the file to analyze
Returns:: FileType – The detected file type (MONTHLY, HOURLY, or TIMESTEP)
Return type:: pytrnsys_process.constants.FileType
Raises:: ValueError – If the file type cannot be determined from the content:

pytrnsys_process.process.get_file_type_using_file_name(file: ~pathlib.Path, logger: ~logging.Logger = <Logger default_pytrnsys_process (WARNING)>) → FileType[source]#

Determine the file type by checking the filename against known patterns.

Parameters:: file (pathlib.Path) – The path to the file to check
Returns:: FileType – The detected file type (MONTHLY, HOURLY, TIMESTEP or DECK)
Return type:: pytrnsys_process.constants.FileType
Raises:: ValueError – If no matching pattern is found:

pytrnsys_process.process.has_pattern(file: Path, file_type: FileType) → bool[source]#

Check if a filename contains any of the patterns associated with a specific FileType.

Parameters:

file (pathlib.Path) – The path to the file to check
file_type (pytrnsys_process.constants.FileType) – The FileType enum containing patterns to match against

Returns:

bool – True if the filename contains any of the patterns, False otherwise

Return type:

bool

pytrnsys_process.process.process_single_simulation(sim_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) → Simulation[source]#

Process a single simulation folder using the provided processing step/scenario.

Parameters:

sim_folder (pathlib.Path) – Path to the simulation folder to process
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should contain the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.

Returns:

Simulation

Return type:

pytrnsys_process.api.Simulation

Example

>>> import pathlib as _pl
>>> from pytrnsys_process import api
...
>>> def processing_step_1(sim: api.Simulation):
...     # Process simulation data
...     pass
>>> results = api.process_single_simulation(
...     _pl.Path("path/to/simulation"),
...     processing_step_1
... )

pytrnsys_process.process.process_whole_result_set(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]]) → SimulationsData[source]#

Process all simulation folders in a results directory sequentially.

Processes each simulation folder found in the results directory one at a time, applying the provided processing step/scenario to each simulation.

Using the default settings your structure should look like this:

results_folder

├─ sim-1

├─ sim-2

├─ sim-3

├─ temp

├─ your-printer-files.prt

Parameters:

pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.

Returns:

SimulationsData –

monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations

Return type:

pytrnsys_process.api.SimulationsData

Raises:

ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:

Example

>>> import pathlib as _pl
>>> from pytrnsys_process import api
...
>>> def processing_step_1(sim):
...     # Process simulation data
...     pass
>>> def processing_step_2(sim):
...     # Process simulation data
...     pass
>>> results = api.process_whole_result_set(
...     _pl.Path("path/to/results"),
...     [processing_step_1, processing_step_2]
... )

pytrnsys_process.process.process_whole_result_set_parallel(results_folder: Path, processing_scenario: Callable[[Simulation], None] | Sequence[Callable[[Simulation], None]], max_workers: int | None = None) → SimulationsData[source]#

Process all simulation folders in a results directory in parallel.

Uses a ProcessPoolExecutor to process multiple simulations concurrently, applying the provided processing step/scenario to each simulation.

Using the default settings your structure should look like this:

results_folder

├─ sim-1

├─ sim-2

├─ sim-3

├─ temp

├─ your-printer-files.prt

Parameters:

pathlib.Path (results_folder) – Path to the directory containing simulation folders. Each subfolder should contain a temp folder containing valid simulation data files.
processing_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the processing logic for a simulation. Each callable should take a Simulation object as its only parameter and modify it in place.
int (max_workers) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.
None (default) – Maximum number of worker processes to use. If None, defaults to the number of processors on the machine.

Returns:

SimulationsData –

monthly: Dict mapping simulation names to monthly DataFrame results
hourly: Dict mapping simulation names to hourly DataFrame results
scalar: DataFrame containing scalar/deck values from all simulations

Return type:

pytrnsys_process.api.SimulationsData

Raises:

ValueError – If results_folder doesn’t exist or is not a directory:
Exception – Individual simulation failures are logged but not re-raised:

Example

>>> import pathlib as _pl
>>> from pytrnsys_process import api
...
>>> def processing_step_1(sim):
...     # Process simulation data
...     pass
>>> def processing_step_2(sim):
...     # Process simulation data
...     pass
>>> results = api.process_whole_result_set_parallel(
...     _pl.Path("path/to/results"),
...     [processing_step_1, processing_step_2]
... )

pytrnsys_process.process.do_comparison(comparison_scenario: Callable[[SimulationsData], None] | Sequence[Callable[[SimulationsData], None]], simulations_data: SimulationsData | None = None, results_folder: Path | None = None) → SimulationsData[source]#

Execute comparison scenarios on processed simulation results.

Parameters:

comparison_scenario (collections.abc.Callable or collections.abc.Sequence of collections.abc.Callable) – They should containd the comparison logic. Each callable should take a SimulationsData object as its only parameter and modify it in place.
simulations_data (SimulationsData, optional) – SimulationsData object containing the processed simulations data to be compared.
results_folder (pathlib.Path, optional) – Path to the directory containing simulation results. Used if simulations_data is not provided.

Returns:

SimulationsData

Return type:

pytrnsys_process.api.SimulationsData

Example

>>> from pytrnsys_process import api
...
>>> def comparison_step(simulations_data: ds.SimulationsData):
...     # Compare simulation results
...     pass
...
>>> api.do_comparison(comparison_step, simulations_data=processed_results)

pytrnsys_process.process package#

Submodules#

pytrnsys_process.process.data_structures module#

pytrnsys_process.process.file_type_detector module#

pytrnsys_process.process.process_batch module#

pytrnsys_process.process.process_sim module#

Module contents#

This Page