water_benchmark_hub.battledim

water_benchmark_hub.battledim.battledim

Module provides access to the BattLeDIM benchmark.

class water_benchmark_hub.battledim.battledim.BattLeDIM

Bases: BenchmarkResource

The Battle of the Leakage Detection and Isolation Methods (BattLeDIM) 2020, organized by S. G. Vrachimis, D. G. Eliades, R. Taormina, Z. Kapelan, A. Ostfeld, S. Liu, M. Kyriakou, P. Pavlou, M. Qiu, and M. M. Polycarpou, as part of the 2nd International CCWI/WDSA Joint Conference in Beijing, China, aims at objectively comparing the performance of methods for the detection and localization of leakage events, relying on SCADA measurements of flow and pressure sensors installed within water distribution networks.

See https://github.com/KIOS-Research/BattLeDIM for details.

This module provides functions for loading the original BattLeDIM data set load_data(), as well as methods for loading the scenarios load_scenario() and pre-generated SCADA data load_scada_data(). The official scoring/evaluation is implemented in compute_evaluation_score() – i.e. those results can be directly compared to the official leaderboard results. Besides this, the user can choose to evaluate predictions using any other metric.

static compute_evaluation_score(y_leak_locations_pred: list[tuple[str, int]], test_scenario: bool, verbose: bool = True) → dict

Evaluates the predictions (i.e. start time and location of leakages) as it was done in the BattLeDIM competition – i.e. the output of this functions can be directly compared to the official leaderboard results.

Parameters:

y_leak_locations_pred (list[tuple[str, int]]) – Predictions of location (link/pipe ID) and start time (in seconds since simulation start) of leakages.
test_scenario (bool) – True if the given predictions are made for the test scenario, False otherwise.
verbose (bool, optional) –
If True, a progress bar is shown while downloading files.

The default is True.

Returns:

Dictionary containing the true positive rate, true positives, false positives, false negatives, and total monetary (Euro) savings (only available if test_scenario is True).

Return type:

dict

static get_meta_info() → dict

Gets the meta information of this resource.

Returns:: Meta info.
Return type:: dict

static load_data(return_test_scenario: bool, download_dir: str | None = None, return_X_y: bool = False, return_features_desc: bool = False, return_leak_locations: bool = False, verbose: bool = True) → pandas.DataFrame | Any

Loads the original BattLeDIM benchmark data set. Note that the data set exists in two different version – a training version and an evaluation/test version.

Parameters:

return_test_scenario (bool) – If True, the evaluation/test data set is returned, otherwise the historical (i.e. training) data set is returned.
download_dir (str, optional) –
Path to the data files – if None, the temp folder will be used. If the path does not exist, the data files will be downloaded to the given path.

The default is None.
return_X_y (bool, optional) –
If True, the data is returned together with the labels (presence of a leakage) as two Numpy arrays, otherwise, the data is returned as a epyt_flow.simulation.scada.scada_data.ScadaData instance.

The default is False.
return_features_desc (bool, optional) –
If True and if return_X_y is True, the returned dictionary contains the features’ descriptions (i.e. names) under the key “features_desc”.

The default is False.
return_leak_locations (bool) –
If True, the leak locations are returned as well – as an instance of scipy.sparse.bsr_array.

The default is False.
verbose (bool, optional) –
If True, a progress bar is shown while downloading files.

The default is True.

Returns:

Benchmark data set.

Return type:

Either a pandas.DataFrame instance or a tuple of Numpy arrays.

static load_scada_data(return_test_scenario: bool, download_dir: str | None = None, return_X_y: bool = False, return_leak_locations: bool = False, verbose: bool = True) → list[epyt_flow.simulation.scada.ScadaData | Any]

Loads the SCADA data of the simulated BattLeDIM benchmark scenario – note that due to randomness, these differ from the original data set which can be loaded by calling load_data().

Warning

A large file (approx. 4GB) will be downloaded and loaded into memory – this might take some time.

Parameters:

return_test_scenario (bool) – If True, the evaluation/test scenario is returned, otherwise the historical (i.e. training) scenario is returned.
download_dir (str, optional) –
Path to the data files – if None, the temp folder will be used. If the path does not exist, the data files will be downloaded to the given path.

The default is None.
return_X_y (bool, optional) –
If True, the data is returned together with the labels (presence of a leakage) as two Numpy arrays, otherwise, the data is returned as a epyt_flow.simulation.scada.scada_data.ScadaData instance.

The default is False.
return_leak_locations (bool) –
If True, the leak locations are returned as well – as an instance of scipy.sparse.bsr_array.

The default is False.
verbose (bool, optional) –
If True, a progress bar is shown while downloading files.

The default is True.

Returns:

The simulated benchmark scenario as either a epyt_flow.simulation.scada.scada_data.ScadaData instance or as a tuple of (X, y) Numpy arrays. If ‘return_leak_locations’ is True, the leak locations are included as an instance of scipy.sparse.bsr_array as well.

Return type:

epyt_flow.simulation.scada.scada_data.ScadaData or list[tuple[numpy.ndarray, numpy.ndarray]]

static load_scenario(return_test_scenario: bool, download_dir: str | None = None, verbose: bool = True) → epyt_flow.simulation.ScenarioConfig

Creates and returns the BattLeDIM scenario – it can be either modified or passed directly to the EPyT-Flow simulator epyt_flow.simulation.scenario_simulator.ScenarioSimulator.

Note

Note that due to randomness, the simulation results differ from the original data set which can be loaded by calling load_data().

Parameters:

return_test_scenario (bool) – If True, the evaluation/test scenario is returned, otherwise the historical (i.e. training) scenario is returned.
download_dir (str, optional) –
Path to the L-TOWN.inp file – if None, the temp folder will be used. If the path does not exist, the .inp will be downloaded to the given path.

The default is None.
verbose (bool, optional) –
If True, a progress bar is shown while downloading files.

The default is True.

Returns:

Complete scenario configuration of the BattLeDIM benchmark scenario.

Return type:

epyt_flow.simulation.scenario_config.ScenarioConfig

water_benchmark_hub.battledim.battledim_data

Module provides the leakage configurations for BattLeDIM.