proteobench.modules.entrapment.entrapment_base_module module#

Quant Base Module.

class proteobench.modules.entrapment.entrapment_base_module.EntrapmentModule(token: str | None, proteobench_repo_name: str, proteobot_repo_name: str, parse_settings_dir: str, module_id: str, branch: str | None = None)[source]#

Bases: object

Base Module for Entrapment.

Parameters:
  • token (Optional[str]) – The GitHub token.

  • proteobench_repo_name (str) – The name of the ProteoBench repository.

  • proteobot_repo_name (str) – The name of the ProteoBot repository.

  • parse_settings_dir (str) – The directory containing parse settings.

  • module_id (str) – The module identifier for configuration.

EXTRACT_PARAMS_DICT: Dict[str, Any] = {'AlphaDIA': <function extract_params>, 'AlphaPept': <function extract_params>, 'DIA-NN': <function extract_params>, 'FragPipe': <function extract_params>, 'FragPipe (DIA-NN quant)': <function extract_params>, 'MSAID': <function extract_params>, 'MSAngel': <function extract_params>, 'MaxQuant': <function extract_params>, 'MetaMorpheus': <function extract_params>, 'PEAKS': <function extract_params>, 'ProlineStudio': <function extract_params>, 'Proteome Discoverer': <function read_spectronaut_settings>, 'Sage': <function extract_params>, 'Spectronaut': <function read_spectronaut_settings>, 'WOMBAT': <function extract_params>, 'i2MassChroQ': <function extract_params>, 'quantms': <function extract_params>}#
add_current_data_point(current_datapoint: Series, all_datapoints: DataFrame | None = None) DataFrame[source]#

Add current data point to previous data points. Load them from file if empty.

Parameters:
  • current_datapoint (pd.Series) – The current data point to add.

  • all_datapoints (Optional[pd.DataFrame]) – Data points from previous runs. Loaded from GitHub repo if None.

Returns:

A DataFrame with the current data point added.

Return type:

pd.DataFrame

benchmarking()[source]#
check_new_unique_hash(datapoints: DataFrame) bool[source]#

Check if the new data point has a unique hash.

Parameters:

datapoints (pd.DataFrame) – Data points.

Returns:

Whether the new data point has a unique hash.

Return type:

bool

clone_pr(temporary_datapoints: DataFrame, datapoint_params: Any, remote_git: str, submission_comments: str = 'no comments', submission_source: str = 'unknown') str[source]#

Clone the repo and open a pull request with the new data points.

Parameters:
  • temporary_datapoints (pd.DataFrame) – Temporary data points.

  • datapoint_params (Any) – Data point parameters.

  • remote_git (str) – Remote Git repository URL.

  • submission_comments (str, optional) – Comments to be included in the pull request. Defaults to “no comments”.

  • submission_source (str, optional) – Origin of the submission: ‘web-server’, ‘local’, or ‘resubmission-script’. Defaults to ‘unknown’.

Returns:

The URL of the created pull request.

Return type:

str

get_plot_generator() PlotGeneratorBase[source]#

Get the plot generator for entrapment plots.

Returns:

The plot generator instance.

Return type:

PlotGeneratorBase

is_implemented() bool[source]#

Return whether the module is fully implemented.

Returns:

Always returns True in this implementation.

Return type:

bool

load_params_file(input_file: List[str], input_format: str, json_file: str) ProteoBenchParameters[source]#

Load parameters from a metadata file depending on its format.

Parameters:
  • input_file (List[str]) – Path to the metadata file.

  • input_format (str) – Format of the metadata file.

  • json_file (str) – Path to the JSON file containing additional module specific parameters.

Returns:

The parameters for the module.

Return type:

ProteoBenchParameters

obtain_all_data_points(all_datapoints: DataFrame | None = None) DataFrame[source]#

Load all data points, load from file if empty.

Parameters:

all_datapoints (Optional[pd.DataFrame])) – All data points. Loaded from the GitHub repo if None.

Returns:

A DataFrame containing all data points.

Return type:

pd.DataFrame

write_intermediate_raw(directory: str, ident: str, input_file_obj: Any, result_performance: DataFrame, param_loc: List[Any], comment: str, extension_input_file: str = '.txt', extension_input_parameter_file: str = '.txt', input_file_secondary_obj: Any = None) None[source]#

Write intermediate and raw data to a directory in zipped form.

Parameters:
  • directory (str) – Directory to write to.

  • ident (str) – Identifier to create a subdirectory for this submission.

  • input_file_obj (Any) – File-like object representing the raw input file.

  • result_performance (pd.DataFrame) – The result performance DataFrame (intermediate data).

  • param_loc (List[Any]) – List of paths to parameter files that need to be copied.

  • comment (str) – User comment for the submission.

  • input_file_secondary_obj (Any, optional) – File-like object representing a secondary input file (e.g., for AlphaDIA).

write_json_local_development(temporary_datapoints: DataFrame, datapoint_params: dict) str[source]#

Write the datapoints to a JSON file for local development.

Parameters:
  • temporary_datapoints (pd.DataFrame) – Temporary data points.

  • datapoint_params (dict) – Data point parameters.

Returns:

The path to the written JSON file.

Return type:

str