proteobench.datapoint.denovo_datapoint module#

This module provides functionality for storing the de novo metrics.

class proteobench.datapoint.denovo_datapoint.DenovoDatapoint(id: str = None, software_name: str = None, software_version: int = 0, checkpoint: str = None, n_beams: int = None, n_peaks: int = None, precursor_mass_tolerance: str = None, min_peptide_length: int = 0, max_peptide_length: int = 0, min_mz: int = 0, max_mz: int = 50000, min_intensity: int = 0, max_intensity: int = 1, tokens: str = None, min_precursor_charge: int = 1, max_precursor_charge: int = None, remove_precursor_tol: str = None, isotope_error_range: str = None, decoding_strategy: str = None, is_temporary: bool = True, intermediate_hash: str = '', results: dict = None, precision_peptide: float = 0, precision_aa: float = 0, recall_aa: float = 0, recall_peptide: float = 0, comments: str = '', proteobench_version: str = '')[source]#

Bases: object

A data structure used to store the results of a benchmark run.

id#

Unique identifier for the benchmark run.

Type:: str

software_name#

Name of the software used in the benchmark.

Type:: str

software_version#

Version of the software.

Type:: str

search_engine#

Name of the search engine used.

Type:: str

search_engine_version#

Version of the search engine.

Type:: str

ident_fdr_psm#

False discovery rate for PSMs.

Type:: float

ident_fdr_peptide#

False discovery rate for peptides.

Type:: float

ident_fdr_protein#

False discovery rate for proteins.

Type:: float

enable_match_between_runs#

Whether matching between runs is enabled.

Type:: bool

precursor_mass_tolerance#

Mass tolerance for precursor ions.

Type:: str

fragment_mass_tolerance#

Mass tolerance for fragment ions.

Type:: str

enzyme#

Enzyme used for digestion.

Type:: str

allowed_miscleavages#

Number of allowed miscleavages.

Type:: int

min_peptide_length#

Minimum peptide length.

Type:: int

max_peptide_length#

Maximum peptide length.

Type:: int

is_temporary#

Whether the data is temporary.

Type:: bool

intermediate_hash#

Hash of the intermediate result.

Type:: str

results#

A dictionary of metrics for the benchmark run.

Type:: dict

median_abs_epsilon#

Median absolute epsilon value for the benchmark.

Type:: float

mean_abs_epsilon#

Mean absolute epsilon value for the benchmark.

Type:: float

nr_prec#

Number of precursors identified.

Type:: int

comments#

Any additional comments.

Type:: str

proteobench_version#

Version of the Proteobench tool used.

Type:: str

checkpoint: str = None#

comments: str = ''#

decoding_strategy: str = None#

static evaluate_ptm(mod_label, mod_tag, peptidoform, match_array)[source]#

static generate_datapoint(intermediate: DataFrame, input_format: str, user_input: dict, subset_columns_hash: List[str] = ['spectrum_id', 'peptide_str', 'score'], evaluation_type: str = 'mass') → Series[source]#: Generate a Datapoint object containing metadata and results from the benchmark run.

generate_id() → None[source]#

Generate a unique ID for the benchmark run by combining the software name and a timestamp.

This ID is used to uniquely identify each run of the benchmark.

get_indepth_metrics(df: DataFrame)[source]#

get_metrics(df: DataFrame, level: str, evaluation: str)[source]#: Compute various statistical metrics from the provided DataFrame for the benchmark.

get_ptm_metrics(df: DataFrame)[source]#

get_species_metrics(df: DataFrame)[source]#

get_spectrum_metrics(df: DataFrame)[source]#

id: str = None#

intermediate_hash: str = ''#

is_temporary: bool = True#

isotope_error_range: str = None#

max_intensity: int = 1#

max_mz: int = 50000#

max_peptide_length: int = 0#

max_precursor_charge: int = None#

min_intensity: int = 0#

min_mz: int = 0#

min_peptide_length: int = 0#

min_precursor_charge: int = 1#

n_beams: int = None#

n_peaks: int = None#

precision_aa: float = 0#

precision_peptide: float = 0#

precursor_mass_tolerance: str = None#

proteobench_version: str = ''#

recall_aa: float = 0#

recall_peptide: float = 0#

static record_proportions_to_results_feature(series: Series, counts: dict, min_el: int = 1, max_el: int = 30, all_elements=None) → dict[source]#

remove_precursor_tol: str = None#

results: dict = None#

software_name: str = None#

software_version: int = 0#

tokens: str = None#

proteobench.datapoint.denovo_datapoint.calculate_prc(scores_correct, scores_all, n_spectra, threshold=None)[source]#

proteobench.datapoint.denovo_datapoint.collapse_aa_scores(df: DataFrame, evaluation_type: str)[source]#

proteobench.datapoint.denovo_datapoint.get_prc_curve(t, n_spectra)[source]#

proteobench.datapoint.denovo_datapoint module#

This Page