Template#
Template for module#
All input formats available for the module
- class proteobench.modules.template.parse_settings.ParseSettings(input_format: str)[source]#
Bases:
objectStructure that contains all the parameters used to parse the given database search output.
- class proteobench.modules.template.parse.ParseInputs[source]#
Bases:
ParseInputsInterface- convert_to_standard_format(parse_settings: ParseSettings) DataFrame[source]#
Convert a search engine output into a generic format supported by the module.
- class proteobench.modules.template.datapoint.Datapoint(id: str | None = None, is_temporary: bool = True, search_engine: str | None = None, software_version: int = 0, fdr_psm: int = 0, fdr_peptide: int = 0, fdr_protein: int = 0, MBR: bool = False, precursor_tol: int = 0, precursor_tol_unit: str = 'Da', fragment_tol: int = 0, fragment_tol_unit: str = 'Da', enzyme_name: str | None = None, missed_cleavages: int = 0, min_pep_length: int = 0, max_pep_length: int = 0)[source]#
Bases:
objectData used to store the experimental metadata and data analysis settings.
- Example for attributes:
id: A unique identifier for the datapoint. is_temporary: A boolean flag indicating whether the datapoint is temporary or not. search_engine: The name of the search engine used for the experiment. software_version: The version number of the software used for the experiment. fdr_psm: The false discovery rate at the peptide-spectrum match level. fdr_peptide: The false discovery rate at the peptide level. fdr_protein: The false discovery rate at the protein level. MBR: A boolean flag indicating whether match-between-runs was enabled or not. precursor_tol: The precursor mass tolerance in units specified by precursor_tol_unit. precursor_tol_unit: The unit of the precursor mass tolerance. Either “Da” or “ppm”. fragment_tol: The fragment mass tolerance in units specified by fragment_tol_unit. fragment_tol_unit: The unit of the fragment mass tolerance. Either “Da” or “ppm”. enzyme_name: The name of the enzyme used for digestion. missed_cleavages: The number of allowed missed cleavages during digestion. min_pep_length: The minimum peptide length for identification. max_pep_length: The maximum peptide length for identification. weighted_sum: The weighted sum score used for protein inference. nr_prec: The number of precursors used for protein inference.
- calculate_benchmarking_metric_1(intermediate_data)[source]#
Calculates the first benchmarking metric based on the intermediate data.
- calculate_benchmarking_metric_2(intermediate_data)[source]#
Calculates the second benchmarking metric based on the intermediate data.
- dump_json_object(file_name)[source]#
Dumps the datapoint as a JSON object to a file.
- Parameters:
file_name (str) – The name of the file to write to.
Writes a JSON representation of the datapoint to a file with the given name. Appends the JSON object to the end of the file if it already exists.