proteobench.io.parsing.parse_settings module#
All formats available for the module.
- class proteobench.io.parsing.parse_settings.ParseModificationSettings(parse_settings: Dict[str, Any])[source]#
Bases:
objectSettings for parsing modifications in protein data.
- Parameters:
parse_settings (Dict[str, Any]) – Dictionary containing modification-specific parsing settings.
- class proteobench.io.parsing.parse_settings.ParseSettingsBuilder(parse_settings_dir: str, module_id: str)[source]#
Bases:
objectClass to build the parser settings for a given input format.
- Parameters:
- build_parser(input_format: str) object[source]#
Build the parser for a given input format using the corresponding TOML files.
- Parameters:
input_format (str) – The input format to build the parser for (e.g., “MaxQuant”, “Sage”).
- Returns:
The parser for the specified input format.
- Return type:
ParseSettings
- class proteobench.io.parsing.parse_settings.ParseSettingsDeNovo(parse_settings: Dict[str, Any], parse_settings_module: Dict[str, Any])[source]#
Bases:
object- add_modification_parser(parser: ParseModificationSettings)[source]#
- convert_to_standard_format(df: DataFrame) tuple[DataFrame, Dict[int, List[str]]][source]#
Convert a software tool output into a generic format supported by the module.
- format_scores(aa_scores: Any, peptidoform: Peptidoform, fix_aa_length=False) List[float][source]#
Format the amino acid scores into a list of float numbers.
- class proteobench.io.parsing.parse_settings.ParseSettingsQuant(parse_settings: Dict[str, Any], parse_settings_module: Dict[str, Any])[source]#
Bases:
objectStructure that contains all the parameters used to parse the given benchmark run output depending on the software tool used.
- Parameters:
- add_modification_parser(parser: ParseModificationSettings)[source]#
Add a modification parser to the settings.
- Parameters:
parser (object) – The modification parser to add.
- convert_to_standard_format(df: DataFrame) tuple[DataFrame, Dict[int, List[str]]][source]#
Convert a software tool output into a generic format supported by the module.
Steps: 1. Validate and rename columns 2. Create replicate mapping 3. Filter decoys 4. Fix column names 5. Mark contaminants 6. Process species information 7. Handle data format (long vs short) 8. Process modifications if needed 9. Filter zero intensities 10. Format based on analysis level