proteobench.io.params.diann module#

DIA-NN parameter parsing.

proteobench.io.params.diann.extract_cfg_parameter(lines: ~typing.List[str], regex: str, cast_type: type = <class 'str'>, default=None, search_all=False) Any[source]#

Extract and cast a parameter using a regex pattern.

proteobench.io.params.diann.extract_modifications(lines: List[str], regexes: List[str]) str | None[source]#

Extract and join modifications from a list of regexes.

proteobench.io.params.diann.extract_params(fname: str, json_file='/home/docs/checkouts/readthedocs.org/user_builds/proteobench/envs/v0.12.1/lib/python3.11/site-packages/proteobench/io/params/json/Quant/quant_lfq_DIA_ion.json') ProteoBenchParameters[source]#

Parse DIA-NN log file and extract relevant parameters.

Logic: 1. Read the log file and extract the software version. 2. Find the command line string that was used to run DIA-NN. 3. Parse the command line string to extract settings. Default values are set for parameters that are not specified in the command line. 4. If the –cfg flag is used (meaning a configuration file was used),

the parameters are parsed from the free text underneath the cmd line.

Parameters:

fname (str) – Parameter file name path.

Returns:

The parsed ProteoBenchParameters object.

Return type:

ProteoBenchParameters

proteobench.io.params.diann.extract_with_regex(lines: List[str], regex, search_all=False) str[source]#

If no mass accuracy was specified in the cmd string, extract it from the log-file.

Parameters:
  • lines (list[str]) – All input lines from the DIA-NN log file.

  • regex (str) – The regex pattern to be matched.

Returns:

The MS1 and MS2 mass accuracy specified in ppm.

Return type:

str

proteobench.io.params.diann.find_cmdline_string(lines: List[str]) str | None[source]#

Find the command line statement in the log file of DIANN.

It is assumed that this statement is stored on a single line.

Parameters:

lines (list[str]) – All input lines from the DIA-NN log file.

Returns:

The command line string.

Return type:

str

proteobench.io.params.diann.parse_cmdline_string(cmd_line: str, software_version: str) dict[source]#

Parse a DIA-NN command line string into a dictionary of settings.

Parameters:
  • cmd_line (str) – The command line string to parse.

  • software_version (str) – The version of the DIA-NN software, e.g., “1.8”.

Returns:

Parsed settings in dictionary format. Keys are setting names, and values are: - List of inputs for multi-value settings. - Boolean True for flag-like settings (without values). - Modified settings for variable and fixed modifications.

Return type:

dict

Raises:

AssertionError – If an unsupported setting format is detected (e.g., unimod with extra arguments).

proteobench.io.params.diann.parse_predictors_library(cmdline_dict: dict)[source]#

Parse the spectral library predictors from parsed execute command string.

For now, only ‘DIANN’ and ‘User defined speclib’ are supported. In the future, the user might specify which algorithm was used for library generation.

Parameters:

cmdline_dict (dict) – Parsed execution command string.

Returns:

Dictionary specifying algorithm name for RT, IM and MS2_int.

Return type:

dict

proteobench.io.params.diann.parse_protein_inference_method(cmdline_dict: dict) str[source]#

Parse the protein inference method from the parsed execution command string.

This setting is defined by disparate setting tags, namely: - no-prot-inf: No protein inference - pg-level: Code specifies inference method

Parameters:

cmdline_dict (dict) – Parsed execution command string.

Returns:

The protein inference method. Possibilities: - Disabled - Isoforms - Protein_names - Genes

Return type:

str

proteobench.io.params.diann.parse_quantification_strategy(cmdline_dict: dict)[source]#

Parse the quantification method from the parsed execution command string.

This setting is defined by disparate setting tags, namely: - direct-quant: use legacy quantification within DIANN - high-acc: QuantUMS high-accuracy setting - no tag: Default is QuantUMS high-precision

Parameters:

cmdline_dict (dict) – Parsed execution command string.

Returns:

The quantification method. Possibilities: - Legacy - QuantUMS high-accuracy - QuantUMS high-precision

Return type:

str

proteobench.io.params.diann.parse_setting(setting_name: str, setting_list: list) Any[source]#

Parse individual settings based on their setting type.

Parameters:
  • setting_name (str) – The name of the setting (ProteoBench).

  • setting_list (list) – The input value of a given setting.

Returns:

The parsed setting.

Return type:

Any