graph LR
Core_Data_Management["Core Data Management"]
Data_Preparation_Annotation["Data Preparation & Annotation"]
Model_Core_Metrics["Model Core & Metrics"]
Data_Tensorization["Data Tensorization"]
Output_Conversion_Services["Output & Conversion Services"]
Output_Conversion_Services -- "orchestrates" --> Data_Tensorization
Output_Conversion_Services -- "uses" --> Data_Tensorization
Output_Conversion_Services -- "manages" --> Model_Core_Metrics
Output_Conversion_Services -- "orchestrates" --> Core_Data_Management
Output_Conversion_Services -- "processes data from" --> Core_Data_Management
Output_Conversion_Services -- "leverages" --> Data_Preparation_Annotation
Data_Tensorization -- "prepares data for" --> Model_Core_Metrics
Data_Tensorization -- "utilizes" --> Data_Preparation_Annotation
Data_Tensorization -- "depends on" --> Data_Preparation_Annotation
Data_Tensorization -- "uses" --> Core_Data_Management
Model_Core_Metrics -- "uses" --> Core_Data_Management
Model_Core_Metrics -- "evaluates" --> Data_Preparation_Annotation
Model_Core_Metrics -- "integrates with" --> Data_Preparation_Annotation
click Core_Data_Management href "./Core Data Management.md" "Details"
click Data_Preparation_Annotation href "./Data Preparation & Annotation.md" "Details"
click Model_Core_Metrics href "./Model Core & Metrics.md" "Details"
click Data_Tensorization href "./Data Tensorization.md" "Details"
click Output_Conversion_Services href "./Output & Conversion Services.md" "Details"
CoSpred is a software system designed for predicting mass spectrometry spectra of peptides. It processes raw input data, performs peptide annotation and data tensorization, utilizes a deep learning model for prediction, and converts the results into various standardized output formats, while also providing a service API for external access.
Manages data input/output operations (reading/writing HDF5, Arrow, CSV) and provides fundamental utility functions for data validation, sequence manipulation, and array reshaping.
Related Classes/Methods:
CoSpred.prosit_model.io_local:get_array(13:15)CoSpred.prosit_model.io_local:to_hdf5(26:35)CoSpred.prosit_model.io_local:from_hdf5_to_transformer(52:116)CoSpred.prosit_model.io_local:from_hdf5(119:176)CoSpred.prosit_model.io_local:pdfile_to_arrow(179:188)CoSpred.prosit_model.io_local:to_arrow(207:237)CoSpred.prosit_model.io_local:from_arrow(240:283)CoSpred.prosit_model.utils:check_mandatory_keys(4:8)CoSpred.prosit_model.utils:reshape_dims(11:12)CoSpred.prosit_model.utils:get_sequence(15:17)CoSpred.prosit_model.utils:sequence_integer_to_str(20:22)CoSpred.prosit_model.utils:peptide_parser(25:39)
Responsible for sanitizing, transforming, and annotating raw input data, including normalization, masking, peptide fragment annotation with m/z values, and matching operations.
Related Classes/Methods:
CoSpred.prosit_model.sanitize:reshape_dims(7:19)CoSpred.prosit_model.sanitize:reshape_flat(22:25)CoSpred.prosit_model.sanitize:normalize_base_peak(28:35)CoSpred.prosit_model.sanitize:mask_outofrange(38:42)CoSpred.prosit_model.sanitize:cap(45:46)CoSpred.prosit_model.sanitize:mask_outofcharge(49:54)CoSpred.prosit_model.sanitize:get_spectral_angle(57:81)CoSpred.prosit_model.sanitize:prediction(84:106)CoSpred.prosit_model.annotate:adjust_masses(7:14)CoSpred.prosit_model.annotate:get_mz(17:18)CoSpred.prosit_model.annotate:get_mzs(21:22)CoSpred.prosit_model.annotate:get_annotation(25:43)CoSpred.prosit_model.match:read_attribute(7:11)CoSpred.prosit_model.match:peptide_parser(14:27)CoSpred.prosit_model.match:get_forward_backward(30:35)CoSpred.prosit_model.match:get_tolerance(38:48)CoSpred.prosit_model.match:is_in_tolerance(51:55)CoSpred.prosit_model.match:binarysearch(58:68)CoSpred.prosit_model.match:match(71:96)CoSpred.prosit_model.match:c_lambda(99:112)CoSpred.prosit_model.match:augment(115:135)
Manages the lifecycle of the prediction model, including loading pre-trained weights and custom layers, and provides functionalities for computing various performance metrics to evaluate prediction quality.
Related Classes/Methods:
CoSpred.prosit_model.model:is_weight_name(14:15)CoSpred.prosit_model.model:get_loss(18:19)CoSpred.prosit_model.model:get_best_weights_path(22:29)CoSpred.prosit_model.model:load(32:46)CoSpred.prosit_model.model:save(49:56)CoSpred.prosit_model.layers.CustomAttention:call(65:78)CoSpred.prosit_model.layers.dot_product(full file reference)CoSpred.prosit_model.metrics.CustomMetric:__init__(6:8)CoSpred.prosit_model.metrics.CustomMetric:update_state(10:11)CoSpred.prosit_model.metrics.CustomMetric:result(15:16)CoSpred.prosit_model.metrics.CustomMetric:reset_state(18:19)CoSpred.prosit_model.metrics.spectral_distance(22:27)CoSpred.prosit_model.metrics.masked_spectral_distance(30:38)CoSpred.prosit_model.metrics.pearson_corr(41:46)CoSpred.prosit_model.metrics.cos_sim(49:53)CoSpred.prosit_model.metrics.binarize(56:65)CoSpred.prosit_model.metrics.ComputeMetrics:__init__(69:72)CoSpred.prosit_model.metrics.ComputeMetrics:update_state(74:135)CoSpred.prosit_model.metrics.ComputeMetrics:return_metrics(137:162)CoSpred.prosit_model.metrics.ComputeMetrics:result(164:168)CoSpred.prosit_model.losses.masked_spectral_distance(5:17)
Converts processed data into numerical tensors, preparing it for input into the neural network model by encoding sequences, precursor charges, and applying m/z values.
Related Classes/Methods:
CoSpred.prosit_model.tensorize:stack(19:31)CoSpred.prosit_model.tensorize:get_numbers(34:36)CoSpred.prosit_model.tensorize:get_precursor_charge_onehot(39:43)CoSpred.prosit_model.tensorize:get_sequence_integer(46:55)CoSpred.prosit_model.tensorize:parse_ion(58:65)CoSpred.prosit_model.tensorize:get_mz_applied(68:86)CoSpred.prosit_model.tensorize:csv(89:120)CoSpred.prosit_model.tensorize:hdf5(123:138)
Provides the core prediction service API, orchestrating the data flow from input to output, and facilitates the conversion of prediction results into various standardized formats (e.g., DIANN, MSP, MGF, CSV, MaxQuant).
Related Classes/Methods:
CoSpred.prosit_model.msp_parser:from_msp_prosit(22:68)CoSpred.prosit_model.msp_parser:from_msp_propel(71:131)CoSpred.prosit_model.msp_parser:dict2mgf(135:156)CoSpred.prosit_model.msp_parser:copy_from_msp(160:189)CoSpred.prosit_model.msp_parser:msp2mgf(192:283)CoSpred.prosit_model.msp_parser:sampling_peptidelist(286:305)CoSpred.prosit_model.msp_parser:main(308:337)CoSpred.prosit_model.converters.diannoutput:rename_column(12:15)CoSpred.prosit_model.converters.diannoutput:read(18:59)CoSpred.prosit_model.converters.diannoutput:write(62:63)CoSpred.prosit_model.converters.diannoutput:convert_prediction(66:131)CoSpred.prosit_model.converters.diannoutput.createLongFileFormat:__init__(138:148)CoSpred.prosit_model.converters.msp:generate_mod_strings(156:186)CoSpred.prosit_model.converters.msp.Converter:convert(195:255)CoSpred.prosit_model.converters.msp.Spectrum:__init__(259:288)CoSpred.prosit_model.converters.msp.Spectrum:__str__(290:309)CoSpred.prosit_model.converters.msp.get_ions(71:84)CoSpred.prosit_model.converters.msp.generate_mods_string_tuples(104:125)CoSpred.prosit_model.converters.msp.preprocess_sequence(9:18)CoSpred.prosit_model.converters.msp.plot_sequence(21:44)CoSpred.prosit_model.converters.generic:convert_spectrum(34:69)CoSpred.prosit_model.converters.generic.Converter:fill_queue(88:98)CoSpred.prosit_model.converters.generic.Converter:to_csv(108:115)CoSpred.prosit_model.converters.generic.Converter:convert(117:123)CoSpred.prosit_model.converters.generic.Converter.batch(80:83)CoSpred.prosit_model.converters.generic.Converter.slice_data(85:86)CoSpred.prosit_model.converters.generic.Converter.get_converted(100:106)CoSpred.prosit_model.converters.maxquant:convert_prediction(67:119)CoSpred.prosit_model.converters.maxquant:write(63:64)CoSpred.prosit_model.server:hello(24:25)CoSpred.prosit_model.server:predict(28:33)CoSpred.prosit_model.server:return_generic(37:48)CoSpred.prosit_model.server:return_msp(52:63)CoSpred.prosit_model.server:return_msms(67:78)