Skip to content

Latest commit

 

History

History
92 lines (62 loc) · 8.29 KB

File metadata and controls

92 lines (62 loc) · 8.29 KB
graph LR
    Model_Core["Model Core"]
    Training_and_Prediction_Orchestrator["Training and Prediction Orchestrator"]
    Data_Management["Data Management"]
    Evaluation_and_Visualization["Evaluation and Visualization"]
    Model_Interpretation["Model Interpretation"]
    Training_and_Prediction_Orchestrator -- "initializes" --> Model_Core
    Training_and_Prediction_Orchestrator -- "uses" --> Model_Core
    Training_and_Prediction_Orchestrator -- "loads data from" --> Data_Management
    Model_Interpretation -- "extracts data from" --> Data_Management
    Model_Interpretation -- "interacts with" --> Model_Core
    Evaluation_and_Visualization -- "receives predictions from" --> Training_and_Prediction_Orchestrator
    click Model_Core href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/decima/Model Core.md" "Details"
    click Training_and_Prediction_Orchestrator href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/decima/Training and Prediction Orchestrator.md" "Details"
    click Data_Management href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/decima/Data Management.md" "Details"
    click Evaluation_and_Visualization href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/decima/Evaluation and Visualization.md" "Details"
    click Model_Interpretation href "https://github.com/CodeBoarding/GeneratedOnBoardings/blob/main/decima/Model Interpretation.md" "Details"
Loading

CodeBoardingDemoContact

Component Details

The decima project implements a neural network model for analyzing biological sequence and gene expression data. Its core functionality involves training a deep learning model, managing complex HDF5 datasets, and providing tools for model evaluation, visualization, and interpretation of predictions.

Model Core

Encapsulates the fundamental neural network architecture (DecimaModel), the custom loss function (TaskWisePoissonMultinomialLoss), and the disease-specific evaluation metric (DiseaseLfcMSE). It forms the computational backbone of the Decima model.

Related Classes/Methods:

Training and Prediction Orchestrator

Integrates the Decima model with the PyTorch Lightning framework, managing the entire machine learning lifecycle including model initialization, forward passes, training, validation, testing, and prediction steps. It handles data loading and orchestrates the training and prediction processes.

Related Classes/Methods:

Data Management

Responsible for reading, processing, and augmenting biological sequence and gene expression data stored in HDF5 files. It includes utilities for gene indexing, extracting specific data points, and defining dataset classes for efficient data loading during training and inference, including specific handling for variant data.

Related Classes/Methods:

Evaluation and Visualization

Provides functionalities for quantitatively assessing the performance of the model, particularly focusing on marker gene analysis and criteria matching, and generates plots and visual representations of the model's evaluation results.

Related Classes/Methods:

Model Interpretation

Provides tools and methods for interpreting the predictions and internal workings of the model, such as attribution analysis to understand feature importance and motif scanning.

Related Classes/Methods: