Skip to content

Switched file reads to data.table::fread#208

Merged
tonywu1999 merged 2 commits into
develfrom
MSstatsShiny/work/20260509_fread-migration
May 13, 2026
Merged

Switched file reads to data.table::fread#208
tonywu1999 merged 2 commits into
develfrom
MSstatsShiny/work/20260509_fread-migration

Conversation

@tonywu1999
Copy link
Copy Markdown
Contributor

@tonywu1999 tonywu1999 commented May 9, 2026

  • Replaced read.csv / read.delim / read.table calls in R/utils.R, R/module-visualize-network-server.R, and R/plotting_funtions.R with data.table::fread, which auto-detects CSV vs TSV
  • Removed the create_separator_buttons UI helper and all sep_* radio inputs and validation, since users only upload CSV or TSV
  • Updated getDataCode and the network code generator so the downloadable R script also uses data.table::fread
  • Adjusted test stubs (read.csv -> data.table::fread) and dropped the obsolete sep_* mock_input fields and the create_separator_buttons UI test
  • Regenerated NAMESPACE and man/MSstatsShiny.Rd via roxygen2

Motivation and Context

The PR standardizes file-reading across MSstatsShiny by switching from base R readers (read.csv/read.delim/read.table) and manual separator UI controls to data.table::fread, which auto-detects delimiters for CSV/TSV uploads. This simplifies the UI and validation logic, reduces duplication, and makes generated reproducible code and tests consistent with the runtime behavior.

Short Summary of the Solution

Replace base file readers with data.table::fread across server loaders, plotting and network code; remove separator-selection UI and related validation/tests; update code generators, roxygen metadata, NAMESPACE, package DESCRIPTION, and tests to reflect the new approach.

Detailed Changes (bullet list)

  • Core file-reading changes

    • Replaced read.csv/read.delim/read.table calls with data.table::fread in:
      • R/utils.R (all server-side file loaders and getData path branches)
      • R/module-visualize-network-server.R (loadCsvData and generated analysis code)
      • R/plotting_funtions.R (msstats.log reader in groupComparisonPlots2)
    • Updated getDataCode() and network code generator to emit data.table::fread(...) (removed explicit sep/header args where unnecessary).
  • UI and server validation

    • Removed create_separator_buttons(...) function from R/module-loadpage-ui.R.
    • Removed sep_* radio input UI elements from multiple upload panels (standard quantification, MSstats format, Skyline, DIANN, Spectronaut-related uploads).
    • Simplified loadpageServer (R/module-loadpage-server.R): render functions omit separator UI, and proceed1 gating no longer depends on sep_* inputs across several BIO/DDA_DIA/filetype branches (DDA_LType, TMT flows, PTM flows, Spectronaut, OpenMS, DIANN, etc.).
  • Tests and test stubs

    • Switched test stubs that previously mocked read.csv to mock data.table::fread where appropriate.
    • Removed sep_* mock_input fields and corresponding assertions in tests.
    • Deleted create_separator_buttons UI test block and removed separator assertions from loadpage UI tests.
  • Package metadata and docs

    • Regenerated NAMESPACE (removed imports of read.csv/read.delim/read.table; added utils imports such as capture.output/head/packageVersion).
    • Updated R/MSstatsShiny.R roxygen import directives to match NAMESPACE changes.
    • Updated DESCRIPTION to use roxygen2 8.0.0 (RoxygenNote change).
    • Updated man/MSstatsShiny.Rd to include Anthony Wu in the author list.
  • Misc

    • Minor fix: variable name in data list for msstats input (single commit message noted).

Unit Tests Added or Modified

  • tests/testthat/ex.R

    • Removed sep_skylinedata from mocked input for the dda openms test; adjusted mock_input ordering.
  • tests/testthat/test-module-loadpage-ui.R

    • Removed assertions that checked for separator/column-separator radio options.
    • Deleted the test block for create_separator_buttons() UI rendering.
  • tests/testthat/test-utils.R

    • Removed sep_skylinedata and sep_data fields from shared mock input setup.
    • Updated multiple getData conversion/integration tests (PD/prog/DIANN, Skyline, OpenMS, Metamorpheus, etc.) to stop providing separator fields.
    • Replaced read.csv stubs with data.table::fread stubs in formatter/integration tests (notably DIANN Spectronaut/OpenSWATH and TMT OpenMS/Spectromine).
    • Adjusted Spectronaut anomaly-score tests to assert calculateAnomalyScores/runOrder/anomalyModelFeatures/n_trees behavior and to use fread stubs.

Coding Guidelines Violations

  • No explicit coding guideline violations observed in the diff summary. Changes are consistent with modernization and maintainability goals: centralized file parsing via data.table::fread, removal of redundant UI controls, and corresponding updates to tests, exports, and documentation.

Review Change Stack

* Replaced read.csv / read.delim / read.table calls in R/utils.R, R/module-visualize-network-server.R, and R/plotting_funtions.R with data.table::fread, which auto-detects CSV vs TSV
* Removed the create_separator_buttons UI helper and all sep_* radio inputs and validation, since users only upload CSV or TSV
* Updated getDataCode and the network code generator so the downloadable R script also uses data.table::fread
* Adjusted test stubs (read.csv -> data.table::fread) and dropped the obsolete sep_* mock_input fields and the create_separator_buttons UI test
* Regenerated NAMESPACE and man/MSstatsShiny.Rd via roxygen2

Co-Authored-By: Claude <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 9, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: f3412bd3-380c-49ce-b885-24c61ed9c573

📥 Commits

Reviewing files that changed from the base of the PR and between 2693375 and b8305dd.

📒 Files selected for processing (1)
  • R/utils.R
🚧 Files skipped from review as they are similar to previous changes (1)
  • R/utils.R

📝 Walkthrough

Walkthrough

This PR updates roxygen to 8.0.0, removes column-separator UI and related validations, and migrates CSV/TSV reading from base R readers to data.table::fread across utilities, modules, generated code, and tests.

Changes

Data Loading Modernization

Layer / File(s) Summary
Roxygen & Package Imports Configuration
DESCRIPTION, NAMESPACE, R/MSstatsShiny.R
Roxygen updated to 8.0.0; NAMESPACE/roxygen imports adjusted to add capture.output, head, packageVersion, write.csv and remove read.csv, read.delim, read.table.
UI Separator Controls Removal
R/module-loadpage-ui.R
Removed column-separator radio controls from standard quantification, MSstats, Skyline, and DIANN upload panels; deleted create_separator_buttons().
Server Validation Logic Update
R/module-loadpage-server.R
Spectronaut file-selection UI no longer appends separator buttons; proceed1 gating for LType/TMT/PTM flows no longer depends on sep_* inputs and relies on primary data presence.
Data Loading Migration to fread
R/utils.R, R/module-visualize-network-server.R, R/plotting_funtions.R
Replaced read.csv/read.delim/read.table with data.table::fread in server helpers, getData()/getDataCode() generation, network visualization, and plotting; removed explicit sep/header arguments in emitted code.
Test Fixture and Mock Data Updates
tests/testthat/*
Mock inputs updated to remove sep_* fields; tests switched stubs from read.csv to data.table::fread; removed separator UI assertions; Spectronaut anomaly-score tests adjusted.
Package Documentation
man/MSstatsShiny.Rd
Added explicit Anthony Wu author entry.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant UI
  participant Server
  participant getData
  participant fread
  User->>UI: upload file via fileInput
  UI->>Server: input$data present
  Server->>getData: request parse
  getData->>fread: call data.table::fread(input$data$datapath)
  fread-->>getData: parsed data.frame/data.table
  getData-->>Server: parsed dataset
  Server-->>UI: enable proceed1
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

Review effort 3/5, enhancement

Suggested reviewers

  • devonjkohler

Poem

🐰 I hopped through code with tidy feet,
Replaced old reads with something neat.
No more radios for comma or tab,
fread now parses every slab.
Data flows smooth, tests align—hip hooray!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Switched file reads to data.table::fread' directly matches the main change across the entire changeset—systematic replacement of base R file-reading functions with data.table::fread throughout multiple files.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch MSstatsShiny/work/20260509_fread-migration

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/testthat/test-utils.R (1)

462-468: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Remove hardcoded sep from these migrated tests.

Line 468, Line 487, and Line 596 still force mock_input$sep = "," even after separator UI removal. This can mask regressions by keeping legacy input state alive in tests that should validate fread auto-detection without sep.

Suggested test update
 test_that("dda pd", {
   suppressWarnings({
     mock_input$BIO <- "Protein"
     mock_input$DDA_DIA <- "LType"
     mock_input$filetype = "PD"
-    mock_input$sep = ","
+    mock_input$sep = NULL
@@
 test_that("dda prog", {
   suppressWarnings({
@@
-    mock_input$sep = ","
+    mock_input$sep = NULL
@@
 test_that("dda openms", {
   suppressWarnings({
@@
-    mock_input$sep = ","
+    mock_input$sep = NULL

Also applies to: 481-487, 590-596

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/testthat/test-utils.R` around lines 462 - 468, Remove the hardcoded
separator assignments in the migrated tests so they no longer force legacy UI
state: in the test block named "dda pd" (and the other affected test_that blocks
where mock_input$sep = "," appears) delete the lines that set mock_input$sep =
"," so the tests rely on fread auto-detection; ensure you only remove the sep
assignments and leave mock_input$BIO, mock_input$DDA_DIA, mock_input$filetype
etc. intact (look for usages of mock_input and the test_that titles to locate
each occurrence).
🧹 Nitpick comments (3)
R/plotting_funtions.R (1)

138-139: ⚡ Quick win

Make the .log parser explicit here.

This file is reading MSstatsShiny's own log artifact, not one of the CSV/TSV uploads the PR is targeting. fread(finalfile) leaves delimiter/header detection to auto-guessing, so processout can change shape as the log contents evolve. Please pass the expected layout explicitly so this stays stable across runs.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@R/plotting_funtions.R` around lines 138 - 139, The code currently uses
data.table::fread(finalfile) which auto-detects delimiter/header and can produce
unstable shapes; change the call that populates processout (the fread invoked on
finalfile in plotting_funtions.R) to parse the expected .log layout explicitly —
e.g., call data.table::fread(finalfile, sep="\t", header=FALSE, fill=TRUE,
quote="") (or the exact delimiter/header settings for your log format) before
converting to matrix so processout remains stable as the log content evolves.
R/module-loadpage-ui.R (1)

218-218: ⚡ Quick win

Constrain these uploaders to the formats the new readers actually support.

After removing the separator controls, these paths now assume fread-compatible inputs, but accept = NULL still lets users pick arbitrary files and only fail later in the server path. Please set per-input filters such as .csv/.tsv/.txt, and include .parquet/.pq for DIANN.

Possible direction
-    fileInput(ns('skylinedata'), "", multiple = FALSE, accept = NULL)
+    fileInput(ns('skylinedata'), "", multiple = FALSE,
+              accept = c(".csv", ".tsv", ".txt"))

-    fileInput(ns('dianndata'), "", multiple = FALSE, accept = NULL)
+    fileInput(ns('dianndata'), "", multiple = FALSE,
+              accept = c(".csv", ".tsv", ".parquet", ".pq"))

Also applies to: 242-242, 264-264, 274-274

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@R/module-loadpage-ui.R` at line 218, Change the fileInput controls so they
only allow the reader-supported extensions instead of accept = NULL: for
uploaders that feed the fread-compatible readers (e.g. the fileInput(ns('data')
...) and the other fileInput controls referenced at the same spots) set accept
to CSV/TSV/text extensions like ".csv", ".tsv", ".txt"; for the DIANN-specific
uploader(s) also include Parquet extensions ".parquet" and ".pq". Update each
fileInput(...) accept parameter accordingly so users can only pick supported
file types.
tests/testthat/test-utils.R (1)

795-807: ⚡ Quick win

The second Sky getDataCode() assertion is redundant.

Line 804–806 repeats the same input and expectation as Line 800–803, so it doesn’t add coverage anymore.

Suggested cleanup
 test_that("get data code filetype sky", {
   suppressWarnings({
@@
     mock_input$DDA_DIA <- "LType"
     output <- getDataCode(mock_input)
     expect_type(output,"character")
-
-    mock_input$DDA_DIA <- "LType"
-    output <- getDataCode(mock_input)
-    expect_type(output,"character")
   })
 })
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/testthat/test-utils.R` around lines 795 - 807, The test contains a
redundant duplicate assertion for getDataCode: remove the repeated block that
sets mock_input$DDA_DIA <- "LType", calls getDataCode(mock_input), and
expect_type(output, "character") so only the first invocation/assertion remains;
if you intended to exercise a different branch, replace the duplicate with a
different DDA_DIA value or an additional expectation instead. Ensure the unique
symbol getDataCode and the mock_input setup remain unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@R/utils.R`:
- Around line 865-867: The generated PTM MSstats snippet sets ptm_data and
global_data but returns a list using PROTEIN = unmod which is undefined; update
the emitted list in the branch where input$BIO == "PTM" (the code that appends
to variable codes) to reference the correct variable name (global_data or unmod
as appropriate) — e.g., replace PROTEIN = unmod with PROTEIN = global_data (or
rename global_data to unmod in the snippet) so the list uses a defined symbol
(refer to input$BIO, codes, ptm_data, global_data, unmod, and PROTEIN).

---

Outside diff comments:
In `@tests/testthat/test-utils.R`:
- Around line 462-468: Remove the hardcoded separator assignments in the
migrated tests so they no longer force legacy UI state: in the test block named
"dda pd" (and the other affected test_that blocks where mock_input$sep = ","
appears) delete the lines that set mock_input$sep = "," so the tests rely on
fread auto-detection; ensure you only remove the sep assignments and leave
mock_input$BIO, mock_input$DDA_DIA, mock_input$filetype etc. intact (look for
usages of mock_input and the test_that titles to locate each occurrence).

---

Nitpick comments:
In `@R/module-loadpage-ui.R`:
- Line 218: Change the fileInput controls so they only allow the
reader-supported extensions instead of accept = NULL: for uploaders that feed
the fread-compatible readers (e.g. the fileInput(ns('data') ...) and the other
fileInput controls referenced at the same spots) set accept to CSV/TSV/text
extensions like ".csv", ".tsv", ".txt"; for the DIANN-specific uploader(s) also
include Parquet extensions ".parquet" and ".pq". Update each fileInput(...)
accept parameter accordingly so users can only pick supported file types.

In `@R/plotting_funtions.R`:
- Around line 138-139: The code currently uses data.table::fread(finalfile)
which auto-detects delimiter/header and can produce unstable shapes; change the
call that populates processout (the fread invoked on finalfile in
plotting_funtions.R) to parse the expected .log layout explicitly — e.g., call
data.table::fread(finalfile, sep="\t", header=FALSE, fill=TRUE, quote="") (or
the exact delimiter/header settings for your log format) before converting to
matrix so processout remains stable as the log content evolves.

In `@tests/testthat/test-utils.R`:
- Around line 795-807: The test contains a redundant duplicate assertion for
getDataCode: remove the repeated block that sets mock_input$DDA_DIA <- "LType",
calls getDataCode(mock_input), and expect_type(output, "character") so only the
first invocation/assertion remains; if you intended to exercise a different
branch, replace the duplicate with a different DDA_DIA value or an additional
expectation instead. Ensure the unique symbol getDataCode and the mock_input
setup remain unchanged.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7905482c-ed26-44b0-9359-b21b20120a60

📥 Commits

Reviewing files that changed from the base of the PR and between 3aae854 and 2693375.

📒 Files selected for processing (12)
  • DESCRIPTION
  • NAMESPACE
  • R/MSstatsShiny.R
  • R/module-loadpage-server.R
  • R/module-loadpage-ui.R
  • R/module-visualize-network-server.R
  • R/plotting_funtions.R
  • R/utils.R
  • man/MSstatsShiny.Rd
  • tests/testthat/ex.R
  • tests/testthat/test-module-loadpage-ui.R
  • tests/testthat/test-utils.R
💤 Files with no reviewable changes (2)
  • NAMESPACE
  • tests/testthat/test-module-loadpage-ui.R

Comment thread R/utils.R
@tonywu1999 tonywu1999 merged commit 2881aa4 into devel May 13, 2026
2 checks passed
@tonywu1999 tonywu1999 deleted the MSstatsShiny/work/20260509_fread-migration branch May 13, 2026 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant