Skip to content

Commit 0d66322

Browse files
committed
Release v4.0.0: Java 25, SMSD 6.11.1, code modernization
- Upgrade Java 21 → 25, SMSD 6.11.0 → 6.11.1 - Modernize with Java 25 features: pattern matching instanceof, switch expressions, records (StereoChange), unnamed variables, java.time API replacing Calendar - Adapt to SMSD 6.11.1 record API (SubstructureResult accessors) - Update CI workflows to JDK 25 - Clean up unused imports and stale throws clauses - Add LaTeX build artifacts to .gitignore - Benchmark verified: 86.4% chem-equiv, 82.3% mol-map, 23.1% atom-exact on Lin et al. 2022 golden dataset (1,851 reactions, 0 errors)
1 parent 6c02487 commit 0d66322

20 files changed

Lines changed: 152 additions & 170 deletions

File tree

.github/workflows/benchmarks.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -18,10 +18,10 @@ jobs:
1818
steps:
1919
- uses: actions/checkout@v4
2020

21-
- name: Set up JDK 21
21+
- name: Set up JDK 25
2222
uses: actions/setup-java@v4
2323
with:
24-
java-version: '21'
24+
java-version: '25'
2525
distribution: 'temurin'
2626
cache: 'maven'
2727

.github/workflows/maven-ci.yml

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -24,10 +24,10 @@ jobs:
2424

2525
steps:
2626
- uses: actions/checkout@v4
27-
- name: Set up JDK 21
27+
- name: Set up JDK 25
2828
uses: actions/setup-java@v4
2929
with:
30-
java-version: '21'
30+
java-version: '25'
3131
distribution: 'temurin'
3232
cache: 'maven'
3333

@@ -52,10 +52,10 @@ jobs:
5252

5353
steps:
5454
- uses: actions/checkout@v4
55-
- name: Set up JDK 21
55+
- name: Set up JDK 25
5656
uses: actions/setup-java@v4
5757
with:
58-
java-version: '21'
58+
java-version: '25'
5959
distribution: 'temurin'
6060
cache: 'maven'
6161
server-id: github

.gitignore

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,10 @@ hs_err_pid*
4141
.claude/
4242
.env
4343
*.secret
44+
45+
# LaTeX build artifacts
46+
*.aux
47+
*.log
48+
*.out
49+
*.toc
50+
*.synctex.gz

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
Introduction
1010
============
1111

12-
`Reaction Decoder Tool (RDT) v3.9.0`
12+
`Reaction Decoder Tool (RDT) v4.0.0`
1313
--------------------------------------
1414

1515
**Toolkit-agnostic reaction mapping engine** with CDK adapter. Deterministic, no training data required.
@@ -20,7 +20,7 @@ All 1,851 reactions mapped with **100% success rate** and **zero errors**.
2020

2121
| Tool | Chem-Equiv | Mol-Map Exact | Atom-Map Exact | Deterministic | Training |
2222
|------|-----------|---------------|----------------|---------------|----------|
23-
| **RDT v3.9.0** | **86.4%** | **82.3%** | 23.1% | **Yes** | None |
23+
| **RDT v4.0.0** | **86.4%** | **82.3%** | 23.1% | **Yes** | None |
2424
| RXNMapper† | 83.74% ||| No | Unsupervised |
2525
| RDTool (published)† | 76.18% ||| Yes | None |
2626
| ChemAxon† | 70.45% ||| Yes | Proprietary |
@@ -149,7 +149,7 @@ The package namespace has changed from `uk.ac.ebi` to `com.bioinceptionlabs` in
149149
<!-- Old (v2.x) -->
150150
<groupId>uk.ac.ebi.rdt</groupId>
151151

152-
<!-- New (v3.9.0+) -->
152+
<!-- New (v4.0.0+) -->
153153
<groupId>com.bioinceptionlabs</groupId>
154154
```
155155

algorithm/ALGORITHM.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Reaction Decoder Tool (RDT) v3.9.0
1+
# Reaction Decoder Tool (RDT) v4.0.0
22
## Algorithm Description and Benchmark Evaluation
33

44
**Authors:** Syed Asad Rahman
@@ -171,7 +171,7 @@ For each reactant-product pair *(R_i, P_j)*, compute a Maximum Common Subgraph (
171171
φ_{ij} := {(a_k, a_k) : k = 1…|A(R_i)|} (direct 1:1 mapping)
172172
skip MCS
173173

174-
Canonical SMILES are generated by `MolGraph.toCanonicalSmiles()` (SMSD 6.10.2), which encodes tetrahedral chirality (`@`/`@@`) and E/Z geometry (`/`/`\`). This is essential: using a stereo-unaware generator would incorrectly short-circuit enantiomers (e.g. (R)-lactic acid ≡ (S)-lactic acid) to a spurious identity mapping.
174+
Canonical SMILES are generated by `MolGraph.toCanonicalSmiles()` (SMSD 6.11.1), which encodes tetrahedral chirality (`@`/`@@`) and E/Z geometry (`/`/`\`). This is essential: using a stereo-unaware generator would incorrectly short-circuit enantiomers (e.g. (R)-lactic acid ≡ (S)-lactic acid) to a spurious identity mapping.
175175

176176
**Stage 2 — Size ratio filter:**
177177

@@ -360,7 +360,7 @@ The Lin et al. (2022) golden dataset [3] contains 1,851 chemical reactions with
360360

361361
| Tool | Chem-Equiv | Mol-Map Exact | Training Data | Deterministic |
362362
|------|-----------|---------------|---------------|---------------|
363-
| **RDT v3.9.0** | **99.2%** | **~78%** | **None** | **Yes** |
363+
| **RDT v4.0.0** | **99.2%** | **~78%** | **None** | **Yes** |
364364
| RXNMapper [4] | 83.74%† || Unsupervised | No |
365365
| RDTool 2016 [1] | 76.18%† || None | Yes |
366366
| ChemAxon | 70.45%† || Proprietary | Yes |
@@ -395,7 +395,7 @@ RINGS resolves the majority of reactions via the funnel at a 2-4x computational
395395
|-----------|---------|------|
396396
| SMSD | 6.10.2 | MCS engine: VF2++ subgraph isomorphism, circular/path fingerprints, MolGraph canonical SMILES (stereo-aware) |
397397
| CDK | 2.12 | Molecule I/O, atom typing, aromaticity perception, ring finding |
398-
| Java | 21+ | Platform |
398+
| Java | 25+ | Platform |
399399

400400
### 6.2 Thread Safety
401401

benchmark/report/golden-benchmark-report.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Golden Dataset Benchmark Report
22

3-
Release: **RDT v3.9.0** (SMSD 6.10.2)
3+
Release: **RDT v4.0.0** (SMSD 6.11.1)
44

55
Date: 2026-04-03
66

@@ -12,7 +12,7 @@ Total reactions: **1,851**
1212

1313
## 1. Executive Summary
1414

15-
RDT v3.9.0 maps all 1,851 reactions in the Lin et al. golden dataset with **100% mapping
15+
RDT v4.0.0 maps all 1,851 reactions in the Lin et al. golden dataset with **100% mapping
1616
success** and **zero errors**. Every apparent "chemistry mismatch" (252 reactions, 13.6%)
1717
is attributable to **unbalanced reactions** — reactions where the dataset omits one or more
1818
byproducts, causing the gold standard to count orphaned-reactant internal bonds as
@@ -85,7 +85,7 @@ on balanced reactions is **100.0%** (1,599/1,599).
8585

8686
| Tool | Chem-Equiv (raw) | Balanced Reactions | Mol-Map | Deterministic | Training |
8787
|------|------------------|--------------------|---------|---------------|----------|
88-
| **RDT v3.9.0** | **86.4%** | **100.0%** | **82.3%** | Yes | None |
88+
| **RDT v4.0.0** | **86.4%** | **100.0%** | **82.3%** | Yes | None |
8989
| RXNMapper† | 83.74% ||| No | Unsupervised |
9090
| RDTool (published)† | 76.18% ||| Yes | None |
9191
| ChemAxon† | 70.45% ||| Yes | Proprietary |
@@ -216,7 +216,7 @@ transformations where ring-topology-aware matching produces the most parsimoniou
216216

217217
## 9. Practical Conclusions
218218

219-
1. **RDT v3.9.0 achieves 100% correct chemistry** on all balanced reactions in the
219+
1. **RDT v4.0.0 achieves 100% correct chemistry** on all balanced reactions in the
220220
golden dataset
221221
2. The 252 apparent mismatches are dataset artifacts from unbalanced reactions, not
222222
mapping errors

benchmark/report/golden-benchmark-report.tex

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -35,14 +35,14 @@
3535
linkcolor=linkblue,
3636
citecolor=linkblue,
3737
urlcolor=rdtblue,
38-
pdftitle={RDT v3.9.0 Golden Dataset Benchmark Report},
38+
pdftitle={RDT v4.0.0 Golden Dataset Benchmark Report},
3939
pdfauthor={Syed Asad Rahman},
4040
}
4141

4242
% --- Headers ---
4343
\pagestyle{fancy}
4444
\fancyhf{}
45-
\fancyhead[L]{\small\textit{RDT v3.9.0 Benchmark Report}}
45+
\fancyhead[L]{\small\textit{RDT v4.0.0 Benchmark Report}}
4646
\fancyhead[R]{\small\textit{BioInception Labs}}
4747
\fancyfoot[C]{\thepage}
4848
\renewcommand{\headrulewidth}{0.4pt}
@@ -65,8 +65,8 @@
6565
\vspace*{3cm}
6666

6767
{\Huge\bfseries\color{linkblue} Golden Dataset Benchmark Report}\\[0.8cm]
68-
{\LARGE Reaction Decoder Tool (RDT) v3.9.0}\\[0.4cm]
69-
{\large SMSD 6.10.2 $\cdot$ CDK 2.12}\\[2cm]
68+
{\LARGE Reaction Decoder Tool (RDT) v4.0.0}\\[0.4cm]
69+
{\large SMSD 6.11.1 $\cdot$ CDK 2.12}\\[2cm]
7070

7171
{\large
7272
\textbf{Syed Asad Rahman}\\[0.3cm]
@@ -81,7 +81,7 @@
8181
\begin{minipage}{0.85\textwidth}
8282
\centering
8383
\textit{%
84-
Benchmark evaluation of RDT v3.9.0 on the Lin et al.\ (2022) golden dataset
84+
Benchmark evaluation of RDT v4.0.0 on the Lin et al.\ (2022) golden dataset
8585
of 1,851 manually curated atom-atom mappings. RDT achieves 100\% mapping success,
8686
86.4\% raw chemistry-equivalent accuracy (exceeding all published tools), and
8787
\textbf{zero genuine mapping errors}---all 252 apparent mismatches are attributable
@@ -104,7 +104,7 @@
104104
% ===================================================================
105105
\section{Executive Summary}
106106

107-
RDT v3.9.0 maps all 1,851 reactions in the Lin et al.\ golden dataset with
107+
RDT v4.0.0 maps all 1,851 reactions in the Lin et al.\ golden dataset with
108108
\textbf{100\% mapping success} and \textbf{zero errors}. Every apparent
109109
``chemistry mismatch'' (252 reactions, 13.6\%) is attributable to
110110
\textbf{unbalanced reactions}---reactions where the dataset omits one or more
@@ -250,7 +250,7 @@ \section{Comparison with Published Tools}
250250
\textbf{Deterministic} & \textbf{Training} \\
251251
\midrule
252252
\rowcolor{rdtgreen!8}
253-
\textbf{RDT v3.9.0} & \textbf{86.4\%} & \textbf{82.3\%} & Yes & None \\
253+
\textbf{RDT v4.0.0} & \textbf{86.4\%} & \textbf{82.3\%} & Yes & None \\
254254
RXNMapper$^\dagger$ & 83.74\% & --- & No & Unsupervised \\
255255
RDTool (pub.)$^\dagger$& 76.18\% & --- & Yes & None \\
256256
ChemAxon$^\dagger$ & 70.45\% & --- & Yes & Proprietary \\
@@ -268,7 +268,7 @@ \section{Comparison with Published Tools}
268268
\centering
269269
\includegraphics[width=0.85\textwidth]{comparison_published.png}
270270
\caption{Horizontal bar chart comparing chemically-equivalent accuracy.
271-
RDT v3.9.0 (raw) already exceeds all published tools; on balanced reactions it
271+
RDT v4.0.0 (raw) already exceeds all published tools; on balanced reactions it
272272
reaches 100\%.}
273273
\label{fig:comparison}
274274
\end{figure}
@@ -463,7 +463,7 @@ \section{Algorithm Selection Profile}
463463
\section{Conclusions}
464464

465465
\begin{enumerate}[leftmargin=2em]
466-
\item \textbf{RDT v3.9.0 achieves 100\% correct chemistry} on all balanced
466+
\item \textbf{RDT v4.0.0 achieves 100\% correct chemistry} on all balanced
467467
reactions in the golden dataset.
468468
\item The 252 apparent mismatches are \textbf{dataset artifacts} from unbalanced
469469
reactions, not mapping errors.

bin/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
Introduction
22
============
33

4-
`Reaction Decoder Tool (RDT) v3.9.0`
4+
`Reaction Decoder Tool (RDT) v4.0.0`
55
--------------------------------------
66

77
`1. Atom Atom Mapping (AAM) Tool`

pom.xml

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,13 @@
44
<groupId>com.bioinceptionlabs</groupId>
55
<artifactId>rdt</artifactId>
66
<description>Reaction Decoder Tool</description>
7-
<version>3.9.0</version>
7+
<version>4.0.0</version>
88
<packaging>jar</packaging>
99
<properties>
10-
<jdk.version>21</jdk.version>
10+
<jdk.version>25</jdk.version>
1111
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
12-
<maven.compiler.source>21</maven.compiler.source>
13-
<maven.compiler.target>21</maven.compiler.target>
12+
<maven.compiler.source>25</maven.compiler.source>
13+
<maven.compiler.target>25</maven.compiler.target>
1414
<cdk.version>2.12</cdk.version>
1515
<mainClass>com.bioinceptionlabs.aamtool.ReactionDecoder</mainClass>
1616
<surefire.groups></surefire.groups>
@@ -178,7 +178,7 @@
178178
<dependency>
179179
<groupId>com.bioinceptionlabs</groupId>
180180
<artifactId>smsd</artifactId>
181-
<version>6.10.2</version>
181+
<version>6.11.1</version>
182182
</dependency>
183183

184184
<!-- https://mvnrepository.com/artifact/commons-cli/commons-cli -->

src/main/java/com/bioinceptionlabs/reactionblast/mapping/ReactionContainer.java

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1123,6 +1123,13 @@ public IReaction standardize(IReaction reaction) throws Exception {
11231123
//As per IntEnz 0 for undefined direction, 1 for LR, 2 for RL and 3 for bidirectional
11241124
//As per CDK BIDIRECTION 1, Forward 2, Backward 0
11251125

1126+
// Preserve agents (e.g. filtered reagents from StandardizeReaction)
1127+
if (reaction.getAgents() != null) {
1128+
for (IAtomContainer agent : reaction.getAgents().atomContainers()) {
1129+
standardizedReaction.addAgent(agent);
1130+
}
1131+
}
1132+
11261133
reactionSet.addReaction(standardizedReaction);
11271134

11281135
//BIDIRECTION 1, Forward 2, Backward 0

0 commit comments

Comments
 (0)