Skip to content

Implement homograph numbers and improve entry search/sort behavior#2220

Open
myieye wants to merge 7 commits into
developfrom
claude/add-homograph-numbers-V1a05
Open

Implement homograph numbers and improve entry search/sort behavior#2220
myieye wants to merge 7 commits into
developfrom
claude/add-homograph-numbers-V1a05

Conversation

@myieye
Copy link
Copy Markdown
Collaborator

@myieye myieye commented Mar 24, 2026

Summary

This PR adds homograph numbers to MiniLcm. Homograph numbers are now respected when sorting.

Homograph-number assignment

  • FwData: after assigning HomographNumber on create and update, the bridge calls ILexEntryRepository.CorrectHomographNumbers(entry). This corrects out-of-range requests to current-max + 1 and deduplicates collisions by renumbering on (HomographNumber, DateCreated, Guid).
  • CRDT: rudimentary implementation that handles newly created entries and promotes pre-existing duplicates. A sync test demonstrates that "broken" homograph numbers in CRDT-land are repaired after two syncs (round-tripped through LibLCM's auto-assignment).

Two FwData tests cover the new correction behavior: CreateEntry_HomographNumberOutOfRange_IsClampedToMaxPlusOne and CreateEntry_DuplicateHomographNumber_IsDeduplicated.

Bumps SIL.LCModel to 11.0.0-beta0165 for the new ILexEntryRepository.CorrectHomographNumbers API (formerly considered for ILexEntry; landed on the repository instead).

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 24, 2026

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 50bd005a-5b0d-409a-8d84-c3cc17f4791d

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR implements homograph number support to track and distinguish multiple entries sharing the same headword. It adds the HomographNumber property to Entry models, implements auto-assignment logic in CRDT, updates database schema, modifies sorting/search ordering, and includes comprehensive tests across both CRDT and FwData layers.

Changes

Cohort / File(s) Summary
API Property Mapping
backend/FwLite/FwDataMiniLcmBridge/Api/FwDataMiniLcmApi.cs, backend/FwLite/FwDataMiniLcmBridge/Api/UpdateProxy/UpdateEntryProxy.cs, backend/FwLite/MiniLcm/Models/Entry.cs, frontend/viewer/src/lib/dotnet-types/generated-types/MiniLcm/Models/IEntry.ts
Added HomographNumber property to Entry models and proxies; mapped from LibLCM entries and exposed in frontend types; conditionally assigns when importing from FwData.
Homograph Auto-Assignment Logic
backend/FwLite/LcmCrdt/CrdtMiniLcmApi.cs
Added AssignHomographNumber helper to auto-assign homograph numbers during entry creation; derives secondary ordering from morph type, queries existing entries, patches first zero entry to 1, assigns incrementing numbers to new entries.
Sorting & Full-Text Search
backend/FwLite/FwDataMiniLcmBridge/Api/Sorting.cs, backend/FwLite/LcmCrdt/Data/Sorting.cs, backend/FwLite/LcmCrdt/FullTextSearch/EntrySearchService.cs
Enabled HomographNumber as deterministic tie-breaker in headword ordering and best-match ranking across all three sorting implementations; re-enabled previously commented-out homograph tie-breaks.
Change Type & Sync
backend/FwLite/LcmCrdt/Changes/CreateEntryChange.cs, backend/FwLite/MiniLcm/SyncHelpers/EntrySync.cs
Added HomographNumber property to CreateEntryChange for persistence; added JSON patch operation generation when homograph numbers differ during sync.
Database Schema & Migrations
backend/FwLite/LcmCrdt/Migrations/20260409130907_AddHomographNumbers.*, backend/FwLite/LcmCrdt/Migrations/LcmCrdtDbContextModelSnapshot.cs
Added EF Core migration to create HomographNumber INTEGER column with default 0 on Entry table; updated model snapshot with new property configuration.
Test Data Verification
backend/FwLite/LcmCrdt.Tests/Changes/ChangeDeserializationRegressionData.*, backend/FwLite/LcmCrdt.Tests/Data/MigrationTests_FromScriptedDb.*.verified.*, backend/FwLite/LcmCrdt.Tests/Data/SnapshotDeserializationRegressionData.*, backend/FwLite/LcmCrdt.Tests/Data/VerifyRegeneratedSnapshotsAfterMigrationFromScriptedDb.*, backend/FwLite/LcmCrdt.Tests/DataModelSnapshotTests.VerifyDbModel.verified.txt
Updated all verified snapshot and test data files to include HomographNumber field (typically 0 for existing entries); added new entry fixture with homograph number 4.
Unit & Integration Tests
backend/FwLite/MiniLcm.Tests/CreateEntryTestsBase.cs, backend/FwLite/MiniLcm.Tests/SortingTestsBase.cs, backend/FwLite/FwLiteProjectSync.Tests/SyncTests.cs
Added five new homograph auto-assignment tests validating incrementation, explicit values, morph-type grouping, citation-form grouping, and multi-entry scenarios; added integration test verifying homograph correction through two-sync cycles; updated sorting test fixtures with explicit homograph variants.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Suggested labels

📦 Lexbox

Suggested reviewers

  • rmunn

Poem

🐰 A homograph's number now takes its place,
Distinguishing entries with matching face,
Through sorting and sync they journey along,
Each one assigned where it belongs!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 21.21% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately describes the main changes: implementing homograph numbers and improving entry search/sort behavior, which align with the core objectives.
Description check ✅ Passed The pull request description accurately describes the changeset, covering homograph number implementation, assignment logic, and FwData/CRDT handling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/add-homograph-numbers-V1a05

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added the 💻 FW Lite issues related to the fw lite application, not miniLcm or crdt related label Mar 24, 2026
@myieye myieye force-pushed the claude/add-lexeme-headwords-TowRX branch from 4246a62 to 2e82fa6 Compare March 27, 2026 16:00
Base automatically changed from claude/add-lexeme-headwords-TowRX to feat/sync-morph-types March 28, 2026 09:14
@myieye myieye force-pushed the feat/sync-morph-types branch 3 times, most recently from e3ec8c7 to 87c5dad Compare April 9, 2026 12:24
@myieye myieye force-pushed the claude/add-homograph-numbers-V1a05 branch from 8d2c6c6 to b95da37 Compare April 9, 2026 12:35
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 9, 2026

UI unit Tests

  1 files  ±0   59 suites  ±0   24s ⏱️ -5s
176 tests ±0  176 ✅ ±0  0 💤 ±0  0 ❌ ±0 
245 runs  ±0  245 ✅ ±0  0 💤 ±0  0 ❌ ±0 

Results for commit c2e0c47. ± Comparison against base commit b1ab821.

♻️ This comment has been updated with latest results.

@argos-ci
Copy link
Copy Markdown

argos-ci Bot commented Apr 9, 2026

The latest updates on your projects. Learn more about Argos notifications ↗︎

Build Status Details Updated (UTC)
default (Inspect) ✅ No changes detected - May 20, 2026, 8:23 AM

@myieye myieye force-pushed the feat/sync-morph-types branch from 87c5dad to 8ed192f Compare April 9, 2026 15:24
@myieye myieye force-pushed the claude/add-homograph-numbers-V1a05 branch from aaf00bb to 0db4516 Compare April 9, 2026 15:42
@myieye myieye marked this pull request as ready for review April 9, 2026 15:46
@myieye
Copy link
Copy Markdown
Collaborator Author

myieye commented Apr 9, 2026

@coderabbitai full review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 9, 2026

✅ Actions performed

Full review triggered.

@myieye myieye force-pushed the claude/add-homograph-numbers-V1a05 branch from 4be17e9 to 77a861f Compare April 10, 2026 10:34
@myieye myieye force-pushed the feat/sync-morph-types branch from 8ed192f to 374b1b2 Compare April 10, 2026 11:32
@myieye myieye force-pushed the claude/add-homograph-numbers-V1a05 branch 6 times, most recently from 441aad1 to 08c9430 Compare May 5, 2026 11:40
?? stemOrder.FirstOrDefault();

var matchingEntries = await (
from e in repo.Entries
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this exclude the deleted entries?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We mostly just read from Harmony's "projected tables" i.e. the clean relational ef-core database that Harmony projects its latest un-deleted snapshots to.

Comment thread backend/FwLite/FwDataMiniLcmBridge/Api/FwDataMiniLcmApi.cs Outdated
@rmunn rmunn force-pushed the feat/sync-morph-types branch from 4028077 to 9683cf5 Compare May 8, 2026 08:37
@myieye myieye force-pushed the feat/sync-morph-types branch 4 times, most recently from a08917b to 418b97c Compare May 13, 2026 07:40
@myieye myieye force-pushed the feat/sync-morph-types branch from 418b97c to 54eb419 Compare May 13, 2026 09:02
Base automatically changed from feat/sync-morph-types to develop May 13, 2026 09:24
@github-actions github-actions Bot added the 📦 Lexbox issues related to any server side code, fw-headless included label May 19, 2026
claude and others added 3 commits May 19, 2026 20:55
Add HomographNumber (int, 0 = unset) to the Entry model with full
round-trip support through CRDT, FwData bridge, and sync.

Key changes:
- Entry model: add HomographNumber property with Copy() support
- CreateEntryChange: persist HomographNumber in CRDT changes
- CrdtMiniLcmApi: auto-assign homograph numbers on entry creation
  when HomographNumber is 0, respecting SecondaryOrder scoping.
  Updates existing lone entries from 0→1 when a second homograph appears.
- FwDataMiniLcmApi: read HomographNumber from ILexEntry, set on create
- UpdateEntryProxy: bidirectional HomographNumber sync to LibLCM
- EntrySync: include HomographNumber in diff/patch operations
- Sorting: uncomment HomographNumber in CRDT sort and search queries
- Tests: uncomment sorting tests with HomographNumber, add auto-
  assignment tests, add sync test verifying LibLCM corrects numbers
  after entry deletion via two sync cycles

https://claude.ai/code/session_01FJj2v135u6KdgVxoK4tRp2
Bumps SIL.LCModel to 11.0.0-beta0165 (depends on SIL.Core /
SIL.WritingSystems 17.0.0). Replaces the previous deferral to
LibLCM's auto-handling with an explicit call to
ILexEntryRepository.CorrectHomographNumbers(entry) after assigning
HomographNumber on create and update. This corrects out-of-range
requests to current-max+1 and deduplicates collisions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@myieye myieye force-pushed the claude/add-homograph-numbers-V1a05 branch from ddb9afd to db4738d Compare May 19, 2026 18:57
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

C# Unit Tests

165 tests  ±0   165 ✅ ±0   20s ⏱️ ±0s
 23 suites ±0     0 💤 ±0 
  1 files   ±0     0 ❌ ±0 

Results for commit c2e0c47. ± Comparison against base commit b1ab821.

♻️ This comment has been updated with latest results.

myieye and others added 4 commits May 20, 2026 08:25
- Restore develop's sena-3-live.verified.sqlite (the rebase had taken the
  branch's older copy whose __EFMigrationsHistory referenced the
  pre-renumber MorphType migrations, causing LiveSena3Sync to fail with
  "table MorphType already exists").
- Add homographNumber to IEntry literals in 4 frontend files now that the
  field is required on the generated type.
- Exclude HomographNumber from the FwData round-trip equivalence check in
  EntrySyncTests.CanSyncRandomEntries (a randomly faked value can't
  survive ILexEntryRepository.CorrectHomographNumbers renumbering).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The branch's copy was regenerated against an older fwdata/CRDT
baseline (different entry GUIDs and content) and is unrelated to
the homograph work. Restoring develop's version produces a clean
diff for the final regeneration: only the new "HomographNumber":n
field appearing on each entry. Accept the resulting .received.txt
in the IDE to land the final snapshot.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both the live CRDT sqlite and the fw-headless snapshot need to advance
together (they encode the same post-sync state). Sync added the previously
CRDT-absent entries with fresh CRDT GUIDs and surfaced HomographNumber
across every entry; LiveSena3Sync passes locally with these accepted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pulls the existing CRDT and FwData homograph-number tests out of their
respective CreateEntryTests.cs into HomographNumberTests.cs alongside
new update-side coverage:
- CRDT: respects explicit HomographNumber on both create and update
- FwData: corrects out-of-range and dedups duplicates on both create
  and update (the renumber tie-breaks by DateCreated then Guid, so
  the update tests mutate the newest entry to keep the outcome
  intuitive)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

💻 FW Lite issues related to the fw lite application, not miniLcm or crdt related 📦 Lexbox issues related to any server side code, fw-headless included

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants