Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
dbde820
PM-4545: treat new WM tasks as task challenges for payments
jmgasper Mar 30, 2026
7c339ec
PM-4528: expose draft billing for project write users
jmgasper Mar 31, 2026
31f54ab
PM-4618: preserve challenge attachments on draft save
jmgasper Mar 31, 2026
6a38297
Add deterministic historical MM planning CLI and report output
jmgasper Mar 31, 2026
13106e5
Implement create-path challenge and phase reconciliation for historic…
jmgasper Mar 31, 2026
29d9b42
Harden reused MM matching and phase backfill safety
jmgasper Apr 1, 2026
83c8ac8
Use authoritative existing-v6 matching for MM planning and apply
jmgasper Apr 1, 2026
765e7d7
Align create-path planning with phase derivation and rerun backfill c…
jmgasper Apr 1, 2026
6122ec0
Fail closed MM planning when discovery/template prerequisites are una…
jmgasper Apr 1, 2026
1f6ba2d
Validate planning-challenge scrutiny rerun
jmgasper Apr 1, 2026
5dffc42
Reconcile submitter resources from eligible registrations
jmgasper Apr 1, 2026
56809ab
Add completed-status fallback for submitter resource backfill
jmgasper Apr 1, 2026
ba1cc6d
Align importer mission artifacts with missing-member skip reporting
jmgasper Apr 1, 2026
47f4ade
Fix member-resolution bigint lookup for live missing-member planning
jmgasper Apr 2, 2026
0bb291a
Finalize missing-member planning partitions and skipped-artifact repo…
jmgasper Apr 2, 2026
682e77e
Filter planned missing-member resources during apply reconciliation
jmgasper Apr 2, 2026
a0214d1
Import deterministic non-example submission history for resolvable me…
jmgasper Apr 2, 2026
8df2511
PM-4686: block challenge activation for invalid billing accounts
jmgasper Apr 2, 2026
dde7317
Finalize historical MM final-score reconciliation and fixture selection
jmgasper Apr 2, 2026
1d381fe
PM-3167: tighten registration reopen dependency checks
jmgasper Apr 2, 2026
f52e93a
Import provisional review summations with missing-member reconciliation
jmgasper Apr 2, 2026
b8190e6
Merge pull request #86 from topcoder-platform/PM-4686
jmgasper Apr 2, 2026
3138c43
Merge pull request #87 from topcoder-platform/PM-3167
jmgasper Apr 2, 2026
561d064
Use authoritative linked counts for MM rerun/backfill planning
jmgasper Apr 3, 2026
2dc4e5c
Validate participant-scores scrutiny synthesis
jmgasper Apr 3, 2026
2ccc4f3
Preserve zero-valued ranking fallback scores in final-score import
jmgasper Apr 3, 2026
e1a794b
Persist participant-scores user-testing artifacts
jmgasper Apr 5, 2026
e3e2cbf
Merge branch 'develop' of github.com:topcoder-platform/challenge-api-…
jmgasper Apr 7, 2026
93171c6
Historical MM import updates
jmgasper Apr 8, 2026
5462281
PM-4837: Make challenge listing name search case-insensitive
jmgasper Apr 13, 2026
92b38d1
Merge pull request #90 from topcoder-platform/PM-4837-1
jmgasper Apr 13, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .factory/init.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
#!/usr/bin/env bash
set -euo pipefail

source "$HOME/.config/nvm/nvm.sh"

nvm use >/dev/null
if [ ! -d node_modules ]; then
pnpm install
fi

if [ -d data-migration ]; then
(
cd data-migration
nvm use 18.19.0 >/dev/null
if [ ! -d node_modules ]; then
pnpm install
fi
)
fi

if [ ! -f .env.importer.local ]; then
echo "warning: .env.importer.local is missing; live validation will remain blocked" >&2
fi
201 changes: 201 additions & 0 deletions .factory/library/architecture.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
# Architecture

How the historical marathon-match importer works at a high level.

**What belongs here:** major components, branch behavior, data flow, invariants, and cross-service ownership.
**What does NOT belong here:** step-by-step implementation tasks or validator commands.

---

## System Boundary

The mission adds a reusable importer inside `challenge-api-v6/data-migration/` that reads legacy Informix JSON exports and reconciles them into the v6 challenge/resource/review stack.

### Read surfaces

- `/mnt/Informix` JSON exports (read-only)
- existing v6 challenge data through the challenge DB / challenge-api schema
- existing v6 resource data through the Resource API
- existing v6 submission and review-summation data through the review DB / review-api schema

### Write surfaces

- Challenge and ChallengePhase records in the challenge DB
- submitter Resource records through the Resource API
- Submission and ReviewSummation records in the review DB

## Import Pipeline

### 1. Selection and planning

The importer accepts an explicit round filter and builds a per-round plan. Each selected round is classified as one of:

- `create` — no matching v6 marathon challenge exists
- `reuse/backfill-only` — a v6 marathon challenge already exists and only linked records may be added
- `skip` / `unresolved` — the round cannot be safely applied without more input

Planning is required to surface traceability, counts, and entity-level deltas before writes occur.

`--existing-state-file` is supplemental only. It may enrich counts for reporting, but it is not authoritative reuse evidence and must never override direct challenge-state discovery.

### Existing-challenge match rule

Safe reuse is authoritative, not fuzzy:

1. first try an exact existing `challenge.legacyId == round.id` match
2. if there is not exactly one such match, treat any name-based or heuristic candidates as planning diagnostics only
3. before reusing a matched challenge, verify it is a safe historical MM target: Marathon Match type, Data Science track, and no conflicting duplicate standard phase rows
4. if the round still is not matched unambiguously, or if the matched challenge fails those shape checks, emit `unresolved` and require an explicit override rather than auto-reusing a challenge

This keeps backfill-only behavior deterministic and avoids silent challenge-level rewrites.

If authoritative challenge-state discovery is unavailable, planning must fail closed as `unresolved` instead of silently falling back to create-path planning.

### 2. Challenge reconciliation

For each selected round:

- if no v6 challenge exists, create one completed `Marathon Match` challenge on the `Data Science` track
- if a v6 challenge is matched unambiguously and passes the reuse preconditions above, keep the same challenge id and preserve challenge-level fields

Created challenges must use `challenge.legacyId = round.id`. Reused challenges are not challenge-level rewrite candidates; they must already be matched unambiguously by the rule above or remain `unresolved`.

### 3. Phase materialization

Canonical MM history in v6 is represented by exactly three standard phases:

- `Registration`
- `Submission`
- `Review`

For newly created historical challenges, these phases must exist and be closed. For reused challenges, already-present standard phase rows are preserved as-is and only absent standard phase rows may be added.

### Timeline derivation rule

When creating a historical challenge:

- choose the canonical Marathon Match/Data Science timeline mapping used by the target environment by resolving exactly one valid template candidate; if zero or multiple candidates remain, stop with `unresolved`
- derive `Registration` from the min/max eligible `round_registration.timestamp`
- derive `Submission` from the earliest available legacy submission-open signal for the round, falling back to the earliest non-example submit timestamp when needed, and end it at the latest non-example submit timestamp
- synthesize `Review` as a coherent closed interval starting at or after the imported submission end; if no explicit review timestamps exist, collapse it to a closed interval at the end of submission rather than inventing a separate open window

If required timestamps are missing or contradictory enough that a coherent closed timeline cannot be produced, the round should remain `unresolved` instead of being half-created.

Planning must perform this same canonical MM/Data Science timeline-mapping resolution before returning `decision=create`; dry-run must not promise creates that apply would later reject.

### 4. Participant materialization

Submitter resources come from legacy registrations, not just from members with submissions. The importer must create or reuse exactly one submitter-role resource per eligible registrant that resolves in the target environment.

**Eligible registrant rule:** every distinct `round_registration.coder_id` for the selected round where `eligible == '1'`.

**Identity normalization rule:** resolve each legacy `coder_id` once through the same normalized member lookup and reuse that normalized identity for Resource API writes, imported submissions, and imported review records so the same member cannot surface with conflicting cross-service identities.

**Stable resource dedup key:** `(challengeId, memberId, roleId=submitter)`.

### Missing-member skip policy

If the target dev environment does not contain a legacy member, classify that member as `missing-member` for the current run and:

- skip resource creation for that member
- skip that member's non-example submissions
- skip that member's final and provisional review materialization
- continue importing other members for the round
- write a deterministic skipped-file artifact for later manual processing

The skipped artifact should be stable enough for rerun comparison and manual recovery, including at least the legacy round id, member id, skip reason, and affected surfaces.

### Approved completed-challenge resource workflow

If the Resource API refuses submitter creation on a completed historical challenge, the user has approved a temporary status-transition workflow solely for submitter-resource backfill:

- capture the original challenge status first
- transition only as much as needed to satisfy the Resource API write constraint
- create the missing submitter resources through the Resource API
- restore the challenge to its original completed state before the importer finishes

This workflow is a narrow exception for historical resource backfill only; it does not authorize general challenge-level rewrites.

### 5. Submission materialization

Only non-example legacy submissions are imported. The importer must preserve the full non-example history for members that resolve in the target environment, and explicitly skip/report missing-member rows instead of creating partial participant footprints.

**Stable submission identity invariant:** imported `Submission.legacySubmissionId` must be a deterministic composite derived from legacy submission identity so round-wide and rerun validation can compare exact sets. The contract assumes `legacySubmissionId` is the stable external identity for imported submissions.

### 6. Score materialization

Two score streams are imported:

- **provisional history** — one provisional review summation per imported non-example submission, using `long_submission.submission_points`
- **final result** — one final review summation per imported member, attached to that member's latest imported non-example submission

Final-score derivation uses legacy final-result fields with the agreed precedence:

1. `long_comp_result.system_point_total`
2. `long_comp_result.point_total`
3. the ranking score from legacy state data used for final ordering

If a legacy finalist has no imported non-example submission to attach to, the importer must skip that final score explicitly rather than create an orphan final review summation. Missing-member skips should be reported distinctly from other skip reasons.

**Stable review-summation dedup keys:**

- provisional: exactly one provisional review summation per imported submission (`submissionId + provisional`)
- final: exactly one final review summation on the member's latest imported non-example submission (`submissionId + final`)

## Reuse / Backfill Rules

These are core safety invariants:

- existing v6 marathon challenges are source of truth for challenge-level fields
- backfill may add missing linked records only
- already-present standard phase rows on reused challenges are preserved
- reruns must not duplicate challenges, phases, resources, submissions, or review summations
- example submissions and example review summations are never imported

## Apply / Resume Behavior

Cross-service writes are not a single distributed transaction. The importer therefore must be round-scoped and restart-safe:

- plan a round before applying it
- read before write on every owned surface
- treat rerun reconciliation as the recovery path after partial failure
- never assume a round is absent just because a previous apply stopped mid-flight

The observable result of rerunning a partially imported round should be reconciliation to the same steady state, not duplication or destructive rewrite.

If a temporary status-transition workflow is used during participant backfill, reruns must still converge to the same final completed state.

## Data Ownership Invariants

### Challenge DB

Owns:

- challenge identity and completion state
- phase rows and challenge timeline shape

### Resource API

Owns:

- submitter resource creation/reuse
- externally visible `(memberId, roleId)` participant footprint

### Review DB / Review API

Owns:

- imported submissions
- provisional review summations per submission
- final review summations attached to the latest imported non-example submission per member

## Validation-Oriented Invariants

The validation contract relies on these high-level invariants being preserved:

- round `10815` is the primary missing-historical create-path fixture
- a score-rich Marathon Match fixture is selected during score-feature work for final-ranking validation
- round `14272` is the second selected round for multi-round blast-radius checks
- imported submission identity is externally testable via `legacySubmissionId`
- reused-round verification depends on comparing both identity sets and externally visible field snapshots
- for member-owned surfaces, validation now reconciles `imported subset + skipped missing-member subset = legacy total`
65 changes: 65 additions & 0 deletions .factory/library/environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# Environment

Environment variables, external dependencies, and setup notes.

**What belongs here:** required env vars, external API URLs, credentials/setup expectations, Node/runtime requirements, read-only source locations.
**What does NOT belong here:** service start/stop commands or ports to manage locally (use `.factory/services.yaml`).

---

## Required Environment

The importer must load `challenge-api-v6/.env.importer.local` for local/dev execution.

Required values:

- `DATABASE_URL` — challenge DB used by `challenge-api-v6`
- `MEMBER_DB_URL` — member lookup DB connection string for target-member resolution during missing-member planning/validation; defaults to `DATABASE_URL` only when that DB can also resolve member data
- `MEMBER_DB_SCHEMA` — schema used for member lookup tables (default behavior is code-defined; validators should set it explicitly when member data is not reachable through the challenge schema)
- `REVIEW_DB_URL` — review DB used for submissions and review summations
- `RESOURCES_API_URL` — base URL for Resource API writes and reads
- `AUTH0_URL`
- `AUTH0_AUDIENCE`
- `AUTH0_CLIENT_ID`
- `AUTH0_CLIENT_SECRET`

Optional / useful values:

- `DATA_DIRECTORY=/mnt/Informix`
- importer-scoped attribution values such as `CREATED_BY` / `UPDATED_BY`

## Canonical API Endpoints For Validation

- Challenge API base URL: `https://api.topcoder-dev.com/v6/challenges`
- Resource API base URL: read from `RESOURCES_API_URL` in `.env.importer.local`

Workers and validators should use these canonical endpoints rather than probing localhost guesses when validating against the populated dev environment.

## Runtime Boundaries

- `/mnt/Informix` is a read-only legacy data source.
- Existing v6 marathon matches are backfill-only at the challenge level.
- Do not commit secrets from `.env.importer.local`.
- The validation target is the existing dev environment referenced by the env file; workers should not assume they are allowed to start replacement local services.

## Node / Tooling Versions

- Repo root (`challenge-api-v6`): Node `22.19.0`
- `challenge-api-v6/data-migration`: Node `18.19.0`
- `pnpm` is installed and available (`10.32.1` during planning)

Workers switching between repo root and `data-migration/` must switch Node versions in the same shell command.

## Existing Local Processes Observed During Planning

These are informational boundaries for worker safety:

- port `3100` already has a running process; do not kill or repurpose it unless the user later explicitly asks
- local postgres is already listening on `54329`; only use it if the env file points there

## Source Data Notes

- Marathon matches come from legacy `round` rows with `round_type_id='13'`.
- Primary join path: `round -> long_component_state -> long_submission -> long_comp_result`.
- `round_registration_*.json` is the source of submitter resources.
- `user_*.json` resolves `coder_id` identities.
Loading
Loading