Skip to content

PHOENIX-7787 Make CCF HAGroupStore ZK Updates backward compatible with existing ZK based client#2476

Open
ritegarg wants to merge 5 commits into
apache:PHOENIX-7562-feature-newfrom
ritegarg:PHOENIX-7787-legacy-crr-sync
Open

PHOENIX-7787 Make CCF HAGroupStore ZK Updates backward compatible with existing ZK based client#2476
ritegarg wants to merge 5 commits into
apache:PHOENIX-7562-feature-newfrom
ritegarg:PHOENIX-7787-legacy-crr-sync

Conversation

@ritegarg
Copy link
Copy Markdown
Contributor

@ritegarg ritegarg commented May 14, 2026

What changes were proposed in this pull request?

Adds an opt-in sync from /phoenix/consistentHA/<G> to the legacy /phoenix/ha/<G> znode in
each region server's HAGroupStoreClient. Each client derives a combined ClusterRoleRecord
(local role + peer role) and writes it to its local /phoenix/ha via ZK stat-version CAS, so
pre-consistentHA ZK-registry clients keep reading a current view of the HA group.

New config keys (in QueryServices / QueryServicesOptions):

  • phoenix.ha.legacy.crr.sync.enabled — master switch (default false).
  • phoenix.ha.legacy.crr.reconciliation.interval.seconds — periodic reconciliation cadence
    (default 60; 0 disables only the periodic loop, the event-driven path still runs).

Convergence model:

  • Event-driven on consistentHA CHILD_ADDED / CHILD_UPDATED (LOCAL or PEER).
  • Periodic reconciler on a dedicated single-thread executor (0–30s jitter on first run).
  • Curator NodeCache watcher on the legacy znode itself, so per-sync reads are in-memory
    rather than a ZK round-trip. The NodeCache is eventually consistent, so on apparent absence
    the sync path falls back to one authoritative getData() to disambiguate "absent" from
    "not-yet-cached" before deciding CREATE_NEW vs CAS_WITH_VERSION.
  • On CAS BadVersion: log at INFO and bail. The NodeCache watch + next event/periodic cycle
    reconverges. No inline retry.

Code-level changes:

  • ClusterRoleRecord: new explicit-RegistryType ctor; @JsonCreator now round-trips
    registryType (legacy znode is always written with RegistryType.ZK for backward
    compatibility, and must read back as ZK). New isLogicallyEqualIgnoringVersionAndRegistry
    helper used as the rewrite short-circuit.
  • PhoenixHAAdmin:
    • Three new helpers on the legacy namespace:
      • getClusterRoleRecordAndStatInZooKeeper — atomic (data, stat) read via
        storingStatIn.
      • createOrUpdateClusterRoleRecordWithCAS(name, record, LegacyCrrWriteMode mode, int expectedStatVersion) — typed write API. LegacyCrrWriteMode is a public enum with
        values CREATE_NEW, FORCE_OVERWRITE, CAS_WITH_VERSION; the int argument is
        meaningful only for CAS_WITH_VERSION (validated >= 0 at entry). Both BadVersion
        (CAS lost) and NodeExists (create lost) surface as
        StaleClusterRoleRecordVersionException so callers retry uniformly.
    • Atomic-read fix on the existing getHAGroupStoreRecordInZooKeeper (replaces a
      pre-existing 2-RPC pattern that could return a stat version not matching its data
      bytes).
  • HAGroupStoreClient: legacyHaAdmin + legacyCrrNodeCache + private legacyCrrSyncLock
    • dedicated periodic executor. Listener hooks for CHILD_ADDED / CHILD_UPDATED.
      CHILD_REMOVED is intentionally a no-op — the legacy /phoenix/ha znode is never deleted
      by this client.
  • New StaleClusterRoleRecordVersionException (CRR-specific analog of
    StaleHAGroupStoreRecordVersionException).

Why are the changes needed?

The Consistent Cluster Failover work moved HA group state from /phoenix/ha (which held a
ClusterRoleRecord) to /phoenix/consistentHA (which holds an HAGroupStoreRecord and uses
an RPC-registry view of CRRs). Existing Phoenix clients built against the older ZK-registry
contract still read /phoenix/ha/<G> and expect:

  • The znode to exist for any live HA group.
  • The record's registryType to be ZK.
  • The roles to reflect the latest state of both clusters.

Without this patch, after a CCF rollout the /phoenix/ha znode goes stale or absent, and old
ZK-registry clients break. This change keeps /phoenix/ha in sync with the new authoritative
/phoenix/consistentHA records on a per-RS basis, gated by an opt-in flag so it only runs
where the operator needs it.

Does this PR introduce any user-facing change?

Yes, but the new behavior is gated entirely behind the new master switch
(phoenix.ha.legacy.crr.sync.enabled), which defaults to false.

When the flag is enabled:

  • Each HAGroupStoreClient writes to (and maintains) the legacy /phoenix/ha/<G> znode on
    its local ZK.
  • Old pre-consistentHA ZK-registry clients pointed at /phoenix/ha will see fresh records
    again.

One incidental visible change independent of the flag: ClusterRoleRecord JSON
deserialization now respects the persisted registryType field. Previously it was hard-coded
to RPC regardless of JSON content. Existing JSON without a registryType field still
defaults to RPC, so deployments not exposed to legacy-sync writes see no behavior change.

How was this patch tested?

  • 13 integration tests in HAGroupStoreClientIT covering: feature-off no-op, initial create,
    role-changing transition, state-only transition (no rewrite), peer-driven role flip,
    peer-absent → role2=UNKNOWN → recovery, pre-existing stale znode overwrite,
    CAS error mapping + write-mode dispatch (CREATE_NEW, FORCE_OVERWRITE,
    CAS_WITH_VERSION negative-version rejection), LOCAL/PEER CHILD_REMOVED
    (znode preserved), registryType stays ZK across multiple role cycles, reconciliation
    interval =0 (event-driven still works), and periodic-loop recovery after an external
    divergence.
  • 4 new unit tests in ClusterRoleRecordTest covering JSON backward-compat: missing
    registryType defaults to RPC, explicit RPC round-trips, explicit ZK round-trips, and
    full ZK toJson/fromJson round-trip preserves registryType. Backed by three JSON
    fixtures under phoenix-core/src/test/resources/json/.
  • Full HAGroupStoreClientIT: 39/39 pass.
  • ClusterRoleRecordTest (unit): 16/16 pass.
  • mvn spotless:check clean across the touched modules.

Was this patch authored or co-authored using generative AI tooling?

Generated-by: Claude (Anthropic).

Ritesh Garg and others added 5 commits May 14, 2026 14:24
…h existing ZK based client

Co-authored-by: Cursor <cursoragent@cursor.com>
- Replace raw-int sentinels in PhoenixHAAdmin.createOrUpdateClusterRoleRecordWithCAS
  with a typed PhoenixHAAdmin.LegacyCrrWriteMode { CREATE_NEW, FORCE_OVERWRITE,
  CAS_WITH_VERSION } and (mode, expectedStatVersion) signature. Negative versions
  are rejected only for CAS_WITH_VERSION.

- In HAGroupStoreClient.syncLegacyCRRIfRoleChanged, when the NodeCache snapshot
  shows the legacy znode as absent, fall back to an authoritative ZK read before
  the logical-equality check and CAS. Prevents an unnecessary rewrite when the
  cache lags behind the actual znode (avoids CREATE_NEW vs CAS misroute and the
  stat-version bump that follows).

- Rename testLegacyCrrCASWithConcurrentWriters_OneSucceedsOneStales to
  testLegacyCrrCasErrorMappingAndModeDispatch and update its Javadoc to reflect
  what it actually tests (sequential CAS error mapping and mode dispatch).

- Add backward-compat JSON tests in ClusterRoleRecordTest covering: missing
  registryType -> defaults to RPC; explicit RPC round-trips; explicit ZK
  round-trips; full ZK toJson/fromJson round-trip preserves registryType.
  Backed by three new JSON fixtures.

- Bump "CAS lost" log from DEBUG to INFO so sustained CAS contention is visible
  in standard operational logs.

- Trim Javadoc and inline comments across the touched files for brevity.

Generated-by: Cursor (Claude).
Co-authored-by: Cursor <cursoragent@cursor.com>
- PhoenixHAAdmin.getHAGroupStoreRecordInZooKeeper: restore @param/@return/@throws
  from the pre-PR Javadoc, keeping the atomic-read note.
- ClusterRoleRecord canonical and convenience constructors: restore per-parameter
  @param tags; on the @JsonCreator constructor also restore the URL-normalization
  note and document the registryType backward-compat default.

No behavior change.

Generated-by: Cursor (Claude).
Co-authored-by: Cursor <cursoragent@cursor.com>
…ecord ctor

Restores L155-L162 of the pre-PR ClusterRoleRecord.java verbatim (8-line
URL-normalization rationale). No behavior change.

Generated-by: Cursor (Claude).
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant