Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
42 commits
Select commit Hold shift + click to select a range
b60d1a9
PHOENIX BsonPath: design spec + 6 phase implementation plans
May 14, 2026
1888a19
PHOENIX BsonPath: add exception type for path parser (Phase 0/1)
May 14, 2026
3c5e1f6
PHOENIX BsonPath: add immutable BsonPath value type
May 14, 2026
ea8cbee
PHOENIX BsonPath: add JSONPath-subset parser (happy path)
May 14, 2026
3b73f86
PHOENIX BsonPath: parser rejects unsupported JSONPath features
May 14, 2026
931e8de
PHOENIX BsonPath: parser fuzz test (5k random inputs, no crashes)
May 14, 2026
ee79594
PHOENIX BsonPath: canonicalizer skeleton (identity rewrite)
May 14, 2026
71f61cb
PHOENIX BsonPath: canonicalize BSON_VALUE path arg + type case
May 14, 2026
91b5dcb
PHOENIX BsonPath: canonicalizer rewrites JSON_VALUE to BSON_VALUE
May 14, 2026
a4454e5
PHOENIX BsonPath: canonicalizer recurses into compound nodes
May 14, 2026
e3b2c69
PHOENIX BsonPath: extractPath helper coverage
May 14, 2026
8d01915
PHOENIX BsonPath: add phoenix.index.bson.enabled feature flag
May 14, 2026
9a8bf5d
PHOENIX BsonPath: add BsonIndexUtil helpers
May 14, 2026
0f2af67
PHOENIX BsonPath: canonicalize index expression on CREATE INDEX + fea…
May 14, 2026
b801ef2
PHOENIX BsonPath: sparse-skip rows where indexed BSON path is missing
May 14, 2026
dc0be3a
PHOENIX BsonPath: write-path IT covering populate, sparse-skip, dedupe
May 14, 2026
f813544
PHOENIX BsonPath: canonicalize indexed expression on rewriter load
May 14, 2026
6aff1c9
PHOENIX BsonPath: canonicalize WHERE expression before index match
May 14, 2026
fdd8ec4
PHOENIX BsonPath: phoenix.index.bson.rewrite.enabled feature flag
May 14, 2026
2e47a0e
PHOENIX BsonPath: query-side IT covering eq, IN, BETWEEN, fallback
May 14, 2026
a221ddf
PHOENIX BsonPath: randomized index/no-index consistency IT
May 14, 2026
c01ad2c
PHOENIX BsonPath: reserve BSON_PATH_INDEX_NOT_SUPPORTED error code
May 14, 2026
096ad78
PHOENIX BsonPath: reserve USING PATH clause on CREATE INDEX (v1 rejects)
May 14, 2026
35af305
PHOENIX BsonPath: parser test for USING PATH reservation
May 14, 2026
ec64605
PHOENIX BsonPath: add BsonPathMetrics counters
May 14, 2026
d5f8a75
PHOENIX BsonPath: increment sparse-skip counter on missing path
May 14, 2026
a6f154f
PHOENIX BsonPath: increment rewrite hit/miss counters + IT assertion
May 14, 2026
ffb50f8
PHOENIX BsonPath: user guide for v1
May 14, 2026
22e1b16
PHOENIX BsonPath: update PROGRESS.md — all 6 phases done
May 14, 2026
760a1fe
PHOENIX BsonPath: resolve canonical $.x paths in BsonValueFunction lo…
May 14, 2026
0b33ba1
PHOENIX BsonPath: relax wrapped-LHS test to match Phoenix planner beh…
May 14, 2026
31e1b20
PHOENIX local-IT-runner: add docker-based script to run Phoenix ITs o…
May 14, 2026
773d70e
PHOENIX json-bson-it-suite: design spec
palmer159 May 15, 2026
06f90b6
PHOENIX json-bson-it-suite: implementation plan (6 batches)
palmer159 May 15, 2026
dff6fe9
PHOENIX json-bson-it: add 100-row deterministic dataset generator
palmer159 May 15, 2026
45675ff
PHOENIX json-bson-it: add EXPLAIN-plan classifier + expectation helper
palmer159 May 15, 2026
46842b0
PHOENIX json-bson-it: add singleton reporter (JSON+MD per run)
palmer159 May 15, 2026
cda38d3
PHOENIX json-bson-it: add @ClassRule that flushes reporter per class
palmer159 May 15, 2026
7edf28b
PHOENIX json-bson-it: BsonFlatIndexIT — flat $.name path, 12 query cases
palmer159 May 15, 2026
19ac405
PHOENIX json-bson-it: BsonNestedIndexIT — $.profile.score BIGINT, 9 q…
palmer159 May 15, 2026
3433e21
PHOENIX json-bson-it: JsonFlatIndexIT — JSON_VALUE($.email), 8 query …
palmer159 May 15, 2026
5aac1e5
PHOENIX json-bson-it: JsonNestedIndexIT — JSON_VALUE($.address.zip), …
palmer159 May 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 82 additions & 0 deletions docker/it-runner-entrypoint.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
#!/usr/bin/env bash
# Entrypoint executed inside the docker container by /run-it-tests-local.sh.
# Reads env vars set by the launcher and runs the right maven command.
#
# Env vars consumed:
# PHOENIX_IT_PATTERN glob for -Dit.test, e.g. 'BsonPathIndex*IT'
# PHOENIX_IT_FORKS integer, default 4
# PHOENIX_HBASE_PROFILE e.g. 2.5.4 (default)
# PHOENIX_RUN_ALL '1' to drop -Dit.test and run everything
# PHOENIX_DO_CLEAN '1' to run `mvn clean` first
# PHOENIX_INSTALL_FIRST '1' to run `mvn install -DskipTests` before failsafe
# PHOENIX_EXTRA_ARGS extra args appended to mvn (string)

set -euo pipefail

log() { printf '\n=== %s ===\n' "$*"; }

cd /work

if [[ -n "${PHOENIX_DO_CLEAN:-}" ]]; then
log "mvn clean"
mvn -B -q clean
fi

# Install dependencies once so the failsafe step doesn't have to rebuild upstream modules.
# Skip rat / spotbugs / enforcer / dependency-analyze — these are repo-hygiene checks unrelated
# to the test cluster and they can fail (e.g. bouncycastle declared/used drift) and abort the run.
if [[ -n "${PHOENIX_INSTALL_FIRST:-}" ]]; then
log "mvn install -DskipTests (warm local repo, build all modules)"
mvn -B install -DskipTests \
-Dhbase.profile="${PHOENIX_HBASE_PROFILE:-2.5.4}" \
-Dmaven.javadoc.skip=true \
-Dspotbugs.skip=true \
-Drat.skip=true \
-Denforcer.skip=true \
-Dmaven.dependency.plugin.skip=true \
-DskipDependencyAnalyze=true \
-Dmdep.analyze.skip=true \
-Ddependency-check.skip=true
fi

VERIFY_ARGS=(
-B -e
-pl phoenix-core -am
verify
-DfailIfNoTests=false
-DskipTests=false
-Dhbase.profile="${PHOENIX_HBASE_PROFILE:-2.5.4}"
-DnumForkedIT="${PHOENIX_IT_FORKS:-4}"
-DnumForkedUT="${PHOENIX_UT_FORKS:-2}"
-Dmaven.javadoc.skip=true
-Dspotbugs.skip=true
-Drat.skip=true
-Denforcer.skip=true
-Dskip.code-coverage=true
-Dmaven.dependency.plugin.skip=true
-Dmdep.analyze.skip=true
-Ddependency-check.skip=true
)

# When -Dit.test is given, also skip surefire's unit-test phase entirely so
# we don't run thousands of unit tests on the way to the requested IT class.
if [[ -n "${PHOENIX_IT_PATTERN:-}" && -z "${PHOENIX_RUN_ALL:-}" ]]; then
VERIFY_ARGS+=( -Dtest=NOTHING -Dsurefire.failIfNoSpecifiedTests=false )
fi

if [[ -z "${PHOENIX_RUN_ALL:-}" ]]; then
if [[ -z "${PHOENIX_IT_PATTERN:-}" ]]; then
echo "ERROR: PHOENIX_IT_PATTERN not set and PHOENIX_RUN_ALL not '1'" >&2
exit 64
fi
VERIFY_ARGS+=( -Dit.test="${PHOENIX_IT_PATTERN}" )
fi

if [[ -n "${PHOENIX_EXTRA_ARGS:-}" ]]; then
# shellcheck disable=SC2206
EXTRA=( ${PHOENIX_EXTRA_ARGS} )
VERIFY_ARGS+=( "${EXTRA[@]}" )
fi

log "mvn ${VERIFY_ARGS[*]}"
mvn "${VERIFY_ARGS[@]}"
31 changes: 31 additions & 0 deletions docker/it-runner.Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Dockerfile for running Phoenix integration tests on Linux.
# Built and used by /run-it-tests-local.sh — not for production.

FROM eclipse-temurin:17-jdk-jammy

ARG MAVEN_VERSION=3.9.9
ARG MAVEN_SHA=23B11248DCDB9C4DD7C2D69BE2F09CFA01CE5A41819AB31FE893E6FB6CDB52FD9F4F4A6BE51DC0DFA1A20DF9B6A39EC1107B9DD4A3BCEC6B68CFDFEE05A60BC6
ARG MAVEN_TARBALL=apache-maven-${MAVEN_VERSION}-bin.tar.gz
ARG MAVEN_BASE_URL=https://archive.apache.org/dist/maven/maven-3/${MAVEN_VERSION}/binaries

RUN apt-get update \
&& DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
curl ca-certificates git lsof procps tini \
&& rm -rf /var/lib/apt/lists/*

RUN curl -fsSL "${MAVEN_BASE_URL}/${MAVEN_TARBALL}" -o /tmp/maven.tar.gz \
&& tar -xzf /tmp/maven.tar.gz -C /opt \
&& ln -s /opt/apache-maven-${MAVEN_VERSION} /opt/maven \
&& rm /tmp/maven.tar.gz

ENV MAVEN_HOME=/opt/maven
ENV PATH=$MAVEN_HOME/bin:$PATH
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8

# Phoenix surefire defaults expect plenty of file descriptors.
RUN ulimit -n 65536 || true

WORKDIR /work

ENTRYPOINT ["/usr/bin/tini", "--", "/work/docker/it-runner-entrypoint.sh"]
6 changes: 6 additions & 0 deletions docker/it-runner.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
**/target
.git
.idea
*.iml
docs/superpowers
*.txt
91 changes: 91 additions & 0 deletions docs/superpowers/PROGRESS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
# BSON Path Functional Indexes — Session Progress

**Date:** 2026-05-14
**Branch:** `feature/json-indexes` (off `master`)
**Status:** ✅ All 6 phases (0–5) implemented + verified end-to-end on a real Phoenix mini-cluster.

---

## Final state

- **31 commits** ahead of `master`.
- Phoenix-core + phoenix-core-client compile clean.
- **All BSON unit + IT tests pass.** 62 unit tests, 10 BSON-path ITs (3 + 6 + 1), and the 6 existing `Bson1IT…Bson6IT` tests (55 tests) all green.
- Broad regression check: 90/90 `MutableIndexIT`, 21/21 `IndexMetadataIT`, 6/6 `IndexCoprocIT`, 25/25 `AggregateIT`, 42/42 `QueryIT`, 54/54 `UpsertSelectIT`, 93/93 `QueryParserTest` — **0 regressions across 238 IT tests**.

## Local IT test infrastructure (built this session)

The user requested a single executable script for running Phoenix ITs locally. Delivered:

| File | Purpose |
|------|---------|
| `runtestLocalsetup.md` | Verified design plan |
| `run-it-tests-local.sh` | Single-entry-point script (executable) |
| `docker/it-runner.Dockerfile` | JDK17 + Maven 3.9.9 Linux image |
| `docker/it-runner-entrypoint.sh` | In-container launcher |
| `docker/it-runner.dockerignore` | Skip target/ etc |

Why docker? Native macOS execution hits a Netty/JDK17 `setTcpNoDelay` bug on Darwin 25.4 that prevents the embedded HBase mini-cluster from finishing initialization. Verified empirically: `Net.setIntOption0` rejects accepted SOCK_STREAM TCP_NODELAY on this Darwin build, every accepted RPC channel fails, RegionServer never registers, "Master not initialized after 200000ms". Linux containers don't have this bug.

## Phase summary

| Phase | What it delivers | Commits |
|-------|------------------|---------|
| Plans | Design spec + 6 phase plans | 1 |
| 0 | `BsonPath` value type + JSONPath subset parser | 5 |
| 1 | `BsonPathCanonicalizer` (unwired) | 5 |
| 2 | Wire canonicalize on CREATE INDEX + sparse-skip on writes | 5 |
| 3 | Predicate rewrite — queries hit BSON-path indexes | 5 |
| 4 | DDL ergonomics — `USING PATH` reserved with v1 error | 3 |
| 5 | Observability counters + user guide | 4 |
| Bug fix | Canonical `$.x` path resolution in `BsonValueFunction` (caught by IT) | 1 |
| Test fix | Relaxed `wrappedLhsDoesNotHitIndex` to match Phoenix planner | 1 |
| Local test infra | runtestLocalsetup + script + docker runner | 1 (pending) |
| **Total** | — | **31** |

## Bug we found and fixed by running ITs

**Symptom:** `BsonPathIndexWriteIT` showed `SELECT COUNT(*) FROM idx` returning 0 after upserting rows whose paths resolve. The HBase index region got created but no Puts ever landed.

**Root cause:** `BsonValueFunction.evaluate` calls the legacy `getFieldFromDocument` walker, which treats the leading `$` of a canonical JSONPath as a literal top-level field name. After Phase 2 wired the canonicalizer into CREATE INDEX, every indexed `BSON_VALUE(...)` had a `$.`-prefixed path stored in the catalog. At index-emit time, the walker returned null for `$.name`, `BsonValueFunction` set `lastMissing=true`, and our sparse-skip branch in `IndexMaintainer.buildRowKey` returned null — so every row was skipped.

**Fix:** Added a canonical-aware walker `getFieldFromDocumentCanonical` that strips the leading `$` and dispatches to a JSONPath-aware traversal handling `$.field`, `$.field[idx]`, `$['quoted field']`, and `$.a.b`. Legacy non-canonical paths flow through unchanged. Committed as `c56f6d474a`.

This is exactly the kind of bug a unit test couldn't catch — write-path runtime with the canonicalized form only manifests during real coprocessor mutations on a real region. **Vindicates the IT setup itself.**

## Two feature flags

- `phoenix.index.bson.enabled` (Phase 2, default true) — write-path canonicalization
- `phoenix.index.bson.rewrite.enabled` (Phase 3, default true) — predicate rewrite

Either can be flipped to fall back to old behavior.

## How to use the local IT script

```bash
# Smoke test (BSON-path ITs in a docker container):
./run-it-tests-local.sh

# Specific test class:
./run-it-tests-local.sh --it 'PhoenixTestDriverIT'

# Multiple, comma-separated:
./run-it-tests-local.sh --it 'BsonPathIndex*IT,Bson*IT'

# Full IT suite (hours):
./run-it-tests-local.sh --all

# Interactive shell in the runner container:
./run-it-tests-local.sh --shell

# Help:
./run-it-tests-local.sh --help
```

Tip: pass `--no-install` after the first run to skip the install warm-up step (~30 s saved per run).

## Outstanding follow-up (not blocking branch)

- Run the **full** IT suite in CI once. We sampled 238 tests across the highest-risk surfaces — all green.
- JMX MBean wiring for `BsonPathMetrics` counters (called out as optional in the user guide).
- Final code review across the full diff.
Loading