Skip to content

fix(flagd): validate FLAGD_SYNC_PORT before parse to avoid kubelet service-link collision#1798

Open
jabenedicic wants to merge 2 commits into
open-feature:mainfrom
jabenedicic:fix/flagd-in-process-port-validation
Open

fix(flagd): validate FLAGD_SYNC_PORT before parse to avoid kubelet service-link collision#1798
jabenedicic wants to merge 2 commits into
open-feature:mainfrom
jabenedicic:fix/flagd-in-process-port-validation

Conversation

@jabenedicic
Copy link
Copy Markdown

Closes #1797

Summary

FlagdOptions.prebuild() parses FLAGD_SYNC_PORT directly with Integer.parseInt when FLAGD_RESOLVER=in-process. When a Service named flagd-sync shares the pod's namespace (the exact topology produced by the upstream flagd Helm chart and the OpenFeature Operator's InProcessConfiguration pattern), kubelet's service-link env injection populates the workload with FLAGD_SYNC_PORT=tcp://<clusterIP>:<port>, the SDK calls Integer.parseInt("tcp://..."), and the consumer crashes at startup with NumberFormatException.

Full repro, affected versions, and root-cause analysis in #1797.

Change

Validate that FLAGD_SYNC_PORT parses as an integer in [1, 65535] before using it. On invalid input, log at WARN with a pointer to the most likely cause (service-link injection) and fall back to FLAGD_PORT. Behaviour for valid, user-set FLAGD_SYNC_PORT is unchanged.

// Before
String portValue = resolverType == Config.Resolver.IN_PROCESS
        ? fallBackToEnvOrDefault(Config.SYNC_PORT_ENV_VAR_NAME, fromPortEnv)
        : fromPortEnv;
port = Integer.parseInt(portValue);

// After (sketch — see diff for full code)
String portValue = fromPortEnv;
if (resolverType == Config.Resolver.IN_PROCESS) {
    String syncPortValue = fallBackToEnvOrDefault(Config.SYNC_PORT_ENV_VAR_NAME, null);
    if (syncPortValue != null) {
        if (isValidPort(syncPortValue)) {
            portValue = syncPortValue;
        } else {
            log.warn("Ignoring {} value '{}' (not a valid port); falling back to {} ('{}'). ...",
                    Config.SYNC_PORT_ENV_VAR_NAME, syncPortValue,
                    Config.PORT_ENV_VAR_NAME, fromPortEnv);
        }
    }
}
port = Integer.parseInt(portValue);

Why this is the minimal change

The narrower fix (validation only) was preferred over alternatives:

Validation is fully backwards-compatible: legitimately set FLAGD_SYNC_PORT=NNNN continues to take precedence over FLAGD_PORT exactly as today.

Tests

Three new cases in FlagdOptionsTest, all using the existing @SetEnvironmentVariable pattern:

  • testInProcessProvider_invalidSyncPortFallsBackToFlagdPortFLAGD_SYNC_PORT=tcp://10.0.0.1:8015 + FLAGD_PORT=8888 → port 8888 (the exact service-link injection scenario).
  • testInProcessProvider_invalidSyncPortWithNoFlagdPortUsesDefault — invalid FLAGD_SYNC_PORT, no FLAGD_PORTDEFAULT_IN_PROCESS_PORT.
  • testInProcessProvider_outOfRangeSyncPortFallsBackToFlagdPortFLAGD_SYNC_PORT=99999 → falls back (port out of valid range).

Existing tests for FLAGD_SYNC_PORT precedence and fallback are unchanged and continue to pass. Full suite: 21 tests, 0 failures, 0 errors.

mvn spotless:apply reports no formatting changes required.

Out of scope

…rvice-link collision

`FlagdOptions.prebuild()` reads `FLAGD_SYNC_PORT` first in `in-process`
resolver mode (introduced in open-feature#1651), falling back to `FLAGD_PORT` only
when unset. Kubernetes service-link environment variable injection
populates `FLAGD_SYNC_PORT` with `tcp://<clusterIP>:<port>` whenever a
Service named `flagd-sync` shares the pod's namespace — which is exactly
the topology produced by the upstream flagd Helm chart and the
OpenFeature Operator's in-process configuration pattern. The SDK then
calls `Integer.parseInt("tcp://...")` and dies at startup with
`NumberFormatException`, breaking JVM consumers using `FLAGD_RESOLVER=in-process`.

Validate the env var parses as a port in `[1, 65535]` before using it.
On invalid input, log at WARN with a pointer to the most likely root
cause (service-link injection) and fall back to `FLAGD_PORT`. Behaviour
for valid, user-set `FLAGD_SYNC_PORT` is unchanged.

The Node.js and Go in-process providers don't read `FLAGD_SYNC_PORT` at
all, so this collision is Java-specific; users on those SDKs aren't
affected today. A future change may want to revisit whether the SDK
should read this env var at all given it's already reserved by the
flagd daemon's own `--sync-port` CLI flag, but validation is the
minimal change that unblocks affected users without altering the
fallback chain.

Signed-off-by: Jason Benedicic <48251655+jabenedicic@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the flagd provider's configuration logic by introducing validation for the FLAGD_SYNC_PORT environment variable. This change prevents the provider from crashing when encountering non-numeric values, such as those injected by Kubernetes service-links, by logging a warning and falling back to the standard port or default value. The feedback suggests adding a final safety check using the new validation method before the final integer parsing to ensure the provider remains stable even if the fallback port configuration is also invalid.

Per review feedback on open-feature#1798: the final `Integer.parseInt(portValue)`
in `FlagdOptions.prebuild()` could still throw if `FLAGD_PORT` itself
is invalid — same Kubernetes service-link injection pattern that hits
the in-process branch can hit RPC-mode consumers when a Service named
`flagd` exists in the pod's namespace, producing `FLAGD_PORT=tcp://...`.

Validate `portValue` before the final parse and fall back to the
resolver's default port on invalid input, with a WARN log to surface
the misconfiguration rather than silently masking it.

Adds two test cases:
- RPC mode with invalid `FLAGD_PORT` → default RPC port.
- In-process mode with both env vars invalid → default in-process port.

Signed-off-by: Jason Benedicic <48251655+jabenedicic@users.noreply.github.com>
String portValue = resolverType == Config.Resolver.IN_PROCESS
? fallBackToEnvOrDefault(Config.SYNC_PORT_ENV_VAR_NAME, fromPortEnv)
: fromPortEnv;
String portValue = fromPortEnv;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this alias?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes — portValue and fromPortEnv are intentionally distinct:

  • portValue is the mutable "chosen port" — it gets reassigned to syncPortValue on the valid-FLAGD_SYNC_PORT path (line 324) and to defaultPort on the final invalid-port fallback (line 355).
  • fromPortEnv is the stable original FLAGD_PORT value, still referenced in the WARN log on line 340 so the operator can see the exact fallback value being chosen.

Collapsing them would either lose that detail from the log message or require recomputing/getenv-ing it a second time. Happy to rename portValuechosenPort if that makes the intent clearer, but I'd prefer to keep the two variables distinct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

flagd-provider-java reads FLAGD_SYNC_PORT in in-process mode, colliding with Kubernetes service-link env injection

7 participants