Skip to content

achetronic/request-validator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

request-validator

A lightweight Envoy and Istio ext-authz service that allows or denies requests using simple YAML rules written in CEL (Common Expression Language).

This service is built for complex authorization scenarios that standard Istio AuthorizationPolicy cannot handle, such as:

  • Inspecting the request body (JSON or YAML).
  • Mixing IP networks (CIDRs) and JWT claims with request payload values.
  • Validating OAuth redirect_uris.
  • Blocking specific paths during certain hours.
  • Any other custom validation that goes beyond basic method, path, and header matching.

Quick Tour

Here is a simple policy that allows POST requests to a Keycloak Dynamic Client Registration endpoint, but only if the request comes from an internal private IP range:

defaults:
  action: deny

groups:
  - name: dcr-internal
    action: allow
    rules:
      - name: from-internal-cidr
        match: |
          request.method == 'POST' &&
          request.path.startsWith('/realms/mcp/clients-registrations') &&
          inCIDR(request.remoteIp, ['10.0.0.0/8'])

How it works:

  • defaults.action: The default action when no rules match. We set this to deny to ensure unmatched requests are blocked.
  • groups: Logical collections of rules that share a verdict (like action: allow).
  • rules: Individual checks. The match field contains a CEL expression that must return a boolean. If it evaluates to true, the rule matches and the group's action is applied.
  • inCIDR: A helper function provided by request-validator to easily check IP ranges.

The rest of this guide covers more advanced features and examples.

Examples

1. Limit Admin Access to Office IP and Working Hours

groups:
  - name: admin-business-hours
    action: allow
    rules:
      - name: office-during-the-day
        match: |
          request.path.startsWith('/admin') &&
          inCIDR(request.remoteIp, ['203.0.113.0/24']) &&
          now().getHours('UTC') >= 7 &&
          now().getHours('UTC') < 19

The now() function returns the current time. You can use standard CEL accessors like getHours, getDayOfWeek, or getMonth to enforce schedules without external dependencies.

2. Require a Header on Webhook Paths

groups:
  - name: webhook-needs-signature
    action: allow
    rules:
      - name: signed-with-x-hub-signature
        match: |
          request.path.startsWith('/hooks/github') &&
          has('x-hub-signature-256', request.headers) &&
          request.header['x-hub-signature-256'].startsWith('sha256=')

The has() helper checks if a header exists and is not empty. You can then access its values directly via request.header.

3. Block Specific Realms on Public Domains

groups:
  - name: keep-master-realm-private
    action: deny
    rules:
      - name: no-master-on-public-hosts
        match: |
          request.host in ['auth.example-1.com', 'auth.example-2.com'] &&
          request.path.startsWith('/realms/master')

Declaring a group with action: deny makes it easy to write and read explicit blocklists.

4. Validate JSON Request Bodies

This is a powerful feature that Istio's built-in policies cannot do. Here, we only allow Keycloak client registrations if all listed redirect_uris belong to trusted domains:

groups:
  - name: dcr-trusted-redirects
    action: allow
    match: |
      request.method == 'POST' &&
      request.path.matches('^/realms/mcp/clients-registrations(/.*)?$') &&
      request.body.jsonOk
    rules:
      - name: antigravity
        match: |
          request.body.json.redirect_uris.all(u,
            u.startsWith('https://antigravity.google/'))

      - name: chatgpt
        match: |
          request.body.json.redirect_uris.all(u,
            u.matches('^https://([a-z0-9-]+\\.)?openai\\.com/.+$'))

The group-level match serves as a pre-filter. If it is a POST to the registration endpoint with valid JSON, we then evaluate the individual rules.

5. Multi-factor Checks (Defense in Depth)

If you want a group to require multiple rules to pass before allowing access, set the group mode to all:

groups:
  - name: admin-defence-in-depth
    action: allow
    mode: all
    match: |
      request.path.startsWith('/admin')
    rules:
      - name: from-internal-network
        match: inCIDR(request.remoteIp, ['10.0.0.0/8', '192.168.0.0/16'])
      - name: has-admin-claim
        match: request.header['x-user-groups'].contains('platform-admins')
      - name: no-debug-header
        match: '!has("x-debug", request.headers)'

In all mode, every single rule must evaluate to true. If even one fails, the group denies the request.

For a comprehensive real-world policy, take a look at examples/policy.yaml.

How Policy Evaluation Works

The engine evaluates groups sequentially in the order they are defined. The first group that matches and produces a verdict determines the outcome. If no group produces a verdict, the default action (defaults.action) is applied.

You can customize this flow using several properties:

  • priority: Assign an integer (e.g., priority: -100) to run a group earlier. Groups with lower priority values run first. This is useful for placing global blocklists before allowlist groups.
  • match (Group level): A filter that decides if the group should look at the request. If false, the group is skipped.
  • mode: Controls how rules inside the group are evaluated:
    • firstMatch (default): The first rule that evaluates to true wins.
    • all: All rules must evaluate to true.
  • action (Rule level): You can override the group action on a specific rule.
  • fallthrough: By default, if a rule does not match, the engine moves to the next rule. You can set fallthrough: allow or fallthrough: deny to immediately stop group evaluation with that verdict.
  • dryRun: Set to true to test rules in production. The engine evaluates and logs the decision, but will not block the request.

Available Request Fields in CEL

Every CEL expression has access to two top-level objects: request and facts.

Field Type Description
request.method string HTTP method (GET, POST, etc.)
request.scheme string http or https (extracted from X-Forwarded-Proto)
request.host string Request authority (host without port)
request.path string URL path
request.remoteIp string Client IP (from X-Forwarded-For or remote address)
request.headers map<string, list<string>> All headers with lowercased keys
request.header map<string, string> First value of each header
request.queries map<string, list<string>> All query parameters
request.query map<string, string> First value of each query parameter
request.body.raw string Raw request body (up to defaults.maxBodyBytes)
request.body.size int Body size in bytes
request.body.contentType string Content-Type header shortcut
request.body.json dyn Parsed JSON object, or {} if not JSON
request.body.jsonOk bool True if the body is valid JSON
request.body.yaml dyn Parsed YAML object, or {} if not YAML
request.body.yamlOk bool True if the body is valid YAML

Note: For the body to be available, you must configure Envoy/Istio to forward request bodies (see the Istio configuration section below).

Facts: Dynamic and External Data

If you have dynamic data that changes frequently (such as IP blocks published by third parties or lists of active customers), you can load them dynamically as facts instead of hardcoding them into your YAML policy.

You define facts at the top level of your policy and reference them in CEL using facts.<name>.

There are three sources for facts:

Method Source CEL Type
value Defined inline in your YAML policy The declared YAML type
file Read from a local file path on startup/reload String (file content)
url Fetched periodically over HTTP String (latest response body)

Example configuration:

facts:
  - name: internalCidrs
    method: value
    value:
      - 10.0.0.0/8
      - 192.168.0.0/16

  - name: trustedClients
    method: file
    file:
      path: /etc/policy/lists/trusted-clients.yaml

  - name: chatgptFeed
    method: url
    url:
      address: https://openai.com/chatgpt-actions.json
      interval: 10m
      timeout: 15s
      headers:
        Authorization: "Bearer $TOKEN"

To use them in your CEL rules:

inCIDR(request.remoteIp, facts.internalCidrs)

For facts loaded as raw strings, you can parse them on the fly in CEL:

inCIDR(request.remoteIp, parseJSON(facts.chatgptFeed).prefixes.map(p, p.ipv4Prefix))

The parseJSON and parseYAML helpers return {} if the input is empty or invalid, ensuring your rules do not crash if a fetch fails. It is usually a good idea to protect your rules by checking if the fact is available first:

match: |
  request.path.startsWith('/api') &&
  facts.chatgptFeed != null && facts.chatgptFeed != ""

Note on reliability: If the initial fetch of a URL fact fails during startup, the policy is rejected. If a subsequent background refresh fails, request-validator will log a warning but continue to use the last successfully fetched data. This prevents temporary network issues from blocking requests.

Logging

request-validator writes structured JSON logs for decisions and internal events. You can configure logging directly in your policy:

logging:
  level: info # debug | info | warn | error
  format: json # json | console
  logBody: false # include the request body in logs (opt-in)
  redactReveal: 6 # show only the first N characters of redacted values
  excludeHeaders: # completely exclude these headers from logs
    - cookie
    - set-cookie
  redactHeaders: # mask these header values
    - authorization
    - proxy-authorization
    - x-api-key
    - x-auth-token
  redactQueryParams: # mask these query parameters
    - access_token
    - id_token
    - code

The CLI flags --log-level and --log-format can be used to override these settings without updating the YAML policy.

An example log entry in JSON:

{
  "time": "2026-05-19T12:14:59.845Z",
  "level": "INFO",
  "msg": "request decided",
  "decision": "allow",
  "rule": "dcr-trusted-redirects/antigravity",
  "reason": "matched",
  "dry_run": false,
  "duration_ms": 0.31,
  "request": {
    "method": "POST",
    "host": "auth.example-1.com",
    "path": "/realms/mcp/clients-registrations",
    "query": "code=***&debug=1",
    "remote_ip": "203.0.113.5",
    "headers": {
      "content-type": "application/json",
      "authorization": "Bearer*********************************",
      "x-api-key": "***"
    },
    "body": { "size": 48, "content_type": "application/json" }
  }
}

The console format outputs values as plain key=value lines, which is highly readable when using kubectl logs -f during local development.

CEL Function Reference

On top of the standard CEL functions and libraries (ext.Strings(), ext.Encoders(), ext.Lists(), ext.Sets(), ext.Math(), ext.Bindings()), the service registers the following custom helper functions.

Network Functions

  • inCIDR(ip, cidrs): Returns true if ip is in any of the listed CIDR ranges. Supports both IPv4 and IPv6. Plain IPs are automatically treated as /32 or /128.
  • ipFamily(ip): Returns "ipv4", "ipv6", or "" if invalid.
  • isPrivateIP(ip): Returns true if the IP is in private ranges (RFC1918, RFC4193, or link-local).
  • isLoopbackIP(ip): Returns true if the IP is a loopback address (127.0.0.0/8 or ::1).
  • parseURL(url): Parses a URL string and returns a map with scheme, host, port, path, query, fragment, username, and password.

Glob Matching

  • glob(string, pattern): Evaluates a glob pattern. * matches anything except slashes, ** matches everything recursively, ? matches a single character, and [abc] matches character classes.
  • globAny(string, patterns): Returns true if any of the glob patterns match.

Encoding and Security

  • sha256Hex(string): Returns the SHA-256 hash of a string as a lowercase hex string.
  • parseJWTUnverified(token): Parses a JWT token and returns a map containing {header, payload}. It does not verify the signature. Use this only if signature validation is handled by another gateway layer.

Time

  • now(): Returns the current UTC timestamp. You can call CEL accessors like .getHours(), .getDayOfWeek(), or .getMonth() on the result.

Structured Parsing

  • parseJSON(string): Parses JSON. Returns {} on empty, null, or invalid input.
  • parseYAML(string): Parses YAML. Returns {} on empty, null, or invalid input.
  • jsonPath(object, expression): Evaluates a lightweight JSONPath expression (e.g., $.a.b[*], $..name, $[0]).

HTTP Conveniences

  • has(name, map): Returns true if the key exists in the map and has a non-empty value.
  • firstOr(map, name, default): Returns the first value of a key, or the default value if it is empty or missing.

Deployment and Operation

Running Locally

To run request-validator locally with the included example:

go run ./cmd \
  --config examples/policy.yaml \
  --log-level debug --log-format console \
  --no-kubernetes

The --no-kubernetes flag tells the service to run in standalone, in-memory mode without attempting to contact a Kubernetes cluster.

Deploying to Kubernetes

An official OCI image is available at: ghcr.io/achetronic/request-validator:<version>.

A ready-to-go deployment setup is located in examples/kubernetes/:

# Create the namespace and admin secret
kubectl create namespace request-validator
kubectl -n request-validator create secret generic request-validator-admin \
  --from-literal=token=$(openssl rand -hex 32)

# Deploy the manifests
kubectl apply -k examples/kubernetes/

This sets up a Deployment with 2 replicas, RBAC permissions, a ConfigMap for the policy, a Service, and a PodDisruptionBudget.

Ports and Endpoints

The service exposes two ports:

1. ext-authz port (Default: 8080)

This is the port Envoy talks to.

  • /: The ext-authz validation endpoint.
  • /healthz: Liveness probe.
  • /readyz: Readiness probe (becomes healthy once a policy is successfully loaded).
  • /metrics: Prometheus metrics.

2. admin port (Default: 8081)

Exposes a CRUD API to modify policies at runtime. Disabled if --admin-token-file is not provided.

  • GET/PUT/DELETE /api/v1/groups[/{name}]: Manage policy groups.
  • GET/PUT/DELETE /api/v1/facts[/{name}]: Manage facts.
  • GET/PUT/DELETE /api/v1/defaults: Override global configuration defaults.
  • GET/PUT/DELETE /api/v1/logging: Manage logging rules.
  • GET /api/v1/config: View the merged running configuration.
  • GET /api/v1/cluster: Details about the leader and cluster state.
  • GET /api/v1/openapi.json: OpenAPI 3.1 specification.

All admin requests must include an Authorization: Bearer <token> header, using the token defined in the admin token file.

Hot Reloads

When you change the configuration file on disk, request-validator detects the change and automatically reloads the policy without downtime. This handles normal file updates, atomic saves, and Kubernetes ConfigMap updates (symlink updates).

If filesystem events cannot be detected (for example, on NFS or FUSE mounts), sending a SIGHUP signal to the process triggers a reload.

Multi-Replica Mode

When running multiple replicas, the admin API uses Kubernetes resources to synchronize state across instances:

  • Active policy overrides are saved in a shared ConfigMap (request-validator-state).
  • Leader election is handled via a Kubernetes Lease (request-validator-leader).
  • Followers handle read requests from their local informer cache and redirect all write requests to the current leader using an HTTP 307 Temporary Redirect.
  • If a leader replica crashes, another replica takes over automatically within about 15 seconds.

Diagnostic Headers

Every decision returned by request-validator includes headers to help you debug and inspect evaluation results:

  • x-rv-result: The verdict (allow or deny).
  • x-rv-rule: The rule that matched, formatted as group/rule (or <defaults>).
  • x-rv-reason: A short, human-readable explanation.
  • x-rv-dry-run: true if the rule was executed in dry-run mode.

Integration with Istio

To integrate the service with Istio, apply these two configurations:

1. Register the Extension Provider in MeshConfig

Add request-validator as an extension provider in your Istio installation settings (usually in the istio-system namespace's istio ConfigMap):

meshConfig:
  extensionProviders:
    - name: request-validator
      envoyExtAuthzHttp:
        service: request-validator.<NAMESPACE>.svc.cluster.local
        port: 8080
        failOpen: false
        timeout: 2s
        includeRequestBodyInCheck:
          maxRequestBytes: 1048576
          allowPartialMessage: false
        headersToDownstreamOnDeny:
          [content-type, x-rv-result, x-rv-rule, x-rv-reason, x-rv-dry-run]
        headersToUpstreamOnAllow:
          [x-rv-result, x-rv-rule, x-rv-reason, x-rv-dry-run]
        includeRequestHeadersInCheck:
          - authorization
          - content-type
          - cookie
          - x-api-key
          - x-user-groups
          - x-forwarded-for
          - x-forwarded-proto

2. Configure an AuthorizationPolicy

Create a CUSTOM policy to route matching traffic to the validator:

apiVersion: security.istio.io/v1
kind: AuthorizationPolicy
metadata:
  name: keycloak-dcr-ext-authz
  namespace: keycloak
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: keycloak
  action: CUSTOM
  provider:
    name: request-validator
  rules:
    - to:
        - operation:
            hosts: [auth.example-1.com, auth.example-2.com]
            paths:
              - /realms/*/clients-registrations
              - /realms/*/clients-registrations/*

For advanced use cases, such as passing all request headers without listing them individually, see the EnvoyFilter examples in examples/config-for-istio.yaml.

License

Apache-2.0.

About

External Authz for Envoy to allow or deny requests based on CEL expressions. Body included.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors