Skip to content

Add regex-based CredentialLeakScorer for fast secret detection #1703

@francose

Description

@francose

The existing leakage detection in PyRIT relies on SelfAskTrueFalseScorer with the leakage.yaml prompt, which requires an LLM call for every evaluation. This works but is slow and expensive at scale.

For CI pipelines and high-volume red team evaluations, a fast deterministic scorer that uses regex to detect common credential patterns (AWS keys, GitHub tokens, JWTs, private keys, connection strings, etc.) would be useful.

Proposed: a CredentialLeakScorer that extends TrueFalseScorer, uses compiled regex patterns, and returns True if any credential format is detected in the model output. No LLM call needed. Supports custom pattern dictionaries for organization-specific secret formats.

This complements the LLM-based scorer — use the regex scorer for speed in CI, fall back to the LLM scorer for nuanced detection of indirect leaks.

I have a working implementation ready and will open a PR.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions