Aeris PromptShield + Superagent: Complementary security layers? #1116
Replies: 5 comments 2 replies
-
|
Hey sounds interesting, what method do you use to catch these attacks? Is it static checks or a model? |
Beta Was this translation helpful? Give feedback.
-
|
Hi Ismail,
Thank you for your reply. To answer your question, we use multiple layers of protection:
1. Pattern-based detection — Fast static checks for known attack signatures (obfuscation, encoding bypasses, jailbreak patterns, prompt entropy)
2. Self-trained LLM classifier — A purpose-built model that catches novel attacks the patterns miss. It continually improves with a hybrid approach to self-train and supervised training.
Currently in development:
3. Output scanning — Catches potentially sensitive data in LLM responses before they reach the user. Think PII leakage, credential exposure, or prompt leakage.
4. Sandbox simulation (enterprise roadmap) — For the most security-conscious deployments, we're building a system that runs and simulates commands in a sandboxed clone environment before execution. Ultimate protection layer for agentic workflows.
For Superagent specifically, the first two layers would plug in at your input handling with a simple API call, scanning user messages before they hit agent orchestration.
Would love to jump on a quick call to discuss how to protect your production systems from getting attacked by malicious prompt injections and.
What does your availability look like this week?
Best,
MingCo-founder, Aeris Systems
…On Wednesday, February 4th, 2026 at 4:36 PM, Ismail Pelaseyed ***@***.***> wrote:
Hey sounds interesting, what method do you use to catch these attacks? Is it static checks or a model?
—
Reply to this email directly, [view it on GitHub](#1116 (comment)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/B5YSONJRGW7ZA6VMCNPYYVL4KGVQFAVCNFSM6AAAAACT4QT2COVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTKNRZGE3DAOI).
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
Hi Ismail,
That makes sense, most teams that we talk to have had to build something in house due to the lack of options available in the market.
Out of curiosity, how are you currently handling prompt injection detection and response in your system? For example, are you focusing more on pattern matching, model-based classification, sandboxing, or something else?
The reason I ask is that our customers usually come to us not because they don’t have a solution, but because maintaining coverage as attacks evolve and models change starts to pull a lot of engineering time away from core product work. Happy to sanity-check whether what we’ve built would be complementary or redundant for you.
Just 15mins of your time, please feel free to find a time in my calendly: https://calendly.com/d/ctms-bfy-z34/15min-call
Best,Ming
Co-founder, Aeris Systems
…On Thursday, February 5th, 2026 at 8:00 PM, Ismail Pelaseyed ***@***.***> wrote:
Our own classifiers do the same thing, not sure what your model would add?
—
Reply to this email directly, [view it on GitHub](#1116 (reply in thread)), or [unsubscribe](https://github.com/notifications/unsubscribe-auth/B5YSONMJPAUEJ4C7Q32C4ID4KMWGZAVCNFSM6AAAAACT4QT2COVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTKNZQGU3TENI).
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
-
|
Really interesting thread! Layering security is definitely the right approach for AI agents. Another open-source option worth considering alongside PromptShield is ClawMoat — it's designed as a runtime security moat for AI agents with focus on:
The nice thing is it's framework-agnostic, so it can wrap any agent runtime as middleware. Could work well as another layer in the security stack alongside Superagent's built-in protections. |
Beta Was this translation helpful? Give feedback.
-
|
From my point of view, the value of an extra security layer here depends less on the headline detection methods and more on where it plugs into the runtime, what the false-positive budget looks like, and how it interacts with existing guardrails around tool execution. If this is going to be framed as complementary to Superagent rather than competitive, the most convincing thing would be a concrete integration boundary and clear evidence about what classes of attacks it catches earlier or more reliably. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Superagent team! 👋
I've been impressed watching what you've built—particularly the open-weight Guard models on HuggingFace (the 0.6B edge deployment model is smart positioning for latency-sensitive applications).
I'm Alex, ex-VP of Engineering at Cloudflare (18 years in security). I built Aeris PromptShield—open-source prompt injection detection with a focus on:
Why I think we're complementary
Your Guard does excellent runtime classification for blocking malicious intent. Our scanner catches sophisticated injection patterns—the kind of obfuscated payloads that might slip past first-pass classification before they even hit your API.
Think of it as defense in depth: PromptShield catches the encoding tricks, Guard classifies the intent.
Would you be interested in a design partnership?
Happy to jump on a call or async via GitHub/Discord. Either way, love what you're building here.
— Alex
Links:
Beta Was this translation helpful? Give feedback.
All reactions