Build autonomous agents using the Open Responses API via the HuggingFace Inference Providers router.
Open Responses is an open-source API standard for autonomous agent development. It provides:
- Sub-agent loops: Multi-step workflows in a single request
- Reasoning visibility: Access to agent thinking (raw, summary, or encrypted)
- Semantic streaming: Structured events instead of raw tokens
- Provider-agnostic design: Single endpoint, provider selection via model suffix
https://router.huggingface.co/v1/responses
moonshotai/Kimi-K2-Instruct-0905:groq # Groq (fast)
meta-llama/Llama-3.1-70B-Instruct:together # Together AI
meta-llama/Llama-3.1-70B-Instruct:nebius # Nebius (EU)
meta-llama/Llama-3.1-70B-Instruct:auto # Auto selection
This skill works with:
- Claude Code
- Cursor (via
.cursor/rules/) - OpenCode (via
.opencode/andAGENTS.md) - Codex (via
.codex/andAGENTS.md)
- TypeScript/JavaScript
- Python
Install this skill directly in Claude Code:
# Clone the repository
git clone https://github.com/OthmanAdi/open-responses-agent-skill.git
# Create skills directory if it doesn't exist
mkdir -p ~/.claude/skills
# Copy the skill
cp -r open-responses-agent-skill/skills/open-responses-agent-dev ~/.claude/skills/
# Restart Claude Code or reload skillsOr install via URL:
/skills add https://github.com/OthmanAdi/open-responses-agent-skill
Copy the rules file:
mkdir -p .cursor/rules
cp open-responses-agent-skill/.cursor/rules/open-responses-agent.mdc .cursor/rules/Copy the configuration files:
# For OpenCode
cp -r open-responses-agent-skill/.opencode .
cp open-responses-agent-skill/AGENTS.md .
# For Codex
cp -r open-responses-agent-skill/.codex .
cp open-responses-agent-skill/AGENTS.md .npm install openai
export HF_TOKEN=your-tokenimport OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://router.huggingface.co/v1",
apiKey: process.env.HF_TOKEN,
});
const response = await client.responses.create({
model: "moonshotai/Kimi-K2-Instruct-0905:groq",
instructions: "You are a helpful assistant.",
input: "Explain quantum entanglement in simple terms.",
});
console.log(response.output_text);pip install openai
export HF_TOKEN=your-tokenfrom openai import OpenAI
import os
client = OpenAI(
base_url="https://router.huggingface.co/v1",
api_key=os.environ.get("HF_TOKEN"),
)
response = client.responses.create(
model="moonshotai/Kimi-K2-Instruct-0905:groq",
instructions="You are a helpful assistant.",
input="Explain quantum entanglement in simple terms.",
)
print(response.output_text)open-responses-agent-dev/
├── .claude-plugin/ # Claude Code plugin configuration
├── .cursor/rules/ # Cursor rules
├── .opencode/ # OpenCode configuration
├── .codex/ # Codex configuration
├── skills/ # Skill definition
│ └── open-responses-agent-dev/
│ └── SKILL.md # Main skill instructions
├── examples/ # Complete examples
│ ├── typescript/ # TypeScript examples
│ └── python/ # Python examples
├── templates/ # Production-ready templates
│ ├── typescript/ # TypeScript starter
│ └── python/ # Python starter
├── docs/ # Documentation
│ ├── migration-guide.md # Chat Completion → Open Responses
│ └── provider-comparison.md # Provider comparison
├── AGENTS.md # Agent instructions (OpenCode/Codex)
└── README.md # This file
Simple request with reasoning visibility:
examples/typescript/basic-agent.tsexamples/python/basic_agent.py
Multi-step workflows with tools:
examples/typescript/sub-agent-loop.tsexamples/python/sub_agent_loop.py
Provider switching via model suffix:
examples/typescript/multi-provider.tsexamples/python/multi_provider.py
Accessing agent thinking:
examples/typescript/reasoning-visibility.tsexamples/python/reasoning_visibility.py
| Suffix | Provider | Description | Reasoning |
|---|---|---|---|
:groq |
Groq | Fast inference | RAW |
:together |
Together AI | Open weight specialist | RAW |
:nebius |
Nebius AI | European infrastructure | RAW |
:auto |
Auto | Automatic selection | Varies |
{
"model": "moonshotai/Kimi-K2-Instruct-0905:groq",
"instructions": "You are a helpful assistant.",
"input": "User's task",
"tools": [...],
"tool_choice": "auto",
"reasoning": { "effort": "medium" }
}{
"id": "resp_abc123",
"model": "moonshotai/Kimi-K2-Instruct-0905",
"output": [
{ "type": "reasoning", "content": "Let me think..." },
{ "type": "function_call", "name": "search", "arguments": {...} },
{ "type": "function_call_output", "output": "..." },
{ "type": "message", "content": "Final response" }
],
"output_text": "Final response (convenience helper)",
"usage": { "input_tokens": 100, "output_tokens": 200 }
}# Required
export HF_TOKEN=hf_...
# Optional
export MODEL=moonshotai/Kimi-K2-Instruct-0905:groq
export REASONING_EFFORT=medium # low, medium, high- RAW:
contentfield (open weight models via Groq/Together/Nebius) - Summary:
summaryfield (some proprietary) - Encrypted:
encrypted_contentfield (most proprietary)
- Migration Guide - Migrate from Chat Completion API
- Provider Comparison - Compare providers
MIT
Ahmad Othman Adi
Based on Open Responses specification.