Skip to content

OpenRouter & OpenAI-Compatible Providers

Overview

agentguard's guard_tools function works with any OpenAI-compatible API — not just OpenAI. One integration, 10+ providers:

  • OpenRouter — 300+ models via a single API
  • Groq — ultra-low latency LLaMA and Mixtral
  • Together AI — open-source models
  • Fireworks AI — fast open-source inference
  • DeepInfra — affordable open-source hosting
  • Mistral — Mistral's own API
  • Perplexity — sonar models with web search built in
  • xAI — Grok models
  • Novita AI — affordable LLaMA models

Quick Start

import os
from openai import OpenAI
from agentguard.integrations import guard_tools, Providers

# Define and guard your tools
from agentguard import guard

@guard(validate_input=True, detect_hallucination=True, max_retries=2)
def search_web(query: str) -> str:
    """Search the web for information."""
    return f"Results for: {query}"

executor = guard_tools([search_web])

# Use with OpenRouter (300+ models)
client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.getenv("OPENROUTER_API_KEY"),
)

response = client.chat.completions.create(
    model="anthropic/claude-3-5-sonnet",
    messages=[{"role": "user", "content": "What is quantum entanglement?"}],
    tools=executor.tools,
)

results = executor.execute_all(response.choices[0].message.tool_calls)

Real Cost Tracking for Compatible Providers

Use guard_openai_compatible_client for OpenRouter and other OpenAI-compatible providers when you want real usage-based cost tracking in addition to guarded tool execution.

import os
from openai import OpenAI

from agentguard import TokenBudget
from agentguard.integrations import (
    Providers,
    create_client,
    guard_openai_compatible_client,
    guard_tools,
)

budget = TokenBudget(max_cost_per_session=10.00)
executor = guard_tools([search_web])

base_client = create_client(Providers.OPENROUTER)
client = guard_openai_compatible_client(
    base_client,
    provider=Providers.OPENROUTER,
    budget=budget,
)

response = client.chat.completions.create(
    model="openai/gpt-4o",
    messages=messages,
    tools=executor.tools,
)

OpenAI-compatible extraction uses a registry-based plugin system internally, so provider-specific usage formats can be added without replacing the default generic extractor.


Built-in Provider Presets

The Providers enum has presets for common providers:

from agentguard.integrations import Providers, create_client

# Each preset reads the API key from environment variables
client = create_client(Providers.OPENROUTER)   # OPENROUTER_API_KEY
client = create_client(Providers.GROQ)          # GROQ_API_KEY
client = create_client(Providers.TOGETHER)      # TOGETHER_API_KEY
client = create_client(Providers.FIREWORKS)     # FIREWORKS_API_KEY
client = create_client(Providers.DEEPINFRA)     # DEEPINFRA_API_KEY
client = create_client(Providers.MISTRAL)       # MISTRAL_API_KEY
client = create_client(Providers.PERPLEXITY)    # PERPLEXITY_API_KEY
client = create_client(Providers.XAI)           # XAI_API_KEY
client = create_client(Providers.NOVITA)        # NOVITA_API_KEY
client = create_client(Providers.OPENAI)        # OPENAI_API_KEY

# Or use kwargs directly
kwargs = Providers.TOGETHER.client_kwargs()
client = OpenAI(**kwargs)

Custom Providers

For any OpenAI-compatible endpoint:

from agentguard.integrations import Provider, create_client

my_provider = Provider(
    name="my-llm",
    base_url="https://my-llm-api.example.com/v1",
    env_key="MY_LLM_API_KEY",
)

client = create_client(my_provider)

guard_tools vs OpenAIToolExecutor

Both work identically. guard_tools is the shortcut:

from agentguard.integrations import guard_tools

executor = guard_tools([search_web, get_weather])
# equivalent to:
from agentguard.integrations import OpenAIToolExecutor
executor = OpenAIToolExecutor()
executor.register(search_web).register(get_weather)

Provider-Specific Notes

OpenRouter

client = create_client(Providers.OPENROUTER)

# OpenRouter supports extra headers for app identification
response = client.chat.completions.create(
    model="openai/gpt-4o",  # or "anthropic/claude-3-5-sonnet", etc.
    messages=messages,
    tools=executor.tools,
    extra_headers={
        "HTTP-Referer": "https://myapp.com",
        "X-Title": "My Agent App",
    },
)

Groq (ultra-low latency)

Groq is ideal for high-frequency tool-calling agents. agentguard's rate limiter helps stay within Groq's limits:

from agentguard import guard
from agentguard.core.types import RateLimitConfig

@guard(rate_limit=RateLimitConfig(calls_per_minute=30))  # Groq free tier limit
def search_web(query: str) -> str: ...

client = create_client(Providers.GROQ)
response = client.chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=messages,
    tools=executor.tools,
)

Perplexity (web search built in)

Perplexity's sonar models include web search natively. Use agentguard for the API call guardrails:

client = create_client(Providers.PERPLEXITY)
response = client.chat.completions.create(
    model="llama-3.1-sonar-large-128k-online",
    messages=messages,
    tools=executor.tools,
)

Switching Providers

The same executor.tools works across providers — just change the client:

executor = guard_tools([search_web, get_weather])

# Test with Groq (fast, free tier)
groq_client = create_client(Providers.GROQ)

# Deploy with OpenRouter (reliability, 300+ models)
openrouter_client = create_client(Providers.OPENROUTER)

# Same executor, different client
for client, model in [
    (groq_client, "llama-3.3-70b-versatile"),
    (openrouter_client, "openai/gpt-4o"),
]:
    response = client.chat.completions.create(
        model=model, messages=messages, tools=executor.tools,
    )