Calibration
Tune the VerificationEngine's sensitivity to match your deployment environment.
Why Calibrate?
Every deployment is different. A tool that hallucination-prone in one environment may be perfectly reliable in another. Calibration lets you tune the trade-off between false positives (blocking legitimate results) and false negatives (missing actual hallucinations).
- Stricter = lower thresholds โ more flagging/blocking, fewer missed hallucinations
- More lenient = higher thresholds โ fewer interruptions, but more risk
The calibrate() API
Use engine.calibrate() to tune detection sensitivity globally or per-tool:
from agentguard.verification import VerificationEngine
engine = VerificationEngine()
# Make a specific tool stricter (catches more hallucinations)
engine.calibrate("search_web",
accept_threshold=0.1, # Lower = stricter acceptance
block_threshold=0.3, # Lower = more blocking
prior=0.25, # Higher base hallucination rate
)
# Make another tool more lenient (fewer false positives)
engine.calibrate("get_time",
accept_threshold=0.4,
block_threshold=0.8,
prior=0.05,
)
# Tune global settings (all tools without per-tool overrides)
engine.calibrate(
accept_threshold=0.15,
block_threshold=0.45,
)
calibrate() Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
tool_name | str | None | None | Tool to calibrate. None for global settings. |
accept_threshold | float | None | 0.2 | P(H) below this → auto-accept |
block_threshold | float | None | 0.5 | P(H) above this → auto-block |
prior | float | None | 0.15 | Base P(hallucination) for this tool |
likelihood_ratios | dict | None | None | Per-signal LR overrides |
Tuning Signal Weights
You can adjust how much each signal contributes to the posterior by overriding its likelihood ratio:
# Make schema violations much more damning
engine.calibrate("search_web",
likelihood_ratios={{
"schema_mismatch": 20.0, # Was 12.0
"latency_anomaly": 2.0, # Was 3.5 (reduce weight)
}},
)
# Globally increase weight of session inconsistency
engine.calibrate(
likelihood_ratios={{
"session_inconsistency": 8.0, # Was 4.0
}},
)
Continuous Learning with record_feedback()
The engine adapts over time when you provide labelled feedback:
# After human review or automated validation:
engine.record_feedback("search_web",
confidence_score=0.3,
was_hallucination=True,
)
engine.record_feedback("search_web",
confidence_score=0.1,
was_hallucination=False,
)
# The engine uses EMA (alpha=0.1) to update:
# - Per-tool hallucination rate (affects prior)
# - Per-tool blocking threshold (adapts to actual rate)
The default EMA alpha is 0.1, meaning each feedback sample contributes 10% to the updated estimate. This provides smooth adaptation without overreacting to individual samples.
Inspecting Calibration
# Check current settings for a tool
cal = engine.get_calibration("search_web")
print(cal)
# {{
# 'accept_threshold': 0.1,
# 'block_threshold': 0.3,
# 'prior': 0.25,
# 'tool_name': 'search_web',
# 'tool_threshold': 0.3,
# 'hallucination_rate': 0.22,
# 'total_feedback': 47,
# 'baseline_calls': 1205,
# 'baseline_latency': {{'count': 1205, 'mean': 342.5, 'std': 89.3, ...}}
# }}
# Check global settings
global_cal = engine.get_calibration()
print(global_cal)
# {{'accept_threshold': 0.15, 'block_threshold': 0.45, 'prior': 0.15}}
Recommended Calibration Workflow
- Start with defaults. Deploy the engine with default settings (
prior=0.15,accept_threshold=0.2,block_threshold=0.5). - Monitor for 24–48 hours. Log all verdicts and manually review flagged/blocked results.
- Record feedback. Call
record_feedback()for every reviewed result. - Check calibration. Use
get_calibration()to see how thresholds have adapted. - Fine-tune. If a tool has too many false positives, increase its
accept_thresholdor lower itsprior. If it misses hallucinations, do the opposite. - Iterate. Repeat steps 2–5 as traffic patterns change.