Guardrails 2.0

Guardrails 2.0 is ElevenLabs’ control layer for ElevenAgents, designed to keep AI voice agents on-topic, policy-aligned, and safer to deploy in production. It is built for teams using voice agents in support, sales, marketing, reception, and internal workflows.

Synthèse Vocale IA

Assistants Vocaux IA

Développement Agents IA

Visiter le Site Web

Overview

Guardrails 2.0 is ElevenLabs’ control layer for ElevenAgents, designed to keep voice agents aligned with a team’s instructions, safety rules, and operational goals. It adds layered checks around agent behavior so teams can reduce drift, catch manipulation attempts, and block policy-violating replies before they reach the user.

The product is aimed at production voice-agent deployments in support, sales, marketing, and internal workflows. Its controls can be configured in the agent settings or via API, and the page frames them as part of a broader trust-and-safety stack for enterprise deployments, including conversation analytics, optional zero-retention mode, and post-call redaction for eligible customers.

Core capabilities

System prompt hardening

A hardened system prompt provides the baseline policy, and the Focus Guardrail reinforces those instructions throughout a conversation to reduce drift in long or complex interactions.

User input validation

User inputs are checked for prompt injection and instruction-override attempts, with the option to terminate conversations that pose a security risk.

Agent response validation

Every response is evaluated against configured policies before it reaches the user, allowing unsafe or off-topic output to be blocked in real time.

Custom guardrails

Custom Guardrails let teams write domain-specific rules in natural language and enforce them automatically across calls with a block-or-allow decision.

Configurable enforcement behavior

Execution mode, exit strategy, content sensitivity, and per-guardrail toggles give teams control over how strict enforcement is and what happens after a trigger.

Logging and post-call redaction

Triggers and actions are logged in conversation analytics, and sensitive information can be redacted from transcripts, recordings, and webhook payloads after a call ends.

Where it fits

Keep agents on-topic
Use Guardrails 2.0 when a voice agent needs to stay on script in long or complex calls, such as support or onboarding conversations where drift can lead to bad answers.
Reduce prompt-injection risk
Apply manipulation and response checks in customer-facing workflows where users may try to override instructions or prompt the model into unsafe behavior.
Enforce domain policies
Use custom guardrails to enforce company-specific policies, such as escalation rules, prohibited topics, or regulated language requirements.
Tune behavior for live calls
Configure exit strategies and sensitivity levels for live voice interactions where the team wants different handling for low-risk and high-risk issues.
Support review and redaction workflows
Combine logging and redaction with post-call QA workflows that need transcripts and recordings for review while removing sensitive details from stored artifacts.

Pros and Cons

Pros

Uses layered checks instead of relying only on the system prompt.
Covers three common risk points: drifting behavior, prompt injection, and unsafe responses.
Supports custom, domain-specific rules in natural language.
Lets teams choose enforcement behavior, exit strategies, and sensitivity levels.
Provides visibility through conversation analytics and trigger logs.

Cons

Some controls, including conversation history redaction and Zero Retention Mode, are described as enterprise-only.
The page does not list every supported integration or show detailed limits for each guardrail type.
Because guardrails can run in different enforcement modes, stricter settings may add latency compared with allowing responses to stream immediately.

FAQ

How do you enable Guardrails 2.0?

Guardrails 2.0 is configured in ElevenAgents. The page says you can turn them on in the Security tab of an agent’s settings or configure them via the API.

What does Guardrails 2.0 actually do?

The page describes three layers: system prompt hardening, user input validation, and agent response validation. These work together to reinforce instructions, detect manipulation attempts, and block policy-violating replies before delivery.

Can I define my own guardrail rules?

The page states that custom guardrails let you define domain-specific policies in natural language and enforce them automatically across every call. A lightweight model evaluates each response and returns a block or allow decision.

How does Guardrails handle a policy violation?

Yes. The page says execution modes let you choose between running guardrails alongside the response for near-zero delay or holding responses until they are fully cleared. It also notes you can define exit strategies such as ending the conversation, transferring to another agent, escalating to a human, or retrying with corrective instructions.

Are redaction and zero-retention features available to everyone?

Conversation history redaction and Zero Retention Mode are described as available to enterprise clients. The page directs customers to contact sales for access.

Quick Facts

Category: AI Voice Agents / Safety
Platform: ElevenAgents
Primary users: Teams deploying voice agents for support, sales, marketing, and internal workflows
Source domain: elevenlabs.io
Availability: Available in alpha in ElevenAgents; configurable in agent settings or via API
Pricing: No standalone price listed on the page; ElevenLabs offers paid plans and enterprise contact-sales options

Alternatives à Guardrails 2.0

Wallie

Wallie is an open-source AI streamer that watches your screen, hears chat, and generates live commentary in a configurable persona. It runs locally on your machine with your own keys and is aimed at faceless content, autonomous streams, and real-time reactions.

CreateOS Sandbox

CreateOS Sandbox is an isolated compute environment for running code and agent workloads inside Firecracker micro-VMs. It is designed for workflows that need machine-level isolation, private networking between sandboxes, and programmatic control through SDK, CLI, or MCP.

Codex Plugins

Codex Plugins bundle reusable skills, app integrations, and MCP servers into workflows you can install in the Codex app or use from Codex CLI. They help extend Codex with connected-service tasks, reusable instructions, and shared team workflows.

PXZ AI

Une plateforme IA tout-en-un qui combine des outils pour l'image, la vidéo, la voix, l'écriture et le chat afin d'améliorer la créativité et la collaboration.

Gemma AI

Gemma AI is a phone call reminder app that calls you with scheduled reminders instead of push notifications. It helps people who want a more direct way to stay on schedule, with Google Calendar sync and conversational call interactions.

CAMB.AI Streams

CAMB.AI Streams dubs live audio in multiple languages in real time for broadcasts on platforms like YouTube, Twitch, and X. It plugs into existing live workflows using common streaming protocols and avoids a post-production step.