NVIDIA NeMo Guardrails

NVIDIA Developer product page — developer.nvidia.com/nemo-guardrails

Abstract

NVIDIA NeMo Guardrails is a scalable, open-source platform for orchestrating AI safety guardrails that keep agentic AI applications — agents, copilots, chatbots — safe, reliable, and aligned. It enables developers to define, orchestrate, and enforce guardrails across multiple safety dimensions simultaneously: topic control, PII detection, RAG grounding, jailbreak prevention, and multimodal/multilingual content safety with reasoning capabilities. NeMo Guardrails is GPU-accelerated for low-latency parallel rail execution and integrates out-of-the-box with NVIDIA Nemotron NIM microservices on NGC and Hugging Face. A benchmark demonstrates that orchestrating up to five parallel GPU-accelerated guardrails increases detection rate by 1.4× while adding only approximately 0.5 seconds of latency, delivering ~50% better protection without substantially slowing responses. NeMo Guardrails is part of the broader NVIDIA NeMo software suite.

Key Concepts

  • Guardrail orchestration: the coordination layer that applies multiple safety checks in parallel or sequence to LLM inputs and outputs, enforcing policies before responses reach end users
  • Parallel rail execution: running multiple safety models concurrently (e.g., jailbreak detection + PII + content safety simultaneously) rather than sequentially, minimising added latency
  • Topic control: guardrail that restricts the LLM to a defined set of permitted topics, rejecting or redirecting out-of-scope queries
  • PII detection: guardrail that identifies and redacts personally identifiable information in inputs and outputs before they are processed or returned
  • RAG grounding: guardrail that verifies LLM responses are grounded in retrieved context, preventing hallucinated answers in RAG pipelines
  • Jailbreak prevention: guardrail using Nemotron NIM models to detect and block adversarial prompts designed to bypass safety instructions
  • NeMo Guardrails microservice: containerised, NIM-packaged deployment of guardrail models, available on NGC and Hugging Face for zero-config integration

Performance

ConfigurationDetection rateAdded latency
Single rail (baseline)1.0×~0.1 s
Five parallel GPU-accelerated rails1.4×~0.5 s

Orchestrating five parallel rails with NeMo Guardrails yields 1.4× improved detection at ~0.5 s additional latency.

Integration

NeMo Guardrails integrates with:

  • Agent frameworks: LangChain, LangGraph, LlamaIndex, multi-agent deployments
  • NVIDIA NIM microservices: Nemotron Safety Guard models for content safety, topic control, and jailbreak detection
  • RAG pipelines: guardrails for enterprise RAG to enforce context grounding and redact PII from retrieved data
  • Observability: native tooling for monitoring guardrail effectiveness and performance

Guardrails Library

The Guardrails Library provides a growing ecosystem of pre-built safety models and rail configurations. Red-teaming tools probe workflows for prompt injection, jailbreaks, tool poisoning, and custom attacks, with results visualised on a dashboard.

Terminology

  • Colang: NeMo Guardrails’ domain-specific language for defining conversational flows and guardrail policies
  • Rail: a single safety check applied to LLM input or output (e.g., jailbreak detection is one rail, PII detection is another)
  • NemoGuard NIM: NVIDIA-packaged NIM microservices for each guardrail type (content safety, topic control, jailbreak detection)
  • Safety for Agentic AI: developer example on NGC demonstrating NeMo Guardrails deployment for agent safety

Connections to Existing Wiki Pages