Wring
All articlesAWS Guides

AWS Bedrock Guardrails: Content Filtering Guide

AWS Bedrock Guardrails for content filtering, PII detection, and topic blocking at $0.75/1K text units. Configure AI safety policies for production.

Wring Team
March 14, 2026
6 min read
AWS BedrockBedrock GuardrailsAI safetycontent filteringPII detectionresponsible AI
AI safety guardrails and content filtering system
AI safety guardrails and content filtering system

Bedrock Guardrails provide configurable safety controls that filter inputs and outputs of your AI applications. They intercept harmful content, block restricted topics, redact PII, check response grounding, and enforce custom word filters — all without modifying your application code. For any production AI application handling user input, guardrails are essential.

TL;DR: Guardrails apply four types of protection: content filters (hate, violence, sexual, misconduct), denied topics (custom restrictions), PII detection/redaction, and contextual grounding checks. Pricing is $0.75 per 1,000 text units (1 unit = 1,000 characters) for text, $1.00 per image. Guardrails run on both input and output, so each request is evaluated twice. For high-volume applications, guardrail costs can exceed model costs — apply them selectively.


Guardrail Components

1. Content Filters

Block content across four categories with configurable strength:

CategoryWhat It CatchesStrength Levels
HateDiscrimination, slurs, biasNone, Low, Medium, High
InsultsPersonal attacks, derogatory languageNone, Low, Medium, High
SexualExplicit content, sexual referencesNone, Low, Medium, High
ViolenceGraphic violence, threats, harmNone, Low, Medium, High
MisconductIllegal activities, self-harmNone, Low, Medium, High

Strength levels:

  • Low: Blocks only clearly harmful content
  • Medium: Blocks moderate and clearly harmful content
  • High: Blocks borderline, moderate, and clearly harmful content

2. Denied Topics

Custom topic restrictions specific to your application:

Examples:

  • A financial advisor bot that should not give specific stock picks
  • A healthcare bot that should not provide diagnoses
  • A customer support bot that should not discuss competitor products

You define topics with a natural language description, and the guardrail uses the foundation model to detect when conversations drift into denied areas.

3. Sensitive Information Filters

Detect and handle PII and custom sensitive patterns:

PII TypeAction Options
Email addressesBlock or Anonymize
Phone numbersBlock or Anonymize
SSN/Tax IDsBlock or Anonymize
Credit card numbersBlock or Anonymize
NamesBlock or Anonymize
AddressesBlock or Anonymize
Custom regex patternsBlock or Anonymize

Anonymize replaces the PII with a placeholder ([EMAIL], [PHONE]) while keeping the rest of the response.

4. Word Filters

Block specific words, phrases, or profanity. Useful for:

  • Brand protection (blocking competitor names)
  • Compliance (blocking regulated terms)
  • Custom profanity lists

5. Contextual Grounding Check

Verifies that model responses are grounded in the provided source material (Knowledge Base results). This reduces hallucinations by flagging responses that make claims not supported by the retrieved documents.

ComponentWhat It Checks
GroundingIs the response supported by source documents?
RelevanceIs the response relevant to the user's query?
Bedrock Guardrails Guide savings comparison

Pricing

ComponentCost
Text guardrails$0.75 per 1,000 text units
Image guardrails$1.00 per image
Contextual grounding$0.10 per 1,000 text units
Text unit1,000 characters

Important: Guardrails evaluate both input and output. A request with 2,000 characters input and 3,000 characters output = 5 text units = $0.00375 per request.

Cost Examples

ScenarioMonthly VolumeGuardrail Cost
Chatbot: 100K conversations500M characters$375/month
Content moderation: 1M posts100M characters$75/month
Document Q&A: 50K queries250M characters$188/month
Bedrock Guardrails Guide process flow diagram

Implementation Patterns

Pattern 1: Apply to All Requests

Attach the guardrail to your Bedrock model invocation. Every request is filtered on input and output.

Best for: Customer-facing applications where any harmful content is unacceptable.

Pattern 2: Selective Application

Apply guardrails only to specific endpoints or user types:

  • User-generated input: Full guardrails
  • Internal tools: Minimal or no guardrails
  • Batch processing: Skip guardrails for cost savings

Pattern 3: Guardrails with Agents

Attach guardrails to Bedrock Agents. The guardrail evaluates:

  • Initial user input
  • Each intermediate agent response
  • Final output to the user

Cost impact: Agent conversations with 5 tool calls may trigger guardrail evaluation 6+ times per conversation.


Configuration Best Practices

Start Permissive, Then Tighten

Begin with Low strength content filters and monitor what gets through. Gradually increase to Medium or High based on actual harmful content patterns. Starting at High blocks too much legitimate content.

Define Denied Topics Precisely

Vague denied topics cause false positives. Instead of "Don't discuss politics," use "Do not provide opinions on political candidates, parties, or elections. Factual information about government policies related to our product is acceptable."

Use Anonymize Over Block for PII

Blocking a response entirely when PII is detected creates a poor user experience. Anonymizing replaces PII with placeholders, allowing the response to still be useful.

Monitor Guardrail Metrics

Track in CloudWatch:

  • GuardrailsInvocations — total evaluations
  • GuardrailsTextUnitsProcessed — for cost tracking
  • GuardrailsIntervened — how often guardrails blocked content

High intervention rates may indicate either effective protection or overly aggressive filtering.

Bedrock Guardrails Guide optimization checklist

Related Guides


FAQ

Do guardrails add latency to responses?

Yes, approximately 100-300ms per evaluation. For streaming responses, guardrails evaluate chunks, which can add visible latency. For latency-sensitive applications, consider applying guardrails only on input (not output) or using asynchronous post-evaluation.

Can I use guardrails without Bedrock models?

Yes. The ApplyGuardrail API lets you evaluate any text against your guardrail configuration — even text from non-Bedrock models. This is useful for content moderation of user-generated content regardless of source.

How do guardrails compare to model-level safety training?

Guardrails are a complementary layer. Model-level safety (built into Claude, Llama, etc.) provides baseline protection. See the Bedrock overview for more on available models. Guardrails add application-specific rules (denied topics, custom PII patterns, brand protection) that no model training covers. Use both for defense-in-depth.

Bedrock Guardrails Guide key statistics

Lower Your Bedrock Guardrails Costs with Wring

Wring helps you access AWS credits and volume discounts to lower your Bedrock Guardrails costs. Through group buying power, Wring negotiates better rates so you pay less per guardrail evaluation.

Start saving on Bedrock Guardrails →