AWS Comprehend Pricing: NLP Analysis Costs

Team analyzing text data and natural language processing results on screen

AWS Comprehend is a managed natural language processing service that extracts insights from text including sentiment, entities, key phrases, and language detection. With pricing measured in units (100-character increments), Comprehend offers pay-per-use NLP that scales from prototype to production without managing ML infrastructure.

TL;DR: NLP API requests cost $0.0001 per unit (1 unit = 100 characters, minimum 3 units per request). Custom classification training costs $3.00 per hour with inference at $0.0005 per unit. Free tier includes 50,000 units per month for 12 months. Batch processing short texts together to meet the 3-unit minimum reduces wasted spend.

NLP API Pricing (Built-in Models)

Monthly Volume	Price per Unit
First 10 million units	$0.0001
Next 40 million units	$0.00005
Over 50 million units	$0.000025

One unit equals 100 characters of text. Each API request has a minimum of 3 units (300 characters), regardless of actual text length. A 50-character tweet is billed as 3 units. A 500-character paragraph is billed as 5 units.

These per-unit prices apply to the following NLP APIs:

Entity Recognition - Detect people, places, organizations, dates, quantities
Sentiment Analysis - Classify text as positive, negative, neutral, or mixed
Key Phrase Extraction - Identify important phrases and concepts
Language Detection - Identify the dominant language in text
Syntax Analysis - Parse parts of speech and sentence structure
Targeted Sentiment - Entity-level sentiment analysis

Batch vs Real-Time Processing

Both synchronous and batch API calls use the same per-unit pricing. However, batch processing through StartEntitiesDetectionJob reads documents from S3 and writes results to S3, reducing operational overhead for large-scale processing.

Comprehend Pricing Guide comparison chart

PII Detection and Redaction

Operation	Price per Unit
PII detection (ContainsPiiEntities)	$0.0001
PII entity identification (DetectPiiEntities)	$0.0001

PII detection identifies personally identifiable information such as names, addresses, Social Security numbers, and financial data within text. It uses the same unit-based pricing as other NLP APIs with the 3-unit minimum per request.

Custom Classification Pricing

Component	Price
Model training	$3.00 per hour
Synchronous inference (endpoint)	$0.0005 per unit
Asynchronous inference (batch)	$0.0005 per unit
Endpoint management	Minimum 1 inference unit running

Custom classification lets you train models to categorize text into your own custom categories. Training costs $3.00 per hour and typically takes 2-5 hours depending on dataset size. Each running inference endpoint has a minimum charge equivalent to 1 inference unit.

Custom Entity Recognition

Component	Price
Model training	$3.00 per hour
Synchronous inference	$0.0005 per unit
Asynchronous inference	$0.0005 per unit

Custom entity recognition follows the same pricing structure as custom classification. You train models to identify domain-specific entities (product names, medical terms, legal concepts) that the built-in entity recognition does not cover.

Topic Modeling Pricing

Component	Price
Topic modeling job	$1.00 per 100 MB of text
Minimum charge	$1.00 (first 10 MB minimum)

Topic modeling discovers abstract topics across a collection of documents. It requires a minimum of 10 MB of input text and is billed at $1.00 per 100 MB. A 500 MB corpus would cost $5.00 to process.

Comprehend Pricing Guide process flow diagram

Free Tier

Feature	Free Allowance	Duration
NLP APIs	50,000 units/month	12 months
Custom Classification (training)	3 hours	One-time, 12 months
Custom Classification (inference)	50,000 units/month	12 months
Topic Modeling	5 jobs (up to 1 MB each)	12 months

The Comprehend free tier provides 50,000 units per month (approximately 5 million characters) for the first 12 months. This is sufficient for prototyping and low-volume production workloads.

Real-World Cost Examples

Use Case	Monthly Volume	Monthly Cost
Social media monitoring (sentiment)	500K tweets (1.5M units)	$150
Customer support ticket classification	100K tickets (3M units)	$300
Custom category classifier	5hr training + 2M units inference	$1,015
PII detection pipeline	10M units	$1,000
Enterprise NLP (multi-API)	50M units across 3 APIs	$7,500
Document topic analysis	2 GB corpus	$20

Comprehend vs Alternatives

Solution	Sentiment per 1M Characters	Custom Models
AWS Comprehend	$1.00	$3/hr training + $0.0005/unit
Google Natural Language	$1.00	$3/hr training + $5/1K queries
Azure Text Analytics	$1.00	Custom pricing
AWS Bedrock (LLM-based)	$3-$15 (varies by model)	Prompt engineering only

Cost Optimization Tips

1. Batch Short Texts to Minimize Unit Waste

Every request has a 3-unit (300-character) minimum. If you are processing short texts like tweets (typically 100-200 characters), use the BatchDetectSentiment API which accepts up to 25 documents per call. This does not eliminate per-document minimums but reduces API call overhead.

2. Use Built-in APIs Before Custom Models

Custom model inference costs $0.0005 per unit, which is 5x more expensive than built-in APIs at $0.0001 per unit. Only invest in custom models when the built-in entity recognition or sentiment analysis does not meet your accuracy requirements.

3. Choose Async Over Sync for Batch Workloads

Asynchronous batch jobs process documents stored in S3 without requiring a running endpoint. For processing large document collections, async jobs are operationally simpler and avoid the ongoing cost of maintaining a real-time inference endpoint.

4. Monitor Unit Consumption with CloudWatch

Track your Comprehend API usage through CloudWatch metrics to identify unexpected volume spikes. Set billing alarms to catch runaway processes before they generate large charges.

5. Consider Bedrock for Complex Analysis

For nuanced text analysis that requires reasoning beyond entity extraction or sentiment classification, AWS Bedrock LLMs may provide better results with a single API call instead of chaining multiple Comprehend APIs together.

Comprehend Pricing Guide optimization checklist

Related Guides

FAQ

What is a Comprehend unit?

One unit equals 100 characters of UTF-8 text. Each API request is billed for a minimum of 3 units (300 characters), even if the input text is shorter. Whitespace and punctuation count toward the character total.

Can I run multiple NLP operations in a single API call?

No. Each NLP operation (sentiment, entities, key phrases) requires a separate API call, and each call is billed independently. Processing a document for both sentiment and entities costs 2x the unit charges. Evaluate which operations you truly need before running all of them.

How long does custom model training take?

Training time depends on dataset size and model type. A custom classifier with 10,000 training documents typically takes 2-4 hours ($6-$12 in training costs). Larger datasets with more categories can take up to 10 hours. Training is a one-time cost unless you retrain with new data.

Comprehend Pricing Guide pricing formula

Lower Your Comprehend Costs with Wring

Wring helps you access AWS credits and volume discounts to lower your Comprehend NLP costs. Through group buying power, Wring negotiates better rates so you pay less per unit processed.

Start saving on AWS →