Wring
All articlesAWS Guides

SageMaker Ground Truth: Data Labeling Costs

SageMaker Ground Truth pricing: image labels from $0.012, bounding boxes at $0.036/object. Active learning cuts labeling costs by up to 70%. Full breakdown.

Wring Team
March 15, 2026
8 min read
SageMaker Ground Truthdata labeling costsannotation pricingML data
Data annotation workspace with labeled images and classification tags for ML training
Data annotation workspace with labeled images and classification tags for ML training

Data labeling is often the most expensive part of building an ML model, yet most teams underestimate the cost. SageMaker Ground Truth provides three workforce options — Amazon Mechanical Turk, private teams, and third-party vendors — each with different quality and cost trade-offs. At scale, labeling 100,000 images with bounding boxes costs $3,600 or more through Mechanical Turk alone.

The real cost saver is automated labeling with active learning. Ground Truth trains an intermediate model on your human-labeled data and uses it to auto-label high-confidence items, reducing the number of items sent to human annotators by up to 70%.

TL;DR: Ground Truth label pricing varies by type: image classification at $0.012/image, bounding boxes at $0.036/object, semantic segmentation at $0.07/image (Mechanical Turk rates). Active learning reduces labels needing human review by up to 70%, cutting costs proportionally. For 100K image classification labels, expect $1,200 without active learning or $360-$500 with it.


Labeling Pricing by Type

Amazon Mechanical Turk Pricing

Mechanical Turk provides the most affordable per-label rates. Each data object is labeled by 3-5 workers by default for quality consensus.

Image labeling:

Label TypePrice per Object10K Objects100K Objects
Image Classification$0.012/image$120$1,200
Multi-label Classification$0.012/image$120$1,200
Bounding Box$0.036/object$360$3,600
Semantic Segmentation$0.07/image$700$7,000
Instance Segmentation$0.084/image$840$8,400
Image-level Labeling (Polyline)$0.036/object$360$3,600

Text labeling:

Label TypePrice per Unit10K Units100K Units
Text Classification$0.012/unit$120$1,200
Named Entity Recognition$0.012/unit$120$1,200
Multi-label Text Classification$0.012/unit$120$1,200

Video and 3D Point Cloud labeling:

Label TypePrice per Unit
Video Object Detection$0.036/frame
Video Object Tracking$0.048/frame
3D Point Cloud Object Detection$0.30/frame
3D Point Cloud Segmentation$0.468/frame

Private Workforce

The Ground Truth platform is free when using a private workforce. You pay your workers directly through your own payroll or contractor agreements.

ComponentCost
Ground Truth platformFree
Labeling UI and toolsFree
Worker managementFree
Worker compensationYou pay directly

Private workforces are ideal for sensitive data that cannot be shared externally, domain-specific tasks requiring specialized knowledge, or when you have in-house annotators.

Third-Party Vendors

AWS Marketplace vendors provide managed labeling teams with quality guarantees. Pricing varies by vendor and task complexity:

Vendor TypeTypical Price RangeBest For
Standard annotation2-5x Mechanical Turk ratesHigher quality, managed QA
Specialized (medical, legal)5-20x Mechanical Turk ratesDomain expertise required
Enterprise managedCustom pricingLarge-scale, ongoing projects
Sagemaker Ground Truth Pricing Guide comparison chart

Active Learning (Automated Labeling)

Active learning is Ground Truth's most powerful cost optimization feature. It works in two phases:

  1. Human labeling phase: Workers label an initial batch of data (typically 1,000-5,000 items)
  2. Automated labeling phase: Ground Truth trains a model and auto-labels items where it has high confidence
Dataset SizeWithout Active LearningWith Active Learning (70% auto)Savings
10,000 images (classification)$120$36 + $4 compute = $4067%
50,000 images (classification)$600$180 + $15 compute = $19568%
100,000 images (classification)$1,200$360 + $30 compute = $39068%
100,000 images (bounding box)$3,600$1,080 + $30 compute = $1,11069%

The compute cost for active learning is the ML training and inference needed to build and run the auto-labeling model. This is typically $2-$5 per 10,000 data objects — negligible compared to human labeling savings.

Requirements for active learning:

  • Minimum 1,000 labeled objects to start
  • Works best with classification and bounding box tasks
  • Confidence threshold is configurable (default 95%)
  • Not available for all label types (e.g., not for 3D point cloud)
Sagemaker Ground Truth Pricing Guide process flow diagram

Ground Truth Plus

Ground Truth Plus is a fully managed labeling service where AWS provides the expert workforce and project management.

FeatureGround TruthGround Truth Plus
Workforce managementYou manageAWS manages
Quality controlYou configureAWS handles
Project managementSelf-serviceDedicated PM
Pricing modelPer-labelCustom contract
Setup effortDIYTurnkey
Domain expertiseDepends on workforceSpecialized teams available

Ground Truth Plus typically costs 3-5x more per label than self-managed Mechanical Turk, but eliminates the operational overhead of managing labeling projects. It is best for enterprises that need high-quality labels at scale without building an internal annotation operations team.


Real-World Cost Scenarios

Computer Vision Startup (Object Detection)

ComponentDetailsCost
Initial labeling20,000 images, bounding boxes, MTurk$720
Active learning enabled80,000 images auto-labeled$120 compute
Quality review5,000 samples re-labeled$180
S3 storage100K images, 50GB$1.15
Total for 100K labeled images$1,021

NLP Team (Text Classification)

ComponentDetailsCost
Text classification50,000 documents, MTurk$600
Active learning150,000 documents auto-labeled$50 compute
NER labeling20,000 documents, private workforce$0 (platform) + labor
Total (excl. private labor)$650

Autonomous Driving (3D Point Cloud)

ComponentDetailsCost
3D object detection10,000 frames, vendor$9,000
Video object tracking50,000 frames, vendor$7,200
Quality review and re-labeling10% re-work$1,620
Total$17,820

Cost Optimization Tips

  1. Enable active learning for every eligible job. It reduces human labeling volume by up to 70%, cutting costs proportionally. The compute overhead is negligible — $2-5 per 10,000 items.

  2. Start with a small labeled sample to validate your task design. Label 500-1,000 items first, review quality, refine instructions, then scale. Poorly designed tasks waste money on low-quality labels that need re-work.

  3. Use consensus wisely. The default is 3 annotators per item. For simple tasks (binary classification), 3 is sufficient. For complex tasks (segmentation), 5 may improve quality. For high-confidence active learning labels, consensus is handled automatically.

  4. Pre-filter your data before labeling. Remove duplicates, irrelevant images, and corrupted files before submitting labeling jobs. Every item in your dataset incurs a per-label cost.

  5. Use private workforces for sensitive or specialized data. The Ground Truth platform is free for private workforces. If you have domain experts (radiologists for medical imaging, lawyers for legal documents), leverage them without per-label platform fees.

  6. Batch your labeling jobs. Larger batches are more cost-efficient with active learning because the auto-labeling model improves with more training data. A single job of 100,000 items is cheaper than ten jobs of 10,000 items.

Sagemaker Ground Truth Pricing Guide optimization checklist

Related Guides


FAQ

How much does it cost to label 100,000 images?

The cost depends on label type and workforce. Image classification on Mechanical Turk: $1,200. Bounding boxes: $3,600. Semantic segmentation: $7,000. With active learning enabled (70% auto-labeled), these costs drop to approximately $390, $1,110, and $2,130 respectively. Private workforce and vendor pricing varies.

What is the difference between Ground Truth and Ground Truth Plus?

Ground Truth is a self-service platform — you configure labeling jobs, manage workers, and handle quality control. Ground Truth Plus is a fully managed service where AWS provides expert annotators and project management. Plus costs 3-5x more per label but eliminates operational overhead. Choose Plus for large-scale projects where managing annotators is not your core competency.

How does active learning reduce labeling costs?

Active learning trains an ML model on your initial human-labeled data (typically 1,000-5,000 items). It then uses this model to auto-label items where its confidence exceeds a threshold (default 95%). Only low-confidence items are sent to human annotators. In practice, this auto-labels 40-70% of your dataset, reducing human labeling costs proportionally.

Sagemaker Ground Truth Pricing Guide savings breakdown

Lower Your SageMaker Ground Truth Costs with Wring

Wring helps you access AWS credits and volume discounts to lower your SageMaker Ground Truth costs. Through group buying power, Wring negotiates better rates so you pay less per labeling task.

Start saving on SageMaker Ground Truth →