Choose Your Plan

All plans include all 25 safety domains, real-time violation detection, and deterministic safety guarantees. Pay per token -- scale as you grow.

Shared Inference

$2.50 /1M tokens

No base fee

CPU-powered inference on shared infrastructure. Ideal for development, testing, and light production workloads.

All 25 safety domains
Pay-per-use ($2.50/1M tokens)
Up to 5 seats
CPU inference (~30-50ms latency)
Dashboard & usage analytics
Standard rate limits (60 rpm)
Email support
Best-effort availability

Dedicated

$5.00 /1M tokens

+ $1,500/mo base fee

Dedicated GPU with isolated database. Continual learning on your domain data. 5-10x faster inference.

All 25 safety domains
Tokens billed at usage ($5.00/1M)
Up to 50 seats
Dedicated GPU inference (~5-10ms latency)
Isolated database + node pool
Continual learning on your data
Priority rate limits (300 rpm)
Custom domain gates
Slack + email support
99.5% uptime SLA

Enterprise Security

$8.50 /1M tokens

+ $3,500/mo base fee

High-performance GPU with isolated VPC, HA database, and expanded audit and governance controls. BAA may be available for qualifying healthcare workloads.

All 25 safety domains
Tokens billed at usage ($8.50/1M)
Unlimited seats
High-performance GPU inference (~3-5ms latency)
Isolated HA infrastructure + VPC
Mapped to internal security control baseline
BAA may be available for qualifying healthcare use cases
Audit trail & explainability controls
Dedicated support engineer
99.9% uptime SLA

Infrastructure at Every Tier

Every plan runs SolaceSentry's custom 350M parameter transformer with 4 judge transformers and dual-model courtroom. Higher tiers unlock dedicated GPU compute and isolated infrastructure.

Feature	Shared	Dedicated	Enterprise
Compute	CPU-Optimized	Dedicated GPU	High-Performance GPU
Inference Latency	~30-50ms	~5-10ms	~3-5ms
Included Tokens	1M/mo	10M/mo	100M/mo
Infrastructure	Shared pool	Isolated node + DB	HA cluster + VPC
Continual Learning	--
Uptime SLA	Best-effort	99.5%	99.9%

Add-On

Predictions

AI-powered cost forecasting with Bayesian analysis, regime detection, and Monte Carlo scenario planning across all 25 safety domains.

Token cost prediction with Hierarchical Bayes
Regime change-point detection
Monte Carlo scenario analysis
25-domain partial pooling

$249 /mo

Available for Dedicated & Enterprise

Frequently Asked Questions

What counts as a token?

Tokens are the units processed by our inference engine. Both input and output tokens are counted. Our custom BPE tokenizer is optimized for safety-domain vocabulary.

Can I switch plans later?

Yes. You can upgrade or downgrade at any time. Changes take effect at the start of your next billing cycle. No lock-in contracts.

What does Enterprise Security include?

Enterprise Security includes isolated infrastructure, expanded audit and governance controls, encryption, and support for contract review. A BAA may be available for qualifying healthcare use cases.

What's the difference between Shared and Dedicated inference?

Shared inference runs on CPU-optimized pooled infrastructure (~30-50ms latency). Dedicated and Enterprise tiers include their own dedicated GPU -- giving you 5-10x faster inference, isolated compute, and continual learning on your data.

How does token billing work?

All plans are usage-based — you pay per 1M tokens consumed at your tier's rate (Shared: $2.50, Dedicated: $5.00, Enterprise: $8.50). Dedicated and Enterprise tiers also have a monthly base fee that covers the dedicated GPU infrastructure. There are no free included tokens on any plan.

Choose Your Plan

Shared Inference

Dedicated

Enterprise Security

Infrastructure at Every Tier

Predictions

Frequently Asked Questions

What counts as a token?

Can I switch plans later?

What does Enterprise Security include?

What's the difference between Shared and Dedicated inference?

How does token billing work?

Select Safety Domains

Healthcare

Financial

Legal & Regulatory

Cyber & Security

Industrial

Transport

People