Models
Emcie model tiers, pricing, and capabilities
We offer two generative model tiers and two embedding model tiers, each optimized for different use cases. Generative models support Student and Teacher roles for automatic optimization. Learn more about model roles.
Generative Models
Note on initial costsNew deployments use Teacher pricing during an initial grace period while the platform learns your usage patterns. Once optimization completes, requests automatically transition to the lower-cost Student model. Learn more about the grace period.
Jackal
Many developers choose Parlant as a friendly, full-featured Conversational AI framework, even when they're not building compliance-sensitive agents (which is Parlant's most unique strength).
For use cases that aren't mission-critical, accuracy is often a should have but not an absolute must. When there are no critical legal or financial repercussions if the agent occasionally falls short on instruction-following fidelity, a certain degree of accuracy can be traded off for costs. Typical use cases include education, AI copilots, lead generation, and similar applications.
Jackal is ideal for these scenarios. It provides a balance between accuracy and cost. It's more accurate than generic, off-the-shelf models at similar price points, thanks to its Parlant-specific specialization.
If your agent doesn't handle particularly sensitive use cases, Jackal delivers excellent value.
Pricing
| Role | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Student | $0.30 | $2.50 |
| Teacher | $1.00 | $8.00 |
Rate Limits
See Jackal tier rate limits for RPM, TPM, and TPD limits by usage tier.
Bison
Many companies use Parlant for its unique strength in managing compliance-sensitive use cases, such as financial services, healthcare, large-scale proactive customer service, and similar domains.
In these scenarios, accuracy is crucial. Mishaps on the agent's part can lead to financial, legal, or reputational damage.
Bison was created for these use cases. While still providing a better price point than generic off-the-shelf models, it is based on a larger model that handles more nuance and complexity.
This leads to higher precision in tasks such as guideline matching, tool calling, and response generation.
Choose Bison when your application handles sensitive, high-stakes decisions where accuracy is non-negotiable.
Pricing
| Role | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| Student | $0.90 | $5.00 |
| Teacher | $1.50 | $15.00 |
Rate Limits
See Bison tier rate limits for RPM, TPM, and TPD limits by usage tier.
Embedding Models
Embedding models generate vector representations for semantic search, similarity matching, and retrieval tasks. Unlike generative models, they do not use the Teacher/Student optimization system.
Generally speaking, it isn't necessary to choose between embedding tiers in Parlant. When using EmcieService in Parlant, the framework automatically selects the appropriate embedding model based on the needs of each task—using high-fidelity embeddings where precision matters and lower-fidelity embeddings where speed is sufficient.
Parlant uses embeddings relatively lightly, and since embedding models are quite inexpensive, we leave this decision to the framework.
Jackal Embedding
The cost-optimized embedding tier, used by Parlant for tasks where retrieval speed takes priority over maximum precision.
| Pricing |
|---|
| $0.01 / 1M tokens |
Bison Embedding
The high-fidelity embedding tier, used by Parlant when retrieval accuracy is important or when working with more complex, nuanced content.
| Pricing |
|---|
| $0.12 / 1M tokens |
Rate Limits
See Embedding model rate limits for RPM, TPM, and TPD limits by usage tier.
Updated 3 days ago
