Rate Limits
Understand API rate limits and restrictions
Rate limits are restrictions that our API imposes on the number of times you can access our services within a specified period of time.
Why do we have rate limits?
Rate limits are a common practice for APIs, and they're put in place for a few different reasons.
They help protect against abuse or misuse of the API. For example, a malicious actor could flood the API with requests in an attempt to overload it or cause disruptions in service. By setting rate limits, we can prevent this kind of activity.
Rate limits also help ensure that everyone has fair access to the API. If one person or organization makes an excessive number of requests, it could bog down the API for everyone else. By throttling the number of requests that a single user can make, we ensure that the most number of people have an opportunity to use the API without experiencing slowdowns.
Finally, rate limits can help us manage the aggregate load on our infrastructure. If requests to the API increase dramatically, it could tax the servers and cause performance issues. By setting rate limits, we can help maintain a smooth and consistent experience for all users.
How do rate limits work?
Rate limits are measured in three ways: RPM (requests per minute), TPM (tokens per minute), and TPD (tokens per day). Limits can be hit across any of these options depending on what occurs first.
For example, you might send 20 requests with 5,000 tokens each to the completions endpoint and that would fill your limit if your RPM was 20, even if you did not reach your TPM limit within those 20 requests.
Limits vary by the model tier being used.
Finally, credit limits are also placed on the total amount an organization can spend on the API each month.
Usage tiers
You can view the rate and usage limits for your organization under the Usage section of your account settings. As your usage of our API increases, we automatically graduate you to the next usage tier.
| Tier | Qualification | Usage Limit |
|---|---|---|
| Tier 1 | User must be in an allowed geography | $50 / month |
| Tier 2 | $50 paid and 7+ days since first successful payment | $200 / month |
| Tier 3 | $100 paid and 7+ days since first successful payment | $500 / month |
| Tier 4 | $250 paid and 14+ days since first successful payment | $1,000 / month |
| Tier 5 | $1,000 paid and 30+ days since first successful payment | $50,000 / month |
Tier upgrades are automatic once both criteria (total spend and account age) are met. If you need custom terms, please contact [email protected].
Rate limits by tier
The following tables show the rate limits for each usage tier. These limits apply per organization.
Rate limits differ between Student and Teacher roles. Teacher models have lower limits due to their higher computational cost. When using auto role, limits are applied based on which model is currently serving your requests.
Jackal tier
Teacher
| Tier | RPM | TPM | TPD |
|---|---|---|---|
| Tier 1 | 500 | 250,000 | 5,000,000 |
| Tier 2 | 1,000 | 500,000 | 10,000,000 |
| Tier 3 | 2,000 | 1,000,000 | 20,000,000 |
| Tier 4 | 5,000 | 2,500,000 | 50,000,000 |
| Tier 5 | 10,000 | 10,000,000 | 100,000,000 |
Student
| Tier | RPM | TPM | TPD |
|---|---|---|---|
| Tier 1 | 1,000 | 500,000 | 10,000,000 |
| Tier 2 | 2,000 | 1,000,000 | 20,000,000 |
| Tier 3 | 5,000 | 2,500,000 | 50,000,000 |
| Tier 4 | 10,000 | 5,000,000 | 100,000,000 |
| Tier 5 | 30,000 | 20,000,000 | 500,000,000 |
Bison tier
Teacher
| Tier | RPM | TPM | TPD |
|---|---|---|---|
| Tier 1 | 500 | 250,000 | 5,000,000 |
| Tier 2 | 1,000 | 500,000 | 10,000,000 |
| Tier 3 | 2,000 | 1,000,000 | 20,000,000 |
| Tier 4 | 5,000 | 2,500,000 | 50,000,000 |
| Tier 5 | 10,000 | 10,000,000 | 100,000,000 |
Student
| Tier | RPM | TPM | TPD |
|---|---|---|---|
| Tier 1 | 1,000 | 250,000 | 5,000,000 |
| Tier 2 | 2,000 | 500,000 | 10,000,000 |
| Tier 3 | 5,000 | 1,000,000 | 20,000,000 |
| Tier 4 | 10,000 | 2,500,000 | 50,000,000 |
| Tier 5 | 30,000 | 10,000,000 | 200,000,000 |
Embedding models
Embedding models do not have Teacher/Student roles. When using EmcieService in Parlant, the framework automatically selects the appropriate embedding model for each task, so you generally don't need to think about these limits directly.
Jackal Embedding
| Tier | RPM | TPM | TPD |
|---|---|---|---|
| Tier 1 | 3,000 | 1,000,000 | 50,000,000 |
| Tier 2 | 5,000 | 2,000,000 | 100,000,000 |
| Tier 3 | 10,000 | 5,000,000 | 200,000,000 |
| Tier 4 | 20,000 | 10,000,000 | 500,000,000 |
| Tier 5 | 50,000 | 50,000,000 | Unlimited |
Bison Embedding
| Tier | RPM | TPM | TPD |
|---|---|---|---|
| Tier 1 | 1,500 | 500,000 | 25,000,000 |
| Tier 2 | 3,000 | 1,000,000 | 50,000,000 |
| Tier 3 | 5,000 | 2,500,000 | 100,000,000 |
| Tier 4 | 10,000 | 5,000,000 | 250,000,000 |
| Tier 5 | 30,000 | 25,000,000 | Unlimited |
Please review these terms on occasion, as they may change from time to time based on resource availability.
Requesting higher limits
If your application requires limits beyond Tier 5, contact us using the Contact page to discuss enterprise options. Please include:
- Your use case description
- Current and expected request volumes
- Peak traffic patterns
Updated 4 days ago
