FAQ
How do I create an API key?
Go to My Account → My API Keys and generate a new key. The key is required to authenticate all model calls via the Inference API.
How is pricing calculated?
Costs are based on input and output tokens. Detailed pricing is available under Product Information → Billing Management.
Where can I find rate limits for a model?
Each model has its own rate limit parameters (requests or tokens per second). View them under Product Information → Rate Limit.
Does AI Inference support autoscaling?
Yes. The platform supports autoscaling for model endpoints. You can configure scaling behavior based on traffic patterns to optimize cost and maintain stability.