top of page

Inference Pricing


LLM, Image, Audio, Video, and Signal models are available through Inference API. For these models, Pay-As-You-Go or use a Flat rate of $1000 per month on your hosting. Prices are per 1,000 tokens including input and output tokens.

Model

Price per 1 token

Model Size
Large Language Model, Chat
Image/ Audio/ Video/ Signal
Price per Hour Hosting
40.1B - 70B
$0.0010
$0.01
$6.17

Fine-tuned models

After you fine-tune a model with the Fine-tuning API you can host it for inference. When hosting your own model you pay hourly for the GPU instances. You can start or stop your instance any time through the web-based Playground or using the start/stop instance APIs.

Let’s Get Started

Get in touch so we can start working together.

  • Youtube
  • X
  • Facebook
  • LinkedIn

Thanks for submitting!

bottom of page