MODEL INFERENCE

Deploy AI Models

One-click deployment for the latest open-source AI models. Run DeepSeek, Llama 4, and more with serverless inference or dedicated GPU infrastructure.

Full Catalog

All Available Models

Browse our complete model catalog with filtering and one-click deployment.

Model	Parameters	Category	Context	Pricing
DeepSeek V3	671B (37B active)	MoE	64K tokens	In: ₹20/1M tokens Out: ₹40/1M tokens
DeepSeek R1s	671B (37B active)	Reasoning	64K tokens	In: ₹25/1M tokens Out: ₹50/1M tokens
Llama 4 Maverick	17B x 128 Experts	MoE	128K tokens	In: ₹30/1M tokens Out: ₹60/1M tokens
Llama 4 Scout	17B x 16 Experts	MoE	128K tokens	In: ₹15/1M tokens Out: ₹30/1M tokens
GPT OSS 120B	120 Billion	General Purpose	8,192 tokens	In: ₹80/1M tokens Out: ₹160/1M tokens
Hermes 3 Llama 3.1 405B	405 Billion	General Purpose	128K tokens	In: ₹60/1M tokens Out: ₹120/1M tokens
Sarvam-2B	2 Billion	Multilingual	4,096 tokens	In: ₹5/1M tokens Out: ₹10/1M tokens
Dolphin 2.9.2 Mistral 8x22B	8 x 22B MoE	MoE	64K tokens	In: ₹40/1M tokens Out: ₹80/1M tokens
DeepSeek V3 0324	671B (37B active)	General Purpose	64K tokens	In: ₹20/1M tokens Out: ₹40/1M tokens

Showing 1–9 of 9 models

Choose how you want to run your AI models.

Pay-per-token pricing with instant scaling

How It Works

Choose how you want to run your AI models.

Deploy any model in seconds with pre-optimized configurations.

Drop-in replacement for OpenAI API with minimal code changes.

Get started with our free tier. No credit card required.