AI + Graphics

NVIDIA L40S

Universal data center GPU powered by Ada Lovelace architecture. Combines AI inference with ray tracing graphics for generative AI, video processing, and visualization workloads.

Model Specifications

ArchitectureAda Lovelace
VRAM48 GB GDDR6
Memory Bandwidth864 GB/s
CUDA Cores18,176
Tensor Cores568 (4th Gen)
FP16 Performance362 TFLOPS
Ada Lovelace
Architecture
48 GB GDDR6
VRAM
864 GB/s
Memory Bandwidth
18,176
CUDA Cores
568 (4th Gen)
Tensor Cores
362 TFLOPS
FP16 Performance
724 TFLOPS
FP8 Performance
142 (3rd Gen)
RT Cores
350W
TDP
Gen4 x16
PCIe

Pricing Plans

Flexible pricing options to match your workload requirements.

On-Demand

Pay as you go with no commitment

₹150/hour
  • 1x NVIDIA L40S GPU
  • 24 vCPUs
  • 192 GB RAM
  • 500 GB NVMe SSD
  • No minimum commitment
  • Start/stop anytime
Most Popular

Reserved 1 Month

Save 15% with monthly commitment

₹63,750/month
  • 1x NVIDIA L40S GPU
  • 24 vCPUs
  • 192 GB RAM
  • 500 GB NVMe SSD
  • 15% discount
  • Priority support

Reserved 1 Year

Maximum savings with annual commitment

₹45,000/month
  • 1x NVIDIA L40S GPU
  • 24 vCPUs
  • 192 GB RAM
  • 500 GB NVMe SSD
  • 40% discount
  • Dedicated support
Key Features

Why Choose NVIDIA L40S

Ada Lovelace Architecture

Latest NVIDIA architecture with 4th gen Tensor Cores and 3rd gen RT Cores.

FP8 Support

Native FP8 for up to 2x inference throughput on transformer models.

Universal Workloads

Combines AI inference, graphics, and video processing in one GPU.

DLSS 3 Ready

Hardware support for AI-powered frame generation and super sampling.

Use Cases

Use Cases

Generative AI Inference

Deploy LLMs, Stable Diffusion, and other generative models efficiently.

Real-time Graphics

Ray-traced rendering and real-time visualization with 3rd gen RT cores.

Video AI

Video analytics, encoding, and AI-powered video processing.

Omniverse & Digital Twins

Build and run NVIDIA Omniverse applications at scale.

Ready to Deploy NVIDIA L40S?

AI inference and graphics in one GPU.

NVIDIA L40S |