L40 on CoreWeave
Visit CoreWeaveCoreWeave's NVIDIA L40 offering combines a high-performance enterprise GPU with a Kubernetes-native cloud platform tailored for massive-scale AI training, inference, and VFX rendering. The NVIDIA L40, built on the Ada Lovelace architecture with 48GB GDDR6 VRAM, delivers exceptional capabilities for demanding visualization, real-time ray tracing, and AI inference workloads, boasting up to 90 TFLOPS FP32 and superior INT8/FP8 throughput for efficient LLM serving. This pairing stands out due to CoreWeave's InfiniBand interconnects enabling low-latency multi-node scaling across thousands of GPUs, ideal for sophisticated engineering teams deploying large-scale LLM inference or VFX studios requiring burst rendering. Key value propositions include per-second billing for precise cost control, spot instances for up to 80% savings, and seamless Kubernetes orchestration without vendor lock-in. ML engineers benefit from hyperscale infrastructure that supports custom containerized workflows, high-speed NVMe storage, and direct access to cutting-edge hardware, making it a top choice for production-grade AI deployments over general-purpose clouds.
Why NVIDIA L40 on CoreWeave?
Choose CoreWeave for NVIDIA L40 when needing scalable inference and rendering in a Kubernetes-native environment. CoreWeave's strengths—massive InfiniBand clusters (up to 400Gb/s per node) and hyperscale pod deployments—perfectly complement the L40's 48GB VRAM and Ada Lovelace efficiency for multi-GPU inference pipelines. Unlike commoditized clouds, per-second billing and spot instances minimize costs for bursty workloads, while native Kubernetes support enables GitOps workflows, autoscaling, and integration with tools like Ray or Kubeflow. This combo excels for LLM serving at scale or VFX, offering lower latency than Ethernet-based rivals and avoiding the overhead of self-managing clusters.
Live Pricing
Real-time NVIDIA L40 offers from CoreWeave
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() CoreWeave | 8×NVIDIA L40 48GB VRAM | 48GB | 128 vCPU 0GB RAM 7680GB Storage | United States | $1.25/GPU/hr $10.00/hr total (8×) | |||
![]() CoreWeave | 8×NVIDIA L40S 48GB VRAM | 48GB | 128 vCPU 0GB RAM 7680GB Storage | United States | $2.25/GPU/hr $18.00/hr total (8×) |


Performance Notes
On CoreWeave, expect strong L40 performance with 36,000 CUDA cores delivering ~90 TFLOPS FP32, 181 TFLOPS FP16, and 362 TFLOPS INT8 for inference-heavy tasks. InfiniBand networking (200-400Gb/s) enables efficient multi-GPU scaling via NVLink or NCCL, supporting pods up to 8x L40s with near-linear speedup in distributed inference. High-IOPS NVMe storage (up to 30GB/s) accelerates data loading. Benchmarks show L40 excelling in TensorRT-optimized models, but exact figures vary by workload; CoreWeave publishes MLPerf-like results. Limitations: Not ideal for FP64-heavy HPC; power draw (300W) suits dense racks but monitor thermals in long runs.
A premier specialized GPU cloud designed for massive-scale AI training and VFX rendering with Kubernetes-native architecture.
Best For
Unique Features
- Kubernetes-native architecture
- Access to massive-scale InfiniBand clusters
VRAM
48GB
Architecture
Ada Lovelace
Tier
enterprise
Platform Features
Getting Started
Getting started with NVIDIA L40 on CoreWeave is straightforward via their web console or kubectl, leveraging Kubernetes for rapid pod deployment. New users can launch GPU instances in minutes, with pre-built images for PyTorch/TensorFlow and inference frameworks.
Steps
- 1Create a CoreWeave account at console.coreweave.com and complete billing setup.
- 2Navigate to 'Pods' in the console and select NVIDIA L40 GPU configuration.
- 3Choose instance size (e.g., 1-8 GPUs), storage, and image; deploy with per-second or spot pricing.
- 4Access via SSH/Jupyter from the console or kubectl; scale with Kubernetes manifests.
- 5Monitor via built-in dashboard and optimize with TensorRT for inference.
Pro Tips
- Leverage spot instances for non-critical inference to cut costs by 50-80% during low-demand periods.
- Use Kubernetes Horizontal Pod Autoscaler for dynamic scaling based on inference queue depth.
- Pre-warm containers with NVIDIA Container Toolkit for sub-second startup times in burst workloads.
Frequently Asked Questions
What is CoreWeave's billing model for NVIDIA L40?▾
CoreWeave bills per-second for GPU instances including NVIDIA L40. Per-second billing ensures you only pay for exactly the compute time you use, which is particularly cost-effective for short experiments, iterative development, and workloads with variable duration.
Does CoreWeave offer spot instances for NVIDIA L40?▾
Yes, CoreWeave offers spot/preemptible instances for NVIDIA L40, which can reduce costs by 50-80% compared to on-demand pricing. Spot instances are ideal for fault-tolerant workloads like batch inference, hyperparameter tuning, and training jobs with checkpointing. Note that spot instances can be interrupted when demand is high, so ensure your workflow can handle preemption gracefully.
How can I access NVIDIA L40 instances on CoreWeave?▾
CoreWeave provides access to NVIDIA L40 instances via SSH, built-in Jupyter notebooks, web-based terminal, programmatic API, Docker containers. The built-in Jupyter notebook support makes it easy to start experimenting immediately without additional setup. SSH access gives you full control over the instance for custom configurations and production deployments. API access enables automation and integration with your existing ML pipelines and CI/CD workflows.
What compliance certifications does CoreWeave have for NVIDIA L40 workloads?▾
CoreWeave maintains SOC 2, HIPAA, GDPR, ISO 27001 certifications, making it suitable for regulated workloads. HIPAA compliance is particularly important for healthcare and medical AI applications. SOC 2 certification demonstrates strong security controls for handling sensitive data. Contact CoreWeave directly for detailed compliance documentation and BAA agreements if needed.
Can I use NVIDIA L40 with Kubernetes on CoreWeave?▾
Yes, CoreWeave supports Kubernetes for orchestrating NVIDIA L40 workloads. This enables you to deploy scalable ML pipelines, manage distributed training jobs across multiple GPUs, and integrate with MLOps tools like Kubeflow, Argo Workflows, and KServe. Kubernetes support is essential for teams building production-grade ML infrastructure.
What are the specifications of the NVIDIA L40?▾
The NVIDIA L40 features 48GB of high-bandwidth memory, built on NVIDIA's Ada Lovelace architecture. As an enterprise-tier GPU, it's designed for large-scale AI training, inference at scale, and demanding HPC workloads. The substantial VRAM capacity supports large language models, complex neural networks, and multi-model deployments.
What workloads is NVIDIA L40 on CoreWeave best suited for?▾
The NVIDIA L40 on CoreWeave is well-suited for large-scale AI/ML training, LLM fine-tuning, batch inference at scale, and high-performance computing workloads. CoreWeave specifically excels at: Sophisticated engineering teams training LLMs at scale; VFX studios requiring burst rendering capacity. Consider your model size, training data volume, and latency requirements when evaluating this combination for your specific use case.
Does CoreWeave offer reserved instances for NVIDIA L40?▾
Yes, CoreWeave offers reserved instance pricing for NVIDIA L40, which can provide significant discounts (typically 20-40% off on-demand rates) for committed usage periods. Reserved instances are ideal for predictable, long-running workloads like production inference services, ongoing training pipelines, or development environments that run continuously. Contact CoreWeave for current reserved pricing and commitment terms.
What unique features does CoreWeave offer for NVIDIA L40?▾
CoreWeave differentiates itself with: Kubernetes-native architecture; Access to massive-scale InfiniBand clusters. These features may provide advantages depending on your specific workflow requirements and technical needs. Evaluate how these capabilities align with your ML infrastructure goals when making your decision.
How do I get started with NVIDIA L40 on CoreWeave?▾
To get started with NVIDIA L40 on CoreWeave, visit https://www.coreweave.com?utm_source=gpuperhour&utm_medium=referral to create an account. Most providers offer a straightforward signup process, and some provide initial credits for new users. Once registered, you can typically launch a NVIDIA L40 instance within minutes through their dashboard or API. We recommend starting with a small experiment to familiarize yourself with the platform before scaling up to larger workloads.
Related Pages
Rent NVIDIA L40
Atlantic.net vs CoreWeave: GPU Cloud Comparison
AWS vs CoreWeave: GPU Cloud Comparison
Cirrascale vs CoreWeave: GPU Cloud Comparison
NVIDIA A100 PCIe 80GB on CoreWeave - Pricing & Availability
NVIDIA A100 SXM4 80GB on CoreWeave - Pricing & Availability
NVIDIA B200 NVL on CoreWeave - Pricing & Availability
NVIDIA B200 SXM on CoreWeave - Pricing & Availability
NVIDIA GH200 Grace Hopper on CoreWeave - Pricing & Availability
NVIDIA L40 in Atlanta, United States - Pricing & Availability
NVIDIA L40 in Belarus - Pricing & Availability
NVIDIA L40 in Canada - Pricing & Availability
NVIDIA L40 in Finland - Pricing & Availability
NVIDIA L40 in Iowa, United States - Pricing & Availability