CoreWeave48GB VRAMAda Lovelaceenterprise

L40 on CoreWeave

Visit CoreWeave

CoreWeave's NVIDIA L40 offering combines a high-performance enterprise GPU with a Kubernetes-native cloud platform tailored for massive-scale AI training, inference, and VFX rendering. The NVIDIA L40, built on the Ada Lovelace architecture with 48GB GDDR6 VRAM, delivers exceptional capabilities for demanding visualization, real-time ray tracing, and AI inference workloads, boasting up to 90 TFLOPS FP32 and superior INT8/FP8 throughput for efficient LLM serving. This pairing stands out due to CoreWeave's InfiniBand interconnects enabling low-latency multi-node scaling across thousands of GPUs, ideal for sophisticated engineering teams deploying large-scale LLM inference or VFX studios requiring burst rendering. Key value propositions include per-second billing for precise cost control, spot instances for up to 80% savings, and seamless Kubernetes orchestration without vendor lock-in. ML engineers benefit from hyperscale infrastructure that supports custom containerized workflows, high-speed NVMe storage, and direct access to cutting-edge hardware, making it a top choice for production-grade AI deployments over general-purpose clouds.

Why NVIDIA L40 on CoreWeave?

Choose CoreWeave for NVIDIA L40 when needing scalable inference and rendering in a Kubernetes-native environment. CoreWeave's strengths—massive InfiniBand clusters (up to 400Gb/s per node) and hyperscale pod deployments—perfectly complement the L40's 48GB VRAM and Ada Lovelace efficiency for multi-GPU inference pipelines. Unlike commoditized clouds, per-second billing and spot instances minimize costs for bursty workloads, while native Kubernetes support enables GitOps workflows, autoscaling, and integration with tools like Ray or Kubeflow. This combo excels for LLM serving at scale or VFX, offering lower latency than Ethernet-based rivals and avoiding the overhead of self-managing clusters.

Live Pricing

Real-time NVIDIA L40 offers from CoreWeave

2 offers available
CoreWeave
CoreWeave
United States
NVIDIA L408x
48GB VRAM
128 vCPU
0GB RAM
7680GB Storage
$1.25/GPU/hr
$10.00/hr total (8×)
CoreWeave
CoreWeave
United States
NVIDIA L40S8x
48GB VRAM
128 vCPU
0GB RAM
7680GB Storage
$2.25/GPU/hr
$18.00/hr total (8×)

Performance Notes

On CoreWeave, expect strong L40 performance with 36,000 CUDA cores delivering ~90 TFLOPS FP32, 181 TFLOPS FP16, and 362 TFLOPS INT8 for inference-heavy tasks. InfiniBand networking (200-400Gb/s) enables efficient multi-GPU scaling via NVLink or NCCL, supporting pods up to 8x L40s with near-linear speedup in distributed inference. High-IOPS NVMe storage (up to 30GB/s) accelerates data loading. Benchmarks show L40 excelling in TensorRT-optimized models, but exact figures vary by workload; CoreWeave publishes MLPerf-like results. Limitations: Not ideal for FP64-heavy HPC; power draw (300W) suits dense racks but monitor thermals in long runs.

About CoreWeave

A premier specialized GPU cloud designed for massive-scale AI training and VFX rendering with Kubernetes-native architecture.

Best For

Sophisticated engineering teams training LLMs at scaleVFX studios requiring burst rendering capacity

Unique Features

  • Kubernetes-native architecture
  • Access to massive-scale InfiniBand clusters
NVIDIA L40 Specs

VRAM

48GB

Architecture

Ada Lovelace

Tier

enterprise

Platform Features

Access Methods
SSH
Jupyter Notebooks
Web Terminal
API
Kubernetes
Containers
Billing Options
Incrementper-second
Spot Instances
Reserved Instances
Prepaid Credits
Compliance
SOC 2
HIPAA
GDPR
ISO 27001

Getting Started

Getting started with NVIDIA L40 on CoreWeave is straightforward via their web console or kubectl, leveraging Kubernetes for rapid pod deployment. New users can launch GPU instances in minutes, with pre-built images for PyTorch/TensorFlow and inference frameworks.

Steps

  1. 1Create a CoreWeave account at console.coreweave.com and complete billing setup.
  2. 2Navigate to 'Pods' in the console and select NVIDIA L40 GPU configuration.
  3. 3Choose instance size (e.g., 1-8 GPUs), storage, and image; deploy with per-second or spot pricing.
  4. 4Access via SSH/Jupyter from the console or kubectl; scale with Kubernetes manifests.
  5. 5Monitor via built-in dashboard and optimize with TensorRT for inference.

Pro Tips

  • Leverage spot instances for non-critical inference to cut costs by 50-80% during low-demand periods.
  • Use Kubernetes Horizontal Pod Autoscaler for dynamic scaling based on inference queue depth.
  • Pre-warm containers with NVIDIA Container Toolkit for sub-second startup times in burst workloads.

Frequently Asked Questions

What is CoreWeave's billing model for NVIDIA L40?

CoreWeave bills per-second for GPU instances including NVIDIA L40. Per-second billing ensures you only pay for exactly the compute time you use, which is particularly cost-effective for short experiments, iterative development, and workloads with variable duration.

Does CoreWeave offer spot instances for NVIDIA L40?

Yes, CoreWeave offers spot/preemptible instances for NVIDIA L40, which can reduce costs by 50-80% compared to on-demand pricing. Spot instances are ideal for fault-tolerant workloads like batch inference, hyperparameter tuning, and training jobs with checkpointing. Note that spot instances can be interrupted when demand is high, so ensure your workflow can handle preemption gracefully.

How can I access NVIDIA L40 instances on CoreWeave?

CoreWeave provides access to NVIDIA L40 instances via SSH, built-in Jupyter notebooks, web-based terminal, programmatic API, Docker containers. The built-in Jupyter notebook support makes it easy to start experimenting immediately without additional setup. SSH access gives you full control over the instance for custom configurations and production deployments. API access enables automation and integration with your existing ML pipelines and CI/CD workflows.

What compliance certifications does CoreWeave have for NVIDIA L40 workloads?

CoreWeave maintains SOC 2, HIPAA, GDPR, ISO 27001 certifications, making it suitable for regulated workloads. HIPAA compliance is particularly important for healthcare and medical AI applications. SOC 2 certification demonstrates strong security controls for handling sensitive data. Contact CoreWeave directly for detailed compliance documentation and BAA agreements if needed.

Can I use NVIDIA L40 with Kubernetes on CoreWeave?

Yes, CoreWeave supports Kubernetes for orchestrating NVIDIA L40 workloads. This enables you to deploy scalable ML pipelines, manage distributed training jobs across multiple GPUs, and integrate with MLOps tools like Kubeflow, Argo Workflows, and KServe. Kubernetes support is essential for teams building production-grade ML infrastructure.

What are the specifications of the NVIDIA L40?

The NVIDIA L40 features 48GB of high-bandwidth memory, built on NVIDIA's Ada Lovelace architecture. As an enterprise-tier GPU, it's designed for large-scale AI training, inference at scale, and demanding HPC workloads. The substantial VRAM capacity supports large language models, complex neural networks, and multi-model deployments.

What workloads is NVIDIA L40 on CoreWeave best suited for?

The NVIDIA L40 on CoreWeave is well-suited for large-scale AI/ML training, LLM fine-tuning, batch inference at scale, and high-performance computing workloads. CoreWeave specifically excels at: Sophisticated engineering teams training LLMs at scale; VFX studios requiring burst rendering capacity. Consider your model size, training data volume, and latency requirements when evaluating this combination for your specific use case.

Does CoreWeave offer reserved instances for NVIDIA L40?

Yes, CoreWeave offers reserved instance pricing for NVIDIA L40, which can provide significant discounts (typically 20-40% off on-demand rates) for committed usage periods. Reserved instances are ideal for predictable, long-running workloads like production inference services, ongoing training pipelines, or development environments that run continuously. Contact CoreWeave for current reserved pricing and commitment terms.

What unique features does CoreWeave offer for NVIDIA L40?

CoreWeave differentiates itself with: Kubernetes-native architecture; Access to massive-scale InfiniBand clusters. These features may provide advantages depending on your specific workflow requirements and technical needs. Evaluate how these capabilities align with your ML infrastructure goals when making your decision.

How do I get started with NVIDIA L40 on CoreWeave?

To get started with NVIDIA L40 on CoreWeave, visit https://www.coreweave.com?utm_source=gpuperhour&utm_medium=referral to create an account. Most providers offer a straightforward signup process, and some provide initial credits for new users. Once registered, you can typically launch a NVIDIA L40 instance within minutes through their dashboard or API. We recommend starting with a small experiment to familiarize yourself with the platform before scaling up to larger workloads.

Related Pages

Compare L40 Across Providers

The L40 is available from 16 providers on GPUPerHour. CoreWeave charges $1.25/hr. Here is how other providers compare:

For a full comparison across all providers, see the L40 rental page. See all GPUs on CoreWeave.