Specifications Compared
| Spec | A40 | L40 |
|---|---|---|
| TDP | 300W | 300W |
| VRAM | 48 GB | 48 GB |
| CUDA Cores | 10,752 | 18,176 |
| Memory Type | GDDR6 | GDDR6 |
| Architecture | Ampere | Ada Lovelace |
| Form Factors | PCIe | PCIe |
| Interconnect | NVLink | |
| Tensor Cores | 336 | 568 |
| FP16 Performance | 37.4 TFLOPS | 90.5 TFLOPS |
| FP32 Performance | 37.4 TFLOPS | 90.5 TFLOPS |
| FP64 Performance | 0.6 TFLOPS | |
| INT8 Performance | 299 TOPS | 724 TOPS |
| Memory Bandwidth | 696 GB/s | 864 GB/s |
Performance Analysis
The L40 outperforms the A40 significantly in raw compute capability. It delivers 90.5 TFLOPS in both FP16 and FP32, more than double the A40's 37.4 TFLOPS, enabling faster matrix operations central to deep learning. This delta translates to quicker training times for models using half-precision arithmetic, which is standard for efficiency in modern frameworks.
Memory bandwidth plays a critical role in handling large datasets: the L40's 864 GB/s allows 24 percent more throughput than the A40's 696 GB/s, supporting larger batch sizes without bottlenecks during data transfers. For inference, higher FP16 performance on the L40 reduces latency for real-time applications. Both GPUs maintain 48 GB VRAM, accommodating similar model sizes, but the Ada Lovelace architecture in the L40 introduces optimizations like improved tensor cores for better overall utilization.
In training scenarios, the L40's advantages compound across epochs, potentially halving completion times relative to the A40 based on the FLOPS ratio.
Live Cloud Pricing
Real-time prices from 25+ providers. Updated every 60 seconds.
A40
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() TensorDock | NVIDIA RTX A4000 16GB VRAM | 16GB | 0 vCPU 0GB RAM | Tallinn, Harjumaa | $0.08/GPU/hr | Available | ||
![]() Vast.ai | 8×NVIDIA RTX A4000 16GB VRAM | 16GB | 80 vCPU 201GB RAM 1698GB Storage | United Kingdom | $0.15/GPU/hr $1.17/hr total (8×) | Available | ||
![]() Hyperstack | 4×NVIDIA RTX A4000 16GB VRAM | 16GB | 16 vCPU 86GB RAM 500GB Storage | Norway | $0.15/GPU/hr $0.60/hr total (4×) | Available | ||
![]() Hyperstack | NVIDIA RTX A4000 16GB VRAM | 16GB | 4 vCPU 21GB RAM 100GB Storage | Norway | $0.15/GPU/hr | Available | ||
![]() Hyperstack | 2×NVIDIA RTX A4000 16GB VRAM | 16GB | 8 vCPU 43GB RAM 200GB Storage | Norway | $0.15/GPU/hr $0.30/hr total (2×) | Available |
L40
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() TensorDock | NVIDIA L40S 48GB VRAM | 48GB | 0 vCPU 0GB RAM | Wolverhampton | $0.55/GPU/hr | Available | ||
![]() RunPod | NVIDIA L40 48GB VRAM | 48GB | 8 vCPU 94GB RAM | 🌍global | $0.82/GPU/hr | |||
![]() RunPod | NVIDIA L40S 48GB VRAM | 48GB | 16 vCPU 94GB RAM | 🌍global | $0.86/GPU/hr | |||
![]() Massed Compute | NVIDIA L40 48GB VRAM | 48GB | 14 vCPU 72GB RAM 625GB Storage | Iowa | $0.86/GPU/hr | Available | ||
![]() Massed Compute | 2×NVIDIA L40 48GB VRAM | 48GB | 26 vCPU 144GB RAM 1250GB Storage | Iowa | $0.86/GPU/hr $1.72/hr total (2×) | Available |
When to Choose the A40
The A40 suits budget-conscious deployments where entry-level pricing matters. At $0.24 per hour starting price across 23 offers, it undercuts the L40's $0.67 per hour minimum, ideal for prototyping or less demanding inference tasks. Its NVLink interconnect enables multi-GPU setups for workloads not requiring peak performance.
Legacy Ampere-optimized software benefits from the A40 without recompilation needs.
When to Choose the L40
The L40 excels in performance-driven environments needing rapid iteration. With 90.5 TFLOPS FP16 versus 37.4 TFLOPS and 864 GB/s bandwidth against 696 GB/s, it accelerates training and inference substantially. Its average pricing of $0.89 per hour across 14 offers provides value for high-throughput production.
Ada Lovelace features support emerging AI techniques, justifying selection for future-proofing.
Use Cases
The L40's 90.5 TFLOPS FP16 outperforms the A40's 37.4 TFLOPS, speeding up large model training. Higher 864 GB/s bandwidth handles bigger batches efficiently.
L40 delivers 90.5 TFLOPS FP16 for lower latency versus A40's 37.4 TFLOPS. Same 48 GB VRAM supports equivalent model sizes with faster throughput.
Ada Lovelace architecture and 2.4x FP32 performance of 90.5 TFLOPS over 37.4 TFLOPS reduce fine-tuning cycles. Bandwidth edge aids parameter updates.
L40's superior 90.5 TFLOPS FP16 accelerates diffusion model generation compared to A40's 37.4 TFLOPS. 48 GB VRAM fits high-res workflows on both.
Both offer 48 GB VRAM and 300 W TDP for simulations. A40 suffices at lower $0.24 per hour entry if L40's 90.5 TFLOPS FP32 unused.
Frequently Asked Questions
Which GPU has better performance, A40 or L40?▾
The L40 provides 90.5 TFLOPS FP16 and FP32, surpassing the A40's 37.4 TFLOPS by 2.4 times. Memory bandwidth reaches 864 GB/s on L40 versus 696 GB/s on A40. These specs make L40 faster for AI tasks.
Do A40 and L40 have the same VRAM?▾
Both GPUs feature 48 GB GDDR6 VRAM, supporting identical large model capacities. This equality aids direct comparisons in memory-bound workloads. Differences lie in speed, not size.
What is the pricing comparison for A40 vs L40?▾
A40 starts at $0.24 per hour, averaging $1.26 per hour across 23 offers. L40 begins at $0.67 per hour, averaging $0.89 per hour over 14 offers. A40 offers cheaper entry points.
Which has higher memory bandwidth?▾
L40 achieves 864 GB/s bandwidth, 24 percent above A40's 696 GB/s. This benefits data-intensive operations like large batch training. Both use GDDR6 memory.
Are A40 and L40 the same power consumption?▾
Each has a 300 W TDP, easing cluster power planning. PCIe form factor matches for both. Performance varies despite equal power.
What architectures do they use?▾
A40 employs Ampere from 2020 with NVLink support. L40 uses Ada Lovelace from 2023. Newer design yields higher 90.5 TFLOPS versus 37.4 TFLOPS.
Which is cheaper to rent, the A40 or the L40?▾
Cloud rental prices for both the A40 and L40 vary by provider, configuration, and availability. This page shows live pricing from 25+ providers updated every 60 seconds. Scroll to the Live Cloud Pricing section to compare current rates.
How much VRAM does the A40 have compared to the L40?▾
The A40 has 48 GB of GDDR6 memory. The L40 has 48 GB of GDDR6 memory.
Can I find A40 and L40 GPUs available to rent right now?▾
Yes. This page shows real-time availability across 25+ cloud GPU providers. The Live Cloud Pricing section displays only in-stock offers with current pricing.
What is the main difference between the A40 and the L40?▾
The A40 uses the Ampere architecture (2020) while the L40 uses Ada Lovelace (2023). The L40 delivers 2.4x the FP16 throughput and 1.2x the memory bandwidth of the A40.




