Specifications Compared
| Spec | L40S | RTX-5090 |
|---|---|---|
| TDP | 350W | 575W |
| VRAM | 48 GB | 32 GB |
| CUDA Cores | 18,176 | 21,760 |
| Memory Type | GDDR6X | GDDR7 |
| Architecture | Ada Lovelace | Blackwell |
| Form Factors | PCIe | PCIe |
| Interconnect | PCIe 4.0 | PCIe 5.0 |
| Tensor Cores | 568 | 680 |
| FP8 Performance | 724 TFLOPS | 838 TFLOPS |
| FP16 Performance | 362 TFLOPS | 419 TFLOPS |
| FP32 Performance | 91 TFLOPS | 105 TFLOPS |
| FP64 Performance | 1.4 TFLOPS | 1.6 TFLOPS |
| INT8 Performance | 724 TOPS | 838 TOPS |
| Memory Bandwidth | 864 GB/s | 1,792 GB/s |
Performance Analysis
The RTX 5090 outperforms the L40S in compute metrics: 419 TFLOPS FP16 versus 362 TFLOPS, and 105 TFLOPS FP32 against 91 TFLOPS, translating to faster model training cycles and inference latencies in deep learning. FP8 performance underscores this, with 838 TFLOPS on RTX 5090 exceeding 724 TFLOPS on L40S, ideal for quantized large language models where precision reduction boosts speed without proportional accuracy loss.
Memory bandwidth disparity proves critical: 1792 GB/s on RTX 5090 supports larger batch sizes in training, reducing overhead from data transfers compared to L40S's 864 GB/s. This enables handling bigger datasets efficiently, though L40S's 48 GB VRAM versus 32 GB accommodates oversized models that exceed RTX 5090 capacity, preventing out-of-memory errors in fine-tuning or inference.
Power draw reflects trade-offs, L40S at 350W TDP versus 575W, implying lower cooling needs and density in clusters, yet RTX 5090's newer PCIe 5.0 interconnect accelerates multi-GPU communication for distributed workloads.
Live Cloud Pricing
Real-time prices from 25+ providers. Updated every 60 seconds.
L40S
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() TensorDock | NVIDIA L40S 48GB VRAM | 48GB | 0 vCPU 0GB RAM | Wolverhampton | $0.55/GPU/hr | Available | ||
![]() RunPod | NVIDIA L40S 48GB VRAM | 48GB | 16 vCPU 94GB RAM | 🌍global | $0.86/GPU/hr | |||
![]() Massed Compute | NVIDIA L40S 48GB VRAM | 48GB | 12 vCPU 72GB RAM 625GB Storage | Iowa | $0.88/GPU/hr | Available | ||
![]() Massed Compute | 2×NVIDIA L40S 48GB VRAM | 48GB | 24 vCPU 144GB RAM 1250GB Storage | Iowa | $0.88/GPU/hr $1.76/hr total (2×) | Available | ||
![]() Massed Compute | NVIDIA L40S 48GB VRAM | 48GB | 12 vCPU 72GB RAM 625GB Storage | Iowa | $0.88/GPU/hr | Available |
RTX 5090
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() TensorDock | NVIDIA GeForce RTX 5090 32GB VRAM | 32GB | 0 vCPU 0GB RAM | Chubbuck, Idaho | $0.57/GPU/hr | Available | ||
![]() Vast.ai | NVIDIA GeForce RTX 5090 32GB VRAM | 32GB | 384 vCPU 94GB RAM 570GB Storage | Czechia | $0.81/GPU/hr | Available | ||
![]() Vast.ai | NVIDIA GeForce RTX 5090 32GB VRAM | 32GB | 8 vCPU 30GB RAM 489GB Storage | South Korea | $0.87/GPU/hr | Available | ||
![]() Vast.ai | NVIDIA GeForce RTX 5090 32GB VRAM | 32GB | 16 vCPU 30GB RAM 583GB Storage | South Korea | $0.87/GPU/hr | Available | ||
![]() Vast.ai | NVIDIA GeForce RTX 5090 32GB VRAM | 32GB | 16 vCPU 30GB RAM 495GB Storage | South Korea | $0.91/GPU/hr | Available |
When to Choose the L40S
Opt for the L40S in scenarios demanding high VRAM, such as training or inferencing models exceeding 32 GB, where its 48 GB GDDR6X prevents swapping to system memory. Datacenter deployments benefit from its 350W TDP, allowing denser racks without excessive power infrastructure.
Legacy Ada Lovelace compatibility suits environments locked into PCIe 4.0 fabrics, ensuring seamless integration without upgrades.
When to Choose the RTX 5090
Select the RTX 5090 for bandwidth-sensitive tasks like high-batch training, leveraging 1792 GB/s to process data 107 percent faster than L40S's 864 GB/s. Its Blackwell architecture and 838 TFLOPS FP8 excel in quantized inference, delivering up to 16 percent more throughput.
Cost drives choice here: at $0.55 average hourly versus $1.66, it yields superior performance per dollar across abundant 32 cloud offers.
Use Cases
L40S's 48 GB VRAM handles massive parameter counts without fragmentation, unlike RTX 5090's 32 GB limit. Its stability suits prolonged training sessions.
RTX 5090's 838 TFLOPS FP8 and 1792 GB/s bandwidth enable higher throughput for quantized models. Lower $0.55 hourly cost scales deployments economically.
48 GB VRAM on L40S supports larger context windows and gradients during fine-tuning. PCIe 4.0 ensures reliable datacenter integration.
RTX 5090's 419 TFLOPS FP16 accelerates diffusion steps with 105 TFLOPS FP32 for post-processing. Bandwidth doubles effective batch rendering.
L40S fits memory-heavy simulations with 48 GB; RTX 5090 excels in FP32-bound tasks at 105 TFLOPS. Choice hinges on dataset size versus speed needs.
Frequently Asked Questions
Which GPU has more VRAM?▾
The L40S provides 48 GB GDDR6X VRAM, surpassing the RTX 5090's 32 GB GDDR7. This makes L40S preferable for models exceeding 32 GB thresholds.
How do their prices compare in the cloud?▾
RTX 5090 starts at $0.13 per hour with $0.55 average across 32 offers, versus L40S at $1.65 from $1.66 average over 3 offers. RTX 5090 delivers better value for scalable workloads.
What is the FP16 performance difference?▾
RTX 5090 achieves 419 TFLOPS FP16, 16 percent above L40S's 362 TFLOPS. This gap shortens training times in mixed-precision setups.
Which has higher memory bandwidth?▾
RTX 5090 doubles bandwidth to 1792 GB/s from L40S's 864 GB/s. Higher rates support larger batches in data-parallel training.
Is the RTX 5090 more power efficient?▾
No, RTX 5090 draws 575W TDP versus L40S's 350W, demanding more cooling. L40S enables higher density in power-constrained clouds.
Which architecture is newer?▾
Blackwell in RTX 5090 from 2025 follows Ada Lovelace in L40S from 2023. Newer design yields gains in FP8 at 838 TFLOPS over 724 TFLOPS.
Which is cheaper to rent, the L40S or the RTX 5090?▾
Cloud rental prices for both the L40S and RTX 5090 vary by provider, configuration, and availability. This page shows live pricing from 25+ providers updated every 60 seconds. Scroll to the Live Cloud Pricing section to compare current rates.
How much VRAM does the L40S have compared to the RTX 5090?▾
The L40S has 48 GB of GDDR6X memory. The RTX 5090 has 32 GB of GDDR7 memory.
Can I find L40S and RTX 5090 GPUs available to rent right now?▾
Yes. This page shows real-time availability across 25+ cloud GPU providers. The Live Cloud Pricing section displays only in-stock offers with current pricing.
What is the main difference between the L40S and the RTX 5090?▾
The L40S uses the Ada Lovelace architecture (2023) while the RTX 5090 uses Blackwell (2025). The RTX 5090 delivers 1.2x the FP16 throughput and 2.1x the memory bandwidth of the L40S.



