Specifications Compared
| Spec | L40 | RTX-3090 |
|---|---|---|
| TDP | 300W | 350W |
| VRAM | 48 GB | 24 GB |
| CUDA Cores | 18,176 | 10,496 |
| Memory Type | GDDR6 | GDDR6X |
| Architecture | Ada Lovelace | Ampere |
| Form Factors | PCIe | PCIe |
| Interconnect | NVLink | |
| Tensor Cores | 568 | 328 |
| FP16 Performance | 90.5 TFLOPS | 35.6 TFLOPS |
| FP32 Performance | 90.5 TFLOPS | 35.6 TFLOPS |
| INT8 Performance | 724 TOPS | |
| Memory Bandwidth | 864 GB/s | 936 GB/s |
Performance Analysis
The L40's 90.5 TFLOPS FP16 performance exceeds the RTX 3090 Ti's 35.6 TFLOPS by 154 percent, accelerating deep learning training and inference significantly. FP16 and FP32 parity at 90.5 TFLOPS on L40 enables seamless mixed-precision workflows without precision bottlenecks common in older architectures. The 48 GB VRAM on L40 supports larger batch sizes for models exceeding 24 GB, such as large language models, minimizing data loading overheads. Memory bandwidth stands at 864 GB/s for L40 versus 936 GB/s for RTX 3090 Ti: the slight edge on RTX 3090 Ti benefits small-batch inference, but L40's doubled VRAM capacity dominates memory-bound tasks. L40's 300W TDP versus 350W allows higher density in multi-GPU setups, improving overall throughput per rack.
Live Cloud Pricing
Real-time prices from 25+ providers. Updated every 60 seconds.
L40
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() TensorDock | NVIDIA L40S 48GB VRAM | 48GB | 0 vCPU 0GB RAM | Wolverhampton | $0.55/GPU/hr | Available | ||
![]() RunPod | NVIDIA L40 48GB VRAM | 48GB | 8 vCPU 94GB RAM | 🌍global | $0.82/GPU/hr | |||
![]() RunPod | NVIDIA L40S 48GB VRAM | 48GB | 16 vCPU 94GB RAM | 🌍global | $0.86/GPU/hr | |||
![]() Massed Compute | NVIDIA L40 48GB VRAM | 48GB | 14 vCPU 72GB RAM 625GB Storage | Iowa | $0.86/GPU/hr | Available | ||
![]() Massed Compute | 2×NVIDIA L40 48GB VRAM | 48GB | 26 vCPU 144GB RAM 1250GB Storage | Iowa | $0.86/GPU/hr $1.72/hr total (2×) | Available |
RTX 3090 Ti
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() TensorDock | NVIDIA GeForce RTX 3090 24GB VRAM | 24GB | 0 vCPU 0GB RAM | Wilmington, Delaware | $0.20/GPU/hr | Available | ||
![]() TensorDock | NVIDIA GeForce RTX 3090 24GB VRAM | 24GB | 0 vCPU 0GB RAM | Dallas, Texas | $0.21/GPU/hr | Available | ||
![]() Vast.ai | 4×NVIDIA GeForce RTX 3090 24GB VRAM | 24GB | 32 vCPU 403GB RAM 104GB Storage | Iceland | $0.25/GPU/hr $1.01/hr total (4×) | Available | ||
![]() Vast.ai | 4×NVIDIA GeForce RTX 3090 24GB VRAM | 24GB | 32 vCPU 252GB RAM 1217GB Storage | Finland | $0.27/GPU/hr $1.07/hr total (4×) | Available | ||
![]() LeaderGPU | 8×NVIDIA GeForce RTX 3090 24GB VRAM | 24GB | 64 vCPU 384GB RAM 2000GB Storage | Netherlands | $0.29/GPU/hr $2.29/hr total (8×) | Available |
When to Choose the L40
The L40 is the superior choice for VRAM-intensive workloads like training or fine-tuning models over 24 GB in size. Its 48 GB GDDR6 capacity and 90.5 TFLOPS FP16 performance enable larger batches and faster iterations compared to RTX 3090 Ti's limits. Datacenter users benefit from PCIe form factor scalability at 300W TDP.
When to Choose the RTX 3090 Ti
Opt for RTX 3090 Ti in budget-constrained environments with pricing from $0.10 per hour. It handles inference and lighter training within 24 GB VRAM effectively, leveraging 936 GB/s bandwidth for rapid data transfer. NVLink interconnect supports multi-GPU setups for cost-effective scaling.
Use Cases
L40's 48 GB VRAM accommodates large model parameters and batches exceeding RTX 3090 Ti's 24 GB limit. 90.5 TFLOPS FP16 doubles training speed over 35.6 TFLOPS.
48 GB VRAM on L40 supports high-concurrency inference for bigger models. Superior 90.5 TFLOPS FP16 reduces latency compared to 35.6 TFLOPS on RTX 3090 Ti.
L40 handles fine-tuning of models up to 48 GB with 90.5 TFLOPS efficiency. RTX 3090 Ti's 24 GB VRAM restricts dataset sizes.
RTX 3090 Ti's 936 GB/s bandwidth and $0.10 per hour pricing suffice for image generation within 24 GB VRAM. L40's advantages are unnecessary for this task.
Both GPUs offer FP32 at similar ratios to FP16, with L40 at 90.5 TFLOPS for intensive simulations and RTX 3090 Ti at 35.6 TFLOPS for lighter loads.
Frequently Asked Questions
What is the VRAM difference between L40 and RTX 3090 Ti?▾
L40 provides 48 GB GDDR6 VRAM, double the 24 GB GDDR6X on RTX 3090 Ti. This enables larger models on L40. Batch sizes increase accordingly in memory-bound tasks.
How do compute performances compare?▾
L40 delivers 90.5 TFLOPS in FP16 and FP32, outperforming RTX 3090 Ti's 35.6 TFLOPS by 154 percent. Training times halve on L40 for FP16 workloads. Inference latency improves proportionally.
What are the cloud pricing differences?▾
L40 starts at $0.67 per hour averaging $0.89 across 14 offers. RTX 3090 Ti begins at $0.10 per hour averaging $0.25 across 5 offers. RTX 3090 Ti offers four times lower average cost.
Which has higher memory bandwidth?▾
RTX 3090 Ti leads with 936 GB/s versus L40's 864 GB/s. This aids small-batch throughput on RTX 3090 Ti. L40 compensates with doubled VRAM.
What are the TDP ratings?▾
L40 consumes 300W TDP, lower than RTX 3090 Ti's 350W. L40 supports denser cloud deployments. Power efficiency favors L40 per TFLOP.
Which architecture is newer?▾
L40 uses Ada Lovelace from 2023, succeeding RTX 3090 Ti's Ampere from 2020. Ada Lovelace improves tensor cores for AI. Efficiency gains reach 2.5 times in FP16.
Which is cheaper to rent, the L40 or the RTX 3090?▾
Cloud rental prices for both the L40 and RTX 3090 vary by provider, configuration, and availability. This page shows live pricing from 25+ providers updated every 60 seconds. Scroll to the Live Cloud Pricing section to compare current rates.
How much VRAM does the L40 have compared to the RTX 3090?▾
The L40 has 48 GB of GDDR6 memory. The RTX 3090 has 24 GB of GDDR6X memory.
Can I find L40 and RTX 3090 GPUs available to rent right now?▾
Yes. This page shows real-time availability across 25+ cloud GPU providers. The Live Cloud Pricing section displays only in-stock offers with current pricing.
What is the main difference between the L40 and the RTX 3090?▾
The L40 uses the Ada Lovelace architecture (2023) while the RTX 3090 uses Ampere (2020). The L40 delivers 2.5x the FP16 throughput and 1.1x the memory bandwidth of the RTX 3090.




