Specifications Compared
| Spec | A100 | L40S |
|---|---|---|
| TDP | 400W | 350W |
| VRAM | 40-80 GB | 48 GB |
| CUDA Cores | 6,912 | 18,176 |
| Memory Type | HBM2e | GDDR6X |
| Architecture | Ampere | Ada Lovelace |
| Form Factors | SXM4, PCIe | PCIe |
| Interconnect | NVLink, PCIe 4.0, InfiniBand | PCIe 4.0 |
| Tensor Cores | 432 | 568 |
| FP16 Performance | 312 TFLOPS | 362 TFLOPS |
| FP32 Performance | 19.5 TFLOPS | 91 TFLOPS |
| FP64 Performance | 9.7 TFLOPS | 1.4 TFLOPS |
| INT8 Performance | 624 TOPS | 724 TOPS |
| Memory Bandwidth | 2,039 GB/s | 864 GB/s |
Performance Analysis
FP32 performance favors the L40S decisively: it achieves 91 TFLOPS compared to the A100's 19.5 TFLOPS, accelerating single-precision tasks in scientific simulations and traditional ML training. In FP16, relevant for deep learning training, the L40S provides 362 TFLOPS versus 312 TFLOPS on the A100, offering a modest edge for mixed-precision workflows.
Memory specifications impact real-world usage profoundly. The A100's 2039 GB/s bandwidth and 80 GB HBM2e VRAM enable larger batch sizes in model training, minimizing data loading bottlenecks for datasets exceeding 48 GB. The L40S, with 864 GB/s and 48 GB GDDR6X, suits smaller-to-medium models but may require model parallelism sooner.
The L40S introduces FP8 at 724 TFLOPS, optimizing quantized inference for large language models, where reduced precision cuts latency without accuracy loss. Lower TDP at 350W versus 400W on the A100 also improves power efficiency in multi-GPU clusters.
Live Cloud Pricing
Real-time prices from 25+ providers. Updated every 60 seconds.
A100 PCIe 80GB
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() Vast.ai | 2×NVIDIA A100 SXM4 80GB 80GB VRAM | 80GB | 256 vCPU 126GB RAM 5672GB Storage | Slovenia | $0.73/GPU/hr $1.47/hr total (2×) | Available | ||
![]() Vast.ai | 2×NVIDIA A100 SXM4 80GB 80GB VRAM | 80GB | 256 vCPU 126GB RAM 769GB Storage | Slovenia | $0.73/GPU/hr $1.47/hr total (2×) | Available | ||
![]() LeaderGPU | 8×NVIDIA A100 PCIe 80GB 80GB VRAM | 80GB | 64 vCPU 384GB RAM 2000GB Storage | Netherlands | $0.90/GPU/hr $7.20/hr total (8×) | Available | ||
![]() Vast.ai | 2×NVIDIA A100 SXM4 80GB 80GB VRAM | 80GB | 64 vCPU 126GB RAM 1114GB Storage | Czechia | $1.00/GPU/hr $2.00/hr total (2×) | Available | ||
![]() Denvr | 4×NVIDIA A100 PCIe 80GB 80GB VRAM | 80GB | 64 vCPU 512GB RAM 7600GB Storage | Virginia | $1.15/GPU/hr $4.60/hr total (4×) |
L40S
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
![]() TensorDock | NVIDIA L40S 48GB VRAM | 48GB | 0 vCPU 0GB RAM | Wolverhampton | $0.55/GPU/hr | Available | ||
![]() RunPod | NVIDIA L40S 48GB VRAM | 48GB | 16 vCPU 94GB RAM | 🌍global | $0.86/GPU/hr | |||
![]() Massed Compute | 4×NVIDIA L40S 48GB VRAM | 48GB | 46 vCPU 288GB RAM 2500GB Storage | Iowa | $0.88/GPU/hr $3.52/hr total (4×) | Available | ||
![]() Massed Compute | 2×NVIDIA L40S 48GB VRAM | 48GB | 24 vCPU 144GB RAM 1250GB Storage | Iowa | $0.88/GPU/hr $1.76/hr total (2×) | Available | ||
![]() Massed Compute | NVIDIA L40S 48GB VRAM | 48GB | 12 vCPU 72GB RAM 625GB Storage | Iowa | $0.88/GPU/hr | Available |
When to Choose the A100 PCIe 80GB
Select the NVIDIA A100 PCIe 80GB for memory-bound workloads like training large-scale LLMs exceeding 48 GB VRAM. Its 80 GB HBM2e capacity and 2039 GB/s bandwidth support massive batch sizes and high-throughput data movement, ideal when NVLink interconnects enable multi-GPU scaling.
This GPU excels in environments prioritizing raw memory over cost, such as research clusters handling petabyte-scale datasets.
When to Choose the L40S
Choose the NVIDIA L40S for inference-heavy or cost-optimized deployments. Its 724 TFLOPS FP8 performance accelerates quantized LLM serving, while 91 TFLOPS FP32 outperforms the A100's 19.5 TFLOPS in graphics and simulation tasks.
At $0.40 per hour starting price and 350W TDP, it fits dense cloud instances better than the A100's $0.89 per hour and 400W draw.
Use Cases
The A100 PCIe 80GB's 80 GB HBM2e VRAM and 2039 GB/s bandwidth handle massive models and large batches without sharding. L40S's 48 GB limits scale for gigantic LLMs.
L40S's 724 TFLOPS FP8 optimizes quantized serving for low-latency responses. Lower $1.14 per hour average cost supports high-volume deployments.
A100's 80 GB VRAM accommodates full model loading during fine-tuning of large LLMs. High bandwidth sustains efficient gradient updates.
Ada Lovelace architecture and 362 TFLOPS FP16 excel in generative tasks like image synthesis. Cheaper pricing at $0.40 per hour enables experimentation.
L40S's 91 TFLOPS FP32 surpasses A100's 19.5 TFLOPS for simulations and HPC. Lower TDP aids sustained cluster runs.
Frequently Asked Questions
Which GPU has more VRAM: A100 PCIe 80GB or L40S?▾
The A100 PCIe 80GB provides 80 GB HBM2e VRAM, exceeding the L40S's 48 GB GDDR6X. This makes A100 better for models requiring over 48 GB.
How do cloud prices compare for A100 PCIe 80GB and L40S?▾
A100 PCIe 80GB starts at $0.89 per hour with an average of $2.08 per hour across 28 offers. L40S begins at $0.40 per hour averaging $1.14 per hour across 22 offers.
What is the FP16 performance difference?▾
L40S delivers 362 TFLOPS FP16, slightly above A100's 312 TFLOPS. This benefits mixed-precision training on L40S.
Does L40S support FP8, and how does it compare?▾
L40S offers 724 TFLOPS FP8 for quantized inference, unavailable on A100. It accelerates LLM serving significantly.
Which has higher memory bandwidth?▾
A100 PCIe 80GB achieves 2039 GB/s, double the L40S's 864 GB/s. Higher bandwidth on A100 supports larger training batches.
What are the TDPs of these GPUs?▾
A100 PCIe 80GB has 400W TDP, while L40S uses 350W. Lower TDP on L40S improves density in cloud racks.
Which is cheaper to rent, the A100 or the L40S?▾
Cloud rental prices for both the A100 and L40S vary by provider, configuration, and availability. This page shows live pricing from 25+ providers updated every 60 seconds. Scroll to the Live Cloud Pricing section to compare current rates.
How much VRAM does the A100 have compared to the L40S?▾
The A100 has 40 to 80 GB of HBM2e memory. The L40S has 48 GB of GDDR6X memory.
Can I find A100 and L40S GPUs available to rent right now?▾
Yes. This page shows real-time availability across 25+ cloud GPU providers. The Live Cloud Pricing section displays only in-stock offers with current pricing.
What is the main difference between the A100 and the L40S?▾
The A100 uses the Ampere architecture (2020) while the L40S uses Ada Lovelace (2023). The L40S delivers 1.2x the FP16 throughput and 2.4x the memory bandwidth of the A100.





