Specifications Compared
| Spec | H200 | MI355X |
|---|---|---|
| TDP | 700W | 750W |
| VRAM | 141 GB | 288 GB |
| CUDA Cores | 16,896 | |
| Memory Type | HBM3e | HBM3e |
| Architecture | Hopper | CDNA 4 |
| Form Factors | SXM, NVL | OAM |
| Interconnect | NVLink, PCIe 5.0, InfiniBand | Infinity Fabric |
| Tensor Cores | 528 | |
| FP8 Performance | 3,958 TFLOPS | 4,600 TFLOPS |
| FP16 Performance | 1,979 TFLOPS | 2,300 TFLOPS |
| FP32 Performance | 67 TFLOPS | 2300 TFLOPS |
| FP64 Performance | 34 TFLOPS | 72 TFLOPS |
| INT8 Performance | 3,958 TOPS | 4,600 TOPS |
| Memory Bandwidth | 4,800 GB/s | 8,000 GB/s |
Performance Analysis
Compute specifications reveal key trade-offs between the GPUs. The MI355X delivers 2300 TFLOPS in both FP16 and FP32, surpassing the H200's 1979 TFLOPS FP16 and notably its 67 TFLOPS FP32; this FP32 parity benefits training pipelines involving higher precision math or scientific simulations, whereas H200's FP32 deficit suits inference-heavy FP16 or FP8 workloads at 3958 TFLOPS. In real-world training, MI355X's balanced precisions reduce conversion overheads. Memory configurations drive practical impacts: MI355X's 8000 GB/s bandwidth versus H200's 4800 GB/s enables larger batch sizes, cutting iteration times in LLM training by improving data throughput. The 288 GB VRAM on MI355X supports full loading of massive models like 1T+ parameter LLMs without model parallelism, unlike H200's 141 GB limit which necessitates sharding and added latency.
Live Cloud Pricing
Real-time prices from 25+ providers. Updated every 60 seconds.
H200 NVL
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
Vultr | NVIDIA GH200 Grace Hopper 96GB VRAM | 96GB | 72 vCPU 480GB RAM 960GB Storage | Atlanta | $1.99/GPU/hr | Available | ||
![]() Lambda Labs | NVIDIA GH200 Grace Hopper 96GB VRAM | 96GB | 64 vCPU 432GB RAM 4096GB Storage | Virginia | $2.29/GPU/hr | Available | ||
Nebius | NVIDIA H200 SXM 141GB VRAM | 141GB | 16 vCPU 200GB RAM | 🌍Europe | $2.45/GPU/hr | |||
![]() CoreWeave | 8×NVIDIA H200 SXM 141GB VRAM | 141GB | 128 vCPU 0GB RAM 61440GB Storage | United States | $2.58/GPU/hr $20.64/hr total (8×) | |||
![]() Ori | 4×NVIDIA H200 SXM 141GB VRAM | 141GB | 96 vCPU 960GB RAM 12000GB Storage | London | $3.50/GPU/hr $14.00/hr total (4×) | Available |
When to Choose the H200 NVL
Opt for the H200 NVL in production environments requiring immediate scalability. Live cloud offers start at $0.50 per hour with an average of $2.54 per hour across four providers, enabling quick deployment without waiting for MI355X availability. Its 700W TDP consumes less power than the MI355X's 750W, and NVLink plus PCIe 5.0 interconnects facilitate robust multi-GPU setups for current Hopper-optimized software stacks.
When to Choose the MI355X
Select the MI355X for forward-looking deployments handling extreme model sizes. The 288 GB HBM3e VRAM accommodates entire large language models on a single GPU, eliminating distribution overheads present with H200's 141 GB. Superior 8000 GB/s bandwidth and 2300 TFLOPS FP32 performance excel in bandwidth-bound training or FP32-intensive tasks.
Use Cases
MI355X's 288 GB VRAM and 8000 GB/s bandwidth support massive batch sizes and full model loading, accelerating epochs compared to H200's 141 GB and 4800 GB/s limits.
Higher FP8 performance at 4600 TFLOPS and 288 GB VRAM on MI355X enable low-latency serving of larger models without quantization, outperforming H200's 3958 TFLOPS FP8 and 141 GB.
H200's current pricing from $0.50 per hour suits quick iterations, while MI355X's 2300 TFLOPS FP16 handles larger datasets; choice depends on model size and availability.
H200's 1979 TFLOPS FP16 and NVLink interconnect optimize multi-GPU image generation pipelines available now, avoiding MI355X's lack of live cloud offers.
MI355X's 2300 TFLOPS FP32 matches its FP16 capability, ideal for precision simulations, far exceeding H200's 67 TFLOPS FP32.
Frequently Asked Questions
Which GPU has more VRAM?▾
The MI355X provides 288 GB HBM3e VRAM, doubling the H200 NVL's 141 GB. This capacity allows MI355X to load larger AI models without partitioning.
What is the memory bandwidth difference?▾
MI355X offers 8000 GB/s bandwidth compared to H200's 4800 GB/s. Higher bandwidth on MI355X improves data transfer for large batch training.
How do FP16 performances compare?▾
MI355X achieves 2300 TFLOPS FP16 versus H200's 1979 TFLOPS. This edge aids MI355X in AI training throughput.
Is there cloud pricing for these GPUs?▾
H200 NVL starts at $0.50 per hour averaging $2.54 per hour across four offers. MI355X has no live cloud pricing available yet.
Which has higher TDP?▾
MI355X consumes 750W TDP, slightly more than H200's 700W. H200 thus runs cooler in dense clusters.
What interconnects do they support?▾
H200 NVL uses NVLink, PCIe 5.0, and InfiniBand for multi-GPU scaling. MI355X relies on Infinity Fabric.
Which is cheaper to rent, the H200 or the MI355X?▾
Cloud rental prices for both the H200 and MI355X vary by provider, configuration, and availability. This page shows live pricing from 25+ providers updated every 60 seconds. Scroll to the Live Cloud Pricing section to compare current rates.
How much VRAM does the H200 have compared to the MI355X?▾
The H200 has 141 GB of HBM3e memory. The MI355X has 288 GB of HBM3e memory.
Can I find H200 and MI355X GPUs available to rent right now?▾
Yes. This page shows real-time availability across 25+ cloud GPU providers. The Live Cloud Pricing section displays only in-stock offers with current pricing.
What is the main difference between the H200 and the MI355X?▾
The H200 uses the Hopper architecture (2024) while the MI355X uses CDNA 4 (2025). The MI355X delivers 1.2x the FP16 throughput and 1.7x the memory bandwidth of the H200.


