Specifications Compared
| Spec | B200 | GH200 |
|---|---|---|
| TDP | 1000W | 900W |
| VRAM | 192 GB | 96 GB |
| CUDA Cores | 18,432 | 16,896 |
| Memory Type | HBM3e | HBM3 |
| Architecture | Blackwell | Hopper |
| Form Factors | SXM, NVL | SXM |
| Interconnect | NVLink, PCIe 6.0, InfiniBand | NVLink-C2C, PCIe 5.0 |
| Tensor Cores | 576 | 528 |
| FP8 Performance | 9,000 TFLOPS | 3,958 TFLOPS |
| FP16 Performance | 4,500 TFLOPS | 1,979 TFLOPS |
| FP32 Performance | 90 TFLOPS | 67 TFLOPS |
| FP64 Performance | 45 TFLOPS | 34 TFLOPS |
| INT8 Performance | 9,000 TOPS | 3,958 TOPS |
| Memory Bandwidth | 8,000 GB/s | 4,000 GB/s |
Performance Analysis
Floating-point performance gaps translate directly to workload speeds. The B200's 4500 TFLOPS FP16 capability more than doubles the GH200's 1979 TFLOPS, speeding up model training where half-precision tensor operations prevail and reducing epochs from days to hours on equivalent datasets. FP32 throughput shows a narrower margin at 90 TFLOPS for B200 versus 67 TFLOPS, adequate for precision-sensitive simulations but secondary in deep learning pipelines.
FP8 performance underscores inference advantages: the B200's 9000 TFLOPS enables quantized model serving at twice the GH200's 3958 TFLOPS pace, critical for real-time applications with high query volumes. Memory differences amplify these effects: 192 GB HBM3e on the B200 supports batch sizes up to double those on the GH200's 96 GB HBM3, minimizing out-of-memory errors in transformer inference. The 8000 GB/s bandwidth versus 4000 GB/s further accelerates data movement, cutting latency in bandwidth-bound scenarios like diffusion models.
Power draw reflects efficiency trade-offs, with the B200 at 1000W TDP exceeding the GH200's 900W, yet delivering superior flops per watt in FP16 and FP8 domains.
Live Cloud Pricing
Real-time prices from 25+ providers. Updated every 60 seconds.
B200 SXM
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
Nebius | NVIDIA B200 SXM 192GB VRAM | 192GB | 20 vCPU 224GB RAM | 🌍Europe | $3.95/GPU/hr | |||
Cirrascale | 8×NVIDIA B200 SXM 192GB VRAM | 192GB | 192 vCPU 2048GB RAM 43923GB Storage | United States | $4.79/GPU/hr $38.32/hr total (8×) | |||
Cirrascale | 8×NVIDIA B200 SXM 192GB VRAM | 192GB | 192 vCPU 2048GB RAM 43923GB Storage | United States | $5.39/GPU/hr $43.12/hr total (8×) | |||
Cirrascale | 8×NVIDIA B200 SXM 192GB VRAM | 192GB | 192 vCPU 2048GB RAM 43923GB Storage | United States | $5.69/GPU/hr $45.52/hr total (8×) | |||
![]() RunPod | NVIDIA B200 SXM 192GB VRAM | 192GB | 28 vCPU 283GB RAM | California | $5.89/GPU/hr |
GH200 Grace Hopper
| Provider | GPU Model | VRAM | Host Specs | Region | Price | Status | Action | |
|---|---|---|---|---|---|---|---|---|
Vultr | NVIDIA GH200 Grace Hopper 96GB VRAM | 96GB | 72 vCPU 480GB RAM 960GB Storage | Atlanta | $1.99/GPU/hr | Available | ||
![]() Lambda Labs | NVIDIA GH200 Grace Hopper 96GB VRAM | 96GB | 64 vCPU 432GB RAM 4096GB Storage | Virginia | $2.29/GPU/hr | Available | ||
![]() Denvr | NVIDIA GH200 Grace Hopper 96GB VRAM | 96GB | 72 vCPU 480GB RAM 7600GB Storage | Virginia | $3.87/GPU/hr | |||
![]() CoreWeave | NVIDIA GH200 Grace Hopper 96GB VRAM | 96GB | 72 vCPU 480GB RAM 7680GB Storage | United States | $6.50/GPU/hr |
When to Choose the B200 SXM
Select the NVIDIA B200 for cutting-edge AI training and inference requiring peak capacity. Its 192 GB HBM3e VRAM accommodates full-parameter loading of models exceeding 100 billion parameters, while 4500 TFLOPS FP16 accelerates convergence. High-volume inference leverages 9000 TFLOPS FP8 for low-latency quantized deployments.
Cloud users benefit from 13 live offers starting at $1.71 per hour, offering flexibility for sustained workloads.
When to Choose the GH200 Grace Hopper
Choose the NVIDIA GH200 Grace Hopper for budget-conscious or CPU-integrated tasks. Average pricing of $3.59 per hour across 4 offers undercuts the B200's $4.60 average, suiting proof-of-concept runs. The 900W TDP simplifies cooling versus 1000W, and NVLink-C2C interconnect optimizes Grace CPU-GPU data transfers in hybrid computing.
Use Cases
The B200's 4500 TFLOPS FP16 and 192 GB VRAM enable training of massive LLMs without partitioning, doubling speed over GH200's 1979 TFLOPS and 96 GB.
B200's 9000 TFLOPS FP8 supports high-throughput quantized inference, far exceeding GH200's 3958 TFLOPS for real-time serving.
192 GB HBM3e and 8000 GB/s bandwidth on B200 handle large batch sizes during fine-tuning, outperforming GH200's 96 GB and 4000 GB/s.
B200's superior FP16 at 4500 TFLOPS accelerates image generation pipelines, with doubled memory bandwidth reducing iteration times versus GH200.
GH200's Grace CPU integration via NVLink-C2C excels in CPU-GPU hybrid simulations, complemented by 67 TFLOPS FP32 at lower 900W TDP.
Frequently Asked Questions
What is the VRAM capacity of NVIDIA B200 versus GH200?▾
The B200 features 192 GB HBM3e VRAM, double the GH200's 96 GB HBM3. This allows the B200 to load larger models entirely on a single GPU.
How do FP16 performance levels compare?▾
B200 delivers 4500 TFLOPS in FP16, more than double the GH200's 1979 TFLOPS. This gap accelerates AI training workloads significantly.
What are the current cloud pricing ranges?▾
B200 SXM starts from $1.71 per hour averaging $4.60 across 13 offers. GH200 Grace Hopper begins at $1.99 per hour averaging $3.59 across 4 offers.
Which has higher memory bandwidth?▾
The B200 provides 8000 GB/s, twice the GH200's 4000 GB/s. Higher bandwidth supports larger batches and reduces data bottlenecks.
What are the TDP ratings?▾
B200 has a 1000W TDP, higher than GH200's 900W. The GH200 offers better power efficiency for constrained environments.
Which GPU supports FP8 better for inference?▾
B200 achieves 9000 TFLOPS in FP8, over double GH200's 3958 TFLOPS. This enhances quantized model inference speeds.
Which is cheaper to rent, the B200 or the GH200?▾
Cloud rental prices for both the B200 and GH200 vary by provider, configuration, and availability. This page shows live pricing from 25+ providers updated every 60 seconds. Scroll to the Live Cloud Pricing section to compare current rates.
How much VRAM does the B200 have compared to the GH200?▾
The B200 has 192 GB of HBM3e memory. The GH200 has 96 GB of HBM3 memory.
Can I find B200 and GH200 GPUs available to rent right now?▾
Yes. This page shows real-time availability across 25+ cloud GPU providers. The Live Cloud Pricing section displays only in-stock offers with current pricing.
What is the main difference between the B200 and the GH200?▾
The B200 uses the Blackwell architecture (2024) while the GH200 uses Hopper (2023). The B200 delivers 2.3x the FP16 throughput and 2.0x the memory bandwidth of the GH200.



