Best Cloud GPU for LLM inference 70B
Minimum 80GB VRAM · Recommended 160GB+ · Runtime: tokens-per-second
Cheapest for LLM inference 70B: NVIDIA A100 80GB PCIe on Salad
$0.890/hr/hr · verify on provider site
Cheapest GPU Options — 9 eligible GPUs
Some links are affiliate links — we may earn a commission at no cost to you.Details.
| Provider | Configuration | Region | Billing | Availability | Price/hr | |
|---|---|---|---|---|---|---|
| Saladcheapest | 1x A100 PCIe 80GB | distributed | per-minute | on-demand | $0.890/hr | Rent → |
| vast | 1x A100 PCIe 80GB | us-east | per-second | on-demand | $1.59/hr | Rent → |
| RunPod | 1x A100 PCIe 80GB | us-east | per-second | on-demand | $1.64/hr | Rent → |
| FluidStack | 1x A100 SXM 80GB | us-east | per-minute | on-demand | $1.85/hr | Rent → |
| DataCrunch | 1x A100 SXM4 80GB | eu-north | per-minute | on-demand | $1.89/hr | Rent → |
| RunPod | 1x A100 SXM | us-east | per-second | on-demand | $1.89/hr | Rent → |
| genesis | 1x A100 PCIe 80GB | us-east | per-minute | on-demand | $1.89/hr | Rent → |
| TensorDock | 1x A100 PCIe 80GB | us-east | per-minute | on-demand | $1.99/hr | Rent → |
| Hyperstack | 1x A100 SXM4 80GB | uk-london | per-minute | on-demand | $2.06/hr | Rent → |
| CoreWeave | 1x A100 SXM4 80GB | us-east | per-second | on-demand | $2.21/hr | Rent → |
| lambda | 1x A100 SXM | us-west-2 | per-minute | on-demand | $2.21/hr | Rent → |
| Paperspace | 1x A100 SXM | us-east | per-minute | on-demand | $2.30/hr | Rent → |
| Paperspace | 1x A100 PCIe 80GB | us-east | per-minute | on-demand | $2.30/hr | Rent → |
| together | 1x A100 SXM 80GB | us-east | per-second | on-demand | $2.49/hr | Rent → |
| fal | 1x A100 SXM 80GB | us-east | per-millisecond | on-demand | $2.99/hr | Rent → |
GPU Requirements
FAQ
What GPU do I need for LLM inference 70B?
Requires at least 80GB VRAM. Recommended: 160GB+. Ideal: NVIDIA H100 80GB SXM, NVIDIA H100 80GB PCIe, NVIDIA H200 141GB SXM.
What is the cheapest GPU for LLM inference 70B?
NVIDIA A100 80GB PCIe at $0.890/hr/hr on Salad.
How much does LLM inference 70B cost per hour?
From $0.890/hr/hr. Runtime: tokens-per-second.