Best Cloud GPU for LLM inference 70B

Minimum 80GB VRAM · Recommended 160GB+ · Runtime: tokens-per-second

Cheapest for LLM inference 70B: NVIDIA A100 80GB PCIe on Salad

$0.890/hr/hr · verify on provider site

Cheapest GPU Options — 9 eligible GPUs

Provider	Configuration	Region	Billing	Availability	Price/hr
Saladcheapest	1x A100 PCIe 80GB	distributed	per-minute	on-demand	$0.890/hr	Rent →
vast	1x A100 PCIe 80GB	us-east	per-second	on-demand	$1.59/hr	Rent →
RunPod	1x A100 PCIe 80GB	us-east	per-second	on-demand	$1.64/hr	Rent →
FluidStack	1x A100 SXM 80GB	us-east	per-minute	on-demand	$1.85/hr	Rent →
DataCrunch	1x A100 SXM4 80GB	eu-north	per-minute	on-demand	$1.89/hr	Rent →
RunPod	1x A100 SXM	us-east	per-second	on-demand	$1.89/hr	Rent →
genesis	1x A100 PCIe 80GB	us-east	per-minute	on-demand	$1.89/hr	Rent →
TensorDock	1x A100 PCIe 80GB	us-east	per-minute	on-demand	$1.99/hr	Rent →
Hyperstack	1x A100 SXM4 80GB	uk-london	per-minute	on-demand	$2.06/hr	Rent →
CoreWeave	1x A100 SXM4 80GB	us-east	per-second	on-demand	$2.21/hr	Rent →
lambda	1x A100 SXM	us-west-2	per-minute	on-demand	$2.21/hr	Rent →
Paperspace	1x A100 SXM	us-east	per-minute	on-demand	$2.30/hr	Rent →
Paperspace	1x A100 PCIe 80GB	us-east	per-minute	on-demand	$2.30/hr	Rent →
together	1x A100 SXM 80GB	us-east	per-second	on-demand	$2.49/hr	Rent →
fal	1x A100 SXM 80GB	us-east	per-millisecond	on-demand	$2.99/hr	Rent →

GPU Requirements

Minimum VRAM	80 GB
Recommended VRAM	160 GB
Ideal GPUs	—
Typical Runtime	tokens-per-second
Billing Pattern	sustained

FAQ

What GPU do I need for LLM inference 70B?

Requires at least 80GB VRAM. Recommended: 160GB+. Ideal: NVIDIA H100 80GB SXM, NVIDIA H100 80GB PCIe, NVIDIA H200 141GB SXM.

What is the cheapest GPU for LLM inference 70B?

NVIDIA A100 80GB PCIe at $0.890/hr/hr on Salad.

How much does LLM inference 70B cost per hour?

From $0.890/hr/hr. Runtime: tokens-per-second.

GPU-Specific Pages

NVIDIA H100 80GB SXM for LLM inference 70B

NVIDIA H100 80GB PCIe for LLM inference 70B

NVIDIA H200 141GB SXM for LLM inference 70B

NVIDIA A100 80GB SXM for LLM inference 70B