Skip to main content

Replicate GPU Cloud Pricing

from $0.440/hr/hr

Serverless platform for running and hosting machine learning models via API, billed per second of GPU compute with a large community model library.

Start on Replicate

Try Replicate →

Current Pricing — 6 configurations

Some links are affiliate links — we may earn a commission at no cost to you.Details.

Provider	Configuration	Region	Billing	Availability	Price/hr
Replicatecheapest	Nvidia T4 (16GB)	us-east	per-second	on-demand	$0.440/hr	Rent →
Replicate	Nvidia RTX 4090 (24GB)	us-east	per-second	on-demand	$1.00/hr	Rent →
Replicate	Nvidia L40S (48GB)	us-east	per-second	on-demand	$1.95/hr	Rent →
Replicate	Nvidia A100 (40GB)	us-east	per-second	on-demand	$2.30/hr	Rent →
Replicate	Nvidia A100 (80GB, SXM)	us-east	per-second	on-demand	$3.24/hr	Rent →
Replicate	Nvidia H100 (80GB)	us-east	per-second	on-demand	$4.85/hr	Rent →

Provider Details

Founded	2019
Billing	per-second
Regions	us-central
Features	serverless, model-hosting, api-access, auto-scaling
Trust Score	4.3/5
Website	https://replicate.com

FAQ

What GPUs does Replicate offer?

Nvidia H100 (80GB), Nvidia A100 (80GB, SXM), Nvidia A100 (40GB), Nvidia RTX 4090 (24GB), Nvidia L40S (48GB), Nvidia T4 (16GB)

Where are Replicate data centers located?

Replicate operates in: us-central.

How does Replicate bill for GPU usage?

Replicate supports per-second billing.

Is Replicate reliable for production workloads?

Replicate has a trust score of 4.3/5. Features include: serverless, model-hosting, api-access, auto-scaling.

Last data refresh: April 29, 2026. Verify on Replicate's site.

Related Providers

Serverless inference platform optimized for generative media workloads (images, video, audio) with sub-second cold starts and real-time streaming.

Developer-first serverless GPU platform with a Python-native SDK, per-second billing, and automatic cold-start optimization for ML workloads.

Consumer and data-center GPU cloud with spot and on-demand instances, a large template marketplace, and a serverless inference platform.

Budget-friendly GPU marketplace aggregating data-center hardware from multiple hosts, offering broad GPU variety at competitive hourly rates.

European GPU cloud powered by renewable energy, offering NVIDIA instances at competitive rates with a focus on sustainability.

Peer-to-peer GPU marketplace that aggregates idle hardware from independent hosts, offering some of the lowest per-hour rates available.