Local AI Hosting Calculator

Models

1 selected
Llama
Qwen
Mistral
DeepSeek
Gemma
Phi
MiniMax
Yi
GLM
Command-R
Falcon
InternLM
SmolLM
Granite
StableLM
OLMo

Quantization & engine

vLLM (paged attention)

Workload

Constraints

Requirement

for the selected models + workload
Total VRAM
0.96 GB
System RAM
1.05 GB
Storage
20.9 GB
Est. PSU
700 W
balanced tier
Weights: 0.70 GBKV cache: 0.14 GBOverhead: 0.13 GB
Budget
NVIDIA GeForce RTX 3060 12GB
$280
VRAM
12.0 GB (11.0 GB free)
Throughput
1.2k tok/s
Est. PSU
450 W
Fits VRAMMeets concurrencyIn budget
Balanced
NVIDIA GeForce RTX 5070 Ti
$979
VRAM
16.0 GB (15.0 GB free)
Throughput
3.1k tok/s
Est. PSU
700 W
Fits VRAMMeets concurrencyIn budget
Max performance
NVIDIA H200 141GB (SXM/NVL)
$31,000
VRAM
141 GB (140 GB free)
Throughput
16.4k tok/s
Est. PSU
1350 W
Fits VRAMMeets concurrencyIn budget

Matching hardware

41 options fit the requirement
HardwareTierVRAMUnitsThroughputPSUPrice
NVIDIA GeForce RTX 3060 12GB
consumer gpu
budget12.0 GB11.2k tok/s450 W$280
Apple M4 (10-core GPU) 16GB
apple silicon
budget16.0 GB1411 tok/s22 W$599
NVIDIA GeForce RTX 3090
consumer gpu
budget24.0 GB13.2k tok/s750 W$750
Apple M4 (10-core GPU) 24GB
apple silicon
budget24.0 GB1411 tok/s22 W$799
NVIDIA GeForce RTX 4070 Ti SUPER
consumer gpu
balanced16.0 GB12.3k tok/s650 W$880
NVIDIA GeForce RTX 3090 Ti
consumer gpu
budget24.0 GB13.5k tok/s950 W$900
NVIDIA GeForce RTX 5070 Ti
consumer gpu
balanced16.0 GB13.1k tok/s700 W$979
Apple M4 (10-core GPU) 32GB
apple silicon
budget32.0 GB1411 tok/s22 W$999
NVIDIA GeForce RTX 4080 SUPER
consumer gpu
balanced16.0 GB12.5k tok/s700 W$1,100
NVIDIA GeForce RTX 5080
consumer gpu
balanced16.0 GB13.3k tok/s800 W$1,299
Apple M4 Pro (20-core GPU) 24GB
apple silicon
budget24.0 GB1935 tok/s38 W$1,399
Apple M4 Pro (20-core GPU) 48GB
apple silicon
balanced48.0 GB1935 tok/s38 W$1,799
Apple M3 Pro (18-core GPU) 18GB
apple silicon
budget18.0 GB1513 tok/s35 W$1,999
Apple M4 Pro (20-core GPU) 64GB
apple silicon
balanced64.0 GB1935 tok/s38 W$1,999
NVIDIA GeForce RTX 4090
consumer gpu
max24.0 GB13.5k tok/s950 W$2,100
Apple M3 Pro (18-core GPU) 36GB
apple silicon
budget36.0 GB1513 tok/s35 W$2,399
NVIDIA GeForce RTX 5090
consumer gpu
max32.0 GB16.1k tok/s1150 W$2,999
Apple M3 Max (30-core GPU) 36GB
apple silicon
balanced36.0 GB11.0k tok/s45 W$2,999
Apple M4 Max (32-core GPU) 36GB
apple silicon
balanced36.0 GB11.4k tok/s50 W$3,199
Apple M3 Max (40-core GPU) 48GB
apple silicon
balanced48.0 GB11.4k tok/s56 W$3,499
2x RTX 4090 (48GB total)
multi gpu
budget48.0 GB16.4k tok/s1700 W$3,600
Apple M4 Max (40-core GPU) 48GB
apple silicon
balanced48.0 GB11.9k tok/s56 W$3,699
Apple M3 Max (40-core GPU) 64GB
apple silicon
balanced64.0 GB11.4k tok/s56 W$3,699
Apple M4 Max (40-core GPU) 64GB
apple silicon
balanced64.0 GB11.9k tok/s56 W$3,999
Apple M3 Ultra (80-core GPU) 96GB
apple silicon
max96.0 GB12.8k tok/s270 W$3,999
Apple M3 Max (40-core GPU) 96GB
apple silicon
max96.0 GB11.4k tok/s56 W$4,099
NVIDIA RTX A6000 (48GB)
workstation gpu
balanced48.0 GB12.6k tok/s700 W$4,200
2x RTX 5090 (64GB total)
multi gpu
balanced64.0 GB111.3k tok/s2150 W$4,400
Apple M3 Max (40-core GPU) 128GB
apple silicon
max128 GB11.4k tok/s56 W$4,499
Apple M4 Max (40-core GPU) 128GB
apple silicon
max128 GB11.9k tok/s56 W$4,699
Apple M3 Ultra (80-core GPU) 256GB
apple silicon
max256 GB12.8k tok/s270 W$5,999
NVIDIA RTX 6000 Ada Generation (48GB)
workstation gpu
balanced48.0 GB13.3k tok/s700 W$6,800
4x RTX 4090 (96GB total)
multi gpu
balanced96.0 GB112.2k tok/s3250 W$7,200
NVIDIA A100 40GB (PCIe/SXM)
datacenter gpu
max40.0 GB15.3k tok/s850 W$7,500
NVIDIA L40S (48GB)
datacenter gpu
balanced48.0 GB13.0k tok/s750 W$8,000
NVIDIA RTX PRO 6000 Blackwell Workstation (96GB)
workstation gpu
max96.0 GB16.1k tok/s1200 W$8,500
Apple M3 Ultra (80-core GPU) 512GB
apple silicon
max512 GB12.8k tok/s270 W$9,499
NVIDIA A100 80GB (PCIe/SXM)
datacenter gpu
max80.0 GB17.0k tok/s850 W$12,000
2x RTX 6000 Ada (96GB total)
multi gpu
max96.0 GB16.1k tok/s1200 W$13,600
NVIDIA H100 80GB (SXM/PCIe)
datacenter gpu
max80.0 GB111.5k tok/s1350 W$25,000
NVIDIA H200 141GB (SXM/NVL)
datacenter gpu
max141 GB116.4k tok/s1350 W$31,000