Best VPS Providers for AnythingLLM in 2026
By Arnas Kazlaus · Software engineer and founder, 15 years shipping code
Tests run personally on rented VPSes · Last updated
The best VPS for AnythingLLM isn't the one with the biggest CPU spec on paper — it's the one that embeds your documents fast and retrieves them fast. We installed AnythingLLM on 7 VPS providers, uploaded the same test document, ran 10 real RAG queries through OpenAI's gpt-4o-mini, and measured: document ingestion time, retrieval latency, and full completion. Every number from runs on 2026-04-22. Hetzner wins ingest, OVHcloud wins RAG latency, DigitalOcean and Contabo tie for last.
hostingselector.com rented one VPS per host, installed the official AnythingLLM docker-compose stack (LanceDB for vectors, native embedding engine, OpenAI for LLM), then ran a structured benchmark: (1) full idle footprint before any uploads, (2) 20 back-to-back GETs against /api/ping for TTFB distribution, (3) admin login + API-key mint + workspace creation, (4) 50 direct chat requests via AnythingLLM's workspace-chat endpoint through gpt-4o-mini, (5) upload of a 5 KB test document with native embedding + association into the workspace — measured end-to-end as time_to_ingest, (6) 10 RAG queries forcing vector retrieval + LLM roundtrip. Ubuntu 24.04 on all hosts. Fresh install per provider. All numbers from runs we did on 2026-04-22. All scripts live in scripts/bench/tools/anythingllm/ in our public repo.
Some links on this page are affiliate links. If you sign up through one, we get a small commission — you pay the same price either way. This doesn’t change who wins below.
- Test window
- 2026-04-22 — 2026-04-22
- Region
- Provider flagship region
Why AnythingLLM?
Why self-host AnythingLLM? If you want a ChatGPT-style interface for your documents — internal wikis, product manuals, customer support articles — without handing those documents to OpenAI's retrieval tier or ChatGPT's custom GPTs, AnythingLLM is the go-to. It ingests PDFs, DOCX, TXT, web pages, YouTube transcripts, and GitHub repos, chunks them into a local vector store (LanceDB by default, or Pinecone/Chroma/Weaviate), and pipes the retrieved chunks to your LLM of choice (OpenAI, Anthropic, Gemini, Ollama, or local models). The full stack — AnythingLLM + MongoDB-optional + LanceDB + native embeddings — runs in a single Docker container and uses ~1 GB of RAM idle. For a team of 3+ people paying $20/month each for ChatGPT Plus, a self-hosted AnythingLLM on a $15 Hetzner VPS plus per-token API costs is typically 3-5× cheaper within the first month.
What AnythingLLM needs from a VPS
Before you shop, know what AnythingLLM actually needs. The surprising part: document ingestion is CPU-bound, not disk-bound.
2 vCPU minimum, 4 recommended for ingestion
Native embedding runs on CPU by default. A 5 KB test document takes 1.2s on Hetzner vs 3.0s on Contabo — a 2.5× gap that compounds over hundreds of uploads. For batch-ingesting a big corpus, pick 4 vCPU dedicated.
2 GB minimum, 4 GB recommended
The AnythingLLM + LanceDB stack uses ~1 GB idle, leaving ~1 GB headroom on a 2 GB VPS for in-flight embedding jobs. 4 GB gives you comfortable room; 8 GB is where you stop worrying about concurrent users.
20 GB SSD minimum, NVMe preferred
AnythingLLM + vector store base usage is ~4 GB. Each 5 KB doc adds ~50 KB of embedded vectors. A 1,000-document corpus uses ~50 MB of storage but MongoDB/LanceDB metadata grows proportionally. Plan on 30 GB for moderate use, 100 GB+ for heavy corpus.
Ubuntu 22.04 or 24.04 LTS
Debian 12 also works. AnythingLLM's Docker image is a standard Node.js runtime; no special kernel needs.
US-East or EU-West
RAG latency includes a network hop to OpenAI (or your chosen LLM). In our tests OVHcloud Canada routed ~100ms faster to OpenAI than Hetzner US-West. Pick a datacenter close to OpenAI's edge, not close to you.
Must be installable
AnythingLLM ships as a single Docker image (mintplexlabs/anythingllm:latest). LXC-based 'container' VPS hosting often doesn't support this.
Bring your own
AnythingLLM doesn't include LLM access. You supply an API key for OpenAI, Anthropic, Google, or Azure — or run Ollama alongside for local inference. Native embeddings are free and run on CPU; OpenAI embeddings cost $0.02 per 1M tokens if you prefer quality over CPU cost.
Our Top Picks
Hostinger
Fastest per-core CPU, 2nd-best disk, friendly UI
$24.49/mo
OVHcloud
Fastest RAG, DDoS protection, painful price
$78.68/mo
OVHcloud
Fastest RAG, DDoS protection, painful price
$78.68/mo
At-a-Glance Comparison
Every provider side-by-side. Lower is better for deploy time; higher is better for everything else.
| Host | Price/mo | CPU (events/sec) | Disk IOPS | Deploy p50 | Variance | Bandwidth |
|---|---|---|---|---|---|---|
| Hetzner Fastest document ingestion, cheapest plan | $15.00/mo | 3,627 | 5,718 | — | — | 20 TB EU / 1 TB US |
| Hostinger Fastest per-core CPU, 2nd-best disk, friendly UI | $24.49/mo | 4,241 | 10,675 | — | — | 8 TB, throttle only |
| Vultr Fast CPU, slow install, mid-pack RAG | $48.00/mo | 4,143 | 9,098 | — | — | 6 TB, $0.01/GB over |
| OVHcloud Fastest RAG, DDoS protection, painful price | $78.68/mo | 4,163 | 7,318 | — | — | Unmetered |
| Contabo Cheapest, slowest, shared-CPU tax visible everywhere | $11.21/mo | 1,491 | 2,527 | — | — | Unmetered* |
| DigitalOcean Best dashboard, slowest ingest, worst value — pattern holds | $48.00/mo | 816.97 | 3,169 | — | — | 5 TB, $0.01/GB over |
| Kamatera Best disk IOPS, fastest chat, mid-pack on ingest | $39.00/mo | 2,699 | 14,704 | — | — | 5 TB, $0.01/GB over |
* Contabo says “unmetered” but can throttle heavy users at their discretion. See full review.
Cost vs performance at a glance
Upper-left is best (cheap and fast). Lower-right is worst (expensive and slow).
Best overall pick · Avoid at this tier
The cost that isn’t on the sticker: bandwidth
AnythingLLM's bandwidth profile is modest — chat JSON + small document uploads. Even at 30 TB/month (a team of dozens with heavy file attachments), Hetzner Europe costs about $27 total while DigitalOcean costs about $298. Bandwidth math is tool-agnostic; these numbers match our other editions.
Scenario: 4-core VPS at each host + 30 TB/month outbound traffic (outbound is what providers meter; inbound is usually free)
| Host | Base price | Traffic included | Over-quota policy | Total / mo |
|---|---|---|---|---|
| Contabo | $11.21 | unlimited | none | $11 |
| Hostinger | $24.49 | 8 TB | no public overage rate | $24 |
| Hetzner | $15.00 | 20 TB (EU) | €1/TB billed over quota | $27 |
| OVHcloud | $78.68 | unmetered | none | $79 |
| Vultr | $48.00 | 6 TB (HP AMD) | $0.01/GB over quota | $288 |
| Kamatera | $39.00 | 5 TB | $0.01/GB over quota | $289 |
| DigitalOcean | $48.00 | 5 TB | $0.01/GB over quota | $298 |
27× difference between cheapest and priciest for the same traffic. Bandwidth policy is the biggest hidden variable in VPS pricing.
Prices current as of April 2026. For typical AnythingLLM usage (chat JSON, small doc uploads), you'll be well under 1 TB/month and bandwidth doesn't matter. If you're serving large files or many embedded docs to many concurrent users, the overage math starts to bite — DigitalOcean's $0.01/GB after 5 TB means a 10 TB month costs an extra $50.
Our picks
3 hosts we'd actually recommend — each wins on a specific axis.
Hetzner
Fastest document ingestion, cheapest plan
Min specs
$5.00/mo
Recommended
$15.00/mo
Benchmark measurements
- CPU (events/sec)
- 3627
- Disk IOPS
- 5718
- Disk throughput
- 22.3 MB/s
- Network
- 1470 Mbps
- Tool setup
- 37 s
Setup & Ease of Use
- ✗1-Click Install Available
- ✗Docker Pre-Configured
- ✓Setup Under 10 Minutes
- ✓AnythingLLM-Specific Docs
- ✓Intuitive Control Panel
Performance
- ✓Strong CPU Benchmark
- ✓8GB RAM at Base Tier
- ✓NVMe SSD Storage
- ✓Low API Latency
- ✓99.9%+ Uptime
Pricing & Value
- ✓Affordable at Min Specs
- ✓Affordable at Rec. Specs
- ✓No Hidden Fees
- ✓Hourly Billing Available
- —Free Trial or Money-Back
Security
- ✓Easy Firewall Config
- ✓DDoS Protection Included
- ✓SSH Key Authentication
- ✓2FA on Hosting Panel
- ✓Automatic Backups
- —Tunnel/Reverse Proxy Guide
Support
- ✓24/7 Availability
- —Fast Response Time
- —AnythingLLM-Specific Knowledge
- ✓Community Forums
Pros
- Fastest document ingestion we measured: 1,189 ms for a 5 KB test doc — embedding is CPU-bound with AnythingLLM's native engine, and Hetzner's dedicated AMD cores win
- Fastest LibreChat install too: 37 seconds end-to-end
- Cheapest 4 vCPU / 8 GB plan at $15/month (EU) / $17.99/month (US)
- Chat TTFT p50 of 999 ms — fastest in the group at forwarding through AnythingLLM to OpenAI
- Tight UI TTFB: 1 ms p50, 2 ms p95
- 20 TB of free outbound traffic in EU regions (1 TB in US — see cost math)
Cons
- US (Hillsboro) region has longer network hop to OpenAI than OVHcloud Canada — our RAG p50 was 1,203 ms vs OVH's 1,100 ms. Small but real.
- Hetzner Ashburn was out of capacity during this campaign — US-East fills up. Pick a secondary region if you want consistent availability.
- Backups are opt-in, not on by default
Visit HetznerBest value for AnythingLLM. Fastest document ingestion of the 7 providers we tested (1,189 ms vs DigitalOcean's 2,670 — 2.2× gap comes from dedicated vs shared CPU). Fastest install. Half the price of Vultr or DigitalOcean. If you plan to ingest hundreds of documents, Hetzner's CPU advantage compounds on every single upload.
Hostinger
Fastest per-core CPU, 2nd-best disk, friendly UI
Min specs
$9.99/mo
Recommended
$24.49/mo
Benchmark measurements
- CPU (events/sec)
- 4241
- Disk IOPS
- 10675
- Disk throughput
- 41.7 MB/s
- Network
- 148 Mbps
- Tool setup
- 53 s
Setup & Ease of Use
- ✗1-Click Install Available
- ✗Docker Pre-Configured
- ✓Setup Under 10 Minutes
- ✓AnythingLLM-Specific Docs
- ✓Intuitive Control Panel
Performance
- ✓Strong CPU Benchmark
- ✓8GB RAM at Base Tier
- ✓NVMe SSD Storage
- ✓Low API Latency
- ✓99.9%+ Uptime
Pricing & Value
- ✓Affordable at Min Specs
- ✓Affordable at Rec. Specs
- ✗No Hidden Fees
- ✗Hourly Billing Available
- ✓Free Trial or Money-Back
Security
- ✓Easy Firewall Config
- —DDoS Protection Included
- ✓SSH Key Authentication
- ✓2FA on Hosting Panel
- ✓Automatic Backups
- —Tunnel/Reverse Proxy Guide
Support
- ✓24/7 Availability
- —Fast Response Time
- —AnythingLLM-Specific Knowledge
- ✓Community Forums
Pros
- Highest per-core CPU we measured: 4,240 events/sec on just 2 vCPU — beats Vultr's 4-vCPU plan per-core
- Second-best disk IOPS: 10,675 (behind only Kamatera's 14,703)
- Third-fastest document ingestion: 1,516 ms — only Hetzner and OVHcloud beat it, and both have dedicated CPUs
- Fast LibreChat install: 53 seconds
- Beginner-friendly hPanel UI with good documentation
- 8 TB of bundled bandwidth — no overage surprise at scale
Cons
- KVM 2 is locked at 2 vCPU — if you plan to ingest big document libraries in parallel, the 4-vCPU plans (Hetzner, Vultr) will finish faster per batch
- Advertised price is $9/mo. Pay monthly: 1.5× that ($14.99). Renew after promo ends: 2.7× that ($24.49/mo).
- Marketing pushes 2-year plans hard — the advertised price assumes that commitment
- RAG p95 of 2,528 ms is looser than OVH (1,517) or Hetzner (1,664) — the 2-vCPU handles single requests well but tail latency on concurrent load suffers
Visit HostingerBest for beginners self-hosting AnythingLLM. Solid across every metric, easy dashboard, no weird gotchas. Per-core CPU is the highest of the 7, so single-user chat + RAG feel snappy. The 2 vCPU ceiling matters only if you're doing batch document ingestion for a team; for solo use it's a non-issue.
OVHcloud
Fastest RAG, DDoS protection, painful price
Min specs
$11.00/mo
Recommended
$78.68/mo
Benchmark measurements
- CPU (events/sec)
- 4163
- Disk IOPS
- 7318
- Disk throughput
- 28.6 MB/s
- Network
- 1 Mbps
- Tool setup
- 61 s
Setup & Ease of Use
- ✗1-Click Install Available
- ✗Docker Pre-Configured
- ✓Setup Under 10 Minutes
- ✓AnythingLLM-Specific Docs
- ✗Intuitive Control Panel
Performance
- ✓Strong CPU Benchmark
- ✓8GB RAM at Base Tier
- ✓NVMe SSD Storage
- ✓Low API Latency
- ✓99.9%+ Uptime
Pricing & Value
- ✓Affordable at Min Specs
- ✓Affordable at Rec. Specs
- ✓No Hidden Fees
- ✓Hourly Billing Available
- —Free Trial or Money-Back
Security
- ✓Easy Firewall Config
- ✓DDoS Protection Included
- ✓SSH Key Authentication
- ✓2FA on Hosting Panel
- ✓Automatic Backups
- —Tunnel/Reverse Proxy Guide
Support
- ✓24/7 Availability
- —Fast Response Time
- —AnythingLLM-Specific Knowledge
- ✓Community Forums
Pros
- Fastest RAG p50 in the group: 1,100 ms — OVH BHS5 has excellent network routing to OpenAI's east-coast edge, and it shows up on every retrieval query
- Tightest RAG p95: 1,517 ms — no outliers
- Second-fastest document ingestion: 1,371 ms (narrowly behind Hetzner's 1,189)
- Highest CPU we measured among 4-vCPU providers: 4,163 events/sec
- Second-best UI TTFB: 1 ms p50, 2 ms p95 (tied with Hostinger)
- DDoS protection and unmetered bandwidth included
- Strong European data sovereignty story
Cons
- Most expensive plan in our group at $78.68/month — 5.25× Hetzner's EU price
- Default login is the `ubuntu` user, not root — you'll need `sudo` for docker commands or enable root SSH first
- Your OVH account is locked to one region entity (EU, CA, or US) — you can't mix
Visit OVHcloudBest RAG experience of the 7 providers. If you're running AnythingLLM for production document chat where latency matters — customer support bot, internal knowledge base — OVH's 1,100 ms RAG p50 with 1,517 ms p95 is worth the premium. For solo hobbyist use, Hetzner delivers 95% of the experience at 20% of the cost.
Close calls
3 tested but not our top picks — each has a real edge for a specific use case.
Vultr
Fast CPU, slow install, mid-pack RAG
Min specs
$6.00/mo
Recommended
$48.00/mo
Benchmark measurements
- CPU (events/sec)
- 4143
- Disk IOPS
- 9098
- Disk throughput
- 35.5 MB/s
- Network
- 1 Mbps
- Tool setup
- 128 s
Setup & Ease of Use
- ✗1-Click Install Available
- ✗Docker Pre-Configured
- ✓Setup Under 10 Minutes
- ✓AnythingLLM-Specific Docs
- ✓Intuitive Control Panel
Performance
- ✓Strong CPU Benchmark
- ✓8GB RAM at Base Tier
- ✓NVMe SSD Storage
- ✓Low API Latency
- ✓99.9%+ Uptime
Pricing & Value
- ✓Affordable at Min Specs
- ✓Affordable at Rec. Specs
- ✓No Hidden Fees
- ✓Hourly Billing Available
- —Free Trial or Money-Back
Security
- ✓Easy Firewall Config
- ✓DDoS Protection Included
- ✓SSH Key Authentication
- ✓2FA on Hosting Panel
- ✓Automatic Backups
- —Tunnel/Reverse Proxy Guide
Support
- ✓24/7 Availability
- —Fast Response Time
- —AnythingLLM-Specific Knowledge
- ✓Community Forums
Pros
- Second-highest CPU: 4,143 events/sec (among 4-vCPU plans, Vultr wins)
- Strong disk: 9,098 IOPS (4th overall)
- Chat and RAG latencies mid-pack — 1,228 ms / 1,360 ms p50
- Hourly billing, 30+ global regions
Cons
- Slowest install in the group after Contabo: 128 seconds — 3.5× Hetzner's 37s
- 3× more expensive than Hetzner ($48/mo vs $15/mo EU) for worse ingest performance
- Document ingestion (2,094 ms) is 76% slower than Hetzner's 1,189 ms despite comparable CPU benchmarks — something about the Vultr NJ network hop to OpenAI is bottlenecking AnythingLLM's processing pipeline
- Only 6 TB of bandwidth included; $0.01/GB over that
Visit VultrHardware is there but the real-world numbers aren't. AnythingLLM users who care about ingestion speed get more from Hetzner at a third the price. Vultr's main case — 'fast CPU when you need it' — is already delivered by Hetzner Ashburn at $15/month. Pick Vultr only if you specifically need a region Hetzner doesn't have.
Contabo
Cheapest, slowest, shared-CPU tax visible everywhere
Min specs
$5.99/mo
Recommended
$11.21/mo
Benchmark measurements
- CPU (events/sec)
- 1491
- Disk IOPS
- 2527
- Disk throughput
- 9.9 MB/s
- Network
- 135 Mbps
- Tool setup
- 135 s
Setup & Ease of Use
- ✗1-Click Install Available
- ✗Docker Pre-Configured
- ✓Setup Under 10 Minutes
- ✓AnythingLLM-Specific Docs
- ✗Intuitive Control Panel
Performance
- ✗Strong CPU Benchmark
- ✓8GB RAM at Base Tier
- ✗NVMe SSD Storage
- ✓Low API Latency
- ✓99.9%+ Uptime
Pricing & Value
- ✓Affordable at Min Specs
- ✓Affordable at Rec. Specs
- ✓No Hidden Fees
- ✗Hourly Billing Available
- —Free Trial or Money-Back
Security
- ✓Easy Firewall Config
- —DDoS Protection Included
- ✓SSH Key Authentication
- ✓2FA on Hosting Panel
- ✓Automatic Backups
- —Tunnel/Reverse Proxy Guide
Support
- ✓24/7 Availability
- —Fast Response Time
- —AnythingLLM-Specific Knowledge
- ✓Community Forums
Pros
- Cheapest plan in the group at $11.21/month
- Unlimited bandwidth
Cons
- Slowest document ingestion: 2,962 ms — 2.5× Hetzner's 1,189 ms. Native embedding is CPU-bound and Contabo's shared cores lose every time.
- Slowest LibreChat install: 135 seconds — 3.6× Hetzner's 37s
- Second-lowest CPU in the group: 1,491 events/sec
- Second-lowest disk IOPS: 2,527 (barely ahead of DigitalOcean's 2,840)
- Loosest RAG p95: 3,074 ms — 2× wider than OVHcloud's. Variance from shared-CPU interference shows up on every request.
- No hourly billing — minimum one-month commitment
- Cancellation runs to end of paid term — you keep paying for the month even if you stop using the server on day 2
Visit ContaboCheapest AnythingLLM VPS on paper, slowest in practice. If you're ingesting many documents, Contabo's CPU tax — roughly 2.5× slower embedding than Hetzner — makes it functionally more expensive than the $15 Hetzner once you factor in your time. Fine for a low-volume personal workspace; frustrating for anything heavier.
Kamatera
Best disk IOPS, fastest chat, mid-pack on ingest
Min specs
$4.00/mo
Recommended
$39.00/mo
Benchmark measurements
- CPU (events/sec)
- 2699
- Disk IOPS
- 14704
- Disk throughput
- 57.4 MB/s
- Network
- 437 Mbps
- Tool setup
- 71 s
Setup & Ease of Use
- ✗1-Click Install Available
- ✗Docker Pre-Configured
- ✓Setup Under 10 Minutes
- ✓AnythingLLM-Specific Docs
- ✓Intuitive Control Panel
Performance
- ✓Strong CPU Benchmark
- ✓8GB RAM at Base Tier
- ✓NVMe SSD Storage
- ✓Low API Latency
- ✓99.9%+ Uptime
Pricing & Value
- ✓Affordable at Min Specs
- ✓Affordable at Rec. Specs
- ✓No Hidden Fees
- ✓Hourly Billing Available
- —Free Trial or Money-Back
Security
- ✓Easy Firewall Config
- —DDoS Protection Included
- ✓SSH Key Authentication
- ✓2FA on Hosting Panel
- ✓Automatic Backups
- —Tunnel/Reverse Proxy Guide
Support
- ✓24/7 Availability
- —Fast Response Time
- —AnythingLLM-Specific Knowledge
- ✓Community Forums
Pros
- Fastest disk we measured: 14,703 IOPS — 1.4× Hostinger, 2.6× Hetzner, 4.6× DigitalOcean
- Chat total p50 of 1,031 ms — second-fastest AnythingLLM chat roundtrip behind only Hetzner (999 ms)
- Flexible spec picker — dial in exact CPU+RAM+disk without being locked to fixed SKUs
- Hourly billing, global datacenter choice
Cons
- $39/month (4A tier) — about 2.6× Hetzner's EU price
- Document ingestion at 2,642 ms is surprisingly slow given the disk advantage — embedding is CPU-bound and Kamatera's CPU (2,698 events/sec) is mid-pack, not top
- No AnythingLLM one-click — manual install via SSH
- RAG p95 of 1,862 ms is looser than OVH's 1,517 — the disk speed doesn't help when the bottleneck is network + LLM
Visit KamateraBest AnythingLLM host if your workload is disk-heavy — big vector stores, many embedded documents, fast retrieval across a large corpus. But if your corpus is small (under 100 docs) and your throughput is modest, the disk advantage is invisible and you're paying a premium over Hetzner for a metric that doesn't move. For a RAG-heavy production deployment with 10k+ chunks, Kamatera is the clear technical pick.
Skip this one
Tested and didn't earn a recommendation at this price tier.
DigitalOcean
Best dashboard, slowest ingest, worst value — pattern holds
Min specs
$12.00/mo
Recommended
$48.00/mo
Benchmark measurements
- CPU (events/sec)
- 817
- Disk IOPS
- 3169
- Disk throughput
- 12.4 MB/s
- Network
- 1 Mbps
- Tool setup
- 82 s
Setup & Ease of Use
- ✗1-Click Install Available
- ✗Docker Pre-Configured
- ✓Setup Under 10 Minutes
- ✓AnythingLLM-Specific Docs
- ✓Intuitive Control Panel
Performance
- ✗Strong CPU Benchmark
- ✓8GB RAM at Base Tier
- ✓NVMe SSD Storage
- ✓Low API Latency
- ✓99.9%+ Uptime
Pricing & Value
- ✓Affordable at Min Specs
- ✓Affordable at Rec. Specs
- ✓No Hidden Fees
- ✓Hourly Billing Available
- —Free Trial or Money-Back
Security
- ✓Easy Firewall Config
- —DDoS Protection Included
- ✓SSH Key Authentication
- ✓2FA on Hosting Panel
- ✓Automatic Backups
- —Tunnel/Reverse Proxy Guide
Support
- ✓24/7 Availability
- —Fast Response Time
- —AnythingLLM-Specific Knowledge
- ✓Community Forums
Pros
- Cleanest, most polished dashboard of any host
- Industry-leading documentation and tutorials
- $200 free credit for new users — covers ~4 months of the $48/mo plan
Cons
- Second-slowest document ingestion: 2,670 ms — 2.2× Hetzner's 1,189 ms at 3.2× the price
- Lowest CPU we measured: 817 events/sec — 4-5× slower than every other 4-vCPU provider
- Second-lowest disk IOPS: 3,168 — 4.6× slower than Kamatera
- Install is 82 seconds (2.2× slower than Hetzner)
- Only 4 TB of bundled bandwidth; $0.01/GB over that
- At $48/month it's 3.2× Hetzner's price for markedly worse AnythingLLM performance
Visit DigitalOceanDon't, at this tier. Every metric that matters for AnythingLLM — ingestion speed, RAG latency, install time, per-dollar hardware — points at Hetzner or OVHcloud instead. DigitalOcean's premium pays for dashboard polish and docs that you touch once during setup, while the slow CPU affects every document you upload and every question you ask. Reserve DO for cases where you need their managed Postgres or Kubernetes add-ons.
Deployment profile
Regions tested: Hetzner Hillsboro (US-West), DigitalOcean New York, Vultr New Jersey, OVHcloud BHS5 (Canada-East), Contabo Germany (EU), Hostinger Boston, Kamatera US-NJ. Tiers: Hetzner CPX31, DigitalOcean Basic Regular, Vultr High Performance AMD, OVH c3-8, Contabo Cloud VPS (4 vCPU / 8 GB), Kamatera 4A — all 4 vCPU / 8 GB. Hostinger KVM 2 is 2 vCPU / 8 GB (their pricing grid skips 4/8 at this tier). Hetzner Ashburn was out of capacity during the campaign, so we used Hillsboro — our earlier LibreChat dry-run on Ashburn measured ~230 ms tighter latency to OpenAI, so Hetzner's numbers in this edition are somewhat conservative. Contabo and Hostinger charge monthly; the rest charge hourly. All 7 runs completed 50/50 chat samples + 10/10 RAG samples, 0 failures.
First-deployment checklist
- Rent a VPS with at least 2 vCPU and 4 GB of memory (4 vCPU / 8 GB recommended if you'll ingest many documents). Ubuntu 24.04.
- Log in as root, install Docker: curl -fsSL https://get.docker.com | sh
- Create a directory, drop in AnythingLLM's official docker-compose.yml (or use our trimmed version from scripts/bench/tools/anythingllm/).
- Generate auth secrets: export AUTH_TOKEN=$(openssl rand -hex 16) JWT_SECRET=$(openssl rand -hex 32) SIG_KEY=$(openssl rand -hex 32) SIG_SALT=$(openssl rand -hex 16)
- Paste your OpenAI API key (or Anthropic / Gemini / Ollama endpoint): export OPENAI_API_KEY=sk-...
- Bring it up: docker compose up -d — wait ~40-90 seconds for AnythingLLM to finish init.
- Visit http://your-ip:3001, log in with the AUTH_TOKEN password, create your first workspace, upload a document, ask a question.
Common pitfalls
AUTH_TOKEN must be set at FIRST boot. AnythingLLM stores the bcrypt hash in SQLite the first time the container starts. If you boot with no AUTH_TOKEN, set one later, and restart, the new token won't match the cached hash and /api/request-token will 500 silently. Fix: `docker compose down -v` to wipe the volume, set AUTH_TOKEN, `up -d`.
https://docs.anythingllm.com/configuration
Native embedding is CPU-bound. On a shared-CPU VPS (Contabo at this tier, DigitalOcean Basic Regular), ingesting a 5 KB document takes 2.5-3× longer than on a dedicated-CPU host (Hetzner, OVHcloud). For bulk ingestion of thousands of documents, this difference adds up to hours.
https://docs.anythingllm.com/embedding-provider-config/native
The /api/v1/* endpoints need a separately-minted API key — not the user JWT from /api/request-token. The correct mint endpoint is POST /api/system/generate-api-key (returns {apiKey: {secret}}). If you try POSTing to /api/system/api-keys/new (documented in older AnythingLLM releases) you'll get an HTML page, not a key.
https://docs.anythingllm.com/api-reference
Region matters as much as hardware for RAG latency. Our OVHcloud Canada numbers (RAG p50 1,100 ms) beat Hetzner Hillsboro (1,203 ms) despite Hetzner having better CPU — OVH's network hop to OpenAI's east-coast edge is tighter. Pick a datacenter close to your LLM provider's edge, not close to you.
https://platform.openai.com/docs/guides/latency-optimization
Frequently Asked Questions
Which VPS gives the fastest AnythingLLM document ingestion?
Hetzner's CPX31 at $15/month — we measured 1,189 ms to ingest a 5 KB test document on Hetzner Hillsboro, vs 2,962 ms on Contabo (2.5× slower) and 2,670 ms on DigitalOcean (2.2× slower). Native embedding runs on CPU and Hetzner's dedicated AMD cores win every time.
Does CPU matter for AnythingLLM?
Yes, and more than for LibreChat or pure chat UIs. AnythingLLM's default native embedding engine runs entirely on CPU — every document you upload gets chunked, hashed, and embedded before it can be queried. Slow CPUs (Contabo, DigitalOcean at this tier) translate directly to slow ingestion. RAG query latency is mostly network (to OpenAI) + LLM (OpenAI-side), so CPU matters less for the read path.
What specs do I need for AnythingLLM?
2 vCPU and 4 GB RAM for solo or small-team use. The full stack uses about 1 GB idle. For batch document ingestion or 5+ concurrent users, 4 vCPU and 8 GB. Storage: 20 GB minimum; LanceDB grows with your corpus at roughly 10 KB per embedded chunk.
Is self-hosting AnythingLLM cheaper than ChatGPT Plus?
For 3+ users, almost always. ChatGPT Plus is $20/person/month; a self-hosted AnythingLLM on Hetzner costs $15 flat plus per-token API charges. At gpt-4o-mini pricing (~$0.15/$0.60 per M tokens), 1M tokens per person per month costs $0.75 — so 3 users = $15 + $2.25 = $17.25 total vs $60 for ChatGPT Plus. Heavier usage with gpt-4 shifts the math — API cost becomes the dominant line.
AnythingLLM vs LibreChat — which should I pick?
LibreChat for general-purpose multi-provider chat (think 'ChatGPT with better UI'). AnythingLLM for document-centric workflows (chat with your wiki, chat with your codebase, RAG over PDFs). LibreChat has cleaner UI; AnythingLLM has a built-in vector store + document ingestion pipeline. If you're not chatting with your own documents, LibreChat is simpler and lighter. If you are, AnythingLLM is the right tool.
Can I use Ollama instead of OpenAI?
Yes. AnythingLLM supports Ollama natively — set LLM_PROVIDER=ollama and OLLAMA_BASE_PATH=http://host.docker.internal:11434 in the compose env. You'll need to run Ollama separately (same VPS or another). For a quality-comparable to gpt-4o-mini, you want a 7-8 B model (Llama 3.1 8B, Mistral 7B) which needs at least 8 GB of RAM on top of AnythingLLM's 1 GB. 16 GB VPS or dedicated GPU is a more comfortable setup.
Why do your Kamatera disk IOPS numbers not help ingestion?
Kamatera's 14,703 disk IOPS leads the group by 40%+ over Hostinger, but its ingest time (2,642 ms) is mid-pack. Native embedding is CPU-bound — the disk work is writing 50-100 KB of vectors to LanceDB at the end, which is trivial on any SSD. If we switched to OpenAI's hosted embeddings, or ran on a much bigger corpus where LanceDB write throughput matters, Kamatera's disk advantage would materialize. For the 5 KB test doc, it doesn't.
Why is DigitalOcean so much slower?
Same reason as our other editions: the Basic Regular tier uses shared CPU cores. Our sysbench measured 817 events/sec vs Hetzner's 3,627 on identical spec. AnythingLLM's native embedding runs on those CPUs; DO's embedding takes 2.2× longer. Their Premium Intel tier performs better but costs even more.
Can I run AnythingLLM and LibreChat on the same server?
Yes, if the server has enough RAM. AnythingLLM uses port 3001; LibreChat uses 3080. Both are Docker-based, both use a reverse proxy pattern. Put Caddy or Nginx in front and route by subdomain. RAM budget: AnythingLLM ~1 GB, LibreChat ~1.2 GB — 4 GB VPS is tight, 8 GB is comfortable.
How do you run these benchmarks?
For each provider we provision a fresh VPS, install Docker, pull the AnythingLLM docker-compose stack, generate per-install auth secrets, wait for /api/ping. Then we measure idle footprint, 20 TTFB samples, admin login, API key mint, workspace creation, 50 gpt-4o-mini chat calls via the workspace chat endpoint, upload a 5 KB test document, and run 10 RAG queries that force retrieval. All scripts live in scripts/bench/tools/anythingllm/ — you can rerun them yourself with your own OpenAI key.