HostingSelector

Best VPS Providers for AnythingLLM in 2026

By Arnas Kazlaus · Software engineer and founder, 15 years shipping code

Tests run personally on rented VPSes · Last updated

The best VPS for AnythingLLM isn't the one with the biggest CPU spec on paper — it's the one that embeds your documents fast and retrieves them fast. We installed AnythingLLM on 7 VPS providers, uploaded the same test document, ran 10 real RAG queries through OpenAI's gpt-4o-mini, and measured: document ingestion time, retrieval latency, and full completion. Every number from runs on 2026-04-22. Hetzner wins ingest, OVHcloud wins RAG latency, DigitalOcean and Contabo tie for last.

About this ranking

hostingselector.com rented one VPS per host, installed the official AnythingLLM docker-compose stack (LanceDB for vectors, native embedding engine, OpenAI for LLM), then ran a structured benchmark: (1) full idle footprint before any uploads, (2) 20 back-to-back GETs against /api/ping for TTFB distribution, (3) admin login + API-key mint + workspace creation, (4) 50 direct chat requests via AnythingLLM's workspace-chat endpoint through gpt-4o-mini, (5) upload of a 5 KB test document with native embedding + association into the workspace — measured end-to-end as time_to_ingest, (6) 10 RAG queries forcing vector retrieval + LLM roundtrip. Ubuntu 24.04 on all hosts. Fresh install per provider. All numbers from runs we did on 2026-04-22. All scripts live in scripts/bench/tools/anythingllm/ in our public repo.

Some links on this page are affiliate links. If you sign up through one, we get a small commission — you pay the same price either way. This doesn’t change who wins below.

Test window
2026-04-22 — 2026-04-22
Region
Provider flagship region

How we test →

Why AnythingLLM?

Why self-host AnythingLLM? If you want a ChatGPT-style interface for your documents — internal wikis, product manuals, customer support articles — without handing those documents to OpenAI's retrieval tier or ChatGPT's custom GPTs, AnythingLLM is the go-to. It ingests PDFs, DOCX, TXT, web pages, YouTube transcripts, and GitHub repos, chunks them into a local vector store (LanceDB by default, or Pinecone/Chroma/Weaviate), and pipes the retrieved chunks to your LLM of choice (OpenAI, Anthropic, Gemini, Ollama, or local models). The full stack — AnythingLLM + MongoDB-optional + LanceDB + native embeddings — runs in a single Docker container and uses ~1 GB of RAM idle. For a team of 3+ people paying $20/month each for ChatGPT Plus, a self-hosted AnythingLLM on a $15 Hetzner VPS plus per-token API costs is typically 3-5× cheaper within the first month.

What AnythingLLM needs from a VPS

Before you shop, know what AnythingLLM actually needs. The surprising part: document ingestion is CPU-bound, not disk-bound.

CPU

2 vCPU minimum, 4 recommended for ingestion

Native embedding runs on CPU by default. A 5 KB test document takes 1.2s on Hetzner vs 3.0s on Contabo — a 2.5× gap that compounds over hundreds of uploads. For batch-ingesting a big corpus, pick 4 vCPU dedicated.

RAM

2 GB minimum, 4 GB recommended

The AnythingLLM + LanceDB stack uses ~1 GB idle, leaving ~1 GB headroom on a 2 GB VPS for in-flight embedding jobs. 4 GB gives you comfortable room; 8 GB is where you stop worrying about concurrent users.

Storage

20 GB SSD minimum, NVMe preferred

AnythingLLM + vector store base usage is ~4 GB. Each 5 KB doc adds ~50 KB of embedded vectors. A 1,000-document corpus uses ~50 MB of storage but MongoDB/LanceDB metadata grows proportionally. Plan on 30 GB for moderate use, 100 GB+ for heavy corpus.

OS

Ubuntu 22.04 or 24.04 LTS

Debian 12 also works. AnythingLLM's Docker image is a standard Node.js runtime; no special kernel needs.

Region matters

US-East or EU-West

RAG latency includes a network hop to OpenAI (or your chosen LLM). In our tests OVHcloud Canada routed ~100ms faster to OpenAI than Hetzner US-West. Pick a datacenter close to OpenAI's edge, not close to you.

Docker

Must be installable

AnythingLLM ships as a single Docker image (mintplexlabs/anythingllm:latest). LXC-based 'container' VPS hosting often doesn't support this.

OpenAI API key (or alternative)

Bring your own

AnythingLLM doesn't include LLM access. You supply an API key for OpenAI, Anthropic, Google, or Azure — or run Ollama alongside for local inference. Native embeddings are free and run on CPU; OpenAI embeddings cost $0.02 per 1M tokens if you prefer quality over CPU cost.

Our Top Picks

Best Overall

Hetzner

Fastest document ingestion, cheapest plan

$15.00/mo

Best Budget

Hetzner

Fastest document ingestion, cheapest plan

$15.00/mo

Best for Beginners

Hostinger

Fastest per-core CPU, 2nd-best disk, friendly UI

$24.49/mo

Best Performance

OVHcloud

Fastest RAG, DDoS protection, painful price

$78.68/mo

Best Security

OVHcloud

Fastest RAG, DDoS protection, painful price

$78.68/mo

At-a-Glance Comparison

Every provider side-by-side. Lower is better for deploy time; higher is better for everything else.

HostPrice/moCPU (events/sec)Disk IOPSDeploy p50VarianceBandwidth
Hetzner
Fastest document ingestion, cheapest plan
$15.00/mo3,6275,71820 TB EU / 1 TB US
Hostinger
Fastest per-core CPU, 2nd-best disk, friendly UI
$24.49/mo4,24110,6758 TB, throttle only
Vultr
Fast CPU, slow install, mid-pack RAG
$48.00/mo4,1439,0986 TB, $0.01/GB over
OVHcloud
Fastest RAG, DDoS protection, painful price
$78.68/mo4,1637,318Unmetered
Contabo
Cheapest, slowest, shared-CPU tax visible everywhere
$11.21/mo1,4912,527Unmetered*
DigitalOcean
Best dashboard, slowest ingest, worst value — pattern holds
$48.00/mo816.973,1695 TB, $0.01/GB over
Kamatera
Best disk IOPS, fastest chat, mid-pack on ingest
$39.00/mo2,69914,7045 TB, $0.01/GB over

* Contabo says “unmetered” but can throttle heavy users at their discretion. See full review.

Cost vs performance at a glance

Upper-left is best (cheap and fast). Lower-right is worst (expensive and slow).

01k2k3k4k$0$10$20$30$40$50Price per month ($) →← CPU (events/sec, higher is better)HetznerHostingerVultrOVHcloudContaboDigitalOceanKamatera

Best overall pick · Avoid at this tier

The cost that isn’t on the sticker: bandwidth

AnythingLLM's bandwidth profile is modest — chat JSON + small document uploads. Even at 30 TB/month (a team of dozens with heavy file attachments), Hetzner Europe costs about $27 total while DigitalOcean costs about $298. Bandwidth math is tool-agnostic; these numbers match our other editions.

Scenario: 4-core VPS at each host + 30 TB/month outbound traffic (outbound is what providers meter; inbound is usually free)

HostBase priceTraffic includedOver-quota policyTotal / mo
Contabo$11.21unlimitednone$11
Hostinger$24.498 TBno public overage rate$24
Hetzner$15.0020 TB (EU)€1/TB billed over quota$27
OVHcloud$78.68unmeterednone$79
Vultr$48.006 TB (HP AMD)$0.01/GB over quota$288
Kamatera$39.005 TB$0.01/GB over quota$289
DigitalOcean$48.005 TB$0.01/GB over quota$298

27× difference between cheapest and priciest for the same traffic. Bandwidth policy is the biggest hidden variable in VPS pricing.

Prices current as of April 2026. For typical AnythingLLM usage (chat JSON, small doc uploads), you'll be well under 1 TB/month and bandwidth doesn't matter. If you're serving large files or many embedded docs to many concurrent users, the overage math starts to bite — DigitalOcean's $0.01/GB after 5 TB means a 10 TB month costs an extra $50.

Our picks

3 hosts we'd actually recommend — each wins on a specific axis.

1
Hetzner logo

Hetzner

Fastest document ingestion, cheapest plan

Min specs

$5.00/mo

Recommended

$15.00/mo

Benchmark measurements

CPU (events/sec)
3627
Disk IOPS
5718
Disk throughput
22.3 MB/s
Network
1470 Mbps
Tool setup
37 s

Setup & Ease of Use

  • 1-Click Install Available
  • Docker Pre-Configured
  • Setup Under 10 Minutes
  • AnythingLLM-Specific Docs
  • Intuitive Control Panel

Performance

  • Strong CPU Benchmark
  • 8GB RAM at Base Tier
  • NVMe SSD Storage
  • Low API Latency
  • 99.9%+ Uptime

Pricing & Value

  • Affordable at Min Specs
  • Affordable at Rec. Specs
  • No Hidden Fees
  • Hourly Billing Available
  • Free Trial or Money-Back

Security

  • Easy Firewall Config
  • DDoS Protection Included
  • SSH Key Authentication
  • 2FA on Hosting Panel
  • Automatic Backups
  • Tunnel/Reverse Proxy Guide

Support

  • 24/7 Availability
  • Fast Response Time
  • AnythingLLM-Specific Knowledge
  • Community Forums

Pros

  • Fastest document ingestion we measured: 1,189 ms for a 5 KB test doc — embedding is CPU-bound with AnythingLLM's native engine, and Hetzner's dedicated AMD cores win
  • Fastest LibreChat install too: 37 seconds end-to-end
  • Cheapest 4 vCPU / 8 GB plan at $15/month (EU) / $17.99/month (US)
  • Chat TTFT p50 of 999 ms — fastest in the group at forwarding through AnythingLLM to OpenAI
  • Tight UI TTFB: 1 ms p50, 2 ms p95
  • 20 TB of free outbound traffic in EU regions (1 TB in US — see cost math)

Cons

  • US (Hillsboro) region has longer network hop to OpenAI than OVHcloud Canada — our RAG p50 was 1,203 ms vs OVH's 1,100 ms. Small but real.
  • Hetzner Ashburn was out of capacity during this campaign — US-East fills up. Pick a secondary region if you want consistent availability.
  • Backups are opt-in, not on by default

Best value for AnythingLLM. Fastest document ingestion of the 7 providers we tested (1,189 ms vs DigitalOcean's 2,670 — 2.2× gap comes from dedicated vs shared CPU). Fastest install. Half the price of Vultr or DigitalOcean. If you plan to ingest hundreds of documents, Hetzner's CPU advantage compounds on every single upload.

Visit Hetzner
2
Hostinger logo

Hostinger

Fastest per-core CPU, 2nd-best disk, friendly UI

Min specs

$9.99/mo

Recommended

$24.49/mo

Benchmark measurements

CPU (events/sec)
4241
Disk IOPS
10675
Disk throughput
41.7 MB/s
Network
148 Mbps
Tool setup
53 s

Setup & Ease of Use

  • 1-Click Install Available
  • Docker Pre-Configured
  • Setup Under 10 Minutes
  • AnythingLLM-Specific Docs
  • Intuitive Control Panel

Performance

  • Strong CPU Benchmark
  • 8GB RAM at Base Tier
  • NVMe SSD Storage
  • Low API Latency
  • 99.9%+ Uptime

Pricing & Value

  • Affordable at Min Specs
  • Affordable at Rec. Specs
  • No Hidden Fees
  • Hourly Billing Available
  • Free Trial or Money-Back

Security

  • Easy Firewall Config
  • DDoS Protection Included
  • SSH Key Authentication
  • 2FA on Hosting Panel
  • Automatic Backups
  • Tunnel/Reverse Proxy Guide

Support

  • 24/7 Availability
  • Fast Response Time
  • AnythingLLM-Specific Knowledge
  • Community Forums

Pros

  • Highest per-core CPU we measured: 4,240 events/sec on just 2 vCPU — beats Vultr's 4-vCPU plan per-core
  • Second-best disk IOPS: 10,675 (behind only Kamatera's 14,703)
  • Third-fastest document ingestion: 1,516 ms — only Hetzner and OVHcloud beat it, and both have dedicated CPUs
  • Fast LibreChat install: 53 seconds
  • Beginner-friendly hPanel UI with good documentation
  • 8 TB of bundled bandwidth — no overage surprise at scale

Cons

  • KVM 2 is locked at 2 vCPU — if you plan to ingest big document libraries in parallel, the 4-vCPU plans (Hetzner, Vultr) will finish faster per batch
  • Advertised price is $9/mo. Pay monthly: 1.5× that ($14.99). Renew after promo ends: 2.7× that ($24.49/mo).
  • Marketing pushes 2-year plans hard — the advertised price assumes that commitment
  • RAG p95 of 2,528 ms is looser than OVH (1,517) or Hetzner (1,664) — the 2-vCPU handles single requests well but tail latency on concurrent load suffers

Best for beginners self-hosting AnythingLLM. Solid across every metric, easy dashboard, no weird gotchas. Per-core CPU is the highest of the 7, so single-user chat + RAG feel snappy. The 2 vCPU ceiling matters only if you're doing batch document ingestion for a team; for solo use it's a non-issue.

Visit Hostinger
3
OVHcloud logo

OVHcloud

Fastest RAG, DDoS protection, painful price

Min specs

$11.00/mo

Recommended

$78.68/mo

Benchmark measurements

CPU (events/sec)
4163
Disk IOPS
7318
Disk throughput
28.6 MB/s
Network
1 Mbps
Tool setup
61 s

Setup & Ease of Use

  • 1-Click Install Available
  • Docker Pre-Configured
  • Setup Under 10 Minutes
  • AnythingLLM-Specific Docs
  • Intuitive Control Panel

Performance

  • Strong CPU Benchmark
  • 8GB RAM at Base Tier
  • NVMe SSD Storage
  • Low API Latency
  • 99.9%+ Uptime

Pricing & Value

  • Affordable at Min Specs
  • Affordable at Rec. Specs
  • No Hidden Fees
  • Hourly Billing Available
  • Free Trial or Money-Back

Security

  • Easy Firewall Config
  • DDoS Protection Included
  • SSH Key Authentication
  • 2FA on Hosting Panel
  • Automatic Backups
  • Tunnel/Reverse Proxy Guide

Support

  • 24/7 Availability
  • Fast Response Time
  • AnythingLLM-Specific Knowledge
  • Community Forums

Pros

  • Fastest RAG p50 in the group: 1,100 ms — OVH BHS5 has excellent network routing to OpenAI's east-coast edge, and it shows up on every retrieval query
  • Tightest RAG p95: 1,517 ms — no outliers
  • Second-fastest document ingestion: 1,371 ms (narrowly behind Hetzner's 1,189)
  • Highest CPU we measured among 4-vCPU providers: 4,163 events/sec
  • Second-best UI TTFB: 1 ms p50, 2 ms p95 (tied with Hostinger)
  • DDoS protection and unmetered bandwidth included
  • Strong European data sovereignty story

Cons

  • Most expensive plan in our group at $78.68/month — 5.25× Hetzner's EU price
  • Default login is the `ubuntu` user, not root — you'll need `sudo` for docker commands or enable root SSH first
  • Your OVH account is locked to one region entity (EU, CA, or US) — you can't mix

Best RAG experience of the 7 providers. If you're running AnythingLLM for production document chat where latency matters — customer support bot, internal knowledge base — OVH's 1,100 ms RAG p50 with 1,517 ms p95 is worth the premium. For solo hobbyist use, Hetzner delivers 95% of the experience at 20% of the cost.

Visit OVHcloud

Close calls

3 tested but not our top picks — each has a real edge for a specific use case.

4
Vultr logo

Vultr

Fast CPU, slow install, mid-pack RAG

Min specs

$6.00/mo

Recommended

$48.00/mo

Benchmark measurements

CPU (events/sec)
4143
Disk IOPS
9098
Disk throughput
35.5 MB/s
Network
1 Mbps
Tool setup
128 s

Setup & Ease of Use

  • 1-Click Install Available
  • Docker Pre-Configured
  • Setup Under 10 Minutes
  • AnythingLLM-Specific Docs
  • Intuitive Control Panel

Performance

  • Strong CPU Benchmark
  • 8GB RAM at Base Tier
  • NVMe SSD Storage
  • Low API Latency
  • 99.9%+ Uptime

Pricing & Value

  • Affordable at Min Specs
  • Affordable at Rec. Specs
  • No Hidden Fees
  • Hourly Billing Available
  • Free Trial or Money-Back

Security

  • Easy Firewall Config
  • DDoS Protection Included
  • SSH Key Authentication
  • 2FA on Hosting Panel
  • Automatic Backups
  • Tunnel/Reverse Proxy Guide

Support

  • 24/7 Availability
  • Fast Response Time
  • AnythingLLM-Specific Knowledge
  • Community Forums

Pros

  • Second-highest CPU: 4,143 events/sec (among 4-vCPU plans, Vultr wins)
  • Strong disk: 9,098 IOPS (4th overall)
  • Chat and RAG latencies mid-pack — 1,228 ms / 1,360 ms p50
  • Hourly billing, 30+ global regions

Cons

  • Slowest install in the group after Contabo: 128 seconds — 3.5× Hetzner's 37s
  • 3× more expensive than Hetzner ($48/mo vs $15/mo EU) for worse ingest performance
  • Document ingestion (2,094 ms) is 76% slower than Hetzner's 1,189 ms despite comparable CPU benchmarks — something about the Vultr NJ network hop to OpenAI is bottlenecking AnythingLLM's processing pipeline
  • Only 6 TB of bandwidth included; $0.01/GB over that

Hardware is there but the real-world numbers aren't. AnythingLLM users who care about ingestion speed get more from Hetzner at a third the price. Vultr's main case — 'fast CPU when you need it' — is already delivered by Hetzner Ashburn at $15/month. Pick Vultr only if you specifically need a region Hetzner doesn't have.

Visit Vultr
5
Contabo logo

Contabo

Cheapest, slowest, shared-CPU tax visible everywhere

Min specs

$5.99/mo

Recommended

$11.21/mo

Benchmark measurements

CPU (events/sec)
1491
Disk IOPS
2527
Disk throughput
9.9 MB/s
Network
135 Mbps
Tool setup
135 s

Setup & Ease of Use

  • 1-Click Install Available
  • Docker Pre-Configured
  • Setup Under 10 Minutes
  • AnythingLLM-Specific Docs
  • Intuitive Control Panel

Performance

  • Strong CPU Benchmark
  • 8GB RAM at Base Tier
  • NVMe SSD Storage
  • Low API Latency
  • 99.9%+ Uptime

Pricing & Value

  • Affordable at Min Specs
  • Affordable at Rec. Specs
  • No Hidden Fees
  • Hourly Billing Available
  • Free Trial or Money-Back

Security

  • Easy Firewall Config
  • DDoS Protection Included
  • SSH Key Authentication
  • 2FA on Hosting Panel
  • Automatic Backups
  • Tunnel/Reverse Proxy Guide

Support

  • 24/7 Availability
  • Fast Response Time
  • AnythingLLM-Specific Knowledge
  • Community Forums

Pros

  • Cheapest plan in the group at $11.21/month
  • Unlimited bandwidth

Cons

  • Slowest document ingestion: 2,962 ms — 2.5× Hetzner's 1,189 ms. Native embedding is CPU-bound and Contabo's shared cores lose every time.
  • Slowest LibreChat install: 135 seconds — 3.6× Hetzner's 37s
  • Second-lowest CPU in the group: 1,491 events/sec
  • Second-lowest disk IOPS: 2,527 (barely ahead of DigitalOcean's 2,840)
  • Loosest RAG p95: 3,074 ms — 2× wider than OVHcloud's. Variance from shared-CPU interference shows up on every request.
  • No hourly billing — minimum one-month commitment
  • Cancellation runs to end of paid term — you keep paying for the month even if you stop using the server on day 2

Cheapest AnythingLLM VPS on paper, slowest in practice. If you're ingesting many documents, Contabo's CPU tax — roughly 2.5× slower embedding than Hetzner — makes it functionally more expensive than the $15 Hetzner once you factor in your time. Fine for a low-volume personal workspace; frustrating for anything heavier.

Visit Contabo
6
Kamatera logo

Kamatera

Best disk IOPS, fastest chat, mid-pack on ingest

Min specs

$4.00/mo

Recommended

$39.00/mo

Benchmark measurements

CPU (events/sec)
2699
Disk IOPS
14704
Disk throughput
57.4 MB/s
Network
437 Mbps
Tool setup
71 s

Setup & Ease of Use

  • 1-Click Install Available
  • Docker Pre-Configured
  • Setup Under 10 Minutes
  • AnythingLLM-Specific Docs
  • Intuitive Control Panel

Performance

  • Strong CPU Benchmark
  • 8GB RAM at Base Tier
  • NVMe SSD Storage
  • Low API Latency
  • 99.9%+ Uptime

Pricing & Value

  • Affordable at Min Specs
  • Affordable at Rec. Specs
  • No Hidden Fees
  • Hourly Billing Available
  • Free Trial or Money-Back

Security

  • Easy Firewall Config
  • DDoS Protection Included
  • SSH Key Authentication
  • 2FA on Hosting Panel
  • Automatic Backups
  • Tunnel/Reverse Proxy Guide

Support

  • 24/7 Availability
  • Fast Response Time
  • AnythingLLM-Specific Knowledge
  • Community Forums

Pros

  • Fastest disk we measured: 14,703 IOPS — 1.4× Hostinger, 2.6× Hetzner, 4.6× DigitalOcean
  • Chat total p50 of 1,031 ms — second-fastest AnythingLLM chat roundtrip behind only Hetzner (999 ms)
  • Flexible spec picker — dial in exact CPU+RAM+disk without being locked to fixed SKUs
  • Hourly billing, global datacenter choice

Cons

  • $39/month (4A tier) — about 2.6× Hetzner's EU price
  • Document ingestion at 2,642 ms is surprisingly slow given the disk advantage — embedding is CPU-bound and Kamatera's CPU (2,698 events/sec) is mid-pack, not top
  • No AnythingLLM one-click — manual install via SSH
  • RAG p95 of 1,862 ms is looser than OVH's 1,517 — the disk speed doesn't help when the bottleneck is network + LLM

Best AnythingLLM host if your workload is disk-heavy — big vector stores, many embedded documents, fast retrieval across a large corpus. But if your corpus is small (under 100 docs) and your throughput is modest, the disk advantage is invisible and you're paying a premium over Hetzner for a metric that doesn't move. For a RAG-heavy production deployment with 10k+ chunks, Kamatera is the clear technical pick.

Visit Kamatera

Skip this one

Tested and didn't earn a recommendation at this price tier.

7
DigitalOcean logo

DigitalOcean

Best dashboard, slowest ingest, worst value — pattern holds

Min specs

$12.00/mo

Recommended

$48.00/mo

Benchmark measurements

CPU (events/sec)
817
Disk IOPS
3169
Disk throughput
12.4 MB/s
Network
1 Mbps
Tool setup
82 s

Setup & Ease of Use

  • 1-Click Install Available
  • Docker Pre-Configured
  • Setup Under 10 Minutes
  • AnythingLLM-Specific Docs
  • Intuitive Control Panel

Performance

  • Strong CPU Benchmark
  • 8GB RAM at Base Tier
  • NVMe SSD Storage
  • Low API Latency
  • 99.9%+ Uptime

Pricing & Value

  • Affordable at Min Specs
  • Affordable at Rec. Specs
  • No Hidden Fees
  • Hourly Billing Available
  • Free Trial or Money-Back

Security

  • Easy Firewall Config
  • DDoS Protection Included
  • SSH Key Authentication
  • 2FA on Hosting Panel
  • Automatic Backups
  • Tunnel/Reverse Proxy Guide

Support

  • 24/7 Availability
  • Fast Response Time
  • AnythingLLM-Specific Knowledge
  • Community Forums

Pros

  • Cleanest, most polished dashboard of any host
  • Industry-leading documentation and tutorials
  • $200 free credit for new users — covers ~4 months of the $48/mo plan

Cons

  • Second-slowest document ingestion: 2,670 ms — 2.2× Hetzner's 1,189 ms at 3.2× the price
  • Lowest CPU we measured: 817 events/sec — 4-5× slower than every other 4-vCPU provider
  • Second-lowest disk IOPS: 3,168 — 4.6× slower than Kamatera
  • Install is 82 seconds (2.2× slower than Hetzner)
  • Only 4 TB of bundled bandwidth; $0.01/GB over that
  • At $48/month it's 3.2× Hetzner's price for markedly worse AnythingLLM performance

Don't, at this tier. Every metric that matters for AnythingLLM — ingestion speed, RAG latency, install time, per-dollar hardware — points at Hetzner or OVHcloud instead. DigitalOcean's premium pays for dashboard polish and docs that you touch once during setup, while the slow CPU affects every document you upload and every question you ask. Reserve DO for cases where you need their managed Postgres or Kubernetes add-ons.

Visit DigitalOcean

Deployment profile

Regions tested: Hetzner Hillsboro (US-West), DigitalOcean New York, Vultr New Jersey, OVHcloud BHS5 (Canada-East), Contabo Germany (EU), Hostinger Boston, Kamatera US-NJ. Tiers: Hetzner CPX31, DigitalOcean Basic Regular, Vultr High Performance AMD, OVH c3-8, Contabo Cloud VPS (4 vCPU / 8 GB), Kamatera 4A — all 4 vCPU / 8 GB. Hostinger KVM 2 is 2 vCPU / 8 GB (their pricing grid skips 4/8 at this tier). Hetzner Ashburn was out of capacity during the campaign, so we used Hillsboro — our earlier LibreChat dry-run on Ashburn measured ~230 ms tighter latency to OpenAI, so Hetzner's numbers in this edition are somewhat conservative. Contabo and Hostinger charge monthly; the rest charge hourly. All 7 runs completed 50/50 chat samples + 10/10 RAG samples, 0 failures.

First-deployment checklist

  • Rent a VPS with at least 2 vCPU and 4 GB of memory (4 vCPU / 8 GB recommended if you'll ingest many documents). Ubuntu 24.04.
  • Log in as root, install Docker: curl -fsSL https://get.docker.com | sh
  • Create a directory, drop in AnythingLLM's official docker-compose.yml (or use our trimmed version from scripts/bench/tools/anythingllm/).
  • Generate auth secrets: export AUTH_TOKEN=$(openssl rand -hex 16) JWT_SECRET=$(openssl rand -hex 32) SIG_KEY=$(openssl rand -hex 32) SIG_SALT=$(openssl rand -hex 16)
  • Paste your OpenAI API key (or Anthropic / Gemini / Ollama endpoint): export OPENAI_API_KEY=sk-...
  • Bring it up: docker compose up -d — wait ~40-90 seconds for AnythingLLM to finish init.
  • Visit http://your-ip:3001, log in with the AUTH_TOKEN password, create your first workspace, upload a document, ask a question.

Common pitfalls

  • AUTH_TOKEN must be set at FIRST boot. AnythingLLM stores the bcrypt hash in SQLite the first time the container starts. If you boot with no AUTH_TOKEN, set one later, and restart, the new token won't match the cached hash and /api/request-token will 500 silently. Fix: `docker compose down -v` to wipe the volume, set AUTH_TOKEN, `up -d`.

    https://docs.anythingllm.com/configuration

  • Native embedding is CPU-bound. On a shared-CPU VPS (Contabo at this tier, DigitalOcean Basic Regular), ingesting a 5 KB document takes 2.5-3× longer than on a dedicated-CPU host (Hetzner, OVHcloud). For bulk ingestion of thousands of documents, this difference adds up to hours.

    https://docs.anythingllm.com/embedding-provider-config/native

  • The /api/v1/* endpoints need a separately-minted API key — not the user JWT from /api/request-token. The correct mint endpoint is POST /api/system/generate-api-key (returns {apiKey: {secret}}). If you try POSTing to /api/system/api-keys/new (documented in older AnythingLLM releases) you'll get an HTML page, not a key.

    https://docs.anythingllm.com/api-reference

  • Region matters as much as hardware for RAG latency. Our OVHcloud Canada numbers (RAG p50 1,100 ms) beat Hetzner Hillsboro (1,203 ms) despite Hetzner having better CPU — OVH's network hop to OpenAI's east-coast edge is tighter. Pick a datacenter close to your LLM provider's edge, not close to you.

    https://platform.openai.com/docs/guides/latency-optimization

Frequently Asked Questions

Which VPS gives the fastest AnythingLLM document ingestion?

Hetzner's CPX31 at $15/month — we measured 1,189 ms to ingest a 5 KB test document on Hetzner Hillsboro, vs 2,962 ms on Contabo (2.5× slower) and 2,670 ms on DigitalOcean (2.2× slower). Native embedding runs on CPU and Hetzner's dedicated AMD cores win every time.

Does CPU matter for AnythingLLM?

Yes, and more than for LibreChat or pure chat UIs. AnythingLLM's default native embedding engine runs entirely on CPU — every document you upload gets chunked, hashed, and embedded before it can be queried. Slow CPUs (Contabo, DigitalOcean at this tier) translate directly to slow ingestion. RAG query latency is mostly network (to OpenAI) + LLM (OpenAI-side), so CPU matters less for the read path.

What specs do I need for AnythingLLM?

2 vCPU and 4 GB RAM for solo or small-team use. The full stack uses about 1 GB idle. For batch document ingestion or 5+ concurrent users, 4 vCPU and 8 GB. Storage: 20 GB minimum; LanceDB grows with your corpus at roughly 10 KB per embedded chunk.

Is self-hosting AnythingLLM cheaper than ChatGPT Plus?

For 3+ users, almost always. ChatGPT Plus is $20/person/month; a self-hosted AnythingLLM on Hetzner costs $15 flat plus per-token API charges. At gpt-4o-mini pricing (~$0.15/$0.60 per M tokens), 1M tokens per person per month costs $0.75 — so 3 users = $15 + $2.25 = $17.25 total vs $60 for ChatGPT Plus. Heavier usage with gpt-4 shifts the math — API cost becomes the dominant line.

AnythingLLM vs LibreChat — which should I pick?

LibreChat for general-purpose multi-provider chat (think 'ChatGPT with better UI'). AnythingLLM for document-centric workflows (chat with your wiki, chat with your codebase, RAG over PDFs). LibreChat has cleaner UI; AnythingLLM has a built-in vector store + document ingestion pipeline. If you're not chatting with your own documents, LibreChat is simpler and lighter. If you are, AnythingLLM is the right tool.

Can I use Ollama instead of OpenAI?

Yes. AnythingLLM supports Ollama natively — set LLM_PROVIDER=ollama and OLLAMA_BASE_PATH=http://host.docker.internal:11434 in the compose env. You'll need to run Ollama separately (same VPS or another). For a quality-comparable to gpt-4o-mini, you want a 7-8 B model (Llama 3.1 8B, Mistral 7B) which needs at least 8 GB of RAM on top of AnythingLLM's 1 GB. 16 GB VPS or dedicated GPU is a more comfortable setup.

Why do your Kamatera disk IOPS numbers not help ingestion?

Kamatera's 14,703 disk IOPS leads the group by 40%+ over Hostinger, but its ingest time (2,642 ms) is mid-pack. Native embedding is CPU-bound — the disk work is writing 50-100 KB of vectors to LanceDB at the end, which is trivial on any SSD. If we switched to OpenAI's hosted embeddings, or ran on a much bigger corpus where LanceDB write throughput matters, Kamatera's disk advantage would materialize. For the 5 KB test doc, it doesn't.

Why is DigitalOcean so much slower?

Same reason as our other editions: the Basic Regular tier uses shared CPU cores. Our sysbench measured 817 events/sec vs Hetzner's 3,627 on identical spec. AnythingLLM's native embedding runs on those CPUs; DO's embedding takes 2.2× longer. Their Premium Intel tier performs better but costs even more.

Can I run AnythingLLM and LibreChat on the same server?

Yes, if the server has enough RAM. AnythingLLM uses port 3001; LibreChat uses 3080. Both are Docker-based, both use a reverse proxy pattern. Put Caddy or Nginx in front and route by subdomain. RAM budget: AnythingLLM ~1 GB, LibreChat ~1.2 GB — 4 GB VPS is tight, 8 GB is comfortable.

How do you run these benchmarks?

For each provider we provision a fresh VPS, install Docker, pull the AnythingLLM docker-compose stack, generate per-install auth secrets, wait for /api/ping. Then we measure idle footprint, 20 TTFB samples, admin login, API key mint, workspace creation, 50 gpt-4o-mini chat calls via the workspace chat endpoint, upload a 5 KB test document, and run 10 RAG queries that force retrieval. All scripts live in scripts/bench/tools/anythingllm/ — you can rerun them yourself with your own OpenAI key.