Running Locally

When running locally through Docker, we recommend making at least 4vCPU cores and 10GB of RAM available to Docker (16GB is preferred). This can be controlled in the Resources section of the Docker Desktop settings menu.

Single Cloud Instance

We generally recommend setting everything up on a single instance (e.g. an AWS EC2 instance, a Google Compute Engine instance, an Azure VM, etc.) via Docker Compose as it’s the simplest way to get started. For a step-by-step guide on how to do this, checkout our EC2 deployment guide.

For most use cases a single reasonably sized instance should be more than enough to guarantee excellent performance. A single instance should be able to effectively serve a small-medium sized organization without issue.

If you go with this approach, we recommend:

  • CPU: >= 4 vCPU cores (we recommend 8 vCPU cores if possible)
  • Memory: >= 10 GB of RAM (for best performance, 16 is recommended)
    • Note: If you are switching embedding models, you will need >= 11 GB RAM as both sets of models need to be loaded in simultaneously
  • Disk: >= 50 GB + ~2.5x the size of the indexed documents. Disk is generally very cheap, so we would recommend getting extra disk beyond this recommendation to be safe.
    • Note: Vespa does not allow writes when disk usage is >75%, so make sure to always have some headroom here
    • Note: old, unused docker images often take up a bunch of space when performing frequent upgrades of Danswer. To clean up these unused images, run: docker system prune --all.

For reference, we have chosen to give each Danswer Cloud customer a m7g.xlarge instance by default, which has 4vCPU cores + 16GB of RAM + 200GB of disk space. We’re comfortable using 4vCPU cores in a production setting since we have dedicated GPU instances that run the embedding / cross-encoder models. If you do not plan on setting that up, we would recommend going with 8vCPU cores (if possible) for a production deployment.

Kubernetes / AWS ECS

If you prefer to give each component it’s own dedicated resources for more efficient scaling, we recommend giving each container at least the following resources:

api_server - 2500m CPU, 5Gi Memory

background - 2500m CPU, 5Gi Memory

postgres - 500m CPU, 1Gi Memory

vespa - >=2000m CPU, >= 4Gi Memory (may need to increase these depending on the number of documents you have indexed)

nginx - 250m CPU, 128Mi Memory