Web Connector - Danswer Documentation

How it works

The Web Connector scrapes sites based on a base URL.

It only indexes files from the same domain and containing the same base path.
It will index pages reachable via hyperlinks from the base URL.
The text contents are cleaned up via some heuristics and some metadata such as the page Title is extracted.

Setting up

Authorization

As long as the page is reachable, no additional authorization is necessary.

Indexing

Navigate to the Admin Dashboard and select the Web Connector.
Input the base URL to index and click on Index.

WebConnector

To see the status of the indexing, visit the Connectors Status page (top left).

WebConnectorStatus

Connector Overview File Connector

On this page

How it works
Setting up
Authorization
Indexing