title	Strategy
description	Learn how html2rss chooses request strategies by default with auto fallback, and when to override with faraday, botasaurus, or browserless.

import { Code } from "@astrojs/starlight/components";

The strategy key defines how html2rss fetches a website's content.

auto (default): Tries concrete strategies in order: faraday -> botasaurus -> browserless.
faraday: Makes a direct HTTP request. It is fast but does not execute JavaScript.
browserless: Renders the website in a headless Chrome browser, which is necessary for JavaScript-heavy sites.
botasaurus: Delegates fetching to a Botasaurus scrape API. This is opt-in and requires BOTASAURUS_SCRAPER_URL.

strategy is a top-level config key. Request-specific controls live under request.

auto falls back to the next strategy when the current attempt errors or extracts zero items. Use explicit --strategy ... only when you need to force a specific transport for troubleshooting or reproducibility.

`auto` (default)

The default strategy chain is:

faraday -> botasaurus -> browserless

`browserless`

To use the browserless strategy, you need a running instance of Browserless.io.

Docker

You can run a local Browserless.io instance using Docker:

Configuration

Set the strategy at the top level of your feed configuration and put request controls under request:

Request Structure

Use this split consistently:

strategy: selects auto, faraday, browserless, or botasaurus
headers: top-level headers shared by all strategies
request.max_redirects: redirect limit for the request session
request.max_requests: total request budget for the whole feed build
request.browserless.*: Browserless-only options
request.botasaurus.*: Botasaurus-only options

Example:

Browserless Preload

Browserless can interact with the page before html2rss captures the final HTML. Configure preload steps under request.browserless.preload.

wait_after_ms: inserts a fixed wait before or after preload steps
click_selectors: clicks matching elements until they disappear or max_clicks is reached
scroll_down: scrolls until the page height stops growing or iterations is reached

If preload triggers a real navigation or redirect, html2rss keeps the final document metadata. Relative links and follow-up pagination therefore resolve against the page that was actually rendered after preload completed.

Command-Line Usage

You can also specify the strategy on the command line:

<Code code={`

Set environment variables for your Browserless.io instance

BROWSERLESS_IO_WEBSOCKET_URL="ws://127.0.0.1:3000"
BROWSERLESS_IO_API_TOKEN="6R0W53R135510"
html2rss feed my_config.yml --strategy browserless ;
html2rss feed my_config.yml --max-redirects 5 --max-requests 6 ;
html2rss feed my_config.yml `} lang="sh" />

Browserless Troubleshooting

If Browserless cannot connect, html2rss surfaces a Browserless connection failed (...) error with endpoint/token hints.

Check these first:

BROWSERLESS_IO_WEBSOCKET_URL is reachable from where html2rss runs
BROWSERLESS_IO_API_TOKEN matches your Browserless TOKEN
your Browserless service is running and accepting connections

For custom Browserless websocket endpoints, BROWSERLESS_IO_API_TOKEN is mandatory. The local default endpoint (ws://127.0.0.1:3000) can use the default local token 6R0W53R135510.

`botasaurus`

botasaurus delegates page fetching to a Botasaurus scrape API endpoint. This strategy is explicit opt-in and requires:

strategy: botasaurus
BOTASAURUS_SCRAPER_URL set to your Botasaurus scrape API base URL (for example http://localhost:4010)

Configuration

Supported request.botasaurus options:

navigation_mode (auto, get, google_get, google_get_bypass)
max_retries (0..3)
wait_for_selector
wait_timeout_seconds
block_images
block_images_and_css
wait_for_complete_page_load
headless
proxy
user_agent
window_size (two integers, for example [1920, 1080])
lang

Command-Line Usage

For detailed documentation on the Ruby API, see the official YARD documentation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`auto` (default)

`browserless`

Docker

Configuration

Request Structure

Browserless Preload

Command-Line Usage

Set environment variables for your Browserless.io instance

Browserless Troubleshooting

`botasaurus`

Configuration

Command-Line Usage

FilesExpand file tree

strategy.mdx

Latest commit

History

strategy.mdx

File metadata and controls

auto (default)

browserless

Docker

Configuration

Request Structure

Browserless Preload

Command-Line Usage

Set environment variables for your Browserless.io instance

Browserless Troubleshooting

botasaurus

Configuration

Command-Line Usage

`auto` (default)

`browserless`

`botasaurus`