Skip to content

Latest commit

 

History

History
223 lines (179 loc) · 5.58 KB

File metadata and controls

223 lines (179 loc) · 5.58 KB
title Custom HTTP Requests
description Learn how to customize HTTP requests with custom headers, authentication, and API interactions for html2rss.

Some websites require custom HTTP headers, authentication, or other request settings to access their content. html2rss lets you customize requests for those cases.

When You Need Custom Headers

You might need custom HTTP requests when:

  • APIs require authentication (Bearer tokens, API keys)
  • Websites block default user agents (need to appear as a real browser)
  • Content is behind login (session cookies, authorization headers)
  • Rate limiting (custom headers to identify your requests)
  • Content negotiation (specific Accept headers for different formats)

Basic Configuration

Add a headers section to your feed configuration. This example is a complete, valid config:

headers:
  User-Agent: "Mozilla/5.0 (compatible; html2rss/1.0)"
  Authorization: "Bearer YOUR_API_TOKEN"
  Accept: "application/json"
channel:
  url: https://api.example.com/posts
selectors:
  items:
    selector: "array > object"
  title:
    selector: "title"
  url:
    selector: "url"

Common Use Cases

API Authentication

Many APIs require authentication tokens:

headers:
  Authorization: "Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
  X-API-Key: "your-api-key-here"
channel:
  url: "https://api.example.com/posts"
selectors:
  items:
    selector: "array > object"
  title:
    selector: "title"
  url:
    selector: "url"

User Agent Spoofing

Some websites block requests that don't look like real browsers:

headers:
  User-Agent: "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36"
  Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
  Accept-Language: "en-US,en;q=0.5"
  Accept-Encoding: "gzip, deflate"
channel:
  url: "https://example.com/articles"
selectors:
  items:
    selector: "article"
  title:
    selector: "h2"
  url:
    selector: "a"
    extractor: "href"

Content Type Negotiation

Request specific content types:

headers:
  Accept: "application/json"
channel:
  url: "https://api.example.com/posts"
selectors:
  items:
    selector: "array > object"
  title:
    selector: "title"
  url:
    selector: "url"

Custom API Headers

Some APIs require specific headers:

headers:
  X-Requested-With: "XMLHttpRequest"
  X-Custom-Header: "your-value"
  Content-Type: "application/json"
channel:
  url: "https://api.example.com/posts"
selectors:
  items:
    selector: "array > object"
  title:
    selector: "title"
  url:
    selector: "url"

Dynamic Headers

You can use dynamic parameters in headers for runtime values:

headers:
  Authorization: "Bearer %<api_token>s"
  X-User-ID: "%<user_id>s"
channel:
  url: "https://api.example.com/users/%<user_id>s/posts"
selectors:
  items:
    selector: "array > object"
  title:
    selector: "title"
  url:
    selector: "url"

See our Dynamic Parameters guide for more details.

Notes

  • Header examples that target third-party APIs are illustrative. Authentication requirements, header names, and response shapes can change independently of html2rss.
  • For JSON APIs, validate the response structure before assuming selectors like array > object or html_url will match.
  • If you document or share a config for reuse, prefer placeholder values and parameterized headers over embedding real tokens.

Testing Your Headers

Test your configuration to ensure headers work correctly:

# Test with curl first
curl -H "Authorization: Bearer YOUR_TOKEN" https://api.example.com/posts

# Then test with html2rss
html2rss feed your-config.yml

Troubleshooting

Common Issues

  • 401 Unauthorized: Check your authentication headers
  • 403 Forbidden: Verify API keys and permissions
  • 429 Too Many Requests: Add rate limiting or different user agents
  • Empty responses: Some APIs require specific Accept headers

Debug Tips

  1. Use browser developer tools to see what headers successful requests use
  2. Test with curl before configuring html2rss
  3. Check API documentation for required headers
  4. Enable debug logging to see what headers are being sent

Advanced Examples

GitHub API

headers:
  Authorization: "token YOUR_GITHUB_TOKEN"
  Accept: "application/vnd.github.v3+json"
  User-Agent: "html2rss/1.0"
channel:
  url: https://api.github.com/repos/owner/repo/issues
selectors:
  items:
    selector: "array > object"
  title:
    selector: "title"
  url:
    selector: "html_url"

Reddit API

headers:
  User-Agent: "html2rss/1.0 by your-username"
  Accept: "application/json"
channel:
  url: https://www.reddit.com/r/programming.json
selectors:
  items:
    selector: "data > children > object > data"
  title:
    selector: "title"
  url:
    selector: "url"

Related Topics

Need More Help?