You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/docs/ruby-gem/reference/channel.mdx
+18-2Lines changed: 18 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,9 @@ title: Channel
3
3
description: "Learn about the channel configuration block for RSS feed metadata. Configure feed title, description, author, and other RSS channel properties."
4
4
---
5
5
6
-
The `channel` configuration block defines the metadata for your RSS feed.
6
+
The `channel` configuration block defines your feed metadata.
7
+
8
+
This example is a complete feed config so you can see the `channel` block in context:
7
9
8
10
```yaml
9
11
channel:
@@ -12,8 +14,16 @@ channel:
12
14
description: "A feed of the latest news from Example.com"
13
15
author: "jane.doe@example.com (Jane Doe)"
14
16
ttl: 60
15
-
language: "en-us"
17
+
language: "en"
16
18
time_zone: "Europe/Berlin"
19
+
selectors:
20
+
items:
21
+
selector: "article"
22
+
title:
23
+
selector: "h2"
24
+
url:
25
+
selector: "a"
26
+
extractor: "href"
17
27
```
18
28
19
29
## Options
@@ -28,6 +38,12 @@ channel:
28
38
| `language` | Optional | The language of the feed. Defaults to the `lang` attribute of the `<html>` tag. |
29
39
| `time_zone` | Optional | The time zone for parsing dates. See the [list of tz database time zones](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones). |
30
40
41
+
## Notes
42
+
43
+
- `language`is runtime-validated. Use a valid language code such as `en`, not an arbitrary string.
44
+
- `author`should follow the RSS-style `email (Name)` format when you set it explicitly.
45
+
- `time_zone`must be a known TZ database identifier such as `UTC` or `Europe/Berlin`.
46
+
31
47
---
32
48
33
49
For detailed documentation on the Ruby API, see the [official YARD documentation](https://www.rubydoc.info/gems/html2rss).
Copy file name to clipboardExpand all lines: src/content/docs/ruby-gem/reference/selectors.mdx
+26Lines changed: 26 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -48,6 +48,32 @@ Available options:
48
48
- `"reverse"`: Reverses the order of items (useful when the website shows oldest items first)
49
49
- Default: Items appear in the order they are found on the page
50
50
51
+
## Paginated Feeds
52
+
53
+
`html2rss`can follow a single `rel="next"` pagination chain when you configure `selectors.items.pagination.max_pages`.
54
+
55
+
```yml
56
+
channel:
57
+
url: "https://example.com/news"
58
+
selectors:
59
+
items:
60
+
selector: "article"
61
+
pagination:
62
+
max_pages: 3
63
+
title:
64
+
selector: "h1"
65
+
url:
66
+
selector: "a"
67
+
extractor: "href"
68
+
```
69
+
70
+
Behavior:
71
+
72
+
- `max_pages`is the total page budget for the item selector chain, including the initial page.
73
+
- Pagination follows strict `link[rel~="next"]` or `a[rel~="next"]` targets only.
74
+
- Pagination stops when there is no next link, a page repeats, or the shared request budget is exhausted.
75
+
- The same request safeguards apply to pagination and Browserless navigation, including timeout limits, redirect limits, response-size guards, and private-network denial.
76
+
51
77
## RSS 2.0 Selectors
52
78
53
79
While you can define any named selector, only the following are used in the final RSS feed:
0 commit comments