From 266f5e19ce440a3683a7e5b9d64454749493912e Mon Sep 17 00:00:00 2001 From: Gil Desmarais Date: Sun, 20 Jul 2025 15:02:17 +0200 Subject: [PATCH 01/17] feat: add contents from html2rss gem readme Signed-off-by: Gil Desmarais --- about.md | 44 ++++++++ api-reference.md | 55 ++++++++++ components/html2rss-configs.md | 23 ---- components/html2rss-web.md | 28 ----- components/html2rss.md | 32 ------ components/index.md | 7 -- configs/index.html | 4 +- configuration/auto_source.md | 68 ++++++++++++ configuration/channel.md | 35 ++++++ configuration/headers.md | 51 +++++++++ configuration/index.md | 24 +++++ configuration/selectors/extractors.md | 51 +++++++++ configuration/selectors/index.md | 118 +++++++++++++++++++++ configuration/selectors/post-processors.md | 71 +++++++++++++ configuration/strategy.md | 83 +++++++++++++++ configuration/stylesheets.md | 72 +++++++++++++ contributing.md | 52 +++++++-- examples/advanced-content-extraction.md | 18 ++++ examples/custom-http-requests.md | 14 +++ examples/dynamic-parameters.md | 35 ++++++ examples/handling-dynamic-content.md | 14 +++ examples/index.md | 30 ++++++ examples/simple-blog-list.md | 64 +++++++++++ examples/styling-rss-feed.md | 14 +++ getting-started/index.md | 41 +++++++ getting-started/installation.md | 72 +++++++++++++ getting-started/your-first-feed.md | 65 ++++++++++++ index.html | 25 ----- index.md | 69 ++++++++++++ support/contact.md | 36 +++++++ support/troubleshooting.md | 109 +++++++++++++++++++ 31 files changed, 1297 insertions(+), 127 deletions(-) create mode 100644 about.md create mode 100644 api-reference.md delete mode 100644 components/html2rss-configs.md delete mode 100644 components/html2rss-web.md delete mode 100644 components/html2rss.md delete mode 100644 components/index.md create mode 100644 configuration/auto_source.md create mode 100644 configuration/channel.md create mode 100644 configuration/headers.md create mode 100644 configuration/index.md create mode 100644 configuration/selectors/extractors.md create mode 100644 configuration/selectors/index.md create mode 100644 configuration/selectors/post-processors.md create mode 100644 configuration/strategy.md create mode 100644 configuration/stylesheets.md create mode 100644 examples/advanced-content-extraction.md create mode 100644 examples/custom-http-requests.md create mode 100644 examples/dynamic-parameters.md create mode 100644 examples/handling-dynamic-content.md create mode 100644 examples/index.md create mode 100644 examples/simple-blog-list.md create mode 100644 examples/styling-rss-feed.md create mode 100644 getting-started/index.md create mode 100644 getting-started/installation.md create mode 100644 getting-started/your-first-feed.md delete mode 100644 index.html create mode 100644 index.md create mode 100644 support/contact.md create mode 100644 support/troubleshooting.md diff --git a/about.md b/about.md new file mode 100644 index 00000000..8f509a46 --- /dev/null +++ b/about.md @@ -0,0 +1,44 @@ +--- +layout: default +title: About html2rss +# nav_order: 2 +--- + +# About html2rss + +`html2rss` is an open-source project dedicated to empowering users to take control of their web content consumption. In an age where many websites no longer offer traditional RSS feeds, `html2rss` bridges this gap by providing a robust and flexible solution for converting any HTML content into a structured RSS format. + +The project was started in 2018 and has since grown into a suite of tools that help users create and consume RSS feeds. + +--- + +### Our Mission + +Our mission is to provide a simple, powerful, and accessible tool that enables individuals and developers to create custom RSS feeds from any web page. We believe in the power of open standards and the freedom to access information on your own terms. + +--- + +### The html2rss Ecosystem + +The `html2rss` project is more than just a single tool. It's a collection of tools that work together to provide a complete RSS solution: + +- **[`html2rss`](https://github.com/html2rss/html2rss):** The core Ruby gem that provides the main functionality for converting HTML to RSS. +- **[`html2rss-web`](https://github.com/html2rss/html2rss-web):** A web application that allows you to create and manage your RSS feeds through a user-friendly interface. +- **[`html2rss-configs`](https://github.com/html2rss/html2rss-configs):** A collection of pre-built feed configs for popular websites, so you can get started quickly. + +--- + +### Project Philosophy + +- **User Empowerment:** Give users the tools to customize their web experience. +- **Simplicity & Power:** Offer an easy-to-use interface with powerful underlying capabilities. +- **Open Source:** Foster a collaborative environment where the community can contribute and improve the project. +- **Reliability:** Strive for a stable and dependable tool that consistently delivers. + +--- + +### The Team + +`html2rss` is maintained by a dedicated group of volunteers and contributors from around the world. We are passionate about open source and committed to continuously improving the project. + +Want to join us? Check out our [Contributing Guide]({{ '/contributing/' | relative_url }})! diff --git a/api-reference.md b/api-reference.md new file mode 100644 index 00000000..fc0a4c93 --- /dev/null +++ b/api-reference.md @@ -0,0 +1,55 @@ +--- +layout: default +title: API Reference +nav_order: 8 +--- + +# API Reference + +This section provides a reference for the `html2rss` command-line interface (CLI). + +For detailed documentation on the Ruby API, please refer to the official YARD documentation. + +[**πŸ“š View the Ruby API Docs on rubydoc.info**](https://www.rubydoc.info/gems/html2rss) + +--- + +### Command-Line Interface (CLI) + +The `html2rss` executable provides the primary way to interact with the tool from your terminal. + +#### `html2rss auto ` + +Automatically generates an RSS feed from the provided URL. + +- `` (Required): The URL of the website to generate a feed from. + +**Example:** + +```bash +html2rss auto https://unmatchedstyle.com/ +``` + +#### `html2rss feed ` + +Generates an RSS feed based on the provided YAML configuration file. + +- `` (Required): Path to your YAML configuration file. + +**Examples:** + +```bash +# Generate and print to console +html2rss feed my_feed.yml + +# Generate and save to an XML file +html2rss feed my_feed.yml > my_feed.xml +``` + +#### `html2rss help` + +Displays the help message with available commands and options. + +#### `html2rss --version` + +Displays the currently installed version of `html2rss`. diff --git a/components/html2rss-configs.md b/components/html2rss-configs.md deleted file mode 100644 index 89c3ca17..00000000 --- a/components/html2rss-configs.md +++ /dev/null @@ -1,23 +0,0 @@ ---- -layout: default -parent: Components -nav_order: 3 -title: html2rss-configs -description: html2rss-configs is a a growing repository of html2rss feed configs. -summary: a repository of feed configs ---- - -{{ page.description }} -{: .fs-8 } - ---- - -The [html2rss-config](https://github.com/html2rss/html2rss-configs) repository contains feed configs. Each feed config contains the instructions for the html2rss gem on how to build the RSS feed. Thus, to create a config, you need write CSS selectors and express them in YAML. - -The feed config must reside in a folder named after the fully qualified domain name of the website. - -The repository has its own test suite. It automatically tests each config and requires them to adhere to the conventions. - -A generator scaffolds a feed config and a test for that config. It gets you started in a breeze and let's you focus on writing the selectors. - -[See the project on Github](https://github.com/html2rss/html2rss-configs){: .btn .btn-purple } diff --git a/components/html2rss-web.md b/components/html2rss-web.md deleted file mode 100644 index fca49e3a..00000000 --- a/components/html2rss-web.md +++ /dev/null @@ -1,28 +0,0 @@ ---- -layout: default -parent: Components -nav_order: 2 -title: html2rss-web -description: html2rss-web app is a small application which serves RSS feeds via HTTP. -summary: serves RSS feeds via HTTP ---- - -{{ page.description }} -{: .fs-8 } - ---- - -html2rss-web builds and serves RSS feeds via HTTP. It's a small application which serves feeds via HTTP. It uses the _feed configs_ from [html2rss-configs](./html2rss-configs) and expose the _html2rss_ generated feeds via HTTP. - -**Generate your own feeds, or start instantly with the included configs.** - -- It's deployable without much hassle (also via Docker). -- It has a file-based application cache to prevent _hammering_ websites -- It handles with client-side HTTP cache headers. - -

- Everyone can host their own html2rss-web instance. - There are [public instances](https://github.com/html2rss/html2rss-web/wiki/Instances) for those who can't. -

- -[See the project on Github](https://github.com/html2rss/html2rss-web){: .btn .btn-purple } diff --git a/components/html2rss.md b/components/html2rss.md deleted file mode 100644 index 084c9ca2..00000000 --- a/components/html2rss.md +++ /dev/null @@ -1,32 +0,0 @@ ---- -layout: default -parent: Components -nav_order: 1 -title: html2rss gem -description: html2rss build RSS 2.0 feeds from websites (and JSON APIs) with a few CSS selectors. -summary: a Ruby gem to build RSS 2.0 feeds ---- - -{{ page.description }} -{: .fs-8 } - ---- - -The [html2rss gem](https://rubygems.org/gems/html2rss) generates a Ruby RSS object from a _feed config_. It does so by scraping and extracting the website. - -Scraping involves a tad more than just selecting an HTML element's text contents. - -- You want to sanitize HTML. -- You might find useful information in a `data` attribute in the page's source. -- You need to convert relative URLs to absolute ones. -- You want to parse dates & times in the publishers' time zone. -- Maybe the website is a JSON API and you want that response converted to a RSS feed? -- You might need to send requests with Authorization or Cookie HTTP headers. -- You want to scrape several syntactically equal pages on one website without duplicating the configs. -- You want to create a custom item description from other attributes. - -The documentation covers everything the gem is capable of. If you want to dive deeper, read the [gem's README](https://github.com/html2rss/html2rss/blob/master/README.md) or check [the YARD Docs](https://www.rubydoc.info/gems/html2rss). - -The gem's code is automatically tested. There's also code documentation for the API, usually with examples. However, looking inside the test suite to find to find more complex examples is recommended. - -[See the project on Github](https://github.com/html2rss/html2rss){: .btn .btn-purple } diff --git a/components/index.md b/components/index.md deleted file mode 100644 index 17a1348a..00000000 --- a/components/index.md +++ /dev/null @@ -1,7 +0,0 @@ ---- -layout: default -title: Components -nav_order: 3 -has_children: true -has_toc: true ---- diff --git a/configs/index.html b/configs/index.html index f4c6a6af..a762539e 100644 --- a/configs/index.html +++ b/configs/index.html @@ -1,8 +1,8 @@ --- layout: default -title: All feeds +title: Ready-to-use configs noindex: true -nav_order: 1 +# nav_order: 1 ---