feat: Auto generate chapters for podcasts that provide timestamps by harryr0se · Pull Request #5119 · advplyr/audiobookshelf

harryr0se · 2026-03-11T18:53:56Z

Brief summary

This PR adds support for the automatic generation of chapters when a podcast episode provides timestamps in the description, it does this by scraping the description line by line and building up a chapter list

Which issue is fixed?

I started working on this as it's something I really wanted, but I've found the following related issue:
#2363

In-depth Description

If the newly added autoGenerateChapters field is true on the Podcast object, the generation code will run when ABS creates a PodcastEpisode object from a newly downloaded RSSPocastEpisode

The generation steps:

Break up the description into lines, currently it splits on any of the following </p>, <br /> or \n
Iterating each line we look for a timestamp via regex
If we match, we try to work out if the timestamp contains an hour or not, it's common for descriptions to only start including hours when they tick over the hour mark, for example

• 00:00 Chapter 1
• 30:00 Chapter 2
• 1:04:14 Chapter 3

We then calculate the chapter start time in seconds based upon this timestamp
Extracting the title is a matter of a further regex which attempts to find text after the timestamp
If there are other chapters that have been generated then we update the last ones end value to be this new chapters start, this makes the assumption that timestamps will be sequential and contiguous
Once out of the loop we update the last chapter to end at the duration of the audio file

Error checking

I believe that this sort of feature should be quite conservative and if there are instances where we would be unsure of the state of a given timestamp we should bail out of the entire process for the podcast episode. This is particularly important due to the fact we're treating them as neighboring chapters, so errors could propagate

This implementation currently has the following error handling:

Throwing on basic argument null checks
Throwing if we're unable to scrape the title of a given chapter
Throw if we scrape and are only able to find one chapter (perhaps this isn't required, but one chapter seems unhelpful and I felt it could indicate some parsing failure)
Throw if there's timestamps past the end of the audio file
Throw if there's minutes or seconds over 59

How have you tested this?

I have added a new test suite for this scraping code, I've tried to cover a number of success and failure cases
All of the above checks if "error checking" should be captured by tests

I've also been running my fork with this for nearly a week and it's working well on the 3 podcasts I subscribe to which provide timestamps

Screenshots

Web interface

iOS app beta

Next steps

I wanted to open this PR to start a discussion with maintainers and get feedback.

I'm aware that there's a re-write of the front end ongoing, so I've tried to craft this PR the something that could land server side and then be included in the new UI. In the meantime it could be enabled on a per podcast basis via the api
It would be nice to know if that would be something you'd be open to

…amps

…efix to match other logs

…ub.com/harryr0se/audiobookshelf into auto-generate-chapters-from-timestamps

…asts table - Bump minor version (I wasn't sure if this was needed for the migration) - Feature is now controlled by the field in the podcast database object - Move parsing code and tests to existing utils/parsers/ dir - Add more test cases

…hapter titles

advplyr · 2026-03-17T14:12:51Z

I'm open to this, but I don't think we need a flag for it. If we can determine that this is reliable enough then we can have it on by default. It could possibly be a library setting that you could turn off.

If the podcast episode has chapters in the audio file, or it has chapters in the RSS feed then we should always prefer those. If it has neither and the description timestamps meet our criteria, then we pull chapters from the description.

harryr0se · 2026-03-17T18:20:59Z

@advplyr Thanks for the feedback!

I'm open to this, but I don't think we need a flag for it. If we can determine that this is reliable enough then we can have it on by default. It could possibly be a library setting that you could turn off.

That sounds great, let me update the PR to remove the flag and make it the default

If the podcast episode has chapters in the audio file, or it has chapters in the RSS feed then we should always prefer those. If it has neither and the description timestamps meet our criteria, then we pull chapters from the description.

This makes sense to me, I believe that priority is already part of this PR, this is the final fallback after checking the audioFile and the rssPodcastEpisode objects

    if (audioFile.chapters?.length) {
      podcastEpisode.chapters = audioFile.chapters.map((ch) => ({ ...ch }))
    } else if (rssPodcastEpisode.chapters?.length) {
      podcastEpisode.chapters = rssPodcastEpisode.chapters.map((ch) => ({ ...ch }))
    } else {
     ... Try auto generating
   }

If we can determine that this is reliable enough

Regarding this, do you have any particular tests you'd like to see for such a feature?
I've been going through the podcasts that I personally subscribe to which support timestamps and testing with them.

Last night I also went through the top podcasts on PocketCasts searching for more test cases, which is where I came across an example where chapter titles could contain html tags. Currently I'm capturing all of these with automated tests, if you have any further scenarios or error checking you'd like to see let me know

harryr0se · 2026-03-17T19:59:41Z

@advplyr I've updated the PR to remove the flag and added a few more test cases

I've also added a high level early out if there's no timestamps in the full description string to make sure this code only runs when it's most likely to succeed

+      throw new Error(`Chapter found that starts after over audio duration. Duration: ${audioDurationSecs}s - Chapter start ${startTime}s`)
+    }
+
+    let chapterTitleMatch = chapterTitleRegex.exec(line)


…amps

harryr0se and others added 8 commits March 10, 2026 20:13

Commit first implementation of timestamp to chapter generation

e8d65ce

Add chapter title scraping and improve error logging

b4b126e

Revert .devcontainer/devcontainer.json

e096a04

Update updating of end values to use new chaptersToPush temp array

256c341

Only use projects logger

bb7fcc1

Improve chapter generation code and extract it into its own function

9d4a2a8

Add tests

b3ba764

Merge branch 'advplyr:master' into auto-generate-chapters-from-timest…

bccf946

…amps

harryr0se marked this pull request as ready for review March 13, 2026 19:21

harryr0se added 5 commits March 13, 2026 20:11

Update logging to use info for key logs, also use [PodcastEpisode] pr…

32ea3e0

…efix to match other logs

Merge branch 'auto-generate-chapters-from-timestamps' of https://gith…

1e19bf3

…ub.com/harryr0se/audiobookshelf into auto-generate-chapters-from-timestamps

Fix typo

12b04fa

Handle podcasts which use html lists and also have html tags in the c…

6e05484

…hapter titles

harryr0se added 3 commits March 17, 2026 18:52

Handle chapters names that are very long, add examples to tests

0227302

Remove autoGenerateChapters flag, migration and version bump

8710816

Early out if the description doesn't contain and timestamps

7f88d4b

github-advanced-security AI found potential problems Mar 17, 2026

View reviewed changes

harryr0se added 2 commits March 20, 2026 17:25

Merge branch 'advplyr:master' into auto-generate-chapters-from-timest…

cbbe85c

…amps

Merge branch 'advplyr:master' into auto-generate-chapters-from-timest…

64fd42e

…amps

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Auto generate chapters for podcasts that provide timestamps#5119

feat: Auto generate chapters for podcasts that provide timestamps#5119
harryr0se wants to merge 18 commits intoadvplyr:masterfrom
harryr0se:auto-generate-chapters-from-timestamps

harryr0se commented Mar 11, 2026 •

edited

Loading

Uh oh!

advplyr commented Mar 17, 2026 •

edited

Loading

Uh oh!

harryr0se commented Mar 17, 2026

Uh oh!

harryr0se commented Mar 17, 2026

Uh oh!

Check failure

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

harryr0se commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Brief summary

Which issue is fixed?

In-depth Description

Error checking

How have you tested this?

Screenshots

Web interface

iOS app beta

Next steps

Uh oh!

advplyr commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

harryr0se commented Mar 17, 2026

Uh oh!

harryr0se commented Mar 17, 2026

Uh oh!

Check failure

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

harryr0se commented Mar 11, 2026 •

edited

Loading

advplyr commented Mar 17, 2026 •

edited

Loading