fix: extract title from h1 or URL slug when page title starts with #

When readallcomics.com pages have a <title> containing only the issue
number (e.g. '#018 (2026)'), fall back to the h1 element first, then
derive the title from the URL slug by stripping the trailing year and
title-casing the hyphen-separated segments.

Closes #4
This commit is contained in:
2026-03-11 18:13:14 -04:00
parent a7c3b632a5
commit dcb41deea9
9 changed files with 950 additions and 13 deletions

1
go.mod
View File

@@ -5,6 +5,7 @@ go 1.22.3
require (
github.com/DaRealFreak/cloudflare-bp-go v1.0.4
github.com/PuerkitoBio/goquery v1.9.2
github.com/andybalholm/brotli v1.2.0
github.com/spf13/cobra v1.8.1
)