Title extraction fails when page title starts with #nn #4

Closed
opened 2026-03-11 22:11:21 +00:00 by bryan · 0 comments
Owner

On sites like readallcomics.com, the HTML <title> tag sometimes contains only the issue number (e.g. #018 (2026)) rather than the full comic name. The current fallback checks <h1> first, then derives a title from the URL slug.

Problem
The slug-derived title is always title-cased from hyphenated segments (e.g. absolute-batman-018-2026Absolute Batman 018), which works for most cases but loses any capitalisation nuance (e.g. acronyms, proper nouns like DC).

Steps to reproduce

yoink https://readallcomics.com/absolute-batman-018-2026/

Expected
Title derived from the actual comic name on the page, not just the URL slug.

Notes

  • h1 fallback was tried but readallcomics.com has no <h1> on comic pages
  • Slug fallback is currently working and produces acceptable results
  • A more robust fix might scrape the comic name from another page element (e.g. og:title meta tag)
On sites like readallcomics.com, the HTML `<title>` tag sometimes contains only the issue number (e.g. `#018 (2026)`) rather than the full comic name. The current fallback checks `<h1>` first, then derives a title from the URL slug. **Problem** The slug-derived title is always title-cased from hyphenated segments (e.g. `absolute-batman-018-2026` → `Absolute Batman 018`), which works for most cases but loses any capitalisation nuance (e.g. acronyms, proper nouns like `DC`). **Steps to reproduce** ``` yoink https://readallcomics.com/absolute-batman-018-2026/ ``` **Expected** Title derived from the actual comic name on the page, not just the URL slug. **Notes** - h1 fallback was tried but readallcomics.com has no `<h1>` on comic pages - Slug fallback is currently working and produces acceptable results - A more robust fix might scrape the comic name from another page element (e.g. `og:title` meta tag)
bryan closed this issue 2026-03-11 22:16:25 +00:00
Sign in to join this conversation.
No Label
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: bryan/yoink-go#4