Wrong title extracted for readallcomics.com URLs with numeric slugs (e.g. /030-2026/) #7
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Bug Report
URL: https://readallcomics.com/030-2026/
Expected behavior: The comic is downloaded with its actual title (the real series name from the page).
Actual behavior: The downloaded
.cbzarchive is named030instead of the real comic title.Root Cause
extractTitleFromMarkupincomic/comic.goparses the HTML<title>tag using a regex that expectsTitle (YYYY...). For this URL, the page title renders as something like030 (2026), so the regex extracts030.The fallback
titleFromSlugalso produces030: the slug030-2026has its trailing-2026stripped, leaving just030.Neither path recovers the real series title because the page's
<title>tag doesn't include it, and the slug itself is purely numeric.Steps to Reproduce
Observe the output file is named
030.cbz.Possible Fix
When the extracted title is purely numeric (or matches the bare slug), fall back to a more descriptive page element — e.g. the
<h1>orog:titlemeta tag — which may contain the full series name.