feat: add batcave.biz support, closes #6
## What changed - `BatcaveBizMarkup` now accepts a `clientChan chan *http.Client` and sends the authenticated cookie jar client back to the caller after completing the Cloudflare challenge flow. All error paths send nil so the caller never blocks. - `Comic` struct gains a `Client *http.Client` field. `NewComic` wires up the channel, receives the client, and stores it so downstream code can reuse the same authenticated session. - `downloadFile` branches on `c.Client`: when set it builds the request manually and only attaches a `Referer: https://batcave.biz/` header when the image URL is actually on batcave.biz. Some issues host images on third-party CDNs (e.g. readcomicsonline.ru) that actively block requests with a batcave Referer, returning 403 — omitting the header fixes those. - `ParseBatcaveBizTitle` extracts the chapter title from the `__DATA__.chapters` JSON array by matching the chapter ID in the URL's last path segment. The HTML `<title>` on batcave.biz is prefixed with "Read " and suffixed with "comics online for free", making it unsuitable as a filename. Using the chapter data gives clean titles like "Nightwing (1996) 153". "Issue #" and bare "#" are stripped since the hash character causes problems on some filesystems and tools. - `ParseBatcaveBizImageLinks` now unescapes `\/` → `/` in extracted URLs. The `__DATA__` JSON often contains forward-slash-escaped URLs that would otherwise be stored verbatim. - `archive.go`: `filepath.Walk` was called on `filepath.Dir(sourcePath)` (the library root) instead of `sourcePath` (the comic's own folder). This caused any leftover image files from previous downloads in sibling directories to be included in every new CBZ. Fixed by walking `sourcePath` directly. - `BatcaveBizMarkup` client now has a 30s `Timeout`. Without it, a single stalled CDN connection would hang the worker goroutine indefinitely, causing `Download()` to block forever waiting for a result that never arrives. - Fixed `for e := range err` in `cli/root.go` — ranging over `[]error` with one variable yields the index, not the error value.
This commit is contained in:
@@ -6,6 +6,7 @@ import (
|
||||
"net/http"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
cloudflarebp "github.com/DaRealFreak/cloudflare-bp-go"
|
||||
@@ -39,13 +40,33 @@ func downloadFile(url string, page int, c *Comic) error {
|
||||
}
|
||||
}
|
||||
|
||||
res, err := handleRequest(url)
|
||||
var res *http.Response
|
||||
var err error
|
||||
if c.Client != nil {
|
||||
req, reqErr := http.NewRequest("GET", url, nil)
|
||||
if reqErr != nil {
|
||||
return ComicDownloadError{Message: "invalid request", Code: 1}
|
||||
}
|
||||
req.Header.Set("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36")
|
||||
if strings.Contains(url, "batcave.biz") {
|
||||
req.Header.Set("Referer", "https://batcave.biz/")
|
||||
}
|
||||
res, err = c.Client.Do(req)
|
||||
} else {
|
||||
res, err = handleRequest(url)
|
||||
}
|
||||
if err != nil {
|
||||
return ComicDownloadError{
|
||||
Message: "invalid request",
|
||||
Code: 1,
|
||||
}
|
||||
}
|
||||
if res.StatusCode != http.StatusOK {
|
||||
return ComicDownloadError{
|
||||
Message: "bad response",
|
||||
Code: 1,
|
||||
}
|
||||
}
|
||||
defer res.Body.Close()
|
||||
|
||||
imageFile, err := os.Create(imageFilepath)
|
||||
|
||||
Reference in New Issue
Block a user