Files
yoink-go/comic/archive.go
Bryan Bailey d2c715e973 feat: add batcave.biz support, closes #6
## What changed

- `BatcaveBizMarkup` now accepts a `clientChan chan *http.Client` and
  sends the authenticated cookie jar client back to the caller after
  completing the Cloudflare challenge flow. All error paths send nil so
  the caller never blocks.

- `Comic` struct gains a `Client *http.Client` field. `NewComic` wires
  up the channel, receives the client, and stores it so downstream code
  can reuse the same authenticated session.

- `downloadFile` branches on `c.Client`: when set it builds the request
  manually and only attaches a `Referer: https://batcave.biz/` header
  when the image URL is actually on batcave.biz. Some issues host images
  on third-party CDNs (e.g. readcomicsonline.ru) that actively block
  requests with a batcave Referer, returning 403 — omitting the header
  fixes those.

- `ParseBatcaveBizTitle` extracts the chapter title from the
  `__DATA__.chapters` JSON array by matching the chapter ID in the URL's
  last path segment. The HTML `<title>` on batcave.biz is prefixed with
  "Read " and suffixed with "comics online for free", making it
  unsuitable as a filename. Using the chapter data gives clean titles
  like "Nightwing (1996) 153". "Issue #" and bare "#" are stripped since
  the hash character causes problems on some filesystems and tools.

- `ParseBatcaveBizImageLinks` now unescapes `\/` → `/` in extracted
  URLs. The `__DATA__` JSON often contains forward-slash-escaped URLs
  that would otherwise be stored verbatim.

- `archive.go`: `filepath.Walk` was called on `filepath.Dir(sourcePath)`
  (the library root) instead of `sourcePath` (the comic's own folder).
  This caused any leftover image files from previous downloads in sibling
  directories to be included in every new CBZ. Fixed by walking
  `sourcePath` directly.

- `BatcaveBizMarkup` client now has a 30s `Timeout`. Without it, a
  single stalled CDN connection would hang the worker goroutine
  indefinitely, causing `Download()` to block forever waiting for a
  result that never arrives.

- Fixed `for e := range err` in `cli/root.go` — ranging over `[]error`
  with one variable yields the index, not the error value.
2026-03-11 20:55:03 -04:00

113 lines
2.0 KiB
Go

package comic
import (
"archive/zip"
"io"
"log"
"os"
"path/filepath"
"strings"
)
type ArchiveError struct {
Message string
Code int
}
func (a ArchiveError) Error() string {
return a.Message
}
// Archive creates a zip archive of the comic files.
//
// It takes no parameters.
// Returns an error if the operation fails.
func (c *Comic) Archive() error {
outputPath := filepath.Join(c.LibraryPath, c.Title, c.Title+".cbz")
err := os.MkdirAll(filepath.Dir(outputPath), os.ModePerm)
if err != nil {
return ArchiveError{
Message: "error creating directory",
Code: 1,
}
}
zipFile, err := os.Create(outputPath)
if err != nil {
return err
}
defer zipFile.Close()
zwriter := zip.NewWriter(zipFile)
defer zwriter.Close()
sourcePath := filepath.Join(c.LibraryPath, c.Title)
err = filepath.Walk(
sourcePath,
func(path string, info os.FileInfo, err error) error {
if err != nil {
return ArchiveError{
Message: "error walking archive",
Code: 1,
}
}
if info.IsDir() {
return nil
}
ext := strings.ToLower(filepath.Ext(path))
if ext != ".jpg" && ext != ".jpeg" && ext != ".png" {
return nil
}
relPath, err := filepath.Rel(sourcePath, path)
if err != nil {
return ArchiveError{
Message: "error walking archive",
Code: 1,
}
}
file, err := os.Open(path)
if err != nil {
return ArchiveError{
Message: "error walking archive",
Code: 1,
}
}
defer file.Close()
zipEntry, err := zwriter.Create(relPath)
if err != nil {
return ArchiveError{
Message: "error walking archive",
Code: 1,
}
}
_, err = io.Copy(zipEntry, file)
if err != nil {
return ArchiveError{
Message: "error walking archive",
Code: 1,
}
}
return nil
},
)
if err != nil {
return ArchiveError{
Message: "error writing files to archive",
Code: 1,
}
}
log.Printf("Created archive\n: %s", outputPath)
return nil
}