Sources
Supported novel sources for scraping
Shuqi (书旗网)
t.shuqi.com
Platform novel populer milik Alibaba dengan konten gratis dan VIP. Scraper mendukung ekstraksi metadata via og:meta tags, katalog chapter, dan konten reader.
Language: Simplified Chinese (简体)
Locale: zh_CN
- •Homepage scraping
- •Novel detail via /book/{id}.html
- •Chapter catalog via /catalog/{id}/
- •Chapter reader via /reader/{bookId}/
- •Free vs VIP detection
TWXS (繁體小說)
www.twxs.com.tw
Platform Taiwan dengan koleksi novel pendek dan panjang. Navigasi chapter dilakukan secara sequential tanpa halaman katalog terpisah.
Language: Traditional Chinese (繁體)
Locale: zh_TW
- •Novel detail via /{novelSlug}/
- •Sequential chapter navigation (read_N.html)
- •Next/prev chapter following
- •VIP content detection
- •Ad/content cleaning
CLI Usage
# Scrape from Shuqi by book ID
node src/scraper/index.js --source shuqi --id 8016707 --save both# Scrape from TWXS by URL
node src/scraper/index.js --url https://www.twxs.com.tw/twxscomYunZhiZi2503049039/ --save both# Dry run (metadata only)
node src/scraper/index.js --url https://t.shuqi.com/book/8016707.html --dry-run# Batch from file
node src/scraper/index.js --list urls.txt --save db