community-crawler

6 Commits 2 Branches 0 Tags

Author	SHA1	Message	Date
Claude	d62867e0cb	Add URL endpoint testing script - Test different Ruliweb URLs (search, board, best, main) - Result: All endpoints return 403 "Access denied" - Confirms that Ruliweb blocks all bot requests - Validates that Puppeteer/Selenium is required	2025-11-15 17:34:10 +00:00
Claude	1ccbc17b79	Improve fetcher with browser-like headers and cookie handling - Add cookie jar for session management - Include sec-ch-ua and Sec-Fetch-* headers (Chrome-like) - Add HTTPS agent with keepAlive - Log 403 response body for debugging Result: Still blocked by TLS fingerprinting - Both Ruliweb and Arcalive return "Access denied" - Need Puppeteer to bypass advanced bot detection	2025-11-15 17:28:23 +00:00
Claude	c5ef580534	Add crawler implementation (Node.js + TypeScript) - Create crawler project structure - Implement base crawler class with safety features - Add crawlers for Ruliweb, Arcalive, DCInside - Implement utilities: fetcher (with retry logic), logger - Configure crawling settings (3s delay, max 20 posts/board) - Add test script and scheduler (30min intervals) Safety measures: - 3 second delay between requests - Exponential backoff retry logic - Respect robots.txt (DCInside disabled) - User-Agent and proper headers Current status: - Structure complete - Both Ruliweb and Arcalive return 403 (bot detection) - Need to decide: Puppeteer, switch targets, or use mock data	2025-11-15 17:18:09 +00:00
Claude	e8ca418817	Fix Tailwind CSS v4 configuration - Update src/index.css to use @import "tailwindcss" (v4 syntax) - Remove tailwind.config.js (not needed in v4) - Tailwind styles now properly applied	2025-11-15 14:51:59 +00:00
Claude	8f7e0ee216	Initial setup: Korean community aggregator web app - Set up Vite + React + TypeScript project - Configure Tailwind CSS v4 with PostCSS - Create project structure (components, types, data) - Implement core features: - Header with search functionality - PostCard component for displaying posts - PostList with community filtering (전체/디씨/루리웹/아카) - PostModal for detailed post view - Add mock data for 3 communities (DCInside, Ruliweb, Arcalive) - Update README with project documentation	2025-11-15 13:22:50 +00:00
Gyubin Han	6e5c108269	First commit	2025-11-15 19:55:24 +09:00