데브허브 | DEVHUB | Web Scraping Without Getting Blocked! (Cheerio, Puppeteer & Proxies)
Web scraping always seemed super hard to me. You have static content, dynamic content, content rendered on the server side and client side, content that requires interaction, or content with loading delays. And then, there’s always the problem of getting blocked. On top of that, a whole new world opens up when dealing with proxies and figuring out how to proxy your requests. But don’t worry! In this video, I’ll show you everything you need to know about web scraping. We’ll start by fetching HTML and parsing it with Cheerio. However, since this approach doesn’t allow us to scrape dynamic data, we’ll implement Puppeteer to handle dynamic content. Along the way, we’ll also cover proxies and how to avoid getting your requests blocked.
🔥 Try out Residential (or any other) proxies for free: https://smartproxy.pxf.io/7aox3d
🔥 Try out the Web Scraping API for free: https://smartproxy.pxf.io/19ROnx
📸 Screen Recording Software: https://dub.sh/eDa47SO
🔒 The best Authentication service: https://dub.sh/xeU8r3v
🚀 Checkout Cal for Free: https://dub.sh/FAuffAy
👨🏻💻 GitHub Repo: https://github.com/ski043/website-to-...
🌍 My Website: https://janmarshal.com/
✅ Follow me on X: https://x.com/janmarshaldev
📧 Business ONLY: jan@alenix.de
Timestamps:
00:00 Intro
01:00 Introduction to the Codebase
06:00 Basic Web Scraping (Cheerio)
16:00 How to Avoid Getting Blocked (Proxies)
29:00 Advanced Web Scraping (Puppeteer)
57:00 Scraper API