This repository provides educational material for learning web scraping using Python. It starts with the basics using requests and BeautifulSoup and then progresses to more advanced techniques with Scrapy.
- Introduction to Web Scraping
- Basic Web Scraping with requests and BeautifulSoup
- Setting Up the Environment
- Introduction to Scrapy
- Advanced Scrapy Techniques
- Handling JavaScript-Heavy Sites with Selenium
- Responsible Web Scraping: Best Practices and Ethics
- Challenges and Solutions in Web Scraping
- Practical Projects for Skill Application
- Handling Data Post-Scraping
- Sample Project: Scraping Patek.com
- Sample Project: Scraping Blancpain.com
- Basic knowledge of Python
- Python environment set up on your local machine
Practical application enhances learning. That's why we've provided sample projects that offer a hands-on approach to utilizing the concepts and techniques discussed in this tutorial. These projects demonstrate the use of popular Python libraries for web scraping: BeautifulSoup, Scrapy, and Selenium.
This project showcases the power and flexibility of Scrapy, a comprehensive web scraping framework in Python. We've targeted Patek.com for this task, extracting detailed information about various watch models. You'll find everything you need to understand, run, and learn from this real-world example in the project's directory, including the script, sample output data, and a detailed walkthrough of the code.
Our second project takes a different approach, using BeautifulSoup and Selenium to scrape data from Blancpain's website. While BeautifulSoup is perfect for simpler scraping tasks, combining it with Selenium allows handling JavaScript-heavy websites. This project will walk you through a practical example of how to use BeautifulSoup and Selenium for web scraping. The project's folder contains the script, sample data, and an extensive guide to understanding each step of the process.