[WORKSHOP] Web Scraping with Scrapy

**Abstract**
This workshop is about what is web scraping and how to do web scraping using Scrapy, one of the popular Python framework for web scraping. 

**About**
This workshop will cover the workflow of scraping a website, step by step.

1) Reconnaissance: After deciding the kind of information we want, We find a page where we can start. We will then inspect the elements that matter to us and find out their tag (div, p, etc) and the class if necessary. Open up a scrapy shell and try to get the information we need, by accessing the corresponding element using XPath. 

2) Crawling: Then, we use this logic in our code, to extract data recursively. Typically we will jump from page to page, by extracting links that match a pattern.

3) Aquisition: During this process, any useful information we need, say text, images, etc, will be downloaded and saved to disk.


**Pre-requisites**
Basic knowledge of HTML and Python would be sufficient.

Those who want to follow along must have Python(3.x) and Scrapy(1.5.0) installed

**[Slides](https://aaqaishtyaq.github.io/slides/webscrapy101/)**

**Expected Duration**: ~90 minutes

**Level**: Beginner-Intermediate

**Resources**: "https://doc.scrapy.org/en/latest/index.html"

**Speaker Bio**: Aaqa Ishtyaq (I am final year Computer Science student. Currently doing an internship in Delhi as a Backend Developer)



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WORKSHOP] Web Scraping with Scrapy #7

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[WORKSHOP] Web Scraping with Scrapy #7

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions