Skip to content

Commit ba6925d

Browse files
committed
refactor: change links to point to correct places
1 parent 5bdf240 commit ba6925d

34 files changed

+69
-69
lines changed

sources/academy/glossary/tools/apify_cli.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The [Apify CLI](/cli) helps you create, develop, build and run Apify Actors, and
1515

1616
## Installing {#installing}
1717

18-
To install the Apfiy CLI, you'll first need npm, which comes preinstalled with Node.js. If you haven't yet installed Node, learn how to do that [here](../../webscraping/scraping_basics_javascript/data_extraction/computer_preparation.md). Additionally, make sure you've got an Apify account, as you will need to log in to the CLI to gain access to its full potential.
18+
To install the Apfiy CLI, you'll first need npm, which comes preinstalled with Node.js. If you haven't yet installed Node, learn how to do that [here](../../webscraping/scraping_basics_javascript/06_computer_preparation.md). Additionally, make sure you've got an Apify account, as you will need to log in to the CLI to gain access to its full potential.
1919

2020
Open up a terminal instance and run the following command:
2121

sources/academy/platform/expert_scraping_with_apify/actors_webhooks.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Thus far, you've run Actors on the platform and written an Actor of your own, wh
1515

1616
## Advanced Actor overview {#advanced-actors}
1717

18-
In this course, we'll be working out of the Amazon scraper project from the **Web scraping basics for JavaScript devs** course. If you haven't already built that project, you can do it in three short lessons [here](../../webscraping/scraping_basics_javascript/challenge/index.md). We've made a few small modifications to the project with the Apify SDK, but 99% of the code is still the same.
18+
In this course, we'll be working out of the Amazon scraper project from the **Web scraping basics for JavaScript devs** course. If you haven't already built that project, you can do it in three short lessons [here](../../webscraping/scraping_basics_javascript/21_challenge.md). We've made a few small modifications to the project with the Apify SDK, but 99% of the code is still the same.
1919

2020
Take another look at the files within your Amazon scraper project. You'll notice that there is a **Dockerfile**. Every single Actor has a Dockerfile (the Actor's **Image**) which tells Docker how to spin up a container on the Apify platform which can successfully run the Actor's code. "Apify Actors" is a serverless platform that runs multiple Docker containers. For a deeper understanding of Actor Dockerfiles, refer to the [Apify Actor Dockerfile docs](/sdk/js/docs/guides/docker-images#example-dockerfile).
2121

sources/academy/platform/expert_scraping_with_apify/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ Before developing a pro-level Apify scraper, there are some important things you
2626

2727
### Crawlee, Apify SDK, and the Apify CLI {#crawlee-apify-sdk-and-cli}
2828

29-
If you're feeling ambitious, you don't need to have any prior experience with Crawlee to get started with this course; however, at least 5–10 minutes of exposure is recommended. If you haven't yet tried out Crawlee, you can refer to [this lesson](../../webscraping/scraping_basics_javascript/crawling/pro_scraping.md) in the **Web scraping basics for JavaScript devs** course (and ideally follow along). To familiarize yourself with the Apify SDK, you can refer to the [Apify Platform](../apify_platform.md) category.
29+
If you're feeling ambitious, you don't need to have any prior experience with Crawlee to get started with this course; however, at least 5–10 minutes of exposure is recommended. If you haven't yet tried out Crawlee, you can refer to [this lesson](../../webscraping/scraping_basics_javascript/18_pro_scraping.md) in the **Web scraping basics for JavaScript devs** course (and ideally follow along). To familiarize yourself with the Apify SDK, you can refer to the [Apify Platform](../apify_platform.md) category.
3030

3131
The Apify CLI will play a core role in the running and testing of the Actor you will build, so if you haven't gotten it installed already, please refer to [this short lesson](../../glossary/tools/apify_cli.md).
3232

sources/academy/tutorials/node_js/analyzing_pages_and_fixing_errors.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ try {
7171
}
7272
```
7373

74-
Read more information about logging and error handling in our developer [best practices](../../webscraping/scraping_basics_javascript/best_practices.md) section.
74+
Read more information about logging and error handling in our developer [best practices](../../webscraping/scraping_basics_javascript/25_best_practices.md) section.
7575

7676
### Saving snapshots {#saving-snapshots}
7777

sources/academy/tutorials/node_js/dealing_with_dynamic_pages.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,7 +43,7 @@ If you're in a brand new project, don't forget to initialize your project, then
4343
npm init -y && npm i crawlee
4444
```
4545

46-
Now, let's write some data extraction code to extract each product's data. This should look familiar if you went through the [Data Extraction](../../webscraping/scraping_basics_javascript/data_extraction/index.md) lessons:
46+
Now, let's write some data extraction code to extract each product's data. This should look familiar if you went through the [Data Extraction](../../webscraping/scraping_basics_javascript/02_data_extraction.md) lessons:
4747

4848
```js
4949
import { CheerioCrawler } from 'crawlee';

sources/academy/webscraping/anti_scraping/mitigation/using_proxies.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,13 +11,13 @@ slug: /anti-scraping/mitigation/using-proxies
1111

1212
---
1313

14-
In the [**Web scraping basics for JavaScript devs**](../../scraping_basics_javascript/crawling/pro_scraping.md) course, we learned about the power of Crawlee, and how it can streamline the development process of web crawlers. You've already seen how powerful the `crawlee` package is; however, what you've been exposed to thus far is only the tip of the iceberg.
14+
In the [**Web scraping basics for JavaScript devs**](../../scraping_basics_javascript/18_pro_scraping.md) course, we learned about the power of Crawlee, and how it can streamline the development process of web crawlers. You've already seen how powerful the `crawlee` package is; however, what you've been exposed to thus far is only the tip of the iceberg.
1515

1616
Because proxies are so widely used in the scraping world, Crawlee has built-in features for implementing them in an effective way. One of the main functionalities that comes baked into Crawlee is proxy rotation, which is when each request is sent through a different proxy from a proxy pool.
1717

1818
## Implementing proxies in a scraper {#implementing-proxies}
1919

20-
Let's borrow some scraper code from the end of the [pro-scraping](../../scraping_basics_javascript/crawling/pro_scraping.md) lesson in our **Web scraping basics for JavaScript devs** course and paste it into a new file called **proxies.js**. This code enqueues all of the product links on [demo-webstore.apify.org](https://demo-webstore.apify.org)'s on-sale page, then makes a request to each product page and scrapes data about each one:
20+
Let's borrow some scraper code from the end of the [pro-scraping](../../scraping_basics_javascript/18_pro_scraping.md) lesson in our **Web scraping basics for JavaScript devs** course and paste it into a new file called **proxies.js**. This code enqueues all of the product links on [demo-webstore.apify.org](https://demo-webstore.apify.org)'s on-sale page, then makes a request to each product page and scrapes data about each one:
2121

2222
```js
2323
// crawlee.js

sources/academy/webscraping/puppeteer_playwright/executing_scripts/extracting_data.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ import TabItem from '@theme/TabItem';
1414

1515
---
1616

17-
Now that we know how to execute scripts on a page, we're ready to learn a bit about [data extraction](../../scraping_basics_javascript/data_extraction/index.md). In this lesson, we'll be scraping all the on-sale products from our [Fakestore](https://demo-webstore.apify.org/search/on-sale) website. Playwright & Puppeteer offer two main methods for data extraction:
17+
Now that we know how to execute scripts on a page, we're ready to learn a bit about [data extraction](../../scraping_basics_javascript/02_data_extraction.md). In this lesson, we'll be scraping all the on-sale products from our [Fakestore](https://demo-webstore.apify.org/search/on-sale) website. Playwright & Puppeteer offer two main methods for data extraction:
1818

1919
1. Directly in `page.evaluate()` and other evaluate functions such as `page.$$eval()`.
2020
2. In the Node.js context using a parsing library such as [Cheerio](https://www.npmjs.com/package/cheerio)

sources/academy/webscraping/puppeteer_playwright/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,7 +63,7 @@ npm install puppeteer
6363
</TabItem>
6464
</Tabs>
6565

66-
> For a more in-depth guide on how to set up the basic environment we'll be using in this tutorial, check out the [**Computer preparation**](../scraping_basics_javascript/data_extraction/computer_preparation.md) lesson in the **Web scraping basics for JavaScript devs** course
66+
> For a more in-depth guide on how to set up the basic environment we'll be using in this tutorial, check out the [**Computer preparation**](../scraping_basics_javascript/06_computer_preparation.md) lesson in the **Web scraping basics for JavaScript devs** course
6767
6868
## Course overview {#course-overview}
6969

sources/academy/webscraping/puppeteer_playwright/page/interacting_with_a_page.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ With `page.click()`, Puppeteer and Playwright actually drag the mouse and click,
5555

5656
Notice that in the Playwright example, we are using a different selector than in the Puppeteer example. This is because Playwright supports [many custom CSS selectors](https://playwright.dev/docs/other-locators#css-elements-matching-one-of-the-conditions), such as the **has-text** pseudo class. As a rule of thumb, using text selectors is much more preferable to using regular selectors, as they are much less likely to break. If Google makes the sibling above the **Accept all** button a `<div>` element instead of a `<button>` element, our `button + button` selector will break. However, the button will always have the text **Accept all**; therefore, `button:has-text("Accept all")` is more reliable.
5757

58-
> If you're not already familiar with CSS selectors and how to find them, we recommend referring to [this lesson](../../scraping_basics_javascript/data_extraction/using_devtools.md) in the **Web scraping basics for JavaScript devs** course.
58+
> If you're not already familiar with CSS selectors and how to find them, we recommend referring to [this lesson](../../scraping_basics_javascript/04_using_devtools.md) in the **Web scraping basics for JavaScript devs** course.
5959
6060
Then, we can type some text into an input field `<textarea>` with `page.type()`; passing a CSS selector as the first, and the string to input as the second parameter:
6161

sources/academy/webscraping/scraping_basics_javascript/01_introduction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,4 +29,4 @@ We use web scraping as an umbrella term for crawling, web data extraction and al
2929

3030
## Next up {#next}
3131

32-
In the [next lesson](./data_extraction/index.md), you will learn about the basic building blocks of each web page. HTML, CSS and JavaScript.
32+
In the [next lesson](./02_data_extraction.md), you will learn about the basic building blocks of each web page. HTML, CSS and JavaScript.

sources/academy/webscraping/scraping_basics_javascript/02_data_extraction.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,4 +33,4 @@ HTML and CSS give websites their structure and style, but they are static. To be
3333
3434
## Next up {#next}
3535

36-
We will show you [how to use the browser DevTools](./browser_devtools.md) to inspect and interact with a web page.
36+
We will show you [how to use the browser DevTools](./03_browser_devtools.md) to inspect and interact with a web page.

sources/academy/webscraping/scraping_basics_javascript/03_browser_devtools.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@ Even though DevTools stands for developer tools, everyone can use them to inspec
1414

1515
## Elements tab {#elements-tab}
1616

17-
When you first open Chrome DevTools on Wikipedia, you will start on the Elements tab (In Firefox it's called the **Inspector**). You can use this tab to inspect the page's HTML on the left hand side, and its CSS on the right. The items in the HTML view are called [**elements**](../../../glossary/concepts/html_elements.md).
17+
When you first open Chrome DevTools on Wikipedia, you will start on the Elements tab (In Firefox it's called the **Inspector**). You can use this tab to inspect the page's HTML on the left hand side, and its CSS on the right. The items in the HTML view are called [**elements**](../../glossary/concepts/html_elements.md).
1818

1919
![Elements tab in Chrome DevTools](./images/browser-devtools-elements-tab.png)
2020

@@ -66,6 +66,6 @@ By changing HTML elements from the Console, you can change what's displayed on t
6666
6767
## Next up {#next}
6868

69-
In this lesson, we learned the absolute basics of interaction with a page using the DevTools. In the [next lesson](./using_devtools.md), you will learn how to extract data from it. We will extract data about the on-sale products on the [Warehouse store](https://warehouse-theme-metal.myshopify.com).
69+
In this lesson, we learned the absolute basics of interaction with a page using the DevTools. In the [next lesson](./04_using_devtools.md), you will learn how to extract data from it. We will extract data about the on-sale products on the [Warehouse store](https://warehouse-theme-metal.myshopify.com).
7070

7171
It isn't a real store, but a full-featured demo of a Shopify online store. And that is perfect for our purposes. Shopify is one of the largest e-commerce platforms in the world, and it uses all the latest technologies that a real e-commerce web application would use. Learning to scrape a Shopify store is useful, because you can immediately apply the learnings to millions of websites.

sources/academy/webscraping/scraping_basics_javascript/04_using_devtools.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ slug: /web-scraping-for-beginners/data-extraction/using-devtools
88

99
---
1010

11-
With the knowledge of the basics of DevTools we can finally try doing something more practical - extracting data from a website. Let's try collecting the on-sale products from the [Warehouse store](https://warehouse-theme-metal.myshopify.com/). We will use [CSS selectors](../../../glossary/concepts/css_selectors.md), JavaScript, and DevTools to achieve this task.
11+
With the knowledge of the basics of DevTools we can finally try doing something more practical - extracting data from a website. Let's try collecting the on-sale products from the [Warehouse store](https://warehouse-theme-metal.myshopify.com/). We will use [CSS selectors](../../glossary/concepts/css_selectors.md), JavaScript, and DevTools to achieve this task.
1212

1313
> **Why use a Shopify demo and not a real e-commerce store like Amazon?** Because real websites are usually bulkier, littered with promotions, and they change very often. Many have multiple versions of pages, and you never know in advance which one you will get. It will be important to learn how to deal with these challenges in the future, but for this beginner course, we want to have a light and stable environment.
1414
>
@@ -38,7 +38,7 @@ Now that we know how the parent element looks, we can extract its data, includin
3838

3939
## Selecting elements in Console {#selecting-elements}
4040

41-
We know how to find an element manually using the DevTools, but that's not very useful for automated scraping. We need to tell the computer how to find it as well. We can do that using JavaScript and CSS selectors. The function to do that is called [`document.querySelector()`](../../../glossary/concepts/querying_css_selectors.md) and it will find the first element in the page's HTML matching the provided [CSS selector](../../../glossary/concepts/css_selectors.md).
41+
We know how to find an element manually using the DevTools, but that's not very useful for automated scraping. We need to tell the computer how to find it as well. We can do that using JavaScript and CSS selectors. The function to do that is called [`document.querySelector()`](../../glossary/concepts/querying_css_selectors.md) and it will find the first element in the page's HTML matching the provided [CSS selector](../../glossary/concepts/css_selectors.md).
4242

4343
For example `document.querySelector('div')` will find the first `<div>` element. And `document.querySelector('.my-class')` (notice the period `.`) will find the first element with the class `my-class`, such as `<div class="my-class">` or `<p class="my-class">`.
4444

@@ -64,7 +64,7 @@ When we look more closely by hovering over the result in the Console, we find th
6464

6565
![Hover over a query result](./images/devtools-collection-query-hover.png)
6666

67-
We need a different function: [`document.querySelectorAll()`](../../../glossary/concepts/querying_css_selectors.md) (notice the `All` at the end). This function does not find only the first element, but all the elements that match the provided selector.
67+
We need a different function: [`document.querySelectorAll()`](../../glossary/concepts/querying_css_selectors.md) (notice the `All` at the end). This function does not find only the first element, but all the elements that match the provided selector.
6868

6969
Run the following function in the Console:
7070

@@ -203,4 +203,4 @@ price.textContent.match(/((\d+,?)+.?(\d+)?)/)[0];
203203

204204
## Next up {#next}
205205

206-
This concludes our lesson on extracting and cleaning data using DevTools. Using CSS selectors, we were able to find the HTML element that contains data about our favorite Sony subwoofer and then extract the data. In the [next lesson](./devtools_continued.md), we will learn how to extract information not only about the subwoofer, but about all the products on the page.
206+
This concludes our lesson on extracting and cleaning data using DevTools. Using CSS selectors, we were able to find the HTML element that contains data about our favorite Sony subwoofer and then extract the data. In the [next lesson](./05_devtools_continued.md), we will learn how to extract information not only about the subwoofer, but about all the products on the page.

sources/academy/webscraping/scraping_basics_javascript/05_devtools_continued.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -93,4 +93,4 @@ And third, we wrapped this data extraction logic in a **loop** to automatically
9393

9494
And that's it! With a bit of trial and error, you will be able to extract data from any webpage that's loaded in your browser. This is a useful skill on its own. It will save you time copy-pasting stuff when you need data for a project.
9595

96-
More importantly though, it taught you the basics to start programming your own scrapers. In the [next lessons](./computer_preparation.md), we will teach you how to create your own web data extraction script using JavaScript and Node.js.
96+
More importantly though, it taught you the basics to start programming your own scrapers. In the [next lessons](./06_computer_preparation.md), we will teach you how to create your own web data extraction script using JavaScript and Node.js.

sources/academy/webscraping/scraping_basics_javascript/06_computer_preparation.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -63,4 +63,4 @@ You should see **Hello World** printed in your terminal. If you do, congratulati
6363

6464
## Next up {#next}
6565

66-
You have your computer set up correctly for development, and you've run your first script. Great! In the [next lesson](./project_setup.md) we'll set up your project to download a website's HTML using Node.js instead of a browser.
66+
You have your computer set up correctly for development, and you've run your first script. Great! In the [next lesson](./07_project_setup.md) we'll set up your project to download a website's HTML using Node.js instead of a browser.

sources/academy/webscraping/scraping_basics_javascript/07_project_setup.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,4 +77,4 @@ If you see **it works!** printed in your terminal, great job! You set up everyth
7777

7878
## Next up {#next}
7979

80-
With the project set up, the [next lesson](./node_js_scraper.md) will show you how to use **got-scraping** to download the website's HTML and extract data from it with Cheerio.
80+
With the project set up, the [next lesson](./08_node_js_scraper.md) will show you how to use **got-scraping** to download the website's HTML and extract data from it with Cheerio.

0 commit comments

Comments
 (0)