Skip to content

Conversation

Pijukatel
Copy link
Collaborator

Description

Split BrowserType literal into two different literals based on context.

This avoids some confusion and some implicit string manipulation in favor of explicit name mapping between the two different literals.

In Playwright: 'chromium', 'firefox', 'webkit'
In browser fingerprints context it is : 'chrome', 'firefox', 'safari', 'edge'

Two similar, but different contexts are `Playwright` and browser fingerprints
@github-actions github-actions bot added this to the 110th sprint - Tooling team milestone Mar 11, 2025
@github-actions github-actions bot added t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics. labels Mar 11, 2025
@Pijukatel Pijukatel requested review from vdusek and B4nan March 11, 2025 09:48
@Pijukatel Pijukatel added the adhoc Ad-hoc unplanned task added during the sprint. label Mar 13, 2025
@Pijukatel
Copy link
Collaborator Author

This is slightly breaking change, so probably wait until more braking changes accumulate.

@Pijukatel
Copy link
Collaborator Author

New release is coming, so let's add this change in now.

@Pijukatel Pijukatel marked this pull request as ready for review June 19, 2025 12:38
Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, could you please resolve conflicts?

@Pijukatel Pijukatel requested a review from vdusek July 2, 2025 14:42
@B4nan B4nan requested review from Copilot and barjin and removed request for B4nan July 2, 2025 17:31
Copilot

This comment was marked as outdated.

@Pijukatel Pijukatel requested a review from Copilot July 3, 2025 07:52
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the BrowserType literal in the fingerprinting context, replacing Playwright-centric values (chromium, webkit) with explicit fingerprinting values (chrome, safari). It introduces a mapping function to translate Playwright browser types into the new literal set and propagates these changes through the header generator, adapter, crawler code, tests, and docs.

  • Updated SupportedBrowserType and related constants to use ['chrome', 'firefox', 'safari', 'edge']
  • Added fingerprint_browser_type_from_playwright_browser_type and applied it throughout header and crawler code
  • Adjusted tests and documentation to reference the new browser type literals

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/crawlee/fingerprint_suite/_types.py Changed SupportedBrowserType literal values
src/crawlee/fingerprint_suite/_header_generator.py Added mapping function and updated default browser_type
src/crawlee/fingerprint_suite/_consts.py Updated BROWSER_TYPE_HEADER_KEYWORD keys
src/crawlee/fingerprint_suite/_browserforge_adapter.py Updated adapter logic/comments to use chrome/safari
src/crawlee/crawlers/_playwright/_playwright_crawler.py Mapped Playwright browser_type to fingerprint context
src/crawlee/browsers/_playwright_browser_controller.py Mapped Playwright browser_type in header generator call
tests/unit/fingerprint_suite/test_header_generator.py Updated parameterized tests for new browser types
tests/unit/fingerprint_suite/test_adapters.py Added test for PatchedHeaderGenerator with various input types
tests/unit/crawlers/_playwright/test_playwright_crawler.py Adjusted tests to use mapping function and new browser types
docs/upgrading/upgrading_to_v1.md Documented breaking change in browser type literals
docs/examples/code_examples/playwright_crawler_with_fingerprint_generator.py Updated example to use 'chrome'
Comments suppressed due to low confidence (3)

src/crawlee/fingerprint_suite/_browserforge_adapter.py:86

  • The inline comment still refers to chromium; for consistency with the updated literal set, please change this to chrome.
            # Increase max attempts as from `BrowserForge` header generator perspective even `chromium`

src/crawlee/fingerprint_suite/_header_generator.py:13

  • [nitpick] Consider adding a short docstring for this helper to explain that it maps Playwright browser literals ('chromium', 'firefox', 'webkit') into the fingerprinting context ('chrome', 'firefox', 'safari').
def fingerprint_browser_type_from_playwright_browser_type(

src/crawlee/fingerprint_suite/_header_generator.py:13

  • The new mapping function isn’t covered by a direct unit test. Consider adding tests for all three Playwright inputs ('chromium', 'firefox', 'webkit') and their expected outputs ('chrome', 'firefox', 'safari').
def fingerprint_browser_type_from_playwright_browser_type(

Copy link
Collaborator

@vdusek vdusek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Pijukatel Pijukatel merged commit 72b5698 into master Jul 3, 2025
35 of 36 checks passed
@Pijukatel Pijukatel deleted the browser-types branch July 3, 2025 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team. tested Temporary label used only programatically for some analytics.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants