Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logs from browserforge #423

Open
vdusek opened this issue Mar 4, 2025 · 4 comments
Open

Logs from browserforge #423

vdusek opened this issue Mar 4, 2025 · 4 comments
Assignees
Labels
bug Something isn't working. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@vdusek
Copy link
Contributor

vdusek commented Mar 4, 2025

Before

  • Before browserforge (Crawlee < 0.6)
2025-03-04T09:35:42.108Z ACTOR: Pulling Docker image of build vbJ3dgYm28POtiN1j from repository.
2025-03-04T09:35:42.311Z ACTOR: Creating Docker container.
2025-03-04T09:35:42.358Z ACTOR: Starting Docker container.
2025-03-04T09:35:44.092Z [apify] INFO  Initializing Actor...
2025-03-04T09:35:44.094Z [apify] INFO  System info ({"apify_sdk_version": "2.3.1", "apify_client_version": "1.9.2", "crawlee_version": "0.6.1", "python_version": "3.13.2", "os": "linux"})
2025-03-04T09:35:44.119Z [apify] DEBUG Debug message
2025-03-04T09:35:44.123Z [apify] INFO  Info message
2025-03-04T09:35:44.124Z [apify] WARN  Warning message
2025-03-04T09:35:44.126Z [apify] ERROR Error message
2025-03-04T09:35:44.128Z [apify] ERROR Exception message
2025-03-04T09:35:44.131Z       Traceback (most recent call last):
2025-03-04T09:35:44.133Z         File "/usr/src/app/src/main.py", line 25, in main
2025-03-04T09:35:44.134Z           raise ValueError('Dummy ValueError')
2025-03-04T09:35:44.136Z       ValueError: Dummy ValueError
2025-03-04T09:35:44.139Z [apify] INFO  Multi
2025-03-04T09:35:44.141Z line
2025-03-04T09:35:44.142Z log
2025-03-04T09:35:44.144Z message
2025-03-04T09:35:44.146Z [apify] ERROR Actor failed with an exception
2025-03-04T09:35:44.147Z       Traceback (most recent call last):
2025-03-04T09:35:44.149Z         File "/usr/src/app/src/main.py", line 33, in main
2025-03-04T09:35:44.150Z           raise RuntimeError('Dummy RuntimeError')
2025-03-04T09:35:44.152Z       RuntimeError: Dummy RuntimeError
2025-03-04T09:35:44.154Z [apify] INFO  Exiting Actor ({"exit_code": 91})

After

  • With browserforge (Crawlee ~= 0.6)
2025-03-04T09:35:42.108Z ACTOR: Pulling Docker image of build vbJ3dgYm28POtiN1j from repository.
2025-03-04T09:35:42.311Z ACTOR: Creating Docker container.
2025-03-04T09:35:42.358Z ACTOR: Starting Docker container.
2025-03-04T09:35:43.463Z Downloading model definition files...
2025-03-04T09:35:43.725Z input-network.zip             OK!
2025-03-04T09:35:43.727Z browser-helper-file.json      OK!
2025-03-04T09:35:43.729Z header-network.zip            OK!
2025-03-04T09:35:43.731Z headers-order.json            OK!
2025-03-04T09:35:43.732Z fingerprint-network.zip       OK!
2025-03-04T09:35:44.092Z [apify] INFO  Initializing Actor...
2025-03-04T09:35:44.094Z [apify] INFO  System info ({"apify_sdk_version": "2.3.1", "apify_client_version": "1.9.2", "crawlee_version": "0.6.1", "python_version": "3.13.2", "os": "linux"})
2025-03-04T09:35:44.119Z [apify] DEBUG Debug message
2025-03-04T09:35:44.123Z [apify] INFO  Info message
2025-03-04T09:35:44.124Z [apify] WARN  Warning message
2025-03-04T09:35:44.126Z [apify] ERROR Error message
2025-03-04T09:35:44.128Z [apify] ERROR Exception message
2025-03-04T09:35:44.131Z       Traceback (most recent call last):
2025-03-04T09:35:44.133Z         File "/usr/src/app/src/main.py", line 25, in main
2025-03-04T09:35:44.134Z           raise ValueError('Dummy ValueError')
2025-03-04T09:35:44.136Z       ValueError: Dummy ValueError
2025-03-04T09:35:44.139Z [apify] INFO  Multi
2025-03-04T09:35:44.141Z line
2025-03-04T09:35:44.142Z log
2025-03-04T09:35:44.144Z message
2025-03-04T09:35:44.146Z [apify] ERROR Actor failed with an exception
2025-03-04T09:35:44.147Z       Traceback (most recent call last):
2025-03-04T09:35:44.149Z         File "/usr/src/app/src/main.py", line 33, in main
2025-03-04T09:35:44.150Z           raise RuntimeError('Dummy RuntimeError')
2025-03-04T09:35:44.152Z       RuntimeError: Dummy RuntimeError
2025-03-04T09:35:44.154Z [apify] INFO  Exiting Actor ({"exit_code": 91})

Result

  • As a result test test_actor_logging is skipped.

Possible solutions

  • Remove the browserforge output and rm skip flag from test_actor_logging.
  • Leave it as it is and rm (or update) the test_actor_logging test.
    • in case of updating - there is no line order guaranteed.
@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Mar 4, 2025
@Pijukatel Pijukatel self-assigned this Mar 4, 2025
@vdusek
Copy link
Contributor Author

vdusek commented Mar 4, 2025

It is important to note that during every Actor run, zip files are now downloaded from the apify/fingerprint-suite repository.

@Erol444
Copy link

Erol444 commented Mar 8, 2025

Not sure if related, but I get this error every time I run an actor:

ACTOR: Pulling Docker image of build {MY_ID} from repository.
2025-03-08T12:46:21.303Z ACTOR: Creating Docker container.
2025-03-08T12:46:21.387Z ACTOR: Starting Docker container.
2025-03-08T12:46:23.361Z Downloading model definition files...
2025-03-08T12:46:23.615Z Error downloading fingerprint-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/fingerprints/data/fingerprint-network.zip'
2025-03-08T12:46:23.618Z Error downloading fingerprint-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/fingerprints/data/fingerprint-network.zip'
2025-03-08T12:46:23.620Z Error downloading fingerprint-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/fingerprints/data/fingerprint-network.zip'
2025-03-08T12:46:23.622Z Error downloading fingerprint-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/fingerprints/data/fingerprint-network.zip'
2025-03-08T12:46:23.625Z Error downloading fingerprint-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/fingerprints/data/fingerprint-network.zip'
2025-03-08T12:46:23.627Z Downloading model definition files...
2025-03-08T12:46:23.645Z Error downloading input-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/headers/data/input-network.zip'
2025-03-08T12:46:23.648Z Error downloading input-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/headers/data/input-network.zip'
2025-03-08T12:46:23.650Z Error downloading input-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/headers/data/input-network.zip'
2025-03-08T12:46:23.652Z Error downloading input-network.zip: [Errno 13] Permission denied: '/usr/local/lib/python3.13/site-packages/browserforge/headers/data/input-network.zip'
2025-03-08T12:46:23.658Z Traceback (most recent call last):
2025-03-08T12:46:23.667Z   File "<frozen runpy>", line 198, in _run_module_as_main
2025-03-08T12:46:23.669Z   File "<frozen runpy>", line 88, in _run_code
2025-03-08T12:46:23.671Z   File "/usr/src/app/src/__main__.py", line 3, in <module>
2025-03-08T12:46:23.674Z     from .main import main
2025-03-08T12:46:23.676Z   File "/usr/src/app/src/main.py", line 10, in <module>
2025-03-08T12:46:23.678Z     from apify import Actor, Request
2025-03-08T12:46:23.680Z   File "/usr/local/lib/python3.13/site-packages/apify/__init__.py", line 15, in <module>
2025-03-08T12:46:23.682Z     from apify._actor import Actor
2025-03-08T12:46:23.684Z   File "/usr/local/lib/python3.13/site-packages/apify/_actor.py", line 28, in <module>
2025-03-08T12:46:23.686Z     from apify._charging import ChargeResult, ChargingManager, ChargingManagerImplementation
2025-03-08T12:46:23.689Z   File "/usr/local/lib/python3.13/site-packages/apify/_charging.py", line 17, in <module>
2025-03-08T12:46:23.692Z     from apify.storages import Dataset
2025-03-08T12:46:23.694Z   File "/usr/local/lib/python3.13/site-packages/apify/storages/__init__.py", line 3, in <module>
2025-03-08T12:46:23.698Z     from ._request_list import RequestList
2025-03-08T12:46:23.701Z   File "/usr/local/lib/python3.13/site-packages/apify/storages/_request_list.py", line 13, in <module>
2025-03-08T12:46:23.703Z     from crawlee.http_clients import HttpClient, HttpxHttpClient
2025-03-08T12:46:23.705Z   File "/usr/local/lib/python3.13/site-packages/crawlee/http_clients/__init__.py", line 6, in <module>
2025-03-08T12:46:23.707Z     from ._httpx import HttpxHttpClient
2025-03-08T12:46:23.710Z   File "/usr/local/lib/python3.13/site-packages/crawlee/http_clients/_httpx.py", line 13, in <module>
2025-03-08T12:46:23.712Z     from crawlee.fingerprint_suite import HeaderGenerator
2025-03-08T12:46:23.714Z   File "/usr/local/lib/python3.13/site-packages/crawlee/fingerprint_suite/__init__.py", line 1, in <module>
2025-03-08T12:46:23.717Z     from ._browserforge_adapter import BrowserforgeFingerprintGenerator as DefaultFingerprintGenerator
2025-03-08T12:46:23.732Z   File "/usr/local/lib/python3.13/site-packages/crawlee/fingerprint_suite/_browserforge_adapter.py", line 10, in <module>
2025-03-08T12:46:23.734Z     from browserforge.fingerprints import Fingerprint as bf_Fingerprint
2025-03-08T12:46:23.736Z   File "/usr/local/lib/python3.13/site-packages/browserforge/fingerprints/__init__.py", line 5, in <module>
2025-03-08T12:46:23.738Z     from browserforge.headers import Browser
2025-03-08T12:46:23.741Z   File "/usr/local/lib/python3.13/site-packages/browserforge/headers/__init__.py", line 5, in <module>
2025-03-08T12:46:23.743Z     from .generator import Browser, HeaderGenerator
2025-03-08T12:46:23.745Z   File "/usr/local/lib/python3.13/site-packages/browserforge/headers/generator.py", line 80, in <module>
2025-03-08T12:46:23.747Z     class HeaderGenerator:
2025-03-08T12:46:23.750Z     ...<470 lines>...
2025-03-08T12:46:23.752Z             )
2025-03-08T12:46:23.756Z   File "/usr/local/lib/python3.13/site-packages/browserforge/headers/generator.py", line 86, in HeaderGenerator
2025-03-08T12:46:23.759Z     input_generator_network = BayesianNetwork(DATA_DIR / "input-network.zip")
2025-03-08T12:46:23.761Z   File "/usr/local/lib/python3.13/site-packages/browserforge/bayesian_network.py", line 103, in __init__
2025-03-08T12:46:23.763Z     network_definition = extract_json(path)
2025-03-08T12:46:23.766Z   File "/usr/local/lib/python3.13/site-packages/browserforge/bayesian_network.py", line 288, in extract_json
2025-03-08T12:46:23.768Z     with zipfile.ZipFile(path, 'r') as zf:
2025-03-08T12:46:23.770Z          ~~~~~~~~~~~~~~~^^^^^^^^^^^
2025-03-08T12:46:23.772Z   File "/usr/local/lib/python3.13/zipfile/__init__.py", line 1367, in __init__
2025-03-08T12:46:23.774Z     self.fp = io.open(file, filemode)
2025-03-08T12:46:23.776Z               ~~~~~~~^^^^^^^^^^^^^^^^
2025-03-08T12:46:23.779Z FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/lib/python3.13/site-packages/browserforge/headers/data/input-network.zip'

@Erol444
Copy link

Erol444 commented Mar 8, 2025

@vdusek you should look into this asap - new py actors will by default use 2.4.0 apify, which throws the error above when starting the actor on apify platform. Thanks to this issue in actor's requirements.txt I pinned apify==2.3.0 which fixes the error above (I was messing with Dockerfile at first).

@Pijukatel
Copy link
Contributor

Pijukatel commented Mar 10, 2025

Proposed solution:

  • Create new PyPi package apify_fingerprint_datapoints: feat: Fingerprint datapoints python package fingerprint-suite#353
  • Change browserforge to not download any files, but explicitly import those files from newly created package apify_fingerprint_datapoints. PR to be created from this branch
    after package is published to Pypi
  • Update all Apify requirements constraints to depend on the latest browserforge version

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

3 participants