Skip to content

Conversation

ziadhany
Copy link
Collaborator

@ziadhany ziadhany commented Oct 7, 2025

No description provided.

Copy link
Member

@keshav-space keshav-space left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ziadhany, see some suggestions below.

PACKAGE_BATCH_SIZE = 500
COMMIT_BATCH_SIZE = 10

BATCH_SIZE = 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to declare this globally

COMMIT_BATCH_SIZE = 10

BATCH_SIZE = 1000
CARGO_CHECKPOINT_PATH = "cargo/checkpoints.json"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checkpoint is no longer needed.

purl_files = []
purls = []

for file_path in base_path.rglob("*"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing blanket rglob can be problematic, this will also iterate .git directory.

}:
continue

logger(f"Processing file: {file_path}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not helpful, this will simply flood our log.

result = store_cargo_packages(packages, cloned_data_repo)
if result:
purl_file, base_purl = result
logger(f"Writing package URLs for package '{base_purl}' to {purl_file}")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This log is not helpful

purl_files = []
purls = []

for file_path in base_path.rglob("*"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also we should use LoopProgress when mining purls from resource for proper progress indication.

@keshav-space
Copy link
Member

@ziadhany see change I pushed in cc34321 and 6162198. Can you run this locally and see how it performs?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants