Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use native impit streaming #2833

Merged
merged 7 commits into from
Feb 12, 2025
Merged

feat: use native impit streaming #2833

merged 7 commits into from
Feb 12, 2025

Conversation

barjin
Copy link
Contributor

@barjin barjin commented Feb 8, 2025

Bumps the impit dependency and required Node version for the @crawlee/impit-client package. Makes use of the new native ReadableStream interface.

Related to #2756

@barjin barjin self-assigned this Feb 8, 2025
@github-actions github-actions bot added this to the 107th sprint - Tooling team milestone Feb 8, 2025
@github-actions github-actions bot added the t-tooling Issues with this label are in the ownership of the tooling team. label Feb 8, 2025
@barjin
Copy link
Contributor Author

barjin commented Feb 8, 2025

The initial implementation shows slight discrepancies in the types exported from the impit package, requiring awkward casts on this side.

Those problems should be fixed in impit while we still can make "breaking" changes in the API there.

Edit: regular fetch API seems to have the same type problems (e.g. passing response.body to Readable.fromWeb causes the same type of discrepancies).

@barjin barjin added the adhoc Ad-hoc unplanned task added during the sprint. label Feb 10, 2025
@barjin barjin marked this pull request as ready for review February 10, 2025 13:57
@barjin barjin requested a review from janbuchar February 10, 2025 13:57
Copy link
Contributor

@janbuchar janbuchar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Resolve my comments at will and feel free to merge.

private getStreamWithProgress(
response: ImpitResponse,
): [Readable, () => { percent: number; transferred: number; total: number }] {
const responseStream = Readable.fromWeb(response.body as ReadableStream<any>);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the cast necessary? ImpitResponse['body'] looks like it's typed pretty well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, unfortunately, Readable.fromWeb accepts ReadableStream from node:stream/web, but impit.fetch (and, more importantly, Node's native fetch) return ReadableStream that gets resolved with TS's lib.dom.d.ts type declarations. There is a slight discrepancy somewhere deep between those two types:

obrazek

There may be a better solution somewhere, but given the (IMO small) size of this issue, a cast solves this just fine.

@barjin barjin merged commit af2fe23 into master Feb 12, 2025
9 checks passed
@barjin barjin deleted the feat/impit-streaming branch February 12, 2025 13:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
adhoc Ad-hoc unplanned task added during the sprint. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants