[Inference Snippet] Add a directRequest option (false by default) #1516
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix after #1514.
Now that we use a placeholder for access token to load from env, there is no direct way to explictly generatea snippet for either a "direct request" or a "routed request" (determined here using
accessToken.startsWith("hf_")
). This PR adds adirectRequest?: boolean;
option to the parameters which solves this problem.Will require a follow-up PR in moon-landing.
cc @SBrandeis who found out the root cause
expected behavior
display routed request by default in https://huggingface.co/deepseek-ai/DeepSeek-R1-0528?inference_api=true&inference_provider=fireworks-ai&language=sh