Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JS] feat: File download support, vision support, and vision sample #1018

Merged
merged 16 commits into from
Dec 6, 2023

Conversation

corinagum
Copy link
Collaborator

@corinagum corinagum commented Dec 6, 2023

Linked issues

closes: #912
fixes: #786, #399

NOTE

This PR is identical to #913. The only difference is the person creating the PR.

Details

This change adds 2 key new features to the AI library:

  • A new file downloader feature lets developers reregister file downloader plugins that can download files relative to the users input. A TeamsAttachmentDownloader implementation is provided which simplifies the downloading of files uploaded to a bot.
  • GPT Vision support is added so that downloaded image files can be sent to GPT as part of the users input.

This PR also fixes a couple of prompt related bugs that were identified in the process.

Change details

  • Fixed bugs related to prompts not using the override options defined in a prompts config.json file.
  • Added a new InputFileDownloader plugin model to the Application class.
  • Added a TeamsAttachmentDownloader class for automatically downloading files uploaded to the bot.
  • Added a new state.temp.input_files variable which is of type InputFile[].
  • Moved the initialization of state.temp.input from the AI class to the Application class as it made more sense to do this earlier.
  • Updated the Message interface to support the new features needed for the vision API's.
  • Added a new UserInputMessage prompt section that knows how to format "user" messages containing images.
  • Updated ConversationHistory prompt section to count tokens for included images. This is currently hard coded to 85 which is the cost of low detail images. High detail images consume more tokens but it would require more work to count them properly.
  • Updated OpenAIModel class to return the LLMClient the input message it sent to the model. This is so the message with the images gets properly added to conversation history.

Attestation Checklist

  • My code follows the style guidelines of this project

  • I have checked for/fixed spelling, linting, and other errors

  • I have commented my code for clarity

  • I have made corresponding changes to the documentation (we use TypeDoc to document our code)

  • My changes generate no new warnings

  • I have added tests that validates my changes, and provides sufficient test coverage. I have tested with:

    • Local testing
    • E2E testing in Teams
  • New and existing unit tests pass locally with my changes

Additional information

Feel free to add other relevant information below

@corinagum corinagum changed the title Stevenic/gpt vision [JS] feat: File download support, vision support, and vision sample Dec 6, 2023
@corinagum corinagum merged commit a62392b into main Dec 6, 2023
@corinagum corinagum deleted the stevenic/gpt-vision branch December 6, 2023 21:32
@aacebo aacebo mentioned this pull request Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request]: add support for gpt-vision Azure OpenAI Deployment URL Endpoint is crafted wrongly
4 participants