Skip to content

Commit

Permalink
[JS] feat: File download support, vision support, and vision sample (#…
Browse files Browse the repository at this point in the history
…1018)

## Linked issues

closes: #912 
fixes: #786, #399

## NOTE 
> This PR is identical to
#913. The only difference is
the person creating the PR.

## Details

This change adds 2 key new features to the AI library:
- A new file downloader feature lets developers reregister file
downloader plugins that can download files relative to the users input.
A `TeamsAttachmentDownloader` implementation is provided which
simplifies the downloading of files uploaded to a bot.
- GPT Vision support is added so that downloaded image files can be sent
to GPT as part of the users input.

This PR also fixes a couple of prompt related bugs that were identified
in the process.

#### Change details

- Fixed bugs related to prompts not using the override options defined
in a prompts config.json file.
- Added a new `InputFileDownloader` plugin model to the `Application`
class.
- Added a `TeamsAttachmentDownloader` class for automatically
downloading files uploaded to the bot.
- Added a new `state.temp.input_files` variable which is of type
`InputFile[]`.
- Moved the initialization of `state.temp.input` from the `AI` class to
the `Application` class as it made more sense to do this earlier.
- Updated the `Message` interface to support the new features needed for
the vision API's.
- Added a new `UserInputMessage` prompt section that knows how to format
"user" messages containing images.
- Updated `ConversationHistory` prompt section to count tokens for
included images. This is currently hard coded to 85 which is the cost of
low detail images. High detail images consume more tokens but it would
require more work to count them properly.
- Updated `OpenAIModel` class to return the `LLMClient` the input
message it sent to the model. This is so the message with the images
gets properly added to conversation history.
 
## Attestation Checklist

- [x] My code follows the style guidelines of this project

- I have checked for/fixed spelling, linting, and other errors
- I have commented my code for clarity
- I have made corresponding changes to the documentation (we use
[TypeDoc](https://typedoc.org/) to document our code)
- My changes generate no new warnings
- I have added tests that validates my changes, and provides sufficient
test coverage. I have tested with:
  - Local testing
  - E2E testing in Teams
- New and existing unit tests pass locally with my changes

### Additional information

> Feel free to add other relevant information below

---------

Co-authored-by: Steven Ickman <[email protected]>
Co-authored-by: Corina Gum <>
Co-authored-by: Lily Du <[email protected]>
  • Loading branch information
3 people authored Dec 6, 2023
1 parent bb711b3 commit a62392b
Show file tree
Hide file tree
Showing 62 changed files with 1,778 additions and 176 deletions.
2 changes: 1 addition & 1 deletion js/.nycrc
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
"**/coverage/**",
"**/*.d.ts",
"**/*.spec.ts",
"packages/**/src/index.ts"
"packages/**/src/**/index.ts"
],
"reporter": ["html", "text"],
"all": true,
Expand Down
2 changes: 1 addition & 1 deletion js/packages/teams-ai/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"name": "@microsoft/teams-ai",
"author": "Microsoft Corp.",
"description": "SDK focused on building AI based applications for Microsoft Teams.",
"version": "1.0.0-preview.1",
"version": "1.0.0-preview.2",
"license": "MIT",
"keywords": [
"botbuilder",
Expand Down
163 changes: 71 additions & 92 deletions js/packages/teams-ai/src/AI.ts
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,12 @@ export interface PredictedDoCommandAndHandler<TState> extends PredictedDoCommand
* @param action Name of the action being executed.
* @returns Whether the AI system should continue executing the plan.
*/
handler: (context: TurnContext, state: TState, parameters?: Record<string, any>, action?: string) => Promise<string>;
handler: (
context: TurnContext,
state: TState,
parameters?: Record<string, any>,
action?: string
) => Promise<string>;
}

/**
Expand Down Expand Up @@ -225,64 +230,52 @@ export class AI<TState extends TurnState = TurnState> {
* @param {ConfiguredAIOptions} options The options used to configure the AI system.
*/
public constructor(options: AIOptions<TState>) {
this._options = Object.assign({
max_steps: 25,
max_time: 300000,
allow_looping: true
}, options) as ConfiguredAIOptions<TState>;
this._options = Object.assign(
{
max_steps: 25,
max_time: 300000,
allow_looping: true
},
options
) as ConfiguredAIOptions<TState>;

// Create moderator if needed
if (!this._options.moderator) {
this._options.moderator = new DefaultModerator<TState>();
}

// Register default UnknownAction handler
this.defaultAction(
AI.UnknownActionName,
(context, state, data, action?) => {
console.error(`An AI action named "${action}" was predicted but no handler was registered.`);
return Promise.resolve(AI.StopCommandName);
}
);
this.defaultAction(AI.UnknownActionName, (context, state, data, action?) => {
console.error(`An AI action named "${action}" was predicted but no handler was registered.`);
return Promise.resolve(AI.StopCommandName);
});

// Register default FlaggedInputAction handler
this.defaultAction(
AI.FlaggedInputActionName,
() => {
console.error(
`The users input has been moderated but no handler was registered for 'AI.FlaggedInputActionName'.`
);
return Promise.resolve(AI.StopCommandName);
}
);
this.defaultAction(AI.FlaggedInputActionName, () => {
console.error(
`The users input has been moderated but no handler was registered for 'AI.FlaggedInputActionName'.`
);
return Promise.resolve(AI.StopCommandName);
});

// Register default FlaggedOutputAction handler
this.defaultAction(
AI.FlaggedOutputActionName,
() => {
console.error(
`The bots output has been moderated but no handler was registered for 'AI.FlaggedOutputActionName'.`
);
return Promise.resolve(AI.StopCommandName);
}
);
this.defaultAction(AI.FlaggedOutputActionName, () => {
console.error(
`The bots output has been moderated but no handler was registered for 'AI.FlaggedOutputActionName'.`
);
return Promise.resolve(AI.StopCommandName);
});

// Register default HttpErrorActionName
this.defaultAction(
AI.HttpErrorActionName,
(context, state, data, action) => {
throw new Error(`An AI http request failed`);
}
);
this.defaultAction(AI.HttpErrorActionName, (context, state, data, action) => {
throw new Error(`An AI http request failed`);
});

// Register default PlanReadyActionName
this.defaultAction<Plan>(
AI.PlanReadyActionName,
(context, state, plan) => {
const isValid = Array.isArray(plan.commands) && plan.commands.length > 0;
return Promise.resolve(!isValid ? AI.StopCommandName : '');
}
);
this.defaultAction<Plan>(AI.PlanReadyActionName, (context, state, plan) => {
const isValid = Array.isArray(plan.commands) && plan.commands.length > 0;
return Promise.resolve(!isValid ? AI.StopCommandName : '');
});

// Register default DoCommandActionName
this.defaultAction<PredictedDoCommandAndHandler<TState>>(
Expand All @@ -294,32 +287,26 @@ export class AI<TState extends TurnState = TurnState> {
);

// Register default SayCommandActionName
this.defaultAction<PredictedSayCommand>(
AI.SayCommandActionName,
async (context, state, data, action) => {
const response = data.response;
if (context.activity.channelId == Channels.Msteams) {
await context.sendActivity(response.split('\n').join('<br>'));
} else {
await context.sendActivity(response);
}

return '';
this.defaultAction<PredictedSayCommand>(AI.SayCommandActionName, async (context, state, data, action) => {
const response = data.response;
if (context.activity.channelId == Channels.Msteams) {
await context.sendActivity(response.split('\n').join('<br>'));
} else {
await context.sendActivity(response);
}
);

return '';
});

// Register default TooManyStepsActionName
this.defaultAction<TooManyStepsParameters>(
AI.TooManyStepsActionName,
async (context, state, data, action) => {
const { max_steps, step_count } = data;
if (step_count > max_steps) {
throw new Error(`The AI system has exceeded the maximum number of steps allowed.`);
} else {
throw new Error(`The AI system has exceeded the maximum amount of time allowed.`);
}
this.defaultAction<TooManyStepsParameters>(AI.TooManyStepsActionName, async (context, state, data, action) => {
const { max_steps, step_count } = data;
if (step_count > max_steps) {
throw new Error(`The AI system has exceeded the maximum number of steps allowed.`);
} else {
throw new Error(`The AI system has exceeded the maximum amount of time allowed.`);
}
);
});
}

/**
Expand Down Expand Up @@ -391,7 +378,7 @@ export class AI<TState extends TurnState = TurnState> {
* @param handler Function to call when the action is triggered.
* @returns The AI system instance for chaining purposes.
*/
public defaultAction<TParameters extends (Record<string, any> | undefined)>(
public defaultAction<TParameters extends Record<string, any> | undefined>(
name: string | string[],
handler: (context: TurnContext, state: TState, parameters: TParameters, action?: string) => Promise<string>
): this {
Expand Down Expand Up @@ -434,7 +421,6 @@ export class AI<TState extends TurnState = TurnState> {
return this._actions.has(action);
}


/**
* Calls the configured planner to generate a plan and executes the plan that is returned.
* @remarks
Expand All @@ -447,23 +433,7 @@ export class AI<TState extends TurnState = TurnState> {
* @param step_count Optional. Number of steps that have been executed.
* @returns True if the plan was completely executed, otherwise false.
*/
public async run(
context: TurnContext,
state: TState,
start_time?: number,
step_count?: number
): Promise<boolean> {
// Populate {{$temp.input}}
if (typeof state.temp.input != 'string') {
// Use the received activity text
state.temp.input = context.activity.text;
}

// Initialize {{$allOutputs}}
if (state.temp.actionOutputs == undefined) {
state.temp.actionOutputs = {};
}

public async run(context: TurnContext, state: TState, start_time?: number, step_count?: number): Promise<boolean> {
// Initialize start time and action count
const { max_steps, max_time } = this._options;
if (start_time === undefined) {
Expand All @@ -474,7 +444,8 @@ export class AI<TState extends TurnState = TurnState> {
}

// Review input on first loop
let plan: Plan|undefined = step_count == 0 ? await this._options.moderator.reviewInput(context, state) : undefined;
let plan: Plan | undefined =
step_count == 0 ? await this._options.moderator.reviewInput(context, state) : undefined;

// Generate plan
if (!plan) {
Expand All @@ -490,7 +461,9 @@ export class AI<TState extends TurnState = TurnState> {

// Process generated plan
let completed = false;
let response = await this._actions.get(AI.PlanReadyActionName)!.handler(context, state, plan, AI.PlanReadyActionName);
const response = await this._actions
.get(AI.PlanReadyActionName)!
.handler(context, state, plan, AI.PlanReadyActionName);
if (response == AI.StopCommandName) {
return false;
}
Expand All @@ -503,8 +476,15 @@ export class AI<TState extends TurnState = TurnState> {
// Check for timeout
if (Date.now() - start_time! > max_time || ++step_count! > max_steps) {
completed = false;
const parameters: TooManyStepsParameters = { max_steps, max_time, start_time: start_time!, step_count: step_count! };
await this._actions.get(AI.TooManyStepsActionName)!.handler(context, state, parameters, AI.TooManyStepsActionName);
const parameters: TooManyStepsParameters = {
max_steps,
max_time,
start_time: start_time!,
step_count: step_count!
};
await this._actions
.get(AI.TooManyStepsActionName)!
.handler(context, state, parameters, AI.TooManyStepsActionName);
break;
}

Expand All @@ -524,9 +504,7 @@ export class AI<TState extends TurnState = TurnState> {
state.temp.actionOutputs[action] = output;
} else {
// Redirect to UnknownAction handler
output = await this._actions
.get(AI.UnknownActionName)!
.handler(context, state, plan, action);
output = await this._actions.get(AI.UnknownActionName)!.handler(context, state, plan, action);
}
break;
}
Expand All @@ -549,6 +527,7 @@ export class AI<TState extends TurnState = TurnState> {
// Copy the actions output to the input
state.temp.lastOutput = output;
state.temp.input = output;
state.temp.inputFiles = [];
}

// Check for looping
Expand Down
29 changes: 28 additions & 1 deletion js/packages/teams-ai/src/Application.ts
Original file line number Diff line number Diff line change
Expand Up @@ -22,17 +22,18 @@ import { ReadReceiptInfo } from 'botframework-connector';

import { AdaptiveCards, AdaptiveCardsOptions } from './AdaptiveCards';
import { AI, AIOptions } from './AI';
import { Meetings } from './Meetings';
import { MessageExtensions } from './MessageExtensions';
import { TaskModules, TaskModulesOptions } from './TaskModules';
import { AuthenticationManager, AuthenticationOptions } from './authentication/Authentication';
import { TurnState } from './TurnState';
import { InputFileDownloader } from './InputFileDownloader';
import {
deleteUserInSignInFlow,
setTokenInState,
setUserInSignInFlow,
userInSignInFlow
} from './authentication/BotAuthenticationBase';
import { Meetings } from './Meetings';

/**
* @private
Expand Down Expand Up @@ -133,6 +134,11 @@ export interface ApplicationOptions<TState extends TurnState> {
* Optional. Factory used to create a custom turn state instance.
*/
turnStateFactory: () => TState;

/**
* Optional. Array of input file download plugins to use.
*/
fileDownloaders?: InputFileDownloader<TState>[];
}

/**
Expand Down Expand Up @@ -668,6 +674,27 @@ export class Application<TState extends TurnState = TurnState> {
return false;
}

// Populate {{$temp.input}}
if (typeof state.temp.input != 'string') {
// Use the received activity text
state.temp.input = context.activity.text;
}

// Download any input files
if (Array.isArray(this._options.fileDownloaders) && this._options.fileDownloaders.length > 0) {
const inputFiles = state.temp.inputFiles ?? [];
for (let i = 0; i < this._options.fileDownloaders.length; i++) {
const files = await this._options.fileDownloaders[i].downloadFiles(context, state);
inputFiles.push(...files);
}
state.temp.inputFiles = inputFiles;
}

// Initialize {{$allOutputs}}
if (state.temp.actionOutputs == undefined) {
state.temp.actionOutputs = {};
}

// Run any RouteSelectors in this._invokeRoutes first if the incoming Teams activity.type is "Invoke".
// Invoke Activities from Teams need to be responded to in less than 5 seconds.
if (context.activity.type === ActivityTypes.Invoke) {
Expand Down
43 changes: 43 additions & 0 deletions js/packages/teams-ai/src/InputFileDownloader.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
/**
* @module teams-ai
*/
/**
* Copyright (c) Microsoft Corporation. All rights reserved.
* Licensed under the MIT License.
*/

import { TurnContext } from 'botbuilder';
import { TurnState } from './TurnState';

/**
* A plugin responsible for downloading files relative to the current user's input.
* @template TState Optional. Type of application state.
*/
export interface InputFileDownloader<TState extends TurnState = TurnState> {
/**
* Download any files relative to the current user's input.
* @param context Context for the current turn of conversation.
* @param state Application state for the current turn of conversation.
*/
downloadFiles(context: TurnContext, state: TState): Promise<InputFile[]>;
}

/**
* A file sent by the user to the bot.
*/
export interface InputFile {
/**
* The downloaded content of the file.
*/
content: Buffer;

/**
* The content type of the file.
*/
contentType: string;

/**
* Optional. URL to the content of the file.
*/
contentUrl?: string;
}
Loading

0 comments on commit a62392b

Please sign in to comment.