Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(plugin-node) Bugs related to image description service #2373

Closed
ae9is opened this issue Jan 16, 2025 · 2 comments · Fixed by #2375
Closed

(plugin-node) Bugs related to image description service #2373

ae9is opened this issue Jan 16, 2025 · 2 comments · Fixed by #2375
Labels
bug Something isn't working Need Feedback

Comments

@ae9is
Copy link
Contributor

ae9is commented Jan 16, 2025

A few things are broken in plugin-node related to image description:

  1. GIF frame extraction doesn't work for me (the call to gifFrames fails), and also uses an unmaintained dependency with vulnerabilities

private async extractFirstFrameFromGif(
gifUrl: string
): Promise<{ filePath: string }> {
const frameData = await gifFrames({
url: gifUrl,
frames: 1,
outputType: "png",
});

ref: https://github.com/benwiley4000/gif-frames/

  1. Image data loading loads the first GIF frame as PNG but the local image provider expects a JPEG

async describeImage(
imageData: Buffer
): Promise<{ title: string; description: string }> {
if (!this.model || !this.processor || !this.tokenizer) {
throw new Error("Model components not initialized");
}
const base64Data = imageData.toString("base64");
const dataUrl = `data:image/jpeg;base64,${base64Data}`;
const image = await RawImage.fromURL(dataUrl);

  1. The local image provider calls Transformers.js API incorrectly, it passes a data URL but RawImage.fromURL does not support that

RawImage.fromURL:

https://github.com/huggingface/transformers.js/blob/e1753ac0ff363183552bbc35488937e0d42c9cf8/src/utils/image.js#L147-L153

Which calls getFile, which does not support data URLs:

https://github.com/huggingface/transformers.js/blob/e1753ac0ff363183552bbc35488937e0d42c9cf8/src/utils/hub.js#L190-L193

  1. Users running Ollama expect to be using the local image vision provider if nothing's set, but instead just error out right now

this.runtime.imageVisionModelProvider ===
ModelProviderName.LLAMALOCAL
) {
this.provider = new LocalImageProvider();
elizaLogger.debug("Using llama local for vision model");

  1. In the describe image action, depending on the model used the file location result object can look like

{ object: fileLocation } or { fileLocation }

const fileLocationResultObject = await generateObject({
runtime,
context: getFileLocationContext,
modelClass: ModelClass.SMALL,
schema: FileLocationResultSchema,
stop: ["\n"],
});
if (!isFileLocationResult(fileLocationResultObject?.object)) {
elizaLogger.error("Failed to generate file location");
return false;
}
const { fileLocation } = fileLocationResultObject.object;

  1. Image data loading currently only classifies images as "animated GIFs" and "everything else", which fails to handle a number of input image cases since the APIs support mostly JPEG and PNG

private async loadImageData(
imageUrl: string
): Promise<{ data: Buffer; mimeType: string }> {
const isGif = imageUrl.toLowerCase().endsWith(".gif");
let imageData: Buffer;
let mimeType: string;
if (isGif) {
const { filePath } = await this.extractFirstFrameFromGif(imageUrl);
imageData = fs.readFileSync(filePath);
mimeType = "image/png";
fs.unlinkSync(filePath); // Clean up temp file
} else {

ref:
https://platform.openai.com/docs/guides/vision#what-type-of-files-can-i-upload
https://firebase.google.com/docs/vertex-ai/input-file-requirements#images-mime-types


PR coming shortly.

@ae9is ae9is added the bug Something isn't working label Jan 16, 2025
Copy link
Contributor

Hello @ae9is! Welcome to the elizaOS community. Thank you for opening your first issue; we appreciate your contribution. You are now an elizaOS contributor!

@AIFlowML
Copy link
Collaborator

This bug wll be probably fixed by #2375
I wait the merge and I keep it open waiting your Feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Need Feedback
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants