Skip to content

Task image-text-to-text fails with AttributeError: 'str' object has no attribute 'pad_token_id' #135

Open
@TrevinAvery

Description

@TrevinAvery

When using the task type image-text-to-text, the tokenizer is set to image_url, resulting in pipeline being called with tokenizer as a string. This causes an error within transformers.

I'm unsure if this task should instead set feature_extractor, or just leave tokenizer as None.

Suggested fix

  1. Instead of manually determining which tasks require feature_extractor or tokenizer, is it possible to process the full list of supported tasks from transformers, and then add the correct value based on the class structure? This will make the code much more future proof as transformers updates.

  2. Add an environment variable to set the tokenizer. This way, if a similar error occurs in the future, developers can do a quick fix by overriding the value. (It may be worth noting that I am using a pre-built docker container, so I don't have the ability to modify this myself for a quick fix without doing a lot of other work. For this situation, an environment variable would be ideal.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions