Description
When using the task type image-text-to-text
, the tokenizer
is set to image_url
, resulting in pipeline
being called with tokenizer
as a string. This causes an error within transformers
.
I'm unsure if this task should instead set feature_extractor
, or just leave tokenizer
as None
.
Suggested fix
-
Instead of manually determining which tasks require
feature_extractor
ortokenizer
, is it possible to process the full list of supported tasks fromtransformers
, and then add the correct value based on the class structure? This will make the code much more future proof astransformers
updates. -
Add an environment variable to set the tokenizer. This way, if a similar error occurs in the future, developers can do a quick fix by overriding the value. (It may be worth noting that I am using a pre-built docker container, so I don't have the ability to modify this myself for a quick fix without doing a lot of other work. For this situation, an environment variable would be ideal.)