You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First of all, congratulations on such a great library!
Most models support PDF analysis out of the box, but if I understand correctly, currently smolagents only support images. So the only way to use PDFs is by converting them to images in the first place, which is a bit cumbersome. Maybe someone has an idea how this could be done more easily with PDFs.
here is how I do this now
fromsmolagentsimportCodeAgent, LiteLLMModelfromPILimportImagefrompdf2imageimportconvert_from_pathimportos# Define poppler path (macOS)defanalyze_pdf( pdf_path):
poppler_path='/opt/homebrew/Cellar/poppler/25.02.0/bin'# Adjust version as neededabs_pdf_path=os.path.abspath(pdf_path)
try:
# Convert PDF pages to imagespages=convert_from_path(
abs_pdf_path,
poppler_path=poppler_path# Required on macOS
)
# Analyze each pageresponse=agent.run(
f"Please analyze {len(pages)} pages of this PDF and describe what you see in detail. ""Include information about text content, layout, any tables or figures, ""and the overall structure of the document.",
images=pages
)
print(f"Page {len(pages)} Analysis:", response)
exceptExceptionase:
print(f"Error: {str(e)}")
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
First of all, congratulations on such a great library!
Most models support PDF analysis out of the box, but if I understand correctly, currently smolagents only support images. So the only way to use PDFs is by converting them to images in the first place, which is a bit cumbersome. Maybe someone has an idea how this could be done more easily with PDFs.
here is how I do this now
Beta Was this translation helpful? Give feedback.
All reactions