Image-to-text Agents

Question

I'm building an agent to extract name, company, and title from photos of badges that people took after participating in an event. I've setup a basic agent that (1) reads the file (2) Responds with the 3 info. However, it's not really able to read the files and it breaks on the first step.

Would love to have the help from those who have already built image-to-text agents.

mpividal · Accepted Answer

Hi @AntenorNeto , Thank you for reaching out! There are two ways to process images in Glean:

* Embedded images from indexed documents, which require OCR to be enabled on your Glean instance. OCR will analyze the images during document indexing and save the caption text, making it useful as context for the agent.
* The image upload feature in Glean Assistant, which is also available for agents when your trigger is set to upload a file.

Please feel free to let me know if you find this helpful or if there's anything else I can assist you with.

Getting Started

Events

Help Center

glean.com