When reviewing the API documentation for the Create an agent run and wait for the response endpoint, I noticed that it only supports text inputs. This makes it impossible (based on my current understanding) to have the agent workflow be based off a file upload, or image content. We are building out a workflow where it would be significantly improved by providing screenshot/images associated with the text.
My other idea was to pass HTML content to the glean agent (text and url for images) , and then leverage an external action such as (cloudflare HTML to PDF render/pdf - Render PDF Β· Cloudflare Browser Rendering docs) and then use that response package to move onto the next step. However, this is also not supported because it looks like glean 'write' custom actions are not allowed to be run without user acknowledgement. Even though this is an intermediary step in the overall process.
Is there another way around this type of limitation?