Hi all,
I’m seeing a difference between how Glean Assistant (GA) behaves in chat vs a specialized workflow agent using File Analyst, and I’d like to know if this is expected and how to design this better.
Use case
I have large tables (CSV/XLSX/JSON, ~3–4k rows) with:
- English labels
- Product names
- Help/tooltip text
Goal: a workflow agent where users upload a file and receive:
- One main Arabic translation column
- Several Arabic suggestion columns (3–5 variants)
- Technical placeholders (e.g.
{equipmentNumber}, ICU plural patterns) preserved exactly - A downloadable translated CSV/XLSX
What GA chat can do
If I work directly in GA chat on the same kind of content, GA can:
- Produce fluent Arabic for both short labels and longer help text
- Preserve
{...} placeholders and ICU plural patterns correctly - Generate multiple natural Arabic variants when asked
So the underlying model clearly supports full, high‑quality translation for this domain.
What happens in the specialized agent
I built a workflow:
- Document Reader – load the whole CSV/XLSX/JSON.
- File Analyst – detect the English column, translate to Arabic, create suggestion columns, export CSV/XLSX.
- Respond – show summary + download links.
In the File Analyst step, I instruct it to:
- Translate every non‑empty English cell to Arabic
- Preserve placeholders/ICU patterns literally
- Avoid wrapping English in markers like
[AR] ..., [Arabic] ... (n), [English] ... - Avoid
Notes = "Needs manual translation" / "Needs review" as the default
However, on real‑size datasets I consistently see:
- “Arabic” columns that are actually English with wrappers, e.g.
[AR] The date and time. - Rows flagged en masse as “Needs manual translation” or “Needs review” instead of real Arabic text
- In some runs, reversed English strings with an Arabic prefix, which looks like fallback behaviour
So GA chat fully translates the content, but the File Analyst‑based agent mostly produces placeholders and review flags at scale.
Questions for the community
- Is this behaviour expected for File Analyst?
- Is it designed to favour mappings/placeholders and conservative “Needs review” statuses instead of free‑form MT on large tables?
- Can File Analyst be configured for a “full translation mode”?
- i.e., for a given workflow, allow it to:
- Use the LLM for free‑form translation on every row
- Disable placeholder patterns like
[AR] ..., [Arabic] ... (n), [English] ... - Avoid defaulting most rows to “Needs manual translation/Needs review”
- If not, what is the recommended pattern for this?
- A dedicated bulk translate tool for workflows?
- A hybrid where File Analyst only handles file I/O/structuring and another step does the actual MT (similar to GA chat behaviour)?
I’m basically trying to get GA‑level translation quality and completeness inside a reusable workflow agent, without drowning the output in placeholders. Any guidance or best practices would be greatly appreciated.
Thanks!