Skip to main content

January 2026 Developer Meetup | UI Agents

  • February 13, 2026
  • 1 reply
  • 0 views
Lu.Hunnicutt
Pathfinder Community Team
Forum|alt.badge.img+19

Welcome to the January Developer Meetup Recap!
In our first Developer Meetup of 2026, we dove into Agentic Browser Automation with UI Agents, led by Pratyush Garikapati, Director, Product Management at Automation Anywhere. Pratyush walked through how UI Agents enable goal-driven browser automation, demonstrated a real-world invoice extraction and entry workflow, and explained how teams can combine UI Agents with existing RPA, recorders, and document automation to reduce build time and maintenance at scale.

HOSTS
Pratyush Garikapati, Director, Product Management
Lu Hunnicutt, Pathfinder Community Manager

TOPIC
In this session, we explored how Automation Anywhere’s UI Agents enable goal-driven browser automation. UI Agents let you express the goal (what you want done) while the agent inspects the page, plans the steps, executes actions (DOM and/or vision), and iterates until the goal completes. Pratyush emphasized hybrid workflows, combining recorders, generative recorder, Document Automation, and UI Agents, so teams can pick the right tool for scale, fidelity, and cost.

UI AGENTS: THE ENHANCEMENT UI
Agents are built for multi-step, multi-site browser tasks where hand-coding every selector is brittle and expensive to maintain. They are particularly useful when you need:

  • Goal-based adaptability across many vendor portals or web properties.
  • Runtime switching between DOM and vision (ADA provides vision + DOM; Nova Act is DOM-first).
  • Integration into an existing automation control plane so outputs (JSON, variables) flow to downstream tasks.

Key Benefits:

  • Autoscaling for Scale & Variability: Write a goal once and run it across many different sites without building a bespoke recorder for each.
  • Lower maintenance burden: Agents reduce brittle selectors and simplify exception handling by reasoning at runtime. -
  • Multimodal access: Vision + DOM support (where available) lets agents read in-browser PDFs and image-based content without separate IDP orchestration.
  • Governance & Audit: Secure variables, masked inputs, and run-time traces (steps taken, time per step, tokens consumed, outputs) provide enterprise visibility.

DEMO HIGHLIGHTS
Pratyush demonstrated a two-step invoice flow:

  1. Read — The agent opened a vendor portal, applied a date filter, opened an in-browser PDF invoice, and extracted vendor details, invoice metadata, and line items as structured JSON (no hand-coded selectors).
  2. Post — Using the JSON, the agent opened a supplier portal, handled a conditional popup (accepted if present), clicked "add line item" dynamically as needed, and returned the created invoice confirmation number.

Notable behaviors shown:

  • Automatic DOM ↔ vision switching when the page contained an embedded PDF.
  • Passing extracted JSON as a variable between run-actions in the same browser session.
  • Human-in-the-loop and unattended run considerations: the session retains context for each run-action and clears it between actions to avoid context poisoning.

CHOOSING MODELS & HYBRID PATTERNS
Two action-model options were shown at launch:

  • Automation Anywhere Narada— Out-of-the-box model with DOM + vision capabilities; provisioned as a SaaS model by Automation Anywhere.
  • Amazon Nova Act — DOM-focused model available as a BYOL via AWS Bedrock for customers already invested in AWS.

Factors to choose by: unattended vs human-in-loop runs, fidelity/accuracy needs, cost/token profile, and whether vision is required. Pratyush recommended starting with recorder/generative recorder for deterministic single-site tasks and introducing UI Agents where scale or variability make recorder-based maintenance prohibitive.

SNEAK PEEK: WHAT’S COMING

  • GA timeframe: Targeted for A360.40 release with documentation and phased rollouts.
  • Governance & tooling: Secure variables, improved audit-log plumbing (A360.41), and clearer model-selection guidance based on workload.
  • Model posture: More action-models will be offered over time; each model will be evaluated for enterprise fidelity and governance.

SESSION Q&A
We closed with a Q&A covering availability, PDFs, session history, variables, regional availability, and licensing. Selected Qs & As below mirror audience questions.

Question: Is this in preview or GA?
Answer: Currently in private GA / private preview with live deployments; broader GA is targeted for release A360.40 (April) and documentation will be published with the release.

Question: Can UI Agents read local PDF files?
Answer: Yes — if the PDF opens in the browser (Chrome/Edge), UI Agents (especially ADA with vision) can access and extract it. For thick-client PDF readers, options are: open the file in the browser, use Document Automation, or route through an AI skill/IDP flow.

Question: Does a UI Agent session retain chat/history across the session?
Answer: The agent retains context for the duration of an active run-action to enable correct planning and execution. Once a run-action completes, that history is cleared to avoid context poisoning for subsequent unattended runs.

Question: Can variables be added to prompts for dynamic inputs (including secrets)?
Answer: Yes. You can pass variables between run-actions. Secure/ masked variables are supported and will be available by GA so sensitive values can be masked from the model while still allowing the agent to interact with UI fields.

Question: Can UI Agents perform multi-invoice extraction from a single PDF or extract multiple tables while preserving format?
Answer: UI Agents can extract multiple tables from a single page and can handle multiple invoices if pages are accessed in-browser. For multi-page, table-rich, high-volume extraction or complex multi-page logic, Document Automation remains the recommended solution for better throughput and fidelity.

Question: Is Nova Act available outside the US?
Answer: Nova Act is accessible via Bedrock beyond the US as of recent availability; both ADA and Nova Act are generally accessible globally, though data residency and regulatory constraints may affect model provisioning in certain jurisdictions.

Question: How is licensing structured?
Answer: UI Agents is a licensed capability. Pricing and packaging can vary by model choice (ADA vs BYO Nova Act) and customer context—discuss with your account rep for exact commercial terms.

NEXT STEPS
If you clicked yes in the meeting poll, expect a follow-up from Pratyush or the Pathfinder team to onboard private preview participants.

Make sure to register for our upcoming Developer Meetups here, so you can get hands on practice with our latest innovations.

We look forward to seeing you at the next Developer Meetup!

1 reply

amore17
Most Valuable Pathfinder
Forum|alt.badge.img+6
  • Most Valuable Pathfinder
  • February 14, 2026

Loved this session...! UI Agents truly redefine how we build and scale browser automations🙂