Skip to main content

Product Club | January 2026 Cloud Native Document Automation

  • January 16, 2026
  • 0 replies
  • 8 views
Lu.Hunnicutt
Pathfinder Community Team
Forum|alt.badge.img+19

Welcome to the January Product Club Recap! This month we kicked off the very first Product Club of 2026 with a deep dive into Document Automation, led by Alex, Director, Product Management at Automation Anywhere. Alex walked through what’s new, what’s next, and how teams can use cloud-scale extraction and GenAI-driven configuration to improve speed and accuracy across real-world document workflows.


HOSTS
Lu Hunnicutt, Pathfinder Community Manager
Alexander Timoshenko, Director, Product Management

TOPIC
In this session, we explored how Automation Anywhere’s Document Automation product helps teams extract structured data from unstructured documents (including tables) and how recent innovations are improving both performance and accuracy. Alex introduced Document Automation as a modular extraction capability that plugs into workflows to extract fields from documents like invoices, purchase orders, IDs, and more. During the session, we reviewed two major pain points in document processing: scaling for high volume and reducing latency for front-office use cases.

Alex demonstrated the latest release feature: Cloud Extraction Service, showing how it improves throughput and turnaround time compared to bot-runner extraction (it was FAST!). We explored a new Adaptive Search Queries feature, designed to handle layout-specific edge cases without breaking other document variants.

Alex then shared a sneak peek of upcoming enhancements, including new model options and enterprise-friendly model connectivity.

CLOUD EXTRACTION SERVICE: THE ENHANCEMENT
Cloud Extraction Service is designed for teams that need faster extraction and easier scaling without managing large fleets of bot runners. It targets two common scenarios: high-volume document processing and user-facing workflows where someone is waiting on results.

Key Benefits:

  • Autoscaling for High Volume: Processes large sets of documents concurrently without requiring customers to add and maintain more bot runners.
  • Faster Time-to-Result: Speeds up extraction when latency matters, especially in front-office experiences where users upload and wait.
  • Simplified Operations: Reduces the operational overhead of updating, restarting, and maintaining bot runner infrastructure.


DEMO HIGHLIGHTS

  • Cloud Extraction at Scale: Alex ran an upload flow for multiple documents and showed that by the time uploads completed, extraction was already finished and documents were ready in the validation queue.
  • Cloud vs Bot Extraction Speed Test: Alex reprocessed the same document using both methods and demonstrated that cloud extraction completed significantly faster while bot extraction continued running.
  • Validation Experience Walkthrough: The audience saw how extracted fields appear alongside the original document, including support for extracting table-based information.


ACCURACY IMPROVEMENTS: ADAPTIVE SEARCH QUERIES
A key challenge with GenAI-based extraction is that some fields can be ambiguous across layouts and suppliers. A prompt that works well for one format can cause confusion in another. Adaptive Search Queries solve this by letting teams tailor extraction queries to specific layouts without changing the default behavior for all documents.

Key Features:

  • Layout-Specific Prompting: Apply more detailed extraction instructions only to documents that match a specific layout.
  • Edge-Case Control Without Regression: Avoid changing global prompts that could degrade performance across other formats.
  • Better Table Context with GenAI Vision Plus: For table-heavy documents, GenAI Vision Plus can capture surrounding context that standard table extraction may miss.


DEMO HIGHLIGHTS

  • Split Delivery Scenarios in Purchase Orders: Alex showed an example where one line item needed to be split into multiple deliveries (different dates/quantities). Using Adaptive Search Queries plus GenAI Vision Plus, the system returned multiple delivery dates and corresponding quantities correctly.
  • Ambiguity Handling: Alex explained how Adaptive Search Queries help in cases like “manufacturer part number” where documents may contain multiple competing identifiers.


SNEAK PEEK: WHAT’S COMING
Alex previewed enhancements aimed at improving model flexibility and enterprise readiness:

  • Google Gemini Model Option: A new third model option alongside existing GenAI model choices.
  • Model Connections for Enterprise Environments: Support for connecting through company API gateways/proxies to meet security and compliance requirements.
  • Custom Model Support: Ability to connect to customer-hosted or homegrown LLMs by configuring endpoints and schemas.

[Release Timing Note: Rollout timing may be staged, with parts expected across January and February.]


SESSION Q&A
Our session concluded with Q&A covering performance, accuracy, OCR, governance, and best practices.


Question: Can you train Document Automation to get closer to 100% accuracy?
Answer: Yes. The product supports configuration and accuracy improvement techniques (including GenAI prompting and feedback mechanisms) to increase extraction accuracy.


Question: Are documents purged after a period of time?
Answer: Yes. Documents can remain in validation for a default period (Alex referenced a 90-day TTL). If the process is completed and outputs are generated and downloaded, files are removed from the system as part of the standard flow.


Question: If we extract using cloud extraction, can we select the OCR engine?
Answer: Yes. Cloud extraction uses Google Vision OCR today, and ABBYY is being worked on as an additional option.


Question: Can we purge immediately after output is generated (financial institution requirement)?
Answer: Yes. The default behavior is to remove files from the server once the output is generated and downloaded to the destination.


Question: Does it work on scanned receipts and invoices?
Answer: Yes. Image-based PDFs and scanned documents are supported.


Question: OCR mistakes capital “I” as number “1” in certain fonts. Any suggestions?
Answer: Try changing the OCR engine, consider using GenAI Vision (which can help with visual understanding), or improve input image quality using preprocessing techniques like contrast adjustments or thresholding.


Question: What is the accuracy for handwritten text?
Answer: Accuracy varies based on image quality, language, and handwriting style. Alex recommended using Google Vision for handwriting and also trying GenAI Vision for improved results.


Question: Does this support Arabic, and what about accuracy?
Answer: OCR via Google Vision can support Arabic. Extraction can be more challenging due to right-to-left layouts and document composition, but the team has projects using Arabic and recommends leveraging GenAI tools and configuration techniques to address edge cases.


Question: In training, how many samples do we need at minimum?
Answer: In many cases, the goal is zero training using GenAI prompts. For tougher cases, feedback can be provided with minimal examples to improve results.


Question: Invoices that include supporting documents in the same file—how do we extract only the correct values?
Answer: Best practice is to split/classify documents first using classification and splitting capabilities, then send only the relevant document (e.g., the invoice) into extraction. Without splitting, fields can be mixed across document types.


We look forward to seeing you at the next Product Club!