Skip to main content
Question

Getting an error in document automation

  • July 2, 2025
  • 4 replies
  • 55 views

Forum|alt.badge.img+2

 

I am trying to extract few form field from document automation with unstructured document and i am facing below issue. Let me know if any of you faced the same issue?

 

EXTRACT_FAILED - [Native] : 500 : DOCUMENT_FAILED : HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection

4 replies

Padmakumar
Premier Pathfinder | Tier 7
Forum|alt.badge.img+15
  • Premier Pathfinder | Tier 7
  • July 3, 2025

Shreya.Kumar
Pathfinder Community Team
Forum|alt.badge.img+14
  • Pathfinder Community Team
  • July 7, 2025

@Gireesh B P 3262 , have you been able to try the suggestion by ​@Padmakumar ?


Forum|alt.badge.img+2

Hi Shreya...the issue with proxy and i am working with the internal team. Once i got complete solution i will post here ​@Shreya.Kumar 


Forum|alt.badge.img+4
  • Navigator | Tier 3
  • July 14, 2025

 

I am trying to extract few form field from document automation with unstructured document and i am facing below issue. Let me know if any of you faced the same issue?

 

EXTRACT_FAILED - [Native] : 500 : DOCUMENT_FAILED : HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection

Please try below some points:

1.Ensure scanned PDFs are at least 300 DPI.

2.Avoid blurry, marked-up, or dot-matrix printed documents.

3.Supported formats include PDF, JPG, PNG, TIFF.
4.Check if the OCR engine (like Tesseract or Google Vision) is properly installed and configured.

5.Go to Control Room > Bots > Learning Instances to verify OCR settings
6.Make sure the Bot Agent has admin permissions and access to the document folder

7.Try recreating the learning instance and re-uploading the document.