Skip to main content

 

I am trying to extract few form field from document automation with unstructured document and i am facing below issue. Let me know if any of you faced the same issue?

 

EXTRACT_FAILED - -Native] : 500 : DOCUMENT_FAILED : HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection

Hi ​@Gireesh B P 3262 ,

 

Kindly refer the below article and see it helps or not.

 

DA | EXTRACT_FAILED - ENative] : 500 : DOCUMENT_FAILED : HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443):


@Gireesh B P 3262 , have you been able to try the suggestion by ​@Padmakumar ?


Hi Shreya...the issue with proxy and i am working with the internal team. Once i got complete solution i will post here ​@Shreya.Kumar 


 

I am trying to extract few form field from document automation with unstructured document and i am facing below issue. Let me know if any of you faced the same issue?

 

EXTRACT_FAILED - -Native] : 500 : DOCUMENT_FAILED : HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by ConnectTimeoutError(<urllib3.connection.HTTPSConnection

Please try below some points:

1.Ensure scanned PDFs are at least 300 DPI.

2.Avoid blurry, marked-up, or dot-matrix printed documents.

3.Supported formats include PDF, JPG, PNG, TIFF.
4.Check if the OCR engine (like Tesseract or Google Vision) is properly installed and configured.

5.Go to Control Room > Bots > Learning Instances to verify OCR settings
6.Make sure the Bot Agent has admin permissions and access to the document folder

7.Try recreating the learning instance and re-uploading the document.


Reply