Hello
Can someone please advise the simplest and quickest way to count the number of tokens in text file (words, punctuation, formatting etc) so I can make sure I don’t exceed the token limit for the Gen AI model?
I tried using Python but I can’t get the python script to output the token count to a variable. I am using the following script.
strInput is my input variable with the words.
strOuput is where i need to store the number of words/tokens found in strInput.
# Python code to count tokens using OpenAI's tokenizer
import tiktoken
# Load the appropriate GPT-4 tokenizer
encoding = tiktoken.encoding_for_model("gpt-4")
# Define the text content
text_content = """{{strInput}}""" # This variable will be populated with content from Automation Anywhere
# Encode the content to count tokens
tokens = encoding.encode(text_content)
# Output the number of tokens
print(len(tokens))