Skip to main content
Solved

Counting tokens for use in Gen AI Models

  • September 3, 2024
  • 2 replies
  • 232 views

Forum|alt.badge.img+1

Hello

Can someone please advise the simplest and quickest way to count the number of tokens in text file (words, punctuation, formatting etc) so I can make sure I don’t exceed the token limit for the Gen AI model?

 

I tried using Python but I can’t get the python script to output the token count to a variable. I am using the following script.

 

strInput is my input variable with the words.

strOuput is where i need to store the number of words/tokens found in strInput.

 

# Python code to count tokens using OpenAI's tokenizer
import tiktoken

# Load the appropriate GPT-4 tokenizer
encoding = tiktoken.encoding_for_model("gpt-4")

# Define the text content
text_content = """{{strInput}}"""  # This variable will be populated with content from Automation Anywhere

# Encode the content to count tokens
tokens = encoding.encode(text_content)

# Output the number of tokens
print(len(tokens))

Best answer by Stefano 5934

First, make sure that the tiktoken package is already installed, you can do it by running the following command in powershell or cmd:

python -m pip freeze

If you don’t see it in the list of packages displayed after executing that command, you can install it by running the following command:

python -m pip install tiktoken 

If you do not have administrator privileges, you can run the command with the --user modifier like so:

python -m pip install tiktoken --user

 

Secondly, after you’ve made sure that the tiktoken package has been installed correctly, change place your script inside a function, it should look something like this:

import tiktoken

def count_tokens(text_content):
    # Load the appropriate GPT-4 tokenizer
    encoding = tiktoken.encoding_for_model("gpt-4")

    # Encode the content to count tokens
    tokens = encoding.encode(text_content)

    # Output the number of tokens
    return len(tokens)

Finally, change the
🐍Python script: Execute script 
action in your bot to 
🐍Python script: Execute function count_tokens  
and add your variable $strInput$ in the input field.

 

In the end, your bot should look something like this:


You can find the source code of the bot here: 

A360-Python_tiktoken_GPT-4_tokenizer.json - GitHub Gist

You can use this extension to import the source code to a task bot: 
Bot Assistant - Chrome Web Store (google.com)

And here’s the link to get the action package that I used in my bot to install the tiktoken package: 

Run Synchronous Scripts Package - Bot Store (automationanywhere.com)

View original
Did this topic help answer your question?

2 replies

Forum|alt.badge.img+5
  • Navigator | Tier 3
  • 36 replies
  • Answer
  • September 6, 2024

First, make sure that the tiktoken package is already installed, you can do it by running the following command in powershell or cmd:

python -m pip freeze

If you don’t see it in the list of packages displayed after executing that command, you can install it by running the following command:

python -m pip install tiktoken 

If you do not have administrator privileges, you can run the command with the --user modifier like so:

python -m pip install tiktoken --user

 

Secondly, after you’ve made sure that the tiktoken package has been installed correctly, change place your script inside a function, it should look something like this:

import tiktoken

def count_tokens(text_content):
    # Load the appropriate GPT-4 tokenizer
    encoding = tiktoken.encoding_for_model("gpt-4")

    # Encode the content to count tokens
    tokens = encoding.encode(text_content)

    # Output the number of tokens
    return len(tokens)

Finally, change the
🐍Python script: Execute script 
action in your bot to 
🐍Python script: Execute function count_tokens  
and add your variable $strInput$ in the input field.

 

In the end, your bot should look something like this:


You can find the source code of the bot here: 

A360-Python_tiktoken_GPT-4_tokenizer.json - GitHub Gist

You can use this extension to import the source code to a task bot: 
Bot Assistant - Chrome Web Store (google.com)

And here’s the link to get the action package that I used in my bot to install the tiktoken package: 

Run Synchronous Scripts Package - Bot Store (automationanywhere.com)


Forum|alt.badge.img+1
  • Author
  • Cadet | Tier 2
  • 3 replies
  • September 9, 2024

Thank you so much for this solution, works a treat!


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings