Artificial Intelligence (AI) is continuously revolutionizing our interaction with technology. While building your own AI agent might appear challenging, you can create one for free using Ollama and Python. This post will guide you through the process of developing your first AI agent capable of checking the real response time of any website, providing a solid foundation for further customization and enhancement to suit your specific use cases. Also, this tutorial will serve as the basis for future tutorials that I will share here on my blog.

What is Ollama?

Ollama is an open-source project designed to provide a powerful and user-friendly platform for running large language models (LLMs) on your local machine. (Thanks Meta)

Installing Ollama is as easy as running this command on your terminal:

curl -fsSL https://ollama.com/install.sh | sh

You can find more about the project in the official Ollama website.

Setting up the project

First, let's pull the LLM model we want to use for our project, I recommend either mistral or llama3, both have agent capabilites.

ollama pull llama3

Then, we must create a virtual environment to install the necessary python packages.

python -m venv venv

And then you can activate it by going to the folder where you ran the first command and typing

source venv/bin/activate

Now we just need to install two libraries for our project, Ollama (to interact with the LLM) and requests (to check the response time of our website)

pip install ollama requests

Implementation

Let's create three python files to organize our project, prompts.py, actions.py and main.py.

prompts.py will contain our predefined prompts, like the system prompt that tells the LLM to act like an agent and what process it should follow.

system_prompt = """
You run in a loop of THOUGHT, ACTION, PAUSE and ACTION_RESPONSE.
At the end of the loop, you output an ANSWER

Use THOUGHT to really understand the question you've been asked.
Use ACTION to run one of the actions avaiable to you, that means responding with just a valid json format to be parsed and nothing else
the format of an action call is defined below - then return PAUSE
ACTION_RESPONSE will be the result of running those actions

Your available actions are:

get_response_time:
e.g. get_response_time: nicolasleao.tech
returns the response time of a website. 

Here's an example session:
QUESTION: what is the response time for nicolasleao.tech?
THOUGHT: I should check the response time of that web page first
ACTION: 
{"function_name":"get_response_time","function_params":{"url": "https://nicolasleao.tech"}}

PAUSE

You will be called again with something like this:
ACTION_RESPONSE: 0.5

That means the action you performed returned this value, so you will then output:

ANSWER: The response time for nicolasleao.tech is 0.5 seconds.

note that your final answer should always be in the following format
ANSWER: <your_answer_here>
"""

prompts.py

Even though we explicitly tell the LLM to return just the JSON response and nothing else, sometimes it still does, but don't worry, we have a workaround for that.

Then, in actions.py, we will create all of the functions that the LLM can execute to achieve it's goal, we will parse the LLM response and see if it's an action that's defined in our actions.py file, and if so, execute the function with the provided parameters.

import requests

def get_response_time(url):
    response = requests.get(url, timeout=5)
    return response.elapsed.total_seconds()

actions.py

For this demo, we'll just create a simple function to check the response time of a website, but your actions can be as complex as you want.

Then, in main.py we define a function that tries to extract the JSON of the LLM response, and the main agent loop of our script.

import json
import ollama
from actions import get_response_time
from prompts import system_prompt

# Call Ollama with all our previous messages and return a response in text
def get_llm_response(messages):
    response = ollama.chat(model='llama3', messages=messages)
    return response['message']['content']

# Try to extract a valid JSON object from the LLM response, if not found return None
def extract_json(text):
    try:
        # Workaround for extra stuff in the LLM response
        a = text.find('ACTION:')
        b = text.find('PAUSE')
        if a != -1 and b != -1:
            print('trying to parse function', text[a+8 : b-1])
            return json.loads(text[a+8 : b-2])
        else:
            return json.loads(text)
    except:
        return None

# A dictionary that maps the function names in text to the actual function definitions
available_actions = {
    "get_response_time": get_response_time
}

# Get user input in text
user_prompt = input('Ask for the response time of any website in the internet\n')

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": user_prompt},
]

# We will loop through the process of THOUGHT, ACTION, PAUSE and ACTION_RESPONSE, to execute the actions and
# add the ACTION_RESPONSE to the conversation history
current_iteration = 1
max_iterations = 5
while current_iteration < max_iterations:
    # For debugging we print the current iteration
    print ('TURN ', current_iteration)
    current_iteration += 1

    response = get_llm_response(messages)
    # For debugging we print the LLM response in every iteration
    print(response)
    print('\n')

    # If the LLM response contains 'ANSWER:' as a substring, it means the LLM already found the answer and we can stop the loop.
    answer_index = response.find("ANSWER: ")
    if answer_index != -1:
        print(response[answer_index + 8:])
        break

    # At every iteration, we try to extract the JSON action from the response,
    # if an action is not found, it means that the LLM is not on the ACTION step yet.
    json_function = extract_json(response)
    print('function',json_function)

    # If we found a JSON action in the LLM response, we must execute it and give the output back to the LLM in the conversation
    if json_function:
        function_name = json_function['function_name']
        function_params = json_function['function_params']
        if function_name not in available_actions:
            raise Exception('Tried to run an unrecognized action')
        # If the action is in our available actions, execute it with the given parameters
        action_function = available_actions[function_name]
        result = action_function(**function_params)
        print('time', result)
        function_result_message = f"ACTION_RESPONSE: {result}"
        # Append the result of our action to the conversation with the LLM so it can be used by it as context to answer our query.
        messages.append({"role": "user", "content": function_result_message})

main.py

Result

Running python main.py, we can ask the response time of any website on the internet

Example for my blog:

Example for google.com

Conclusion

As you continue to explore the capabilities of AI and Ollama, remember that the possibilities are vast. Experiment with different models, fine-tune your agent, and stay curious. The knowledge gained from this project will serve as a valuable asset in your AI journey.

If you found this guide helpful and want to stay updated with more tutorials and insights, consider subscribing to my newsletter for free. For those looking to support my work and gain access to exclusive content, you can become a Plus or Pro paid member. Your support helps me continue to provide high-quality, valuable content for the tech community.

Let's innovate and grow together. Subscribe today and take the next step in your AI adventure!

Creating an AI Agent for free with Python and Ollama