Subject Agent Frameworks
Date Feb 2026
Section 02 / 07
02

What's an agent, anyway?

An agent is an LLM running tools in a loop:

So an agent has 3 ingredients:

Let's unpack each one.

What an LLM does

An LLM is a text-completion machine. You send it a chain of characters. It predicts the next most probable character, then the next, until it stops.

When you ask a question, the sequence of most probable next characters is likely to be a sentence that resembles an answer to your question.

An LLM can only produce text. It cannot browse the web. It cannot run a calculation using a program. It cannot read a file or call an API.

What a tool is

A tool gives an LLM capabilities it does not have natively. Tools enable:

The LLM cannot execute tools on its own though:

How LLMs learned to call tools

We said that an LLM can only produce text. So how does it ask for calling a tool? Does it return a text saying "I need to run the calculator" or something like that?

To call a tool, the LLM returns a JSON object that says which tool it wants to run, and with which parameters.

For example, if the LLM wants to check the weather in Paris, instead of responding with text, it returns something like:

{
  "type": "tool_use",
  "name": "get_weather",
  "input": { "city": "Paris" }
}

But how did the LLM learn to generate such JSON objects as the most likely chain of characters in the middle of a conversation in plain English?

The original training data contains no tool-calling examples. Nobody writes "output a JSON object to invoke a calculator function" on the internet.

LLMs are specifically trained to learn when to use tools through fine-tuning on tool-use transcripts:

Tool hallucination is a consequence of tool training. The model can generate calls to tools that were never provided, or fabricate parameters. UC Berkeley's Gorilla project (Berkeley Function-Calling Leaderboard) has documented this systematically — it is one reason agent frameworks invest in validation and error handling.

The two-step pattern

When you call an LLM with tools enabled, two things can happen:

  1. The model responds with text — it has enough information to answer directly.
  2. The model responds with a tool-call request — a structured object specifying which tool to call and what arguments to pass.

If the model requests a tool call, your code executes it. You send the result back as a follow-up message. The model uses that result to formulate its answer — or to request yet another tool call.

Tool use always involves at least two model calls.:

messages = [system_prompt, user_message]

# LLM Call 1 — send the conversation + list of available tools
response = llm(messages, tools=available_tools)

# Did the model respond with text, or with a tool-call request?
if response.has_tool_call:
    tool_call = response.tool_call
    result = execute(tool_call.name, tool_call.arguments)
    messages.append(tool_call)
    messages.append(tool_result(tool_call.id, result))

    # LLM Call 2 — send the conversation again, now including the tool result
    response = llm(messages, tools=available_tools)

The agentic loop

Many tasks require more than one tool call. A coding assistant might read a file, edit it, run the tests, check output, fix a failing test — all in sequence. The model cannot know in advance how many steps it will need.

The solution: wrap the two-step pattern in a loop.

messages = [system_prompt, user_message]

loop:
    response = llm(messages, tools=available_tools)

    if response.has_tool_calls:
        for call in response.tool_calls:
            result = execute(call.name, call.arguments)
            messages.append(call)
            messages.append(tool_result(call.id, result))
        continue

    break  # no tool calls — done

print(response.text)

The model loops: calling tools, receiving results, deciding what to do next, until it produces text instead of another tool call.

In practice, you add guardrails: a maximum number of iterations, a cost budget, validation checks. But the core mechanism is the same.

What this looks like in practice. Here is a simplified trace of an agent booking a restaurant. Each block is one iteration of the loop:

User:      "Find me a good Italian restaurant near the office
            for Friday dinner, 4 people."

Agent:     [tool: search_web("Italian restaurants near 123 Main St")]
           → 3 results: Trattoria Roma, Pasta House, Il Giardino

Agent:     [tool: get_reviews("Trattoria Roma", "Pasta House", "Il Giardino")]
           → Trattoria Roma: 4.7★, Pasta House: 3.9★, Il Giardino: 4.5★

Agent:     [tool: check_availability("Trattoria Roma", friday, party=4)]
           → available at 7:30 PM and 8:00 PM

Agent:     "Trattoria Roma is the best rated (4.7★) and has two
            slots Friday for 4: 7:30 PM or 8:00 PM.
            Want me to book one?"

Four loop iterations. Three tool calls, then a text response that ends the loop. The agent decided which restaurants to look up, which one to check availability for first (the highest rated), and when it had enough information to stop. The program just ran the tools and passed results back.

What to keep in mind