Documentation Index
Fetch the complete documentation index at: https://agno-v2-rbac-doc-update.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
The RunOutput from an agent run includes detailed metrics about token usage, cost, timing, and per-model breakdowns.
from agno.agent import Agent
from agno.models.openai import OpenAIResponses
from agno.tools.hackernews import HackerNewsTools
from agno.db.sqlite import SqliteDb
from rich.pretty import pprint
agent = Agent(
model=OpenAIResponses(id="gpt-5.2"),
tools=[HackerNewsTools()],
db=SqliteDb(db_file="tmp/agents.db"),
markdown=True,
)
run_response = agent.run("What are the top stories on HackerNews?")
# Message metrics (MessageMetrics)
for message in run_response.messages:
if message.role == "assistant":
pprint(message.metrics.to_dict())
# Run metrics (RunMetrics)
pprint(run_response.metrics.to_dict())
# Per-model breakdown
if run_response.metrics.details:
for model_type, model_metrics_list in run_response.metrics.details.items():
for m in model_metrics_list:
print(f"{model_type}: {m.provider}/{m.id} - {m.total_tokens} tokens")
# Session metrics (SessionMetrics)
pprint(agent.get_session_metrics().to_dict())
Metrics are available at multiple levels:
- Per message: Each assistant message has
MessageMetrics with per-API-call token counts and timing.
- Per run: Each
RunOutput has RunMetrics with aggregated totals and a details breakdown by model type.
- Per session:
agent.get_session_metrics() returns SessionMetrics aggregated across all runs.
| Level | Type | Access |
|---|
| Per message | MessageMetrics | message.metrics |
| Per run | RunMetrics | run_response.metrics |
| Per session | SessionMetrics | agent.get_session_metrics() |
Run fields (RunMetrics)
| Field | Description |
|---|
input_tokens | Tokens sent to the model. |
output_tokens | Tokens generated by the model. |
total_tokens | Sum of input_tokens and output_tokens. |
audio_input_tokens | Audio tokens in the input. |
audio_output_tokens | Audio tokens in the output. |
audio_total_tokens | Sum of audio_input_tokens and audio_output_tokens. |
cache_read_tokens | Tokens read from cache. |
cache_write_tokens | Tokens written to cache. |
reasoning_tokens | Tokens used for reasoning. |
cost | Cost of the run. |
duration | Run duration in seconds. |
time_to_first_token | Time from run start to first token (seconds). |
details | Per-model breakdown by model type. See Metrics reference. |
additional_metrics | Extra metrics (e.g., eval_duration). |
Message fields (MessageMetrics)
| Field | Description |
|---|
input_tokens | Tokens sent to the model. |
output_tokens | Tokens generated by the model. |
total_tokens | Sum of input_tokens and output_tokens. |
audio_input_tokens | Audio tokens in the input. |
audio_output_tokens | Audio tokens in the output. |
audio_total_tokens | Total audio tokens. |
cache_read_tokens | Tokens served from cache. |
cache_write_tokens | Tokens written to cache. |
reasoning_tokens | Tokens used for reasoning. |
cost | Cost of this API call. |
duration | Duration of this API call (seconds). |
time_to_first_token | Time to first token for this API call (seconds). |
provider_metrics | Provider-specific metrics (e.g., Ollama timing, Groq timing, Cerebras timing). |
Developer Resources