Cute robot wearing detective outfit

Agent Inspector - Debug your A.I. Agent or LLM actions

How do you know your agent is doing what it's supposed to? How do you make it work more consistently? How do you assess whether it's responding toxically to your users? How long is it running for?

When building an A.I. agent you need to consider all of these factors. One bad or slow result may spell the last time a user ever tries your agent again.

To solve this I've built an agent to help you build and iterate on your agents: The Agent Inspector.

 

How it works

At it's most basic the way it works is that you invoke the agent within your own agent,

  • Provide the prompt or goal of your agent
  • Any data you use to enable your agent to function - if applicable.
  • Your agent's response
  • Optionally - the start time of your agent - you can use another agent I built to automatically get that.

Then you pick whether you want your report to be in human readable format that you can display as you work on your agent, or in json format which is great for if you want to send the results somewhere like a database for future reference.

Regardless of the format you choose numerous tests will be run to evaluate both your prompt and the response of your agent.

Tests run

  • Automatic detection of the expected resulting data type of the result - especially important if you need to run code or invoke another agent on this data.
  • Verification of if the response is in the data type expected.
  • Relevancy test of the response.
  • Assessment of if there are multiple ways to interpret the prompt's instructions - This helps you to ensure consistency of LLM results.
  • Does the LLM have all of the information it needs, or could it be making things up?
  • Toxicity - Is the agents response toxic? does it contain offensive, harmful, or otherwise dangerous language?
  • An overall evaluation: Pass/Fail
  • Succinct but detailed reasoning for the pass/fail response
  • Execution timing information (optional)

Once you have this information while working on your agent, you'll find it impossible to build without it.

How to use it

This agent was designed to work a handful of ways. NOTE: The instructions below are being updated and code blocks for the more advanced uses will be formatted to make things easier.

Within another agent on Agent.ai

The primary way this agent was designed to work was within other Agent.AI agents. Here are the instructions to add the agent to your own agent.

  1. While viewing the "Actions" tab of the Agent Builder click "Add Action".
  2. Click the "Advanced" tab.
  3. Choose "Invoke Other Agent".
  4. In the "Agent ID" field enter "debug_my_agent".
  5. Click into the "Parameters, one value per line" field. The variables that the agent accepts should appear. For each, use "Insert Variable" to pass information to the agent. The fields "data given to llm to make decisions, and Execution start time are optional. For most users you'll want to set "Hide response from user but return JSON" to false.

Tip: If you're actively tweaking your prompt you can "add action" > "Run Process" and choose "set variable" then put your prompt in the variable, in your LLM actions just pass the variable, in the prompt variable for the Agent debugger, also pass your variable. That way you can tweak the prompt in one place and get feedback.

 

Webhook - In development

You can have this agent evaluate agents and LLM responses on any platform that supports making an API request. Note the Agent.ai platform seems to not handle this properly right now so webhook may or may not work. If you get "Error running agent: cannot unpack non-iterable NoneType object" sorry, that's the platform not handling it properly.

https://api-lr.agent.ai/v1/agent/nvaszhzjk2u4jitq/webhook/9367d09a

 

The body must contain:

"llm_prompt":"REPLACE_ME", "llm_response":"REPLACE_ME", "data_provided_to_llm":"REPLACE_ME", "output_is_json":"REPLACE_ME", "execution_start_time":"REPLACE_ME"

An example CURL request to the Agent:

curl -L -X POST -H 'Content-Type: application/json' \
'https://api-lr.agent.ai/v1/agent/nvaszhzjk2u4jitq/webhook/9367d09a' \
-d '{"llm_prompt":"REPLACE_ME","llm_response":"REPLACE_ME","data_provided_to_llm":"REPLACE_ME","output_is_json":"REPLACE_ME","execution_start_time":"REPLACE_ME"}'

 

An example Node.js request to the agent:

const https = require('https');

const postData = JSON.stringify({
  llm_prompt: "REPLACE_ME",
  llm_response: "REPLACE_ME",
  data_provided_to_llm: "REPLACE_ME",
  output_is_json: "REPLACE_ME",
  execution_start_time: "REPLACE_ME"
});

const options = {
  hostname: 'api-lr.agent.ai',
  port: 443,
  path: '/v1/agent/nvaszhzjk2u4jitq/webhook/9367d09a',
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Content-Length': postData.length
  }
};

const req = https.request(options, (res) => {
  console.log(`statusCode: ${res.statusCode}`);
  console.log('Headers: ', res.headers);

  res.on('data', (d) => {
    process.stdout.write(d);
  });
});

req.on('error', (error) => {
  console.error(error);
});

req.write(postData);
req.end();

 

Email

The agent also supports being emailed at debug_my_agent.agent.ai@agent.ai. This might be useful in some automation platforms that do not support serverless functions - where you as the creator of an agent may not have the development skill to build a fetch request for the data, or as a simple way of getting the information.

 

Note: Agent AI handles sending and receiving the emails - I've found this to be hit or miss. sometimes so do not rely on email for mission critical uses.

 

You will email the above email address and provide the following data:

llm_prompt=``
llm_response=``
output_is_json=false