isaac hershenson 65a2cad83d custom schema
2024-11-27 13:16:57 -08:00
2024-11-27 14:20:03 -05:00
2024-11-27 14:12:11 -05:00
2024-11-27 13:16:57 -08:00
2024-11-27 14:12:11 -05:00

Agent Marketplace Evals

This is a collection of evaluation scripts for evaluating agents.

Repo Structure

Each folder in the repo contains:

  • README.md: A description of the evaluation (dataset, metrics, how to run the eval)
  • run_eval.py: A script to run the evaluation
S
Description
Evals for agents
Readme 168 KiB
Languages
Python 100%