54 Commits

Author SHA1 Message Date
vbarda ac30ead4b0 relax dict requirement 2024-12-04 11:03:33 -05:00
Vadym Barda f702f66e02 update people eval to llm as a judge (#13) 2024-12-03 21:26:35 -05:00
Vadym Barda 8e674f24a6 add llm-as-a-judge (#12) 2024-12-03 16:50:15 -05:00
vbarda bbdffbbe41 update 2024-12-03 12:54:16 -05:00
vbarda 20a1790cab add create dataset for people 2024-12-03 12:43:32 -05:00
Eugene Yurtsev 90b9781a8a use webarxiv (#11)
use webarxiv
2024-12-03 11:53:33 -05:00
vbarda 0ee7b37d3f add cloning examples 2024-12-03 11:37:26 -05:00
Eugene Yurtsev 05d33b3503 math readme (#10)
Update math readme
2024-12-03 11:00:05 -05:00
vbarda eae270696f rephrase 2024-12-02 18:34:13 -05:00
vbarda e6ae465a84 update 2024-12-02 18:05:10 -05:00
vbarda 3515289800 update 2024-12-02 18:00:28 -05:00
vbarda 449333f290 Merge branch 'main' of github.com:langchain-ai/agent-marketplace-evals 2024-12-02 17:44:48 -05:00
vbarda 08f5ff6435 update 2024-12-02 17:44:39 -05:00
vbarda e6b5e69088 update people readme 2024-12-02 17:40:12 -05:00
Eugene Yurtsev 17f1d95583 Update math eval (#8)
Math score to differentiate between unable to calculate (score = 0), to incorrect answer (score=-1), to correct answer (score=1)
2024-12-02 17:15:53 -05:00
Eugene Yurtsev 867acceb97 add examples to math dataset (#7)
Add examples
2024-12-02 16:35:52 -05:00
vbarda ff73ed903e nit 2024-12-02 16:20:08 -05:00
vbarda 1ff04c2b92 add create dataset 2024-12-02 15:59:17 -05:00
vbarda 6d08fc87e9 update main readme 2024-12-02 15:24:51 -05:00
Vadym Barda b7600b75ae split out datasets (#4) 2024-12-02 15:21:43 -05:00
Eugene Yurtsev e0ff72ea11 Fix typo in dataset name (#5) 2024-12-02 15:09:32 -05:00
Eugene Yurtsev 50a86b4136 Add .gitignore (#6)
Add gitignore
2024-12-02 15:08:17 -05:00
Eugene Yurtsev fdbcddb85e handle non existent dataset 2024-12-02 14:52:03 -05:00
Eugene Yurtsev 992bd53915 fix precision 2024-12-02 14:45:37 -05:00
Eugene Yurtsev 4772c6f8be create dataset 2024-12-02 14:41:22 -05:00
vbarda 91b6da6063 update URLs 2024-12-02 10:44:35 -05:00
vbarda be93082f77 update 2024-11-27 21:03:53 -05:00
vbarda 577b74a9ee update to use dataset_name 2024-11-27 18:54:41 -05:00
vbarda 48687e9bac update URLs 2024-11-27 18:45:18 -05:00
Eugene Yurtsev 31a4665df5 x 2024-11-27 18:12:58 -05:00
vbarda b55d3f6bc4 nit 2024-11-27 17:28:32 -05:00
vbarda a289cf5f34 update 2024-11-27 17:25:05 -05:00
vbarda 72e00089c1 Merge branch 'main' of github.com:langchain-ai/agent-marketplace-evals 2024-11-27 17:21:42 -05:00
vbarda 367d8f7ca4 update readme 2024-11-27 17:20:50 -05:00
Isaac Francisco 50659ae7c6 Update README.md 2024-11-27 14:17:57 -08:00
vbarda d2ec56ff91 update link 2024-11-27 17:16:47 -05:00
vbarda f12018f7c7 Merge branch 'main' of github.com:langchain-ai/agent-marketplace-evals 2024-11-27 17:08:55 -05:00
vbarda 49b53b7398 add math eval 2024-11-27 17:08:49 -05:00
Vadym Barda 05753c03b4 Merge pull request #1 from langchain-ai/isaac/addpeopleextraction
add people extraction
2024-11-27 17:02:09 -05:00
isaac hershenson 49ec6e1f14 custom schema note 2024-11-27 14:00:15 -08:00
isaac hershenson 72a3470861 comments 2024-11-27 13:48:21 -08:00
isaac hershenson fcb0b67b31 changes 2024-11-27 13:22:10 -08:00
isaac hershenson 65a2cad83d custom schema 2024-11-27 13:16:57 -08:00
isaac hershenson 5e91873dee edits 2024-11-27 13:00:34 -08:00
vbarda 44f4284541 update to specify graph ID instead, add metadata 2024-11-27 15:37:08 -05:00
isaac hershenson d29fba2f9b draft 2024-11-27 12:28:05 -08:00
vbarda 7258f03747 add notes on how to handle custom schema 2024-11-27 14:35:20 -05:00
Vadym Barda ad26e72240 update readme 2024-11-27 14:31:17 -05:00
vbarda f23673fa91 Merge branch 'main' of github.com:langchain-ai/agent-marketplace-evals 2024-11-27 14:20:13 -05:00
vbarda 1c5a822846 level 2024-11-27 14:20:03 -05:00