220 Commits

Author SHA1 Message Date
ccurme 34cd281494 benchmarks[major]: bump core to 0.3 (#211)
- Drop support for python 3.8
- Bump langchain-core to 0.3
- Update pydantic objects to v2
2024-10-21 16:47:14 -04:00
Isaac Francisco 99cf03a50a add faiss-cpu dependency (#209) 2024-08-07 07:53:45 -07:00
Isaac Francisco b36a339a65 Isaac/realfixes (#208) 2024-08-06 15:28:43 -07:00
Isaac Francisco 442cb47fc9 Isaac/realfixes (#207) 2024-08-06 15:24:23 -07:00
Isaac Francisco b7795c7df1 change wd (#206) 2024-08-06 15:15:08 -07:00
Isaac Francisco ac161de968 thanks erick (#205) 2024-08-06 14:50:39 -07:00
Isaac Francisco d91944bb07 test (#204) 2024-08-06 14:45:48 -07:00
Isaac Francisco 8798bd3105 test (#203) 2024-08-06 14:40:01 -07:00
Isaac Francisco 621eea5d93 Isaac/tryingpoetryagain (#202) 2024-08-06 14:36:43 -07:00
Isaac Francisco b6590a8745 Isaac/changepoetry (#201) 2024-08-06 14:30:42 -07:00
Isaac Francisco 458ffa70ea test (#200) 2024-08-06 14:26:56 -07:00
Isaac Francisco ebe5c117c2 test (#198) 2024-08-06 14:14:39 -07:00
Ikko Eltociear Ashimine adff80af11 docs: update README.md (#195)
Mutiverse -> Multiverse
2024-07-24 11:13:42 -07:00
Bagatur 301837e303 Release 0.0.14 (#194) v0.0.14 2024-07-24 08:00:17 -07:00
Bagatur 4f1d922a6e minor: bump to langchain v2 (#191) 2024-07-24 07:59:19 -07:00
Bagatur e4e26a3b8e infra: release permissions (#193) v0.0.13 2024-07-24 07:56:47 -07:00
Bagatur 7f82761813 Release 0.0.13 (#192) 2024-07-24 07:44:20 -07:00
Isaac Francisco 7e16b6daa6 tool benchmarking (#190)
Co-authored-by: Bagatur <baskaryan@gmail.com>
2024-07-24 07:00:33 -07:00
Eugene Yurtsev 22d279a25c Update README.md (#187) 2024-04-19 10:19:19 -04:00
Eugene Yurtsev 357ada3867 Update README.md (#186) 2024-04-18 19:58:54 -04:00
Eugene Yurtsev ab2d93ac6d Update README.md (#185) 2024-04-18 13:48:51 -04:00
Eugene Yurtsev 53f727af64 Update README.md (#184) 2024-04-18 13:47:49 -04:00
Eugene Yurtsev 820af98418 Release 0.0.12 (#183) v0.0.12 2024-04-18 13:38:38 -04:00
Eugene Yurtsev 857f41882f Update README.md (#182) 2024-04-18 11:33:45 -04:00
Eugene Yurtsev 381ada5cbe Update benchmarks all notebook to use {question} instead of {input} (#179)
Update benchmarks all prompt
2024-04-18 11:28:21 -04:00
Eugene Yurtsev 32a532f269 Update README.md (#181) 2024-04-18 11:28:09 -04:00
Eugene Yurtsev d0acf0ee26 Add security policy (#180)
Add security policy
2024-04-18 11:19:13 -04:00
Eugene Yurtsev bec40d90ef Remove old code (#176)
Remove old code
2024-04-18 11:16:42 -04:00
Eugene Yurtsev c80e959b05 Simplify all tool usage notebooks (#178)
Simplify tool usage notebooks
2024-04-18 11:09:34 -04:00
Eugene Yurtsev 2007f68302 Update intro, remove adapter (#177)
Remove confusing adapter for agents. Agent template should just take {question} as the input.

Update intro and simplify it!
2024-04-18 10:47:46 -04:00
Eugene Yurtsev aad9045bcb remove tiny multiverse dataset from registry (#175)
Keep it for backwards compatibility but do not expose in task registry.
This dataset is probably more confusing to folks than helpful especially
since it it completely overlaps with the existing multiverse math
dataset. We should add another dataset that's later.
2024-04-18 09:31:03 -04:00
Eugene Yurtsev 3b86e9f0b5 Update benchmark all for agents (#174) 2024-04-18 09:23:19 -04:00
Eugene Yurtsev c1c5585d3a Fix list of env variables in benchmark all notebook (#173)
Fix list of env variables
v0.0.11
2024-04-10 22:06:44 -04:00
ccurme c45993617b add tool calling benchmark notebook (#171)
Co-authored-by: Eugene Yurtsev <eyurtsev@gmail.com>
2024-04-10 22:03:19 -04:00
Eugene Yurtsev d13b33e956 Update deps (#170)
Update deps
2024-04-10 09:47:14 -04:00
Eugene Yurtsev 20a4aee5c1 Add factory for regular tool using agents (#169)
add factory for regular tool using agents

---------

Co-authored-by: Chester Curme <chester.curme@gmail.com>
2024-04-10 09:27:32 -04:00
Eugene Yurtsev 4139ac8632 update model providers (#168)
* Update packages to be used with different providers
* Register Anthropic models
2024-04-09 17:44:02 -04:00
Eugene Yurtsev 89be01737d update dependencies (#167)
Update dependencies
2024-04-09 17:17:25 -04:00
Bagatur 29e4e878a4 docs: add high cardinality links (#166) 2024-03-13 23:39:42 -07:00
Bagatur ffc2832088 docs: include high cardinality (#165) 2024-03-13 23:09:07 -07:00
Bagatur 8b5feab7b2 Add high cardinality benchmark (#164) 2024-03-08 09:10:03 -08:00
Konjeti Maruthi a805c985a6 Missing Word in comparing_techniques.ipynb (#160)
Fixing a missing word in
https://langchain-ai.github.io/langchain-benchmarks/notebooks/retrieval/comparing_techniques.html

The sentence after the heading is incomplete since I have added the word
`documents` which would complete the sentence.

Before changing:
<img width="527" alt="LangChainFix"
src="https://github.com/langchain-ai/langchain-benchmarks/assets/63769209/4859bbf0-19ae-4b87-830d-85f6242b9b61">
2024-02-16 15:23:11 -05:00
Eugene Yurtsev c0ac497ed4 Update README.md to fix archived links (#162) 2024-02-06 12:35:26 -05:00
Leonid Ganeline a0ea197b28 updated Makefile (#153)
Cleaned up `makefile`
v0.0.10
2023-12-20 09:24:06 -05:00
Eugene Yurtsev 74b11de9ae Update evaluators (#157)
Update to remove user warning
2023-12-19 17:30:24 -05:00
William FH c2b70436e5 Add runnable agent factory (#156)
Not sure if it's "easier" but it involves less thinking about
benchmarking abstractions
2023-12-19 13:39:08 -08:00
Eugene Yurtsev af9a9800e5 Register the new dataset (#155)
Register the new dataset
2023-12-19 15:01:38 -05:00
Eugene Yurtsev e7bac2cbb8 Change multiverse math to multiverse math (tiny) and add another multiverse math set (#154)
* This PR adds a multiverse math consisting of 20 questions.
* Question about rounding has been removed to simplify evaluation.
2023-12-19 14:57:37 -05:00
Eugene Yurtsev d595394243 Update Math Evaluator (#152)
Try another evaluator that ignores the question
2023-12-19 13:52:13 -05:00
William FH 27efb7b53c Add Gemini (#151) 2023-12-18 20:27:59 -08:00