[PR #101] [MERGED] Multi modal RAG benchmark #111

Closed
opened 2026-02-16 00:18:14 -05:00 by yindo · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/langchain-ai/langchain-benchmarks/pull/101
Author: @rlancemartin
Created: 12/3/2023
Status: Merged
Merged: 12/5/2023
Merged by: @hinthornw

Base: mainHead: rlm/multi-modal-benchmark


📝 Commits (10+)

📊 Changes

13 files changed (+1870 additions, -9 deletions)

View changed files

📝 docs/source/.gitignore (+2 -0)
docs/source/notebooks/retrieval/multi_modal_benchmarking/multi_modal_eval.ipynb (+1162 -0)
docs/source/notebooks/retrieval/multi_modal_benchmarking/multi_modal_eval_baseline.ipynb (+610 -0)
📝 docs/source/toc.segment (+2 -0)
langchain_benchmarks/rag/tasks/.gitignore (+1 -0)
📝 langchain_benchmarks/rag/tasks/__init__.py (+8 -1)
langchain_benchmarks/rag/tasks/multi_modal_slide_decks/__init__.py (+5 -0)
langchain_benchmarks/rag/tasks/multi_modal_slide_decks/indexing/__init__.py (+5 -0)
langchain_benchmarks/rag/tasks/multi_modal_slide_decks/indexing/retriever_registry.py (+39 -0)
langchain_benchmarks/rag/tasks/multi_modal_slide_decks/task.py (+23 -0)
📝 langchain_benchmarks/rag/tasks/semi_structured_reports/indexing/retriever_registry.py (+3 -4)
📝 langchain_benchmarks/registration.py (+2 -0)
📝 langchain_benchmarks/schema.py (+8 -4)

📄 Description

  • Example notebooks for eval of multi-modal RAG w/ mm-embd and mv-retriever vs baseline top-k RAG
  • To Do: create public benchmark

🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/langchain-ai/langchain-benchmarks/pull/101 **Author:** [@rlancemartin](https://github.com/rlancemartin) **Created:** 12/3/2023 **Status:** ✅ Merged **Merged:** 12/5/2023 **Merged by:** [@hinthornw](https://github.com/hinthornw) **Base:** `main` ← **Head:** `rlm/multi-modal-benchmark` --- ### 📝 Commits (10+) - [`8f09d70`](https://github.com/langchain-ai/langchain-benchmarks/commit/8f09d70ef32427d1a938390f4c2309a8b09769ae) Multi modal RAG benchmark - [`1ed2f42`](https://github.com/langchain-ai/langchain-benchmarks/commit/1ed2f42bfbd3aa2ff3c5f037163bd4172aea3022) fmt - [`4198cdf`](https://github.com/langchain-ai/langchain-benchmarks/commit/4198cdff18f2609526a8f63fc91bbcfeaeb18b63) Add dataset loader - [`4420a71`](https://github.com/langchain-ai/langchain-benchmarks/commit/4420a7185ff3ee92add5dbcd3b5ba92f00290515) Create and update multi-modal benchmark - [`7011e0a`](https://github.com/langchain-ai/langchain-benchmarks/commit/7011e0a701bcada91d1cb4a07dcec45776fd6af3) Remove older dataset, source files - [`55ed669`](https://github.com/langchain-ai/langchain-benchmarks/commit/55ed6699a944f425fab16b18bac4cb89eeea0c03) merge - [`1f22284`](https://github.com/langchain-ai/langchain-benchmarks/commit/1f22284795f95d79104644965cf10c8697b704d1) tmp - [`eb703c0`](https://github.com/langchain-ai/langchain-benchmarks/commit/eb703c0aa6853d04ce48a240fd90a145d8e4ad9c) Clean ntbks - [`61d500f`](https://github.com/langchain-ai/langchain-benchmarks/commit/61d500ff568fb9d1a346115d9f84291acc1e042e) unused imports - [`a62c63a`](https://github.com/langchain-ai/langchain-benchmarks/commit/a62c63ab3de9b06e2c6f70b0f3e1d37748cc0073) updat docs ### 📊 Changes **13 files changed** (+1870 additions, -9 deletions) <details> <summary>View changed files</summary> 📝 `docs/source/.gitignore` (+2 -0) ➕ `docs/source/notebooks/retrieval/multi_modal_benchmarking/multi_modal_eval.ipynb` (+1162 -0) ➕ `docs/source/notebooks/retrieval/multi_modal_benchmarking/multi_modal_eval_baseline.ipynb` (+610 -0) 📝 `docs/source/toc.segment` (+2 -0) ➕ `langchain_benchmarks/rag/tasks/.gitignore` (+1 -0) 📝 `langchain_benchmarks/rag/tasks/__init__.py` (+8 -1) ➕ `langchain_benchmarks/rag/tasks/multi_modal_slide_decks/__init__.py` (+5 -0) ➕ `langchain_benchmarks/rag/tasks/multi_modal_slide_decks/indexing/__init__.py` (+5 -0) ➕ `langchain_benchmarks/rag/tasks/multi_modal_slide_decks/indexing/retriever_registry.py` (+39 -0) ➕ `langchain_benchmarks/rag/tasks/multi_modal_slide_decks/task.py` (+23 -0) 📝 `langchain_benchmarks/rag/tasks/semi_structured_reports/indexing/retriever_registry.py` (+3 -4) 📝 `langchain_benchmarks/registration.py` (+2 -0) 📝 `langchain_benchmarks/schema.py` (+8 -4) </details> ### 📄 Description * Example notebooks for eval of multi-modal RAG w/ mm-embd and mv-retriever vs baseline top-k RAG * To Do: create public benchmark --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
yindo added the pull-request label 2026-02-16 00:18:14 -05:00
yindo closed this issue 2026-02-16 00:18:14 -05:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: langchain-ai/langchain-benchmarks#111