archive repo

This commit is contained in:
Logan Markewich
2025-09-02 15:28:37 -06:00
parent 2e14f65e17
commit ddb1bc009a
44 changed files with 28461 additions and 28151 deletions
+9
View File
@@ -1,3 +1,12 @@
# ⚠️ Important Notice
This repository is deprecated and no longer maintained.
For the latest python examples, please refer to the `llama-cloud-services` repository examples:
https://github.com/run-llama/llama_cloud_services/tree/main/examples
---
# LlamaCloud Demo
This repository contains a collection of cookbooks to show you how to build LLM applications using LlamaCloud to help manage your data pipelines, and LlamaIndex as the core orchestration framework.
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
@@ -1,375 +1,382 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Inline Citations with LlamaCloud\n",
"In this notebook we show you how to perform inline citations with LlamaCloud. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Install core packages, download files. You will need to upload these documents to LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-index-embeddings-openai\n",
"!pip install llama-index-question-gen-openai\n",
"!pip install llama-index-postprocessor-flag-embedding-reranker\n",
"!pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"!pip install llama-parse"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# download Apple \n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf\" -O data/apple_2023.pdf\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2022/q4/_10-K-2022-(As-Filed).pdf\" -O data/apple_2022.pdf\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf\" -O data/apple_2021.pdf\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2020/ar/_10-K-2020-(As-Filed).pdf\" -O data/apple_2020.pdf\n",
"!wget \"https://www.dropbox.com/scl/fi/i6vk884ggtq382mu3whfz/apple_2019_10k.pdf?rlkey=eudxh3muxh7kop43ov4bgaj5i&dl=1\" -O data/apple_2019.pdf\n",
"\n",
"# download Tesla\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000162828024002390/tsla-20231231-gen.pdf\" -O data/tesla_2023.pdf\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000095017023001409/tsla-20221231-gen.pdf\" -O data/tesla_2022.pdf\n",
"!wget \"https://www.dropbox.com/scl/fi/ptk83fmye7lqr7pz9r6dm/tesla_2021_10k.pdf?rlkey=24kxixeajbw9nru1sd6tg3bye&dl=1\" -O data/tesla_2021.pdf\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000156459021004599/tsla-10k_20201231-gen.pdf\" -O data/tesla_2020.pdf\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000156459020004475/tsla-10k_20191231-gen_0.pdf\" -O data/tesla_2019.pdf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some OpenAI and LlamaParse details. The OpenAI LLM is used for response synthesis."
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [],
"source": [
"# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio\n",
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"# API access to llama-cloud\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"# Using OpenAI API for embeddings/llms\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Load Documents into LlamaCloud\n",
"\n",
"The first order of business is to download the 5 Apple and Tesla 10Ks and upload them into LlamaCloud.\n",
"\n",
"You can easily do this by creating a pipeline and uploading docs via the \"Files\" mode.\n",
"\n",
"After this is done, proceed to the next section."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define NodeCitationPostProcessor\n",
"Add node id to metadata to match the citation links"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {},
"outputs": [],
"source": [
"from typing import List, Optional\n",
"\n",
"from llama_index.core import QueryBundle\n",
"from llama_index.core.postprocessor.types import BaseNodePostprocessor\n",
"from llama_index.core.schema import NodeWithScore\n",
"\n",
"\n",
"class NodeCitationProcessor(BaseNodePostprocessor):\n",
" \"\"\"\n",
" Append node_id into metadata for citation purpose.\n",
" Config SYSTEM_CITATION_PROMPT in your runtime environment variable to enable this feature.\n",
" \"\"\"\n",
"\n",
" def _postprocess_nodes(\n",
" self,\n",
" nodes: List[NodeWithScore],\n",
" query_bundle: Optional[QueryBundle] = None,\n",
" ) -> List[NodeWithScore]:\n",
" for node_score in nodes:\n",
" node_score.node.metadata[\"node_id\"] = node_score.node.node_id\n",
" return nodes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define System Citation Prompt\n",
"Modify the system prompt to add the citation links based on the metadata"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [],
"source": [
"SYSTEM_CITATION_PROMPT = \"\"\"You have provided information from a knowledge base that has been passed to you in nodes of information.\n",
"Each node has useful metadata such as node ID, file name, page, etc.\n",
"Please add the citation to the data node for each sentence or paragraph that you reference in the provided information.\n",
"The citation format is: . [citation:<node_id>]()\n",
"Where the <node_id> is the unique identifier of the data node.\n",
"\n",
"Example:\n",
"We have two nodes:\n",
" node_id: xyz\n",
" file_name: llama.pdf\n",
" \n",
" node_id: abc\n",
" file_name: animal.pdf\n",
"\n",
"User question: Tell me a fun fact about Llama.\n",
"Your answer:\n",
"A baby llama is called \"Cria\" [citation:xyz]().\n",
"It often live in desert [citation:abc]().\n",
"It\\\\'s cute animal.\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define LlamaCloud Retriever over Documents\n",
"\n",
"In this section we define LlamaCloud Retriever over these documents."
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"apple_tesla_demo_2\",\n",
" project_name=\"llamacloud_demo\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Define chunk retriever\n",
"\n",
"The chunk-level retriever does vector search with a final reranked set of `rerank_top_n=5`."
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [],
"source": [
"chunk_retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=5\n",
")\n",
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-4o-mini\", system_prompt=SYSTEM_CITATION_PROMPT)\n",
"query_engine_chunk = RetrieverQueryEngine.from_args(\n",
" chunk_retriever, \n",
" llm=llm,\n",
" response_mode=\"tree_summarize\",\n",
" node_postprocessors=[NodeCitationProcessor()]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate final output matching citations with page labela\n",
"Given the found nodes, match the page assigned and build a final url"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"\n",
"def process_citations_with_sources(response) -> str:\n",
" content = str(response)\n",
" source_nodes = response.source_nodes\n",
"\n",
" # Create a lookup: citation_id -> page_label\n",
" id_to_label = {\n",
" str(node.id_): node.metadata.get('page_label', 'unknown')\n",
" for node in source_nodes\n",
" }\n",
"\n",
" # Track citation order and assign human-friendly numbers\n",
" citation_order = {}\n",
" citation_counter = 1\n",
"\n",
" def replace(match):\n",
" nonlocal citation_counter\n",
" citation_id = match.group(1).strip()\n",
" if citation_id not in citation_order:\n",
" citation_order[citation_id] = citation_counter\n",
" citation_counter += 1\n",
" number = citation_order[citation_id]\n",
" page_label = id_to_label.get(citation_id, 'unknown')\n",
" return f\"[{number}](https://fake.url/SampleFile#page={page_label})\"\n",
"\n",
" # Replace complete citations\n",
" citation_regex = re.compile(r'\\[citation:([^\\]]+)\\]')\n",
" content = citation_regex.sub(replace, content)\n",
"\n",
" # Remove incomplete/broken citation tags\n",
" incomplete_regex = re.compile(r'\\[citation:[^\\]]*$')\n",
" content = incomplete_regex.sub('', content)\n",
"\n",
" return content"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Query it"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [],
"source": [
"response = query_engine_chunk.query(\"What are the tiny risks for apple 2022\")"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Apple Inc. faces several risks that could impact its business and financial performance in 2022. These include:\n",
"\n",
"1. **Foreign Currency Exchange Risks**: The company's financial performance is subject to risks associated with changes in the value of the U.S. dollar relative to local currencies. Fluctuations in foreign currency exchange rates can adversely affect gross margins on products sold internationally, potentially leading to reduced demand if international pricing is raised to offset currency strength [1](https://fake.url/SampleFile#page=19)().\n",
"\n",
"2. **Credit Risk**: Apple is exposed to credit risk related to its trade accounts receivable and vendor non-trade receivables. This risk is heightened during economic downturns, especially since a significant portion of its trade receivables is not covered by collateral or credit insurance [1](https://fake.url/SampleFile#page=19)().\n",
"\n",
"3. **Supply Chain Risks**: The company relies heavily on single-source suppliers for many components. Any disruptions in the supply chain, whether due to natural disasters, political issues, or supplier financial instability, could adversely affect production and sales [2](https://fake.url/SampleFile#page=12)().\n",
"\n",
"4. **Product and Service Quality Risks**: Apples products and services may experience design and manufacturing defects, which could harm its reputation and lead to financial liabilities. The complexity of its hardware and software increases the risk of defects that could affect customer satisfaction [2](https://fake.url/SampleFile#page=12)().\n",
"\n",
"5. **Macroeconomic Risks**: Global and regional economic conditions significantly impact Apples performance. Factors such as inflation, recession, and changes in consumer confidence can adversely affect demand for its products and services [3](https://fake.url/SampleFile#page=8)().\n",
"\n",
"6. **Regulatory and Compliance Risks**: The company is subject to various U.S. and international laws regarding data protection and privacy. Noncompliance with these laws could result in significant penalties and harm to its reputation [4](https://fake.url/SampleFile#page=18)().\n",
"\n",
"These risks, while potentially small in individual impact, can collectively pose significant challenges to Apple's operations and financial health in 2022.\n"
]
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Inline Citations with LlamaCloud\n",
"In this notebook we show you how to perform inline citations with LlamaCloud. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Install core packages, download files. You will need to upload these documents to LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-index-embeddings-openai\n",
"!pip install llama-index-question-gen-openai\n",
"!pip install llama-index-postprocessor-flag-embedding-reranker\n",
"!pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"!pip install llama-parse"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# download Apple \n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf\" -O data/apple_2023.pdf\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2022/q4/_10-K-2022-(As-Filed).pdf\" -O data/apple_2022.pdf\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf\" -O data/apple_2021.pdf\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2020/ar/_10-K-2020-(As-Filed).pdf\" -O data/apple_2020.pdf\n",
"!wget \"https://www.dropbox.com/scl/fi/i6vk884ggtq382mu3whfz/apple_2019_10k.pdf?rlkey=eudxh3muxh7kop43ov4bgaj5i&dl=1\" -O data/apple_2019.pdf\n",
"\n",
"# download Tesla\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000162828024002390/tsla-20231231-gen.pdf\" -O data/tesla_2023.pdf\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000095017023001409/tsla-20221231-gen.pdf\" -O data/tesla_2022.pdf\n",
"!wget \"https://www.dropbox.com/scl/fi/ptk83fmye7lqr7pz9r6dm/tesla_2021_10k.pdf?rlkey=24kxixeajbw9nru1sd6tg3bye&dl=1\" -O data/tesla_2021.pdf\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000156459021004599/tsla-10k_20201231-gen.pdf\" -O data/tesla_2020.pdf\n",
"!wget \"https://ir.tesla.com/_flysystem/s3/sec/000156459020004475/tsla-10k_20191231-gen_0.pdf\" -O data/tesla_2019.pdf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some OpenAI and LlamaParse details. The OpenAI LLM is used for response synthesis."
]
},
{
"cell_type": "code",
"execution_count": 51,
"metadata": {},
"outputs": [],
"source": [
"# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio\n",
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"# API access to llama-cloud\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [],
"source": [
"# Using OpenAI API for embeddings/llms\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\""
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Load Documents into LlamaCloud\n",
"\n",
"The first order of business is to download the 5 Apple and Tesla 10Ks and upload them into LlamaCloud.\n",
"\n",
"You can easily do this by creating a pipeline and uploading docs via the \"Files\" mode.\n",
"\n",
"After this is done, proceed to the next section."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define NodeCitationPostProcessor\n",
"Add node id to metadata to match the citation links"
]
},
{
"cell_type": "code",
"execution_count": 91,
"metadata": {},
"outputs": [],
"source": [
"from typing import List, Optional\n",
"\n",
"from llama_index.core import QueryBundle\n",
"from llama_index.core.postprocessor.types import BaseNodePostprocessor\n",
"from llama_index.core.schema import NodeWithScore\n",
"\n",
"\n",
"class NodeCitationProcessor(BaseNodePostprocessor):\n",
" \"\"\"\n",
" Append node_id into metadata for citation purpose.\n",
" Config SYSTEM_CITATION_PROMPT in your runtime environment variable to enable this feature.\n",
" \"\"\"\n",
"\n",
" def _postprocess_nodes(\n",
" self,\n",
" nodes: List[NodeWithScore],\n",
" query_bundle: Optional[QueryBundle] = None,\n",
" ) -> List[NodeWithScore]:\n",
" for node_score in nodes:\n",
" node_score.node.metadata[\"node_id\"] = node_score.node.node_id\n",
" return nodes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define System Citation Prompt\n",
"Modify the system prompt to add the citation links based on the metadata"
]
},
{
"cell_type": "code",
"execution_count": 92,
"metadata": {},
"outputs": [],
"source": [
"SYSTEM_CITATION_PROMPT = \"\"\"You have provided information from a knowledge base that has been passed to you in nodes of information.\n",
"Each node has useful metadata such as node ID, file name, page, etc.\n",
"Please add the citation to the data node for each sentence or paragraph that you reference in the provided information.\n",
"The citation format is: . [citation:<node_id>]()\n",
"Where the <node_id> is the unique identifier of the data node.\n",
"\n",
"Example:\n",
"We have two nodes:\n",
" node_id: xyz\n",
" file_name: llama.pdf\n",
" \n",
" node_id: abc\n",
" file_name: animal.pdf\n",
"\n",
"User question: Tell me a fun fact about Llama.\n",
"Your answer:\n",
"A baby llama is called \"Cria\" [citation:xyz]().\n",
"It often live in desert [citation:abc]().\n",
"It\\\\'s cute animal.\"\"\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define LlamaCloud Retriever over Documents\n",
"\n",
"In this section we define LlamaCloud Retriever over these documents."
]
},
{
"cell_type": "code",
"execution_count": 93,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"apple_tesla_demo_2\",\n",
" project_name=\"llamacloud_demo\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Define chunk retriever\n",
"\n",
"The chunk-level retriever does vector search with a final reranked set of `rerank_top_n=5`."
]
},
{
"cell_type": "code",
"execution_count": 94,
"metadata": {},
"outputs": [],
"source": [
"chunk_retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=5\n",
")\n",
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-4o-mini\", system_prompt=SYSTEM_CITATION_PROMPT)\n",
"query_engine_chunk = RetrieverQueryEngine.from_args(\n",
" chunk_retriever, \n",
" llm=llm,\n",
" response_mode=\"tree_summarize\",\n",
" node_postprocessors=[NodeCitationProcessor()]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate final output matching citations with page labela\n",
"Given the found nodes, match the page assigned and build a final url"
]
},
{
"cell_type": "code",
"execution_count": 95,
"metadata": {},
"outputs": [],
"source": [
"import re\n",
"\n",
"def process_citations_with_sources(response) -> str:\n",
" content = str(response)\n",
" source_nodes = response.source_nodes\n",
"\n",
" # Create a lookup: citation_id -> page_label\n",
" id_to_label = {\n",
" str(node.id_): node.metadata.get('page_label', 'unknown')\n",
" for node in source_nodes\n",
" }\n",
"\n",
" # Track citation order and assign human-friendly numbers\n",
" citation_order = {}\n",
" citation_counter = 1\n",
"\n",
" def replace(match):\n",
" nonlocal citation_counter\n",
" citation_id = match.group(1).strip()\n",
" if citation_id not in citation_order:\n",
" citation_order[citation_id] = citation_counter\n",
" citation_counter += 1\n",
" number = citation_order[citation_id]\n",
" page_label = id_to_label.get(citation_id, 'unknown')\n",
" return f\"[{number}](https://fake.url/SampleFile#page={page_label})\"\n",
"\n",
" # Replace complete citations\n",
" citation_regex = re.compile(r'\\[citation:([^\\]]+)\\]')\n",
" content = citation_regex.sub(replace, content)\n",
"\n",
" # Remove incomplete/broken citation tags\n",
" incomplete_regex = re.compile(r'\\[citation:[^\\]]*$')\n",
" content = incomplete_regex.sub('', content)\n",
"\n",
" return content"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Query it"
]
},
{
"cell_type": "code",
"execution_count": 96,
"metadata": {},
"outputs": [],
"source": [
"response = query_engine_chunk.query(\"What are the tiny risks for apple 2022\")"
]
},
{
"cell_type": "code",
"execution_count": 97,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Apple Inc. faces several risks that could impact its business and financial performance in 2022. These include:\n",
"\n",
"1. **Foreign Currency Exchange Risks**: The company's financial performance is subject to risks associated with changes in the value of the U.S. dollar relative to local currencies. Fluctuations in foreign currency exchange rates can adversely affect gross margins on products sold internationally, potentially leading to reduced demand if international pricing is raised to offset currency strength [1](https://fake.url/SampleFile#page=19)().\n",
"\n",
"2. **Credit Risk**: Apple is exposed to credit risk related to its trade accounts receivable and vendor non-trade receivables. This risk is heightened during economic downturns, especially since a significant portion of its trade receivables is not covered by collateral or credit insurance [1](https://fake.url/SampleFile#page=19)().\n",
"\n",
"3. **Supply Chain Risks**: The company relies heavily on single-source suppliers for many components. Any disruptions in the supply chain, whether due to natural disasters, political issues, or supplier financial instability, could adversely affect production and sales [2](https://fake.url/SampleFile#page=12)().\n",
"\n",
"4. **Product and Service Quality Risks**: Apples products and services may experience design and manufacturing defects, which could harm its reputation and lead to financial liabilities. The complexity of its hardware and software increases the risk of defects that could affect customer satisfaction [2](https://fake.url/SampleFile#page=12)().\n",
"\n",
"5. **Macroeconomic Risks**: Global and regional economic conditions significantly impact Apples performance. Factors such as inflation, recession, and changes in consumer confidence can adversely affect demand for its products and services [3](https://fake.url/SampleFile#page=8)().\n",
"\n",
"6. **Regulatory and Compliance Risks**: The company is subject to various U.S. and international laws regarding data protection and privacy. Noncompliance with these laws could result in significant penalties and harm to its reputation [4](https://fake.url/SampleFile#page=18)().\n",
"\n",
"These risks, while potentially small in individual impact, can collectively pose significant challenges to Apple's operations and financial health in 2022.\n"
]
}
],
"source": [
"content = process_citations_with_sources(response)\n",
"print(content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
}
],
"source": [
"content = process_citations_with_sources(response)\n",
"print(content)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
"nbformat": 4,
"nbformat_minor": 4
}
File diff suppressed because it is too large Load Diff
+444 -437
View File
@@ -1,444 +1,451 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "f4b2c37d-3b5a-47aa-95b9-d28e0bc83f77",
"metadata": {},
"source": [
"# Auto-Retrieval with LlamaCloud\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/advanced_rag/auto_retrieval.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"**Auto-retrieval** is an advanced RAG technique that uses an LLM to dynamically infer the metadata filter parameters along with the semantic query before initiating vector database retrieval, in comparison to naive RAG which directly sends the user query to the vector db retrieval interface (e.g. dense vector search). It can both be thought of as a form of query expansion/rewriting if you come from the retrieval world, as well as a specific form of function calling.\n",
"\n",
"![](auto_retrieval_img.png)\n",
"\n",
"LlamaCloud helps you easily define chunk and document-level retrieval interfaces on top of any documents. In this guide we show you how to build an auto-retrieval pipeline on top of LlamaCloud retrievers over a research document corpus."
]
},
{
"cell_type": "markdown",
"id": "2e4f707a-c7b5-473f-b4a6-881e2245e82d",
"metadata": {},
"source": [
"## Setup LlamaCloud \n",
"\n",
"Install core packages and download relevant files. Upload these documents to LlamaCloud, and then define a chunk and document-level retriever interface over these documents.\n",
"\n",
"For more information on chunk-level and document-level retrieval, check out our interface [here](https://github.com/run-llama/llamacloud-demo/blob/main/examples/10k_apple_tesla/demo_file_retrieval.ipynb)."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9aa458bc-bc8d-46fe-9a57-021dd8d9e525",
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-parse"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "79821400-caaf-42f1-99d8-74c184c19e29",
"metadata": {},
"outputs": [],
"source": [
"# NOTE: uncomment more papers if you want to do research over a larger subset of docs\n",
"\n",
"urls = [\n",
" # \"https://openreview.net/pdf?id=VtmBAGCN7o\",\n",
" # \"https://openreview.net/pdf?id=6PmJoRfdaK\",\n",
" # \"https://openreview.net/pdf?id=LzPWWPAdY4\",\n",
" \"https://openreview.net/pdf?id=VTF8yNQM66\",\n",
" \"https://openreview.net/pdf?id=hSyW5go0v8\",\n",
" # \"https://openreview.net/pdf?id=9WD9KwssyT\",\n",
" # \"https://openreview.net/pdf?id=yV6fD7LYkF\",\n",
" # \"https://openreview.net/pdf?id=hnrB5YHoYu\",\n",
" # \"https://openreview.net/pdf?id=WbWtOYIzIK\",\n",
" \"https://openreview.net/pdf?id=c5pwL0Soay\",\n",
" # \"https://openreview.net/pdf?id=TpD2aG1h0D\",\n",
"]\n",
"\n",
"papers = [\n",
" # \"metagpt.pdf\",\n",
" # \"longlora.pdf\",\n",
" # \"loftq.pdf\",\n",
" \"swebench.pdf\",\n",
" \"selfrag.pdf\",\n",
" # \"zipformer.pdf\",\n",
" # \"values.pdf\",\n",
" # \"finetune_fair_diffusion.pdf\",\n",
" # \"knowledge_card.pdf\",\n",
" \"metra.pdf\",\n",
" # \"vr_mcl.pdf\",\n",
"]\n",
"\n",
"data_dir = \"iclr_docs\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "80137d15-f22b-47eb-adce-ac295ced7e71",
"metadata": {},
"outputs": [],
"source": [
"!mkdir \"{data_dir}\"\n",
"for url, paper in zip(urls, papers):\n",
" !wget \"{url}\" -O \"{data_dir}/{paper}\""
]
},
{
"cell_type": "markdown",
"id": "0315d639-52a0-4888-8779-653083ec3768",
"metadata": {},
"source": [
"#### Load Documents into LlamaCloud\n",
"\n",
"Create a new index in LlamaCloud and drag and drop these downloaded PDFs into the data source.\n",
"\n",
"For best results, in the Transformation Configuration click on the \"Manual\" tab, and set page-level segmentation configuration and \"None\" for additional chunking."
]
},
{
"cell_type": "markdown",
"id": "3579938a-54b5-4eb2-b97e-22d96382eded",
"metadata": {},
"source": [
"#### Setup LlamaCloud Index"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4208977f-896f-4993-a3a2-1be83c92baa1",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"research_papers_page\",\n",
" project_name=\"llamacloud_demo\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "d46f14ff-45b1-41f4-84e2-a6e5d6637809",
"metadata": {},
"source": [
"## Setup Auto-Retrieval\n",
"\n",
"Now we setup an **auto-retrieval** function over our LlamaCloud retrievers. At a high-level our auto-retrieval function uses a function-calling LLM to infer the metadata filters for a user query - this leads to more precise and relevant retrieval results beyond just using a raw semantic query.\n",
"\n",
"This section shows you how to build it from scratch, also includes some advanced few-shot example selection to increase reliability.\n",
"1. Define a custom prompt to generate metadata filters\n",
"2. Given a user query, first do chunk-level retrieval to dynamically retrieve the metadata of the retrieved chunks.\n",
"3. Inject the metadata as few-shot examples in the auto-retrieval prompt. The goal is to show the LLM what existing, relevant examples of metadata values already look like, so that the LLM can infer correct metadata filters.\n",
"\n",
"A lot of the code below is lifted from our **VectorIndexAutoRetriever** module, which provides an out of the box way to do auto-retrieval against a vector index."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "652cb067-da39-42cb-a303-faa346f72e13",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-4o\")"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "b0a564cb-bfdb-48a5-9d67-10390c3a6c28",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.prompts import ChatPromptTemplate\n",
"from llama_index.core.vector_stores.types import VectorStoreInfo, VectorStoreQuerySpec, MetadataInfo, MetadataFilters\n",
"from llama_index.core.retrievers import BaseRetriever\n",
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"from llama_index.core import Response\n",
"\n",
"import json\n",
"\n",
"SYS_PROMPT = \"\"\"\\\n",
"Your goal is to structure the user's query to match the request schema provided below.\n",
"You MUST call the tool in order to generate the query spec.\n",
"\n",
"<< Structured Request Schema >>\n",
"When responding use a markdown code snippet with a JSON object formatted in the \\\n",
"following schema:\n",
"\n",
"{schema_str}\n",
"\n",
"The query string should contain only text that is expected to match the contents of \\\n",
"documents. Any conditions in the filter should not be mentioned in the query as well.\n",
"\n",
"Make sure that filters only refer to attributes that exist in the data source.\n",
"Make sure that filters take into account the descriptions of attributes.\n",
"Make sure that filters are only used as needed. If there are no filters that should be \\\n",
"applied return [] for the filter value.\\\n",
"\n",
"If the user's query explicitly mentions number of documents to retrieve, set top_k to \\\n",
"that number, otherwise do not set top_k.\n",
"\n",
"The schema of the metadata filters in the vector db table is listed below, along with some example metadata dictionaries from relevant rows.\n",
"The user will send the input query string.\n",
"\n",
"Data Source:\n",
"```json\n",
"{info_str}\n",
"```\n",
"\n",
"Example metadata from relevant chunks:\n",
"{example_rows}\n",
"\n",
"\"\"\"\n",
"\n",
"example_rows_retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=4\n",
")\n",
"\n",
"def get_example_rows_fn(**kwargs):\n",
" \"\"\"Retrieve relevant few-shot examples.\"\"\"\n",
" query_str = kwargs[\"query_str\"]\n",
" nodes = example_rows_retriever.retrieve(query_str)\n",
" # get the metadata, join them\n",
" metadata_list = [n.metadata for n in nodes]\n",
"\n",
" return \"\\n\".join([json.dumps(m) for m in metadata_list])\n",
" \n",
" \n",
"\n",
"# TODO: define function mapping for `example_rows`.\n",
"chat_prompt_tmpl = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", SYS_PROMPT),\n",
" (\"user\", \"{query_str}\"),\n",
" ],\n",
" function_mappings={\n",
" \"example_rows\": get_example_rows_fn\n",
" }\n",
")\n",
"\n",
"\n",
"## NOTE: this is a dataclass that contains information about the metadata\n",
"vector_store_info = VectorStoreInfo(\n",
" content_info=\"contains content from various research papers\",\n",
" metadata_info=[\n",
" MetadataInfo(\n",
" name=\"file_name\",\n",
" type=\"str\",\n",
" description=\"Name of the source paper\",\n",
" ),\n",
" ],\n",
")\n",
"\n",
"def auto_retriever_rag(query: str, retrieve_doc: bool = False, files_top_k: int = 1, rerank_top_n: int = 5) -> Response:\n",
" \"\"\"Synthesizes an answer to your question by feeding in an entire relevant document as context.\"\"\"\n",
" print(f\"> User query string: {query}\")\n",
" # Use structured predict to infer the metadata filters and query string.\n",
" query_spec = llm.structured_predict(\n",
" VectorStoreQuerySpec,\n",
" chat_prompt_tmpl,\n",
" info_str=vector_store_info.model_dump_json(indent=4),\n",
" schema_str=json.dumps(VectorStoreQuerySpec.model_json_schema()),\n",
" query_str=query\n",
" )\n",
" # build retriever and query engine\n",
" filters = MetadataFilters(filters=query_spec.filters) if len(query_spec.filters) > 0 else None\n",
" print(f\"> Inferred query string: {query_spec.query}\")\n",
" if filters:\n",
" print(f\"> Inferred filters: {filters.json()}\")\n",
"\n",
" # define retriever based on whether chunk or document-level is specified\n",
" if retrieve_doc:\n",
" retriever = index.as_retriever(\n",
" retrieval_mode=\"files_via_content\",\n",
" # retrieval_mode=\"files_via_metadata\",\n",
" files_top_k=files_top_k,\n",
" filters=filters\n",
" )\n",
" else:\n",
" retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=rerank_top_n,\n",
" filters=filters\n",
" )\n",
" \n",
" query_engine = RetrieverQueryEngine.from_args(\n",
" retriever, \n",
" llm=llm,\n",
" response_mode=\"tree_summarize\"\n",
" )\n",
" # run query\n",
" return query_engine.query(query_spec.query)\n"
]
},
{
"cell_type": "markdown",
"id": "d67303e6-ec65-499b-85bb-8189d220b466",
"metadata": {},
"source": [
"### Try out Auto-Retrieval\n",
"\n",
"Let's try running our auto-retriever on some sample queries. We try out both the chunk-level and document-level retrieval"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "b7533424-ae1f-4f04-9d31-f4a41b2b2709",
"metadata": {},
"outputs": [],
"source": [
"from functools import partial\n",
"\n",
"auto_doc_rag = partial(auto_retriever_rag, retrieve_doc=True)\n",
"auto_chunk_rag = partial(auto_retriever_rag, retrieve_doc=False)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e664037b-3d89-481e-8e0e-2c1d47ffc226",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: ELI5 the objective function in Metra\n",
"> Inferred query string: objective function in Metra\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"metra.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The objective function in METRA involves maximizing the expected inner product of the difference in state representations and a skill vector, subject to a Lipschitz constraint under the temporal distance metric. This is expressed as maximizing the expected value of \\((\\phi(s') - \\phi(s))^\\top z\\), where \\(\\phi\\) is a representation function, \\(s\\) and \\(s'\\) are states, and \\(z\\) is a skill vector. The constraint ensures that the difference in representations is bounded by the temporal distance between states, promoting the discovery of diverse behaviors that cover the latent space effectively.\n"
]
}
],
"source": [
"response = auto_chunk_rag(\"ELI5 the objective function in Metra\")\n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "ad5758fb-63dc-4815-a0b5-54224f784230",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: How was SWE-Bench constructed? Tell me all the stages that went into it.\n",
"> Inferred query string: SWE-Bench construction stages\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"swebench.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The construction of SWE-bench involves a three-stage pipeline:\n",
"\n",
"1. **Repo Selection and Data Scraping**: This stage involves collecting pull requests (PRs) from 12 popular open-source Python repositories on GitHub, resulting in approximately 90,000 PRs. The focus is on popular repositories due to their better maintenance, clear contributor guidelines, and comprehensive test coverage.\n",
"\n",
"2. **Attribute-based Filtering**: In this stage, candidate tasks are created by selecting merged PRs that resolve a GitHub issue and make changes to the test files of the repository. This indicates that the user likely contributed tests to verify the resolution of the issue.\n",
"\n",
"3. **Execution-based Filtering**: For each candidate task, the PRs test content is applied, and the associated test results are logged before and after the PRs other content is applied. Task instances without at least one test changing from fail to pass are filtered out, as well as instances resulting in installation or runtime errors.\n",
"\n",
"Through these stages, the original 90,000 PRs are filtered down to 2,294 task instances that comprise SWE-bench.\n"
]
}
],
"source": [
"response = auto_chunk_rag(\"How was SWE-Bench constructed? Tell me all the stages that went into it.\")\n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "c8bd8cdb-44e8-4cad-9d3c-8207e336f42b",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "f4b2c37d-3b5a-47aa-95b9-d28e0bc83f77",
"metadata": {},
"source": [
"# Auto-Retrieval with LlamaCloud\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/advanced_rag/auto_retrieval.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"**Auto-retrieval** is an advanced RAG technique that uses an LLM to dynamically infer the metadata filter parameters along with the semantic query before initiating vector database retrieval, in comparison to naive RAG which directly sends the user query to the vector db retrieval interface (e.g. dense vector search). It can both be thought of as a form of query expansion/rewriting if you come from the retrieval world, as well as a specific form of function calling.\n",
"\n",
"![](auto_retrieval_img.png)\n",
"\n",
"LlamaCloud helps you easily define chunk and document-level retrieval interfaces on top of any documents. In this guide we show you how to build an auto-retrieval pipeline on top of LlamaCloud retrievers over a research document corpus."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: Give me a summary of the SWE-bench paper\n",
"> Inferred query string: summary of the SWE-bench paper\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"swebench.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The SWE-bench paper introduces a new benchmark designed to evaluate the capabilities of language models (LMs) in real-world software engineering tasks. This benchmark, called SWE-bench, consists of 2,294 software engineering problems derived from GitHub issues and corresponding pull requests across 12 popular Python repositories. The task for the language models is to generate a pull request that resolves a given issue and passes the associated tests. SWE-bench challenges models to handle complex reasoning, long contexts, and cross-file code editing, which are typical in real-world software development but not commonly addressed in existing benchmarks.\n",
"\n",
"The paper highlights that current state-of-the-art models, including proprietary ones like Claude 2, struggle with these tasks, solving only a small percentage of them. The authors also introduce SWE-Llama, a fine-tuned version of the CodeLlama model, which shows competitive performance in some settings. SWE-bench is designed to be continually updatable, allowing for the inclusion of new task instances over time. The benchmark aims to push the development of more practical, intelligent, and autonomous language models for software engineering applications.\n"
]
}
],
"source": [
"response = auto_doc_rag(\"Give me a summary of the SWE-bench paper\") \n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "c624a69b-637f-4c3d-b888-7f1525856d66",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "2e4f707a-c7b5-473f-b4a6-881e2245e82d",
"metadata": {},
"source": [
"## Setup LlamaCloud \n",
"\n",
"Install core packages and download relevant files. Upload these documents to LlamaCloud, and then define a chunk and document-level retriever interface over these documents.\n",
"\n",
"For more information on chunk-level and document-level retrieval, check out our interface [here](https://github.com/run-llama/llamacloud-demo/blob/main/examples/10k_apple_tesla/demo_file_retrieval.ipynb)."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: Give me a summary of the Self-RAG paper\n",
"> Inferred query string: summary of the Self-RAG paper\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"selfrag.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The Self-RAG paper introduces a framework called Self-Reflective Retrieval-Augmented Generation (SELF-RAG) designed to enhance the quality and factuality of large language models (LLMs). SELF-RAG improves upon traditional Retrieval-Augmented Generation (RAG) by incorporating a self-reflection mechanism that allows the model to retrieve relevant information on-demand and critique its own outputs. This is achieved through the use of reflection tokens, which guide the model in deciding when to retrieve information and how to evaluate the relevance and support of the retrieved passages. The framework enables the model to adapt its behavior to different tasks, improving factual accuracy and citation precision. Experiments demonstrate that SELF-RAG outperforms state-of-the-art LLMs and retrieval-augmented models across various tasks, including open-domain question answering and long-form generation. The approach allows for customizable inference-time behavior, balancing retrieval frequency and model creativity based on task requirements.\n"
]
"cell_type": "code",
"execution_count": null,
"id": "9aa458bc-bc8d-46fe-9a57-021dd8d9e525",
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-parse"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "79821400-caaf-42f1-99d8-74c184c19e29",
"metadata": {},
"outputs": [],
"source": [
"# NOTE: uncomment more papers if you want to do research over a larger subset of docs\n",
"\n",
"urls = [\n",
" # \"https://openreview.net/pdf?id=VtmBAGCN7o\",\n",
" # \"https://openreview.net/pdf?id=6PmJoRfdaK\",\n",
" # \"https://openreview.net/pdf?id=LzPWWPAdY4\",\n",
" \"https://openreview.net/pdf?id=VTF8yNQM66\",\n",
" \"https://openreview.net/pdf?id=hSyW5go0v8\",\n",
" # \"https://openreview.net/pdf?id=9WD9KwssyT\",\n",
" # \"https://openreview.net/pdf?id=yV6fD7LYkF\",\n",
" # \"https://openreview.net/pdf?id=hnrB5YHoYu\",\n",
" # \"https://openreview.net/pdf?id=WbWtOYIzIK\",\n",
" \"https://openreview.net/pdf?id=c5pwL0Soay\",\n",
" # \"https://openreview.net/pdf?id=TpD2aG1h0D\",\n",
"]\n",
"\n",
"papers = [\n",
" # \"metagpt.pdf\",\n",
" # \"longlora.pdf\",\n",
" # \"loftq.pdf\",\n",
" \"swebench.pdf\",\n",
" \"selfrag.pdf\",\n",
" # \"zipformer.pdf\",\n",
" # \"values.pdf\",\n",
" # \"finetune_fair_diffusion.pdf\",\n",
" # \"knowledge_card.pdf\",\n",
" \"metra.pdf\",\n",
" # \"vr_mcl.pdf\",\n",
"]\n",
"\n",
"data_dir = \"iclr_docs\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "80137d15-f22b-47eb-adce-ac295ced7e71",
"metadata": {},
"outputs": [],
"source": [
"!mkdir \"{data_dir}\"\n",
"for url, paper in zip(urls, papers):\n",
" !wget \"{url}\" -O \"{data_dir}/{paper}\""
]
},
{
"cell_type": "markdown",
"id": "0315d639-52a0-4888-8779-653083ec3768",
"metadata": {},
"source": [
"#### Load Documents into LlamaCloud\n",
"\n",
"Create a new index in LlamaCloud and drag and drop these downloaded PDFs into the data source.\n",
"\n",
"For best results, in the Transformation Configuration click on the \"Manual\" tab, and set page-level segmentation configuration and \"None\" for additional chunking."
]
},
{
"cell_type": "markdown",
"id": "3579938a-54b5-4eb2-b97e-22d96382eded",
"metadata": {},
"source": [
"#### Setup LlamaCloud Index"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "4208977f-896f-4993-a3a2-1be83c92baa1",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"research_papers_page\",\n",
" project_name=\"llamacloud_demo\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "d46f14ff-45b1-41f4-84e2-a6e5d6637809",
"metadata": {},
"source": [
"## Setup Auto-Retrieval\n",
"\n",
"Now we setup an **auto-retrieval** function over our LlamaCloud retrievers. At a high-level our auto-retrieval function uses a function-calling LLM to infer the metadata filters for a user query - this leads to more precise and relevant retrieval results beyond just using a raw semantic query.\n",
"\n",
"This section shows you how to build it from scratch, also includes some advanced few-shot example selection to increase reliability.\n",
"1. Define a custom prompt to generate metadata filters\n",
"2. Given a user query, first do chunk-level retrieval to dynamically retrieve the metadata of the retrieved chunks.\n",
"3. Inject the metadata as few-shot examples in the auto-retrieval prompt. The goal is to show the LLM what existing, relevant examples of metadata values already look like, so that the LLM can infer correct metadata filters.\n",
"\n",
"A lot of the code below is lifted from our **VectorIndexAutoRetriever** module, which provides an out of the box way to do auto-retrieval against a vector index."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "652cb067-da39-42cb-a303-faa346f72e13",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-4o\")"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "b0a564cb-bfdb-48a5-9d67-10390c3a6c28",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.prompts import ChatPromptTemplate\n",
"from llama_index.core.vector_stores.types import VectorStoreInfo, VectorStoreQuerySpec, MetadataInfo, MetadataFilters\n",
"from llama_index.core.retrievers import BaseRetriever\n",
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"from llama_index.core import Response\n",
"\n",
"import json\n",
"\n",
"SYS_PROMPT = \"\"\"\\\n",
"Your goal is to structure the user's query to match the request schema provided below.\n",
"You MUST call the tool in order to generate the query spec.\n",
"\n",
"<< Structured Request Schema >>\n",
"When responding use a markdown code snippet with a JSON object formatted in the \\\n",
"following schema:\n",
"\n",
"{schema_str}\n",
"\n",
"The query string should contain only text that is expected to match the contents of \\\n",
"documents. Any conditions in the filter should not be mentioned in the query as well.\n",
"\n",
"Make sure that filters only refer to attributes that exist in the data source.\n",
"Make sure that filters take into account the descriptions of attributes.\n",
"Make sure that filters are only used as needed. If there are no filters that should be \\\n",
"applied return [] for the filter value.\\\n",
"\n",
"If the user's query explicitly mentions number of documents to retrieve, set top_k to \\\n",
"that number, otherwise do not set top_k.\n",
"\n",
"The schema of the metadata filters in the vector db table is listed below, along with some example metadata dictionaries from relevant rows.\n",
"The user will send the input query string.\n",
"\n",
"Data Source:\n",
"```json\n",
"{info_str}\n",
"```\n",
"\n",
"Example metadata from relevant chunks:\n",
"{example_rows}\n",
"\n",
"\"\"\"\n",
"\n",
"example_rows_retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=4\n",
")\n",
"\n",
"def get_example_rows_fn(**kwargs):\n",
" \"\"\"Retrieve relevant few-shot examples.\"\"\"\n",
" query_str = kwargs[\"query_str\"]\n",
" nodes = example_rows_retriever.retrieve(query_str)\n",
" # get the metadata, join them\n",
" metadata_list = [n.metadata for n in nodes]\n",
"\n",
" return \"\\n\".join([json.dumps(m) for m in metadata_list])\n",
" \n",
" \n",
"\n",
"# TODO: define function mapping for `example_rows`.\n",
"chat_prompt_tmpl = ChatPromptTemplate.from_messages(\n",
" [\n",
" (\"system\", SYS_PROMPT),\n",
" (\"user\", \"{query_str}\"),\n",
" ],\n",
" function_mappings={\n",
" \"example_rows\": get_example_rows_fn\n",
" }\n",
")\n",
"\n",
"\n",
"## NOTE: this is a dataclass that contains information about the metadata\n",
"vector_store_info = VectorStoreInfo(\n",
" content_info=\"contains content from various research papers\",\n",
" metadata_info=[\n",
" MetadataInfo(\n",
" name=\"file_name\",\n",
" type=\"str\",\n",
" description=\"Name of the source paper\",\n",
" ),\n",
" ],\n",
")\n",
"\n",
"def auto_retriever_rag(query: str, retrieve_doc: bool = False, files_top_k: int = 1, rerank_top_n: int = 5) -> Response:\n",
" \"\"\"Synthesizes an answer to your question by feeding in an entire relevant document as context.\"\"\"\n",
" print(f\"> User query string: {query}\")\n",
" # Use structured predict to infer the metadata filters and query string.\n",
" query_spec = llm.structured_predict(\n",
" VectorStoreQuerySpec,\n",
" chat_prompt_tmpl,\n",
" info_str=vector_store_info.model_dump_json(indent=4),\n",
" schema_str=json.dumps(VectorStoreQuerySpec.model_json_schema()),\n",
" query_str=query\n",
" )\n",
" # build retriever and query engine\n",
" filters = MetadataFilters(filters=query_spec.filters) if len(query_spec.filters) > 0 else None\n",
" print(f\"> Inferred query string: {query_spec.query}\")\n",
" if filters:\n",
" print(f\"> Inferred filters: {filters.json()}\")\n",
"\n",
" # define retriever based on whether chunk or document-level is specified\n",
" if retrieve_doc:\n",
" retriever = index.as_retriever(\n",
" retrieval_mode=\"files_via_content\",\n",
" # retrieval_mode=\"files_via_metadata\",\n",
" files_top_k=files_top_k,\n",
" filters=filters\n",
" )\n",
" else:\n",
" retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=rerank_top_n,\n",
" filters=filters\n",
" )\n",
" \n",
" query_engine = RetrieverQueryEngine.from_args(\n",
" retriever, \n",
" llm=llm,\n",
" response_mode=\"tree_summarize\"\n",
" )\n",
" # run query\n",
" return query_engine.query(query_spec.query)\n"
]
},
{
"cell_type": "markdown",
"id": "d67303e6-ec65-499b-85bb-8189d220b466",
"metadata": {},
"source": [
"### Try out Auto-Retrieval\n",
"\n",
"Let's try running our auto-retriever on some sample queries. We try out both the chunk-level and document-level retrieval"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "b7533424-ae1f-4f04-9d31-f4a41b2b2709",
"metadata": {},
"outputs": [],
"source": [
"from functools import partial\n",
"\n",
"auto_doc_rag = partial(auto_retriever_rag, retrieve_doc=True)\n",
"auto_chunk_rag = partial(auto_retriever_rag, retrieve_doc=False)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "e664037b-3d89-481e-8e0e-2c1d47ffc226",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: ELI5 the objective function in Metra\n",
"> Inferred query string: objective function in Metra\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"metra.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The objective function in METRA involves maximizing the expected inner product of the difference in state representations and a skill vector, subject to a Lipschitz constraint under the temporal distance metric. This is expressed as maximizing the expected value of \\((\\phi(s') - \\phi(s))^\\top z\\), where \\(\\phi\\) is a representation function, \\(s\\) and \\(s'\\) are states, and \\(z\\) is a skill vector. The constraint ensures that the difference in representations is bounded by the temporal distance between states, promoting the discovery of diverse behaviors that cover the latent space effectively.\n"
]
}
],
"source": [
"response = auto_chunk_rag(\"ELI5 the objective function in Metra\")\n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "ad5758fb-63dc-4815-a0b5-54224f784230",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: How was SWE-Bench constructed? Tell me all the stages that went into it.\n",
"> Inferred query string: SWE-Bench construction stages\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"swebench.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The construction of SWE-bench involves a three-stage pipeline:\n",
"\n",
"1. **Repo Selection and Data Scraping**: This stage involves collecting pull requests (PRs) from 12 popular open-source Python repositories on GitHub, resulting in approximately 90,000 PRs. The focus is on popular repositories due to their better maintenance, clear contributor guidelines, and comprehensive test coverage.\n",
"\n",
"2. **Attribute-based Filtering**: In this stage, candidate tasks are created by selecting merged PRs that resolve a GitHub issue and make changes to the test files of the repository. This indicates that the user likely contributed tests to verify the resolution of the issue.\n",
"\n",
"3. **Execution-based Filtering**: For each candidate task, the PRs test content is applied, and the associated test results are logged before and after the PRs other content is applied. Task instances without at least one test changing from fail to pass are filtered out, as well as instances resulting in installation or runtime errors.\n",
"\n",
"Through these stages, the original 90,000 PRs are filtered down to 2,294 task instances that comprise SWE-bench.\n"
]
}
],
"source": [
"response = auto_chunk_rag(\"How was SWE-Bench constructed? Tell me all the stages that went into it.\")\n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "c8bd8cdb-44e8-4cad-9d3c-8207e336f42b",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: Give me a summary of the SWE-bench paper\n",
"> Inferred query string: summary of the SWE-bench paper\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"swebench.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The SWE-bench paper introduces a new benchmark designed to evaluate the capabilities of language models (LMs) in real-world software engineering tasks. This benchmark, called SWE-bench, consists of 2,294 software engineering problems derived from GitHub issues and corresponding pull requests across 12 popular Python repositories. The task for the language models is to generate a pull request that resolves a given issue and passes the associated tests. SWE-bench challenges models to handle complex reasoning, long contexts, and cross-file code editing, which are typical in real-world software development but not commonly addressed in existing benchmarks.\n",
"\n",
"The paper highlights that current state-of-the-art models, including proprietary ones like Claude 2, struggle with these tasks, solving only a small percentage of them. The authors also introduce SWE-Llama, a fine-tuned version of the CodeLlama model, which shows competitive performance in some settings. SWE-bench is designed to be continually updatable, allowing for the inclusion of new task instances over time. The benchmark aims to push the development of more practical, intelligent, and autonomous language models for software engineering applications.\n"
]
}
],
"source": [
"response = auto_doc_rag(\"Give me a summary of the SWE-bench paper\") \n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "c624a69b-637f-4c3d-b888-7f1525856d66",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"> User query string: Give me a summary of the Self-RAG paper\n",
"> Inferred query string: summary of the Self-RAG paper\n",
"> Inferred filters: {\"filters\":[{\"key\":\"file_name\",\"value\":\"selfrag.pdf\",\"operator\":\"==\"}],\"condition\":\"and\"}\n",
"The Self-RAG paper introduces a framework called Self-Reflective Retrieval-Augmented Generation (SELF-RAG) designed to enhance the quality and factuality of large language models (LLMs). SELF-RAG improves upon traditional Retrieval-Augmented Generation (RAG) by incorporating a self-reflection mechanism that allows the model to retrieve relevant information on-demand and critique its own outputs. This is achieved through the use of reflection tokens, which guide the model in deciding when to retrieve information and how to evaluate the relevance and support of the retrieved passages. The framework enables the model to adapt its behavior to different tasks, improving factual accuracy and citation precision. Experiments demonstrate that SELF-RAG outperforms state-of-the-art LLMs and retrieval-augmented models across various tasks, including open-domain question answering and long-form generation. The approach allows for customizable inference-time behavior, balancing retrieval frequency and model creativity based on task requirements.\n"
]
}
],
"source": [
"response = auto_doc_rag(\"Give me a summary of the Self-RAG paper\") \n",
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "9fb7107d-a4f3-4ed3-9671-d27973d0a3fe",
"metadata": {},
"source": [
"## Next Steps\n",
"\n",
"Now that you've learned the basics of auto-retrieval, you can choose to build a standalone RAG pipeline powered by this, or choose to plug this in as part of a broader agentic system. For instance, you can plug in both chunk and doc-level auto-retriever pipelines as tools for an agent to interact with. "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"response = auto_doc_rag(\"Give me a summary of the Self-RAG paper\") \n",
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "9fb7107d-a4f3-4ed3-9671-d27973d0a3fe",
"metadata": {},
"source": [
"## Next Steps\n",
"\n",
"Now that you've learned the basics of auto-retrieval, you can choose to build a standalone RAG pipeline powered by this, or choose to plug this in as part of a broader agentic system. For instance, you can plug in both chunk and doc-level auto-retriever pipelines as tools for an agent to interact with. "
]
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
@@ -1,495 +1,502 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Corrective RAG Demo\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/advanced_rag/corrective_rag_workflow.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This demo shows how you can use LlamaCloud and [Tavily AI](https://tavily.com/) to build a [Corrective RAG](https://arxiv.org/abs/2401.15884) workflow. The workflow uses the indexed documents on Llamacloud as a primary tool, but falls back to web search using Tavily AI if the information presented in the query cannot be found on LlamaCloud.\n",
"\n",
"![](corrective_rag_workflow_img.png)\n",
"\n",
"A brief understanding of the paper: \n",
"Corrective Retrieval Augmented Generation (CRAG) is a method designed to enhance the robustness of language model generation by evaluating and augmenting the relevance of retrieved documents through a an evaluator and large-scale web searches, ensuring more accurate and reliable information is used in generation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Follow [these instructions](https://docs.cloud.llamaindex.ai/llamacloud/getting_started/quick_start) on how to set up your index. For this example, we will upload a paper about Llama2 onto LlamaCloud. On the configure data source step, download [this PDF paper](https://arxiv.org/pdf/2307.09288) and upload it into your index.\n",
"\n",
"After deploying your index, follow [these instructions](https://docs.cloud.llamaindex.ai/llamacloud/getting_started/api_key) on getting an API key. Once you are done with this, configure `nest_asyncio` and your enviornment variables."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index llama-index-indices-managed-llama-cloud llama-index-tools-tavily-research"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<Your OpenAI API Key>\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## [Optional] Setup Observability\n",
"\n",
"We setup an integration with LlamaTrace (integration with Arize).\n",
"\n",
"If you haven't already done so, make sure to create an account here: https://llamatrace.com/login. Then create an API key and put it in the `PHOENIX_API_KEY` variable below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install -U llama-index-callbacks-arize-phoenix"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# setup Arize Phoenix for logging/observability\n",
"import llama_index.core\n",
"import os\n",
"\n",
"PHOENIX_API_KEY = \"<PHOENIX_API_KEY>\"\n",
"os.environ[\"OTEL_EXPORTER_OTLP_HEADERS\"] = f\"api_key={PHOENIX_API_KEY}\"\n",
"llama_index.core.set_global_handler(\n",
" \"arize_phoenix\", endpoint=\"https://llamatrace.com/v1/traces\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Designing the Workflow\n",
"\n",
"Corrective RAG consists of the following steps:\n",
"1. Ingestion of data — Loads the data into an index and setting up Tavily AI. The ingestion step will be run by itself, taking in a start event and returning a stop event.\n",
"2. Retrieval - Retrives the most relevant nodes based on the query.\n",
"3. Relevance evaluation - Uses an LLM to determine whether the retrieved nodes are relevant to the query given the content of the nodes.\n",
"4. Relevance extraction - Extracts the nodes which the LLM determined to be relevant.\n",
"5. Query transformation and Tavily search - If a node is irrelevant, then uses an LLM to transform the query to tailor towards a web search. Uses Tavily to search the web for a relevant answer based on the query.\n",
"6. Response generation - Builds a summary index given the text from the relevant nodes and the Tavily search and uses this index to get a result given the original query.\n",
"\n",
"The following events are needed:\n",
"1. `RetrieveEvent` - Event containing information about the retrieved nodes.\n",
"2. `RelevanceEvalEvent` - Event containing a list of the results of the relevance evaluation.\n",
"3. `TextExtractEvent` - Event containing the concatenated string of relevant text from relevant nodes.\n",
"4. `QueryEvent` - Event containing both the relevant text and search text."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"from typing import List\n",
"\n",
"from llama_index.core.schema import NodeWithScore\n",
"from llama_index.core.workflow import (\n",
" Event,\n",
")\n",
"\n",
"class RetrieveEvent(Event):\n",
" \"\"\"Retrieve event (gets retrieved nodes).\"\"\"\n",
"\n",
" retrieved_nodes: List[NodeWithScore]\n",
"\n",
"\n",
"\n",
"class WebSearchEvent(Event):\n",
" \"\"\"Web search event.\"\"\"\n",
"\n",
" relevant_text: str # not used, just used for pass through\n",
"\n",
"\n",
"class QueryEvent(Event):\n",
" \"\"\"Query event. Queries given relevant text and search text.\"\"\"\n",
"\n",
" relevant_text: str\n",
" search_text: str"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is the code for the workflow."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional, Any\n",
"\n",
"from llama_index.core.workflow import (\n",
" StartEvent,\n",
" StopEvent,\n",
" step,\n",
" Workflow,\n",
" Context,\n",
")\n",
"from llama_index.core import SummaryIndex\n",
"from llama_index.core.schema import Document\n",
"from llama_index.core.prompts import PromptTemplate\n",
"from llama_index.core.llms import LLM\n",
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.core.base.base_retriever import BaseRetriever\n",
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"from llama_index.tools.tavily_research import TavilyToolSpec\n",
"\n",
"DEFAULT_RELEVANCY_PROMPT_TEMPLATE = PromptTemplate(\n",
" template=\"\"\"As a grader, your task is to evaluate the relevance of a document retrieved in response to a user's question.\n",
"\n",
" Retrieved Document:\n",
" -------------------\n",
" {context_str}\n",
"\n",
" User Question:\n",
" --------------\n",
" {query_str}\n",
"\n",
" Evaluation Criteria:\n",
" - Consider whether the document contains keywords or topics related to the user's question.\n",
" - The evaluation should not be overly stringent; the primary objective is to identify and filter out clearly irrelevant retrievals.\n",
"\n",
" Decision:\n",
" - Assign a binary score to indicate the document's relevance.\n",
" - Use 'yes' if the document is relevant to the question, or 'no' if it is not.\n",
"\n",
" Please provide your binary score ('yes' or 'no') below to indicate the document's relevance to the user question.\"\"\"\n",
")\n",
"\n",
"DEFAULT_TRANSFORM_QUERY_TEMPLATE = PromptTemplate(\n",
" template=\"\"\"Your task is to refine a query to ensure it is highly effective for retrieving relevant search results. \\n\n",
" Analyze the given input to grasp the core semantic intent or meaning. \\n\n",
" Original Query:\n",
" \\n ------- \\n\n",
" {query_str}\n",
" \\n ------- \\n\n",
" Your goal is to rephrase or enhance this query to improve its search performance. Ensure the revised query is concise and directly aligned with the intended search objective. \\n\n",
" Respond with the optimized query only:\"\"\"\n",
")\n",
"\n",
"\n",
"class CorrectiveRAGWorkflow(Workflow):\n",
" \"\"\"Corrective RAG Workflow.\"\"\"\n",
" def __init__(\n",
" self,\n",
" index,\n",
" tavily_ai_apikey: str,\n",
" llm: Optional[LLM] = None,\n",
" **kwargs: Any\n",
" ) -> None:\n",
" \"\"\"Init params.\"\"\"\n",
" super().__init__(**kwargs)\n",
" self.index = index\n",
" self.tavily_tool = TavilyToolSpec(api_key=tavily_ai_apikey)\n",
" self.llm = llm or OpenAI(model=\"gpt-4o\")\n",
"\n",
" @step\n",
" async def retrieve(self, ctx: Context, ev: StartEvent) -> Optional[RetrieveEvent]:\n",
" \"\"\"Retrieve the relevant nodes for the query.\"\"\"\n",
" query_str = ev.get(\"query_str\")\n",
" retriever_kwargs = ev.get(\"retriever_kwargs\", {})\n",
"\n",
" if query_str is None:\n",
" return None\n",
"\n",
" retriever: BaseRetriever = self.index.as_retriever(**retriever_kwargs)\n",
" result = retriever.retrieve(query_str)\n",
" await ctx.set(\"retrieved_nodes\", result)\n",
" await ctx.set(\"query_str\", query_str)\n",
" return RetrieveEvent(retrieved_nodes=result)\n",
"\n",
" @step\n",
" async def eval_relevance(\n",
" self, ctx: Context, ev: RetrieveEvent\n",
" ) -> WebSearchEvent | QueryEvent:\n",
" \"\"\"Evaluate relevancy of retrieved documents with the query.\"\"\"\n",
" retrieved_nodes = ev.retrieved_nodes\n",
" query_str = await ctx.get(\"query_str\")\n",
"\n",
" relevancy_results = []\n",
" for node in retrieved_nodes:\n",
" prompt = DEFAULT_RELEVANCY_PROMPT_TEMPLATE.format(context_str=node.text, query_str=query_str)\n",
" relevancy = self.llm.complete(prompt)\n",
" relevancy_results.append(relevancy.text.lower().strip())\n",
"\n",
" relevant_texts = [\n",
" retrieved_nodes[i].text\n",
" for i, result in enumerate(relevancy_results)\n",
" if result == \"yes\"\n",
" ]\n",
" relevant_text = \"\\n\".join(relevant_texts)\n",
" if \"no\" in relevancy_results:\n",
" return WebSearchEvent(relevant_text=relevant_text)\n",
" else:\n",
" return QueryEvent(relevant_text=relevant_text, search_text=\"\")\n",
"\n",
" @step\n",
" async def web_search(\n",
" self, ctx: Context, ev: WebSearchEvent\n",
" ) -> QueryEvent:\n",
" \"\"\"Search the transformed query with Tavily API.\"\"\"\n",
" # If any document is found irrelevant, transform the query string for better search results.\n",
"\n",
" query_str = await ctx.get(\"query_str\")\n",
"\n",
" prompt = DEFAULT_TRANSFORM_QUERY_TEMPLATE.format(query_str=query_str)\n",
" result = self.llm.complete(prompt)\n",
" transformed_query_str = result.text\n",
" # Conduct a search with the transformed query string and collect the results.\n",
" search_results = self.tavily_tool.search(\n",
" transformed_query_str, max_results=5\n",
" )\n",
" search_text = \"\\n\".join([result.text for result in search_results])\n",
" return QueryEvent(relevant_text=ev.relevant_text, search_text=search_text)\n",
"\n",
" @step\n",
" async def query_result(self, ctx: Context, ev: QueryEvent) -> StopEvent:\n",
" \"\"\"Get result with relevant text.\"\"\"\n",
" relevant_text = ev.relevant_text\n",
" search_text = ev.search_text\n",
" query_str = await ctx.get(\"query_str\")\n",
"\n",
" documents = [Document(text=relevant_text + \"\\n\" + search_text)]\n",
" index = SummaryIndex.from_documents(documents)\n",
" query_engine = index.as_query_engine()\n",
" result = query_engine.query(query_str)\n",
" return StopEvent(result=result)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create LlamaCloudIndex\n",
"\n",
"Create a `LlamaCloudIndex` which retrieves information from the index you have on LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"llama2_paper\",\n",
" project_name=\"tutorials\",\n",
" organization_id=\"<org_id>\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See [here](https://docs.cloud.llamaindex.ai/organizations) for a tutorial on how to use organizations."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Set up the workflow ingestion:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"workflow = CorrectiveRAGWorkflow(index=index, tavily_ai_apikey=os.environ[\"TAVILY_API_KEY\"], verbose=True, timeout=60)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Visualize Workflow"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'NoneType'>\n",
"<class '__main__.WebSearchEvent'>\n",
"<class '__main__.QueryEvent'>\n",
"<class 'llama_index.core.workflow.events.StopEvent'>\n",
"<class '__main__.RetrieveEvent'>\n",
"<class '__main__.QueryEvent'>\n",
"crag_workflow.html\n"
]
}
],
"source": [
"from llama_index.utils.workflow import draw_all_possible_flows\n",
"\n",
"draw_all_possible_flows(CorrectiveRAGWorkflow, filename=\"crag_workflow.html\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example queries"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step retrieve\n",
"Step retrieve produced event RetrieveEvent\n",
"Running step eval_relevance\n",
"Step eval_relevance produced event QueryEvent\n",
"Running step query_result\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Failed to detach context\n",
"Traceback (most recent call last):\n",
" File \"/Users/jerryliu/Programming/llamacloud-demo/.venv/lib/python3.10/site-packages/opentelemetry/context/__init__.py\", line 154, in detach\n",
" _RUNTIME_CONTEXT.detach(token)\n",
" File \"/Users/jerryliu/Programming/llamacloud-demo/.venv/lib/python3.10/site-packages/opentelemetry/context/contextvars_context.py\", line 50, in detach\n",
" self._current_context.reset(token) # type: ignore\n",
"ValueError: <Token var=<ContextVar name='current_context' default={} at 0x154d55ad0> at 0x29dedac40> was created in a different Context\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Step query_result produced event StopEvent\n"
]
},
{
"data": {
"text/markdown": [
"Llama 2 was pretrained using an optimized auto-regressive transformer model. The pretraining process involved robust data cleaning, updated data mixes, training on a large number of tokens, doubling the context length, and utilizing grouped-query attention to enhance inference scalability for larger models."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import display, Markdown\n",
"\n",
"result = await workflow.run(query_str=\"How was Llama2 pretrained?\") # this was in the given paper\n",
"display(Markdown(str(result)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step retrieve\n"
]
"cell_type": "markdown",
"metadata": {},
"source": [
"# Corrective RAG Demo\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/advanced_rag/corrective_rag_workflow.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This demo shows how you can use LlamaCloud and [Tavily AI](https://tavily.com/) to build a [Corrective RAG](https://arxiv.org/abs/2401.15884) workflow. The workflow uses the indexed documents on Llamacloud as a primary tool, but falls back to web search using Tavily AI if the information presented in the query cannot be found on LlamaCloud.\n",
"\n",
"![](corrective_rag_workflow_img.png)\n",
"\n",
"A brief understanding of the paper: \n",
"Corrective Retrieval Augmented Generation (CRAG) is a method designed to enhance the robustness of language model generation by evaluating and augmenting the relevance of retrieved documents through a an evaluator and large-scale web searches, ensuring more accurate and reliable information is used in generation."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Follow [these instructions](https://docs.cloud.llamaindex.ai/llamacloud/getting_started/quick_start) on how to set up your index. For this example, we will upload a paper about Llama2 onto LlamaCloud. On the configure data source step, download [this PDF paper](https://arxiv.org/pdf/2307.09288) and upload it into your index.\n",
"\n",
"After deploying your index, follow [these instructions](https://docs.cloud.llamaindex.ai/llamacloud/getting_started/api_key) on getting an API key. Once you are done with this, configure `nest_asyncio` and your enviornment variables."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index llama-index-indices-managed-llama-cloud llama-index-tools-tavily-research"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<Your OpenAI API Key>\""
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## [Optional] Setup Observability\n",
"\n",
"We setup an integration with LlamaTrace (integration with Arize).\n",
"\n",
"If you haven't already done so, make sure to create an account here: https://llamatrace.com/login. Then create an API key and put it in the `PHOENIX_API_KEY` variable below."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install -U llama-index-callbacks-arize-phoenix"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"# setup Arize Phoenix for logging/observability\n",
"import llama_index.core\n",
"import os\n",
"\n",
"PHOENIX_API_KEY = \"<PHOENIX_API_KEY>\"\n",
"os.environ[\"OTEL_EXPORTER_OTLP_HEADERS\"] = f\"api_key={PHOENIX_API_KEY}\"\n",
"llama_index.core.set_global_handler(\n",
" \"arize_phoenix\", endpoint=\"https://llamatrace.com/v1/traces\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Designing the Workflow\n",
"\n",
"Corrective RAG consists of the following steps:\n",
"1. Ingestion of data — Loads the data into an index and setting up Tavily AI. The ingestion step will be run by itself, taking in a start event and returning a stop event.\n",
"2. Retrieval - Retrives the most relevant nodes based on the query.\n",
"3. Relevance evaluation - Uses an LLM to determine whether the retrieved nodes are relevant to the query given the content of the nodes.\n",
"4. Relevance extraction - Extracts the nodes which the LLM determined to be relevant.\n",
"5. Query transformation and Tavily search - If a node is irrelevant, then uses an LLM to transform the query to tailor towards a web search. Uses Tavily to search the web for a relevant answer based on the query.\n",
"6. Response generation - Builds a summary index given the text from the relevant nodes and the Tavily search and uses this index to get a result given the original query.\n",
"\n",
"The following events are needed:\n",
"1. `RetrieveEvent` - Event containing information about the retrieved nodes.\n",
"2. `RelevanceEvalEvent` - Event containing a list of the results of the relevance evaluation.\n",
"3. `TextExtractEvent` - Event containing the concatenated string of relevant text from relevant nodes.\n",
"4. `QueryEvent` - Event containing both the relevant text and search text."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"from typing import List\n",
"\n",
"from llama_index.core.schema import NodeWithScore\n",
"from llama_index.core.workflow import (\n",
" Event,\n",
")\n",
"\n",
"class RetrieveEvent(Event):\n",
" \"\"\"Retrieve event (gets retrieved nodes).\"\"\"\n",
"\n",
" retrieved_nodes: List[NodeWithScore]\n",
"\n",
"\n",
"\n",
"class WebSearchEvent(Event):\n",
" \"\"\"Web search event.\"\"\"\n",
"\n",
" relevant_text: str # not used, just used for pass through\n",
"\n",
"\n",
"class QueryEvent(Event):\n",
" \"\"\"Query event. Queries given relevant text and search text.\"\"\"\n",
"\n",
" relevant_text: str\n",
" search_text: str"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is the code for the workflow."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [],
"source": [
"from typing import Optional, Any\n",
"\n",
"from llama_index.core.workflow import (\n",
" StartEvent,\n",
" StopEvent,\n",
" step,\n",
" Workflow,\n",
" Context,\n",
")\n",
"from llama_index.core import SummaryIndex\n",
"from llama_index.core.schema import Document\n",
"from llama_index.core.prompts import PromptTemplate\n",
"from llama_index.core.llms import LLM\n",
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.core.base.base_retriever import BaseRetriever\n",
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"from llama_index.tools.tavily_research import TavilyToolSpec\n",
"\n",
"DEFAULT_RELEVANCY_PROMPT_TEMPLATE = PromptTemplate(\n",
" template=\"\"\"As a grader, your task is to evaluate the relevance of a document retrieved in response to a user's question.\n",
"\n",
" Retrieved Document:\n",
" -------------------\n",
" {context_str}\n",
"\n",
" User Question:\n",
" --------------\n",
" {query_str}\n",
"\n",
" Evaluation Criteria:\n",
" - Consider whether the document contains keywords or topics related to the user's question.\n",
" - The evaluation should not be overly stringent; the primary objective is to identify and filter out clearly irrelevant retrievals.\n",
"\n",
" Decision:\n",
" - Assign a binary score to indicate the document's relevance.\n",
" - Use 'yes' if the document is relevant to the question, or 'no' if it is not.\n",
"\n",
" Please provide your binary score ('yes' or 'no') below to indicate the document's relevance to the user question.\"\"\"\n",
")\n",
"\n",
"DEFAULT_TRANSFORM_QUERY_TEMPLATE = PromptTemplate(\n",
" template=\"\"\"Your task is to refine a query to ensure it is highly effective for retrieving relevant search results. \\n\n",
" Analyze the given input to grasp the core semantic intent or meaning. \\n\n",
" Original Query:\n",
" \\n ------- \\n\n",
" {query_str}\n",
" \\n ------- \\n\n",
" Your goal is to rephrase or enhance this query to improve its search performance. Ensure the revised query is concise and directly aligned with the intended search objective. \\n\n",
" Respond with the optimized query only:\"\"\"\n",
")\n",
"\n",
"\n",
"class CorrectiveRAGWorkflow(Workflow):\n",
" \"\"\"Corrective RAG Workflow.\"\"\"\n",
" def __init__(\n",
" self,\n",
" index,\n",
" tavily_ai_apikey: str,\n",
" llm: Optional[LLM] = None,\n",
" **kwargs: Any\n",
" ) -> None:\n",
" \"\"\"Init params.\"\"\"\n",
" super().__init__(**kwargs)\n",
" self.index = index\n",
" self.tavily_tool = TavilyToolSpec(api_key=tavily_ai_apikey)\n",
" self.llm = llm or OpenAI(model=\"gpt-4o\")\n",
"\n",
" @step\n",
" async def retrieve(self, ctx: Context, ev: StartEvent) -> Optional[RetrieveEvent]:\n",
" \"\"\"Retrieve the relevant nodes for the query.\"\"\"\n",
" query_str = ev.get(\"query_str\")\n",
" retriever_kwargs = ev.get(\"retriever_kwargs\", {})\n",
"\n",
" if query_str is None:\n",
" return None\n",
"\n",
" retriever: BaseRetriever = self.index.as_retriever(**retriever_kwargs)\n",
" result = retriever.retrieve(query_str)\n",
" await ctx.set(\"retrieved_nodes\", result)\n",
" await ctx.set(\"query_str\", query_str)\n",
" return RetrieveEvent(retrieved_nodes=result)\n",
"\n",
" @step\n",
" async def eval_relevance(\n",
" self, ctx: Context, ev: RetrieveEvent\n",
" ) -> WebSearchEvent | QueryEvent:\n",
" \"\"\"Evaluate relevancy of retrieved documents with the query.\"\"\"\n",
" retrieved_nodes = ev.retrieved_nodes\n",
" query_str = await ctx.get(\"query_str\")\n",
"\n",
" relevancy_results = []\n",
" for node in retrieved_nodes:\n",
" prompt = DEFAULT_RELEVANCY_PROMPT_TEMPLATE.format(context_str=node.text, query_str=query_str)\n",
" relevancy = self.llm.complete(prompt)\n",
" relevancy_results.append(relevancy.text.lower().strip())\n",
"\n",
" relevant_texts = [\n",
" retrieved_nodes[i].text\n",
" for i, result in enumerate(relevancy_results)\n",
" if result == \"yes\"\n",
" ]\n",
" relevant_text = \"\\n\".join(relevant_texts)\n",
" if \"no\" in relevancy_results:\n",
" return WebSearchEvent(relevant_text=relevant_text)\n",
" else:\n",
" return QueryEvent(relevant_text=relevant_text, search_text=\"\")\n",
"\n",
" @step\n",
" async def web_search(\n",
" self, ctx: Context, ev: WebSearchEvent\n",
" ) -> QueryEvent:\n",
" \"\"\"Search the transformed query with Tavily API.\"\"\"\n",
" # If any document is found irrelevant, transform the query string for better search results.\n",
"\n",
" query_str = await ctx.get(\"query_str\")\n",
"\n",
" prompt = DEFAULT_TRANSFORM_QUERY_TEMPLATE.format(query_str=query_str)\n",
" result = self.llm.complete(prompt)\n",
" transformed_query_str = result.text\n",
" # Conduct a search with the transformed query string and collect the results.\n",
" search_results = self.tavily_tool.search(\n",
" transformed_query_str, max_results=5\n",
" )\n",
" search_text = \"\\n\".join([result.text for result in search_results])\n",
" return QueryEvent(relevant_text=ev.relevant_text, search_text=search_text)\n",
"\n",
" @step\n",
" async def query_result(self, ctx: Context, ev: QueryEvent) -> StopEvent:\n",
" \"\"\"Get result with relevant text.\"\"\"\n",
" relevant_text = ev.relevant_text\n",
" search_text = ev.search_text\n",
" query_str = await ctx.get(\"query_str\")\n",
"\n",
" documents = [Document(text=relevant_text + \"\\n\" + search_text)]\n",
" index = SummaryIndex.from_documents(documents)\n",
" query_engine = index.as_query_engine()\n",
" result = query_engine.query(query_str)\n",
" return StopEvent(result=result)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create LlamaCloudIndex\n",
"\n",
"Create a `LlamaCloudIndex` which retrieves information from the index you have on LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"llama2_paper\",\n",
" project_name=\"tutorials\",\n",
" organization_id=\"<org_id>\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"See [here](https://docs.cloud.llamaindex.ai/organizations) for a tutorial on how to use organizations."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Set up the workflow ingestion:"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"workflow = CorrectiveRAGWorkflow(index=index, tavily_ai_apikey=os.environ[\"TAVILY_API_KEY\"], verbose=True, timeout=60)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Visualize Workflow"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'NoneType'>\n",
"<class '__main__.WebSearchEvent'>\n",
"<class '__main__.QueryEvent'>\n",
"<class 'llama_index.core.workflow.events.StopEvent'>\n",
"<class '__main__.RetrieveEvent'>\n",
"<class '__main__.QueryEvent'>\n",
"crag_workflow.html\n"
]
}
],
"source": [
"from llama_index.utils.workflow import draw_all_possible_flows\n",
"\n",
"draw_all_possible_flows(CorrectiveRAGWorkflow, filename=\"crag_workflow.html\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Example queries"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step retrieve\n",
"Step retrieve produced event RetrieveEvent\n",
"Running step eval_relevance\n",
"Step eval_relevance produced event QueryEvent\n",
"Running step query_result\n"
]
},
{
"name": "stderr",
"output_type": "stream",
"text": [
"Failed to detach context\n",
"Traceback (most recent call last):\n",
" File \"/Users/jerryliu/Programming/llamacloud-demo/.venv/lib/python3.10/site-packages/opentelemetry/context/__init__.py\", line 154, in detach\n",
" _RUNTIME_CONTEXT.detach(token)\n",
" File \"/Users/jerryliu/Programming/llamacloud-demo/.venv/lib/python3.10/site-packages/opentelemetry/context/contextvars_context.py\", line 50, in detach\n",
" self._current_context.reset(token) # type: ignore\n",
"ValueError: <Token var=<ContextVar name='current_context' default={} at 0x154d55ad0> at 0x29dedac40> was created in a different Context\n"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Step query_result produced event StopEvent\n"
]
},
{
"data": {
"text/markdown": [
"Llama 2 was pretrained using an optimized auto-regressive transformer model. The pretraining process involved robust data cleaning, updated data mixes, training on a large number of tokens, doubling the context length, and utilizing grouped-query attention to enhance inference scalability for larger models."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.display import display, Markdown\n",
"\n",
"result = await workflow.run(query_str=\"How was Llama2 pretrained?\") # this was in the given paper\n",
"display(Markdown(str(result)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step retrieve\n"
]
}
],
"source": [
"result = await workflow.run(query_str=\"Where does the airline flight UA 1 fly?\") # this info is not in the paper\n",
"display(Markdown(str(result)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llamacloud-demo",
"language": "python",
"name": "llamacloud-demo"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"result = await workflow.run(query_str=\"Where does the airline flight UA 1 fly?\") # this info is not in the paper\n",
"display(Markdown(str(result)))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llamacloud-demo",
"language": "python",
"name": "llamacloud-demo"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
"nbformat": 4,
"nbformat_minor": 4
}
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large Load Diff
+199 -192
View File
@@ -1,199 +1,206 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "ced5edf1-4d1d-4e81-ba1e-58b26ac6ca8c",
"metadata": {},
"source": [
"# LlamaIndex Platform Demo"
]
},
{
"cell_type": "markdown",
"id": "b450616c-ec1b-48ad-a5be-0e98a1123831",
"metadata": {},
"source": [
"## Step 0: Setup environment config for platform"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "da7a0c2e-54d2-4e8c-9edc-c1c457fc24f9",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"your-api-key\""
]
},
{
"cell_type": "markdown",
"id": "a89b27f6-d9ab-43f9-8c0d-56b68b89f5d5",
"metadata": {},
"source": [
"## Step 1: Configure ingestion pipeline (data source, transformations)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "654f252d-d68d-4a91-8a40-5bffce8c61ab",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.ingestion import IngestionPipeline\n",
"from llama_index.core import SimpleDirectoryReader\n",
"from llama_index.core.node_parser import SentenceSplitter\n",
"from llama_index.embeddings.openai import OpenAIEmbedding"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "aef2e044-3188-4e46-9a3a-7105a223d07e",
"metadata": {},
"outputs": [],
"source": [
"reader = SimpleDirectoryReader(input_files=['data_sec/source_files/uber_2021.pdf'])\n",
"docs = reader.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8165a372-329b-46d4-9b27-9b146f19ff09",
"metadata": {},
"outputs": [],
"source": [
"sec_pipeline = IngestionPipeline(\n",
" project_name='sec analysis',\n",
" name='uber',\n",
" documents=docs,\n",
" transformations=[\n",
" SentenceSplitter(),\n",
" OpenAIEmbedding(),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "4a974962-6874-4edd-b1c1-386bf86d5f5c",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Pipeline available at: https://cloud.llamaindex.ai/project/a18cb1c9-393e-44bc-af5b-2cbc554c7a3f/playground/ca240136-f463-46d4-a7f9-70f9566c0b31\n"
]
}
],
"source": [
"sec_pipeline_id = sec_pipeline.register()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d30e465d-bb09-4995-89bf-66c21a184108",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.llama_dataset import LabelledRagDataset\n",
"rag_dataset = LabelledRagDataset.from_json(\"./data_sec/rag_dataset.json\")\n",
"questions = [example.query for example in rag_dataset.examples[:5]]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "05fedd5e-4d2c-4ce4-b90f-d735071aacd3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1. According to the context information provided, what is the state of incorporation for Uber Technologies, Inc., and what is the company's IRS Employer Identification Number?\n",
"2. Based on the information from the document, which type of annual report did Uber Technologies, Inc. file with the SEC for the fiscal year ended December 31, 2021, and on which stock exchange is Uber's Common Stock registered?\n",
"3. According to the context information provided from the \"uber_2021.pdf\" document, what is the classification of the filer as indicated by the check mark, and what does this classification imply regarding the company's filing requirements?\n",
"4. As of June 30, 2021, what was the aggregate market value of the voting and non-voting common equity held by non-affiliates of the registrant, and on which stock exchange was this value based?\n",
"5. According to the table of contents in the \"UBER TECHNOLOGIES, INC.\" document, what are the main topics covered under Item 7 in Part II, and on which page does this section begin?\n"
]
}
],
"source": [
"for ind, question in enumerate(questions):\n",
" print(f\"{ind + 1}. {question}\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f8009211-afe5-4ed2-95a4-41961cdcc447",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Uploaded 5 questions to dataset AI generated - 5 questions\n"
]
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"data": {
"text/plain": [
"'27d98fff-4dd2-4eb0-8904-24128082eeea'"
"cell_type": "markdown",
"id": "ced5edf1-4d1d-4e81-ba1e-58b26ac6ca8c",
"metadata": {},
"source": [
"# LlamaIndex Platform Demo"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
},
{
"cell_type": "markdown",
"id": "b450616c-ec1b-48ad-a5be-0e98a1123831",
"metadata": {},
"source": [
"## Step 0: Setup environment config for platform"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "da7a0c2e-54d2-4e8c-9edc-c1c457fc24f9",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"your-api-key\""
]
},
{
"cell_type": "markdown",
"id": "a89b27f6-d9ab-43f9-8c0d-56b68b89f5d5",
"metadata": {},
"source": [
"## Step 1: Configure ingestion pipeline (data source, transformations)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "654f252d-d68d-4a91-8a40-5bffce8c61ab",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.ingestion import IngestionPipeline\n",
"from llama_index.core import SimpleDirectoryReader\n",
"from llama_index.core.node_parser import SentenceSplitter\n",
"from llama_index.embeddings.openai import OpenAIEmbedding"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "aef2e044-3188-4e46-9a3a-7105a223d07e",
"metadata": {},
"outputs": [],
"source": [
"reader = SimpleDirectoryReader(input_files=['data_sec/source_files/uber_2021.pdf'])\n",
"docs = reader.load_data()"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "8165a372-329b-46d4-9b27-9b146f19ff09",
"metadata": {},
"outputs": [],
"source": [
"sec_pipeline = IngestionPipeline(\n",
" project_name='sec analysis',\n",
" name='uber',\n",
" documents=docs,\n",
" transformations=[\n",
" SentenceSplitter(),\n",
" OpenAIEmbedding(),\n",
" ]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "4a974962-6874-4edd-b1c1-386bf86d5f5c",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Pipeline available at: https://cloud.llamaindex.ai/project/a18cb1c9-393e-44bc-af5b-2cbc554c7a3f/playground/ca240136-f463-46d4-a7f9-70f9566c0b31\n"
]
}
],
"source": [
"sec_pipeline_id = sec_pipeline.register()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d30e465d-bb09-4995-89bf-66c21a184108",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.llama_dataset import LabelledRagDataset\n",
"rag_dataset = LabelledRagDataset.from_json(\"./data_sec/rag_dataset.json\")\n",
"questions = [example.query for example in rag_dataset.examples[:5]]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "05fedd5e-4d2c-4ce4-b90f-d735071aacd3",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1. According to the context information provided, what is the state of incorporation for Uber Technologies, Inc., and what is the company's IRS Employer Identification Number?\n",
"2. Based on the information from the document, which type of annual report did Uber Technologies, Inc. file with the SEC for the fiscal year ended December 31, 2021, and on which stock exchange is Uber's Common Stock registered?\n",
"3. According to the context information provided from the \"uber_2021.pdf\" document, what is the classification of the filer as indicated by the check mark, and what does this classification imply regarding the company's filing requirements?\n",
"4. As of June 30, 2021, what was the aggregate market value of the voting and non-voting common equity held by non-affiliates of the registrant, and on which stock exchange was this value based?\n",
"5. According to the table of contents in the \"UBER TECHNOLOGIES, INC.\" document, what are the main topics covered under Item 7 in Part II, and on which page does this section begin?\n"
]
}
],
"source": [
"for ind, question in enumerate(questions):\n",
" print(f\"{ind + 1}. {question}\")"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "f8009211-afe5-4ed2-95a4-41961cdcc447",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Uploaded 5 questions to dataset AI generated - 5 questions\n"
]
},
{
"data": {
"text/plain": [
"'27d98fff-4dd2-4eb0-8904-24128082eeea'"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from llama_index.core.evaluation.eval_utils import upload_eval_dataset\n",
"\n",
"upload_eval_dataset(\n",
" project_name='sec analysis',\n",
" dataset_name='AI generated - 5 questions',\n",
" questions=questions,\n",
" overwrite=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18d113bd-5afc-4def-9869-2a6271439cfb",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
],
"source": [
"from llama_index.core.evaluation.eval_utils import upload_eval_dataset\n",
"\n",
"upload_eval_dataset(\n",
" project_name='sec analysis',\n",
" dataset_name='AI generated - 5 questions',\n",
" questions=questions,\n",
" overwrite=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "18d113bd-5afc-4def-9869-2a6271439cfb",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.3"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
+290 -283
View File
@@ -1,287 +1,294 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "318d5a8d-7f6f-4fff-8c2d-cd0e290b6f49",
"metadata": {},
"source": [
"# Building and Evaluating a RAG Pipeline with LlamaCloud and our Batch Evaluator\n",
"\n",
"In this notebook we show you how to easily construct a RAG pipeline with a LlamaCloud Index, and then run evaluations against that index using our batch evaluator."
]
},
{
"cell_type": "markdown",
"id": "1188feb6-bedf-40d5-8812-29b90431f8b5",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Here we define some basic imports."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "84324692-17f5-4584-8932-6687ab811b66",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# attach to the same event-loop\n",
"import nest_asyncio\n",
"\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "dc3f08bd-81ad-4b05-89cc-33409e3d1f71",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Response\n",
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.core.evaluation import (\n",
" FaithfulnessEvaluator,\n",
" RelevancyEvaluator,\n",
" CorrectnessEvaluator,\n",
")\n",
"from llama_index.core.node_parser import SentenceSplitter\n",
"import pandas as pd\n",
"\n",
"pd.set_option(\"display.max_colwidth\", 0)"
]
},
{
"cell_type": "markdown",
"id": "95eb1f3a-b0bf-4588-8302-3799c520ffc3",
"metadata": {},
"source": [
"## Build RAG Pipeline from LlamaCloud Index\n",
"\n",
"The LlamaCloud index is built over the 2021 Lyft and Uber 10K documents.\n",
"\n",
"To create the index, follow the instructions:\n",
"1. You can download them here ([Uber 10K](https://www.dropbox.com/s/te0a2w227v27iag/uber_2021.pdf?dl=1), [Lyft 10K](https://www.dropbox.com/s/qctkz6nxhm0y5qe/lyft_2021.pdf?dl=1))\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading these documents."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e44c21cc-ad10-40b5-89ca-f0d95c2a6714",
"metadata": {},
"outputs": [],
"source": [
"! pip install llama-index-indices-managed-llama-cloud"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e83560c6",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "c7bc2ff3-8c95-4de4-965f-1ca5bcf61f9b",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"<index_name>\", \n",
" project_name=\"<project_name>\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8b5eafcb-5529-4fa5-a681-0d5e014ec517",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"query = \"Tell me about the risk factors for Uber\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "520199ca-4271-4708-9efc-76e244f4cc22",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"response = index.as_query_engine().query(query)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "b30d952c-d502-4888-a7e8-3a72ca0d052f",
"metadata": {
"tags": []
},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Airbyte is a platform that allows users to sync their data from various sources into a destination database, such as Snowflake. It provides functionalities for data ingestion and transformation, enabling users to easily move and work with data from different platforms.\n"
]
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"cell_type": "markdown",
"id": "318d5a8d-7f6f-4fff-8c2d-cd0e290b6f49",
"metadata": {},
"source": [
"# Building and Evaluating a RAG Pipeline with LlamaCloud and our Batch Evaluator\n",
"\n",
"In this notebook we show you how to easily construct a RAG pipeline with a LlamaCloud Index, and then run evaluations against that index using our batch evaluator."
]
},
{
"cell_type": "markdown",
"id": "1188feb6-bedf-40d5-8812-29b90431f8b5",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Here we define some basic imports."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "84324692-17f5-4584-8932-6687ab811b66",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# attach to the same event-loop\n",
"import nest_asyncio\n",
"\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "dc3f08bd-81ad-4b05-89cc-33409e3d1f71",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Response\n",
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.core.evaluation import (\n",
" FaithfulnessEvaluator,\n",
" RelevancyEvaluator,\n",
" CorrectnessEvaluator,\n",
")\n",
"from llama_index.core.node_parser import SentenceSplitter\n",
"import pandas as pd\n",
"\n",
"pd.set_option(\"display.max_colwidth\", 0)"
]
},
{
"cell_type": "markdown",
"id": "95eb1f3a-b0bf-4588-8302-3799c520ffc3",
"metadata": {},
"source": [
"## Build RAG Pipeline from LlamaCloud Index\n",
"\n",
"The LlamaCloud index is built over the 2021 Lyft and Uber 10K documents.\n",
"\n",
"To create the index, follow the instructions:\n",
"1. You can download them here ([Uber 10K](https://www.dropbox.com/s/te0a2w227v27iag/uber_2021.pdf?dl=1), [Lyft 10K](https://www.dropbox.com/s/qctkz6nxhm0y5qe/lyft_2021.pdf?dl=1))\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading these documents."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e44c21cc-ad10-40b5-89ca-f0d95c2a6714",
"metadata": {},
"outputs": [],
"source": [
"! pip install llama-index-indices-managed-llama-cloud"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e83560c6",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "c7bc2ff3-8c95-4de4-965f-1ca5bcf61f9b",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"<index_name>\", \n",
" project_name=\"<project_name>\",\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "8b5eafcb-5529-4fa5-a681-0d5e014ec517",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"query = \"Tell me about the risk factors for Uber\""
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "520199ca-4271-4708-9efc-76e244f4cc22",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"response = index.as_query_engine().query(query)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "b30d952c-d502-4888-a7e8-3a72ca0d052f",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Airbyte is a platform that allows users to sync their data from various sources into a destination database, such as Snowflake. It provides functionalities for data ingestion and transformation, enabling users to easily move and work with data from different platforms.\n"
]
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "026a1616-874f-4e59-a7d5-b1be6a5525cd",
"metadata": {},
"source": [
"## Setup Batch Evaluator\n",
"\n",
"Here we setup a batch evaluator, which can run evaluations over a batch dataset. We start by defining the set of metrics that we want to measure over this dataset."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "34208172-079b-48d0-8bb2-6e57288ccc57",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# gpt-4\n",
"gpt4 = OpenAI(temperature=0, model=\"gpt-4\")\n",
"gpt35 = OpenAI(model=\"gpt-3.5-turbo\")\n",
"\n",
"faithfulness_gpt4 = FaithfulnessEvaluator(llm=gpt4)\n",
"relevancy_gpt4 = RelevancyEvaluator(llm=gpt4)\n",
"correctness_gpt4 = CorrectnessEvaluator(llm=gpt4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d59da19e-9c75-48f7-8235-b0a458ca6a04",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"print(response.source_nodes[2].get_content())"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "47ed4750-92cd-49ec-b1a3-03bdbc425d97",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.core.evaluation import BatchEvalRunner\n",
"\n",
"runner = BatchEvalRunner(\n",
" {\"faithfulness\": faithfulness_gpt4, \"relevancy\": relevancy_gpt4},\n",
" workers=8,\n",
")\n",
"\n",
"eval_results = await runner.aevaluate_queries(\n",
" index.as_query_engine(), queries=[query]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c89268b0-299f-4da5-b729-0e344988a1de",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"eval_results"
]
},
{
"cell_type": "markdown",
"id": "48ece9e7-5e70-46c4-88c3-0eb7bd733892",
"metadata": {},
"source": [
"## Upload Results\n",
"\n",
"Once you obtain a set of eval results, you're able to upload them."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5aedc2a7-3eae-49c7-9f1b-242d15917693",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"runner.upload_eval_results(\n",
" project_name=\"default project\",\n",
" app_name=\"default app\",\n",
" results=eval_results\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ed0a112-93da-4df7-b1c6-3e779ae53211",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "026a1616-874f-4e59-a7d5-b1be6a5525cd",
"metadata": {},
"source": [
"## Setup Batch Evaluator\n",
"\n",
"Here we setup a batch evaluator, which can run evaluations over a batch dataset. We start by defining the set of metrics that we want to measure over this dataset."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "34208172-079b-48d0-8bb2-6e57288ccc57",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# gpt-4\n",
"gpt4 = OpenAI(temperature=0, model=\"gpt-4\")\n",
"gpt35 = OpenAI(model=\"gpt-3.5-turbo\")\n",
"\n",
"faithfulness_gpt4 = FaithfulnessEvaluator(llm=gpt4)\n",
"relevancy_gpt4 = RelevancyEvaluator(llm=gpt4)\n",
"correctness_gpt4 = CorrectnessEvaluator(llm=gpt4)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d59da19e-9c75-48f7-8235-b0a458ca6a04",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"print(response.source_nodes[2].get_content())"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "47ed4750-92cd-49ec-b1a3-03bdbc425d97",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.core.evaluation import BatchEvalRunner\n",
"\n",
"runner = BatchEvalRunner(\n",
" {\"faithfulness\": faithfulness_gpt4, \"relevancy\": relevancy_gpt4},\n",
" workers=8,\n",
")\n",
"\n",
"eval_results = await runner.aevaluate_queries(\n",
" index.as_query_engine(), queries=[query]\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c89268b0-299f-4da5-b729-0e344988a1de",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"eval_results"
]
},
{
"cell_type": "markdown",
"id": "48ece9e7-5e70-46c4-88c3-0eb7bd733892",
"metadata": {},
"source": [
"## Upload Results\n",
"\n",
"Once you obtain a set of eval results, you're able to upload them."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5aedc2a7-3eae-49c7-9f1b-242d15917693",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"runner.upload_eval_results(\n",
" project_name=\"default project\",\n",
" app_name=\"default app\",\n",
" results=eval_results\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ed0a112-93da-4df7-b1c6-3e779ae53211",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
File diff suppressed because it is too large Load Diff
+382 -375
View File
@@ -1,384 +1,391 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "5b1055c3-6dc5-4620-ab42-8e5838bf16e4",
"metadata": {},
"source": [
"# LlamaCloud Client SDK: Document Metadata Management\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/client_sdk/doc_metadata.ipynb\n",
" \" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This tutorial shows you how to update metadata onto a document.\n",
"\n",
"**NOTE**: To add new documents with metadata, check out our \"Inserting Custom Documents\" tutorial.\n",
"\n",
"You can update metadata in two ways with the low-level client SDK: \n",
"- Using our `update_pipeline_file` method to update the metadata of an uploaded file.\n",
"- Using our `upsert_batch_pipeline_documents` method to update the metadata of uploaded documents."
]
},
{
"cell_type": "markdown",
"id": "f586c2a0-95f5-4943-9163-778008a680b0",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Here we setup our environment variables, data, and the client SDK."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "5edc1bde-e217-42d3-8bdc-fd1e8c4b325d",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "67a907cb-a727-4c12-86c9-ca2c55d73a59",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_BASE_URL\"] = \"https://api.cloud.llamaindex.ai\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e150bc97-7b34-4817-b65f-f909f76045d5",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"<LLAMA_CLOUD_API_KEY>\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<OPENAI_API_KEY>\""
]
},
{
"cell_type": "markdown",
"id": "3f0f2ff7-e5e5-4452-b7bc-69e0b77827af",
"metadata": {},
"source": [
"#### Load Data"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "0c42e736-da9e-4ef7-a5a7-3145ef362703",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2024-07-03 21:18:33-- https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf\n",
"Resolving s2.q4cdn.com (s2.q4cdn.com)... 2a0b:4d07:2::3, 2a0b:4d07:2::1, 2a0b:4d07:2::4, ...\n",
"Connecting to s2.q4cdn.com (s2.q4cdn.com)|2a0b:4d07:2::3|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 789896 (771K) [application/pdf]\n",
"Saving to: apple_2021_10k.pdf\n",
"\n",
"apple_2021_10k.pdf 100%[===================>] 771.38K --.-KB/s in 0.06s \n",
"\n",
"2024-07-03 21:18:33 (12.3 MB/s) - apple_2021_10k.pdf saved [789896/789896]\n",
"\n"
]
}
],
"source": [
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf\" -O data/apple_2021_10k.pdf"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "89b1b06f-bce5-4f8e-8387-b81a40f0969f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Error while parsing the file './apple_2021_10k.pdf': [Errno 2] No such file or directory: './apple_2021_10k.pdf'\n"
]
}
],
"source": [
"from llama_parse import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"markdown\").load_data(\"./data/apple_2021_10k.pdf\")"
]
},
{
"cell_type": "markdown",
"id": "08b546bd-b23b-4c3c-bbb2-023dbb2c651a",
"metadata": {},
"source": [
"#### Setup LlamaCloud Client SDK"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "d4c36489",
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(\n",
" token=os.environ[\"LLAMA_CLOUD_API_KEY\"],\n",
" base_url=os.environ[\"LLAMA_CLOUD_BASE_URL\"]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "9932fc54-82c5-4cc1-bd4b-105fd232209b",
"metadata": {},
"source": [
"#### Setup Index\n",
"\n",
"Please setup an empty index. You can either do this through the UI or [programmatically](https://docs.cloud.llamaindex.ai/llamacloud/guides/framework_integration).\n",
"\n",
"After you've done so, make sure to note down the pipeline_id, pipeline_name, project_id, and project_name in the variables below. You'll need these later! "
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "04e80c0d-d58d-4fce-aa78-9247b9840738",
"metadata": {},
"outputs": [],
"source": [
"pipeline_id = \"<pipeline_id>\"\n",
"pipeline_name = \"<pipeline_name>\"\n",
"project_id = \"<project_id>\"\n",
"project_name = \"<project_name>\""
]
},
{
"cell_type": "markdown",
"id": "5c2d17d9-5ad3-4892-8962-dd3182d397b2",
"metadata": {},
"source": [
"## Updating Metadata in Files\n",
"\n",
"\n",
"can be from manually uploaded files or data source files after ingested"
]
},
{
"cell_type": "markdown",
"id": "a0ab3b6d-ca5f-4c01-a4cb-e715cc6cb1e8",
"metadata": {},
"source": [
"#### Updating Metadata through `update_pipeline_file`"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "27e331ab-8b33-4e27-9d79-3cb8e19718ac",
"metadata": {},
"outputs": [],
"source": [
"# upload file and add file to pipeline\n",
"with open('data/apple_2021_10k.pdf', 'rb') as f:\n",
" file = client.files.upload_file(upload_file=f, project_id=project_id)\n",
" pipeline_files = client.pipelines.add_files_to_pipeline(pipeline_id, request=[{'file_id': file.id}]) "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "84e239bf-44fd-4019-b447-e7ffabadea61",
"metadata": {},
"outputs": [],
"source": [
"# adding metadata\n",
"pipeline_files = client.pipelines.update_pipeline_file(\n",
" pipeline_id=pipeline_id, file_id=file.id, custom_metadata={ \"editor\": \"jerry_liu\" }\n",
") "
]
},
{
"cell_type": "markdown",
"id": "837f88ed-122d-4afc-b997-41d2a1b45804",
"metadata": {},
"source": [
"#### Updating Metadata through `upsert_batch_pipeline_documents`"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "fe66ab74-e080-43b3-9684-aa474b949693",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1"
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pipeline_docs = client.pipelines.list_pipeline_documents(pipeline_id)\n",
"len(pipeline_docs)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "034e95a8-e73e-48f4-90b9-a05c2ba96d1c",
"metadata": {
"scrolled": true
},
"outputs": [
},
{
"data": {
"text/plain": [
"{'file_size': '789896',\n",
" 'last_modified_at': '2024-07-04T06:39:23',\n",
" 'file_path': 'apple_2021_10k.pdf',\n",
" 'file_name': 'apple_2021_10k.pdf',\n",
" 'pipeline_id': 'b4b8a624-cd50-4f54-8d20-a756427d961f'}"
"cell_type": "markdown",
"id": "5b1055c3-6dc5-4620-ab42-8e5838bf16e4",
"metadata": {},
"source": [
"# LlamaCloud Client SDK: Document Metadata Management\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/client_sdk/doc_metadata.ipynb\n",
" \" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This tutorial shows you how to update metadata onto a document.\n",
"\n",
"**NOTE**: To add new documents with metadata, check out our \"Inserting Custom Documents\" tutorial.\n",
"\n",
"You can update metadata in two ways with the low-level client SDK: \n",
"- Using our `update_pipeline_file` method to update the metadata of an uploaded file.\n",
"- Using our `upsert_batch_pipeline_documents` method to update the metadata of uploaded documents."
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# inspect the first document\n",
"pipeline_docs[0].metadata"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "6076e810-154b-4e96-a4db-88b42a182d03",
"metadata": {},
"outputs": [],
"source": [
"# change the metadata of the document\n",
"pipeline_docs[0].metadata[\"editor\"] = \"simon_suo\""
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "eecdc4ff-178a-4157-962d-90d39bc2ff27",
"metadata": {},
"outputs": [
},
{
"data": {
"text/plain": [
"{'file_size': '789896',\n",
" 'last_modified_at': '2024-07-04T06:37:50',\n",
" 'file_path': 'apple_2021_10k.pdf',\n",
" 'file_name': 'apple_2021_10k.pdf',\n",
" 'pipeline_id': 'a2de81e0-6917-4e23-8874-5f5170b1aa79',\n",
" 'editor': 'simon_suo'}"
"cell_type": "markdown",
"id": "f586c2a0-95f5-4943-9163-778008a680b0",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Here we setup our environment variables, data, and the client SDK."
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
},
{
"cell_type": "code",
"execution_count": 2,
"id": "5edc1bde-e217-42d3-8bdc-fd1e8c4b325d",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "67a907cb-a727-4c12-86c9-ca2c55d73a59",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_BASE_URL\"] = \"https://api.cloud.llamaindex.ai\""
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "e150bc97-7b34-4817-b65f-f909f76045d5",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"<LLAMA_CLOUD_API_KEY>\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<OPENAI_API_KEY>\""
]
},
{
"cell_type": "markdown",
"id": "3f0f2ff7-e5e5-4452-b7bc-69e0b77827af",
"metadata": {},
"source": [
"#### Load Data"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "0c42e736-da9e-4ef7-a5a7-3145ef362703",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2024-07-03 21:18:33-- https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf\n",
"Resolving s2.q4cdn.com (s2.q4cdn.com)... 2a0b:4d07:2::3, 2a0b:4d07:2::1, 2a0b:4d07:2::4, ...\n",
"Connecting to s2.q4cdn.com (s2.q4cdn.com)|2a0b:4d07:2::3|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 789896 (771K) [application/pdf]\n",
"Saving to: apple_2021_10k.pdf\n",
"\n",
"apple_2021_10k.pdf 100%[===================>] 771.38K --.-KB/s in 0.06s \n",
"\n",
"2024-07-03 21:18:33 (12.3 MB/s) - apple_2021_10k.pdf saved [789896/789896]\n",
"\n"
]
}
],
"source": [
"!wget \"https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf\" -O data/apple_2021_10k.pdf"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "89b1b06f-bce5-4f8e-8387-b81a40f0969f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Error while parsing the file './apple_2021_10k.pdf': [Errno 2] No such file or directory: './apple_2021_10k.pdf'\n"
]
}
],
"source": [
"from llama_parse import LlamaParse\n",
"\n",
"documents = LlamaParse(result_type=\"markdown\").load_data(\"./data/apple_2021_10k.pdf\")"
]
},
{
"cell_type": "markdown",
"id": "08b546bd-b23b-4c3c-bbb2-023dbb2c651a",
"metadata": {},
"source": [
"#### Setup LlamaCloud Client SDK"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "d4c36489",
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(\n",
" token=os.environ[\"LLAMA_CLOUD_API_KEY\"],\n",
" base_url=os.environ[\"LLAMA_CLOUD_BASE_URL\"]\n",
")"
]
},
{
"cell_type": "markdown",
"id": "9932fc54-82c5-4cc1-bd4b-105fd232209b",
"metadata": {},
"source": [
"#### Setup Index\n",
"\n",
"Please setup an empty index. You can either do this through the UI or [programmatically](https://docs.cloud.llamaindex.ai/llamacloud/guides/framework_integration).\n",
"\n",
"After you've done so, make sure to note down the pipeline_id, pipeline_name, project_id, and project_name in the variables below. You'll need these later! "
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "04e80c0d-d58d-4fce-aa78-9247b9840738",
"metadata": {},
"outputs": [],
"source": [
"pipeline_id = \"<pipeline_id>\"\n",
"pipeline_name = \"<pipeline_name>\"\n",
"project_id = \"<project_id>\"\n",
"project_name = \"<project_name>\""
]
},
{
"cell_type": "markdown",
"id": "5c2d17d9-5ad3-4892-8962-dd3182d397b2",
"metadata": {},
"source": [
"## Updating Metadata in Files\n",
"\n",
"\n",
"can be from manually uploaded files or data source files after ingested"
]
},
{
"cell_type": "markdown",
"id": "a0ab3b6d-ca5f-4c01-a4cb-e715cc6cb1e8",
"metadata": {},
"source": [
"#### Updating Metadata through `update_pipeline_file`"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "27e331ab-8b33-4e27-9d79-3cb8e19718ac",
"metadata": {},
"outputs": [],
"source": [
"# upload file and add file to pipeline\n",
"with open('data/apple_2021_10k.pdf', 'rb') as f:\n",
" file = client.files.upload_file(upload_file=f, project_id=project_id)\n",
" pipeline_files = client.pipelines.add_files_to_pipeline(pipeline_id, request=[{'file_id': file.id}]) "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "84e239bf-44fd-4019-b447-e7ffabadea61",
"metadata": {},
"outputs": [],
"source": [
"# adding metadata\n",
"pipeline_files = client.pipelines.update_pipeline_file(\n",
" pipeline_id=pipeline_id, file_id=file.id, custom_metadata={ \"editor\": \"jerry_liu\" }\n",
") "
]
},
{
"cell_type": "markdown",
"id": "837f88ed-122d-4afc-b997-41d2a1b45804",
"metadata": {},
"source": [
"#### Updating Metadata through `upsert_batch_pipeline_documents`"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "fe66ab74-e080-43b3-9684-aa474b949693",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"1"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"pipeline_docs = client.pipelines.list_pipeline_documents(pipeline_id)\n",
"len(pipeline_docs)"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "034e95a8-e73e-48f4-90b9-a05c2ba96d1c",
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"{'file_size': '789896',\n",
" 'last_modified_at': '2024-07-04T06:39:23',\n",
" 'file_path': 'apple_2021_10k.pdf',\n",
" 'file_name': 'apple_2021_10k.pdf',\n",
" 'pipeline_id': 'b4b8a624-cd50-4f54-8d20-a756427d961f'}"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# inspect the first document\n",
"pipeline_docs[0].metadata"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "6076e810-154b-4e96-a4db-88b42a182d03",
"metadata": {},
"outputs": [],
"source": [
"# change the metadata of the document\n",
"pipeline_docs[0].metadata[\"editor\"] = \"simon_suo\""
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "eecdc4ff-178a-4157-962d-90d39bc2ff27",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'file_size': '789896',\n",
" 'last_modified_at': '2024-07-04T06:37:50',\n",
" 'file_path': 'apple_2021_10k.pdf',\n",
" 'file_name': 'apple_2021_10k.pdf',\n",
" 'pipeline_id': 'a2de81e0-6917-4e23-8874-5f5170b1aa79',\n",
" 'editor': 'simon_suo'}"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"upserted_docs = client.pipelines.upsert_batch_pipeline_documents(pipeline_id, request=[pipeline_docs[0]])\n",
"upserted_docs[0].metadata"
]
},
{
"cell_type": "markdown",
"id": "b408db36-be44-4ce8-a535-073ebea177f7",
"metadata": {},
"source": [
"## Test Retrieval\n",
"\n",
"We test retrieval through the framework integration."
]
},
{
"cell_type": "markdown",
"id": "00ecdb0c-f8d5-4ed6-95a0-acd738c2b0ef",
"metadata": {},
"source": [
"#### Retrieval Through the Framework Integration\n",
"\n",
"We can also define a retriever through the Python framework, through our `LlamaCloudIndex`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23ccb658-6620-49f6-a363-5f608733b2f5",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=pipeline_name, \n",
" project_name=project_name,\n",
" api_key=os.getenv(\"LLAMA_CLOUD_API_KEY\")\n",
")\n",
"\n",
"query_engine = index.as_query_engine(rerank_top_n=1)\n",
"response = query_engine.query(\"Who is the editor of this document.\")\n",
"print(str(response) + \"\\n-------\\n\\nSources:\\n\\n\")\n",
"for n in response.source_nodes:\n",
" print(n.get_content(metadata_mode=\"all\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9345c9c-1f7e-447f-b478-9149be73dc79",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"upserted_docs = client.pipelines.upsert_batch_pipeline_documents(pipeline_id, request=[pipeline_docs[0]])\n",
"upserted_docs[0].metadata"
]
},
{
"cell_type": "markdown",
"id": "b408db36-be44-4ce8-a535-073ebea177f7",
"metadata": {},
"source": [
"## Test Retrieval\n",
"\n",
"We test retrieval through the framework integration."
]
},
{
"cell_type": "markdown",
"id": "00ecdb0c-f8d5-4ed6-95a0-acd738c2b0ef",
"metadata": {},
"source": [
"#### Retrieval Through the Framework Integration\n",
"\n",
"We can also define a retriever through the Python framework, through our `LlamaCloudIndex`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "23ccb658-6620-49f6-a363-5f608733b2f5",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=pipeline_name, \n",
" project_name=project_name,\n",
" api_key=os.getenv(\"LLAMA_CLOUD_API_KEY\")\n",
")\n",
"\n",
"query_engine = index.as_query_engine(rerank_top_n=1)\n",
"response = query_engine.query(\"Who is the editor of this document.\")\n",
"print(str(response) + \"\\n-------\\n\\nSources:\\n\\n\")\n",
"for n in response.source_nodes:\n",
" print(n.get_content(metadata_mode=\"all\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b9345c9c-1f7e-447f-b478-9149be73dc79",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
+455 -448
View File
@@ -1,454 +1,461 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "8c1348d3-4c0e-450f-8faf-19503f61b7b2",
"metadata": {},
"source": [
"# LlamaCloud Client SDK: Integrating LlamaHub Loaders\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/client_sdk/llamahub_doc.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This tutorial shows you how to use LlamaHub loaders to insert documents with LlamaCloud, with the help of the lower-level LlamaCloud Client SDK.\n",
"\n",
"In this example, we use the [Firecrawl web page reader](https://www.firecrawl.dev/) in order to load a web page document, and feed it into our LlamaCloud pipeline."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "e83a35ec-8e6c-475c-827c-20f46c4a3c43",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f47019e7-5bf8-49ee-8ab1-1f1ae1ae55e4",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-cloud\n",
"!pip install llama-index-readers-web firecrawl-py"
]
},
{
"cell_type": "markdown",
"id": "57082f55-66e0-44e1-8072-2450405c21d1",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Here we setup our environment variables, client, and load data using the Firecrawl loader available on LlamaHub.\n",
"\n",
"The Firecrawl loader is available as part of our `llama-index-readers-web` package."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "67a907cb-a727-4c12-86c9-ca2c55d73a59",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_BASE_URL\"] = \"https://api.cloud.llamaindex.ai\""
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "ddd884dc-8da5-498d-a186-d08d30e8478e",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"<LLAMA_CLOUD_API_KEY>\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<OPENAI_API_KEY>\""
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "2235076d-3032-41ee-b936-f8a0081ec5af",
"metadata": {},
"outputs": [],
"source": [
"FIRECRAWL_API_KEY = \"<FIRECRAWL_API_KEY>\""
]
},
{
"cell_type": "markdown",
"id": "f72f00a3-cace-4975-bc73-13280d0f5d32",
"metadata": {},
"source": [
"#### Load Data"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "6590c2eb-4a0d-4a5c-8dd9-092ed43d2d48",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.readers.web import FireCrawlWebReader\n",
"\n",
"# using firecrawl to crawl a website\n",
"firecrawl_reader = FireCrawlWebReader(\n",
" api_key=FIRECRAWL_API_KEY,\n",
" mode=\"scrape\", # Choose between \"crawl\" and \"scrape\" for single page scraping\n",
" # params={\"additional\": \"parameters\"}, # Optional additional parameters\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "b59927cc-c43b-4c40-9840-586b3fce39e4",
"metadata": {},
"outputs": [],
"source": [
"# Load documents from a single page URL\n",
"documents = firecrawl_reader.load_data(url=\"https://www.oreilly.com/radar/what-we-learned-from-a-year-of-building-with-llms-part-i/\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "83cdb3e5-902b-4d79-930b-df2063c3a39b",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"print(documents[0].get_content())"
]
},
{
"cell_type": "markdown",
"id": "5d91a633-34c7-4c5b-b6cb-de37e80ace68",
"metadata": {},
"source": [
"#### Setup Index\n",
"\n",
"Please setup an empty index. You can either do this through the UI or [programmatically](https://docs.cloud.llamaindex.ai/llamacloud/guides/framework_integration).\n",
"\n",
"After you've done so, make sure to note down the pipeline_id, pipeline_name, project_id, and project_name in the variables below. You'll need these later! "
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "04e80c0d-d58d-4fce-aa78-9247b9840738",
"metadata": {},
"outputs": [],
"source": [
"pipeline_id = \"<pipeline_id>\"\n",
"pipeline_name = \"<pipeline_name>\"\n",
"project_id = \"<project_id>\"\n",
"project_name = \"<project_name>\""
]
},
{
"cell_type": "markdown",
"id": "55afeb53",
"metadata": {},
"source": [
"#### Setup LlamaCloud Client SDK and Framework Client\n",
"\n",
"Here we define both the client (giving us access to low-level client operations) as well as the `LlamaCloudIndex` defined through the framework."
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "d4c36489",
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(\n",
" token=os.environ[\"LLAMA_CLOUD_API_KEY\"],\n",
" base_url=os.environ[\"LLAMA_CLOUD_BASE_URL\"]\n",
")\n",
"\n",
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=pipeline_name, \n",
" project_name=project_name,\n",
" api_key=os.getenv(\"LLAMA_CLOUD_API_KEY\")\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ce9e45c5-4255-4dfa-ae3b-f33aac640df5",
"metadata": {},
"source": [
"## Inserting Documents\n",
"\n",
"Now let's create the custom Document objects. We assume that your pipeline has been created in the last section. Copy the pipeline and project ids into the box below.\n",
"\n",
"We insert one document containing the parsed document text, and another document as a toy example."
]
},
{
"cell_type": "markdown",
"id": "d3b7171d-eb65-479b-8bab-a38e7f5ef221",
"metadata": {},
"source": [
"#### Inserting Document Objects through the Client SDK"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "adfde1cf-1941-4bf6-ad0b-903ebd298ae0",
"metadata": {},
"outputs": [],
"source": [
"cloud_documents = [d.to_cloud_document() for d in documents]\n",
"upserted_docs = client.pipelines.upsert_batch_pipeline_documents(pipeline_id, request=cloud_documents)"
]
},
{
"cell_type": "markdown",
"id": "b4a0e171-8834-4a46-aee5-4d7551edac92",
"metadata": {},
"source": [
"#### Inserting Document Objects through the Framework Integration\n",
"\n",
"You can also do `index.insert` to directly upload document objects using the types defined by the framework."
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "346dde6d-3c6f-4425-8be4-a3d06baca325",
"metadata": {},
"outputs": [],
"source": [
"# NOTE: the llamaparsed document is already in the right representation\n",
"from llama_index.core.schema import Document\n",
"\n",
"for doc in documents:\n",
" index.insert(doc)"
]
},
{
"cell_type": "markdown",
"id": "ca62a2da-f1a8-486d-8c62-775844369142",
"metadata": {},
"source": [
"#### Validating the Documents\n",
"\n",
"After the documents have been inserted, we can validate that they exist in the pipeline."
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "dd0cfcd8-2e77-4e75-ba5c-7cd629706aba",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n"
]
}
],
"source": [
"pipeline_docs = client.pipelines.list_pipeline_documents(pipeline_id)\n",
"\n",
"print(len(pipeline_docs))"
]
},
{
"cell_type": "markdown",
"id": "5a951e23-8608-4082-913b-d7567de845c1",
"metadata": {},
"source": [
"#### Deleting the Documents\n",
"\n",
"If you want to reset, you can use the client SDK to delete pipeline documents."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c1082187-6218-4d7b-9244-982ba309f532",
"metadata": {},
"outputs": [],
"source": [
"pipeline_docs = client.pipelines.list_pipeline_documents(pipeline_id)\n",
"for doc in pipeline_docs:\n",
" client.pipelines.delete_pipeline_document(pipeline_id, doc.id)\n",
"client.pipelines.sync_pipeline(pipeline_id)"
]
},
{
"cell_type": "markdown",
"id": "1b5cf563-d119-40da-977d-9557e41328c7",
"metadata": {},
"source": [
"## Test RAG\n",
"\n",
"Let's create a sample RAG pipeline through the Python framework, through our `LlamaCloudIndex` (you can also run our lower-level search API through `client.pipelines.run_search`)."
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "23ccb658-6620-49f6-a363-5f608733b2f5",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"The intern test for evaluating generations involves assessing whether an average college student in the relevant major could successfully complete the task given the same input and context as the language model. If the student could succeed, the task is considered feasible for the model. If not, the context may need to be enriched or the task simplified. If the task is too complex even after improvements, it may be beyond the capabilities of contemporary language models.\n"
]
}
],
"source": [
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-4o\")\n",
"retriever = index.as_retriever(rerank_top_n=5)\n",
"query_engine = RetrieverQueryEngine.from_args(\n",
" retriever,\n",
" llm=llm\n",
")\n",
"response = query_engine.query(\"What is the intern test for evaluating generations?\")\n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "5c2f771a-f213-46fd-a23e-37695e0df0c9",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "8c1348d3-4c0e-450f-8faf-19503f61b7b2",
"metadata": {},
"source": [
"# LlamaCloud Client SDK: Integrating LlamaHub Loaders\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/client_sdk/llamahub_doc.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This tutorial shows you how to use LlamaHub loaders to insert documents with LlamaCloud, with the help of the lower-level LlamaCloud Client SDK.\n",
"\n",
"In this example, we use the [Firecrawl web page reader](https://www.firecrawl.dev/) in order to load a web page document, and feed it into our LlamaCloud pipeline."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"--------- SOURCE NODE 0 --------\n",
"Also consider checks to ensure that word, item, or sentence counts lie within a range. For other kinds of generation, assertions can look different. [Execution-evaluation](https://www.semanticscholar.org/paper/Execution-Based-Evaluation-for-Open-Domain-Code-Wang-Zhou/1bed34f2c23b97fd18de359cf62cd92b3ba612c3)\n",
" is a powerful method for **evaluating** code-generation, wherein you run the generated code and determine that the state of runtime is sufficient for the user-request.\n",
"\n",
"As an example, if the user asks for a new function named foo; then after executing the agents generated code, foo should be callable! One challenge in execution-evaluation is that the agent code frequently leaves the runtime in slightly different form than the target code. It can be effective to “relax” assertions to the absolute most weak assumptions that any viable answer would satisfy.\n",
"\n",
"Finally, using your product as intended for customers (i.e., “dogfooding”) can provide insight into failure modes on real-worl....\n",
"--------- SOURCE NODE 1 --------\n",
"![](https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/05/Picture1.png)\n",
"\n",
"LLM-as-Judge is not a silver bullet though. There are subtle aspects of language where even the strongest models fail to evaluate reliably. In addition, weve found that [conventional classifiers](https://eugeneyan.com/writing/finetuning/)\n",
" and reward models can achieve higher accuracy than LLM-as-Judge, and with lower cost and latency. For code generation, LLM-as-Judge can be weaker than more direct evaluation strategies like execution-evaluation.\n",
"\n",
" **The “**intern** **test**” for **evaluating** generations**\n",
"\n",
"We like to use the following “**intern** **test**” when **evaluating** generations: If you took the exact input to the language model, including the context, and gave it to an average college student in the relevant major as a task, could they succeed? How long would it take?\n",
"\n",
"If the answer is no because the LLM lacks the required knowledge, consider ways to enrich the context.\n",
"\n",
"If the answer is ....\n",
"--------- SOURCE NODE 2 --------\n",
"When a request comes in, we can check to see if a summary already exists in the cache. If so, we can return it immediately; if not, we generate, guardrail, and serve it, and then store it in the cache for future requests.\n",
"\n",
"For more open-ended queries, we can borrow techniques from the field of search, which also leverages caching for open-ended inputs. Features like autocomplete and spelling correction also help normalize user input and thus increase the cache hit rate.\n",
"\n",
" **When to fine-tune**\n",
"\n",
"We may have some tasks where even the most cleverly designed prompts fall short. For example, even after significant prompt engineering, our system may still be a ways from returning reliable, high-quality output. If so, then it may be necessary to finetune a model for your specific task.\n",
"\n",
"Successful examples include:\n",
"\n",
"* [Honeycombs Natural Language Query Assistant](https://www.honeycomb.io/blog/introducing-query-assistant)\n",
" : Initially, the “programming manual” was provided in the prompt ....\n",
"--------- SOURCE NODE 3 --------\n",
"[Skip to main content](#maincontent)\n",
"\n",
"[![O'Reilly home](https://cdn.oreillystatic.com/images/sitewide-headers/oreilly_logo_mark_red_@2x.png)](https://www.oreilly.com/ \"home page\")\n",
"\n",
"* * [Sign In](https://www.oreilly.com/member/login/)\n",
" \n",
" * [Try Now](https://oreilly.com/online-learning/try-now.html)\n",
" \n",
"* * [](https://www.oreilly.com/online-learning/teams.html)\n",
" [](https://www.oreilly.com/online-learning/teams.html)\n",
" [Teams](https://www.oreilly.com/online-learning/teams.html)\n",
" * [For business](https://www.oreilly.com/online-learning/teams.html)\n",
" \n",
" * [For government](https://www.oreilly.com/online-learning/government.html)\n",
" \n",
" * [For higher ed](https://www.oreilly.com/online-learning/academic.html)\n",
" \n",
" * [](https://www.oreilly.com/online-learning/individuals.html)\n",
" [](https://www.oreilly.com/online-learning/individuals.html)\n",
" [Individuals](https://www.oreilly.com/online-lear....\n",
"--------- SOURCE NODE 4 --------\n",
"In the first step, given a high-level goal or prompt, the agent generates a plan. Then, the plan is executed deterministically. This allows each step to be more predictable and reliable. Benefits include:\n",
"\n",
"* Generated plans can serve as few-shot samples to prompt or finetune an agent.\n",
"* Deterministic execution makes the system more reliable, and thus easier to **test** and debug. Furthermore, failures can be traced to the specific steps in the plan.\n",
"* Generated plans can be represented as directed acyclic graphs (DAGs) which are easier, relative to a static prompt, to understand and adapt to new situations.\n",
"\n",
"The most successful agent builders may be those with strong experience managing junior engineers because the process of generating plans is similar to how we instruct and manage juniors. We give juniors clear goals and concrete plans, instead of vague open-ended directions, and we should do the same for our agents too.\n",
"\n",
"In the end, the key to reliable, working agents will lik....\n"
]
"cell_type": "code",
"execution_count": 1,
"id": "e83a35ec-8e6c-475c-827c-20f46c4a3c43",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f47019e7-5bf8-49ee-8ab1-1f1ae1ae55e4",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-cloud\n",
"!pip install llama-index-readers-web firecrawl-py"
]
},
{
"cell_type": "markdown",
"id": "57082f55-66e0-44e1-8072-2450405c21d1",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Here we setup our environment variables, client, and load data using the Firecrawl loader available on LlamaHub.\n",
"\n",
"The Firecrawl loader is available as part of our `llama-index-readers-web` package."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "67a907cb-a727-4c12-86c9-ca2c55d73a59",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_BASE_URL\"] = \"https://api.cloud.llamaindex.ai\""
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "ddd884dc-8da5-498d-a186-d08d30e8478e",
"metadata": {},
"outputs": [],
"source": [
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"<LLAMA_CLOUD_API_KEY>\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<OPENAI_API_KEY>\""
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "2235076d-3032-41ee-b936-f8a0081ec5af",
"metadata": {},
"outputs": [],
"source": [
"FIRECRAWL_API_KEY = \"<FIRECRAWL_API_KEY>\""
]
},
{
"cell_type": "markdown",
"id": "f72f00a3-cace-4975-bc73-13280d0f5d32",
"metadata": {},
"source": [
"#### Load Data"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "6590c2eb-4a0d-4a5c-8dd9-092ed43d2d48",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.readers.web import FireCrawlWebReader\n",
"\n",
"# using firecrawl to crawl a website\n",
"firecrawl_reader = FireCrawlWebReader(\n",
" api_key=FIRECRAWL_API_KEY,\n",
" mode=\"scrape\", # Choose between \"crawl\" and \"scrape\" for single page scraping\n",
" # params={\"additional\": \"parameters\"}, # Optional additional parameters\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 36,
"id": "b59927cc-c43b-4c40-9840-586b3fce39e4",
"metadata": {},
"outputs": [],
"source": [
"# Load documents from a single page URL\n",
"documents = firecrawl_reader.load_data(url=\"https://www.oreilly.com/radar/what-we-learned-from-a-year-of-building-with-llms-part-i/\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "83cdb3e5-902b-4d79-930b-df2063c3a39b",
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"print(documents[0].get_content())"
]
},
{
"cell_type": "markdown",
"id": "5d91a633-34c7-4c5b-b6cb-de37e80ace68",
"metadata": {},
"source": [
"#### Setup Index\n",
"\n",
"Please setup an empty index. You can either do this through the UI or [programmatically](https://docs.cloud.llamaindex.ai/llamacloud/guides/framework_integration).\n",
"\n",
"After you've done so, make sure to note down the pipeline_id, pipeline_name, project_id, and project_name in the variables below. You'll need these later! "
]
},
{
"cell_type": "code",
"execution_count": 38,
"id": "04e80c0d-d58d-4fce-aa78-9247b9840738",
"metadata": {},
"outputs": [],
"source": [
"pipeline_id = \"<pipeline_id>\"\n",
"pipeline_name = \"<pipeline_name>\"\n",
"project_id = \"<project_id>\"\n",
"project_name = \"<project_name>\""
]
},
{
"cell_type": "markdown",
"id": "55afeb53",
"metadata": {},
"source": [
"#### Setup LlamaCloud Client SDK and Framework Client\n",
"\n",
"Here we define both the client (giving us access to low-level client operations) as well as the `LlamaCloudIndex` defined through the framework."
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "d4c36489",
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(\n",
" token=os.environ[\"LLAMA_CLOUD_API_KEY\"],\n",
" base_url=os.environ[\"LLAMA_CLOUD_BASE_URL\"]\n",
")\n",
"\n",
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=pipeline_name, \n",
" project_name=project_name,\n",
" api_key=os.getenv(\"LLAMA_CLOUD_API_KEY\")\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ce9e45c5-4255-4dfa-ae3b-f33aac640df5",
"metadata": {},
"source": [
"## Inserting Documents\n",
"\n",
"Now let's create the custom Document objects. We assume that your pipeline has been created in the last section. Copy the pipeline and project ids into the box below.\n",
"\n",
"We insert one document containing the parsed document text, and another document as a toy example."
]
},
{
"cell_type": "markdown",
"id": "d3b7171d-eb65-479b-8bab-a38e7f5ef221",
"metadata": {},
"source": [
"#### Inserting Document Objects through the Client SDK"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "adfde1cf-1941-4bf6-ad0b-903ebd298ae0",
"metadata": {},
"outputs": [],
"source": [
"cloud_documents = [d.to_cloud_document() for d in documents]\n",
"upserted_docs = client.pipelines.upsert_batch_pipeline_documents(pipeline_id, request=cloud_documents)"
]
},
{
"cell_type": "markdown",
"id": "b4a0e171-8834-4a46-aee5-4d7551edac92",
"metadata": {},
"source": [
"#### Inserting Document Objects through the Framework Integration\n",
"\n",
"You can also do `index.insert` to directly upload document objects using the types defined by the framework."
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "346dde6d-3c6f-4425-8be4-a3d06baca325",
"metadata": {},
"outputs": [],
"source": [
"# NOTE: the llamaparsed document is already in the right representation\n",
"from llama_index.core.schema import Document\n",
"\n",
"for doc in documents:\n",
" index.insert(doc)"
]
},
{
"cell_type": "markdown",
"id": "ca62a2da-f1a8-486d-8c62-775844369142",
"metadata": {},
"source": [
"#### Validating the Documents\n",
"\n",
"After the documents have been inserted, we can validate that they exist in the pipeline."
]
},
{
"cell_type": "code",
"execution_count": 41,
"id": "dd0cfcd8-2e77-4e75-ba5c-7cd629706aba",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"1\n"
]
}
],
"source": [
"pipeline_docs = client.pipelines.list_pipeline_documents(pipeline_id)\n",
"\n",
"print(len(pipeline_docs))"
]
},
{
"cell_type": "markdown",
"id": "5a951e23-8608-4082-913b-d7567de845c1",
"metadata": {},
"source": [
"#### Deleting the Documents\n",
"\n",
"If you want to reset, you can use the client SDK to delete pipeline documents."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c1082187-6218-4d7b-9244-982ba309f532",
"metadata": {},
"outputs": [],
"source": [
"pipeline_docs = client.pipelines.list_pipeline_documents(pipeline_id)\n",
"for doc in pipeline_docs:\n",
" client.pipelines.delete_pipeline_document(pipeline_id, doc.id)\n",
"client.pipelines.sync_pipeline(pipeline_id)"
]
},
{
"cell_type": "markdown",
"id": "1b5cf563-d119-40da-977d-9557e41328c7",
"metadata": {},
"source": [
"## Test RAG\n",
"\n",
"Let's create a sample RAG pipeline through the Python framework, through our `LlamaCloudIndex` (you can also run our lower-level search API through `client.pipelines.run_search`)."
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "23ccb658-6620-49f6-a363-5f608733b2f5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The intern test for evaluating generations involves assessing whether an average college student in the relevant major could successfully complete the task given the same input and context as the language model. If the student could succeed, the task is considered feasible for the model. If not, the context may need to be enriched or the task simplified. If the task is too complex even after improvements, it may be beyond the capabilities of contemporary language models.\n"
]
}
],
"source": [
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm = OpenAI(model=\"gpt-4o\")\n",
"retriever = index.as_retriever(rerank_top_n=5)\n",
"query_engine = RetrieverQueryEngine.from_args(\n",
" retriever,\n",
" llm=llm\n",
")\n",
"response = query_engine.query(\"What is the intern test for evaluating generations?\")\n",
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "5c2f771a-f213-46fd-a23e-37695e0df0c9",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--------- SOURCE NODE 0 --------\n",
"Also consider checks to ensure that word, item, or sentence counts lie within a range. For other kinds of generation, assertions can look different. [Execution-evaluation](https://www.semanticscholar.org/paper/Execution-Based-Evaluation-for-Open-Domain-Code-Wang-Zhou/1bed34f2c23b97fd18de359cf62cd92b3ba612c3)\n",
" is a powerful method for **evaluating** code-generation, wherein you run the generated code and determine that the state of runtime is sufficient for the user-request.\n",
"\n",
"As an example, if the user asks for a new function named foo; then after executing the agents generated code, foo should be callable! One challenge in execution-evaluation is that the agent code frequently leaves the runtime in slightly different form than the target code. It can be effective to “relax” assertions to the absolute most weak assumptions that any viable answer would satisfy.\n",
"\n",
"Finally, using your product as intended for customers (i.e., “dogfooding”) can provide insight into failure modes on real-worl....\n",
"--------- SOURCE NODE 1 --------\n",
"![](https://www.oreilly.com/radar/wp-content/uploads/sites/3/2024/05/Picture1.png)\n",
"\n",
"LLM-as-Judge is not a silver bullet though. There are subtle aspects of language where even the strongest models fail to evaluate reliably. In addition, weve found that [conventional classifiers](https://eugeneyan.com/writing/finetuning/)\n",
" and reward models can achieve higher accuracy than LLM-as-Judge, and with lower cost and latency. For code generation, LLM-as-Judge can be weaker than more direct evaluation strategies like execution-evaluation.\n",
"\n",
" **The “**intern** **test**” for **evaluating** generations**\n",
"\n",
"We like to use the following “**intern** **test**” when **evaluating** generations: If you took the exact input to the language model, including the context, and gave it to an average college student in the relevant major as a task, could they succeed? How long would it take?\n",
"\n",
"If the answer is no because the LLM lacks the required knowledge, consider ways to enrich the context.\n",
"\n",
"If the answer is ....\n",
"--------- SOURCE NODE 2 --------\n",
"When a request comes in, we can check to see if a summary already exists in the cache. If so, we can return it immediately; if not, we generate, guardrail, and serve it, and then store it in the cache for future requests.\n",
"\n",
"For more open-ended queries, we can borrow techniques from the field of search, which also leverages caching for open-ended inputs. Features like autocomplete and spelling correction also help normalize user input and thus increase the cache hit rate.\n",
"\n",
" **When to fine-tune**\n",
"\n",
"We may have some tasks where even the most cleverly designed prompts fall short. For example, even after significant prompt engineering, our system may still be a ways from returning reliable, high-quality output. If so, then it may be necessary to finetune a model for your specific task.\n",
"\n",
"Successful examples include:\n",
"\n",
"* [Honeycombs Natural Language Query Assistant](https://www.honeycomb.io/blog/introducing-query-assistant)\n",
" : Initially, the “programming manual” was provided in the prompt ....\n",
"--------- SOURCE NODE 3 --------\n",
"[Skip to main content](#maincontent)\n",
"\n",
"[![O'Reilly home](https://cdn.oreillystatic.com/images/sitewide-headers/oreilly_logo_mark_red_@2x.png)](https://www.oreilly.com/ \"home page\")\n",
"\n",
"* * [Sign In](https://www.oreilly.com/member/login/)\n",
" \n",
" * [Try Now](https://oreilly.com/online-learning/try-now.html)\n",
" \n",
"* * [](https://www.oreilly.com/online-learning/teams.html)\n",
" [](https://www.oreilly.com/online-learning/teams.html)\n",
" [Teams](https://www.oreilly.com/online-learning/teams.html)\n",
" * [For business](https://www.oreilly.com/online-learning/teams.html)\n",
" \n",
" * [For government](https://www.oreilly.com/online-learning/government.html)\n",
" \n",
" * [For higher ed](https://www.oreilly.com/online-learning/academic.html)\n",
" \n",
" * [](https://www.oreilly.com/online-learning/individuals.html)\n",
" [](https://www.oreilly.com/online-learning/individuals.html)\n",
" [Individuals](https://www.oreilly.com/online-lear....\n",
"--------- SOURCE NODE 4 --------\n",
"In the first step, given a high-level goal or prompt, the agent generates a plan. Then, the plan is executed deterministically. This allows each step to be more predictable and reliable. Benefits include:\n",
"\n",
"* Generated plans can serve as few-shot samples to prompt or finetune an agent.\n",
"* Deterministic execution makes the system more reliable, and thus easier to **test** and debug. Furthermore, failures can be traced to the specific steps in the plan.\n",
"* Generated plans can be represented as directed acyclic graphs (DAGs) which are easier, relative to a static prompt, to understand and adapt to new situations.\n",
"\n",
"The most successful agent builders may be those with strong experience managing junior engineers because the process of generating plans is similar to how we instruct and manage juniors. We give juniors clear goals and concrete plans, instead of vague open-ended directions, and we should do the same for our agents too.\n",
"\n",
"In the end, the key to reliable, working agents will lik....\n"
]
}
],
"source": [
"# view sources\n",
"for idx, n in enumerate(response.source_nodes):\n",
" print(f\"--------- SOURCE NODE {idx} --------\")\n",
" print(n.get_content()[:1000] + \"....\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5fabe94c-ae69-4c33-a11f-edc7f4015c91",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"# view sources\n",
"for idx, n in enumerate(response.source_nodes):\n",
" print(f\"--------- SOURCE NODE {idx} --------\")\n",
" print(n.get_content()[:1000] + \"....\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5fabe94c-ae69-4c33-a11f-edc7f4015c91",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
File diff suppressed because it is too large Load Diff
@@ -1,445 +1,452 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Access Control demo with LlamaCloud"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Install core packages, download files. You will need to upload these documents to LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-index-embeddings-openai\n",
"!pip install llama-index-question-gen-openai\n",
"!pip install llama-index-postprocessor-flag-embedding-reranker\n",
"!pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"!pip install llama-parse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some OpenAI and LlamaParse details. The OpenAI LLM is used for response synthesis."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio\n",
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"# API access to llama-cloud\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Using OpenAI API for embeddings/llms\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(token=os.environ[\"LLAMA_CLOUD_API_KEY\"])"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Setup Sharepoint Programatically\n",
"Please for more details about how to setup the permissions check our docuentation [here](https://docs.cloud.llamaindex.ai/llamacloud/integrations/data_sources/sharepoint)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create Data Source"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.types import CloudSharepointDataSource\n",
"\n",
"ds = {\n",
" 'name': '<your-data-source-name>',\n",
" 'source_type': 'MICROSOFT_SHAREPOINT', \n",
" 'component': CloudSharepointDataSource(\n",
" site_name='<site_name>',\n",
" folder_path='<folder_path>', # optional\n",
" client_id='<client_id>',\n",
" client_secret='<client_secret>',\n",
" tenant_id='<tenant_id>',\n",
" )\n",
"}\n",
"\n",
"data_source = client.data_sources.create_data_source(request=ds)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup Transformations/Embeddings Configs"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"# Embedding config\n",
"embedding_config = {\n",
" 'type': 'OPENAI_EMBEDDING',\n",
" 'component': {\n",
" 'api_key': os.environ[\"OPENAI_API_KEY\"], # editable\n",
" 'model_name': 'text-embedding-ada-002' # editable\n",
" }\n",
"}\n",
"\n",
"# Transformation auto config\n",
"transform_config = {\n",
" 'mode': 'auto',\n",
" 'config': {\n",
" 'chunk_size': 1024, # editable\n",
" 'chunk_overlap': 20 # editable\n",
" }\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Pipeline"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"pipeline = {\n",
" 'name': 'test-pipeline',\n",
" 'embedding_config': embedding_config,\n",
" 'transform_config': transform_config,\n",
"}\n",
"\n",
"pipeline = client.pipelines.upsert_pipeline(request=pipeline)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Add Sharepoint Data Source to Pipeline"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"data_sources = [\n",
" {\n",
" 'data_source_id': data_source.id,\n",
" 'sync_interval': 43200.0 # Optional, scheduled sync frequency in seconds. In this case, every 12 hours.\n",
" }\n",
"]\n",
"\n",
"pipeline_data_sources = client.pipelines.add_data_sources_to_pipeline(pipeline.id, request=data_sources)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sync Pipeline\n",
"This triggers the data-source run\n"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
"cells": [
{
"data": {
"text/plain": [
"Pipeline(id='477e2c1b-6a31-4f95-8b8c-f86a736cef30', created_at=datetime.datetime(2025, 2, 6, 1, 10, 43, 365914, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2025, 2, 6, 1, 10, 43, 365914, tzinfo=datetime.timezone.utc), name='test-pipeline', project_id='177261ea-1af1-4a47-9029-15ed38c0cea8', embedding_model_config_id=None, pipeline_type=<PipelineType.MANAGED: 'MANAGED'>, managed_pipeline_id=None, embedding_config=PipelineEmbeddingConfig_OpenaiEmbedding(component=OpenAiEmbedding(model_name='text-embedding-ada-002', embed_batch_size=10, num_workers=None, additional_kwargs={}, api_key='********MUwA', api_base='https://api.openai.com/v1', api_version='', max_retries=10, timeout=60.0, default_headers=None, reuse_client=True, dimensions=None, class_name='OpenAIEmbedding'), type='OPENAI_EMBEDDING'), configured_transformations=[], config_hash=PipelineConfigurationHashes(embedding_config_hash='e23a92ac03041136e80e731de4f1f15b4aeb5a5f24c3c2ea4d', parsing_config_hash='e0d357e5d44a8e1364c6f6c6f38c261c676e7d1289157ee273', transform_config_hash='84032cd089f78343a74e25f97a30c5f7419a67f43d65ada2f1'), transform_config=PipelineTransformConfig_Auto(chunk_size=1024, chunk_overlap=200, mode='auto'), preset_retrieval_parameters=PresetRetrievalParams(dense_similarity_top_k=30, dense_similarity_cutoff=0.0, sparse_similarity_top_k=30, enable_reranking=True, rerank_top_n=6, alpha=0.5, search_filters=None, files_top_k=1, retrieval_mode=<RetrievalMode.CHUNKS: 'chunks'>, retrieve_image_nodes=False, class_name='base_component'), eval_parameters=EvalExecutionParams(llm_model=<SupportedLlmModelNames.GPT_4_O: 'GPT_4O'>, qa_prompt_tmpl='Context information is below.\\n---------------------\\n{context_str}\\n---------------------\\nGiven the context information and not prior knowledge, answer the query.\\nQuery: {query_str}\\nAnswer: '), llama_parse_parameters=LlamaParseParameters(languages=[<ParserLanguages.EN: 'en'>], parsing_instruction='', disable_ocr=False, annotate_links=False, disable_reconstruction=False, disable_image_extraction=False, invalidate_cache=False, output_pdf_of_document=False, do_not_cache=False, fast_mode=False, skip_diagonal_text=False, gpt_4_o_mode=False, gpt_4_o_api_key='', do_not_unroll_columns=False, extract_layout=False, html_make_all_elements_visible=False, html_remove_navigation_elements=False, html_remove_fixed_elements=False, guess_xlsx_sheet_name=False, page_separator=None, bounding_box='', bbox_top=None, bbox_right=None, bbox_bottom=None, bbox_left=None, target_pages='', use_vendor_multimodal_model=False, vendor_multimodal_model_name='', vendor_multimodal_api_key='', page_prefix='', page_suffix='', webhook_url='', take_screenshot=False, is_formatting_instruction=True, premium_mode=False, continuous_mode=False, s_3_input_path='', input_s_3_region='', s_3_output_path_prefix='', output_s_3_region='', project_id=None, azure_openai_deployment_name=None, azure_openai_endpoint=None, azure_openai_api_version=None, azure_openai_key=None, input_url=None, http_proxy=None, auto_mode=False, auto_mode_trigger_on_regexp_in_page=None, auto_mode_trigger_on_text_in_page=None, auto_mode_trigger_on_table_in_page=False, auto_mode_trigger_on_image_in_page=False, structured_output=False, structured_output_json_schema=None, structured_output_json_schema_name=None, max_pages=None, max_pages_enforced=None, extract_charts=False, formatting_instruction=None, complemental_formatting_instruction=None, content_guideline_instruction=None, spreadsheet_extract_sub_tables=False, job_timeout_in_seconds=None, job_timeout_extra_time_per_page_in_seconds=None, strict_mode_image_extraction=False, strict_mode_image_ocr=False, strict_mode_reconstruction=False, strict_mode_buggy_font=False, ignore_document_elements_for_layout_detection=False, output_tables_as_html=False, internal_is_screenshot_job=False), data_sink=None)"
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"client.pipelines.sync_pipeline(pipeline.id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define LlamaCloud Chunk Retriever over Documents\n",
"\n",
"In this section we define a chunk-level LlamaCloud Retriever over these documents.\n",
"\n",
"The chunk-level LlamaCloud retriever is our default retriever that returns chunks via hybrid search + reranking."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=pipeline.name,\n",
" project_id=pipeline.project_id,\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"],\n",
" organization_id=\"04db4a56-04e3-43c5-aef5-0f39f1653dc8\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Define chunk retriever\n",
"\n",
"The chunk-level retriever does vector search with a final reranked set of `rerank_top_n=5`."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.vector_stores import (\n",
" MetadataFilter,\n",
" MetadataFilters,\n",
" FilterOperator,\n",
")\n",
"\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"\n",
"# resolver user id/group id through sharepoint/db here \n",
"# obs: you can also filter by the groups of the user\n",
"FILTER_BY_USER_ID = \"11\" # editable\n",
"\n",
"filters = MetadataFilters(\n",
" filters=[\n",
" MetadataFilter(\n",
" key=\"allowed_siteUser_ids\", operator=FilterOperator.IN, value=[FILTER_BY_USER_ID]\n",
" ),\n",
" ]\n",
")\n",
"\n",
"chunk_retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=5,\n",
" filters=filters\n",
")\n",
"\n",
"llm = OpenAI(model=\"gpt-4o-mini\")\n",
"query_engine_chunk = RetrieverQueryEngine.from_args(\n",
" chunk_retriever, \n",
" llm=llm,\n",
" response_mode=\"tree_summarize\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build an Agent\n",
"\n",
"In this section we build an agent that takes in both file-level and chunk-level query engines as tools. It decides which query engine to call depending on the nature of this question."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.tools import FunctionTool, ToolMetadata, QueryEngineTool\n",
"\n",
"\n",
"# this variable tells the agent specific properties about your document.\n",
"doc_metadata_extra_str = \"\"\"\\\n",
"Each document represents a complete 10K report for a given year (e.g. Apple in 2019). \n",
"Here's an example of relevant documents:\n",
"1. apple_2019.pdf\n",
"\"\"\"\n",
"\n",
"tool_chunk_description = f\"\"\"\\\n",
"Synthesizes an answer to your question by feeding in a relevant chunk as context. Best used for questions that are more pointed in nature.\n",
"Do NOT use if the question asks seems to require a general summary of any given document. Use the doc_query_engine instead for that purpose.\n",
"\n",
"Below we give details on the format of each document:\n",
"{doc_metadata_extra_str}\n",
"\"\"\"\n",
"\n",
"tool_chunk = QueryEngineTool(\n",
" query_engine=query_engine_chunk,\n",
" metadata=ToolMetadata(\n",
" name=\"chunk_query_engine\",\n",
" description=tool_chunk_description\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.core.agent import FunctionCallingAgentWorker\n",
"from llama_index.core.agent import AgentRunner\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm_agent = OpenAI(model=\"gpt-4o\")\n",
"agent = FunctionCallingAgentWorker.from_tools(\n",
" [tool_chunk], llm=llm_agent, verbose=True\n",
").as_agent()"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Added user message to memory: Tell me the revenue for Apple in 2019?\n",
"=== Calling Function ===\n",
"Calling function: chunk_query_engine with args: {\"input\": \"Apple revenue in 2019\"}\n",
"=== Function Output ===\n",
"Apple's total net sales in 2019 amounted to $260.174 billion, which represented a 2% decrease compared to 2018. The revenue was primarily driven by various product categories, including iPhone, Mac, iPad, Wearables, Home and Accessories, and Services.\n",
"=== LLM Response ===\n",
"Apple's total net sales in 2019 amounted to $260.174 billion, which represented a 2% decrease compared to 2018.\n"
]
}
],
"source": [
"response = agent.chat(\"Tell me the revenue for Apple in 2019?\")"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# Access Control demo with LlamaCloud"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Added user message to memory: Tell me the revenue for Apple in 2020?\n",
"=== Calling Function ===\n",
"Calling function: chunk_query_engine with args: {\"input\": \"Apple revenue in 2020\"}\n",
"=== Function Output ===\n",
"Apple's total net sales in 2020 amounted to $274.5 billion, reflecting a 6% increase compared to 2019. The revenue was primarily driven by higher sales in Services and Wearables, Home and Accessories.\n",
"=== LLM Response ===\n",
"Apple's total net sales in 2020 amounted to $274.5 billion, reflecting a 6% increase compared to 2019.\n"
]
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Install core packages, download files. You will need to upload these documents to LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-core\n",
"!pip install llama-index-embeddings-openai\n",
"!pip install llama-index-question-gen-openai\n",
"!pip install llama-index-postprocessor-flag-embedding-reranker\n",
"!pip install git+https://github.com/FlagOpen/FlagEmbedding.git\n",
"!pip install llama-parse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Some OpenAI and LlamaParse details. The OpenAI LLM is used for response synthesis."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio\n",
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"# API access to llama-cloud\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# Using OpenAI API for embeddings/llms\n",
"os.environ[\"OPENAI_API_KEY\"] = \"\""
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(token=os.environ[\"LLAMA_CLOUD_API_KEY\"])"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": []
},
"source": [
"## Setup Sharepoint Programatically\n",
"Please for more details about how to setup the permissions check our docuentation [here](https://docs.cloud.llamaindex.ai/llamacloud/integrations/data_sources/sharepoint)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create Data Source"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.types import CloudSharepointDataSource\n",
"\n",
"ds = {\n",
" 'name': '<your-data-source-name>',\n",
" 'source_type': 'MICROSOFT_SHAREPOINT', \n",
" 'component': CloudSharepointDataSource(\n",
" site_name='<site_name>',\n",
" folder_path='<folder_path>', # optional\n",
" client_id='<client_id>',\n",
" client_secret='<client_secret>',\n",
" tenant_id='<tenant_id>',\n",
" )\n",
"}\n",
"\n",
"data_source = client.data_sources.create_data_source(request=ds)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup Transformations/Embeddings Configs"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"# Embedding config\n",
"embedding_config = {\n",
" 'type': 'OPENAI_EMBEDDING',\n",
" 'component': {\n",
" 'api_key': os.environ[\"OPENAI_API_KEY\"], # editable\n",
" 'model_name': 'text-embedding-ada-002' # editable\n",
" }\n",
"}\n",
"\n",
"# Transformation auto config\n",
"transform_config = {\n",
" 'mode': 'auto',\n",
" 'config': {\n",
" 'chunk_size': 1024, # editable\n",
" 'chunk_overlap': 20 # editable\n",
" }\n",
"}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create Pipeline"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"pipeline = {\n",
" 'name': 'test-pipeline',\n",
" 'embedding_config': embedding_config,\n",
" 'transform_config': transform_config,\n",
"}\n",
"\n",
"pipeline = client.pipelines.upsert_pipeline(request=pipeline)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Add Sharepoint Data Source to Pipeline"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"data_sources = [\n",
" {\n",
" 'data_source_id': data_source.id,\n",
" 'sync_interval': 43200.0 # Optional, scheduled sync frequency in seconds. In this case, every 12 hours.\n",
" }\n",
"]\n",
"\n",
"pipeline_data_sources = client.pipelines.add_data_sources_to_pipeline(pipeline.id, request=data_sources)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sync Pipeline\n",
"This triggers the data-source run\n"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Pipeline(id='477e2c1b-6a31-4f95-8b8c-f86a736cef30', created_at=datetime.datetime(2025, 2, 6, 1, 10, 43, 365914, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2025, 2, 6, 1, 10, 43, 365914, tzinfo=datetime.timezone.utc), name='test-pipeline', project_id='177261ea-1af1-4a47-9029-15ed38c0cea8', embedding_model_config_id=None, pipeline_type=<PipelineType.MANAGED: 'MANAGED'>, managed_pipeline_id=None, embedding_config=PipelineEmbeddingConfig_OpenaiEmbedding(component=OpenAiEmbedding(model_name='text-embedding-ada-002', embed_batch_size=10, num_workers=None, additional_kwargs={}, api_key='********MUwA', api_base='https://api.openai.com/v1', api_version='', max_retries=10, timeout=60.0, default_headers=None, reuse_client=True, dimensions=None, class_name='OpenAIEmbedding'), type='OPENAI_EMBEDDING'), configured_transformations=[], config_hash=PipelineConfigurationHashes(embedding_config_hash='e23a92ac03041136e80e731de4f1f15b4aeb5a5f24c3c2ea4d', parsing_config_hash='e0d357e5d44a8e1364c6f6c6f38c261c676e7d1289157ee273', transform_config_hash='84032cd089f78343a74e25f97a30c5f7419a67f43d65ada2f1'), transform_config=PipelineTransformConfig_Auto(chunk_size=1024, chunk_overlap=200, mode='auto'), preset_retrieval_parameters=PresetRetrievalParams(dense_similarity_top_k=30, dense_similarity_cutoff=0.0, sparse_similarity_top_k=30, enable_reranking=True, rerank_top_n=6, alpha=0.5, search_filters=None, files_top_k=1, retrieval_mode=<RetrievalMode.CHUNKS: 'chunks'>, retrieve_image_nodes=False, class_name='base_component'), eval_parameters=EvalExecutionParams(llm_model=<SupportedLlmModelNames.GPT_4_O: 'GPT_4O'>, qa_prompt_tmpl='Context information is below.\\n---------------------\\n{context_str}\\n---------------------\\nGiven the context information and not prior knowledge, answer the query.\\nQuery: {query_str}\\nAnswer: '), llama_parse_parameters=LlamaParseParameters(languages=[<ParserLanguages.EN: 'en'>], parsing_instruction='', disable_ocr=False, annotate_links=False, disable_reconstruction=False, disable_image_extraction=False, invalidate_cache=False, output_pdf_of_document=False, do_not_cache=False, fast_mode=False, skip_diagonal_text=False, gpt_4_o_mode=False, gpt_4_o_api_key='', do_not_unroll_columns=False, extract_layout=False, html_make_all_elements_visible=False, html_remove_navigation_elements=False, html_remove_fixed_elements=False, guess_xlsx_sheet_name=False, page_separator=None, bounding_box='', bbox_top=None, bbox_right=None, bbox_bottom=None, bbox_left=None, target_pages='', use_vendor_multimodal_model=False, vendor_multimodal_model_name='', vendor_multimodal_api_key='', page_prefix='', page_suffix='', webhook_url='', take_screenshot=False, is_formatting_instruction=True, premium_mode=False, continuous_mode=False, s_3_input_path='', input_s_3_region='', s_3_output_path_prefix='', output_s_3_region='', project_id=None, azure_openai_deployment_name=None, azure_openai_endpoint=None, azure_openai_api_version=None, azure_openai_key=None, input_url=None, http_proxy=None, auto_mode=False, auto_mode_trigger_on_regexp_in_page=None, auto_mode_trigger_on_text_in_page=None, auto_mode_trigger_on_table_in_page=False, auto_mode_trigger_on_image_in_page=False, structured_output=False, structured_output_json_schema=None, structured_output_json_schema_name=None, max_pages=None, max_pages_enforced=None, extract_charts=False, formatting_instruction=None, complemental_formatting_instruction=None, content_guideline_instruction=None, spreadsheet_extract_sub_tables=False, job_timeout_in_seconds=None, job_timeout_extra_time_per_page_in_seconds=None, strict_mode_image_extraction=False, strict_mode_image_ocr=False, strict_mode_reconstruction=False, strict_mode_buggy_font=False, ignore_document_elements_for_layout_detection=False, output_tables_as_html=False, internal_is_screenshot_job=False), data_sink=None)"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"client.pipelines.sync_pipeline(pipeline.id)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define LlamaCloud Chunk Retriever over Documents\n",
"\n",
"In this section we define a chunk-level LlamaCloud Retriever over these documents.\n",
"\n",
"The chunk-level LlamaCloud retriever is our default retriever that returns chunks via hybrid search + reranking."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"import os\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=pipeline.name,\n",
" project_id=pipeline.project_id,\n",
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"],\n",
" organization_id=\"04db4a56-04e3-43c5-aef5-0f39f1653dc8\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Define chunk retriever\n",
"\n",
"The chunk-level retriever does vector search with a final reranked set of `rerank_top_n=5`."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.vector_stores import (\n",
" MetadataFilter,\n",
" MetadataFilters,\n",
" FilterOperator,\n",
")\n",
"\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"from llama_index.core.query_engine import RetrieverQueryEngine\n",
"\n",
"# resolver user id/group id through sharepoint/db here \n",
"# obs: you can also filter by the groups of the user\n",
"FILTER_BY_USER_ID = \"11\" # editable\n",
"\n",
"filters = MetadataFilters(\n",
" filters=[\n",
" MetadataFilter(\n",
" key=\"allowed_siteUser_ids\", operator=FilterOperator.IN, value=[FILTER_BY_USER_ID]\n",
" ),\n",
" ]\n",
")\n",
"\n",
"chunk_retriever = index.as_retriever(\n",
" retrieval_mode=\"chunks\",\n",
" rerank_top_n=5,\n",
" filters=filters\n",
")\n",
"\n",
"llm = OpenAI(model=\"gpt-4o-mini\")\n",
"query_engine_chunk = RetrieverQueryEngine.from_args(\n",
" chunk_retriever, \n",
" llm=llm,\n",
" response_mode=\"tree_summarize\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build an Agent\n",
"\n",
"In this section we build an agent that takes in both file-level and chunk-level query engines as tools. It decides which query engine to call depending on the nature of this question."
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.tools import FunctionTool, ToolMetadata, QueryEngineTool\n",
"\n",
"\n",
"# this variable tells the agent specific properties about your document.\n",
"doc_metadata_extra_str = \"\"\"\\\n",
"Each document represents a complete 10K report for a given year (e.g. Apple in 2019). \n",
"Here's an example of relevant documents:\n",
"1. apple_2019.pdf\n",
"\"\"\"\n",
"\n",
"tool_chunk_description = f\"\"\"\\\n",
"Synthesizes an answer to your question by feeding in a relevant chunk as context. Best used for questions that are more pointed in nature.\n",
"Do NOT use if the question asks seems to require a general summary of any given document. Use the doc_query_engine instead for that purpose.\n",
"\n",
"Below we give details on the format of each document:\n",
"{doc_metadata_extra_str}\n",
"\"\"\"\n",
"\n",
"tool_chunk = QueryEngineTool(\n",
" query_engine=query_engine_chunk,\n",
" metadata=ToolMetadata(\n",
" name=\"chunk_query_engine\",\n",
" description=tool_chunk_description\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.core.agent import FunctionCallingAgentWorker\n",
"from llama_index.core.agent import AgentRunner\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"llm_agent = OpenAI(model=\"gpt-4o\")\n",
"agent = FunctionCallingAgentWorker.from_tools(\n",
" [tool_chunk], llm=llm_agent, verbose=True\n",
").as_agent()"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Added user message to memory: Tell me the revenue for Apple in 2019?\n",
"=== Calling Function ===\n",
"Calling function: chunk_query_engine with args: {\"input\": \"Apple revenue in 2019\"}\n",
"=== Function Output ===\n",
"Apple's total net sales in 2019 amounted to $260.174 billion, which represented a 2% decrease compared to 2018. The revenue was primarily driven by various product categories, including iPhone, Mac, iPad, Wearables, Home and Accessories, and Services.\n",
"=== LLM Response ===\n",
"Apple's total net sales in 2019 amounted to $260.174 billion, which represented a 2% decrease compared to 2018.\n"
]
}
],
"source": [
"response = agent.chat(\"Tell me the revenue for Apple in 2019?\")"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Added user message to memory: Tell me the revenue for Apple in 2020?\n",
"=== Calling Function ===\n",
"Calling function: chunk_query_engine with args: {\"input\": \"Apple revenue in 2020\"}\n",
"=== Function Output ===\n",
"Apple's total net sales in 2020 amounted to $274.5 billion, reflecting a 6% increase compared to 2019. The revenue was primarily driven by higher sales in Services and Wearables, Home and Accessories.\n",
"=== LLM Response ===\n",
"Apple's total net sales in 2020 amounted to $274.5 billion, reflecting a 6% increase compared to 2019.\n"
]
}
],
"source": [
"response = agent.chat(\"Tell me the revenue for Apple in 2020?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python (systra)",
"language": "python",
"name": "bp_kernel"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.1"
}
],
"source": [
"response = agent.chat(\"Tell me the revenue for Apple in 2020?\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python (systra)",
"language": "python",
"name": "bp_kernel"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.13.1"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
"nbformat": 4,
"nbformat_minor": 4
}
+414 -407
View File
@@ -1,414 +1,421 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# LlamaParse + LlamaCloud + AWS Bedrock Cookbook\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/10k_apple_tesla/demo_file_retrieval.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"In this notebook we demonstrate demonstrate how to build a RAG application using LlamaParse, LlamaCloud, embedding models and LLMs supported on AWS Bedrock.\n",
"\n",
"Here are the steps involved:\n",
"\n",
"1. Install the packages and setup API keys. \n",
"2. Download Apple-10K 2023 SEC filing.\n",
"3. Parse the documents using LlamaParse.\n",
"4. Create a pipeline/ Index on LlamaCloud.\n",
"5. Upload the document to Index with `amazon.titan-embed-text-v1` embedding.\n",
"6. Connect to the Index.\n",
"7. Initiate LLM.\n",
"8. Create `query_engine`.\n",
"9. Query the index using `query_engine`.\n",
"\n",
"[LlamaCloud](https://docs.cloud.llamaindex.ai/), [LlamaParse](https://docs.llamaindex.ai/en/stable/llama_cloud/llama_parse/)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation and Setup API Keys"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"Here we install following packages:\n",
"\n",
"1. `llama-index`: Core package for OSS orchestration.\n",
"2. `llama-index-llms-bedrock-converse`: To utilize Bedrock LLMs.\n",
"3. `llama-index-indices-managed-llama-cloud`: For managing indices on LlamaCloud.\n",
"4. `llama-parse`: For parsing documents efficiently."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-llms-bedrock-converse\n",
"!pip install llama-index-indices-managed-llama-cloud\n",
"!pip install llama-parse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup API Keys\n",
"\n",
"Here we setup `LLAMA_CLOUD_API_KEY` for managing the index on LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio\n",
"import nest_asyncio\n",
"nest_asyncio.apply()\n",
"\n",
"import os\n",
"# API access to llama-cloud\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"<LLAMACLOUD API KEY>\" # Get your API key from https://cloud.llamaindex.ai/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download files\n",
"\n",
"Here we download `Apple-10K` 2023 SEC filings and use it for our demonstration."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2024-11-28 16:05:55-- https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf\n",
"Resolving s2.q4cdn.com (s2.q4cdn.com)... 181.41.142.154\n",
"Connecting to s2.q4cdn.com (s2.q4cdn.com)|181.41.142.154|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 714094 (697K) [application/pdf]\n",
"Saving to: data/apple_2023.pdf\n",
"\n",
"data/apple_2023.pdf 100%[===================>] 697.36K 4.37MB/s in 0.2s \n",
"\n",
"2024-11-28 16:05:57 (4.37 MB/s) - data/apple_2023.pdf saved [714094/714094]\n",
"\n"
]
}
],
"source": [
"# download Apple \n",
"!mkdir -p data\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf\" -O data/apple_2023.pdf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Parse the document.\n",
"\n",
"Here we use `LlamaParse` to parse the downloaded `Apple` 10K-SEC filings. "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Started parsing the file under job_id d6c339e3-9014-4139-afce-48c1ffbaa098\n"
]
}
],
"source": [
"from llama_parse import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\", # \"markdown\" and \"text\" are available\n",
" num_workers=4, # if multiple files passed, split in `num_workers` API calls\n",
" verbose=True,\n",
" language=\"en\", # Optionally you can define a language, default=en\n",
")\n",
"\n",
"# sync\n",
"documents = parser.load_data(\"data/apple_2023.pdf\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Pipeline/ Index on LlamaCloud"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### LlamaCloud Client\n",
"\n",
"We will connect to `LlamaCloud` client."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(token=os.environ[\"LLAMA_CLOUD_API_KEY\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create LlamaCloud Pipeline/ Index\n",
"\n",
"We need `embedding_config` and `transform_config` to create a pipeline.\n",
"\n",
"`embedding_config` - Sets up the embedding model details required to configure and create the pipeline.\n",
"\n",
"`transform_config` - Configures the `chunk_size` and `chunk_overlap` parameters required for the RAG application.\n",
"\n",
"We will use the `amazon.titan-embed-text-v1` embedding model available on AWS Bedrock. To access it, you will need the following credentials: `region_name`, `aws_access_key_id`, and `aws_secret_access_key`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# Transformation auto config\n",
"transform_config = {\n",
" 'mode': 'auto',\n",
" 'config': {\n",
" 'chunk_size': 1024, # editable\n",
" 'chunk_overlap': 20 # editable\n",
" }\n",
"}\n",
"\n",
"embedding_config = {\n",
" 'type': 'BEDROCK_EMBEDDING',\n",
" 'component': {\n",
" 'region_name': '<REGION NAME>',\n",
" 'aws_access_key_id': '<AWS ACCESS KEY ID>',\n",
" 'aws_secret_access_key': '<AWS SECRET ACCESS KEY>',\n",
" 'model': 'amazon.titan-embed-text-v1',\n",
" }\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"pipeline = {\n",
" 'name': 'apple_2023', # pipeline/ index name\n",
" 'transform_config': transform_config,\n",
" 'embedding_config': embedding_config,\n",
" 'data_sink_id': None\n",
"}\n",
"\n",
"pipeline = client.pipelines.upsert_pipeline(request=pipeline)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Upload the Documents\n",
"\n",
"Here we use the parsed document and upload it to the LlamaCloud"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.types import CloudDocumentCreate\n",
"\n",
"text = \"\\n\\n\".join([doc.text for doc in documents])\n",
"\n",
"documents = [\n",
"CloudDocumentCreate(\n",
"text=text,\n",
"metadata={\"filename\": \"apple_2023.pdf\", \"file_path\": \"data/apple_2023.pdf\"},\n",
")\n",
"]\n",
"\n",
"documents = client.pipelines.create_batch_pipeline_documents(pipeline.id, request=documents)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check status if its uploaded"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# LlamaParse + LlamaCloud + AWS Bedrock Cookbook\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/10k_apple_tesla/demo_file_retrieval.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"In this notebook we demonstrate demonstrate how to build a RAG application using LlamaParse, LlamaCloud, embedding models and LLMs supported on AWS Bedrock.\n",
"\n",
"Here are the steps involved:\n",
"\n",
"1. Install the packages and setup API keys. \n",
"2. Download Apple-10K 2023 SEC filing.\n",
"3. Parse the documents using LlamaParse.\n",
"4. Create a pipeline/ Index on LlamaCloud.\n",
"5. Upload the document to Index with `amazon.titan-embed-text-v1` embedding.\n",
"6. Connect to the Index.\n",
"7. Initiate LLM.\n",
"8. Create `query_engine`.\n",
"9. Query the index using `query_engine`.\n",
"\n",
"[LlamaCloud](https://docs.cloud.llamaindex.ai/), [LlamaParse](https://docs.llamaindex.ai/en/stable/llama_cloud/llama_parse/)"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"ManagedIngestionStatus.SUCCESS\n"
]
}
],
"source": [
"status = client.pipelines.get_pipeline_status(pipeline.id)\n",
"print(status.status)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Connect to the Index\n",
"\n",
"We will connect to the LlamaCloud pipeline or index that has been created. You can get the `project_name` and `organization_id` from your LlamaCloud index."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"apple_2023\", \n",
" project_name=\"<PROJECT NAME>\",\n",
" organization_id=\"<ORGANIZATION ID>\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define the LLM\n",
"\n",
"Here, we will initiate the supported LLM on AWS Bedrock LLM. You can refer to the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) to explore the available LLMs.\n",
"\n",
"To access it, you will need the following credentials: `region_name`, `aws_access_key_id`, `aws_secret_access_key` and `aws_session_token`."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.bedrock_converse import BedrockConverse\n",
"\n",
"llm = BedrockConverse(model=\"<MODEL ID>\", \n",
" region_name=\"<REGION NAME>\", \n",
" aws_access_key_id=\"<AWS ACCESS KEY ID>\", \n",
" aws_secret_access_key=\"<AWS SECRET ACCESS KEY>\", \n",
" aws_session_token=\"<AWS SESSION TOKEN>\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create QueryEngine"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"query_engine = index.as_query_engine(llm=llm)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Querying\n",
"\n",
"We will test out with a query using the created `QueryEngine`"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation and Setup API Keys"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"The revenue of Apple in 2023 was $383.3 billion.\n"
]
"cell_type": "markdown",
"metadata": {},
"source": [
"### Installation\n",
"\n",
"Here we install following packages:\n",
"\n",
"1. `llama-index`: Core package for OSS orchestration.\n",
"2. `llama-index-llms-bedrock-converse`: To utilize Bedrock LLMs.\n",
"3. `llama-index-indices-managed-llama-cloud`: For managing indices on LlamaCloud.\n",
"4. `llama-parse`: For parsing documents efficiently."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index\n",
"!pip install llama-index-llms-bedrock-converse\n",
"!pip install llama-index-indices-managed-llama-cloud\n",
"!pip install llama-parse"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Setup API Keys\n",
"\n",
"Here we setup `LLAMA_CLOUD_API_KEY` for managing the index on LlamaCloud."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio\n",
"import nest_asyncio\n",
"nest_asyncio.apply()\n",
"\n",
"import os\n",
"# API access to llama-cloud\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"<LLAMACLOUD API KEY>\" # Get your API key from https://cloud.llamaindex.ai/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download files\n",
"\n",
"Here we download `Apple-10K` 2023 SEC filings and use it for our demonstration."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"--2024-11-28 16:05:55-- https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf\n",
"Resolving s2.q4cdn.com (s2.q4cdn.com)... 181.41.142.154\n",
"Connecting to s2.q4cdn.com (s2.q4cdn.com)|181.41.142.154|:443... connected.\n",
"HTTP request sent, awaiting response... 200 OK\n",
"Length: 714094 (697K) [application/pdf]\n",
"Saving to: data/apple_2023.pdf\n",
"\n",
"data/apple_2023.pdf 100%[===================>] 697.36K 4.37MB/s in 0.2s \n",
"\n",
"2024-11-28 16:05:57 (4.37 MB/s) - data/apple_2023.pdf saved [714094/714094]\n",
"\n"
]
}
],
"source": [
"# download Apple \n",
"!mkdir -p data\n",
"!wget \"https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf\" -O data/apple_2023.pdf"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Parse the document.\n",
"\n",
"Here we use `LlamaParse` to parse the downloaded `Apple` 10K-SEC filings. "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Started parsing the file under job_id d6c339e3-9014-4139-afce-48c1ffbaa098\n"
]
}
],
"source": [
"from llama_parse import LlamaParse\n",
"\n",
"parser = LlamaParse(\n",
" result_type=\"markdown\", # \"markdown\" and \"text\" are available\n",
" num_workers=4, # if multiple files passed, split in `num_workers` API calls\n",
" verbose=True,\n",
" language=\"en\", # Optionally you can define a language, default=en\n",
")\n",
"\n",
"# sync\n",
"documents = parser.load_data(\"data/apple_2023.pdf\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create Pipeline/ Index on LlamaCloud"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### LlamaCloud Client\n",
"\n",
"We will connect to `LlamaCloud` client."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.client import LlamaCloud\n",
"\n",
"client = LlamaCloud(token=os.environ[\"LLAMA_CLOUD_API_KEY\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Create LlamaCloud Pipeline/ Index\n",
"\n",
"We need `embedding_config` and `transform_config` to create a pipeline.\n",
"\n",
"`embedding_config` - Sets up the embedding model details required to configure and create the pipeline.\n",
"\n",
"`transform_config` - Configures the `chunk_size` and `chunk_overlap` parameters required for the RAG application.\n",
"\n",
"We will use the `amazon.titan-embed-text-v1` embedding model available on AWS Bedrock. To access it, you will need the following credentials: `region_name`, `aws_access_key_id`, and `aws_secret_access_key`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"# Transformation auto config\n",
"transform_config = {\n",
" 'mode': 'auto',\n",
" 'config': {\n",
" 'chunk_size': 1024, # editable\n",
" 'chunk_overlap': 20 # editable\n",
" }\n",
"}\n",
"\n",
"embedding_config = {\n",
" 'type': 'BEDROCK_EMBEDDING',\n",
" 'component': {\n",
" 'region_name': '<REGION NAME>',\n",
" 'aws_access_key_id': '<AWS ACCESS KEY ID>',\n",
" 'aws_secret_access_key': '<AWS SECRET ACCESS KEY>',\n",
" 'model': 'amazon.titan-embed-text-v1',\n",
" }\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"pipeline = {\n",
" 'name': 'apple_2023', # pipeline/ index name\n",
" 'transform_config': transform_config,\n",
" 'embedding_config': embedding_config,\n",
" 'data_sink_id': None\n",
"}\n",
"\n",
"pipeline = client.pipelines.upsert_pipeline(request=pipeline)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Upload the Documents\n",
"\n",
"Here we use the parsed document and upload it to the LlamaCloud"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"from llama_cloud.types import CloudDocumentCreate\n",
"\n",
"text = \"\\n\\n\".join([doc.text for doc in documents])\n",
"\n",
"documents = [\n",
"CloudDocumentCreate(\n",
"text=text,\n",
"metadata={\"filename\": \"apple_2023.pdf\", \"file_path\": \"data/apple_2023.pdf\"},\n",
")\n",
"]\n",
"\n",
"documents = client.pipelines.create_batch_pipeline_documents(pipeline.id, request=documents)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Check status if its uploaded"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"ManagedIngestionStatus.SUCCESS\n"
]
}
],
"source": [
"status = client.pipelines.get_pipeline_status(pipeline.id)\n",
"print(status.status)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Connect to the Index\n",
"\n",
"We will connect to the LlamaCloud pipeline or index that has been created. You can get the `project_name` and `organization_id` from your LlamaCloud index."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"apple_2023\", \n",
" project_name=\"<PROJECT NAME>\",\n",
" organization_id=\"<ORGANIZATION ID>\",\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Define the LLM\n",
"\n",
"Here, we will initiate the supported LLM on AWS Bedrock LLM. You can refer to the [AWS Bedrock documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-supported.html) to explore the available LLMs.\n",
"\n",
"To access it, you will need the following credentials: `region_name`, `aws_access_key_id`, `aws_secret_access_key` and `aws_session_token`."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.bedrock_converse import BedrockConverse\n",
"\n",
"llm = BedrockConverse(model=\"<MODEL ID>\", \n",
" region_name=\"<REGION NAME>\", \n",
" aws_access_key_id=\"<AWS ACCESS KEY ID>\", \n",
" aws_secret_access_key=\"<AWS SECRET ACCESS KEY>\", \n",
" aws_session_token=\"<AWS SESSION TOKEN>\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create QueryEngine"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [],
"source": [
"query_engine = index.as_query_engine(llm=llm)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Querying\n",
"\n",
"We will test out with a query using the created `QueryEngine`"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The revenue of Apple in 2023 was $383.3 billion.\n"
]
}
],
"source": [
"query = \"what is the revenue of Apple in 2023?\"\n",
"response = query_engine.query(query)\n",
"\n",
"print(response)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "llamacloud",
"language": "python",
"name": "llamacloud"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.4"
}
],
"source": [
"query = \"what is the revenue of Apple in 2023?\"\n",
"response = query_engine.query(query)\n",
"\n",
"print(response)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "llamacloud",
"language": "python",
"name": "llamacloud"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
"nbformat": 4,
"nbformat_minor": 2
}
+337 -330
View File
@@ -1,338 +1,345 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "8c1348d3-4c0e-450f-8faf-19503f61b7b2",
"metadata": {},
"source": [
"# LlamaCloud Demo"
]
},
{
"cell_type": "markdown",
"id": "57082f55-66e0-44e1-8072-2450405c21d1",
"metadata": {},
"source": [
"## Step 0: Setup environment config for LlamaCloud"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "e83a35ec-8e6c-475c-827c-20f46c4a3c43",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "013c59e6-34b2-4685-b8c3-10b9e9afc3a8",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"sk-\""
]
},
{
"cell_type": "markdown",
"id": "9472b5e3-c203-410d-82a3-b19c7c0ed61b",
"metadata": {},
"source": [
"## Step 1: Parse pdf with LlamaParse"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "4f713c84-2774-45c8-b030-a572273724db",
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_index.core import SimpleDirectoryReader"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "e83d26e5-05ea-4bac-a06c-987e06993f1a",
"metadata": {},
"outputs": [],
"source": [
"parser = LlamaParse(\n",
" result_type=\"markdown\", # \"markdown\" and \"text\" are available\n",
" num_workers=4,\n",
" verbose=True,\n",
" language=\"en\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "6414cbff-8b05-4599-bce6-5ce764600a40",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Started parsing the file under job_id 837b9595-bc68-4c02-85b2-ed76acb2f59b\n"
]
}
],
"source": [
"file_extractor = {\".pdf\": parser}\n",
"reader = SimpleDirectoryReader(\n",
" input_files=['data_resnet/resnet.pdf'], \n",
" file_extractor=file_extractor\n",
")\n",
"docs = reader.load_data()"
]
},
{
"cell_type": "markdown",
"id": "c25b8955-b6f7-43c2-8ba4-33a07fca0e2c",
"metadata": {},
"source": [
"## Step 2: Build cloud index"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "ba1896de-c850-44d2-8e79-cb75a04d669c",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "5d879e25-3ffc-4b90-9858-f9f3f6c83851",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Find your index at https://cloud.llamaindex.ai/project/c4bd96da-ba73-4572-b400-4fb2e53b2a95/deploy/e24629a3-5ca5-401e-b673-c3bb22874034\n"
]
}
],
"source": [
"index = LlamaCloudIndex.from_documents(\n",
" name='resnet_0226',\n",
" documents=docs,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "8375bb94-3386-4a21-9f6e-a31c7ff55e5b",
"metadata": {},
"source": [
"## Step 3: Use your retrieval endpoint "
]
},
{
"cell_type": "markdown",
"id": "a7b5cb21-8f98-490d-9468-cb30e12c3d83",
"metadata": {},
"source": [
"If you have a reference to the index: "
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "df510141-13be-4285-9fab-1906c56dc71c",
"metadata": {},
"outputs": [],
"source": [
"retriever = index.as_retriever(rerank_top_n=3)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "63457127-5a86-4b78-81e9-848a07c37362",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "8c1348d3-4c0e-450f-8faf-19503f61b7b2",
"metadata": {},
"source": [
"# LlamaCloud Demo"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 11 ms, sys: 10.7 ms, total: 21.7 ms\n",
"Wall time: 1.34 s\n"
]
}
],
"source": [
"%%time\n",
"nodes = retriever.retrieve('how is the result in ImageNet detection task?')"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "8c3de876-fad7-41e4-b965-fb93b3909fd1",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "57082f55-66e0-44e1-8072-2450405c21d1",
"metadata": {},
"source": [
"## Step 0: Setup environment config for LlamaCloud"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Node ID: 8a3280f3-c9e9-48c3-b9fb-77127c619cd8\n",
"Text: This **result** won the 1st place in the **ImageNet**\n",
"**detection** task in ILSVRC 2015, surpassing the second place by 8.5\n",
"points (absolute). ## **ImageNet** Localization The **ImageNet**\n",
"Localization (LOC) task [36] requires to classify and localize the\n",
"objects. Following [40, 41], we assume that the image-level\n",
"classifiers are first adopted...\n",
"Score: 0.997\n",
"\n",
"Node ID: 8228859b-0e0c-4828-96ee-85ad71c2a3e7\n",
"Text: Under this setting, the results are an mAP@.5 of 55.7% and an\n",
"mAP@[.5, .95] of 34.9% (Table 9). This is our single-model **result**.\n",
"Ensemble. In Faster R-CNN, the system is designed to learn region\n",
"proposals and also object classifiers, so an ensemble can be used to\n",
"boost both tasks. We use an ensemble for proposing regions, and the\n",
"union set ...\n",
"Score: 0.992\n",
"\n",
"Node ID: 189314cd-84ae-426f-8280-cd26ca8b79ac\n",
"Text: **Detection** results on the PASCAL VOC 2007 test set. The\n",
"baseline is the Faster R-CNN system. The system “baseline+++” include\n",
"box refinement, context, and multi-scale testing in Table 9. |system|\n",
"net|data|mAP|areo|bike|bird|boat|bottle|bus|car|cat|chair|cow|table|do\n",
"g|horse|mbike|person|plant|sheep|sofa|train|tv|\n",
"|---|---|---|---|---|---|---|-...\n",
"Score: 0.948\n",
"\n"
]
}
],
"source": [
"for node in nodes:\n",
" print(node)"
]
},
{
"cell_type": "markdown",
"id": "9ba5001d-fcdf-4888-8940-0e7db5372df4",
"metadata": {},
"source": [
"Alternatively, you can directly create a retriever:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "4e4c651b-a1d0-40b0-91ce-a7a96fd701fa",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudRetriever"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "2796361d-90cd-423c-9726-f638a7aefbbc",
"metadata": {},
"outputs": [],
"source": [
"retriever = LlamaCloudRetriever(\n",
" name='resnet_0226',\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "828c11dd-8b40-4262-a4c3-f0345b81a522",
"metadata": {},
"outputs": [],
"source": [
"# %%time\n",
"nodes = retriever.retrieve('what is Deep Residual Learning?')"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "58688348-fa6f-4daf-8d95-20604c6f5a54",
"metadata": {},
"outputs": [
"cell_type": "code",
"execution_count": 9,
"id": "e83a35ec-8e6c-475c-827c-20f46c4a3c43",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Node ID: e74dac90-e4f1-4344-87b3-7b35af1c52f6\n",
"Text: **Residual** learning: a building block. are comparably good or\n",
"better than the constructed solution (or unable to do so in feasible\n",
"time). In this paper, we address the degradation problem by\n",
"introducing a **deep** **residual** learning framework. Instead of\n",
"hoping each few stacked layers directly fit a desired underlying\n",
"mapping, we explicit...\n",
"Score: 0.999\n",
"\n",
"Node ID: 5b856dc3-886e-4e2f-8720-c4720c892403\n",
"Text: ## **Deep** **Residual** Learning for Image Recognition Kaiming\n",
"He Xiangyu Zhang Shaoqing Ren Jian Sun Microsoft Research\n",
"arXiv:1512.03385v1 [cs.CV] 10 Dec 2015 {kahe, v-xiangz, v-shren,\n",
"jiansun}@microsoft.com 20 20 ### Abstract Deeper neural networks are\n",
"more difficult to train. We present a **residual** learning framework\n",
"to ease the train...\n",
"Score: 0.998\n",
"\n",
"Node ID: 9018f86f-647f-4f58-8836-799733664df5\n",
"Text: These methods suggest that a good reformulation or\n",
"preconditioning can simplify the optimization. Shortcut Connections.\n",
"Practices and theories that lead to shortcut connections have been\n",
"studied for a long time. An early practice of training multi-layer\n",
"perceptrons (MLPs) is to add a linear layer connected from the network\n",
"input to the output. ...\n",
"Score: 0.961\n",
"\n"
]
"cell_type": "code",
"execution_count": 10,
"id": "013c59e6-34b2-4685-b8c3-10b9e9afc3a8",
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-\"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"sk-\""
]
},
{
"cell_type": "markdown",
"id": "9472b5e3-c203-410d-82a3-b19c7c0ed61b",
"metadata": {},
"source": [
"## Step 1: Parse pdf with LlamaParse"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "4f713c84-2774-45c8-b030-a572273724db",
"metadata": {},
"outputs": [],
"source": [
"from llama_parse import LlamaParse\n",
"from llama_index.core import SimpleDirectoryReader"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "e83d26e5-05ea-4bac-a06c-987e06993f1a",
"metadata": {},
"outputs": [],
"source": [
"parser = LlamaParse(\n",
" result_type=\"markdown\", # \"markdown\" and \"text\" are available\n",
" num_workers=4,\n",
" verbose=True,\n",
" language=\"en\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "6414cbff-8b05-4599-bce6-5ce764600a40",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Started parsing the file under job_id 837b9595-bc68-4c02-85b2-ed76acb2f59b\n"
]
}
],
"source": [
"file_extractor = {\".pdf\": parser}\n",
"reader = SimpleDirectoryReader(\n",
" input_files=['data_resnet/resnet.pdf'], \n",
" file_extractor=file_extractor\n",
")\n",
"docs = reader.load_data()"
]
},
{
"cell_type": "markdown",
"id": "c25b8955-b6f7-43c2-8ba4-33a07fca0e2c",
"metadata": {},
"source": [
"## Step 2: Build cloud index"
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "ba1896de-c850-44d2-8e79-cb75a04d669c",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex"
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "5d879e25-3ffc-4b90-9858-f9f3f6c83851",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Find your index at https://cloud.llamaindex.ai/project/c4bd96da-ba73-4572-b400-4fb2e53b2a95/deploy/e24629a3-5ca5-401e-b673-c3bb22874034\n"
]
}
],
"source": [
"index = LlamaCloudIndex.from_documents(\n",
" name='resnet_0226',\n",
" documents=docs,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "8375bb94-3386-4a21-9f6e-a31c7ff55e5b",
"metadata": {},
"source": [
"## Step 3: Use your retrieval endpoint "
]
},
{
"cell_type": "markdown",
"id": "a7b5cb21-8f98-490d-9468-cb30e12c3d83",
"metadata": {},
"source": [
"If you have a reference to the index: "
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "df510141-13be-4285-9fab-1906c56dc71c",
"metadata": {},
"outputs": [],
"source": [
"retriever = index.as_retriever(rerank_top_n=3)"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "63457127-5a86-4b78-81e9-848a07c37362",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"CPU times: user 11 ms, sys: 10.7 ms, total: 21.7 ms\n",
"Wall time: 1.34 s\n"
]
}
],
"source": [
"%%time\n",
"nodes = retriever.retrieve('how is the result in ImageNet detection task?')"
]
},
{
"cell_type": "code",
"execution_count": 30,
"id": "8c3de876-fad7-41e4-b965-fb93b3909fd1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Node ID: 8a3280f3-c9e9-48c3-b9fb-77127c619cd8\n",
"Text: This **result** won the 1st place in the **ImageNet**\n",
"**detection** task in ILSVRC 2015, surpassing the second place by 8.5\n",
"points (absolute). ## **ImageNet** Localization The **ImageNet**\n",
"Localization (LOC) task [36] requires to classify and localize the\n",
"objects. Following [40, 41], we assume that the image-level\n",
"classifiers are first adopted...\n",
"Score: 0.997\n",
"\n",
"Node ID: 8228859b-0e0c-4828-96ee-85ad71c2a3e7\n",
"Text: Under this setting, the results are an mAP@.5 of 55.7% and an\n",
"mAP@[.5, .95] of 34.9% (Table 9). This is our single-model **result**.\n",
"Ensemble. In Faster R-CNN, the system is designed to learn region\n",
"proposals and also object classifiers, so an ensemble can be used to\n",
"boost both tasks. We use an ensemble for proposing regions, and the\n",
"union set ...\n",
"Score: 0.992\n",
"\n",
"Node ID: 189314cd-84ae-426f-8280-cd26ca8b79ac\n",
"Text: **Detection** results on the PASCAL VOC 2007 test set. The\n",
"baseline is the Faster R-CNN system. The system “baseline+++” include\n",
"box refinement, context, and multi-scale testing in Table 9. |system|\n",
"net|data|mAP|areo|bike|bird|boat|bottle|bus|car|cat|chair|cow|table|do\n",
"g|horse|mbike|person|plant|sheep|sofa|train|tv|\n",
"|---|---|---|---|---|---|---|-...\n",
"Score: 0.948\n",
"\n"
]
}
],
"source": [
"for node in nodes:\n",
" print(node)"
]
},
{
"cell_type": "markdown",
"id": "9ba5001d-fcdf-4888-8940-0e7db5372df4",
"metadata": {},
"source": [
"Alternatively, you can directly create a retriever:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "4e4c651b-a1d0-40b0-91ce-a7a96fd701fa",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudRetriever"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "2796361d-90cd-423c-9726-f638a7aefbbc",
"metadata": {},
"outputs": [],
"source": [
"retriever = LlamaCloudRetriever(\n",
" name='resnet_0226',\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "828c11dd-8b40-4262-a4c3-f0345b81a522",
"metadata": {},
"outputs": [],
"source": [
"# %%time\n",
"nodes = retriever.retrieve('what is Deep Residual Learning?')"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "58688348-fa6f-4daf-8d95-20604c6f5a54",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Node ID: e74dac90-e4f1-4344-87b3-7b35af1c52f6\n",
"Text: **Residual** learning: a building block. are comparably good or\n",
"better than the constructed solution (or unable to do so in feasible\n",
"time). In this paper, we address the degradation problem by\n",
"introducing a **deep** **residual** learning framework. Instead of\n",
"hoping each few stacked layers directly fit a desired underlying\n",
"mapping, we explicit...\n",
"Score: 0.999\n",
"\n",
"Node ID: 5b856dc3-886e-4e2f-8720-c4720c892403\n",
"Text: ## **Deep** **Residual** Learning for Image Recognition Kaiming\n",
"He Xiangyu Zhang Shaoqing Ren Jian Sun Microsoft Research\n",
"arXiv:1512.03385v1 [cs.CV] 10 Dec 2015 {kahe, v-xiangz, v-shren,\n",
"jiansun}@microsoft.com 20 20 ### Abstract Deeper neural networks are\n",
"more difficult to train. We present a **residual** learning framework\n",
"to ease the train...\n",
"Score: 0.998\n",
"\n",
"Node ID: 9018f86f-647f-4f58-8836-799733664df5\n",
"Text: These methods suggest that a good reformulation or\n",
"preconditioning can simplify the optimization. Shortcut Connections.\n",
"Practices and theories that lead to shortcut connections have been\n",
"studied for a long time. An early practice of training multi-layer\n",
"perceptrons (MLPs) is to add a linear layer connected from the network\n",
"input to the output. ...\n",
"Score: 0.961\n",
"\n"
]
}
],
"source": [
"for node in nodes:\n",
" print(node)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
],
"source": [
"for node in nodes:\n",
" print(node)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
+316 -309
View File
@@ -1,316 +1,323 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "007f366e-5098-42fa-baf1-b709f5845301",
"metadata": {},
"source": [
"# Ad-Hoc Experimentation with Chunking\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/experimentation/chunk_size_adhoc.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"A key parameter for RAG pipelines is chunking - the chunk size affects the accuracy of your overall RAG pipeline.\n",
"\n",
"Unlike retrieval and query-time parameters though, chunking is a little harder to experiment with. This is because changing your chunking configuration requires reindexing your data, which can be tedious to experiment with.\n",
"\n",
"LlamaCloud provides easy ways for you to perform **ad-hoc** experimentation over chunking. \n",
"1. First, validate if a given query is correct over an index with our index playground features.\n",
"2. If not correct, you can clone an index with a click of a button, and set different chunking/ingestion parameters more broadly.as\n",
"3. Try the same query over the playground features again and see if it leads to the right results. \n",
"\n",
"**NOTE**: More structured experimentation capabilities here are coming soon! "
]
},
{
"cell_type": "markdown",
"id": "3483a0e1-3344-4527-a6b5-6f6b829915a2",
"metadata": {},
"source": [
"## Setup a LlamaCloud Index\n",
"\n",
"Download the three ICLR 2024 papers below. Then, create a new LlamaCloud Index in the UI and upload these three files through drag/drop.\n",
"\n",
"In the \"Transform Settings\" - make sure to select the \"Auto\" tab with a chunk size of 512. This will be our starting point - we'll analyze how well different queries perform on this index, and iterate on indexing parameters after. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "6e40b962-b6e1-46c7-9a6b-cbe14d21ae28",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()\n",
"\n",
"from IPython.display import Markdown, display"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "1f94da7d-b18a-4ce7-90eb-f6ef99e20a06",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.core.settings import Settings\n",
"\n",
"Settings.llm = OpenAI(model=\"gpt-4o\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "26a99a0f-4463-46a8-a847-f390ee578ebf",
"metadata": {},
"outputs": [],
"source": [
"# NOTE: insert your own `name`, `project_name`, and `api_key`\n",
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"research_papers_512\", \n",
" project_name=\"llamacloud_demo\",\n",
" # api_key=\"llx-\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "640ea13c-95db-4790-b0c2-54917e20c4f6",
"metadata": {},
"source": [
"## Ad-hoc Test a Question\n",
"\n",
"You want to test this index by sanity-checking with a question you already know the answer to. In this example, we want to understand the core features of the SWE-Bench dataset in page 3, as shown in the image below.\n",
"\n",
"![](chunk_size_adhoc_images/source_chunk.png)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "974d6f4e-9c74-47e7-8ef4-6fc535510471",
"metadata": {},
"outputs": [],
"source": [
"# set re-ranking top-n to 3 \n",
"query_engine = index.as_query_engine(rerank_top_n=3)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "186eb4cb-d809-434d-80fb-22610c32ad2f",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"RETRIEVING: Tell me about the core features of SWE-bench\n"
]
}
],
"source": [
"response = query_engine.query(\"Tell me about the core features of SWE-bench\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "bb8ac742-0828-44c4-839f-9454af3899a2",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"SWE-bench is a benchmark designed to evaluate language models (LMs) in a realistic software engineering setting. Its core features include:\n",
"\n",
"1. **Real-world Software Engineering Tasks**: Each task involves a large and complex codebase along with a detailed issue description, requiring sophisticated skills and knowledge akin to those of experienced software engineers.\n",
"\n",
"2. **Continually Updatable**: The benchmark can be easily extended with new task instances from any Python repository on GitHub, ensuring a continual supply of fresh challenges that were not part of the models' training data.\n",
"\n",
"3. **Diverse Long Inputs**: Issue descriptions are typically lengthy and detailed, and the codebases contain many thousands of files. This requires models to identify the specific lines that need modification among a vast amount of context.\n",
"\n",
"4. **Robust Evaluation**: Each task instance includes at least one fail-to-pass test to verify the solution, with many instances having multiple such tests. Additionally, a median of 51 other tests are run to ensure that prior functionality is maintained.\n",
"\n",
"5. **Realistic Setting**: Unlike traditional benchmarks that involve short and contrived problems, SWE-bench uses user-submitted issues and solutions from popular GitHub repositories, providing a more realistic and challenging environment for evaluation.\n",
"\n",
"6. **Execution-based Evaluation**: The benchmark uses the repositorys testing framework to evaluate the revised codebase, ensuring that the proposed solutions are practical and effective.\n",
"\n",
"These features collectively make SWE-bench a comprehensive and challenging benchmark for evaluating the capabilities of language models in real-world software engineering tasks.\n"
]
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "eaeef466-a95e-4211-af53-532236a7ea36",
"metadata": {},
"source": [
"### Analyze Results\n",
"\n",
"**NOTE**: Assuming we have knowledge of the ground-truth, we know this answer isn't quite complete. \n",
"Instead of the notebook, we can also quickly validate this over the chat UI and retrieval UI in the \"Playground\" section of the index page. \n",
"Try clicking into the LlamaCloud playground, and enter the question above in the chat UI, and look at the response and set of retrieved nodes.\n",
"\n",
"![](chunk_size_adhoc_images/chat_ui_test.png)\n",
"\n",
"Now enter the same question into the retrieval UI, which lets you not only see the chunks but also the source document for each chunk. \n",
"\n",
"![](chunk_size_adhoc_images/retrieval_ui_test.png)\n",
"\n",
"Clicking \"View in File\" on the first chunk will let you see how the source document is parsed and chunked. Since we have knowledge of the ground-truth, we can check to see if the ground-truth context is chunked in a cohesive manner - in this case we can see that the relevant section is cutoff, and Node 21 is not in the retrieved set at all. \n",
"\n",
"![](chunk_size_adhoc_images/view_chunks.png)\n",
"\n",
"You'll notice that the relevant paragraph is broken up into two chunks."
]
},
{
"cell_type": "markdown",
"id": "4b5a8b94-27b5-46bd-b066-f21d1146d02b",
"metadata": {},
"source": [
"## Experimenting with Chunk Sizes\n",
"\n",
"You can tackle the above issue in a variety of ways. For instance, you can keep the chunk sizes fixed and only tune retrieval parameters, like top-k, hybrid search, reranking, etc. There are tradeoffs to only tuning retrieval though. Increasing top-k can lead to increased latency and cost.\n",
"\n",
"Chunk sizes are a little harder than retrieval parameters to experiment with, since changing it requires retriggering an index run.\n",
"\n",
"With LlamaCloud, we can easily create a new index with a different chunking configuration and see if the retrieved results change. \n",
"\n",
"First, on the Index page click the \"Copy\" button to duplicate the index, and give it a new name. \n",
"\n",
"Click into the new index, rename it as you wish, and then click \"Edit\" and change the chunking configuration to page-level chunking.\n",
"1. Click into \"Manual\"\n",
"2. Click \"Page\" segmentation in \"Segmentation Configuration\" to segment by page at the top-level\n",
"3. In \"Chunking Configuration\", select None for the mode.\n",
"\n",
"![](chunk_size_adhoc_images/transform_config.png)\n",
"\n",
"Click \"Save\" to set the new index settings and retrigger a run of the pipeline."
]
},
{
"cell_type": "markdown",
"id": "df214f58-b010-452b-8787-3b985b32e42c",
"metadata": {},
"source": [
"Let's try testing this same index. "
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "815d5f40-ae6d-4944-a243-fd99d0b05ec7",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"research_papers_page\", \n",
" project_name=\"llamacloud_demo\",\n",
" # api_key=\"llx-\"\n",
")\n",
"query_engine = index.as_query_engine(rerank_top_n=3)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "71676364-ef94-4a15-bb6d-ddc69a7821a7",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "007f366e-5098-42fa-baf1-b709f5845301",
"metadata": {},
"source": [
"# Ad-Hoc Experimentation with Chunking\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/experimentation/chunk_size_adhoc.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"A key parameter for RAG pipelines is chunking - the chunk size affects the accuracy of your overall RAG pipeline.\n",
"\n",
"Unlike retrieval and query-time parameters though, chunking is a little harder to experiment with. This is because changing your chunking configuration requires reindexing your data, which can be tedious to experiment with.\n",
"\n",
"LlamaCloud provides easy ways for you to perform **ad-hoc** experimentation over chunking. \n",
"1. First, validate if a given query is correct over an index with our index playground features.\n",
"2. If not correct, you can clone an index with a click of a button, and set different chunking/ingestion parameters more broadly.as\n",
"3. Try the same query over the playground features again and see if it leads to the right results. \n",
"\n",
"**NOTE**: More structured experimentation capabilities here are coming soon! "
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"RETRIEVING: Tell me about the core features of SWE-bench\n"
]
}
],
"source": [
"response = query_engine.query(\"Tell me about the core features of SWE-bench\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "8331e3d1-b537-49ca-a57b-bcab179a4e46",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "3483a0e1-3344-4527-a6b5-6f6b829915a2",
"metadata": {},
"source": [
"## Setup a LlamaCloud Index\n",
"\n",
"Download the three ICLR 2024 papers below. Then, create a new LlamaCloud Index in the UI and upload these three files through drag/drop.\n",
"\n",
"In the \"Transform Settings\" - make sure to select the \"Auto\" tab with a chunk size of 512. This will be our starting point - we'll analyze how well different queries perform on this index, and iterate on indexing parameters after. "
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"SWE-bench is a benchmark designed to evaluate language models (LMs) in realistic software engineering settings. It features several core attributes:\n",
"\n",
"1. **Real-world Software Engineering Tasks**: It involves large and complex codebases with detailed issue descriptions, requiring sophisticated skills and knowledge akin to those of experienced software engineers.\n",
"\n",
"2. **Continually Updatable**: The collection process can be applied to any Python repository on GitHub with minimal human intervention, allowing for a continual supply of new task instances.\n",
"\n",
"3. **Diverse Long Inputs**: Issue descriptions are typically long and detailed, and codebases contain many thousands of files, necessitating the identification of specific lines that need editing.\n",
"\n",
"4. **Robust Evaluation**: Each task instance includes at least one fail-to-pass test to ensure the model addresses the problem, with additional tests to check for proper maintenance of prior functionality.\n",
"\n",
"5. **Cross-context Code Editing**: Unlike benchmarks that constrain edit scope, SWE-bench requires generating revisions in multiple locations of a large codebase, challenging models to handle repository-scale code editing.\n",
"\n",
"6. **Wide Scope for Possible Solutions**: It allows for creative freedom in generating novel solutions, providing a level playing field to compare various approaches, from retrieval and long-context models to decision-making agents.\n"
]
"cell_type": "code",
"execution_count": 1,
"id": "6e40b962-b6e1-46c7-9a6b-cbe14d21ae28",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()\n",
"\n",
"from IPython.display import Markdown, display"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "1f94da7d-b18a-4ce7-90eb-f6ef99e20a06",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.openai import OpenAI\n",
"from llama_index.core.settings import Settings\n",
"\n",
"Settings.llm = OpenAI(model=\"gpt-4o\")"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "26a99a0f-4463-46a8-a847-f390ee578ebf",
"metadata": {},
"outputs": [],
"source": [
"# NOTE: insert your own `name`, `project_name`, and `api_key`\n",
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"research_papers_512\", \n",
" project_name=\"llamacloud_demo\",\n",
" # api_key=\"llx-\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "640ea13c-95db-4790-b0c2-54917e20c4f6",
"metadata": {},
"source": [
"## Ad-hoc Test a Question\n",
"\n",
"You want to test this index by sanity-checking with a question you already know the answer to. In this example, we want to understand the core features of the SWE-Bench dataset in page 3, as shown in the image below.\n",
"\n",
"![](chunk_size_adhoc_images/source_chunk.png)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "974d6f4e-9c74-47e7-8ef4-6fc535510471",
"metadata": {},
"outputs": [],
"source": [
"# set re-ranking top-n to 3 \n",
"query_engine = index.as_query_engine(rerank_top_n=3)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "186eb4cb-d809-434d-80fb-22610c32ad2f",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"RETRIEVING: Tell me about the core features of SWE-bench\n"
]
}
],
"source": [
"response = query_engine.query(\"Tell me about the core features of SWE-bench\")"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "bb8ac742-0828-44c4-839f-9454af3899a2",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SWE-bench is a benchmark designed to evaluate language models (LMs) in a realistic software engineering setting. Its core features include:\n",
"\n",
"1. **Real-world Software Engineering Tasks**: Each task involves a large and complex codebase along with a detailed issue description, requiring sophisticated skills and knowledge akin to those of experienced software engineers.\n",
"\n",
"2. **Continually Updatable**: The benchmark can be easily extended with new task instances from any Python repository on GitHub, ensuring a continual supply of fresh challenges that were not part of the models' training data.\n",
"\n",
"3. **Diverse Long Inputs**: Issue descriptions are typically lengthy and detailed, and the codebases contain many thousands of files. This requires models to identify the specific lines that need modification among a vast amount of context.\n",
"\n",
"4. **Robust Evaluation**: Each task instance includes at least one fail-to-pass test to verify the solution, with many instances having multiple such tests. Additionally, a median of 51 other tests are run to ensure that prior functionality is maintained.\n",
"\n",
"5. **Realistic Setting**: Unlike traditional benchmarks that involve short and contrived problems, SWE-bench uses user-submitted issues and solutions from popular GitHub repositories, providing a more realistic and challenging environment for evaluation.\n",
"\n",
"6. **Execution-based Evaluation**: The benchmark uses the repositorys testing framework to evaluate the revised codebase, ensuring that the proposed solutions are practical and effective.\n",
"\n",
"These features collectively make SWE-bench a comprehensive and challenging benchmark for evaluating the capabilities of language models in real-world software engineering tasks.\n"
]
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "eaeef466-a95e-4211-af53-532236a7ea36",
"metadata": {},
"source": [
"### Analyze Results\n",
"\n",
"**NOTE**: Assuming we have knowledge of the ground-truth, we know this answer isn't quite complete. \n",
"Instead of the notebook, we can also quickly validate this over the chat UI and retrieval UI in the \"Playground\" section of the index page. \n",
"Try clicking into the LlamaCloud playground, and enter the question above in the chat UI, and look at the response and set of retrieved nodes.\n",
"\n",
"![](chunk_size_adhoc_images/chat_ui_test.png)\n",
"\n",
"Now enter the same question into the retrieval UI, which lets you not only see the chunks but also the source document for each chunk. \n",
"\n",
"![](chunk_size_adhoc_images/retrieval_ui_test.png)\n",
"\n",
"Clicking \"View in File\" on the first chunk will let you see how the source document is parsed and chunked. Since we have knowledge of the ground-truth, we can check to see if the ground-truth context is chunked in a cohesive manner - in this case we can see that the relevant section is cutoff, and Node 21 is not in the retrieved set at all. \n",
"\n",
"![](chunk_size_adhoc_images/view_chunks.png)\n",
"\n",
"You'll notice that the relevant paragraph is broken up into two chunks."
]
},
{
"cell_type": "markdown",
"id": "4b5a8b94-27b5-46bd-b066-f21d1146d02b",
"metadata": {},
"source": [
"## Experimenting with Chunk Sizes\n",
"\n",
"You can tackle the above issue in a variety of ways. For instance, you can keep the chunk sizes fixed and only tune retrieval parameters, like top-k, hybrid search, reranking, etc. There are tradeoffs to only tuning retrieval though. Increasing top-k can lead to increased latency and cost.\n",
"\n",
"Chunk sizes are a little harder than retrieval parameters to experiment with, since changing it requires retriggering an index run.\n",
"\n",
"With LlamaCloud, we can easily create a new index with a different chunking configuration and see if the retrieved results change. \n",
"\n",
"First, on the Index page click the \"Copy\" button to duplicate the index, and give it a new name. \n",
"\n",
"Click into the new index, rename it as you wish, and then click \"Edit\" and change the chunking configuration to page-level chunking.\n",
"1. Click into \"Manual\"\n",
"2. Click \"Page\" segmentation in \"Segmentation Configuration\" to segment by page at the top-level\n",
"3. In \"Chunking Configuration\", select None for the mode.\n",
"\n",
"![](chunk_size_adhoc_images/transform_config.png)\n",
"\n",
"Click \"Save\" to set the new index settings and retrigger a run of the pipeline."
]
},
{
"cell_type": "markdown",
"id": "df214f58-b010-452b-8787-3b985b32e42c",
"metadata": {},
"source": [
"Let's try testing this same index. "
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "815d5f40-ae6d-4944-a243-fd99d0b05ec7",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"research_papers_page\", \n",
" project_name=\"llamacloud_demo\",\n",
" # api_key=\"llx-\"\n",
")\n",
"query_engine = index.as_query_engine(rerank_top_n=3)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "71676364-ef94-4a15-bb6d-ddc69a7821a7",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"RETRIEVING: Tell me about the core features of SWE-bench\n"
]
}
],
"source": [
"response = query_engine.query(\"Tell me about the core features of SWE-bench\")"
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "8331e3d1-b537-49ca-a57b-bcab179a4e46",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SWE-bench is a benchmark designed to evaluate language models (LMs) in realistic software engineering settings. It features several core attributes:\n",
"\n",
"1. **Real-world Software Engineering Tasks**: It involves large and complex codebases with detailed issue descriptions, requiring sophisticated skills and knowledge akin to those of experienced software engineers.\n",
"\n",
"2. **Continually Updatable**: The collection process can be applied to any Python repository on GitHub with minimal human intervention, allowing for a continual supply of new task instances.\n",
"\n",
"3. **Diverse Long Inputs**: Issue descriptions are typically long and detailed, and codebases contain many thousands of files, necessitating the identification of specific lines that need editing.\n",
"\n",
"4. **Robust Evaluation**: Each task instance includes at least one fail-to-pass test to ensure the model addresses the problem, with additional tests to check for proper maintenance of prior functionality.\n",
"\n",
"5. **Cross-context Code Editing**: Unlike benchmarks that constrain edit scope, SWE-bench requires generating revisions in multiple locations of a large codebase, challenging models to handle repository-scale code editing.\n",
"\n",
"6. **Wide Scope for Possible Solutions**: It allows for creative freedom in generating novel solutions, providing a level playing field to compare various approaches, from retrieval and long-context models to decision-making agents.\n"
]
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "b8f2ccf3-a456-4c5f-a75c-db1b151e798b",
"metadata": {},
"source": [
"**Result**: Turns out that page-level chunking helps you give back the main result. This is not unexpected, since page-level chunking preserves context across an entire page. \n",
"\n",
"## Next Steps\n",
"\n",
"If you are aiming for development velocity, you can keep the page-level chunking as a reasonable baseline and build something that \"just works\". If you are looking to iteratively improve chunking further, consider running the two LlamaCloud indexes you've defined over a more structured dataset and evaluating the results."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "273ef8e0-1280-4647-b2b8-21819fbc0bf7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "b8f2ccf3-a456-4c5f-a75c-db1b151e798b",
"metadata": {},
"source": [
"**Result**: Turns out that page-level chunking helps you give back the main result. This is not unexpected, since page-level chunking preserves context across an entire page. \n",
"\n",
"## Next Steps\n",
"\n",
"If you are aiming for development velocity, you can keep the page-level chunking as a reasonable baseline and build something that \"just works\". If you are looking to iteratively improve chunking further, consider running the two LlamaCloud indexes you've defined over a more structured dataset and evaluating the results."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "273ef8e0-1280-4647-b2b8-21819fbc0bf7",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
File diff suppressed because it is too large Load Diff
File diff suppressed because one or more lines are too long
+489 -482
View File
@@ -1,491 +1,498 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "99cea58c-48bc-4af6-8358-df9695659983",
"metadata": {},
"source": [
"# Getting Started: Building Agents over LlamaCloud RAG Pipelines\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/getting_started_agent.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"In this notebook we show you how to build a function calling agent (powered by OpenAI) over RAG pipelines built with LlamaCloud.\n",
"\n",
"Adding an agentic layer to RAG allows you to build in a layer of query planning and state management that allows you to ask multi-part complex questions over existing RAG pipelines and get back answers in a conversational manner.\n"
]
},
{
"cell_type": "markdown",
"id": "54b7bc2e-606f-411a-9490-fcfab9236dfc",
"metadata": {},
"source": [
"## Initial Setup "
]
},
{
"cell_type": "markdown",
"id": "23e80e5b-aaee-4f23-b338-7ae62b08141f",
"metadata": {},
"source": [
"Let's start by importing some simple building blocks. \n",
"\n",
"The main thing we need is:\n",
"1. the OpenAI API (using our own `llama_index` LLM class)\n",
"2. a place to keep conversation history \n",
"3. a definition for tools that our agent can use."
]
},
{
"cell_type": "markdown",
"id": "41101795",
"metadata": {},
"source": [
"If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4985c578",
"metadata": {},
"outputs": [],
"source": [
"%pip install -U llama-index-indices-managed-llama-cloud\n",
"%pip install -U llama-index\n",
"%pip install -U llama-index-core"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "9d47283b-025e-4874-88ed-76245b22f82e",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()\n",
"\n",
"from IPython.display import Markdown, display"
]
},
{
"cell_type": "markdown",
"id": "eeac7d4c-58fd-42a5-9da9-c258375c61a0",
"metadata": {},
"source": [
"Make sure your OPENAI_API_KEY is set. Otherwise explicitly specify the `api_key` parameter."
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "4becf171-6632-42e5-bdec-918a00934696",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.openai import OpenAI\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<openai_api_key>\"\n",
"\n",
"llm = OpenAI(model=\"gpt-4.1\")"
]
},
{
"cell_type": "markdown",
"id": "51dff7d4-07cf-472f-bb35-e231c5874f1b",
"metadata": {},
"source": [
"## Build Two LlamaCloud Indexes\n",
"\n",
"Our data sources are the 2021 Lyft and Uber 10K documents.\n",
"\n",
"In contrast to the other getting started examples, in this notebook we will build **two** RAG pipelines: one for Uber and one for Lyft. This is for the sake of example; we can plug in both RAG pipelines as tools for the agent to reason over.\n",
"\n",
"To create each index, follow the instructions:\n",
"1. You can download them here ([Uber 10K](https://www.dropbox.com/s/te0a2w227v27iag/uber_2021.pdf?dl=1), [Lyft 10K](https://www.dropbox.com/s/qctkz6nxhm0y5qe/lyft_2021.pdf?dl=1))\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading these documents."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3aac2202-5346-4fe5-a0b5-cbac64003fbc",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index_uber = LlamaCloudIndex(\n",
" name=\"<index_uber>\", \n",
" api_key=\"<api_key>\"\n",
")\n",
"index_lyft = LlamaCloudIndex(\n",
" name=\"<index_lyft>\", \n",
" api_key=\"<api_key>\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "a747e287-16e0-4ca6-8580-3d2e1a0b6e6c",
"metadata": {},
"source": [
"For each index, get a query engine from the index, which gives us an out-of-the-box RAG pipeline."
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "7c352324-8112-43f1-ad97-d02e581bf282",
"metadata": {},
"outputs": [],
"source": [
"query_engine_uber = index_uber.as_query_engine(llm=llm)\n",
"query_engine_lyft = index_lyft.as_query_engine(llm=llm)"
]
},
{
"cell_type": "markdown",
"id": "cabfdf01-8d63-43ff-b06e-a3059ede2ddf",
"metadata": {},
"source": [
"## OpenAI Agent over LlamaCloud RAG Pipelines\n",
"\n",
"We convert both query engines to tools and pass it to a function calling OpenAI agent."
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "48c0cf98-3f10-4599-8437-d88dc89cefad",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.tools import QueryEngineTool, ToolMetadata\n",
"\n",
"tool_uber = QueryEngineTool(\n",
" query_engine=query_engine_uber,\n",
" metadata=ToolMetadata(\n",
" name=\"uber_10k\",\n",
" description=(\n",
" \"Provides information about Uber financials for year 2021. \"\n",
" \"Use a detailed plain text question as input to the tool.\"\n",
" ),\n",
" ),\n",
")\n",
"tool_lyft = QueryEngineTool(\n",
" query_engine=query_engine_lyft,\n",
" metadata=ToolMetadata(\n",
" name=\"lyft_10k\",\n",
" description=(\n",
" \"Provides information about Lyft financials for year 2021. \"\n",
" \"Use a detailed plain text question as input to the tool.\"\n",
" ),\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "ebfdaf80-e5e1-4c60-b556-20558da3d5e3",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.agent.workflow import FunctionAgent\n",
"\n",
"agent = FunctionAgent(\n",
" tools=[tool_uber, tool_lyft],\n",
" llm=llm,\n",
" system_prompt=\"You are a helpful assistant that can answer questions about Uber and Lyft.\",\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "58c53f2a-0a3f-4abe-b8b6-97a974ec7546",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step init_run\n",
"Step init_run produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced no event\n",
"Running step call_tool\n",
"Running step call_tool\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced no event\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced event StopEvent\n"
]
}
],
"source": [
"response = await agent.run(user_msg=\"Tell me both the tailwinds for Uber and Lyft?\")"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "cb682e18-2538-4da7-9bed-5c585d971735",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Here are the main tailwinds for both Uber and Lyft in 2021:\n",
"\n",
"Uber:\n",
"- Significant increase in Gross Bookings (up 56% year-over-year), driven by strong growth in both Delivery and Mobility segments.\n",
"- Delivery segment saw substantial growth due to increased food delivery orders, higher basket sizes (influenced by stay-at-home demand from COVID-19), and expansion in U.S. and international markets.\n",
"- Mobility segment rebounded as trip volumes recovered from pandemic lows.\n",
"- Revenue increased by 57% year-over-year, with additional contributions from the Freight segment (especially after acquiring Transplace).\n",
"- Operational improvements, such as reduced fixed costs and increased efficiency, led to a significant reduction in net loss.\n",
"- Gains from the sale of the autonomous vehicle business and positive impacts from equity investments.\n",
"- Expansion of offerings through acquisitions (e.g., Cornershop and Drizly) added grocery and alcohol delivery capabilities.\n",
"- Growth in Monthly Active Platform Consumers and trip volumes.\n",
"\n",
"Lyft:\n",
"- Widespread COVID-19 vaccine distribution and community reopenings led to a strong recovery in revenue and active riders.\n",
"- Revenue increased by 36% compared to the prior year.\n",
"- Active riders grew by 49.2% in Q4 2021 compared to Q4 2020.\n",
"- Achieved first annual Adjusted EBITDA profitability in 2021.\n",
"- Maintained a strong liquidity position ($2.3 billion in unrestricted cash, cash equivalents, and short-term investments at year-end).\n",
"- These factors supported Lyfts ongoing recovery and expansion efforts.\n",
"\n",
"In summary, both companies benefited from pandemic recovery, increased demand, and operational improvements, with Uber also seeing strong growth in its Delivery and Freight segments, and Lyft achieving profitability and maintaining strong liquidity."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(f\"{response}\"))"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "1b3b5915",
"metadata": {},
"outputs": [
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step init_run\n",
"Step init_run produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced no event\n",
"Running step call_tool\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced event StopEvent\n"
]
}
],
"source": [
"response = await agent.run(user_msg=\"What are the investments made by Uber?\")"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "305abd8d",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"In 2021, Uber made several notable investments and acquisitions, as detailed in their annual report:\n",
"\n",
"**Acquisitions:**\n",
"- Transplace: Uber acquired 100% of Transplace, a major logistics and transportation management provider, to expand its Uber Freight business.\n",
"- Cornershop: Uber completed the acquisition of the remaining stake in Cornershop, a grocery delivery platform, making it a wholly owned subsidiary.\n",
"- Drizly: Uber acquired Drizly, an on-demand alcohol marketplace, to enhance its Delivery segment.\n",
"\n",
"**Equity Investments and Stakes:**\n",
"- Aurora: Uber sold its autonomous vehicle unit (ATG) to Aurora Innovation, receiving Aurora stock in return. After Auroras public listing, Ubers stake was valued at $3.4 billion at year-end.\n",
"- Moove: Uber invested in Moove, a Spanish vehicle fleet operator, acquiring a 30% minority interest and providing a $213 million loan, with an option to buy more shares.\n",
"- Didi: Ubers investment in Didi became publicly traded shares after Didis IPO.\n",
"- Zomato: Ubers investment in Zomato also became publicly traded after Zomatos IPO, valued at $1.1 billion at year-end.\n",
"\n",
"**Divestitures and Related Investments:**\n",
"- Yandex Joint Ventures: Uber sold its stakes in Yandexs self-driving and delivery businesses, as well as part of its interest in MLU B.V., to Yandex.\n",
"- ATG Business: Uber divested its ATG business to Aurora, receiving equity in Aurora as consideration.\n",
"\n",
"**Other Financial Developments:**\n",
"- Uber issued $1.5 billion in senior notes in 2021.\n",
"- Ubers investments in Grab and Zomato resulted in significant unrealized gains, while its Didi investment saw a substantial unrealized loss.\n",
"\n",
"These investments reflect Ubers strategy to expand its logistics, delivery, and mobility services, while also restructuring and monetizing its stakes in global technology and mobility companies."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
"cell_type": "markdown",
"id": "99cea58c-48bc-4af6-8358-df9695659983",
"metadata": {},
"source": [
"# Getting Started: Building Agents over LlamaCloud RAG Pipelines\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/getting_started_agent.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"In this notebook we show you how to build a function calling agent (powered by OpenAI) over RAG pipelines built with LlamaCloud.\n",
"\n",
"Adding an agentic layer to RAG allows you to build in a layer of query planning and state management that allows you to ask multi-part complex questions over existing RAG pipelines and get back answers in a conversational manner.\n"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(f\"{response}\"))"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "d45fd026",
"metadata": {},
"outputs": [
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step init_run\n",
"Step init_run produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced no event\n",
"Running step call_tool\n",
"Running step call_tool\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced no event\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced event StopEvent\n"
]
}
],
"source": [
"response = await agent.run(user_msg=\"Compare the investments made by Uber and lyft?\")"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "69a165e2",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Here is a comparison of the investments made by Uber and Lyft in 2021:\n",
"\n",
"Uber\n",
"\n",
"- Acquisitions: Uber acquired Cornershop (grocery delivery), Drizly (alcohol delivery), and Transplace (logistics), and consolidated CS-Mexico.\n",
"- Divestitures & Equity Investments: Sold its autonomous vehicle unit (ATG) to Aurora Innovation in exchange for equity, and restructured/sold interests in Yandex-related ventures.\n",
"- Technology & Partnerships: Invested in Moove (vehicle fleet operator in Spain), with a minority equity stake, a $213 million loan, and a commercial partnership. Continued capital expenditures on technology, including assets from Postmates, Transplace, Drizly, and Cornershop.\n",
"- Capital Expenditures: Issued $1.5 billion in senior notes for additional capital, and entered a reinsurance agreement for legacy auto insurance liabilities, resulting in a $1 billion cash inflow.\n",
"- Strategy: Focused on expanding core businesses, enhancing technology, and strengthening market position through acquisitions and partnerships.\n",
"\n",
"Lyft\n",
"\n",
"- Acquisitions & Technology: Invested heavily in R&D, completed strategic acquisitions, and expanded its network of shared bikes and scooters. Developed third-party self-driving technology through Lyft Autonomous.\n",
"- Divestitures: Sold its Level 5 self-driving vehicle division, shifting to licensing and data access agreements with autonomous vehicle companies.\n",
"- Partnerships & Innovation: Invested in sales/marketing, expanded Lyft Rentals, and supported the Express Drive vehicle leasing program. Received a $64 million non-marketable equity security as part of a licensing agreement.\n",
"- Capital Expenditures: Included real estate leases (81 locations), finance leases for Flexdrive, and other long-term commitments.\n",
"- Sustainability: Invested in environmental initiatives, aiming for 100% electric vehicles by 2030 and compliance with Californias Clean Miles Standard.\n",
"- Strategy: Focused on technological innovation, multimodal transportation, strategic partnerships, operational infrastructure, and sustainability.\n",
"\n",
"Summary\n",
"\n",
"- Ubers investments were more acquisition-heavy, targeting expansion in delivery, logistics, and international markets, with significant capital raised and deployed for these purposes.\n",
"- Lyfts investments focused on technology, multimodal transportation (bikes, scooters, rentals), partnerships, and sustainability, with fewer large acquisitions but significant R&D and operational investments.\n",
"\n",
"Both companies made strategic divestitures in autonomous vehicle technology, but Uber retained equity in Aurora, while Lyft shifted to a partnership/licensing model. Ubers investments were broader in scope and scale, while Lyfts were more focused on platform innovation and sustainability."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
"cell_type": "markdown",
"id": "54b7bc2e-606f-411a-9490-fcfab9236dfc",
"metadata": {},
"source": [
"## Initial Setup "
]
},
"metadata": {},
"output_type": "display_data"
},
{
"cell_type": "markdown",
"id": "23e80e5b-aaee-4f23-b338-7ae62b08141f",
"metadata": {},
"source": [
"Let's start by importing some simple building blocks. \n",
"\n",
"The main thing we need is:\n",
"1. the OpenAI API (using our own `llama_index` LLM class)\n",
"2. a place to keep conversation history \n",
"3. a definition for tools that our agent can use."
]
},
{
"cell_type": "markdown",
"id": "41101795",
"metadata": {},
"source": [
"If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "4985c578",
"metadata": {},
"outputs": [],
"source": [
"%pip install -U llama-index-indices-managed-llama-cloud\n",
"%pip install -U llama-index\n",
"%pip install -U llama-index-core"
]
},
{
"cell_type": "code",
"execution_count": 39,
"id": "9d47283b-025e-4874-88ed-76245b22f82e",
"metadata": {},
"outputs": [],
"source": [
"import nest_asyncio\n",
"nest_asyncio.apply()\n",
"\n",
"from IPython.display import Markdown, display"
]
},
{
"cell_type": "markdown",
"id": "eeac7d4c-58fd-42a5-9da9-c258375c61a0",
"metadata": {},
"source": [
"Make sure your OPENAI_API_KEY is set. Otherwise explicitly specify the `api_key` parameter."
]
},
{
"cell_type": "code",
"execution_count": 40,
"id": "4becf171-6632-42e5-bdec-918a00934696",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.llms.openai import OpenAI\n",
"import os\n",
"\n",
"os.environ[\"OPENAI_API_KEY\"] = \"<openai_api_key>\"\n",
"\n",
"llm = OpenAI(model=\"gpt-4.1\")"
]
},
{
"cell_type": "markdown",
"id": "51dff7d4-07cf-472f-bb35-e231c5874f1b",
"metadata": {},
"source": [
"## Build Two LlamaCloud Indexes\n",
"\n",
"Our data sources are the 2021 Lyft and Uber 10K documents.\n",
"\n",
"In contrast to the other getting started examples, in this notebook we will build **two** RAG pipelines: one for Uber and one for Lyft. This is for the sake of example; we can plug in both RAG pipelines as tools for the agent to reason over.\n",
"\n",
"To create each index, follow the instructions:\n",
"1. You can download them here ([Uber 10K](https://www.dropbox.com/s/te0a2w227v27iag/uber_2021.pdf?dl=1), [Lyft 10K](https://www.dropbox.com/s/qctkz6nxhm0y5qe/lyft_2021.pdf?dl=1))\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading these documents."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3aac2202-5346-4fe5-a0b5-cbac64003fbc",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index_uber = LlamaCloudIndex(\n",
" name=\"<index_uber>\", \n",
" api_key=\"<api_key>\"\n",
")\n",
"index_lyft = LlamaCloudIndex(\n",
" name=\"<index_lyft>\", \n",
" api_key=\"<api_key>\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "a747e287-16e0-4ca6-8580-3d2e1a0b6e6c",
"metadata": {},
"source": [
"For each index, get a query engine from the index, which gives us an out-of-the-box RAG pipeline."
]
},
{
"cell_type": "code",
"execution_count": 42,
"id": "7c352324-8112-43f1-ad97-d02e581bf282",
"metadata": {},
"outputs": [],
"source": [
"query_engine_uber = index_uber.as_query_engine(llm=llm)\n",
"query_engine_lyft = index_lyft.as_query_engine(llm=llm)"
]
},
{
"cell_type": "markdown",
"id": "cabfdf01-8d63-43ff-b06e-a3059ede2ddf",
"metadata": {},
"source": [
"## OpenAI Agent over LlamaCloud RAG Pipelines\n",
"\n",
"We convert both query engines to tools and pass it to a function calling OpenAI agent."
]
},
{
"cell_type": "code",
"execution_count": 43,
"id": "48c0cf98-3f10-4599-8437-d88dc89cefad",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.tools import QueryEngineTool, ToolMetadata\n",
"\n",
"tool_uber = QueryEngineTool(\n",
" query_engine=query_engine_uber,\n",
" metadata=ToolMetadata(\n",
" name=\"uber_10k\",\n",
" description=(\n",
" \"Provides information about Uber financials for year 2021. \"\n",
" \"Use a detailed plain text question as input to the tool.\"\n",
" ),\n",
" ),\n",
")\n",
"tool_lyft = QueryEngineTool(\n",
" query_engine=query_engine_lyft,\n",
" metadata=ToolMetadata(\n",
" name=\"lyft_10k\",\n",
" description=(\n",
" \"Provides information about Lyft financials for year 2021. \"\n",
" \"Use a detailed plain text question as input to the tool.\"\n",
" ),\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 44,
"id": "ebfdaf80-e5e1-4c60-b556-20558da3d5e3",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.agent.workflow import FunctionAgent\n",
"\n",
"agent = FunctionAgent(\n",
" tools=[tool_uber, tool_lyft],\n",
" llm=llm,\n",
" system_prompt=\"You are a helpful assistant that can answer questions about Uber and Lyft.\",\n",
" verbose=True,\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 45,
"id": "58c53f2a-0a3f-4abe-b8b6-97a974ec7546",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step init_run\n",
"Step init_run produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced no event\n",
"Running step call_tool\n",
"Running step call_tool\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced no event\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced event StopEvent\n"
]
}
],
"source": [
"response = await agent.run(user_msg=\"Tell me both the tailwinds for Uber and Lyft?\")"
]
},
{
"cell_type": "code",
"execution_count": 46,
"id": "cb682e18-2538-4da7-9bed-5c585d971735",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Here are the main tailwinds for both Uber and Lyft in 2021:\n",
"\n",
"Uber:\n",
"- Significant increase in Gross Bookings (up 56% year-over-year), driven by strong growth in both Delivery and Mobility segments.\n",
"- Delivery segment saw substantial growth due to increased food delivery orders, higher basket sizes (influenced by stay-at-home demand from COVID-19), and expansion in U.S. and international markets.\n",
"- Mobility segment rebounded as trip volumes recovered from pandemic lows.\n",
"- Revenue increased by 57% year-over-year, with additional contributions from the Freight segment (especially after acquiring Transplace).\n",
"- Operational improvements, such as reduced fixed costs and increased efficiency, led to a significant reduction in net loss.\n",
"- Gains from the sale of the autonomous vehicle business and positive impacts from equity investments.\n",
"- Expansion of offerings through acquisitions (e.g., Cornershop and Drizly) added grocery and alcohol delivery capabilities.\n",
"- Growth in Monthly Active Platform Consumers and trip volumes.\n",
"\n",
"Lyft:\n",
"- Widespread COVID-19 vaccine distribution and community reopenings led to a strong recovery in revenue and active riders.\n",
"- Revenue increased by 36% compared to the prior year.\n",
"- Active riders grew by 49.2% in Q4 2021 compared to Q4 2020.\n",
"- Achieved first annual Adjusted EBITDA profitability in 2021.\n",
"- Maintained a strong liquidity position ($2.3 billion in unrestricted cash, cash equivalents, and short-term investments at year-end).\n",
"- These factors supported Lyfts ongoing recovery and expansion efforts.\n",
"\n",
"In summary, both companies benefited from pandemic recovery, increased demand, and operational improvements, with Uber also seeing strong growth in its Delivery and Freight segments, and Lyft achieving profitability and maintaining strong liquidity."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(f\"{response}\"))"
]
},
{
"cell_type": "code",
"execution_count": 47,
"id": "1b3b5915",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step init_run\n",
"Step init_run produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced no event\n",
"Running step call_tool\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced event StopEvent\n"
]
}
],
"source": [
"response = await agent.run(user_msg=\"What are the investments made by Uber?\")"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "305abd8d",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"In 2021, Uber made several notable investments and acquisitions, as detailed in their annual report:\n",
"\n",
"**Acquisitions:**\n",
"- Transplace: Uber acquired 100% of Transplace, a major logistics and transportation management provider, to expand its Uber Freight business.\n",
"- Cornershop: Uber completed the acquisition of the remaining stake in Cornershop, a grocery delivery platform, making it a wholly owned subsidiary.\n",
"- Drizly: Uber acquired Drizly, an on-demand alcohol marketplace, to enhance its Delivery segment.\n",
"\n",
"**Equity Investments and Stakes:**\n",
"- Aurora: Uber sold its autonomous vehicle unit (ATG) to Aurora Innovation, receiving Aurora stock in return. After Auroras public listing, Ubers stake was valued at $3.4 billion at year-end.\n",
"- Moove: Uber invested in Moove, a Spanish vehicle fleet operator, acquiring a 30% minority interest and providing a $213 million loan, with an option to buy more shares.\n",
"- Didi: Ubers investment in Didi became publicly traded shares after Didis IPO.\n",
"- Zomato: Ubers investment in Zomato also became publicly traded after Zomatos IPO, valued at $1.1 billion at year-end.\n",
"\n",
"**Divestitures and Related Investments:**\n",
"- Yandex Joint Ventures: Uber sold its stakes in Yandexs self-driving and delivery businesses, as well as part of its interest in MLU B.V., to Yandex.\n",
"- ATG Business: Uber divested its ATG business to Aurora, receiving equity in Aurora as consideration.\n",
"\n",
"**Other Financial Developments:**\n",
"- Uber issued $1.5 billion in senior notes in 2021.\n",
"- Ubers investments in Grab and Zomato resulted in significant unrealized gains, while its Didi investment saw a substantial unrealized loss.\n",
"\n",
"These investments reflect Ubers strategy to expand its logistics, delivery, and mobility services, while also restructuring and monetizing its stakes in global technology and mobility companies."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(f\"{response}\"))"
]
},
{
"cell_type": "code",
"execution_count": 49,
"id": "d45fd026",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Running step init_run\n",
"Step init_run produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced no event\n",
"Running step call_tool\n",
"Running step call_tool\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced no event\n",
"Step call_tool produced event ToolCallResult\n",
"Running step aggregate_tool_results\n",
"Step aggregate_tool_results produced event AgentInput\n",
"Running step setup_agent\n",
"Step setup_agent produced event AgentSetup\n",
"Running step run_agent_step\n",
"Step run_agent_step produced event AgentOutput\n",
"Running step parse_agent_output\n",
"Step parse_agent_output produced event StopEvent\n"
]
}
],
"source": [
"response = await agent.run(user_msg=\"Compare the investments made by Uber and lyft?\")"
]
},
{
"cell_type": "code",
"execution_count": 50,
"id": "69a165e2",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"Here is a comparison of the investments made by Uber and Lyft in 2021:\n",
"\n",
"Uber\n",
"\n",
"- Acquisitions: Uber acquired Cornershop (grocery delivery), Drizly (alcohol delivery), and Transplace (logistics), and consolidated CS-Mexico.\n",
"- Divestitures & Equity Investments: Sold its autonomous vehicle unit (ATG) to Aurora Innovation in exchange for equity, and restructured/sold interests in Yandex-related ventures.\n",
"- Technology & Partnerships: Invested in Moove (vehicle fleet operator in Spain), with a minority equity stake, a $213 million loan, and a commercial partnership. Continued capital expenditures on technology, including assets from Postmates, Transplace, Drizly, and Cornershop.\n",
"- Capital Expenditures: Issued $1.5 billion in senior notes for additional capital, and entered a reinsurance agreement for legacy auto insurance liabilities, resulting in a $1 billion cash inflow.\n",
"- Strategy: Focused on expanding core businesses, enhancing technology, and strengthening market position through acquisitions and partnerships.\n",
"\n",
"Lyft\n",
"\n",
"- Acquisitions & Technology: Invested heavily in R&D, completed strategic acquisitions, and expanded its network of shared bikes and scooters. Developed third-party self-driving technology through Lyft Autonomous.\n",
"- Divestitures: Sold its Level 5 self-driving vehicle division, shifting to licensing and data access agreements with autonomous vehicle companies.\n",
"- Partnerships & Innovation: Invested in sales/marketing, expanded Lyft Rentals, and supported the Express Drive vehicle leasing program. Received a $64 million non-marketable equity security as part of a licensing agreement.\n",
"- Capital Expenditures: Included real estate leases (81 locations), finance leases for Flexdrive, and other long-term commitments.\n",
"- Sustainability: Invested in environmental initiatives, aiming for 100% electric vehicles by 2030 and compliance with Californias Clean Miles Standard.\n",
"- Strategy: Focused on technological innovation, multimodal transportation, strategic partnerships, operational infrastructure, and sustainability.\n",
"\n",
"Summary\n",
"\n",
"- Ubers investments were more acquisition-heavy, targeting expansion in delivery, logistics, and international markets, with significant capital raised and deployed for these purposes.\n",
"- Lyfts investments focused on technology, multimodal transportation (bikes, scooters, rentals), partnerships, and sustainability, with fewer large acquisitions but significant R&D and operational investments.\n",
"\n",
"Both companies made strategic divestitures in autonomous vehicle technology, but Uber retained equity in Aurora, while Lyft shifted to a partnership/licensing model. Ubers investments were broader in scope and scale, while Lyfts were more focused on platform innovation and sustainability."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(Markdown(f\"{response}\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ed1bdd1-de8b-43ed-8bcc-2bbe29d12c9c",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "8cef4431-a73e-4377-8edd-af1df3540204",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
}
],
"source": [
"display(Markdown(f\"{response}\"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ed1bdd1-de8b-43ed-8bcc-2bbe29d12c9c",
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"id": "8cef4431-a73e-4377-8edd-af1df3540204",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
+359 -352
View File
@@ -1,359 +1,366 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "616a781c",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/getting_started_chat.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"id": "18e20fbc-056b-44ac-b1fc-2d34b8e99bcc",
"metadata": {},
"source": [
"# Getting Started: Building a Chat Engine with LlamaCloud"
]
},
{
"cell_type": "markdown",
"id": "b99eea02-429c-40e4-99be-b82a89c8d070",
"metadata": {},
"source": [
"In this notebook, we show you how to build a multi-step chat engine over a LlamaCloud index over your data.\n",
"\n",
"Out chat engines allow you to turn a RAG pipeline into a conversational chat interface. During each turn, we maintain conversation history and use that retrieve context and synthesize over the relevant chat history."
]
},
{
"cell_type": "markdown",
"id": "34d34fcc-e247-4d55-ab16-c3d633e2385a",
"metadata": {},
"source": [
"**How the `CondensePlusContextChatEngine` works**\n",
"* First condense a conversation and latest user message to a standalone question\n",
"* Then build a context for the standalone question from a retriever,\n",
"* Then pass the context along with prompt and user message to LLM to generate a response."
]
},
{
"cell_type": "markdown",
"id": "ca364545",
"metadata": {},
"source": [
"If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a46eb19f",
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index-llms-openai\n",
"%pip install llama-index\n",
"%pip install llama-index-indices-managed-llama-cloud"
]
},
{
"cell_type": "markdown",
"id": "1047e67c-b81b-4e12-8a02-6822454a5d49",
"metadata": {},
"source": [
"## Build LlamaCloud Index\n",
"\n",
"The LlamaCloud index is built over the 2021 Lyft and Uber 10K documents.\n",
"\n",
"To create the index, follow the instructions:\n",
"1. You can download them here ([Uber 10K](https://www.dropbox.com/s/te0a2w227v27iag/uber_2021.pdf?dl=1), [Lyft 10K](https://www.dropbox.com/s/qctkz6nxhm0y5qe/lyft_2021.pdf?dl=1))\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading these documents."
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "3aac2202-5346-4fe5-a0b5-cbac64003fbc",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"<index_name>\", \n",
" project_name=\"<project_name>\",\n",
" api_key=\"llx-\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "7c352324-8112-43f1-ad97-d02e581bf282",
"metadata": {},
"outputs": [],
"source": [
"retriever = index.as_retriever()"
]
},
{
"cell_type": "markdown",
"id": "b314f279-bf7f-4e67-9f66-ebf783f08d38",
"metadata": {},
"source": [
"## Build Chat Engine"
]
},
{
"cell_type": "markdown",
"id": "40d3d9e4",
"metadata": {},
"source": [
"Define a chat engine wrapper around the defined LlamaCloud index.\n",
"\n",
"Since the context retrieved can take up a large amount of the available LLM context, let's ensure we configure a smaller limit to the chat history!"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "164ef191-f86a-4ce1-aa9d-64d61f29dd45",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.memory import ChatMemoryBuffer\n",
"from llama_index.core.chat_engine import CondensePlusContextChatEngine\n",
"from llama_index.core import Settings\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"\n",
"llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
"\n",
"memory = ChatMemoryBuffer.from_defaults(token_limit=3900)\n",
"\n",
"chat_engine = CondensePlusContextChatEngine.from_defaults(\n",
" index.as_retriever(),\n",
" memory=memory,\n",
" llm=llm,\n",
" context_prompt=(\n",
" \"You are a chatbot, able to have normal interactions, as well as talk\"\n",
" \" about financial reports for Uber and Lyft.\"\n",
" \"Here are the relevant documents for the context:\\n\"\n",
" \"{context_str}\"\n",
" \"\\nInstruction: Use the previous chat history, or the context above, to interact and help the user.\"\n",
" ),\n",
" verbose=False,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "63a4259d-89b5-49f8-b158-9eba5353d6f5",
"metadata": {},
"source": [
"Chat with your data"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "825b5bb3-37ff-4886-be2c-264584ca9eab",
"metadata": {},
"outputs": [],
"source": [
"response = chat_engine.chat(\"What are the Monthly Active Platform Consumers in 2021?\")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "d8fa4310-4dc5-4787-a073-755d2e0b4887",
"metadata": {},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3\n",
"In 2021, the Monthly Active Platform Consumers (MAPCs) for Uber were 118 million. This represents a 27% increase from the previous year. MAPCs are the number of unique consumers who completed a Mobility or New Mobility ride or received a Delivery order on Uber's platform at least once in a given month.\n"
]
}
],
"source": [
"print(len(response.source_nodes))\n",
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "67021e64-8665-4338-9fb4-c0f1d6361092",
"metadata": {},
"source": [
"Ask a follow up question"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "f6181319-5d76-48c4-a5d4-23c6e9bc5ccb",
"metadata": {},
"outputs": [],
"source": [
"response_2 = chat_engine.chat(\"What about 2020?\")"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "95045f5b-7964-4872-bc91-809d9debf1f5",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"In 2020, the Monthly Active Platform Consumers (MAPCs) for Uber were 93 million. This shows a 27% increase to 118 million in 2021. MAPCs are an important metric for Uber as they indicate the adoption of their platform and the frequency of transactions by consumers.\n"
]
}
],
"source": [
"print(response_2)"
]
},
{
"cell_type": "markdown",
"id": "c2c68de8-af58-4f7e-8759-19fc072873fd",
"metadata": {},
"source": [
"Reset conversation state"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "d13cf082-1a91-43c5-8bad-76fa45be96f9",
"metadata": {},
"outputs": [],
"source": [
"chat_engine.reset()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "627de435-d195-4dad-b314-a68e731979a9",
"metadata": {},
"outputs": [],
"source": [
"response = chat_engine.chat(\"Hello! What do you know?\")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "75ef9e31-3cdb-4129-92f7-e61be201ea36",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "616a781c",
"metadata": {},
"source": [
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/getting_started_chat.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello! I'm here to help you with any questions or information you need. Is there anything specific you would like to know or talk about today?\n"
]
}
],
"source": [
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "a65ad1a2",
"metadata": {},
"source": [
"## Streaming Support"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "22605caa",
"metadata": {},
"outputs": [],
"source": [
"chat_engine = CondensePlusContextChatEngine.from_defaults(\n",
" index.as_retriever(),\n",
" chat_mode=\"condense_plus_context\",\n",
" context_prompt=(\n",
" \"You are a chatbot, able to have normal interactions, as well as talk\"\n",
" \" about financial reports for Uber and Lyft.\"\n",
" \"Here are the relevant documents for the context:\\n\"\n",
" \"{context_str}\"\n",
" \"\\nInstruction: Based on the above documents, provide a detailed answer for the user question below.\"\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "250abd43",
"metadata": {},
"outputs": [
"cell_type": "markdown",
"id": "18e20fbc-056b-44ac-b1fc-2d34b8e99bcc",
"metadata": {},
"source": [
"# Getting Started: Building a Chat Engine with LlamaCloud"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"Based on the financial report provided for Lyft, here is a breakdown of the assets and liabilities for the company:\n",
"\n",
"Assets:\n",
"1. Receivables: The receivable balance primarily consists of amounts due from Enterprise Users and was $196.2 million, $104.7 million, and $120.0 million as of December 31, 2021, 2020, and 2019, respectively.\n",
"2. Allowance for Credit Losses: The allowance for credit losses, which reflects the Company's estimate of expected credit losses inherent in the enterprise and trade receivables balance, was $9.3 million, $15.2 million, and $6.2 million as of December 31, 2021, 2020, and 2019, respectively.\n",
"\n",
"Liabilities:\n",
"1. Prepaid Expenses and Other Current Assets: Uncollected fees owed for completed transactions on the Lyft Platform are included in prepaid expenses and other current assets on the consolidated balance sheets.\n",
"2. Accrued and Other Current Liabilities: The portion of the fare receivable to be remitted to drivers is included in accrued and other current liabilities on the consolidated balance sheets.\n",
"3. Trade Payables: The Company records an allowance for credit losses for fees owed for completed transactions that may never settle or be collected, which impacts the trade payables.\n",
"\n",
"These are some of the key assets and liabilities outlined in the financial report for Lyft."
]
"cell_type": "markdown",
"id": "b99eea02-429c-40e4-99be-b82a89c8d070",
"metadata": {},
"source": [
"In this notebook, we show you how to build a multi-step chat engine over a LlamaCloud index over your data.\n",
"\n",
"Out chat engines allow you to turn a RAG pipeline into a conversational chat interface. During each turn, we maintain conversation history and use that retrieve context and synthesize over the relevant chat history."
]
},
{
"cell_type": "markdown",
"id": "34d34fcc-e247-4d55-ab16-c3d633e2385a",
"metadata": {},
"source": [
"**How the `CondensePlusContextChatEngine` works**\n",
"* First condense a conversation and latest user message to a standalone question\n",
"* Then build a context for the standalone question from a retriever,\n",
"* Then pass the context along with prompt and user message to LLM to generate a response."
]
},
{
"cell_type": "markdown",
"id": "ca364545",
"metadata": {},
"source": [
"If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "a46eb19f",
"metadata": {},
"outputs": [],
"source": [
"%pip install llama-index-llms-openai\n",
"%pip install llama-index\n",
"%pip install llama-index-indices-managed-llama-cloud"
]
},
{
"cell_type": "markdown",
"id": "1047e67c-b81b-4e12-8a02-6822454a5d49",
"metadata": {},
"source": [
"## Build LlamaCloud Index\n",
"\n",
"The LlamaCloud index is built over the 2021 Lyft and Uber 10K documents.\n",
"\n",
"To create the index, follow the instructions:\n",
"1. You can download them here ([Uber 10K](https://www.dropbox.com/s/te0a2w227v27iag/uber_2021.pdf?dl=1), [Lyft 10K](https://www.dropbox.com/s/qctkz6nxhm0y5qe/lyft_2021.pdf?dl=1))\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading these documents."
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "3aac2202-5346-4fe5-a0b5-cbac64003fbc",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"<index_name>\", \n",
" project_name=\"<project_name>\",\n",
" api_key=\"llx-\"\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "7c352324-8112-43f1-ad97-d02e581bf282",
"metadata": {},
"outputs": [],
"source": [
"retriever = index.as_retriever()"
]
},
{
"cell_type": "markdown",
"id": "b314f279-bf7f-4e67-9f66-ebf783f08d38",
"metadata": {},
"source": [
"## Build Chat Engine"
]
},
{
"cell_type": "markdown",
"id": "40d3d9e4",
"metadata": {},
"source": [
"Define a chat engine wrapper around the defined LlamaCloud index.\n",
"\n",
"Since the context retrieved can take up a large amount of the available LLM context, let's ensure we configure a smaller limit to the chat history!"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "164ef191-f86a-4ce1-aa9d-64d61f29dd45",
"metadata": {},
"outputs": [],
"source": [
"from llama_index.core.memory import ChatMemoryBuffer\n",
"from llama_index.core.chat_engine import CondensePlusContextChatEngine\n",
"from llama_index.core import Settings\n",
"from llama_index.llms.openai import OpenAI\n",
"\n",
"\n",
"llm = OpenAI(model=\"gpt-3.5-turbo\")\n",
"\n",
"memory = ChatMemoryBuffer.from_defaults(token_limit=3900)\n",
"\n",
"chat_engine = CondensePlusContextChatEngine.from_defaults(\n",
" index.as_retriever(),\n",
" memory=memory,\n",
" llm=llm,\n",
" context_prompt=(\n",
" \"You are a chatbot, able to have normal interactions, as well as talk\"\n",
" \" about financial reports for Uber and Lyft.\"\n",
" \"Here are the relevant documents for the context:\\n\"\n",
" \"{context_str}\"\n",
" \"\\nInstruction: Use the previous chat history, or the context above, to interact and help the user.\"\n",
" ),\n",
" verbose=False,\n",
")"
]
},
{
"cell_type": "markdown",
"id": "63a4259d-89b5-49f8-b158-9eba5353d6f5",
"metadata": {},
"source": [
"Chat with your data"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "825b5bb3-37ff-4886-be2c-264584ca9eab",
"metadata": {},
"outputs": [],
"source": [
"response = chat_engine.chat(\"What are the Monthly Active Platform Consumers in 2021?\")"
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "d8fa4310-4dc5-4787-a073-755d2e0b4887",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"3\n",
"In 2021, the Monthly Active Platform Consumers (MAPCs) for Uber were 118 million. This represents a 27% increase from the previous year. MAPCs are the number of unique consumers who completed a Mobility or New Mobility ride or received a Delivery order on Uber's platform at least once in a given month.\n"
]
}
],
"source": [
"print(len(response.source_nodes))\n",
"print(str(response))"
]
},
{
"cell_type": "markdown",
"id": "67021e64-8665-4338-9fb4-c0f1d6361092",
"metadata": {},
"source": [
"Ask a follow up question"
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "f6181319-5d76-48c4-a5d4-23c6e9bc5ccb",
"metadata": {},
"outputs": [],
"source": [
"response_2 = chat_engine.chat(\"What about 2020?\")"
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "95045f5b-7964-4872-bc91-809d9debf1f5",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"In 2020, the Monthly Active Platform Consumers (MAPCs) for Uber were 93 million. This shows a 27% increase to 118 million in 2021. MAPCs are an important metric for Uber as they indicate the adoption of their platform and the frequency of transactions by consumers.\n"
]
}
],
"source": [
"print(response_2)"
]
},
{
"cell_type": "markdown",
"id": "c2c68de8-af58-4f7e-8759-19fc072873fd",
"metadata": {},
"source": [
"Reset conversation state"
]
},
{
"cell_type": "code",
"execution_count": 25,
"id": "d13cf082-1a91-43c5-8bad-76fa45be96f9",
"metadata": {},
"outputs": [],
"source": [
"chat_engine.reset()"
]
},
{
"cell_type": "code",
"execution_count": 26,
"id": "627de435-d195-4dad-b314-a68e731979a9",
"metadata": {},
"outputs": [],
"source": [
"response = chat_engine.chat(\"Hello! What do you know?\")"
]
},
{
"cell_type": "code",
"execution_count": 27,
"id": "75ef9e31-3cdb-4129-92f7-e61be201ea36",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Hello! I'm here to help you with any questions or information you need. Is there anything specific you would like to know or talk about today?\n"
]
}
],
"source": [
"print(response)"
]
},
{
"cell_type": "markdown",
"id": "a65ad1a2",
"metadata": {},
"source": [
"## Streaming Support"
]
},
{
"cell_type": "code",
"execution_count": 28,
"id": "22605caa",
"metadata": {},
"outputs": [],
"source": [
"chat_engine = CondensePlusContextChatEngine.from_defaults(\n",
" index.as_retriever(),\n",
" chat_mode=\"condense_plus_context\",\n",
" context_prompt=(\n",
" \"You are a chatbot, able to have normal interactions, as well as talk\"\n",
" \" about financial reports for Uber and Lyft.\"\n",
" \"Here are the relevant documents for the context:\\n\"\n",
" \"{context_str}\"\n",
" \"\\nInstruction: Based on the above documents, provide a detailed answer for the user question below.\"\n",
" ),\n",
")"
]
},
{
"cell_type": "code",
"execution_count": 29,
"id": "250abd43",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Based on the financial report provided for Lyft, here is a breakdown of the assets and liabilities for the company:\n",
"\n",
"Assets:\n",
"1. Receivables: The receivable balance primarily consists of amounts due from Enterprise Users and was $196.2 million, $104.7 million, and $120.0 million as of December 31, 2021, 2020, and 2019, respectively.\n",
"2. Allowance for Credit Losses: The allowance for credit losses, which reflects the Company's estimate of expected credit losses inherent in the enterprise and trade receivables balance, was $9.3 million, $15.2 million, and $6.2 million as of December 31, 2021, 2020, and 2019, respectively.\n",
"\n",
"Liabilities:\n",
"1. Prepaid Expenses and Other Current Assets: Uncollected fees owed for completed transactions on the Lyft Platform are included in prepaid expenses and other current assets on the consolidated balance sheets.\n",
"2. Accrued and Other Current Liabilities: The portion of the fare receivable to be remitted to drivers is included in accrued and other current liabilities on the consolidated balance sheets.\n",
"3. Trade Payables: The Company records an allowance for credit losses for fees owed for completed transactions that may never settle or be collected, which impacts the trade payables.\n",
"\n",
"These are some of the key assets and liabilities outlined in the financial report for Lyft."
]
}
],
"source": [
"response = chat_engine.stream_chat(\"What are the assets/liabilities for Lyft?\")\n",
"for token in response.response_gen:\n",
" print(token, end=\"\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f1cd349a-8a34-4bb5-aa8b-c57b9c0d0ddf",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"response = chat_engine.stream_chat(\"What are the assets/liabilities for Lyft?\")\n",
"for token in response.response_gen:\n",
" print(token, end=\"\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f1cd349a-8a34-4bb5-aa8b-c57b9c0d0ddf",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
File diff suppressed because it is too large Load Diff
+237 -230
View File
@@ -1,236 +1,243 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "03d9d807-02bc-447c-8890-787aac4c2b74",
"metadata": {
"tags": []
},
"source": [
"# Getting Started with LlamaCloud (Tesla 10K)\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/getting_started_tesla.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This notebook shows you how to get started with LlamaCloud by building a simple RAG pipeline over the 2023 Tesla 10K filing, and then querying it."
]
},
{
"cell_type": "markdown",
"id": "c578d22f-53d0-471a-bc1d-bd9b428a3d03",
"metadata": {},
"source": [
"## Build RAG Pipeline from LlamaCloud Index\n",
"\n",
"The LlamaCloud index is built over the 2023 Tesla 10K filing.\n",
"\n",
"To create the index, follow the instructions:\n",
"1. You can download the Tesla 10K here: https://ir.tesla.com/_flysystem/s3/sec/000162828024002390/tsla-20231231-gen.pdf\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading this document."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1500b200-f7c6-428c-ab7e-07902b3faa4e",
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index-indices-managed-llama-cloud"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "281ce14d-1bb9-43f3-a91c-15f83fb89d40",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"<index_name>\", \n",
" project_name=\"<project_name>\",\n",
" api_key=\"llx-\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ddebb2af-d59f-4df8-876b-fda2e8fda333",
"metadata": {},
"source": [
"## Try out an Example Query! \n",
"\n",
"Now we can try out an example query against the index.\n",
"\n",
"If you want an out of the box query engine, just do `index.as_query_engine()` (similar to our VectorStoreIndex).\n",
"\n",
"If you want a retriever that you can plug into a custom workflow, do `index.as_retriever()`"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1f7e66d1-ccc7-4d29-86f6-77b9ccef0711",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# query = \"Tell me about the total cost of revenues in 2022\"\n",
"# query = \"Tell me about the R&D operating expenses in 2023\"\n",
"# query = \"Tell me revenues from energy generation and storage leasing in 2023\"\n",
"query = \"Tell me the solar energy systems in service in 2023 and 2022\" \n",
"nodes = index.as_retriever().retrieve(query)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ffa60e6b-3362-466c-b895-a718d429e762",
"metadata": {
"tags": []
},
"outputs": [
"cells": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5\n"
]
}
],
"source": [
"print(len(nodes))"
]
},
{
"cell_type": "markdown",
"id": "28860fcd-b433-45de-aea2-6a0d6b8c295c",
"metadata": {},
"source": [
"We see that the top-retrieved node corresponds to an extracted table. We automatically extract a table summary and format the table in clean Markdown."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "805f9860-92fa-4994-8278-a4c845c456c7",
"metadata": {
"tags": []
},
"outputs": [
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"table_df: {' ': {0: 'Solar energy systems in service', 1: 'Initial direct costs related to customer solar energy system lease acquisition costs', 2: 'Total', 3: 'Less: accumulated depreciation and amortization (1)', 4: 'Solar energy systems under construction', 5: 'Solar energy systems pending interconnection', 6: 'Solar energy systems, net (2)'}, 'December 31, 2023': {0: '$6,755', 1: '$104', 2: '$6,859', 3: '($1,643)', 4: '1', 5: '12', 6: '$5,229'}, 'December 31, 2022': {0: '$6,785', 1: '$104', 2: '$6,889', 3: '($1,418)', 4: '2', 5: '16', 6: '$5,489'}}\n",
"table_summary: Summary: The table provides information on solar energy systems, including those in service, under construction, pending interconnection, and the net value after depreciation and amortization.,\n",
"with the following table title:\n",
"Solar Energy Systems Overview,\n",
"with the following columns:\n",
"- December 31, 2023: None\n",
"- December 31, 2022: None\n",
"\n",
"file_name: tesla_2023 (2).pdf\n",
"file_path: tesla_2023 (2).pdf\n",
"file_type: application/pdf\n",
"file_size: 984581\n",
"llx_platform_pipeline_id: 205bb92d-2932-4034-b354-68ec83635762\n",
"llx_platform_loaded_file_id: 2b7e68a0-3e8d-40da-9185-f9a4bbbaa513\n",
"pipeline_id: 205bb92d-2932-4034-b354-68ec83635762\n",
"\n",
"Summary: The table provides information on **solar** **energy** **systems**, including those in **service**, under construction, pending interconnection, and the net value after depreciation and amortization.,\n",
"with the following table title:\n",
"**Solar** **Energy** **Systems** Overview,\n",
"with the following columns:\n",
"- December 31, **2023**: None\n",
"- December 31, **2022**: None\n",
"\n",
"| |December 31, **2023**|December 31, **2022**|\n",
"|---|---|---|\n",
"|**Solar** **energy** **systems** in **service**|$6,755|$6,785|\n",
"|Initial direct costs related to customer **solar** **energy** system lease acquisition costs|$104|$104|\n",
"|Total|$6,859|$6,889|\n",
"|Less: accumulated depreciation and amortization (1)|($1,643)|($1,418)|\n",
"|**Solar** **energy** **systems** under construction|1|2|\n",
"|**Solar** **energy** **systems** pending interconnection|12|16|\n",
"|**Solar** **energy** **systems**, net (2)|$5,229|$5,489|\n"
]
}
],
"source": [
"print(nodes[0].get_content(metadata_mode=\"all\"))"
]
},
{
"cell_type": "markdown",
"id": "0569c88f-95ee-49b6-b45e-3fbc4d1c0dca",
"metadata": {},
"source": [
"Now let's try plugging this into a query engine."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "390b9e89-fbe6-4ff8-a13e-d453190ad0a0",
"metadata": {},
"outputs": [],
"source": [
"response = index.as_query_engine().query(query)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e3233279-8730-4f30-8f95-9a7a31ac8675",
"metadata": {
"tags": []
},
"outputs": [
"cell_type": "markdown",
"id": "03d9d807-02bc-447c-8890-787aac4c2b74",
"metadata": {
"tags": []
},
"source": [
"# Getting Started with LlamaCloud (Tesla 10K)\n",
"\n",
"<a href=\"https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/getting_started_tesla.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
"\n",
"This notebook shows you how to get started with LlamaCloud by building a simple RAG pipeline over the 2023 Tesla 10K filing, and then querying it."
]
},
{
"name": "stdout",
"output_type": "stream",
"text": [
"The solar energy systems in service in 2023 were $6,755, and in 2022, they were $6,785.\n"
]
"cell_type": "markdown",
"id": "c578d22f-53d0-471a-bc1d-bd9b428a3d03",
"metadata": {},
"source": [
"## Build RAG Pipeline from LlamaCloud Index\n",
"\n",
"The LlamaCloud index is built over the 2023 Tesla 10K filing.\n",
"\n",
"To create the index, follow the instructions:\n",
"1. You can download the Tesla 10K here: https://ir.tesla.com/_flysystem/s3/sec/000162828024002390/tsla-20231231-gen.pdf\n",
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading this document."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1500b200-f7c6-428c-ab7e-07902b3faa4e",
"metadata": {},
"outputs": [],
"source": [
"!pip install llama-index-indices-managed-llama-cloud"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "281ce14d-1bb9-43f3-a91c-15f83fb89d40",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
"\n",
"index = LlamaCloudIndex(\n",
" name=\"<index_name>\", \n",
" project_name=\"<project_name>\",\n",
" api_key=\"llx-\"\n",
")"
]
},
{
"cell_type": "markdown",
"id": "ddebb2af-d59f-4df8-876b-fda2e8fda333",
"metadata": {},
"source": [
"## Try out an Example Query! \n",
"\n",
"Now we can try out an example query against the index.\n",
"\n",
"If you want an out of the box query engine, just do `index.as_query_engine()` (similar to our VectorStoreIndex).\n",
"\n",
"If you want a retriever that you can plug into a custom workflow, do `index.as_retriever()`"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "1f7e66d1-ccc7-4d29-86f6-77b9ccef0711",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# query = \"Tell me about the total cost of revenues in 2022\"\n",
"# query = \"Tell me about the R&D operating expenses in 2023\"\n",
"# query = \"Tell me revenues from energy generation and storage leasing in 2023\"\n",
"query = \"Tell me the solar energy systems in service in 2023 and 2022\" \n",
"nodes = index.as_retriever().retrieve(query)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ffa60e6b-3362-466c-b895-a718d429e762",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5\n"
]
}
],
"source": [
"print(len(nodes))"
]
},
{
"cell_type": "markdown",
"id": "28860fcd-b433-45de-aea2-6a0d6b8c295c",
"metadata": {},
"source": [
"We see that the top-retrieved node corresponds to an extracted table. We automatically extract a table summary and format the table in clean Markdown."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "805f9860-92fa-4994-8278-a4c845c456c7",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"table_df: {' ': {0: 'Solar energy systems in service', 1: 'Initial direct costs related to customer solar energy system lease acquisition costs', 2: 'Total', 3: 'Less: accumulated depreciation and amortization (1)', 4: 'Solar energy systems under construction', 5: 'Solar energy systems pending interconnection', 6: 'Solar energy systems, net (2)'}, 'December 31, 2023': {0: '$6,755', 1: '$104', 2: '$6,859', 3: '($1,643)', 4: '1', 5: '12', 6: '$5,229'}, 'December 31, 2022': {0: '$6,785', 1: '$104', 2: '$6,889', 3: '($1,418)', 4: '2', 5: '16', 6: '$5,489'}}\n",
"table_summary: Summary: The table provides information on solar energy systems, including those in service, under construction, pending interconnection, and the net value after depreciation and amortization.,\n",
"with the following table title:\n",
"Solar Energy Systems Overview,\n",
"with the following columns:\n",
"- December 31, 2023: None\n",
"- December 31, 2022: None\n",
"\n",
"file_name: tesla_2023 (2).pdf\n",
"file_path: tesla_2023 (2).pdf\n",
"file_type: application/pdf\n",
"file_size: 984581\n",
"llx_platform_pipeline_id: 205bb92d-2932-4034-b354-68ec83635762\n",
"llx_platform_loaded_file_id: 2b7e68a0-3e8d-40da-9185-f9a4bbbaa513\n",
"pipeline_id: 205bb92d-2932-4034-b354-68ec83635762\n",
"\n",
"Summary: The table provides information on **solar** **energy** **systems**, including those in **service**, under construction, pending interconnection, and the net value after depreciation and amortization.,\n",
"with the following table title:\n",
"**Solar** **Energy** **Systems** Overview,\n",
"with the following columns:\n",
"- December 31, **2023**: None\n",
"- December 31, **2022**: None\n",
"\n",
"| |December 31, **2023**|December 31, **2022**|\n",
"|---|---|---|\n",
"|**Solar** **energy** **systems** in **service**|$6,755|$6,785|\n",
"|Initial direct costs related to customer **solar** **energy** system lease acquisition costs|$104|$104|\n",
"|Total|$6,859|$6,889|\n",
"|Less: accumulated depreciation and amortization (1)|($1,643)|($1,418)|\n",
"|**Solar** **energy** **systems** under construction|1|2|\n",
"|**Solar** **energy** **systems** pending interconnection|12|16|\n",
"|**Solar** **energy** **systems**, net (2)|$5,229|$5,489|\n"
]
}
],
"source": [
"print(nodes[0].get_content(metadata_mode=\"all\"))"
]
},
{
"cell_type": "markdown",
"id": "0569c88f-95ee-49b6-b45e-3fbc4d1c0dca",
"metadata": {},
"source": [
"Now let's try plugging this into a query engine."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "390b9e89-fbe6-4ff8-a13e-d453190ad0a0",
"metadata": {},
"outputs": [],
"source": [
"response = index.as_query_engine().query(query)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "e3233279-8730-4f30-8f95-9a7a31ac8675",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The solar energy systems in service in 2023 were $6,755, and in 2022, they were $6,785.\n"
]
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "30b97e63-aa44-4c8a-9212-09be990dba32",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
],
"source": [
"print(str(response))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "30b97e63-aa44-4c8a-9212-09be990dba32",
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "llama_index_v3",
"language": "python",
"name": "llama_index_v3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.8"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
"nbformat": 4,
"nbformat_minor": 5
}
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large Load Diff
@@ -1,5 +1,12 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"cell_type": "markdown",
"metadata": {
@@ -957,4 +964,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
File diff suppressed because it is too large Load Diff
File diff suppressed because it is too large Load Diff
@@ -1,5 +1,12 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
]
},
{
"cell_type": "markdown",
"metadata": {
@@ -1379,4 +1386,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}
File diff suppressed because one or more lines are too long
File diff suppressed because it is too large Load Diff