mirror of
https://github.com/run-llama/llamacloud-demo.git
synced 2026-07-01 01:37:54 -04:00
294 lines
8.2 KiB
Plaintext
294 lines
8.2 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# ⚠️ Important Notice\n\nThis notebook (and repository) is deprecated.\n\nFor the latest python examples, please refer to the `llama-cloud-services` repository examples: \nhttps://github.com/run-llama/llama_cloud_services/tree/main/examples\n\n---"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "318d5a8d-7f6f-4fff-8c2d-cd0e290b6f49",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Building and Evaluating a RAG Pipeline with LlamaCloud and our Batch Evaluator\n",
|
|
"\n",
|
|
"In this notebook we show you how to easily construct a RAG pipeline with a LlamaCloud Index, and then run evaluations against that index using our batch evaluator."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "1188feb6-bedf-40d5-8812-29b90431f8b5",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setup\n",
|
|
"\n",
|
|
"Here we define some basic imports."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "84324692-17f5-4584-8932-6687ab811b66",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# attach to the same event-loop\n",
|
|
"import nest_asyncio\n",
|
|
"\n",
|
|
"nest_asyncio.apply()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "dc3f08bd-81ad-4b05-89cc-33409e3d1f71",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Response\n",
|
|
"from llama_index.llms.openai import OpenAI\n",
|
|
"from llama_index.core.evaluation import (\n",
|
|
" FaithfulnessEvaluator,\n",
|
|
" RelevancyEvaluator,\n",
|
|
" CorrectnessEvaluator,\n",
|
|
")\n",
|
|
"from llama_index.core.node_parser import SentenceSplitter\n",
|
|
"import pandas as pd\n",
|
|
"\n",
|
|
"pd.set_option(\"display.max_colwidth\", 0)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "95eb1f3a-b0bf-4588-8302-3799c520ffc3",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Build RAG Pipeline from LlamaCloud Index\n",
|
|
"\n",
|
|
"The LlamaCloud index is built over the 2021 Lyft and Uber 10K documents.\n",
|
|
"\n",
|
|
"To create the index, follow the instructions:\n",
|
|
"1. You can download them here ([Uber 10K](https://www.dropbox.com/s/te0a2w227v27iag/uber_2021.pdf?dl=1), [Lyft 10K](https://www.dropbox.com/s/qctkz6nxhm0y5qe/lyft_2021.pdf?dl=1))\n",
|
|
"2. Follow instructions on `https://cloud.llamaindex.ai/` to signup for an account. Create a pipeline by uploading these documents."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "e44c21cc-ad10-40b5-89ca-f0d95c2a6714",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"! pip install llama-index-indices-managed-llama-cloud"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "e83560c6",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import os\n",
|
|
"\n",
|
|
"os.environ[\"LLAMA_CLOUD_API_KEY\"] = \"llx-\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "c7bc2ff3-8c95-4de4-965f-1ca5bcf61f9b",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from llama_index.indices.managed.llama_cloud import LlamaCloudIndex\n",
|
|
"\n",
|
|
"index = LlamaCloudIndex(\n",
|
|
" name=\"<index_name>\", \n",
|
|
" project_name=\"<project_name>\",\n",
|
|
" api_key=os.environ[\"LLAMA_CLOUD_API_KEY\"]\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "8b5eafcb-5529-4fa5-a681-0d5e014ec517",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"query = \"Tell me about the risk factors for Uber\""
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"id": "520199ca-4271-4708-9efc-76e244f4cc22",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"response = index.as_query_engine().query(query)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "b30d952c-d502-4888-a7e8-3a72ca0d052f",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [
|
|
{
|
|
"name": "stdout",
|
|
"output_type": "stream",
|
|
"text": [
|
|
"Airbyte is a platform that allows users to sync their data from various sources into a destination database, such as Snowflake. It provides functionalities for data ingestion and transformation, enabling users to easily move and work with data from different platforms.\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"print(str(response))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "026a1616-874f-4e59-a7d5-b1be6a5525cd",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Setup Batch Evaluator\n",
|
|
"\n",
|
|
"Here we setup a batch evaluator, which can run evaluations over a batch dataset. We start by defining the set of metrics that we want to measure over this dataset."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "34208172-079b-48d0-8bb2-6e57288ccc57",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"# gpt-4\n",
|
|
"gpt4 = OpenAI(temperature=0, model=\"gpt-4\")\n",
|
|
"gpt35 = OpenAI(model=\"gpt-3.5-turbo\")\n",
|
|
"\n",
|
|
"faithfulness_gpt4 = FaithfulnessEvaluator(llm=gpt4)\n",
|
|
"relevancy_gpt4 = RelevancyEvaluator(llm=gpt4)\n",
|
|
"correctness_gpt4 = CorrectnessEvaluator(llm=gpt4)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "d59da19e-9c75-48f7-8235-b0a458ca6a04",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"print(response.source_nodes[2].get_content())"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "47ed4750-92cd-49ec-b1a3-03bdbc425d97",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"from llama_index.core.evaluation import BatchEvalRunner\n",
|
|
"\n",
|
|
"runner = BatchEvalRunner(\n",
|
|
" {\"faithfulness\": faithfulness_gpt4, \"relevancy\": relevancy_gpt4},\n",
|
|
" workers=8,\n",
|
|
")\n",
|
|
"\n",
|
|
"eval_results = await runner.aevaluate_queries(\n",
|
|
" index.as_query_engine(), queries=[query]\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "c89268b0-299f-4da5-b729-0e344988a1de",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"eval_results"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "48ece9e7-5e70-46c4-88c3-0eb7bd733892",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Upload Results\n",
|
|
"\n",
|
|
"Once you obtain a set of eval results, you're able to upload them."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "5aedc2a7-3eae-49c7-9f1b-242d15917693",
|
|
"metadata": {
|
|
"tags": []
|
|
},
|
|
"outputs": [],
|
|
"source": [
|
|
"runner.upload_eval_results(\n",
|
|
" project_name=\"default project\",\n",
|
|
" app_name=\"default app\",\n",
|
|
" results=eval_results\n",
|
|
")"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": null,
|
|
"id": "0ed0a112-93da-4df7-b1c6-3e779ae53211",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": []
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "llama_index_v3",
|
|
"language": "python",
|
|
"name": "llama_index_v3"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.10.8"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
} |