updated scripts and POC html

2026-06-30 21:37:55 -04:00 · 2025-09-03 14:34:45 +02:00
parent f5d3ba46c5
commit ccac13615f
63 changed files with 20156 additions and 24 deletions
@@ -0,0 +1,333 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/Chatbot_SEC.ipynb
+toc: True
+title: "How to Build a Chatbot"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+LlamaIndex serves as a bridge between your data and Language Learning Models (LLMs), providing a toolkit that enables you to establish a query interface around your data for a variety of tasks, such as question-answering and summarization.
+
+In this tutorial, we'll walk you through building a context-augmented chatbot using a [Data Agent](https://gpt-index.readthedocs.io/en/stable/core_modules/agent_modules/agents/root.html). This agent, powered by LLMs, is capable of intelligently executing tasks over your data. The end result is a chatbot agent equipped with a robust set of data interface tools provided by LlamaIndex to answer queries about your data.
+
+**Note**: This tutorial builds upon initial work on creating a query interface over SEC 10-K filings - [check it out here](https://medium.com/@jerryjliu98/how-unstructured-and-llamaindex-can-help-bring-the-power-of-llms-to-your-own-data-3657d063e30d).
+
+### Context
+
+In this guide, we’ll build a "10-K Chatbot" that uses raw UBER 10-K HTML filings from Dropbox. Users can interact with the chatbot to ask questions related to the 10-K filings.
+
+### Preparation
+
+
+```python
+%pip install llama-index-readers-file
+%pip install llama-index-embeddings-openai
+%pip install llama-index-agent-openai
+%pip install llama-index-llms-openai
+%pip install llama-index-question-gen-openai
+%pip install unstructured
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.core import Settings
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+# global defaults
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model_name="text-embedding-3-large")
+Settings.chunk_size = 512
+Settings.chunk_overlap = 64
+```
+
+### Ingest Data
+
+Let's first download the raw 10-k files, from 2019-2022.
+
+
+```python
+# NOTE: the code examples assume you're operating within a Jupyter notebook.
+# download files
+!mkdir data
+!wget "https://www.dropbox.com/s/948jr9cfs7fgj99/UBER.zip?dl=1" -O data/UBER.zip
+!unzip data/UBER.zip -d data
+```
+
+To parse the HTML files into formatted text, we use the [Unstructured](https://github.com/Unstructured-IO/unstructured) library. Thanks to [LlamaHub](https://llamahub.ai/), we can directly integrate with Unstructured, allowing conversion of any text into a Document format that LlamaIndex can ingest.
+
+First we install the necessary packages:
+
+Then we can use the `UnstructuredReader` to parse the HTML files into a list of `Document` objects.
+
+
+```python
+from llama_index.readers.file import UnstructuredReader
+from pathlib import Path
+
+years = [2022, 2021, 2020, 2019]
+```
+
+
+```python
+loader = UnstructuredReader()
+doc_set = {}
+all_docs = []
+for year in years:
+    year_docs = loader.load_data(
+        file=Path(f"./data/UBER/UBER_{year}.html"), split_documents=False
+    )
+    # insert year metadata into each year
+    for d in year_docs:
+        d.metadata = {"year": year}
+    doc_set[year] = year_docs
+    all_docs.extend(year_docs)
+```
+
+### Setting up Vector Indices for each year
+
+We first setup a vector index for each year. Each vector index allows us
+to ask questions about the 10-K filing of a given year.
+
+We build each index and save it to disk.
+
+
+```python
+# initialize simple vector indices
+# NOTE: don't run this cell if the indices are already loaded!
+from llama_index.core import VectorStoreIndex, StorageContext
+
+
+index_set = {}
+for year in years:
+    storage_context = StorageContext.from_defaults()
+    cur_index = VectorStoreIndex.from_documents(
+        doc_set[year],
+        storage_context=storage_context,
+    )
+    index_set[year] = cur_index
+    storage_context.persist(persist_dir=f"./storage/{year}")
+```
+
+To load an index from disk, do the following
+
+
+```python
+# Load indices from disk
+from llama_index.core import StorageContext, load_index_from_storage
+
+index_set = {}
+for year in years:
+    storage_context = StorageContext.from_defaults(
+        persist_dir=f"./storage/{year}"
+    )
+    cur_index = load_index_from_storage(
+        storage_context,
+    )
+    index_set[year] = cur_index
+```
+
+### Setting up a Sub Question Query Engine to Synthesize Answers Across 10-K Filings
+
+Since we have access to documents of 4 years, we may not only want to ask questions regarding the 10-K document of a given year, but ask questions that require analysis over all 10-K filings.
+
+To address this, we can use a [Sub Question Query Engine](https://gpt-index.readthedocs.io/en/stable/examples/query_engine/sub_question_query_engine.html). It decomposes a query into subqueries, each answered by an individual vector index, and synthesizes the results to answer the overall query.
+
+LlamaIndex provides some wrappers around indices (and query engines) so that they can be used by query engines and agents. First we define a `QueryEngineTool` for each vector index.
+Each tool has a name and a description; these are what the LLM agent sees to decide which tool to choose.
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+
+individual_query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=index_set[year].as_query_engine(),
+        name=f"vector_index_{year}",
+        description=(
+            "useful for when you want to answer queries about the"
+            f" {year} SEC 10-K for Uber"
+        ),
+    )
+    for year in years
+]
+```
+
+Now we can create the Sub Question Query Engine, which will allow us to synthesize answers across the 10-K filings. We pass in the `individual_query_engine_tools` we defined above.
+
+
+```python
+from llama_index.core.query_engine import SubQuestionQueryEngine
+
+query_engine = SubQuestionQueryEngine.from_defaults(
+    query_engine_tools=individual_query_engine_tools,
+)
+```
+
+### Setting up the Chatbot Agent
+
+We use a LlamaIndex Data Agent to setup the outer chatbot agent, which has access to a set of Tools. Specifically, we will use an OpenAIAgent, that takes advantage of OpenAI API function calling. We want to use the separate Tools we defined previously for each index (corresponding to a given year), as well as a tool for the sub question query engine we defined above.
+
+First we define a `QueryEngineTool` for the sub question query engine:
+
+
+```python
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=query_engine,
+    name="sub_question_query_engine",
+    description=(
+        "useful for when you want to answer queries that require analyzing"
+        " multiple SEC 10-K documents for Uber"
+    ),
+)
+```
+
+Then, we combine the Tools we defined above into a single list of tools for the agent:
+
+
+```python
+tools = individual_query_engine_tools + [query_engine_tool]
+```
+
+Finally, we call `FunctionAgent` to create the agent, passing in the list of tools we defined above.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.llms.openai import OpenAI
+
+agent = FunctionAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
+```
+
+### Testing the Agent
+
+We can now test the agent with various queries.
+
+If we test it with a simple "hello" query, the agent does not use any Tools.
+
+
+```python
+from llama_index.core.workflow import Context
+
+# Setup the context for this specific interaction
+ctx = Context(agent)
+
+response = await agent.run("hi, i am bob", ctx=ctx)
+print(str(response))
+```
+
+    Hello Bob! How can I assist you today?
+
+
+If we test it with a query regarding the 10-k of a given year, the agent will use
+the relevant vector index Tool.
+
+
+```python
+response = await agent.run(
+    "What were some of the biggest risk factors in 2020 for Uber?", ctx=ctx
+)
+print(str(response))
+```
+
+    In 2020, some of the biggest risk factors for Uber included:
+    
+    1. **Legal and Regulatory Risks**: Extensive government regulation and oversight could adversely impact operations and future prospects.
+    2. **Data Privacy and Security Risks**: Risks related to data collection, use, and processing could lead to investigations, litigation, and negative publicity.
+    3. **Economic Impact of COVID-19**: The pandemic adversely affected business operations, demand for services, and financial condition due to governmental restrictions and changes in consumer behavior.
+    4. **Market Volatility**: Volatility in the market price of common stock could affect investors' ability to resell shares at favorable prices.
+    5. **Safety Incidents**: Criminal or dangerous activities on the platform could harm the ability to attract and retain drivers and consumers.
+    6. **Investment Risks**: Substantial investments in new technologies and offerings carry inherent risks, with no guarantee of realizing expected benefits.
+    7. **Dependence on Metropolitan Areas**: A significant portion of gross bookings comes from large metropolitan areas, which may be negatively impacted by various external factors.
+    8. **Talent Retention**: Attracting and retaining high-quality personnel is crucial, and issues with attrition or succession planning could adversely affect the business.
+    9. **Cybersecurity Threats**: Cyberattacks and data breaches could harm reputation and operational results.
+    10. **Capital Requirements**: The need for additional capital to support growth may not be met on reasonable terms, impacting business expansion.
+    11. **Acquisition Challenges**: Difficulty in identifying and integrating suitable businesses could harm operating results and future prospects.
+    12. **Operational Limitations**: Potential restrictions in certain jurisdictions may require modifications to the business model, affecting service delivery.
+
+
+Finally, if we test it with a query to compare/contrast risk factors across years, the agent will use the Sub Question Query Engine Tool.
+
+
+```python
+cross_query_str = (
+    "Compare/contrast the risk factors described in the Uber 10-K across"
+    " years. Give answer in bullet points."
+)
+
+response = await agent.run(cross_query_str, ctx=ctx)
+print(str(response))
+```
+
+    Here's a comparison of the risk factors for Uber across the years 2020, 2021, and 2022:
+    
+    - **COVID-19 Impact**:
+      - **2020**: The pandemic significantly affected business operations, demand, and financial condition.
+      - **2021**: Continued impact of the pandemic was a concern, affecting various parts of the business.
+      - **2022**: The pandemic's impact was less emphasized, with more focus on operational and competitive risks.
+    
+    - **Driver Classification**:
+      - **2020**: Not specifically highlighted.
+      - **2021**: Potential reclassification of Drivers as employees could alter the business model.
+      - **2022**: Continued risk of reclassification impacting operational costs.
+    
+    - **Competition**:
+      - **2020**: Not specifically highlighted.
+      - **2021**: Intense competition with low barriers to entry and well-capitalized competitors.
+      - **2022**: Competitive landscape challenges due to established alternatives and low barriers to entry.
+    
+    - **Financial Concerns**:
+      - **2020**: Market volatility and capital requirements were major concerns.
+      - **2021**: Historical losses and increased operating expenses raised profitability concerns.
+      - **2022**: Significant losses and rising expenses continued to raise profitability concerns.
+    
+    - **User and Personnel Retention**:
+      - **2020**: Talent retention was crucial, with risks from attrition.
+      - **2021**: Attracting and retaining a critical mass of users and personnel was essential.
+      - **2022**: Continued emphasis on retaining Drivers, consumers, and high-quality personnel.
+    
+    - **Brand and Reputation**:
+      - **2020**: Safety incidents and cybersecurity threats could harm reputation.
+      - **2021**: Maintaining and enhancing brand reputation was critical, with past negative publicity being a concern.
+      - **2022**: Brand and reputation were under scrutiny, with negative media coverage potentially harming prospects.
+    
+    - **Operational Challenges**:
+      - **2020**: Operational limitations and acquisition challenges were highlighted.
+      - **2021**: Challenges in managing growth and optimizing organizational structure.
+      - **2022**: Historical workplace culture and the need for organizational optimization were critical.
+    
+    - **Safety and Liability**:
+      - **2020**: Safety incidents and liability claims were significant risks.
+      - **2021**: Safety incidents and liability claims, especially with vulnerable road users, were concerns.
+      - **2022**: Safety incidents and public reporting could impact reputation and financial results.
+    
+    Overall, while some risk factors remained consistent across the years, such as competition, financial concerns, and safety, the emphasis shifted slightly with the evolving business environment and external factors like the pandemic.
+
+
+### Setting up the Chatbot Loop
+
+Now that we have the chatbot setup, it only takes a few more steps to setup a basic interactive loop to chat with our SEC-augmented chatbot!
+
+
+```python
+agent = FunctionAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
+ctx = Context(agent)
+
+while True:
+    text_input = input("User: ")
+    if text_input == "exit":
+        break
+    response = await agent.run(text_input, ctx=ctx)
+    print(f"Agent: {response}")
+
+# User: What were some of the legal proceedings against Uber in 2022?
+```
@@ -0,0 +1,271 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_builder.ipynb
+toc: True
+title: "GPT Builder Demo"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agent_builder.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+Inspired by GPTs interface, presented at OpenAI Dev Day 2023. Construct an agent with natural language.
+
+Here you can build your own agent...with another agent!
+
+
+```python
+%pip install llama-index-embeddings-openai
+%pip install llama-index-llms-openai
+%pip install llama-index-readers-file
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.llms.openai import OpenAI
+from llama_index.core import Settings
+
+llm = OpenAI(model="gpt-4o")
+Settings.llm = llm
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+## Define Candidate Tools
+
+We also define a tool retriever to retrieve candidate tools.
+
+In this setting we define tools as different Wikipedia pages.
+
+
+```python
+from llama_index.core import SimpleDirectoryReader
+```
+
+
+```python
+wiki_titles = ["Toronto", "Seattle", "Chicago", "Boston", "Houston"]
+```
+
+
+```python
+from pathlib import Path
+
+import requests
+
+for title in wiki_titles:
+    response = requests.get(
+        "https://en.wikipedia.org/w/api.php",
+        params={
+            "action": "query",
+            "format": "json",
+            "titles": title,
+            "prop": "extracts",
+            # 'exintro': True,
+            "explaintext": True,
+        },
+    ).json()
+    page = next(iter(response["query"]["pages"].values()))
+    wiki_text = page["extract"]
+
+    data_path = Path("data")
+    if not data_path.exists():
+        Path.mkdir(data_path)
+
+    with open(data_path / f"{title}.txt", "w") as fp:
+        fp.write(wiki_text)
+```
+
+
+```python
+# Load all wiki documents
+city_docs = {}
+for wiki_title in wiki_titles:
+    city_docs[wiki_title] = SimpleDirectoryReader(
+        input_files=[f"data/{wiki_title}.txt"]
+    ).load_data()
+```
+
+### Build Query Tool for Each Document
+
+
+```python
+from llama_index.core import VectorStoreIndex
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import VectorStoreIndex
+
+# Build tool dictionary
+tool_dict = {}
+
+for wiki_title in wiki_titles:
+    # build vector index
+    vector_index = VectorStoreIndex.from_documents(
+        city_docs[wiki_title],
+    )
+    # define query engines
+    vector_query_engine = vector_index.as_query_engine(llm=llm)
+
+    # define tools
+    vector_tool = QueryEngineTool.from_defaults(
+        query_engine=vector_query_engine,
+        name=wiki_title,
+        description=("Useful for questions related to" f" {wiki_title}"),
+    )
+    tool_dict[wiki_title] = vector_tool
+```
+
+### Define Tool Retriever
+
+
+```python
+# define an "object" index and retriever over these tools
+from llama_index.core import VectorStoreIndex
+from llama_index.core.objects import ObjectIndex
+
+tool_index = ObjectIndex.from_objects(
+    list(tool_dict.values()),
+    index_cls=VectorStoreIndex,
+)
+tool_retriever = tool_index.as_retriever(similarity_top_k=1)
+```
+
+### Load Data
+
+Here we load wikipedia pages from different cities.
+
+## Define Meta-Tools for GPT Builder
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.core.llms import ChatMessage
+from llama_index.core import ChatPromptTemplate
+from typing import List
+
+GEN_SYS_PROMPT_STR = """\
+Task information is given below. 
+
+Given the task, please generate a system prompt for an OpenAI-powered bot to solve this task: 
+{task} \
+"""
+
+gen_sys_prompt_messages = [
+    ChatMessage(
+        role="system",
+        content="You are helping to build a system prompt for another bot.",
+    ),
+    ChatMessage(role="user", content=GEN_SYS_PROMPT_STR),
+]
+
+GEN_SYS_PROMPT_TMPL = ChatPromptTemplate(gen_sys_prompt_messages)
+
+
+agent_cache = {}
+
+
+async def create_system_prompt(task: str):
+    """Create system prompt for another agent given an input task."""
+    llm = OpenAI(llm="gpt-4")
+    fmt_messages = GEN_SYS_PROMPT_TMPL.format_messages(task=task)
+    response = await llm.achat(fmt_messages)
+    return response.message.content
+
+
+async def get_tools(task: str):
+    """Get the set of relevant tools to use given an input task."""
+    subset_tools = await tool_retriever.aretrieve(task)
+    return [t.metadata.name for t in subset_tools]
+
+
+def create_agent(system_prompt: str, tool_names: List[str]):
+    """Create an agent given a system prompt and an input set of tools."""
+    llm = OpenAI(model="gpt-4o")
+    try:
+        # get the list of tools
+        input_tools = [tool_dict[tn] for tn in tool_names]
+
+        agent = FunctionAgent(
+            tools=input_tools, llm=llm, system_prompt=system_prompt
+        )
+        agent_cache["agent"] = agent
+        return_msg = "Agent created successfully."
+    except Exception as e:
+        return_msg = f"An error occurred when building an agent. Here is the error: {repr(e)}"
+    return return_msg
+```
+
+
+```python
+from llama_index.core.tools import FunctionTool
+
+system_prompt_tool = FunctionTool.from_defaults(fn=create_system_prompt)
+get_tools_tool = FunctionTool.from_defaults(fn=get_tools)
+create_agent_tool = FunctionTool.from_defaults(fn=create_agent)
+```
+
+
+```python
+GPT_BUILDER_SYS_STR = """\
+You are helping to construct an agent given a user-specified task. You should generally use the tools in this order to build the agent.
+
+1) Create system prompt tool: to create the system prompt for the agent.
+2) Get tools tool: to fetch the candidate set of tools to use.
+3) Create agent tool: to create the final agent.
+"""
+
+prefix_msgs = [ChatMessage(role="system", content=GPT_BUILDER_SYS_STR)]
+
+
+builder_agent = FunctionAgent(
+    tools=[system_prompt_tool, get_tools_tool, create_agent_tool],
+    prefix_messages=prefix_msgs,
+    llm=OpenAI(model="gpt-4o"),
+    verbose=True,
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult
+
+handler = builder_agent.run("Build an agent that can tell me about Toronto.")
+async for event in handler.stream_events():
+    if isinstance(event, ToolCallResult):
+        print(
+            f"Called tool {event.tool_name} with input {event.tool_kwargs}\nGot output: {event.tool_output}"
+        )
+
+result = await handler
+print(f"Result: {result}")
+```
+
+    Called tool create_system_prompt with input {'task': 'Tell me about Toronto'}
+    Got output: "Generate a brief summary about Toronto, including its history, culture, landmarks, and notable features."
+    Called tool get_tools with input {'task': 'Tell me about Toronto'}
+    Got output: ['Toronto']
+    Called tool create_agent with input {'system_prompt': 'Generate a brief summary about Toronto, including its history, culture, landmarks, and notable features.', 'tool_names': ['Toronto']}
+    Got output: Agent created successfully.
+    Result: I have created an agent that can provide information about Toronto, including its history, culture, landmarks, and notable features. You can now ask the agent any questions you have about Toronto!
+
+
+
+```python
+city_agent = agent_cache["agent"]
+```
+
+
+```python
+response = await city_agent.run("Tell me about the parks in Toronto")
+print(str(response))
+```
+
+    Toronto is home to a diverse array of parks and public spaces, offering both urban and natural environments. Key downtown parks include Allan Gardens, Christie Pits, and Trinity Bellwoods Park. For waterfront views, Tommy Thompson Park and the Toronto Islands are popular destinations. In the city's outer areas, large parks like High Park, Humber Bay Park, and Morningside Park provide expansive green spaces. Additionally, parts of Rouge National Urban Park, the largest urban park in North America, are located within Toronto. The city also features notable squares such as Nathan Phillips Square, Yonge–Dundas Square, and Harbourfront Square. Approximately 12.5% of Toronto's land is dedicated to parkland, offering facilities for various activities, including winter sports like ice skating and skiing.
+
@@ -0,0 +1,351 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_basic.ipynb
+toc: True
+title: "FunctionAgent / AgentWorkflow Basic Introduction"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+The `AgentWorkflow` is an orchestrator for running a system of one or more agents. In this example, we'll create a simple workflow with a single `FunctionAgent`, and use that to cover the basic functionality.
+
+
+```python
+%pip install llama-index
+```
+
+## Setup
+
+In this example, we will use `OpenAI` as our LLM. For all LLMs, check out the [examples documentation](https://docs.llamaindex.ai/en/stable/examples/llm/openai/) or [LlamaHub](https://llamahub.ai/?tab=llms) for a list of all supported LLMs and how to install/use them.
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+llm = OpenAI(model="gpt-4o-mini", api_key="sk-...")
+```
+
+To make our agent more useful, we can give it tools/actions to use. In this case, we'll use Tavily to implement a tool that can search the web for information. You can get a free API key from [Tavily](https://tavily.com/).
+
+
+```python
+%pip install tavily-python
+```
+
+When creating a tool, its very important to:
+- give the tool a proper name and docstring/description. The LLM uses this to understand what the tool does.
+- annotate the types. This helps the LLM understand the expected input and output types.
+- use async when possible, since this will make the workflow more efficient.
+
+
+```python
+from tavily import AsyncTavilyClient
+
+
+async def search_web(query: str) -> str:
+    """Useful for using the web to answer questions."""
+    client = AsyncTavilyClient(api_key="tvly-...")
+    return str(await client.search(query))
+```
+
+With the tool and and LLM defined, we can create an `AgentWorkflow` that uses the tool.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[search_web],
+    llm=llm,
+    system_prompt="You are a helpful assistant that can search the web for information.",
+)
+```
+
+## Running the Agent
+
+Now that our agent is created, we can run it!
+
+
+```python
+response = await agent.run(user_msg="What is the weather in San Francisco?")
+print(str(response))
+```
+
+    The current weather in San Francisco is as follows:
+    
+    - **Temperature**: 16.1°C (61°F)
+    - **Condition**: Partly cloudy
+    - **Wind**: 13.6 mph (22.0 kph) from the west
+    - **Humidity**: 64%
+    - **Visibility**: 16 km (9 miles)
+    - **Pressure**: 1017 mb (30.04 in)
+    
+    For more details, you can check the full report [here](https://www.weatherapi.com/).
+
+
+The above is the equivalent of the following of using `AgentWorkflow` with a single `FunctionAgent`:
+
+
+```python
+from llama_index.core.agent.workflow import AgentWorkflow
+
+workflow = AgentWorkflow(agents=[agent])
+
+response = await workflow.run(user_msg="What is the weather in San Francisco?")
+```
+
+If you were creating a workflow with multiple agents, you can pass in a list of agents to the `AgentWorkflow` constructor. Learn more in our [multi-agent workflow example](https://docs.llamaindex.ai/en/stable/understanding/agent/multi_agent/).
+
+## Maintaining State
+
+By default, the `FunctionAgent` will maintain stateless between runs. This means that the agent will not have any memory of previous runs.
+
+To maintain state, we need to keep track of the previous state. Since the `FunctionAgent` is running in a  `Workflow`, the state is stored in the `Context`. This can be passed between runs to maintain state and history.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+```
+
+
+```python
+response = await agent.run(
+    user_msg="My name is Logan, nice to meet you!", ctx=ctx
+)
+print(str(response))
+```
+
+    Nice to meet you, Logan! How can I assist you today?
+
+
+
+```python
+response = await agent.run(user_msg="What is my name?", ctx=ctx)
+print(str(response))
+```
+
+    Your name is Logan.
+
+
+The context is serializable, so it can be saved to a database, file, etc. and loaded back in later. 
+
+The `JsonSerializer` is a simple serializer that uses `json.dumps` and `json.loads` to serialize and deserialize the context.
+
+The `JsonPickleSerializer` is a serializer that uses `pickle` to serialize and deserialize the context. If you have objects in your context that are not serializable, you can use this serializer.
+
+
+```python
+from llama_index.core.workflow import JsonPickleSerializer, JsonSerializer
+
+ctx_dict = ctx.to_dict(serializer=JsonSerializer())
+
+restored_ctx = Context.from_dict(agent, ctx_dict, serializer=JsonSerializer())
+```
+
+
+```python
+response = await agent.run(
+    user_msg="Do you still remember my name?", ctx=restored_ctx
+)
+print(str(response))
+```
+
+    Yes, I remember your name is Logan.
+
+
+## Streaming
+
+The `AgentWorkflow`/`FunctionAgent` also supports streaming. Since the `AgentWorkflow` is a `Workflow`, it can be streamed like any other `Workflow`. This works by using the handler that is returned from the workflow. There are a few key events that are streamed, feel free to explore below.
+
+If you only want to stream the LLM output, you can use the `AgentStream` events.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentInput,
+    AgentOutput,
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+
+handler = agent.run(user_msg="What is the weather in Saskatoon?")
+
+async for event in handler.stream_events():
+    if isinstance(event, AgentStream):
+        print(event.delta, end="", flush=True)
+        # print(event.response)  # the current full response
+        # print(event.raw)  # the raw llm api response
+        # print(event.current_agent_name)  # the current agent name
+    # elif isinstance(event, AgentInput):
+    #    print(event.input)  # the current input messages
+    #    print(event.current_agent_name)  # the current agent name
+    # elif isinstance(event, AgentOutput):
+    #    print(event.response)  # the current full response
+    #    print(event.tool_calls)  # the selected tool calls, if any
+    #    print(event.raw)  # the raw llm api response
+    # elif isinstance(event, ToolCallResult):
+    #    print(event.tool_name)  # the tool name
+    #    print(event.tool_kwargs)  # the tool kwargs
+    #    print(event.tool_output)  # the tool output
+    # elif isinstance(event, ToolCall):
+    #     print(event.tool_name)  # the tool name
+    #     print(event.tool_kwargs)  # the tool kwargs
+```
+
+    The current weather in Saskatoon is as follows:
+    
+    - **Temperature**: 22.2°C (72°F)
+    - **Condition**: Overcast
+    - **Humidity**: 25%
+    - **Wind Speed**: 6.0 mph (9.7 kph) from the northwest
+    - **Visibility**: 4.8 km
+    - **Pressure**: 1018 mb
+    
+    For more details, you can check the full report [here](https://www.weatherapi.com/).
+
+## Tools and State
+
+Tools can also be defined that have access to the workflow context. This means you can set and retrieve variables from the context and use them in the tool or between tools.
+
+**Note:** The `Context` parameter should be the first parameter of the tool.
+
+
+```python
+from llama_index.core.workflow import Context
+
+
+async def set_name(ctx: Context, name: str) -> str:
+    state = await ctx.store.get("state")
+    state["name"] = name
+    await ctx.store.set("state", state)
+    return f"Name set to {name}"
+
+
+agent = FunctionAgent(
+    tools=[set_name],
+    llm=llm,
+    system_prompt="You are a helpful assistant that can set a name.",
+    initial_state={"name": "unset"},
+)
+
+ctx = Context(agent)
+
+response = await agent.run(user_msg="My name is Logan", ctx=ctx)
+print(str(response))
+
+state = await ctx.store.get("state")
+print(state["name"])
+```
+
+    Your name has been set to Logan.
+    Logan
+
+
+## Human in the Loop
+
+Tools can also be defined that involve a human in the loop. This is useful for tasks that require human input, such as confirming a tool call or providing feedback.
+
+Using workflow events, we can emit events that require a response from the user. Here, we use the built-in `InputRequiredEvent` and `HumanResponseEvent` to handle the human in the loop, but you can also define your own events.
+
+`wait_for_event` will emit the `waiter_event` and wait until it sees the `HumanResponseEvent` with the specified `requirements`. The `waiter_id` is used to ensure that we only send one `waiter_event` for each `waiter_id`.
+
+
+```python
+from llama_index.core.workflow import (
+    Context,
+    InputRequiredEvent,
+    HumanResponseEvent,
+)
+
+
+async def dangerous_task(ctx: Context) -> str:
+    """A dangerous task that requires human confirmation."""
+
+    question = "Are you sure you want to proceed?"
+    response = await ctx.wait_for_event(
+        HumanResponseEvent,
+        waiter_id=question,
+        waiter_event=InputRequiredEvent(
+            prefix=question,
+            user_name="Logan",
+        ),
+        requirements={"user_name": "Logan"},
+    )
+    if response.response == "yes":
+        return "Dangerous task completed successfully."
+    else:
+        return "Dangerous task aborted."
+
+
+agent = FunctionAgent(
+    tools=[dangerous_task],
+    llm=llm,
+    system_prompt="You are a helpful assistant that can perform dangerous tasks.",
+)
+```
+
+
+```python
+handler = agent.run(user_msg="I want to proceed with the dangerous task.")
+
+async for event in handler.stream_events():
+    if isinstance(event, InputRequiredEvent):
+        response = input(event.prefix).strip().lower()
+        handler.ctx.send_event(
+            HumanResponseEvent(
+                response=response,
+                user_name=event.user_name,
+            )
+        )
+
+response = await handler
+print(str(response))
+```
+
+    The dangerous task has been completed successfully. If you need anything else, feel free to ask!
+
+
+In production scenarios, you might handle human-in-the-loop over a websocket or multiple API requests.
+
+As mentioned before, the `Context` object is serializable, and this means we can also save the workflow mid-run and restore it later. 
+
+**NOTE:** Any functions/steps that were in-progress will start from the beginning when the workflow is restored.
+
+
+```python
+from llama_index.core.workflow import JsonSerializer
+
+handler = agent.run(user_msg="I want to proceed with the dangerous task.")
+
+input_ev = None
+async for event in handler.stream_events():
+    if isinstance(event, InputRequiredEvent):
+        input_ev = event
+        break
+
+# save the context somewhere for later
+ctx_dict = handler.ctx.to_dict(serializer=JsonSerializer())
+
+# get the response from the user
+response_str = input(input_ev.prefix).strip().lower()
+
+# restore the workflow
+restored_ctx = Context.from_dict(agent, ctx_dict, serializer=JsonSerializer())
+
+handler = agent.run(ctx=restored_ctx)
+handler.ctx.send_event(
+    HumanResponseEvent(
+        response=response_str,
+        user_name=input_ev.user_name,
+    )
+)
+response = await handler
+print(str(response))
+```
+
+    The dangerous task has been completed successfully. If you need anything else, feel free to ask!
+
@@ -0,0 +1,300 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_multi.ipynb
+toc: True
+title: "Multi-Agent Report Generation with AgentWorkflow"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agent_workflow_multi.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+In this notebook, we will explore how to use the `AgentWorkflow` class to create multi-agent systems. Specifically, we will create a system that can generate a report on a given topic.
+
+This notebook will assume that you have already either read the [basic agent workflow notebook](https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_basic) or the [agent workflow documentation](https://docs.llamaindex.ai/en/stable/understanding/agent/).
+
+## Setup
+
+In this example, we will use `OpenAI` as our LLM. For all LLMs, check out the [examples documentation](https://docs.llamaindex.ai/en/stable/examples/llm/openai/) or [LlamaHub](https://llamahub.ai/?tab=llms) for a list of all supported LLMs and how to install/use them.
+
+If we wanted, each agent could have a different LLM, but for this example, we will use the same LLM for all agents.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+llm = OpenAI(model="gpt-4o", api_key="sk-...")
+```
+
+## System Design
+
+Our system will have three agents:
+
+1. A `ResearchAgent` that will search the web for information on the given topic.
+2. A `WriteAgent` that will write the report using the information found by the `ResearchAgent`.
+3. A `ReviewAgent` that will review the report and provide feedback.
+
+We will use the `AgentWorkflow` class to create a multi-agent system that will execute these agents in order.
+
+While there are many ways to implement this system, in this case, we will use a few tools to help with the research and writing processes.
+
+1. A `web_search` tool to search the web for information on the given topic.
+2. A `record_notes` tool to record notes on the given topic.
+3. A `write_report` tool to write the report using the information found by the `ResearchAgent`.
+4. A `review_report` tool to review the report and provide feedback.
+
+Utilizing the `Context` class, we can pass state between agents, and each agent will have access to the current state of the system.
+
+
+
+```python
+%pip install tavily-python
+```
+
+
+```python
+from tavily import AsyncTavilyClient
+from llama_index.core.workflow import Context
+
+
+async def search_web(query: str) -> str:
+    """Useful for using the web to answer questions."""
+    client = AsyncTavilyClient(api_key="tvly-...")
+    return str(await client.search(query))
+
+
+async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
+    """Useful for recording notes on a given topic. Your input should be notes with a title to save the notes under."""
+    current_state = await ctx.store.get("state")
+    if "research_notes" not in current_state:
+        current_state["research_notes"] = {}
+    current_state["research_notes"][notes_title] = notes
+    await ctx.store.set("state", current_state)
+    return "Notes recorded."
+
+
+async def write_report(ctx: Context, report_content: str) -> str:
+    """Useful for writing a report on a given topic. Your input should be a markdown formatted report."""
+    current_state = await ctx.store.get("state")
+    current_state["report_content"] = report_content
+    await ctx.store.set("state", current_state)
+    return "Report written."
+
+
+async def review_report(ctx: Context, review: str) -> str:
+    """Useful for reviewing a report and providing feedback. Your input should be a review of the report."""
+    current_state = await ctx.store.get("state")
+    current_state["review"] = review
+    await ctx.store.set("state", current_state)
+    return "Report reviewed."
+```
+
+With our tools defined, we can now create our agents.
+
+If the LLM you are using supports tool calling, you can use the `FunctionAgent` class. Otherwise, you can use the `ReActAgent` class.
+
+Here, the name and description of each agent is used so that the system knows what each agent is responsible for and when to hand off control to the next agent.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+
+research_agent = FunctionAgent(
+    name="ResearchAgent",
+    description="Useful for searching the web for information on a given topic and recording notes on the topic.",
+    system_prompt=(
+        "You are the ResearchAgent that can search the web for information on a given topic and record notes on the topic. "
+        "Once notes are recorded and you are satisfied, you should hand off control to the WriteAgent to write a report on the topic. "
+        "You should have at least some notes on a topic before handing off control to the WriteAgent."
+    ),
+    llm=llm,
+    tools=[search_web, record_notes],
+    can_handoff_to=["WriteAgent"],
+)
+
+write_agent = FunctionAgent(
+    name="WriteAgent",
+    description="Useful for writing a report on a given topic.",
+    system_prompt=(
+        "You are the WriteAgent that can write a report on a given topic. "
+        "Your report should be in a markdown format. The content should be grounded in the research notes. "
+        "Once the report is written, you should get feedback at least once from the ReviewAgent."
+    ),
+    llm=llm,
+    tools=[write_report],
+    can_handoff_to=["ReviewAgent", "ResearchAgent"],
+)
+
+review_agent = FunctionAgent(
+    name="ReviewAgent",
+    description="Useful for reviewing a report and providing feedback.",
+    system_prompt=(
+        "You are the ReviewAgent that can review the write report and provide feedback. "
+        "Your review should either approve the current report or request changes for the WriteAgent to implement. "
+        "If you have feedback that requires changes, you should hand off control to the WriteAgent to implement the changes after submitting the review."
+    ),
+    llm=llm,
+    tools=[review_report],
+    can_handoff_to=["WriteAgent"],
+)
+```
+
+## Running the Workflow
+
+With our agents defined, we can create our `AgentWorkflow` and run it.
+
+
+```python
+from llama_index.core.agent.workflow import AgentWorkflow
+
+agent_workflow = AgentWorkflow(
+    agents=[research_agent, write_agent, review_agent],
+    root_agent=research_agent.name,
+    initial_state={
+        "research_notes": {},
+        "report_content": "Not written yet.",
+        "review": "Review required.",
+    },
+)
+```
+
+As the workflow is running, we will stream the events to get an idea of what is happening under the hood.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentInput,
+    AgentOutput,
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+
+handler = agent_workflow.run(
+    user_msg=(
+        "Write me a report on the history of the internet. "
+        "Briefly describe the history of the internet, including the development of the internet, the development of the web, "
+        "and the development of the internet in the 21st century."
+    )
+)
+
+current_agent = None
+current_tool_calls = ""
+async for event in handler.stream_events():
+    if (
+        hasattr(event, "current_agent_name")
+        and event.current_agent_name != current_agent
+    ):
+        current_agent = event.current_agent_name
+        print(f"\n{'='*50}")
+        print(f"🤖 Agent: {current_agent}")
+        print(f"{'='*50}\n")
+
+    # if isinstance(event, AgentStream):
+    #     if event.delta:
+    #         print(event.delta, end="", flush=True)
+    # elif isinstance(event, AgentInput):
+    #     print("📥 Input:", event.input)
+    elif isinstance(event, AgentOutput):
+        if event.response.content:
+            print("📤 Output:", event.response.content)
+        if event.tool_calls:
+            print(
+                "🛠️  Planning to use tools:",
+                [call.tool_name for call in event.tool_calls],
+            )
+    elif isinstance(event, ToolCallResult):
+        print(f"🔧 Tool Result ({event.tool_name}):")
+        print(f"  Arguments: {event.tool_kwargs}")
+        print(f"  Output: {event.tool_output}")
+    elif isinstance(event, ToolCall):
+        print(f"🔨 Calling Tool: {event.tool_name}")
+        print(f"  With arguments: {event.tool_kwargs}")
+```
+
+    
+    ==================================================
+    🤖 Agent: ResearchAgent
+    ==================================================
+    
+    🛠️  Planning to use tools: ['search_web']
+    🔨 Calling Tool: search_web
+      With arguments: {'query': 'history of the internet'}
+    🔧 Tool Result (search_web):
+      Arguments: {'query': 'history of the internet'}
+      Output: {'query': 'history of the internet', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': 'Internet history timeline: ARPANET to the World Wide Web', 'url': 'https://www.livescience.com/20727-internet-history.html', 'content': 'Internet history timeline: ARPANET to the World Wide Web\nThe internet history timeline shows how today\'s vast network evolved from the initial concept\nIn internet history, credit for the initial concept that developed into the World Wide Web is typically given to Leonard Kleinrock. "\nAccording to the journal Management and Business Review (MBR), Kleinrock, along with other innovators such as J.C.R. Licklider, the first director of the Information Processing Technology Office (IPTO), provided the backbone for the ubiquitous stream of emails, media, Facebook postings and tweets that are now shared online every day.\n The precursor to the internet was jumpstarted in the early days of the history of computers , in 1969 with the U.S. Defense Department\'s Advanced Research Projects Agency Network (ARPANET), according to the journal American Scientist. The successful push to stop the bill, involving technology companies such as Google and nonprofit organizations including Wikipedia and the Electronic Frontier Foundation, is considered a victory for sites such as YouTube that depend on user-generated content, as well as "fair use" on the internet.\n Vinton Cerf and Bob Kahn (the duo said by many to be the Fathers of the Internet) publish "A Protocol for Packet Network Interconnection," which details the design of TCP.\n1976:', 'score': 0.81097376, 'raw_content': None}, {'title': 'A Brief History of the Internet - University System of Georgia', 'url': 'https://usg.edu/galileo/skills/unit07/internet07_02.phtml', 'content': 'The Internet started in the 1960s as a way for government researchers to share information. This eventually led to the formation of the ARPANET (Advanced Research Projects Agency Network), the network that ultimately evolved into what we now know as the Internet. In response to this, other networks were created to provide information sharing. ARPANET and the Defense Data Network officially changed to the TCP/IP standard on January 1, 1983, hence the birth of the Internet. (Business computers like the UNIVAC processed data more slowly than the IAS-type machines, but were designed for fast input and output.) The first few sales were to government agencies, the A.C. Nielsen Company, and the Prudential Insurance Company.', 'score': 0.8091708, 'raw_content': None}, {'title': 'Timeline - History of the Internet', 'url': 'https://historyoftheinternet.net/timeline/', 'content': "Learn how the internet evolved from SAGE and IBM's internal networks to ARPANET and the World Wide Web. Explore the commercial and government paths that led to the current internet format and protocols.", 'score': 0.7171114, 'raw_content': None}, {'title': 'Learn About Internet History | History of the Internet', 'url': 'https://internethistory.org/', 'content': 'Learn about the origins, evolution and impact of the internet through stories, materials and videos. Explore the first internet message, optical amplifier, wavelength division multiplexing and more.', 'score': 0.7040996, 'raw_content': None}, {'title': 'Brief History of the Internet', 'url': 'https://www.internetsociety.org/resources/doc/2017/brief-history-internet/', 'content': "Learn how the Internet evolved from the initial internetting concepts to a global network of networks that transformed the computer and communications world. Explore the key milestones, challenges, and opportunities of the Internet's development and future.", 'score': 0.6944897, 'raw_content': None}], 'response_time': 1.65}
+    🛠️  Planning to use tools: ['record_notes']
+    🔨 Calling Tool: record_notes
+      With arguments: {'notes': "The internet's history began in the 1960s as a project for government researchers to share information, leading to the creation of ARPANET (Advanced Research Projects Agency Network). ARPANET was the first network to implement the TCP/IP protocol suite, which became the foundation for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the birth of the internet.\n\nThe World Wide Web was developed later, in 1989, by Tim Berners-Lee, a British scientist at CERN. The web was initially conceived as a way to facilitate information sharing among scientists and institutes around the world. Berners-Lee developed the first web browser and web server, and introduced the concept of hyperlinks, which allowed users to navigate between different documents on the web.\n\nIn the 21st century, the internet has evolved into a global network that connects billions of devices and users. It has transformed communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact.", 'notes_title': 'History of the Internet'}
+    🔧 Tool Result (record_notes):
+      Arguments: {'notes': "The internet's history began in the 1960s as a project for government researchers to share information, leading to the creation of ARPANET (Advanced Research Projects Agency Network). ARPANET was the first network to implement the TCP/IP protocol suite, which became the foundation for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the birth of the internet.\n\nThe World Wide Web was developed later, in 1989, by Tim Berners-Lee, a British scientist at CERN. The web was initially conceived as a way to facilitate information sharing among scientists and institutes around the world. Berners-Lee developed the first web browser and web server, and introduced the concept of hyperlinks, which allowed users to navigate between different documents on the web.\n\nIn the 21st century, the internet has evolved into a global network that connects billions of devices and users. It has transformed communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact.", 'notes_title': 'History of the Internet'}
+      Output: Notes recorded.
+    🛠️  Planning to use tools: ['handoff']
+    🔨 Calling Tool: handoff
+      With arguments: {'to_agent': 'WriteAgent', 'reason': 'I have gathered and recorded notes on the history of the internet, including its development, the creation of the web, and its evolution in the 21st century. The WriteAgent can now use these notes to write a comprehensive report.'}
+    🔧 Tool Result (handoff):
+      Arguments: {'to_agent': 'WriteAgent', 'reason': 'I have gathered and recorded notes on the history of the internet, including its development, the creation of the web, and its evolution in the 21st century. The WriteAgent can now use these notes to write a comprehensive report.'}
+      Output: Handed off to WriteAgent because: I have gathered and recorded notes on the history of the internet, including its development, the creation of the web, and its evolution in the 21st century. The WriteAgent can now use these notes to write a comprehensive report.
+    
+    ==================================================
+    🤖 Agent: WriteAgent
+    ==================================================
+    
+    🛠️  Planning to use tools: ['write_report']
+    🔨 Calling Tool: write_report
+      With arguments: {'report_content': "# History of the Internet\n\nThe history of the internet is a fascinating journey from a government research project to a global network that has transformed the way we communicate, work, and live. This report provides a brief overview of the key milestones in the development of the internet, the World Wide Web, and the internet's evolution in the 21st century.\n\n## Development of the Internet\n\nThe origins of the internet date back to the 1960s, when it was conceived as a means for government researchers to share information. This led to the creation of ARPANET (Advanced Research Projects Agency Network), which was the first network to implement the TCP/IP protocol suite. TCP/IP became the foundational technology for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the official birth of the internet.\n\n## Development of the World Wide Web\n\nThe World Wide Web was developed in 1989 by Tim Berners-Lee, a British scientist working at CERN. The web was initially designed to facilitate information sharing among scientists and research institutes globally. Berners-Lee created the first web browser and web server, introducing the concept of hyperlinks, which allowed users to navigate between different documents on the web seamlessly.\n\n## The Internet in the 21st Century\n\nIn the 21st century, the internet has evolved into a vast global network connecting billions of devices and users. It has revolutionized communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact, making it an integral part of daily life for people around the world.\n\nThe internet continues to evolve, with ongoing advancements in technology and infrastructure shaping its future. As it grows, the internet remains a powerful tool for innovation, connectivity, and information sharing, influencing nearly every aspect of modern society."}
+    🔧 Tool Result (write_report):
+      Arguments: {'report_content': "# History of the Internet\n\nThe history of the internet is a fascinating journey from a government research project to a global network that has transformed the way we communicate, work, and live. This report provides a brief overview of the key milestones in the development of the internet, the World Wide Web, and the internet's evolution in the 21st century.\n\n## Development of the Internet\n\nThe origins of the internet date back to the 1960s, when it was conceived as a means for government researchers to share information. This led to the creation of ARPANET (Advanced Research Projects Agency Network), which was the first network to implement the TCP/IP protocol suite. TCP/IP became the foundational technology for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the official birth of the internet.\n\n## Development of the World Wide Web\n\nThe World Wide Web was developed in 1989 by Tim Berners-Lee, a British scientist working at CERN. The web was initially designed to facilitate information sharing among scientists and research institutes globally. Berners-Lee created the first web browser and web server, introducing the concept of hyperlinks, which allowed users to navigate between different documents on the web seamlessly.\n\n## The Internet in the 21st Century\n\nIn the 21st century, the internet has evolved into a vast global network connecting billions of devices and users. It has revolutionized communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact, making it an integral part of daily life for people around the world.\n\nThe internet continues to evolve, with ongoing advancements in technology and infrastructure shaping its future. As it grows, the internet remains a powerful tool for innovation, connectivity, and information sharing, influencing nearly every aspect of modern society."}
+      Output: Report written.
+    🛠️  Planning to use tools: ['handoff']
+    🔨 Calling Tool: handoff
+      With arguments: {'to_agent': 'ReviewAgent', 'reason': 'The report on the history of the internet has been written and needs to be reviewed for accuracy and completeness.'}
+    🔧 Tool Result (handoff):
+      Arguments: {'to_agent': 'ReviewAgent', 'reason': 'The report on the history of the internet has been written and needs to be reviewed for accuracy and completeness.'}
+      Output: Handed off to ReviewAgent because: The report on the history of the internet has been written and needs to be reviewed for accuracy and completeness.
+    
+    ==================================================
+    🤖 Agent: ReviewAgent
+    ==================================================
+    
+    🛠️  Planning to use tools: ['review_report']
+    🔨 Calling Tool: review_report
+      With arguments: {'review': "The report on the history of the internet provides a concise and informative overview of the key developments in the internet's history. It effectively covers the origins of the internet with ARPANET, the creation of the World Wide Web by Tim Berners-Lee, and the evolution of the internet in the 21st century. The report is well-structured, with clear sections that make it easy to follow.\n\nThe content is accurate and aligns with the historical timeline of the internet's development. It highlights significant milestones such as the adoption of TCP/IP and the introduction of hyperlinks, which are crucial to understanding the internet's growth.\n\nOverall, the report meets the requirements and provides a comprehensive summary of the internet's history. It is approved for final submission."}
+    🔧 Tool Result (review_report):
+      Arguments: {'review': "The report on the history of the internet provides a concise and informative overview of the key developments in the internet's history. It effectively covers the origins of the internet with ARPANET, the creation of the World Wide Web by Tim Berners-Lee, and the evolution of the internet in the 21st century. The report is well-structured, with clear sections that make it easy to follow.\n\nThe content is accurate and aligns with the historical timeline of the internet's development. It highlights significant milestones such as the adoption of TCP/IP and the introduction of hyperlinks, which are crucial to understanding the internet's growth.\n\nOverall, the report meets the requirements and provides a comprehensive summary of the internet's history. It is approved for final submission."}
+      Output: Report reviewed.
+    📤 Output: The report on the history of the internet has been reviewed and approved. It provides a comprehensive and accurate overview of the internet's development, the creation of the World Wide Web, and its evolution in the 21st century. The report is well-structured and meets the requirements for final submission.
+
+
+Now, we can retrieve the final report in the system for ourselves.
+
+
+```python
+state = await handler.ctx.store.get("state")
+print(state["report_content"])
+```
+
+    # History of the Internet
+    
+    The history of the internet is a fascinating journey from a government research project to a global network that has transformed the way we communicate, work, and live. This report provides a brief overview of the key milestones in the development of the internet, the World Wide Web, and the internet's evolution in the 21st century.
+    
+    ## Development of the Internet
+    
+    The origins of the internet date back to the 1960s, when it was conceived as a means for government researchers to share information. This led to the creation of ARPANET (Advanced Research Projects Agency Network), which was the first network to implement the TCP/IP protocol suite. TCP/IP became the foundational technology for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the official birth of the internet.
+    
+    ## Development of the World Wide Web
+    
+    The World Wide Web was developed in 1989 by Tim Berners-Lee, a British scientist working at CERN. The web was initially designed to facilitate information sharing among scientists and research institutes globally. Berners-Lee created the first web browser and web server, introducing the concept of hyperlinks, which allowed users to navigate between different documents on the web seamlessly.
+    
+    ## The Internet in the 21st Century
+    
+    In the 21st century, the internet has evolved into a vast global network connecting billions of devices and users. It has revolutionized communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact, making it an integral part of daily life for people around the world.
+    
+    The internet continues to evolve, with ongoing advancements in technology and infrastructure shaping its future. As it grows, the internet remains a powerful tool for innovation, connectivity, and information sharing, influencing nearly every aspect of modern society.
+
@@ -1,11 +1,12 @@
 ---
 layout: recipe
-colab: https://colab.research.google.com/github/TuanaCelik/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_research_assistant.ipynb
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_research_assistant.ipynb
 toc: True
 title: "Agent Workflow + Research Assistant using AgentQL"
-featured: False
+featured: True
 experimental: True
 tags: ['Agent', 'Websearch', 'Integrations']
+language: py
 ---
 <a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agent_workflow_research_assistant.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

@@ -0,0 +1,467 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agents_as_tools.ipynb
+toc: True
+title: "Multi-Agent Report Generation using Agents as Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agents_as_tools.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+In this notebook, we will explore how to create a multi-agent system that uses a top-level agent to orchestrate a group of agents as tools. Specifically, we will create a system that can research, write, and review a report on a given topic.
+
+This notebook will assume that you have already either read the [basic agent workflow notebook](https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_basic) or the [general agent documentation](https://docs.llamaindex.ai/en/stable/understanding/agent/).
+
+## Setup
+
+In this example, we will use `OpenAI` as our LLM. For all LLMs, check out the [examples documentation](https://docs.llamaindex.ai/en/stable/examples/llm/openai/) or [LlamaHub](https://llamahub.ai/?tab=llms) for a list of all supported LLMs and how to install/use them.
+
+If we wanted, each agent could have a different LLM, but for this example, we will use the same LLM for all agents.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+sub_agent_llm = OpenAI(model="gpt-4.1-mini", api_key="sk-...")
+orchestrator_llm = OpenAI(model="o3-mini", api_key="sk-...")
+```
+
+## System Design
+
+Our system will have three agents:
+
+1. A `ResearchAgent` that will search the web for information on the given topic.
+2. A `WriteAgent` that will write the report using the information found by the `ResearchAgent`.
+3. A `ReviewAgent` that will review the report and provide feedback.
+
+We will then use a top-level agent to orchestrate the other agents to write our report.
+
+While there are many ways to implement this system, in this case, we will use a single `web_search` tool to search the web for information on the given topic.
+
+
+
+```python
+%pip install tavily-python
+```
+
+
+```python
+from tavily import AsyncTavilyClient
+
+
+async def search_web(query: str) -> str:
+    """Useful for using the web to answer questions."""
+    client = AsyncTavilyClient(api_key="tvly-...")
+    return str(await client.search(query))
+```
+
+With our tool defined, we can now create our sub-agents.
+
+If the LLM you are using supports tool calling, you can use the `FunctionAgent` class. Otherwise, you can use the `ReActAgent` class.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+
+research_agent = FunctionAgent(
+    system_prompt=(
+        "You are the ResearchAgent that can search the web for information on a given topic and record notes on the topic. "
+        "You should output notes on the topic in a structured format."
+    ),
+    llm=sub_agent_llm,
+    tools=[search_web],
+)
+
+write_agent = FunctionAgent(
+    system_prompt=(
+        "You are the WriteAgent that can write a report on a given topic. "
+        "Your report should be in a markdown format. The content should be grounded in the research notes. "
+        "Return your markdown report surrounded by <report>...</report> tags."
+    ),
+    llm=sub_agent_llm,
+    tools=[],
+)
+
+review_agent = FunctionAgent(
+    system_prompt=(
+        "You are the ReviewAgent that can review the write report and provide feedback. "
+        "Your review should either approve the current report or request changes to be implemented."
+    ),
+    llm=sub_agent_llm,
+    tools=[],
+)
+```
+
+With our sub-agents defined, we can then convert them into tools that can be used by the top-level agent.
+
+
+```python
+import re
+from llama_index.core.workflow import Context
+
+
+async def call_research_agent(ctx: Context, prompt: str) -> str:
+    """Useful for recording research notes based on a specific prompt."""
+    result = await research_agent.run(
+        user_msg=f"Write some notes about the following: {prompt}"
+    )
+
+    state = await ctx.store.get("state")
+    state["research_notes"].append(str(result))
+    await ctx.store.set("state", state)
+
+    return str(result)
+
+
+async def call_write_agent(ctx: Context) -> str:
+    """Useful for writing a report based on the research notes or revising the report based on feedback."""
+    state = await ctx.store.get("state")
+    notes = state.get("research_notes", None)
+    if not notes:
+        return "No research notes to write from."
+
+    user_msg = f"Write a markdown report from the following notes. Be sure to output the report in the following format: <report>...</report>:\n\n"
+
+    # Add the feedback to the user message if it exists
+    feedback = state.get("review", None)
+    if feedback:
+        user_msg += f"<feedback>{feedback}</feedback>\n\n"
+
+    # Add the research notes to the user message
+    notes = "\n\n".join(notes)
+    user_msg += f"<research_notes>{notes}</research_notes>\n\n"
+
+    # Run the write agent
+    result = await write_agent.run(user_msg=user_msg)
+    report = re.search(r"<report>(.*)</report>", str(result), re.DOTALL).group(
+        1
+    )
+    state["report_content"] = str(report)
+    await ctx.store.set("state", state)
+
+    return str(report)
+
+
+async def call_review_agent(ctx: Context) -> str:
+    """Useful for reviewing the report and providing feedback."""
+    state = await ctx.store.get("state")
+    report = state.get("report_content", None)
+    if not report:
+        return "No report content to review."
+
+    result = await review_agent.run(
+        user_msg=f"Review the following report: {report}"
+    )
+    state["review"] = result
+    await ctx.store.set("state", state)
+
+    return result
+```
+
+## Creating the Top-Level Orchestrator Agent
+
+With our sub-agents defined as tools, we can now create our top-level orchestrator agent.
+
+
+```python
+orchestrator = FunctionAgent(
+    system_prompt=(
+        "You are an expert in the field of report writing. "
+        "You are given a user request and a list of tools that can help with the request. "
+        "You are to orchestrate the tools to research, write, and review a report on the given topic. "
+        "Once the review is positive, you should notify the user that the report is ready to be accessed."
+    ),
+    llm=orchestrator_llm,
+    tools=[
+        call_research_agent,
+        call_write_agent,
+        call_review_agent,
+    ],
+    initial_state={
+        "research_notes": [],
+        "report_content": None,
+        "review": None,
+    },
+)
+```
+
+## Running the Agent
+
+Let's run our agents! We can iterate over events as the workflow runs.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentInput,
+    AgentOutput,
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+from llama_index.core.workflow import Context
+
+# Create a context for the orchestrator to hold history/state
+ctx = Context(orchestrator)
+
+
+async def run_orchestrator(ctx: Context, user_msg: str):
+    handler = orchestrator.run(
+        user_msg=user_msg,
+        ctx=ctx,
+    )
+
+    async for event in handler.stream_events():
+        if isinstance(event, AgentStream):
+            if event.delta:
+                print(event.delta, end="", flush=True)
+        # elif isinstance(event, AgentInput):
+        #     print("📥 Input:", event.input)
+        elif isinstance(event, AgentOutput):
+            # Skip printing the output since we are streaming above
+            # if event.response.content:
+            #     print("📤 Output:", event.response.content)
+            if event.tool_calls:
+                print(
+                    "🛠️  Planning to use tools:",
+                    [call.tool_name for call in event.tool_calls],
+                )
+        elif isinstance(event, ToolCallResult):
+            print(f"🔧 Tool Result ({event.tool_name}):")
+            print(f"  Arguments: {event.tool_kwargs}")
+            print(f"  Output: {event.tool_output}")
+        elif isinstance(event, ToolCall):
+            print(f"🔨 Calling Tool: {event.tool_name}")
+            print(f"  With arguments: {event.tool_kwargs}")
+```
+
+
+```python
+await run_orchestrator(
+    ctx=ctx,
+    user_msg=(
+        "Write me a report on the history of the internet. "
+        "Briefly describe the history of the internet, including the development of the internet, the development of the web, "
+        "and the development of the internet in the 21st century."
+    ),
+)
+```
+
+    🛠️  Planning to use tools: ['call_research_agent']
+    🔨 Calling Tool: call_research_agent
+      With arguments: {'prompt': 'Write a detailed research note on the history of the internet, covering the development of the internet, the development of the web, and the development of the internet in the 21st century.'}
+    🔧 Tool Result (call_research_agent):
+      Arguments: {'prompt': 'Write a detailed research note on the history of the internet, covering the development of the internet, the development of the web, and the development of the internet in the 21st century.'}
+      Output: Research Notes on the History of the Internet
+    
+    1. Development of the Internet:
+    - The internet's origins trace back to the late 1960s with the U.S. Defense Department's Advanced Research Projects Agency Network (ARPANET), designed as a military defense system during the Cold War.
+    - ARPANET was the first network to implement the protocol suite TCP/IP, which became the technical foundation of the modern Internet.
+    - The Network Working Group evolved into the Internet Working Group to coordinate the growing research community.
+    - In the 1970s, commercial packet networks emerged, primarily to provide remote computer access.
+    - The National Science Foundation (NSF) expanded access to the scientific and academic community and helped make TCP/IP the standard for federally supported research networks.
+    - The internet grew through interconnected commercial backbones linked by network access points (NAPs).
+    
+    2. Development of the World Wide Web:
+    - Invented by Tim Berners-Lee in 1989 while working at CERN, the World Wide Web introduced a "web" of linked information accessible to anyone on the Internet.
+    - By December 1990, Berners-Lee developed the essential tools: HTTP (HyperText Transfer Protocol), HTML (HyperText Markup Language), the first web browser/editor, the first web server, and the first website.
+    - The Web allowed easy access to existing information and linked resources, initially serving CERN scientists.
+    - In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to create open standards for the Web.
+    - The Web evolved from Web 1.0 (basic, static pages) to Web 2.0 (interactive, user-generated content) starting around 2003, and further towards Web 3.0 (semantic web, intelligent data) from 2014 onwards.
+    
+    3. Development of the Internet in the 21st Century:
+    - The 21st century saw transformative developments such as broadband, fiber-optic technology, and mobile internet.
+    - The rise of smartphones revolutionized mobile browsing and internet access.
+    - Cloud computing emerged, allowing data storage and processing on remote servers, changing how businesses and individuals manage information.
+    - The Internet of Things (IoT) connected everyday devices to the internet, expanding the internet's reach into daily life.
+    - Social media platforms became dominant, reshaping communication and information sharing.
+    - The internet's infrastructure and services have continuously evolved to support increasing data demands and new technologies.
+    
+    These notes summarize the key milestones and technological advancements that shaped the internet from its inception to its current state in the 21st century.
+    🛠️  Planning to use tools: ['call_write_agent']
+    🔨 Calling Tool: call_write_agent
+      With arguments: {}
+    🔧 Tool Result (call_write_agent):
+      Arguments: {}
+      Output: 
+    # History of the Internet
+    
+    ## 1. Development of the Internet
+    
+    The origins of the internet date back to the late 1960s with the creation of the Advanced Research Projects Agency Network (ARPANET) by the U.S. Defense Department. Initially designed as a military defense system during the Cold War, ARPANET was the first network to implement the TCP/IP protocol suite, which later became the technical foundation of the modern Internet.
+    
+    The Network Working Group, which coordinated early research efforts, evolved into the Internet Working Group as the research community expanded. During the 1970s, commercial packet networks began to emerge, primarily to provide remote computer access.
+    
+    The National Science Foundation (NSF) played a crucial role by expanding internet access to the scientific and academic communities and promoting TCP/IP as the standard for federally supported research networks. The internet grew further through interconnected commercial backbones linked by network access points (NAPs), facilitating broader connectivity.
+    
+    ## 2. Development of the World Wide Web
+    
+    The World Wide Web was invented in 1989 by Tim Berners-Lee while working at CERN. It introduced a "web" of linked information accessible to anyone on the Internet. By December 1990, Berners-Lee had developed the essential tools that formed the Web's foundation: HTTP (HyperText Transfer Protocol), HTML (HyperText Markup Language), the first web browser/editor, the first web server, and the first website.
+    
+    Initially serving CERN scientists, the Web allowed easy access to existing information and linked resources. In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to create open standards for the Web, ensuring its continued growth and interoperability.
+    
+    The Web evolved through several stages:
+    - **Web 1.0:** Basic, static pages.
+    - **Web 2.0:** Starting around 2003, characterized by interactive, user-generated content.
+    - **Web 3.0:** From 2014 onwards, focusing on the semantic web and intelligent data.
+    
+    ## 3. Development of the Internet in the 21st Century
+    
+    The 21st century brought transformative advancements to the internet, including broadband and fiber-optic technologies that significantly increased data transmission speeds. The rise of smartphones revolutionized mobile browsing and internet access, making the internet ubiquitous.
+    
+    Cloud computing emerged as a major innovation, enabling data storage and processing on remote servers, which transformed how businesses and individuals manage information. The Internet of Things (IoT) connected everyday devices to the internet, expanding its reach into daily life.
+    
+    Social media platforms became dominant forces, reshaping communication and information sharing globally. Throughout these developments, the internet's infrastructure and services have continuously evolved to support increasing data demands and new technologies.
+    
+    ---
+    
+    This report summarizes the key milestones and technological advancements that have shaped the internet from its inception in the late 1960s to its current state in the 21st century.
+    
+    🛠️  Planning to use tools: ['call_review_agent']
+    🔨 Calling Tool: call_review_agent
+      With arguments: {}
+    🔧 Tool Result (call_review_agent):
+      Arguments: {}
+      Output: The report titled "History of the Internet" is well-structured, clear, and provides a concise overview of the major developments in the evolution of the internet. It effectively covers the origins, the invention and growth of the World Wide Web, and significant 21st-century advancements.
+    
+    Strengths:
+    - The chronological organization helps readers follow the progression of internet technology.
+    - Key figures and organizations (e.g., ARPANET, Tim Berners-Lee, NSF, W3C) are appropriately highlighted.
+    - The explanation of Web 1.0, 2.0, and 3.0 stages adds valuable context.
+    - The inclusion of recent technologies such as cloud computing, IoT, and social media reflects current trends.
+    
+    Suggestions for improvement:
+    1. **Add citations or references:** The report would benefit from citing sources or references to support the historical facts and technological descriptions.
+    2. **Clarify technical terms:** While the report is generally accessible, briefly defining terms like TCP/IP, NAPs, and semantic web could help readers unfamiliar with networking jargon.
+    3. **Expand on social impact:** Consider including a brief discussion on how the internet has impacted society, economy, and culture to provide a more holistic view.
+    4. **Minor formatting:** The section numbering is inconsistent (e.g., "1.", "2.", "3." but no numbering for the introduction or conclusion). Adding a brief introduction and conclusion section with numbering or consistent formatting would improve flow.
+    
+    Overall, the report is informative and well-written. With the suggested enhancements, it would be even more comprehensive and reader-friendly.
+    
+    Recommendation: **Approve with minor revisions** to incorporate citations, clarify terms, and consider adding social impact context.
+    🛠️  Planning to use tools: ['call_write_agent']
+    🔨 Calling Tool: call_write_agent
+      With arguments: {}
+    🔧 Tool Result (call_write_agent):
+      Arguments: {}
+      Output: 
+    # History of the Internet
+    
+    ## 1. Introduction
+    
+    The internet is a transformative technology that has reshaped communication, information sharing, and society at large. This report provides a concise overview of the major developments in the evolution of the internet, from its origins in the late 1960s to the advanced technologies and societal impacts of the 21st century.
+    
+    ## 2. Development of the Internet
+    
+    The origins of the internet date back to the late 1960s with the creation of the Advanced Research Projects Agency Network (ARPANET) by the U.S. Department of Defense. ARPANET was initially designed as a military defense communication system during the Cold War. It was the first network to implement the Transmission Control Protocol/Internet Protocol (TCP/IP), a suite of communication protocols that became the technical foundation of the modern internet. TCP/IP enables different networks to interconnect and communicate seamlessly.
+    
+    During the 1970s, commercial packet-switched networks emerged, primarily to provide remote computer access. The National Science Foundation (NSF) played a crucial role in expanding internet access to the scientific and academic communities and helped establish TCP/IP as the standard protocol for federally supported research networks. The internet's growth was further supported by interconnected commercial backbones linked through Network Access Points (NAPs), which facilitated data exchange between different service providers.
+    
+    ## 3. Development of the World Wide Web
+    
+    In 1989, Tim Berners-Lee, working at CERN, invented the World Wide Web (WWW), which introduced a system of linked information accessible to anyone connected to the internet. By December 1990, Berners-Lee had developed the essential components of the Web: HyperText Transfer Protocol (HTTP), HyperText Markup Language (HTML), the first web browser/editor, the first web server, and the first website. These innovations allowed users to easily access and navigate information through hyperlinks.
+    
+    Initially serving CERN scientists, the Web rapidly expanded to the public. In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to develop open standards ensuring the Web's interoperability and growth.
+    
+    The Web has evolved through several stages:
+    
+    - **Web 1.0**: Characterized by static, read-only web pages.
+    - **Web 2.0**: Beginning around 2003, marked by interactive, user-generated content and social media platforms.
+    - **Web 3.0**: Emerging from 2014 onwards, focusing on the semantic web and intelligent data processing to create more personalized and meaningful online experiences.
+    
+    ## 4. Development of the Internet in the 21st Century
+    
+    The 21st century has witnessed transformative advancements in internet technology and infrastructure. Broadband and fiber-optic technologies have significantly increased data transmission speeds. The proliferation of smartphones revolutionized mobile internet access, enabling users to connect anytime and anywhere.
+    
+    Cloud computing emerged as a paradigm shift, allowing data storage and processing on remote servers rather than local devices. This innovation has changed how businesses and individuals manage information and applications.
+    
+    The Internet of Things (IoT) has expanded the internet's reach by connecting everyday devices—such as home appliances, vehicles, and wearable technology—to the network, enabling new functionalities and data-driven services.
+    
+    Social media platforms have become dominant forces in communication and information sharing, reshaping social interactions, marketing, and news dissemination.
+    
+    The internet's infrastructure and services continue to evolve to meet increasing data demands and support emerging technologies.
+    
+    ## 5. Social Impact of the Internet
+    
+    Beyond technological advancements, the internet has profoundly impacted society, the economy, and culture. It has democratized access to information, facilitated global communication, and enabled new forms of social interaction. Economically, it has created new industries, transformed traditional business models, and fostered innovation. Culturally, the internet has influenced media consumption, education, and the way communities form and interact.
+    
+    However, these changes also bring challenges such as privacy concerns, digital divides, misinformation, and cybersecurity threats, which require ongoing attention and management.
+    
+    ## 6. Conclusion
+    
+    The history of the internet is marked by continuous innovation and expansion, from its military origins to a global network integral to modern life. Key figures like Tim Berners-Lee and organizations such as ARPANET, NSF, and W3C have played pivotal roles in its development. Understanding the technical foundations, evolutionary stages of the Web, and recent technological trends provides valuable context for appreciating the internet's role today. Incorporating social impact considerations offers a more holistic view of this transformative technology.
+    
+    ---
+    
+    *Note: This report would benefit from citations to authoritative sources for historical facts and technical explanations to enhance credibility and provide readers with avenues for further research.*
+    
+    
+    The revised report on the history of the internet is now complete and ready for your review. Would you like to access the final report?
+
+With our report written and revised/reviewed, we can inspect the final report in the state.
+
+
+```python
+state = await ctx.store.get("state")
+print(state["report_content"])
+```
+
+    
+    # History of the Internet
+    
+    ## 1. Introduction
+    
+    The internet is a transformative technology that has reshaped communication, information sharing, and society at large. This report provides a concise overview of the major developments in the evolution of the internet, from its origins in the late 1960s to the advanced technologies and societal impacts of the 21st century.
+    
+    ## 2. Development of the Internet
+    
+    The origins of the internet date back to the late 1960s with the creation of the Advanced Research Projects Agency Network (ARPANET) by the U.S. Department of Defense. ARPANET was initially designed as a military defense communication system during the Cold War. It was the first network to implement the Transmission Control Protocol/Internet Protocol (TCP/IP), a suite of communication protocols that became the technical foundation of the modern internet. TCP/IP enables different networks to interconnect and communicate seamlessly.
+    
+    During the 1970s, commercial packet-switched networks emerged, primarily to provide remote computer access. The National Science Foundation (NSF) played a crucial role in expanding internet access to the scientific and academic communities and helped establish TCP/IP as the standard protocol for federally supported research networks. The internet's growth was further supported by interconnected commercial backbones linked through Network Access Points (NAPs), which facilitated data exchange between different service providers.
+    
+    ## 3. Development of the World Wide Web
+    
+    In 1989, Tim Berners-Lee, working at CERN, invented the World Wide Web (WWW), which introduced a system of linked information accessible to anyone connected to the internet. By December 1990, Berners-Lee had developed the essential components of the Web: HyperText Transfer Protocol (HTTP), HyperText Markup Language (HTML), the first web browser/editor, the first web server, and the first website. These innovations allowed users to easily access and navigate information through hyperlinks.
+    
+    Initially serving CERN scientists, the Web rapidly expanded to the public. In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to develop open standards ensuring the Web's interoperability and growth.
+    
+    The Web has evolved through several stages:
+    
+    - **Web 1.0**: Characterized by static, read-only web pages.
+    - **Web 2.0**: Beginning around 2003, marked by interactive, user-generated content and social media platforms.
+    - **Web 3.0**: Emerging from 2014 onwards, focusing on the semantic web and intelligent data processing to create more personalized and meaningful online experiences.
+    
+    ## 4. Development of the Internet in the 21st Century
+    
+    The 21st century has witnessed transformative advancements in internet technology and infrastructure. Broadband and fiber-optic technologies have significantly increased data transmission speeds. The proliferation of smartphones revolutionized mobile internet access, enabling users to connect anytime and anywhere.
+    
+    Cloud computing emerged as a paradigm shift, allowing data storage and processing on remote servers rather than local devices. This innovation has changed how businesses and individuals manage information and applications.
+    
+    The Internet of Things (IoT) has expanded the internet's reach by connecting everyday devices—such as home appliances, vehicles, and wearable technology—to the network, enabling new functionalities and data-driven services.
+    
+    Social media platforms have become dominant forces in communication and information sharing, reshaping social interactions, marketing, and news dissemination.
+    
+    The internet's infrastructure and services continue to evolve to meet increasing data demands and support emerging technologies.
+    
+    ## 5. Social Impact of the Internet
+    
+    Beyond technological advancements, the internet has profoundly impacted society, the economy, and culture. It has democratized access to information, facilitated global communication, and enabled new forms of social interaction. Economically, it has created new industries, transformed traditional business models, and fostered innovation. Culturally, the internet has influenced media consumption, education, and the way communities form and interact.
+    
+    However, these changes also bring challenges such as privacy concerns, digital divides, misinformation, and cybersecurity threats, which require ongoing attention and management.
+    
+    ## 6. Conclusion
+    
+    The history of the internet is marked by continuous innovation and expansion, from its military origins to a global network integral to modern life. Key figures like Tim Berners-Lee and organizations such as ARPANET, NSF, and W3C have played pivotal roles in its development. Understanding the technical foundations, evolutionary stages of the Web, and recent technological trends provides valuable context for appreciating the internet's role today. Incorporating social impact considerations offers a more holistic view of this transformative technology.
+    
+    ---
+    
+    *Note: This report would benefit from citations to authoritative sources for historical facts and technical explanations to enhance credibility and provide readers with avenues for further research.*
+    
+    
+
@@ -0,0 +1,213 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/anthropic_agent.ipynb
+toc: True
+title: "Function Calling Anthropic Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our Anthropic agent, powered by function calling capabilities.
+
+**NOTE:** Only claude-3* models support function calling using Anthropic's API.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the Anthropic API (using our own `llama_index` LLM class)
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-anthropic
+%pip install llama-index-embeddings-openai
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Make sure your ANTHROPIC_API_KEY is set. Otherwise explicitly specify the `api_key` parameter.
+
+
+```python
+from llama_index.llms.anthropic import Anthropic
+
+llm = Anthropic(model="claude-3-opus-20240229", api_key="sk-...")
+```
+
+## Initialize Anthropic Agent
+
+Here we initialize a simple Mistral agent with calculator functions.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult
+
+
+async def run_agent_verbose(query: str):
+    handler = agent.run(query)
+    async for event in handler.stream_events():
+        if isinstance(event, ToolCallResult):
+            print(
+                f"Called tool {event.tool_name} with args {event.tool_kwargs}\nGot result: {event.tool_output}"
+            )
+
+    return await handler
+```
+
+### Chat
+
+
+```python
+response = await run_agent_verbose("What is (121 + 2) * 5?")
+print(str(response))
+```
+
+    Called tool add with args {'a': 121, 'b': 2}
+    Got result: 123
+    Called tool multiply with args {'a': 123, 'b': 5}
+    Got result: 615
+    Therefore, (121 + 2) * 5 = 615
+
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+    [ToolCallResult(tool_name='add', tool_kwargs={'a': 121, 'b': 2}, tool_id='toolu_01MH6ME7ppxGPSJcCMEUAN5Q', tool_output=ToolOutput(content='123', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 121, 'b': 2}}, raw_output=123, is_error=False), return_direct=False), ToolCallResult(tool_name='multiply', tool_kwargs={'a': 123, 'b': 5}, tool_id='toolu_01JE5TVERND5YC97E68gYoPw', tool_output=ToolOutput(content='615', tool_name='multiply', raw_input={'args': (), 'kwargs': {'a': 123, 'b': 5}}, raw_output=615, is_error=False), return_direct=False)]
+
+
+### Managing Context/Memory
+
+By default, `.run()` is stateless. If you want to maintain state, you can pass in a `context` object.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+
+response = await agent.run("My name is John Doe", ctx=ctx)
+response = await agent.run("What is my name?", ctx=ctx)
+
+print(str(response))
+```
+
+## Anthropic Agent over RAG Pipeline
+
+Build a Anthropic agent over a simple 10K document. We use OpenAI embeddings and claude-3-haiku-20240307 to construct the RAG pipeline, and pass it to the Anthropic Opus agent as a tool.
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+```
+
+    --2025-03-24 12:52:55--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf
+    Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
+    Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
+    HTTP request sent, awaiting response... 200 OK
+    Length: 1880483 (1.8M) [application/octet-stream]
+    Saving to: ‘data/10k/uber_2021.pdf’
+    
+    data/10k/uber_2021. 100%[===================>]   1.79M  8.98MB/s    in 0.2s    
+    
+    2025-03-24 12:52:56 (8.98 MB/s) - ‘data/10k/uber_2021.pdf’ saved [1880483/1880483]
+    
+
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.llms.anthropic import Anthropic
+
+embed_model = OpenAIEmbedding(
+    model_name="text-embedding-3-large", api_key="sk-proj-..."
+)
+query_llm = Anthropic(model="claude-3-haiku-20240307", api_key="sk-...")
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=query_llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(tools=[query_engine_tool], llm=llm, verbose=True)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber?"
+)
+print(str(response))
+```
+
+    In summary, based on Uber's 2021 10-K filing, some of the company's key risk factors included:
+    
+    - Significant expected increases in operating expenses 
+    - Challenges attracting and retaining drivers, consumers, merchants, shippers, and carriers
+    - Risks to Uber's brand and reputation
+    - Challenges from Uber's historical workplace culture 
+    - Difficulties optimizing organizational structure and managing growth
+    - Risks related to criminal activity by platform users
+    - Risks from new offerings and technologies like autonomous vehicles
+    - Data security and privacy risks
+    - Climate change exposure 
+    - Reliance on third-party platforms
+    - Regulatory and legal risks
+    - Intellectual property risks
+    
+    In terms of growth opportunities and tailwinds, Uber's strategy in 2021 focused on restructuring by divesting certain markets and business lines, and instead partnering with and taking minority ownership positions in local ridesharing and delivery companies in those markets. This suggests Uber saw opportunities to still participate in the growth of those markets through its investments, rather than operating independently.
+
@@ -0,0 +1,157 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/bedrock_converse_agent.ipynb
+toc: True
+title: "Function Calling AWS Bedrock Converse Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our AWS Bedrock Converse agent, powered by function calling capabilities.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. AWS credentials with access to Bedrock and the Claude Haiku LLM
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-bedrock-converse
+%pip install llama-index-embeddings-huggingface
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Make sure to set your AWS credentials, either the `profile_name` or the keys below.
+
+
+```python
+from llama_index.llms.bedrock_converse import BedrockConverse
+
+llm = BedrockConverse(
+    model="anthropic.claude-3-haiku-20240307-v1:0",
+    # NOTE replace with your own AWS credentials
+    aws_access_key_id="AWS Access Key ID to use",
+    aws_secret_access_key="AWS Secret Access Key to use",
+    aws_session_token="AWS Session Token to use",
+    region_name="AWS Region to use, eg. us-east-1",
+)
+```
+
+## Initialize AWS Bedrock Converse Agent
+
+Here we initialize a simple AWS Bedrock Converse agent with calculator functions.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+### Chat
+
+
+```python
+response = await agent.run("What is (121 + 2) * 5?")
+print(str(response))
+```
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+## AWS Bedrock Converse Agent over RAG Pipeline
+
+Build an AWS Bedrock Converse agent over a simple 10K document. We use both HuggingFace embeddings and `BAAI/bge-small-en-v1.5` to construct the RAG pipeline, and pass it to the AWS Bedrock Converse agent as a tool.
+
+
+```python
+!mkdir -p 'data/10k/'
+!curl -o 'data/10k/uber_2021.pdf' 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf'
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+from llama_index.embeddings.huggingface import HuggingFaceEmbedding
+from llama_index.llms.bedrock_converse import BedrockConverse
+
+embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
+query_llm = BedrockConverse(
+    model="anthropic.claude-3-haiku-20240307-v1:0",
+    # NOTE replace with your own AWS credentials
+    aws_access_key_id="AWS Access Key ID to use",
+    aws_secret_access_key="AWS Secret Access Key to use",
+    aws_session_token="AWS Session Token to use",
+    region_name="AWS Region to use, eg. us-east-1",
+)
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=query_llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[query_engine_tool],
+    llm=llm,
+)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls."
+)
+```
+
+
+```python
+print(str(response))
+```
@@ -0,0 +1,369 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/code_act_agent.ipynb
+toc: True
+title: "Prebuilt CodeAct Agent w/ LlamaIndex"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+LlamaIndex offers a prebuilt CodeAct Agent that can be used to write and execute code, inspired by the original [CodeAct paper](https://arxiv.org/abs/2402.01030).
+
+With this agent, you provide an agent with a set of functions, and the agent will write code that uses those functions to help complete the task you give it.
+
+Some advantages of using the CodeAct Agent:
+
+- No need to exhaustively list out all the possible functions that the agent might need
+- The agent can develop complex workflows around your existing functions
+- Can integrate directly with existing API's
+
+Let's walk through a simple example of how to use the CodeAct Agent.
+
+**NOTE:** This example includes code that will execute arbitrary code. This is dangerous, and proper sandboxing should be used in production environments.
+
+## Setup
+
+First, let's configure the LLM we want to use, and provide some functions that we can use in our code.
+
+
+```python
+%pip install -U llama-index-core llama-index-llms-ollama
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+# Configure the LLM
+llm = OpenAI(model="gpt-4o-mini", api_key="sk-...")
+
+
+# Define a few helper functions
+def add(a: int, b: int) -> int:
+    """Add two numbers together"""
+    return a + b
+
+
+def subtract(a: int, b: int) -> int:
+    """Subtract two numbers"""
+    return a - b
+
+
+def multiply(a: int, b: int) -> int:
+    """Multiply two numbers"""
+    return a * b
+
+
+def divide(a: int, b: int) -> float:
+    """Divide two numbers"""
+    return a / b
+```
+
+## Create a Code Executor 
+
+The `CodeActAgent` will require a specific `code_execute_fn` to execute the code generated by the agent.
+
+Below, we define a simple `code_execute_fn` that will execute the code in-process and maintain execution state.
+
+**NOTE:** In a production environment, you should use a more robust method of executing code. This is just for demonstration purposes, and executing code in-process is dangerous. Consider using docker or external services to execute code.
+
+With this executor, we can pass in a dictionary of local and global variables to use in the execution context.
+
+- `locals`: Local variables to use in the execution context, this includes our functions that we want the LLM to code around
+- `globals`: Global variables to use in the execution context, this includes the builtins and other imported modules we want to use in the execution context
+
+
+```python
+from typing import Any, Dict, Tuple
+import io
+import contextlib
+import ast
+import traceback
+
+
+class SimpleCodeExecutor:
+    """
+    A simple code executor that runs Python code with state persistence.
+
+    This executor maintains a global and local state between executions,
+    allowing for variables to persist across multiple code runs.
+
+    NOTE: not safe for production use! Use with caution.
+    """
+
+    def __init__(self, locals: Dict[str, Any], globals: Dict[str, Any]):
+        """
+        Initialize the code executor.
+
+        Args:
+            locals: Local variables to use in the execution context
+            globals: Global variables to use in the execution context
+        """
+        # State that persists between executions
+        self.globals = globals
+        self.locals = locals
+
+    def execute(self, code: str) -> Tuple[bool, str, Any]:
+        """
+        Execute Python code and capture output and return values.
+
+        Args:
+            code: Python code to execute
+
+        Returns:
+            Dict with keys `success`, `output`, and `return_value`
+        """
+        # Capture stdout and stderr
+        stdout = io.StringIO()
+        stderr = io.StringIO()
+
+        output = ""
+        return_value = None
+        try:
+            # Execute with captured output
+            with contextlib.redirect_stdout(
+                stdout
+            ), contextlib.redirect_stderr(stderr):
+                # Try to detect if there's a return value (last expression)
+                try:
+                    tree = ast.parse(code)
+                    last_node = tree.body[-1] if tree.body else None
+
+                    # If the last statement is an expression, capture its value
+                    if isinstance(last_node, ast.Expr):
+                        # Split code to add a return value assignment
+                        last_line = code.rstrip().split("\n")[-1]
+                        exec_code = (
+                            code[: -len(last_line)]
+                            + "\n__result__ = "
+                            + last_line
+                        )
+
+                        # Execute modified code
+                        exec(exec_code, self.globals, self.locals)
+                        return_value = self.locals.get("__result__")
+                    else:
+                        # Normal execution
+                        exec(code, self.globals, self.locals)
+                except:
+                    # If parsing fails, just execute the code as is
+                    exec(code, self.globals, self.locals)
+
+            # Get output
+            output = stdout.getvalue()
+            if stderr.getvalue():
+                output += "\n" + stderr.getvalue()
+
+        except Exception as e:
+            # Capture exception information
+            output = f"Error: {type(e).__name__}: {str(e)}\n"
+            output += traceback.format_exc()
+
+        if return_value is not None:
+            output += "\n\n" + str(return_value)
+
+        return output
+```
+
+
+```python
+code_executor = SimpleCodeExecutor(
+    # give access to our functions defined above
+    locals={
+        "add": add,
+        "subtract": subtract,
+        "multiply": multiply,
+        "divide": divide,
+    },
+    globals={
+        # give access to all builtins
+        "__builtins__": __builtins__,
+        # give access to numpy
+        "np": __import__("numpy"),
+    },
+)
+```
+
+## Setup the CodeAct Agent
+
+Now that we have our code executor, we can setup the CodeAct Agent.
+
+
+
+```python
+from llama_index.core.agent.workflow import CodeActAgent
+from llama_index.core.workflow import Context
+
+agent = CodeActAgent(
+    code_execute_fn=code_executor.execute,
+    llm=llm,
+    tools=[add, subtract, multiply, divide],
+)
+
+# context to hold the agent's session/state/chat history
+ctx = Context(agent)
+```
+
+## Use the Agent
+
+Now that we have our agent, we can use it to complete tasks! Since we gave it some math functions, we will prompt it for tasks that require calculations.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+
+
+async def run_agent_verbose(agent, ctx, query):
+    handler = agent.run(query, ctx=ctx)
+    print(f"User:  {query}")
+    async for event in handler.stream_events():
+        if isinstance(event, ToolCallResult):
+            print(
+                f"\n-----------\nCode execution result:\n{event.tool_output}"
+            )
+        elif isinstance(event, ToolCall):
+            print(f"\n-----------\nParsed code:\n{event.tool_kwargs['code']}")
+        elif isinstance(event, AgentStream):
+            print(f"{event.delta}", end="", flush=True)
+
+    return await handler
+```
+
+Here, the agent uses some built-in functions to calculate the sum of all numbers from 1 to 10.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of all numbers from 1 to 10"
+)
+```
+
+    User:  Calculate the sum of all numbers from 1 to 10
+    The sum of all numbers from 1 to 10 can be calculated using the formula for the sum of an arithmetic series. However, I will compute it directly for you.
+    
+    <execute>
+    # Calculate the sum of numbers from 1 to 10
+    total_sum = sum(range(1, 11))
+    print(total_sum)
+    </execute>
+    -----------
+    Parsed code:
+    # Calculate the sum of numbers from 1 to 10
+    total_sum = sum(range(1, 11))
+    print(total_sum)
+    
+    -----------
+    Code execution result:
+    55
+    
+    The sum of all numbers from 1 to 10 is 55.
+
+Next, we get the agent to use the tools that we passed in.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Add 5 and 3, then multiply the result by 2"
+)
+```
+
+    User:  Add 5 and 3, then multiply the result by 2
+    I will perform the addition of 5 and 3, and then multiply the result by 2.
+    
+    <execute>
+    # Perform the calculation
+    addition_result = add(5, 3)
+    final_result = multiply(addition_result, 2)
+    print(final_result)
+    </execute>
+    -----------
+    Parsed code:
+    # Perform the calculation
+    addition_result = add(5, 3)
+    final_result = multiply(addition_result, 2)
+    print(final_result)
+    
+    -----------
+    Code execution result:
+    16
+    
+    The result of adding 5 and 3, then multiplying by 2, is 16.
+
+We can even get the agent to define new functions for us!
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 10 fibonacci numbers"
+)
+```
+
+    User:  Calculate the sum of the first 10 fibonacci numbers
+    I will calculate the sum of the first 10 Fibonacci numbers.
+    
+    <execute>
+    def fibonacci(n):
+        fib_sequence = [0, 1]
+        for i in range(2, n):
+            fib_sequence.append(fib_sequence[-1] + fib_sequence[-2])
+        return fib_sequence
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    first_10_fib = fibonacci(10)
+    fibonacci_sum = sum(first_10_fib)
+    print(fibonacci_sum)
+    </execute>
+    -----------
+    Parsed code:
+    def fibonacci(n):
+        fib_sequence = [0, 1]
+        for i in range(2, n):
+            fib_sequence.append(fib_sequence[-1] + fib_sequence[-2])
+        return fib_sequence
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    first_10_fib = fibonacci(10)
+    fibonacci_sum = sum(first_10_fib)
+    print(fibonacci_sum)
+    
+    -----------
+    Code execution result:
+    88
+    
+    The sum of the first 10 Fibonacci numbers is 88.
+
+And then reuse those new functions in a new task!
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 20 fibonacci numbers"
+)
+```
+
+    User:  Calculate the sum of the first 20 fibonacci numbers
+    I will calculate the sum of the first 20 Fibonacci numbers.
+    
+    <execute>
+    # Calculate the sum of the first 20 Fibonacci numbers
+    first_20_fib = fibonacci(20)
+    fibonacci_sum_20 = sum(first_20_fib)
+    print(fibonacci_sum_20)
+    </execute>
+    -----------
+    Parsed code:
+    # Calculate the sum of the first 20 Fibonacci numbers
+    first_20_fib = fibonacci(20)
+    fibonacci_sum_20 = sum(first_20_fib)
+    print(fibonacci_sum_20)
+    
+    -----------
+    Code execution result:
+    10945
+    
+    The sum of the first 20 Fibonacci numbers is 10,945.
@@ -1,11 +1,12 @@
 ---
 layout: recipe
-colab: https://colab.research.google.com/github/TuanaCelik/cookbooks-demo/blob/main/notebooks/agent/custom_multi_agent.ipynb
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/custom_multi_agent.ipynb
 toc: True
 title: "Custom Planning Multi-Agent System"
 featured: True
 experimental: False
 tags: ['Agent']
+language: py
 ---
 <a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/custom_multi_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

@@ -0,0 +1,505 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/from_scratch_code_act_agent.ipynb
+toc: True
+title: "Creating a CodeAct Agent From Scratch"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+While LlamaIndex provides a pre-built [CodeActAgent](https://docs.llamaindex.ai/en/stable/examples/agent/code_act_agent/), we can also create our own from scratch.
+
+This way, we can fully understand and customize the agent's behaviour beyond what is provided by the pre-built agent.
+
+In this notebook, we will
+1. Create a workflow for generating and parsing code
+2. Implement basic code execution
+3. Add memory and state to the agent
+
+## Setting up Functions for our Agent
+
+If we want our agent to execute our code, we need to deine the code for it to execute!
+
+For now, let's use a few basic math functions.
+
+
+```python
+# Define a few helper functions
+def add(a: int, b: int) -> int:
+    """Add two numbers together"""
+    return a + b
+
+
+def subtract(a: int, b: int) -> int:
+    """Subtract two numbers"""
+    return a - b
+
+
+def multiply(a: int, b: int) -> int:
+    """Multiply two numbers"""
+    return a * b
+
+
+def divide(a: int, b: int) -> float:
+    """Divide two numbers"""
+    return a / b
+```
+
+## Creating a Code Executor 
+
+In order to execute code, we need to create a code executor.
+
+Here, we will use a simple in-process code executor that maintains it's own state.
+
+**NOTE:** This is a simple example, and does not include proper sandboxing. In a production environment, you should use tools like docker or proper code sandboxing environments.
+
+
+```python
+from typing import Any, Dict, Tuple
+import io
+import contextlib
+import ast
+import traceback
+
+
+class SimpleCodeExecutor:
+    """
+    A simple code executor that runs Python code with state persistence.
+
+    This executor maintains a global and local state between executions,
+    allowing for variables to persist across multiple code runs.
+
+    NOTE: not safe for production use! Use with caution.
+    """
+
+    def __init__(self, locals: Dict[str, Any], globals: Dict[str, Any]):
+        """
+        Initialize the code executor.
+
+        Args:
+            locals: Local variables to use in the execution context
+            globals: Global variables to use in the execution context
+        """
+        # State that persists between executions
+        self.globals = globals
+        self.locals = locals
+
+    def execute(self, code: str) -> Tuple[bool, str, Any]:
+        """
+        Execute Python code and capture output and return values.
+
+        Args:
+            code: Python code to execute
+
+        Returns:
+            Dict with keys `success`, `output`, and `return_value`
+        """
+        # Capture stdout and stderr
+        stdout = io.StringIO()
+        stderr = io.StringIO()
+
+        output = ""
+        return_value = None
+        try:
+            # Execute with captured output
+            with contextlib.redirect_stdout(
+                stdout
+            ), contextlib.redirect_stderr(stderr):
+                # Try to detect if there's a return value (last expression)
+                try:
+                    tree = ast.parse(code)
+                    last_node = tree.body[-1] if tree.body else None
+
+                    # If the last statement is an expression, capture its value
+                    if isinstance(last_node, ast.Expr):
+                        # Split code to add a return value assignment
+                        last_line = code.rstrip().split("\n")[-1]
+                        exec_code = (
+                            code[: -len(last_line)]
+                            + "\n__result__ = "
+                            + last_line
+                        )
+
+                        # Execute modified code
+                        exec(exec_code, self.globals, self.locals)
+                        return_value = self.locals.get("__result__")
+                    else:
+                        # Normal execution
+                        exec(code, self.globals, self.locals)
+                except:
+                    # If parsing fails, just execute the code as is
+                    exec(code, self.globals, self.locals)
+
+            # Get output
+            output = stdout.getvalue()
+            if stderr.getvalue():
+                output += "\n" + stderr.getvalue()
+
+        except Exception as e:
+            # Capture exception information
+            output = f"Error: {type(e).__name__}: {str(e)}\n"
+            output += traceback.format_exc()
+
+        if return_value is not None:
+            output += "\n\n" + str(return_value)
+
+        return output
+```
+
+
+```python
+code_executor = SimpleCodeExecutor(
+    # give access to our functions defined above
+    locals={
+        "add": add,
+        "subtract": subtract,
+        "multiply": multiply,
+        "divide": divide,
+    },
+    globals={
+        # give access to all builtins
+        "__builtins__": __builtins__,
+        # give access to numpy
+        "np": __import__("numpy"),
+    },
+)
+```
+
+## Defining the CodeAct Agent
+
+Now, we can using LlamaIndex Workflows to define the workflow for our agent.
+
+The basic flow is:
+- take in our prompt + chat history
+- parse out the code to execute (if any)
+- execute the code
+- provide the output of the code execution back to the agent
+- repeat until the agent is satisfied with the answer
+
+First, we can create the events in the workflow.
+
+
+```python
+from llama_index.core.llms import ChatMessage
+from llama_index.core.workflow import Event
+
+
+class InputEvent(Event):
+    input: list[ChatMessage]
+
+
+class StreamEvent(Event):
+    delta: str
+
+
+class CodeExecutionEvent(Event):
+    code: str
+```
+
+Next, we can define the workflow that orchestrates using these events.
+
+
+```python
+import inspect
+import re
+from typing import Any, Callable, List
+
+from llama_index.core.llms import ChatMessage, LLM
+from llama_index.core.memory import ChatMemoryBuffer
+from llama_index.core.tools.types import BaseTool
+from llama_index.core.workflow import (
+    Context,
+    Workflow,
+    StartEvent,
+    StopEvent,
+    step,
+)
+from llama_index.llms.openai import OpenAI
+
+
+CODEACT_SYSTEM_PROMPT = """
+You are a helpful assistant that can execute code.
+
+Given the chat history, you can write code within <execute>...</execute> tags to help the user with their question.
+
+In your code, you can reference any previously used variables or functions.
+
+The user has also provided you with some predefined functions:
+{fn_str}
+
+To execute code, write the code between <execute>...</execute> tags.
+"""
+
+
+class CodeActAgent(Workflow):
+    def __init__(
+        self,
+        fns: List[Callable],
+        code_execute_fn: Callable,
+        llm: LLM | None = None,
+        **workflow_kwargs: Any,
+    ) -> None:
+        super().__init__(**workflow_kwargs)
+        self.fns = fns or []
+        self.code_execute_fn = code_execute_fn
+        self.llm = llm or OpenAI(model="gpt-4o-mini")
+
+        # parse the functions into truncated function strings
+        self.fn_str = "\n\n".join(
+            f'def {fn.__name__}{str(inspect.signature(fn))}:\n    """ {fn.__doc__} """\n    ...'
+            for fn in self.fns
+        )
+        self.system_message = ChatMessage(
+            role="system",
+            content=CODEACT_SYSTEM_PROMPT.format(fn_str=self.fn_str),
+        )
+
+    def _parse_code(self, response: str) -> str | None:
+        # find the code between <execute>...</execute> tags
+        matches = re.findall(r"<execute>(.*?)</execute>", response, re.DOTALL)
+        if matches:
+            return "\n\n".join(matches)
+
+        return None
+
+    @step
+    async def prepare_chat_history(
+        self, ctx: Context, ev: StartEvent
+    ) -> InputEvent:
+        # check if memory is setup
+        memory = await ctx.store.get("memory", default=None)
+        if not memory:
+            memory = ChatMemoryBuffer.from_defaults(llm=self.llm)
+
+        # get user input
+        user_input = ev.get("user_input")
+        if user_input is None:
+            raise ValueError("user_input kwarg is required")
+        user_msg = ChatMessage(role="user", content=user_input)
+        memory.put(user_msg)
+
+        # get chat history
+        chat_history = memory.get()
+
+        # update context
+        await ctx.store.set("memory", memory)
+
+        # add the system message to the chat history and return
+        return InputEvent(input=[self.system_message, *chat_history])
+
+    @step
+    async def handle_llm_input(
+        self, ctx: Context, ev: InputEvent
+    ) -> CodeExecutionEvent | StopEvent:
+        chat_history = ev.input
+
+        # stream the response
+        response_stream = await self.llm.astream_chat(chat_history)
+        async for response in response_stream:
+            ctx.write_event_to_stream(StreamEvent(delta=response.delta or ""))
+
+        # save the final response, which should have all content
+        memory = await ctx.store.get("memory")
+        memory.put(response.message)
+        await ctx.store.set("memory", memory)
+
+        # get the code to execute
+        code = self._parse_code(response.message.content)
+
+        if not code:
+            return StopEvent(result=response)
+        else:
+            return CodeExecutionEvent(code=code)
+
+    @step
+    async def handle_code_execution(
+        self, ctx: Context, ev: CodeExecutionEvent
+    ) -> InputEvent:
+        # execute the code
+        ctx.write_event_to_stream(ev)
+        output = self.code_execute_fn(ev.code)
+
+        # update the memory
+        memory = await ctx.store.get("memory")
+        memory.put(ChatMessage(role="assistant", content=output))
+        await ctx.store.set("memory", memory)
+
+        # get the latest chat history and loop back to the start
+        chat_history = memory.get()
+        return InputEvent(input=[self.system_message, *chat_history])
+```
+
+## Testing the CodeAct Agent
+
+Now, we can test out the CodeAct Agent!
+
+We'll create a simple agent and slowly build up the complexity with requests.
+
+
+```python
+from llama_index.core.workflow import Context
+
+agent = CodeActAgent(
+    fns=[add, subtract, multiply, divide],
+    code_execute_fn=code_executor.execute,
+    llm=OpenAI(model="gpt-4o-mini", api_key="sk-..."),
+)
+
+# context to hold the agent's state / memory
+ctx = Context(agent)
+```
+
+
+```python
+async def run_agent_verbose(agent: CodeActAgent, ctx: Context, query: str):
+    handler = agent.run(user_input=query, ctx=ctx)
+    print(f"User:  {query}")
+    async for event in handler.stream_events():
+        if isinstance(event, StreamEvent):
+            print(f"{event.delta}", end="", flush=True)
+        elif isinstance(event, CodeExecutionEvent):
+            print(f"\n-----------\nParsed code:\n{event.code}\n")
+
+    return await handler
+```
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of all numbers from 1 to 10"
+)
+```
+
+    User:  Calculate the sum of all numbers from 1 to 10
+    To calculate the sum of all numbers from 1 to 10, we can use the `add` function in a loop. Here's how we can do it:
+    
+    <execute>
+    total_sum = 0
+    for number in range(1, 11):
+        total_sum = add(total_sum, number)
+    total_sum
+    </execute>
+    -----------
+    Parsed code:
+    
+    total_sum = 0
+    for number in range(1, 11):
+        total_sum = add(total_sum, number)
+    total_sum
+    
+    
+    The sum of all numbers from 1 to 10 is 55.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Add 5 and 3, then multiply the result by 2"
+)
+```
+
+    User:  Add 5 and 3, then multiply the result by 2
+    To perform the calculation, we will first add 5 and 3 using the `add` function, and then multiply the result by 2 using the `multiply` function. Here's how we can do it:
+    
+    <execute>
+    result_addition = add(5, 3)
+    final_result = multiply(result_addition, 2)
+    final_result
+    </execute>
+    -----------
+    Parsed code:
+    
+    result_addition = add(5, 3)
+    final_result = multiply(result_addition, 2)
+    final_result
+    
+    
+    The final result of adding 5 and 3, then multiplying by 2, is 16.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 10 fibonacci numbers0"
+)
+```
+
+    User:  Calculate the sum of the first 10 fibonacci numbers0
+    To calculate the sum of the first 10 Fibonacci numbers, we first need to generate the Fibonacci sequence up to the 10th number and then sum those numbers. The Fibonacci sequence starts with 0 and 1, and each subsequent number is the sum of the two preceding ones.
+    
+    Here's how we can do it:
+    
+    <execute>
+    def fibonacci(n: int) -> int:
+        """ Return the nth Fibonacci number """
+        if n == 0:
+            return 0
+        elif n == 1:
+            return 1
+        else:
+            a, b = 0, 1
+            for _ in range(2, n + 1):
+                a, b = b, a + b
+            return b
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    fibonacci_sum = 0
+    for i in range(10):
+        fibonacci_sum = add(fibonacci_sum, fibonacci(i))
+    
+    fibonacci_sum
+    </execute>
+    -----------
+    Parsed code:
+    
+    def fibonacci(n: int) -> int:
+        """ Return the nth Fibonacci number """
+        if n == 0:
+            return 0
+        elif n == 1:
+            return 1
+        else:
+            a, b = 0, 1
+            for _ in range(2, n + 1):
+                a, b = b, a + b
+            return b
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    fibonacci_sum = 0
+    for i in range(10):
+        fibonacci_sum = add(fibonacci_sum, fibonacci(i))
+    
+    fibonacci_sum
+    
+    
+    The sum of the first 10 Fibonacci numbers is 55.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 20 fibonacci numbers"
+)
+```
+
+    User:  Calculate the sum of the first 20 fibonacci numbers
+    To calculate the sum of the first 20 Fibonacci numbers, we can use the same approach as before, but this time we will iterate up to 20. Here's how we can do it:
+    
+    <execute>
+    # Calculate the sum of the first 20 Fibonacci numbers
+    fibonacci_sum_20 = 0
+    for i in range(20):
+        fibonacci_sum_20 = add(fibonacci_sum_20, fibonacci(i))
+    
+    fibonacci_sum_20
+    </execute>
+    -----------
+    Parsed code:
+    
+    # Calculate the sum of the first 20 Fibonacci numbers
+    fibonacci_sum_20 = 0
+    for i in range(20):
+        fibonacci_sum_20 = add(fibonacci_sum_20, fibonacci(i))
+    
+    fibonacci_sum_20
+    
+    
+    The sum of the first 20 Fibonacci numbers is 6765.
@@ -0,0 +1,99 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/chat_memory_buffer.ipynb
+toc: True
+title: "Chat Memory Buffer"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+The `ChatMemoryBuffer` is a memory buffer that simply stores the last X messages that fit into a token limit.
+
+%pip install llama-index-core
+
+## Setup
+
+
+```python
+from llama_index.core.memory import ChatMemoryBuffer
+
+memory = ChatMemoryBuffer.from_defaults(token_limit=40000)
+```
+
+## Using Standalone
+
+
+```python
+from llama_index.core.llms import ChatMessage
+
+chat_history = [
+    ChatMessage(role="user", content="Hello, how are you?"),
+    ChatMessage(role="assistant", content="I'm doing well, thank you!"),
+]
+
+# put a list of messages
+memory.put_messages(chat_history)
+
+# put one message at a time
+# memory.put_message(chat_history[0])
+```
+
+
+```python
+# Get the last X messages that fit into a token limit
+history = memory.get()
+```
+
+
+```python
+# Get all messages
+all_history = memory.get_all()
+```
+
+
+```python
+# clear the memory
+memory.reset()
+```
+
+## Using with Agents
+
+You can set the memory in any agent in the `.run()` method.
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-proj-..."
+```
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+
+memory = ChatMemoryBuffer.from_defaults(token_limit=40000)
+
+agent = FunctionAgent(tools=[], llm=OpenAI(model="gpt-4o-mini"))
+
+# context to hold the chat history/state
+ctx = Context(agent)
+```
+
+
+```python
+resp = await agent.run("Hello, how are you?", ctx=ctx, memory=memory)
+```
+
+
+```python
+print(memory.get_all())
+```
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Hello, how are you?')]), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text="Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?")])]
+
@@ -0,0 +1,512 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/composable_memory.ipynb
+toc: True
+title: "Simple Composable Memory"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+In this notebook, we demonstrate how to inject multiple memory sources into an agent. Specifically, we use the `SimpleComposableMemory` which is comprised of a `primary_memory` as well as potentially several secondary memory sources (stored in `secondary_memory_sources`). The main difference is that `primary_memory` will be used as the main chat buffer for the agent, where as any retrieved messages from `secondary_memory_sources` will be injected to the system prompt message only.
+
+Multiple memory sources may be of use for example in situations where you have a longer-term memory such as `VectorMemory` that you want to use in addition to the default `ChatMemoryBuffer`. What you'll see in this notebook is that with a `SimpleComposableMemory` you'll be able to effectively "load" the desired messages from long-term memory into the main memory (i.e. the `ChatMemoryBuffer`).
+
+## How `SimpleComposableMemory` Works?
+
+We begin with the basic usage of the `SimpleComposableMemory`. Here we construct a `VectorMemory` as well as a default `ChatMemoryBuffer`. The `VectorMemory` will be our secondary memory source, whereas the `ChatMemoryBuffer` will be the main or primary one. To instantiate a `SimpleComposableMemory` object, we need to supply a `primary_memory` and (optionally) a list of `secondary_memory_sources`.
+
+![SimpleComposableMemoryIllustration](https://d3ddy8balm3goa.cloudfront.net/llamaindex/simple-composable-memory.excalidraw.svg)
+
+
+```python
+from llama_index.core.memory import (
+    VectorMemory,
+    SimpleComposableMemory,
+    ChatMemoryBuffer,
+)
+from llama_index.core.llms import ChatMessage
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+vector_memory = VectorMemory.from_defaults(
+    vector_store=None,  # leave as None to use default in-memory vector store
+    embed_model=OpenAIEmbedding(),
+    retriever_kwargs={"similarity_top_k": 1},
+)
+
+# let's set some initial messages in our secondary vector memory
+msgs = [
+    ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),
+    ChatMessage.from_str("Bob likes burgers.", "user"),
+    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
+    ChatMessage.from_str("Alice likes apples.", "user"),
+]
+vector_memory.set(msgs)
+
+chat_memory_buffer = ChatMemoryBuffer.from_defaults()
+
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=chat_memory_buffer,
+    secondary_memory_sources=[vector_memory],
+)
+```
+
+
+```python
+composable_memory.primary_memory
+```
+
+
+
+
+    ChatMemoryBuffer(chat_store=SimpleChatStore(store={}), chat_store_key='chat_history', token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'))
+
+
+
+
+```python
+composable_memory.secondary_memory_sources
+```
+
+
+
+
+    [VectorMemory(vector_index=<llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x137b912a0>, retriever_kwargs={'similarity_top_k': 1}, batch_by_user_message=True, cur_batch_textnode=TextNode(id_='288b0ef3-570e-4698-a1ae-b3531df66361', embedding=None, metadata={'sub_dicts': [{'role': <MessageRole.USER: 'user'>, 'content': 'Alice likes apples.', 'additional_kwargs': {}}]}, excluded_embed_metadata_keys=['sub_dicts'], excluded_llm_metadata_keys=['sub_dicts'], relationships={}, text='Alice likes apples.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'))]
+
+
+
+### `put()` messages into memory
+
+Since `SimpleComposableMemory` is itself a subclass of `BaseMemory`, we add messages to it in the same way as we do for other memory modules. Note that for `SimpleComposableMemory`, invoking `.put()` effectively calls `.put()` on all memory sources. In other words, the message gets added to `primary` and `secondary` sources.
+
+
+```python
+msgs = [
+    ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
+    ChatMessage.from_str("Jerry likes juice.", "user"),
+]
+```
+
+
+```python
+# load into all memory sources modules"
+for m in msgs:
+    composable_memory.put(m)
+```
+
+### `get()` messages from memory
+
+When `.get()` is invoked, we similarly execute all of the `.get()` methods of `primary` memory as well as all of the `secondary` sources. This leaves us with sequence of lists of messages that we have to must "compose" into a sensible single set of messages (to pass downstream to our agents). Special care must be applied here in general to ensure that the final sequence of messages are both sensible and conform to the chat APIs of the LLM provider.
+
+For `SimpleComposableMemory`, we **inject the messages from the `secondary` sources in the system message of the `primary` memory**. The rest of the message history of the `primary` source is left intact, and this composition is what is ultimately returned.
+
+
+```python
+msgs = composable_memory.get("What does Bob like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Bob likes burgers.\n\tASSISTANT: Indeed, Bob likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+
+```python
+# see the memory injected into the system message of the primary memory
+print(msgs[0])
+```
+
+    system: You are a REALLY helpful assistant.
+    
+    Below are a set of relevant dialogues retrieved from potentially several memory sources:
+    
+    =====Relevant messages from memory source 1=====
+    
+    	USER: Bob likes burgers.
+    	ASSISTANT: Indeed, Bob likes apples.
+    
+    =====End of relevant messages from memory source 1======
+    
+    This is the end of the retrieved message dialogues.
+
+
+### Successive calls to `get()`
+
+Successive calls of `get()` will simply replace the loaded `secondary` memory messages in the system prompt.
+
+
+```python
+msgs = composable_memory.get("What does Alice like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Alice likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+
+```python
+# see the memory injected into the system message of the primary memory
+print(msgs[0])
+```
+
+    system: You are a REALLY helpful assistant.
+    
+    Below are a set of relevant dialogues retrieved from potentially several memory sources:
+    
+    =====Relevant messages from memory source 1=====
+    
+    	USER: Alice likes apples.
+    
+    =====End of relevant messages from memory source 1======
+    
+    This is the end of the retrieved message dialogues.
+
+
+### What if `get()` retrieves `secondary` messages that already exist in `primary` memory?
+
+In the event that messages retrieved from `secondary` memory already exist in `primary` memory, then these rather redundant secondary messages will not get added to the system message. In the below example, the message "Jerry likes juice." was `put` into all memory sources, so the system message is not altered.
+
+
+```python
+msgs = composable_memory.get("What does Jerry like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+### How to `reset` memory 
+
+Similar to the other methods `put()` and `get()`, calling `reset()` will execute `reset()` on both the `primary` and `secondary` memory sources. If you want to reset only the `primary` then you should call the `reset()` method only from it.
+
+#### `reset()` only primary memory
+
+
+```python
+composable_memory.primary_memory.reset()
+```
+
+
+```python
+composable_memory.primary_memory.get()
+```
+
+
+
+
+    []
+
+
+
+
+```python
+composable_memory.secondary_memory_sources[0].get("What does Alice like?")
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content='Alice likes apples.', additional_kwargs={})]
+
+
+
+#### `reset()` all memory sources
+
+
+```python
+composable_memory.reset()
+```
+
+
+```python
+composable_memory.primary_memory.get()
+```
+
+
+
+
+    []
+
+
+
+
+```python
+composable_memory.secondary_memory_sources[0].get("What does Alice like?")
+```
+
+
+
+
+    []
+
+
+
+## Use `SimpleComposableMemory` With An Agent
+
+Here we will use a `SimpleComposableMemory` with an agent and demonstrate how a secondary, long-term memory source can be used to use messages from on agent conversation as part of another conversation with another agent session.
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.core.tools import FunctionTool
+from llama_index.core.agent import FunctionCallingAgent
+
+import nest_asyncio
+
+nest_asyncio.apply()
+```
+
+### Define our memory modules
+
+
+```python
+vector_memory = VectorMemory.from_defaults(
+    vector_store=None,  # leave as None to use default in-memory vector store
+    embed_model=OpenAIEmbedding(),
+    retriever_kwargs={"similarity_top_k": 2},
+)
+
+chat_memory_buffer = ChatMemoryBuffer.from_defaults()
+
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=chat_memory_buffer,
+    secondary_memory_sources=[vector_memory],
+)
+```
+
+### Define our Agent
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiply two integers and returns the result integer"""
+    return a * b
+
+
+def mystery(a: int, b: int) -> int:
+    """Mystery function on two numbers"""
+    return a**2 - b**2
+
+
+multiply_tool = FunctionTool.from_defaults(fn=multiply)
+mystery_tool = FunctionTool.from_defaults(fn=mystery)
+```
+
+
+```python
+llm = OpenAI(model="gpt-3.5-turbo-0613")
+agent = FunctionCallingAgent.from_tools(
+    [multiply_tool, mystery_tool],
+    llm=llm,
+    memory=composable_memory,
+    verbose=True,
+)
+```
+
+### Execute some function calls
+
+When `.chat()` is invoked, the messages are put into the composable memory, which we understand from the previous section implies that all the messages are put in both `primary` and `secondary` sources.
+
+
+```python
+response = agent.chat("What is the mystery function on 5 and 6?")
+```
+
+    Added user message to memory: What is the mystery function on 5 and 6?
+    === Calling Function ===
+    Calling function: mystery with args: {"a": 5, "b": 6}
+    === Function Output ===
+    -11
+    === LLM Response ===
+    The mystery function on 5 and 6 returns -11.
+
+
+
+```python
+response = agent.chat("What happens if you multiply 2 and 3?")
+```
+
+    Added user message to memory: What happens if you multiply 2 and 3?
+    === Calling Function ===
+    Calling function: multiply with args: {"a": 2, "b": 3}
+    === Function Output ===
+    6
+    === LLM Response ===
+    If you multiply 2 and 3, the result is 6.
+
+
+### New Agent Sessions
+
+Now that we've added the messages to our `vector_memory`, we can see the effect of having this memory be used with a new agent session versus when it is used. Specifically, we ask the new agents to "recall" the outputs of the function calls, rather than re-computing.
+
+#### An Agent without our past memory
+
+
+```python
+llm = OpenAI(model="gpt-3.5-turbo-0613")
+agent_without_memory = FunctionCallingAgent.from_tools(
+    [multiply_tool, mystery_tool], llm=llm, verbose=True
+)
+```
+
+
+```python
+response = agent_without_memory.chat(
+    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+)
+```
+
+    Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
+    === LLM Response ===
+    I'm sorry, but I don't have access to the previous output of the mystery function on 5 and 6.
+
+
+#### An Agent with our past memory
+
+We see that the agent without access to the our past memory cannot complete the task. With this next agent we will indeed pass in our previous long-term memory (i.e., `vector_memory`). Note that we even use a fresh `ChatMemoryBuffer` which means there is no `chat_history` with this agent. Nonetheless, it will be able to retrieve from our long-term memory to get the past dialogue it needs.
+
+
+```python
+llm = OpenAI(model="gpt-3.5-turbo-0613")
+
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=ChatMemoryBuffer.from_defaults(),
+    secondary_memory_sources=[
+        vector_memory.copy(
+            deep=True
+        )  # using a copy here for illustration purposes
+        # later will use original vector_memory again
+    ],
+)
+
+agent_with_memory = FunctionCallingAgent.from_tools(
+    [multiply_tool, mystery_tool],
+    llm=llm,
+    memory=composable_memory,
+    verbose=True,
+)
+```
+
+
+```python
+agent_with_memory.chat_history  # an empty chat history
+```
+
+
+
+
+    []
+
+
+
+
+```python
+response = agent_with_memory.chat(
+    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+)
+```
+
+    Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
+    === LLM Response ===
+    The output of the mystery function on 5 and 6 is -11.
+
+
+
+```python
+response = agent_with_memory.chat(
+    "What was the output of the multiply function on 2 and 3 again? Don't recompute."
+)
+```
+
+    Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute.
+    === LLM Response ===
+    The output of the multiply function on 2 and 3 is 6.
+
+
+
+```python
+agent_with_memory.chat_history
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the mystery function on 5 and 6 again? Don't recompute.", additional_kwargs={}),
+     ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the mystery function on 5 and 6 is -11.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the multiply function on 2 and 3 again? Don't recompute.", additional_kwargs={}),
+     ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the multiply function on 2 and 3 is 6.', additional_kwargs={})]
+
+
+
+### What happens under the hood with `.chat(user_input)`
+
+Under the hood, `.chat(user_input)` call effectively will call the memory's `.get()` method with `user_input` as the argument. As we learned in the previous section, this will ultimately return a composition of the `primary` and all of the `secondary` memory sources. These composed messages are what is being passed to the LLM's chat API as the chat history.
+
+
+```python
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=ChatMemoryBuffer.from_defaults(),
+    secondary_memory_sources=[
+        vector_memory.copy(
+            deep=True
+        )  # copy for illustrative purposes to explain what
+        # happened under the hood from previous subsection
+    ],
+)
+agent_with_memory = agent_worker.as_agent(memory=composable_memory)
+```
+
+
+```python
+agent_with_memory.memory.get(
+    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+)
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: What is the mystery function on 5 and 6?\n\tASSISTANT: None\n\tTOOL: -11\n\tASSISTANT: The mystery function on 5 and 6 returns -11.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={})]
+
+
+
+
+```python
+print(
+    agent_with_memory.memory.get(
+        "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+    )[0]
+)
+```
+
+    system: You are a helpful assistant.
+    
+    Below are a set of relevant dialogues retrieved from potentially several memory sources:
+    
+    =====Relevant messages from memory source 1=====
+    
+    	USER: What is the mystery function on 5 and 6?
+    	ASSISTANT: None
+    	TOOL: -11
+    	ASSISTANT: The mystery function on 5 and 6 returns -11.
+    
+    =====End of relevant messages from memory source 1======
+    
+    This is the end of the retrieved message dialogues.
+
@@ -0,0 +1,110 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/summary_memory_buffer.ipynb
+toc: True
+title: "Chat Summary Memory Buffer"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+The `ChatSummaryMemoryBuffer` is a memory buffer that stores the last X messages that fit into a token limit. It also summarizes the chat history into a single message.
+
+
+
+```python
+%pip install llama-index-core
+```
+
+## Setup
+
+
+```python
+from llama_index.core.memory import ChatSummaryMemoryBuffer
+
+memory = ChatSummaryMemoryBuffer.from_defaults(
+    token_limit=40000,
+    # optional set the summary prompt, here's the default:
+    # summarize_prompt=(
+    #     "The following is a conversation between the user and assistant. "
+    #     "Write a concise summary about the contents of this conversation."
+    # )
+)
+```
+
+## Using Standalone
+
+
+```python
+from llama_index.core.llms import ChatMessage
+
+chat_history = [
+    ChatMessage(role="user", content="Hello, how are you?"),
+    ChatMessage(role="assistant", content="I'm doing well, thank you!"),
+]
+
+# put a list of messages
+memory.put_messages(chat_history)
+
+# put one message at a time
+# memory.put_message(chat_history[0])
+```
+
+
+```python
+# Get the last X messages that fit into a token limit
+history = memory.get()
+```
+
+
+```python
+# Get all messages
+all_history = memory.get_all()
+```
+
+
+```python
+# clear the memory
+memory.reset()
+```
+
+## Using with Agents
+
+You can set the memory in any agent in the `.run()` method.
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-proj-..."
+```
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+
+memory = ChatMemoryBuffer.from_defaults(token_limit=40000)
+
+agent = FunctionAgent(tools=[], llm=OpenAI(model="gpt-4o-mini"))
+
+# context to hold the chat history/state
+ctx = Context(agent)
+```
+
+
+```python
+resp = await agent.run("Hello, how are you?", ctx=ctx, memory=memory)
+```
+
+
+```python
+print(memory.get_all())
+```
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Hello, how are you?')]), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text="Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?")])]
+
@@ -0,0 +1,100 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/vector_memory.ipynb
+toc: True
+title: "Vector Memory"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+The vector memory module uses vector search (backed by a vector db) to retrieve relevant conversation items given a user input.
+
+This notebook shows you how to use the `VectorMemory` class. We show you how to use its individual functions. A typical usecase for vector memory is as a long-term memory storage of chat messages. You can
+
+![VectorMemoryIllustration](https://d3ddy8balm3goa.cloudfront.net/llamaindex/vector-memory.excalidraw.svg)
+
+### Initialize and Experiment with Memory Module
+
+Here we initialize a raw memory module and demonstrate its functions - to put and retrieve from ChatMessage objects.
+
+- Note that `retriever_kwargs` is the same args you'd specify on the `VectorIndexRetriever` or from `index.as_retriever(..)`.
+
+
+```python
+from llama_index.core.memory import VectorMemory
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+
+vector_memory = VectorMemory.from_defaults(
+    vector_store=None,  # leave as None to use default in-memory vector store
+    embed_model=OpenAIEmbedding(),
+    retriever_kwargs={"similarity_top_k": 1},
+)
+```
+
+
+```python
+from llama_index.core.llms import ChatMessage
+
+msgs = [
+    ChatMessage.from_str("Jerry likes juice.", "user"),
+    ChatMessage.from_str("Bob likes burgers.", "user"),
+    ChatMessage.from_str("Alice likes apples.", "user"),
+]
+```
+
+
+```python
+# load into memory
+for m in msgs:
+    vector_memory.put(m)
+```
+
+
+```python
+# retrieve from memory
+msgs = vector_memory.get("What does Jerry like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+
+```python
+vector_memory.reset()
+```
+
+Now let's try resetting and trying again. This time, we'll add an assistant message. Note that user/assistant messages are bundled by default.
+
+
+```python
+msgs = [
+    ChatMessage.from_str("Jerry likes burgers.", "user"),
+    ChatMessage.from_str("Bob likes apples.", "user"),
+    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
+    ChatMessage.from_str("Alice likes juice.", "user"),
+]
+vector_memory.set(msgs)
+```
+
+
+```python
+msgs = vector_memory.get("What does Bob like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content='Bob likes apples.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='Indeed, Bob likes apples.', additional_kwargs={})]
+
+
@@ -0,0 +1,186 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/mistral_agent.ipynb
+toc: True
+title: "Function Calling Mistral Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our Mistral agent, powered by function calling capabilities.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the OpenAI API (using our own `llama_index` LLM class)
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-mistralai
+%pip install llama-index-embeddings-mistralai
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Make sure your MISTRAL_API_KEY is set. Otherwise explicitly specify the `api_key` parameter.
+
+
+```python
+from llama_index.llms.mistralai import MistralAI
+
+llm = MistralAI(model="mistral-large-latest", api_key="...")
+```
+
+## Initialize Mistral Agent
+
+Here we initialize a simple Mistral agent with calculator functions.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+### Chat
+
+
+```python
+response = await agent.run("What is (121 + 2) * 5?")
+print(str(response))
+```
+
+    Added user message to memory: What is (121 + 2) * 5?
+    === Calling Function ===
+    Calling function: add with args: {"a": 121, "b": 2}
+    === Calling Function ===
+    Calling function: multiply with args: {"a": 123, "b": 5}
+    assistant: The result of (121 + 2) * 5 is 615.
+
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+### Managing Context/Memory
+
+By default, `.run()` is stateless. If you want to maintain state, you can pass in a `context` object.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+
+response = await agent.run("My name is John Doe", ctx=ctx)
+response = await agent.run("What is my name?", ctx=ctx)
+
+print(str(response))
+```
+
+## Mistral Agent over RAG Pipeline
+
+Build a Mistral agent over a simple 10K document. We use both Mistral embeddings and mistral-medium to construct the RAG pipeline, and pass it to the Mistral agent as a tool.
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+from llama_index.embeddings.mistralai import MistralAIEmbedding
+from llama_index.llms.mistralai import MistralAI
+
+embed_model = MistralAIEmbedding(api_key="...")
+query_llm = MistralAI(model="mistral-medium", api_key="...")
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=query_llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(tools=[query_engine_tool], llm=llm)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls."
+)
+print(str(response))
+```
+
+    Added user message to memory: Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls.
+    === Calling Function ===
+    Calling function: uber_10k with args: {"input": "What are the risk factors for Uber in 2021?"}
+    === Calling Function ===
+    Calling function: uber_10k with args: {"input": "What are the tailwinds for Uber in 2021?"}
+    assistant: Based on the information provided, here are the risk factors for Uber in 2021:
+    
+    1. Failure to offer or develop autonomous vehicle technologies, which could result in inferior performance or safety concerns compared to competitors.
+    2. Dependence on high-quality personnel and the potential impact of attrition or unsuccessful succession planning on the business.
+    3. Security or data privacy breaches, unauthorized access, or destruction of proprietary, employee, or user data.
+    4. Cyberattacks, such as malware, ransomware, viruses, spamming, and phishing attacks, which could harm the company's reputation and operations.
+    5. Climate change risks, including physical and transitional risks, that may adversely impact the business if not managed effectively.
+    6. Reliance on third parties to maintain open marketplaces for distributing products and providing software, which could negatively affect the business if interfered with.
+    7. The need for additional capital to support business growth, which may not be available on reasonable terms or at all.
+    8. Difficulties in identifying, acquiring, and integrating suitable businesses, which could harm operating results and prospects.
+    9. Legal and regulatory risks, including extensive government regulation and oversight related to payment and financial services.
+    10. Intellectual property risks, such as the inability to protect intellectual property or claims of misappropriation by third parties.
+    11. Volatility in the market price of common stock, which could result in steep declines and loss of investment for shareholders.
+    12. Economic risks related to the COVID-19 pandemic, which has adversely impacted and could continue to adversely impact the business, financial condition, and results of operations.
+    13. The potential reclassification of Drivers as employees, workers, or quasi-employees, which could result in material costs associated with defending, settling, or resolving lawsuits and demands for arbitration.
+    
+    On the other hand, here are some tailwinds for Uber in 2021:
+    
+    1. Launch of Uber One, a single cross-platform membership program in the United States, which offers discounts, special pricing, priority service, and exclusive perks across rides, delivery, and grocery offerings.
+    2. Introduction of a "Super App" view on iOS
+
@@ -0,0 +1,706 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/multi_agent_workflow_with_weaviate_queryagent.ipynb
+toc: True
+title: "Multi-Agent Workflow with Weaviate QueryAgent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/multi_agent_workflow_with_weaviate_queryagent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+In this example, we will be building a LlamaIndex Agent Workflow that ends up being a multi-agent system that aims to be a Docs Assistant capable of:
+- Writing new content to a "LlamaIndexDocs" collection in Weaviate
+- Writing new content to a "WeaviateDocs" collection in Weaviate
+- Using the Weaviate [`QueryAgent`](https://weaviate.io/developers/agents/query) to answer questions based on the contents of these collections.
+
+The `QueryAgent` is a full agent prodcut by Weaviate, that is capable of doing regular search, as well as aggregations over the collections you give it access to. Our 'orchestrator' agent will decide when to invoke the Weaviate QueryAgent, leaving the job of creating Weaviate specific search queries to it.
+
+**Things you will need:**
+
+- An OpenAI API key (or switch to another provider and adjust the code below)
+- A Weaviate sandbox (this is free)
+- Your Weaviate sandbox URL and API key
+
+![Workflow Overview](../_static/agents/workflow-weaviate-multiagent.png)
+
+## Install & Import Dependencies
+
+
+```python
+!pip install llama-index-core llama-index-utils-workflow weaviate-client[agents] llama-index-llms-openai llama-index-readers-web
+```
+
+
+```python
+from llama_index.core.workflow import (
+    StartEvent,
+    StopEvent,
+    Workflow,
+    step,
+    Event,
+    Context,
+)
+from llama_index.utils.workflow import draw_all_possible_flows
+from llama_index.readers.web import SimpleWebPageReader
+from llama_index.core.llms import ChatMessage
+from llama_index.core.tools import FunctionTool
+from llama_index.llms.openai import OpenAI
+from llama_index.core.agent.workflow import FunctionAgent
+
+from enum import Enum
+from pydantic import BaseModel, Field
+from llama_index.llms.openai import OpenAI
+from typing import List, Union
+import json
+
+import weaviate
+from weaviate.auth import Auth
+from weaviate.agents.query import QueryAgent
+from weaviate.classes.config import Configure, Property, DataType
+
+import os
+from getpass import getpass
+```
+
+## Set up Weaviate
+
+To use the Weaviate Query Agent, first, create a [Weaviate Cloud](https://weaviate.io/deployment/serverless) account👇
+1. [Create Serverless Weaviate Cloud account](https://weaviate.io/deployment/serverless) and set up a free [Sandbox](https://weaviate.io/developers/wcs/manage-clusters/create#sandbox-clusters)
+2. Go to 'Embedding' and enable it, by default, this will make it so that we use `Snowflake/snowflake-arctic-embed-l-v2.0` as the embedding model
+3. Take note of the `WEAVIATE_URL` and `WEAVIATE_API_KEY` to connect to your cluster below
+
+> Info: We recommend using [Weaviate Embeddings](https://weaviate.io/developers/weaviate/model-providers/weaviate) so you do not have to provide any extra keys for external embedding providers.
+
+
+```python
+if "WEAVIATE_API_KEY" not in os.environ:
+    os.environ["WEAVIATE_API_KEY"] = getpass("Add Weaviate API Key")
+if "WEAVIATE_URL" not in os.environ:
+    os.environ["WEAVIATE_URL"] = getpass("Add Weaviate URL")
+```
+
+
+```python
+client = weaviate.connect_to_weaviate_cloud(
+    cluster_url=os.environ.get("WEAVIATE_URL"),
+    auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
+)
+```
+
+### Create WeaviateDocs and LlamaIndexDocs Collections
+
+The helper function below will create a "WeaviateDocs" and "LlamaIndexDocs" collection in Weaviate (if they don't exist already). It will also set up a `QueryAgent` that has access to both of these collections.
+
+The Weaviate [`QueryAgent`](https://weaviate.io/blog/query-agent) is designed to be able to query Weviate Collections for both regular search and aggregations, and also handles the burden of creating the Weaviate specific queries internally.
+
+The Agent will use the collection descriptions, as well as the property descriptions while formilating the queries.
+
+
+```python
+def fresh_setup_weaviate(client):
+    if client.collections.exists("WeaviateDocs"):
+        client.collections.delete("WeaviateDocs")
+    client.collections.create(
+        "WeaviateDocs",
+        description="A dataset with the contents of Weaviate technical Docs and website",
+        vectorizer_config=Configure.Vectorizer.text2vec_weaviate(),
+        properties=[
+            Property(
+                name="url",
+                data_type=DataType.TEXT,
+                description="the source URL of the webpage",
+            ),
+            Property(
+                name="text",
+                data_type=DataType.TEXT,
+                description="the content of the webpage",
+            ),
+        ],
+    )
+
+    if client.collections.exists("LlamaIndexDocs"):
+        client.collections.delete("LlamaIndexDocs")
+    client.collections.create(
+        "LlamaIndexDocs",
+        description="A dataset with the contents of LlamaIndex technical Docs and website",
+        vectorizer_config=Configure.Vectorizer.text2vec_weaviate(),
+        properties=[
+            Property(
+                name="url",
+                data_type=DataType.TEXT,
+                description="the source URL of the webpage",
+            ),
+            Property(
+                name="text",
+                data_type=DataType.TEXT,
+                description="the content of the webpage",
+            ),
+        ],
+    )
+
+    agent = QueryAgent(
+        client=client, collections=["LlamaIndexDocs", "WeaviateDocs"]
+    )
+    return agent
+```
+
+### Write Contents of Webpage to the Collections
+
+The helper function below uses the `SimpleWebPageReader` to write the contents of a webpage to the relevant Weaviate collection
+
+
+```python
+def write_webpages_to_weaviate(client, urls: list[str], collection_name: str):
+    documents = SimpleWebPageReader(html_to_text=True).load_data(urls)
+    collection = client.collections.get(collection_name)
+    with collection.batch.dynamic() as batch:
+        for doc in documents:
+            batch.add_object(properties={"url": doc.id_, "text": doc.text})
+```
+
+## Create a Function Calling Agent
+
+Now that we have the relevant functions to write to a collection and also the `QueryAgent` at hand, we can start by using the `FunctionAgent`, which is a simple tool calling agent.
+
+
+```python
+if "OPENAI_API_KEY" not in os.environ:
+    os.environ["OPENAI_API_KEY"] = getpass("openai-key")
+```
+
+
+```python
+weaviate_agent = fresh_setup_weaviate(client)
+```
+
+
+```python
+llm = OpenAI(model="gpt-4o-mini")
+
+
+def write_to_weaviate_collection(urls=list[str]):
+    """Useful for writing new content to the WeaviateDocs collection"""
+    write_webpages_to_weaviate(client, urls, "WeaviateDocs")
+
+
+def write_to_li_collection(urls=list[str]):
+    """Useful for writing new content to the LlamaIndexDocs collection"""
+    write_webpages_to_weaviate(client, urls, "LlamaIndexDocs")
+
+
+def query_agent(query: str) -> str:
+    """Useful for asking questions about Weaviate and LlamaIndex"""
+    response = weaviate_agent.run(query)
+    return response.final_answer
+
+
+agent = FunctionAgent(
+    tools=[write_to_weaviate_collection, write_to_li_collection, query_agent],
+    llm=llm,
+    system_prompt="""You are a helpful assistant that can write the
+      contents of urls to WeaviateDocs and LlamaIndexDocs collections,
+      as well as forwarding questions to a QueryAgent""",
+)
+```
+
+
+```python
+response = await agent.run(
+    user_msg="Can you save https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_basic/"
+)
+print(str(response))
+```
+
+
+```python
+response = await agent.run(
+    user_msg="""What are llama index workflows? And can you save
+    these to weaviate docs: https://weaviate.io/blog/what-are-agentic-workflows
+    and https://weaviate.io/blog/ai-agents"""
+)
+print(str(response))
+```
+
+    Llama Index workflows refer to orchestrations involving one or more AI agents within the LlamaIndex framework. These workflows manage complex tasks dynamically by leveraging components such as large language models (LLMs), tools, and memory states. Key features of Llama Index workflows include:
+    
+    - Support for single or multiple agents managed within an AgentWorkflow orchestrator.
+    - Ability to maintain state across runs via serializable context objects.
+    - Integration of external tools with type annotations, including asynchronous functions.
+    - Streaming of intermediate outputs and event-based interactions.
+    - Human-in-the-loop capabilities to confirm or guide agent actions during workflow execution.
+    
+    These workflows enable agents to execute sequences of operations, call external tools asynchronously, maintain conversation or task states, stream partial results, and incorporate human inputs when necessary. They embody dynamic, agent-driven sequences of task decomposition, tool use, and reflection, allowing AI systems to plan, act, and improve iteratively toward specific goals.
+    
+    I have also saved the contents from the provided URLs to the WeaviateDocs collection.
+
+
+
+```python
+response = await agent.run(
+    user_msg="How many docs do I have in the weaviate and llamaindex collections in total?"
+)
+print(str(response))
+```
+
+    You have a total of 2 documents in the WeaviateDocs collection and 1 document in the LlamaIndexDocs collection. In total, that makes 3 documents across both collections.
+
+
+
+```python
+weaviate_agent = fresh_setup_weaviate(client)
+```
+
+## Create a Workflow with Branches
+
+### Simple Example: Create Events
+
+A LlamaIndex Workflow has 2 fundamentals:
+- An Event
+- A Step
+
+An step may return an event, and an event may trigger a step!
+
+For our use-case, we can imagine thet there are 4 events:
+
+
+```python
+class EvaluateQuery(Event):
+    query: str
+
+
+class WriteLlamaIndexDocsEvent(Event):
+    urls: list[str]
+
+
+class WriteWeaviateDocsEvent(Event):
+    urls: list[str]
+
+
+class QueryAgentEvent(Event):
+    query: str
+```
+
+### Simple Example: A Branching Workflow (that does nothing yet)
+
+
+```python
+class DocsAssistantWorkflow(Workflow):
+    @step
+    async def start(self, ctx: Context, ev: StartEvent) -> EvaluateQuery:
+        return EvaluateQuery(query=ev.query)
+
+    @step
+    async def evaluate_query(
+        self, ctx: Context, ev: EvaluateQuery
+    ) -> QueryAgentEvent | WriteLlamaIndexDocsEvent | WriteWeaviateDocsEvent | StopEvent:
+        if ev.query == "llama":
+            return WriteLlamaIndexDocsEvent(urls=[ev.query])
+        if ev.query == "weaviate":
+            return WriteWeaviateDocsEvent(urls=[ev.query])
+        if ev.query == "question":
+            return QueryAgentEvent(query=ev.query)
+        return StopEvent()
+
+    @step
+    async def write_li_docs(
+        self, ctx: Context, ev: WriteLlamaIndexDocsEvent
+    ) -> StopEvent:
+        print(f"Got a request to write something to LlamaIndexDocs")
+        return StopEvent()
+
+    @step
+    async def write_weaviate_docs(
+        self, ctx: Context, ev: WriteWeaviateDocsEvent
+    ) -> StopEvent:
+        print(f"Got a request to write something to WeaviateDocs")
+        return StopEvent()
+
+    @step
+    async def query_agent(
+        self, ctx: Context, ev: QueryAgentEvent
+    ) -> StopEvent:
+        print(f"Got a request to forward a query to the QueryAgent")
+        return StopEvent()
+```
+
+
+```python
+workflow_that_does_nothing = DocsAssistantWorkflow()
+
+# draw_all_possible_flows(workflow_that_does_nothing)
+```
+
+
+```python
+print(
+    await workflow_that_does_nothing.run(start_event=StartEvent(query="llama"))
+)
+```
+
+    Got a request to write something to LlamaIndexDocs
+    None
+
+
+### Classify the Query with Structured Outputs
+
+
+```python
+class SaveToLlamaIndexDocs(BaseModel):
+    """The URLs to parse and save into a llama-index specific docs collection."""
+
+    llama_index_urls: List[str] = Field(default_factory=list)
+
+
+class SaveToWeaviateDocs(BaseModel):
+    """The URLs to parse and save into a weaviate specific docs collection."""
+
+    weaviate_urls: List[str] = Field(default_factory=list)
+
+
+class Ask(BaseModel):
+    """The natural language questions that can be asked to a Q&A agent."""
+
+    queries: List[str] = Field(default_factory=list)
+
+
+class Actions(BaseModel):
+    """Actions to take based on the latest user message."""
+
+    actions: List[
+        Union[SaveToLlamaIndexDocs, SaveToWeaviateDocs, Ask]
+    ] = Field(default_factory=list)
+```
+
+#### Create a Workflow
+
+Let's create a workflow that, still, does nothing, but the incoming user query will be converted to our structure. Based on the contents of that structure, the workflow will decide which step to run.
+
+Notice how whichever step runs first, will return a `StopEvent`... This is good, but maybe we can improve that later!
+
+
+```python
+from llama_index.llms.openai import OpenAIResponses
+
+
+class DocsAssistantWorkflow(Workflow):
+    def __init__(self, *args, **kwargs):
+        self.llm = OpenAIResponses(model="gpt-4.1-mini")
+        self.system_prompt = """You are a docs assistant. You evaluate incoming queries and break them down to subqueries when needed.
+                          You decide on the next best course of action. Overall, here are the options:
+                          - You can write the contents of a URL to llamaindex docs (if it's a llamaindex url)
+                          - You can write the contents of a URL to weaviate docs (if it's a weaviate url)
+                          - You can answer a question about llamaindex and weaviate using the QueryAgent"""
+        super().__init__(*args, **kwargs)
+
+    @step
+    async def start(self, ev: StartEvent) -> EvaluateQuery:
+        return EvaluateQuery(query=ev.query)
+
+    @step
+    async def evaluate_query(
+        self, ev: EvaluateQuery
+    ) -> QueryAgentEvent | WriteLlamaIndexDocsEvent | WriteWeaviateDocsEvent:
+        sllm = self.llm.as_structured_llm(Actions)
+        response = await sllm.achat(
+            [
+                ChatMessage(role="system", content=self.system_prompt),
+                ChatMessage(role="user", content=ev.query),
+            ]
+        )
+        actions = response.raw.actions
+        print(actions)
+        for action in actions:
+            if isinstance(action, SaveToLlamaIndexDocs):
+                return WriteLlamaIndexDocsEvent(urls=action.llama_index_urls)
+            elif isinstance(action, SaveToWeaviateDocs):
+                return WriteWeaviateDocsEvent(urls=action.weaviate_urls)
+            elif isinstance(action, Ask):
+                for query in action.queries:
+                    return QueryAgentEvent(query=query)
+
+    @step
+    async def write_li_docs(self, ev: WriteLlamaIndexDocsEvent) -> StopEvent:
+        print(f"Writing {ev.urls} to LlamaIndex Docs")
+        return StopEvent()
+
+    @step
+    async def write_weaviate_docs(
+        self, ev: WriteWeaviateDocsEvent
+    ) -> StopEvent:
+        print(f"Writing {ev.urls} to Weaviate Docs")
+        return StopEvent()
+
+    @step
+    async def query_agent(self, ev: QueryAgentEvent) -> StopEvent:
+        print(f"Sending `'{ev.query}`' to agent")
+        return StopEvent()
+
+
+everything_docs_agent_beta = DocsAssistantWorkflow()
+```
+
+
+```python
+async def run_docs_agent_beta(query: str):
+    print(
+        await everything_docs_agent_beta.run(
+            start_event=StartEvent(query=query)
+        )
+    )
+```
+
+
+```python
+await run_docs_agent_beta(
+    """Can you save https://www.llamaindex.ai/blog/get-citations-and-reasoning-for-extracted-data-in-llamaextract
+    and https://www.llamaindex.ai/blog/llamaparse-update-may-2025-new-models-skew-detection-and-more??"""
+)
+```
+
+    [SaveToLlamaIndexDocs(llama_index_urls=['https://www.llamaindex.ai/blog/get-citations-and-reasoning-for-extracted-data-in-llamaextract', 'https://www.llamaindex.ai/blog/llamaparse-update-may-2025-new-models-skew-detection-and-more'])]
+    Writing ['https://www.llamaindex.ai/blog/get-citations-and-reasoning-for-extracted-data-in-llamaextract', 'https://www.llamaindex.ai/blog/llamaparse-update-may-2025-new-models-skew-detection-and-more'] to LlamaIndex Docs
+    None
+
+
+
+```python
+await run_docs_agent_beta(
+    "How many documents do we have in the LlamaIndexDocs collection now?"
+)
+```
+
+    [Ask(queries=['How many documents are in the LlamaIndexDocs collection?'])]
+    Sending `'How many documents are in the LlamaIndexDocs collection?`' to agent
+    None
+
+
+
+```python
+await run_docs_agent_beta("What are LlamaIndex workflows?")
+```
+
+    [Ask(queries=['What are LlamaIndex workflows?'])]
+    Sending `'What are LlamaIndex workflows?`' to agent
+    None
+
+
+
+```python
+await run_docs_agent_beta(
+    "Can you save https://weaviate.io/blog/graph-rag and https://weaviate.io/blog/genai-apps-with-weaviate-and-databricks??"
+)
+```
+
+    [SaveToWeaviateDocs(weaviate_urls=['https://weaviate.io/blog/graph-rag', 'https://weaviate.io/blog/genai-apps-with-weaviate-and-databricks'])]
+    Writing ['https://weaviate.io/blog/graph-rag', 'https://weaviate.io/blog/genai-apps-with-weaviate-and-databricks'] to Weaviate Docs
+    None
+
+
+## Run Multiple Branches & Put it all togehter
+
+In these cases, it makes sense to run multiple branches. So, a single step can trigger multiple events at once! We can `send_event` via the context 👇
+
+
+```python
+class ActionCompleted(Event):
+    result: str
+
+
+class DocsAssistantWorkflow(Workflow):
+    def __init__(self, *args, **kwargs):
+        self.llm = OpenAIResponses(model="gpt-4.1-mini")
+        self.system_prompt = """You are a docs assistant. You evaluate incoming queries and break them down to subqueries when needed.
+                      You decide on the next best course of action. Overall, here are the options:
+                      - You can write the contents of a URL to llamaindex docs (if it's a llamaindex url)
+                      - You can write the contents of a URL to weaviate docs (if it's a weaviate url)
+                      - You can answer a question about llamaindex and weaviate using the QueryAgent"""
+        super().__init__(*args, **kwargs)
+
+    @step
+    async def start(self, ctx: Context, ev: StartEvent) -> EvaluateQuery:
+        return EvaluateQuery(query=ev.query)
+
+    @step
+    async def evaluate_query(
+        self, ctx: Context, ev: EvaluateQuery
+    ) -> QueryAgentEvent | WriteLlamaIndexDocsEvent | WriteWeaviateDocsEvent | None:
+        await ctx.store.set("results", [])
+        sllm = self.llm.as_structured_llm(Actions)
+        response = await sllm.achat(
+            [
+                ChatMessage(role="system", content=self.system_prompt),
+                ChatMessage(role="user", content=ev.query),
+            ]
+        )
+        actions = response.raw.actions
+        await ctx.store.set("num_events", len(actions))
+        await ctx.store.set("results", [])
+        print(actions)
+        for action in actions:
+            if isinstance(action, SaveToLlamaIndexDocs):
+                ctx.send_event(
+                    WriteLlamaIndexDocsEvent(urls=action.llama_index_urls)
+                )
+            elif isinstance(action, SaveToWeaviateDocs):
+                ctx.send_event(
+                    WriteWeaviateDocsEvent(urls=action.weaviate_urls)
+                )
+            elif isinstance(action, Ask):
+                for query in action.queries:
+                    ctx.send_event(QueryAgentEvent(query=query))
+
+    @step
+    async def write_li_docs(
+        self, ctx: Context, ev: WriteLlamaIndexDocsEvent
+    ) -> ActionCompleted:
+        print(f"Writing {ev.urls} to LlamaIndex Docs")
+        write_webpages_to_weaviate(
+            client, urls=ev.urls, collection_name="LlamaIndexDocs"
+        )
+        results = await ctx.store.get("results")
+        results.append(f"Wrote {ev.urls} it LlamaIndex Docs")
+        return ActionCompleted(result=f"Writing {ev.urls} to LlamaIndex Docs")
+
+    @step
+    async def write_weaviate_docs(
+        self, ctx: Context, ev: WriteWeaviateDocsEvent
+    ) -> ActionCompleted:
+        print(f"Writing {ev.urls} to Weaviate Docs")
+        write_webpages_to_weaviate(
+            client, urls=ev.urls, collection_name="WeaviateDocs"
+        )
+        results = await ctx.store.get("results")
+        results.append(f"Wrote {ev.urls} it Weavite Docs")
+        return ActionCompleted(result=f"Writing {ev.urls} to Weaviate Docs")
+
+    @step
+    async def query_agent(
+        self, ctx: Context, ev: QueryAgentEvent
+    ) -> ActionCompleted:
+        print(f"Sending {ev.query} to agent")
+        response = weaviate_agent.run(ev.query)
+        results = await ctx.store.get("results")
+        results.append(f"QueryAgent responded with:\n {response.final_answer}")
+        return ActionCompleted(result=f"Sending `'{ev.query}`' to agent")
+
+    @step
+    async def collect(
+        self, ctx: Context, ev: ActionCompleted
+    ) -> StopEvent | None:
+        num_events = await ctx.store.get("num_events")
+        evs = ctx.collect_events(ev, [ActionCompleted] * num_events)
+        if evs is None:
+            return None
+        return StopEvent(result=[ev.result for ev in evs])
+
+
+everything_docs_agent = DocsAssistantWorkflow(timeout=None)
+```
+
+
+```python
+async def run_docs_agent(query: str):
+    handler = everything_docs_agent.run(start_event=StartEvent(query=query))
+    result = await handler
+    for response in await handler.ctx.get("results"):
+        print(response)
+```
+
+
+```python
+await run_docs_agent(
+    "Can you save https://docs.llamaindex.ai/en/stable/understanding/workflows/ and https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/"
+)
+```
+
+    [SaveToLlamaIndexDocs(llama_index_urls=['https://docs.llamaindex.ai/en/stable/understanding/workflows/']), SaveToLlamaIndexDocs(llama_index_urls=['https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/'])]
+    Writing ['https://docs.llamaindex.ai/en/stable/understanding/workflows/'] to LlamaIndex Docs
+    Writing ['https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/'] to LlamaIndex Docs
+    Wrote ['https://docs.llamaindex.ai/en/stable/understanding/workflows/'] it LlamaIndex Docs
+    Wrote ['https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/'] it LlamaIndex Docs
+
+
+
+```python
+await run_docs_agent(
+    "How many documents do we have in the LlamaIndexDocs collection now?"
+)
+```
+
+    [Ask(queries=['How many documents are in the LlamaIndexDocs collection?'])]
+    Sending How many documents are in the LlamaIndexDocs collection? to agent
+    QueryAgent responded with:
+     The LlamaIndexDocs collection contains 2 documents, specifically related to workflows and branches and loops within the documentation.
+
+
+
+```python
+await run_docs_agent(
+    "What are LlamaIndex workflows? And can you save https://weaviate.io/blog/graph-rag"
+)
+```
+
+    [Ask(queries=['What are LlamaIndex workflows?'])]
+    Sending What are LlamaIndex workflows? to agent
+    QueryAgent responded with:
+     LlamaIndex workflows are an event-driven, step-based framework designed to control and manage the execution flow of complex applications, particularly those involving generative AI. They break an application into discrete Steps, each triggered by Events and capable of emitting further Events, allowing for complex logic involving loops, branches, and parallel execution.
+    
+    In a LlamaIndex workflow, steps perform functions ranging from simple tasks to complex agents, with inputs and outputs communicated via Events. This event-driven model facilitates maintainability and clarity, overcoming limitations of previous approaches like directed acyclic graphs (DAGs) which struggled with complex flows involving loops and branching.
+    
+    Key features include:
+    - **Loops:** Steps can return events that loop back to previous steps to enable iterative processes.
+    - **Branches:** Workflows can branch into different paths based on conditions, allowing for multiple distinct sequences of steps.
+    - **Parallelism:** Multiple branches or steps can run concurrently and synchronize their results.
+    - **State Maintenance:** Workflows support maintaining state and context throughout execution.
+    - **Observability and Debugging:** Supported by various components and callbacks for monitoring.
+    
+    An example workflow might involve judging whether a query is of sufficient quality, looping to improve it if not, then concurrently executing different retrieval-augmented generation (RAG) strategies, and finally judging their responses to produce a single output.
+    
+    Workflows are especially useful as applications grow in complexity, enabling developers to organize and control intricate AI logic more naturally and efficiently than traditional graph-based methods. For simpler pipelines, LlamaIndex suggests using workflows optionally, but for advanced agentic applications, workflows provide a flexible and powerful control abstraction.
+
+
+
+```python
+await run_docs_agent("How do I use loops in llamaindex workflows?")
+```
+
+    [Ask(queries=['How to use loops in llamaindex workflows'])]
+    Sending How to use loops in llamaindex workflows to agent
+    QueryAgent responded with:
+     In LlamaIndex workflows, loops are implemented using an event-driven approach where you define custom event types and steps that emit events to control the workflow's execution flow. To create a loop, you define a custom event (e.g., `LoopEvent`) and a workflow step that can return either the event continuing the loop or another event to proceed. For example, a workflow step might randomly decide to either loop back (emit `LoopEvent` again) or continue to a next step emitting a different event.
+    
+    This allows creating flexible looping behaviors where any step can loop back to any other step by returning the corresponding event instances. The approach leverages Python's async functions decorated with `@step`, which process events and return the next event(s), enabling both loops and conditional branching in workflows.
+    
+    Thus, loops in LlamaIndex workflows are event-based, using custom event types and the return of events from steps to signal iterations until a condition is met.
+    
+    Example:
+    
+    ```python
+    from llamaindex.workflow import Workflow, Event, StartEvent, StopEvent, step
+    import random
+    
+    class LoopEvent(Event):
+        loop_output: str
+    
+    class FirstEvent(Event):
+        first_output: str
+    
+    class MyWorkflow(Workflow):
+        @step
+        async def step_one(self, ev: StartEvent | LoopEvent) -> FirstEvent | LoopEvent:
+            if random.randint(0, 1) == 0:
+                print("Bad thing happened")
+                return LoopEvent(loop_output="Back to step one.")
+            else:
+                print("Good thing happened")
+                return FirstEvent(first_output="First step complete.")
+    
+        # ... other steps ...
+    
+    # Running this workflow will cause step_one to loop randomly until it proceeds.
+    ```
+    
+    You can combine loops with branching and parallel execution in workflows to build complex control flows. For detailed guidance and examples, consult the LlamaIndex documentation under "Branches and Loops" and the "Workflows" guides.
+
@@ -0,0 +1,591 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/multi_document_agents-v1.ipynb
+toc: True
+title: "Multi-Document Agents (V1)"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+In this guide, you learn towards setting up a multi-document agent over the LlamaIndex documentation.
+
+This is an extension of V0 multi-document agents with the additional features:
+- Reranking during document (tool) retrieval
+- Query planning tool that the agent can use to plan 
+
+
+We do this with the following architecture:
+
+- setup a "document agent" over each Document: each doc agent can do QA/summarization within its doc
+- setup a top-level agent over this set of document agents. Do tool retrieval and then do CoT over the set of tools to answer a question.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index-core
+%pip install llama-index-agent-openai
+%pip install llama-index-readers-file
+%pip install llama-index-postprocessor-cohere-rerank
+%pip install llama-index-llms-openai
+%pip install llama-index-embeddings-openai
+%pip install unstructured[html]
+```
+
+
+```python
+%load_ext autoreload
+%autoreload 2
+```
+
+## Setup and Download Data
+
+In this section, we'll load in the LlamaIndex documentation.
+
+**NOTE:** This command will take a while to run, it will download the entire LlamaIndex documentation. In my testing, this took about 15 minutes.
+
+
+```python
+domain = "docs.llamaindex.ai"
+docs_url = "https://docs.llamaindex.ai/en/latest/"
+!wget -e robots=off --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains {domain} --no-parent {docs_url}
+```
+
+
+```python
+from llama_index.readers.file import UnstructuredReader
+
+reader = UnstructuredReader()
+```
+
+
+```python
+from pathlib import Path
+
+all_files_gen = Path("./docs.llamaindex.ai/").rglob("*")
+all_files = [f.resolve() for f in all_files_gen]
+```
+
+
+```python
+all_html_files = [f for f in all_files if f.suffix.lower() == ".html"]
+```
+
+
+```python
+len(all_html_files)
+```
+
+
+
+
+    1656
+
+
+
+
+```python
+useful_files = [
+    x
+    for x in all_html_files
+    if "understanding" in str(x).split(".")[-2]
+    or "examples" in str(x).split(".")[-2]
+]
+print(len(useful_files))
+```
+
+    680
+
+
+
+```python
+from llama_index.core import Document
+
+# TODO: set to higher value if you want more docs to be indexed
+doc_limit = 100
+
+docs = []
+for idx, f in enumerate(useful_files):
+    if idx > doc_limit:
+        break
+    print(f"Idx {idx}/{len(useful_files)}")
+    loaded_docs = reader.load_data(file=f, split_documents=True)
+
+    loaded_doc = Document(
+        text="\n\n".join([d.get_content() for d in loaded_docs]),
+        metadata={"path": str(f)},
+    )
+    print(loaded_doc.metadata["path"])
+    docs.append(loaded_doc)
+```
+
+
+```python
+print(len(docs))
+```
+
+    101
+
+
+Define Global LLM + Embeddings
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+llm = OpenAI(model="gpt-4o")
+Settings.llm = llm
+Settings.embed_model = OpenAIEmbedding(
+    model="text-embedding-3-small", embed_batch_size=256
+)
+```
+
+## Building Multi-Document Agents
+
+In this section we show you how to construct the multi-document agent. We first build a document agent for each document, and then define the top-level parent agent with an object index.
+
+### Build Document Agent for each Document
+
+In this section we define "document agents" for each document.
+
+We define both a vector index (for semantic search) and summary index (for summarization) for each document. The two query engines are then converted into tools that are passed to an OpenAI function calling agent.
+
+This document agent can dynamically choose to perform semantic search or summarization within a given document.
+
+We create a separate document agent for each city.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+from llama_index.core import (
+    load_index_from_storage,
+    StorageContext,
+    VectorStoreIndex,
+)
+from llama_index.core import SummaryIndex
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core.node_parser import SentenceSplitter
+import os
+from tqdm.notebook import tqdm
+import pickle
+
+
+async def build_agent_per_doc(nodes, file_base):
+    vi_out_path = f"./data/llamaindex_docs/{file_base}"
+    summary_out_path = f"./data/llamaindex_docs/{file_base}_summary.pkl"
+    if not os.path.exists(vi_out_path):
+        Path("./data/llamaindex_docs/").mkdir(parents=True, exist_ok=True)
+        # build vector index
+        vector_index = VectorStoreIndex(nodes)
+        vector_index.storage_context.persist(persist_dir=vi_out_path)
+    else:
+        vector_index = load_index_from_storage(
+            StorageContext.from_defaults(persist_dir=vi_out_path),
+        )
+
+    # build summary index
+    summary_index = SummaryIndex(nodes)
+
+    # define query engines
+    vector_query_engine = vector_index.as_query_engine(llm=llm)
+    summary_query_engine = summary_index.as_query_engine(
+        response_mode="tree_summarize", llm=llm
+    )
+
+    # extract a summary
+    if not os.path.exists(summary_out_path):
+        Path(summary_out_path).parent.mkdir(parents=True, exist_ok=True)
+        summary = str(
+            await summary_query_engine.aquery(
+                "Extract a concise 1-2 line summary of this document"
+            )
+        )
+        pickle.dump(summary, open(summary_out_path, "wb"))
+    else:
+        summary = pickle.load(open(summary_out_path, "rb"))
+
+    # define tools
+    query_engine_tools = [
+        QueryEngineTool.from_defaults(
+            query_engine=vector_query_engine,
+            name=f"vector_tool_{file_base}",
+            description=f"Useful for questions related to specific facts",
+        ),
+        QueryEngineTool.from_defaults(
+            query_engine=summary_query_engine,
+            name=f"summary_tool_{file_base}",
+            description=f"Useful for summarization questions",
+        ),
+    ]
+
+    # build agent
+    function_llm = OpenAI(model="gpt-4")
+    agent = FunctionAgent(
+        tools=query_engine_tools,
+        llm=function_llm,
+        system_prompt=f"""\
+You are a specialized agent designed to answer queries about the `{file_base}.html` part of the LlamaIndex docs.
+You must ALWAYS use at least one of the tools provided when answering a question; do NOT rely on prior knowledge.\
+""",
+    )
+
+    return agent, summary
+
+
+async def build_agents(docs):
+    node_parser = SentenceSplitter()
+
+    # Build agents dictionary
+    agents_dict = {}
+    extra_info_dict = {}
+
+    # # this is for the baseline
+    # all_nodes = []
+
+    for idx, doc in enumerate(tqdm(docs)):
+        nodes = node_parser.get_nodes_from_documents([doc])
+        # all_nodes.extend(nodes)
+
+        # ID will be base + parent
+        file_path = Path(doc.metadata["path"])
+        file_base = str(file_path.parent.stem) + "_" + str(file_path.stem)
+        agent, summary = await build_agent_per_doc(nodes, file_base)
+
+        agents_dict[file_base] = agent
+        extra_info_dict[file_base] = {"summary": summary, "nodes": nodes}
+
+    return agents_dict, extra_info_dict
+```
+
+
+```python
+agents_dict, extra_info_dict = await build_agents(docs)
+```
+
+### Build Retriever-Enabled OpenAI Agent
+
+We build a top-level agent that can orchestrate across the different document agents to answer any user query.
+
+This agent will use a tool retriever to retrieve the most relevant tools for the query.
+
+**Improvements from V0**: We make the following improvements compared to the "base" version in V0.
+
+- Adding in reranking: we use Cohere reranker to better filter the candidate set of documents.
+- Adding in a query planning tool: we add an explicit query planning tool that's dynamically created based on the set of retrieved tools.
+
+
+
+```python
+from typing import Callable
+from llama_index.core.tools import FunctionTool
+
+
+def get_agent_tool_callable(agent: FunctionAgent) -> Callable:
+    async def query_agent(query: str) -> str:
+        response = await agent.run(query)
+        return str(response)
+
+    return query_agent
+
+
+# define tool for each document agent
+all_tools = []
+for file_base, agent in agents_dict.items():
+    summary = extra_info_dict[file_base]["summary"]
+    async_fn = get_agent_tool_callable(agent)
+    doc_tool = FunctionTool.from_defaults(
+        async_fn,
+        name=f"tool_{file_base}",
+        description=summary,
+    )
+    all_tools.append(doc_tool)
+```
+
+
+```python
+print(all_tools[0].metadata)
+```
+
+    ToolMetadata(description='The document provides a series of tutorials on building agentic LLM applications using LlamaIndex, covering key steps such as building RAG pipelines, agents, and workflows, along with techniques for data ingestion, indexing, querying, and application evaluation.', name='tool_understanding_index', fn_schema=<class 'llama_index.core.tools.utils.tool_understanding_index'>, return_direct=False)
+
+
+
+```python
+# define an "object" index and retriever over these tools
+from llama_index.core import VectorStoreIndex
+from llama_index.core.objects import (
+    ObjectIndex,
+    ObjectRetriever,
+)
+from llama_index.postprocessor.cohere_rerank import CohereRerank
+from llama_index.core.query_engine import SubQuestionQueryEngine
+from llama_index.core.schema import QueryBundle
+from llama_index.llms.openai import OpenAI
+
+
+llm = OpenAI(model_name="gpt-4o")
+
+obj_index = ObjectIndex.from_objects(
+    all_tools,
+    index_cls=VectorStoreIndex,
+)
+vector_node_retriever = obj_index.as_node_retriever(
+    similarity_top_k=10,
+)
+
+
+# define a custom object retriever that adds in a query planning tool
+class CustomObjectRetriever(ObjectRetriever):
+    def __init__(
+        self,
+        retriever,
+        object_node_mapping,
+        node_postprocessors=None,
+        llm=None,
+    ):
+        self._retriever = retriever
+        self._object_node_mapping = object_node_mapping
+        self._llm = llm or OpenAI("gpt-4o")
+        self._node_postprocessors = node_postprocessors or []
+
+    def retrieve(self, query_bundle):
+        if isinstance(query_bundle, str):
+            query_bundle = QueryBundle(query_str=query_bundle)
+
+        nodes = self._retriever.retrieve(query_bundle)
+        for processor in self._node_postprocessors:
+            nodes = processor.postprocess_nodes(
+                nodes, query_bundle=query_bundle
+            )
+        tools = [self._object_node_mapping.from_node(n.node) for n in nodes]
+
+        sub_agent = FunctionAgent(
+            name="compare_tool",
+            description=f"""\
+Useful for any queries that involve comparing multiple documents. ALWAYS use this tool for comparison queries - make sure to call this \
+tool with the original query. Do NOT use the other tools for any queries involving multiple documents.
+""",
+            tools=tools,
+            llm=self._llm,
+            system_prompt="""You are an expert at comparing documents. Given a query, use the tools provided to compare the documents and return a summary of the results.""",
+        )
+
+        async def query_sub_agent(query: str) -> str:
+            response = await sub_agent.run(query)
+            return str(response)
+
+        sub_question_tool = FunctionTool.from_defaults(
+            query_sub_agent,
+            name=sub_agent.name,
+            description=sub_agent.description,
+        )
+        return tools + [sub_question_tool]
+```
+
+
+```python
+# wrap it with ObjectRetriever to return objects
+custom_obj_retriever = CustomObjectRetriever(
+    vector_node_retriever,
+    obj_index.object_node_mapping,
+    node_postprocessors=[CohereRerank(top_n=5, model="rerank-v3.5")],
+    llm=llm,
+)
+```
+
+
+```python
+tmps = custom_obj_retriever.retrieve("hello")
+
+# should be 5 + 1 -- 5 from reranker, 1 from subquestion
+print(len(tmps))
+```
+
+    6
+
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
+
+top_agent = FunctionAgent(
+    tool_retriever=custom_obj_retriever,
+    system_prompt=""" \
+You are an agent designed to answer queries about the documentation.
+Please always use the tools provided to answer a question. Do not rely on prior knowledge.\
+
+""",
+    llm=llm,
+)
+
+# top_agent = ReActAgent(
+#     tool_retriever=custom_obj_retriever,
+#     system_prompt=""" \
+# You are an agent designed to answer queries about the documentation.
+# Please always use the tools provided to answer a question. Do not rely on prior knowledge.\
+
+# """,
+#     llm=llm,
+# )
+```
+
+### Define Baseline Vector Store Index
+
+As a point of comparison, we define a "naive" RAG pipeline which dumps all docs into a single vector index collection.
+
+We set the top_k = 4
+
+
+```python
+all_nodes = [
+    n for extra_info in extra_info_dict.values() for n in extra_info["nodes"]
+]
+```
+
+
+```python
+base_index = VectorStoreIndex(all_nodes)
+base_query_engine = base_index.as_query_engine(similarity_top_k=4)
+```
+
+## Running Example Queries
+
+Let's run some example queries, ranging from QA / summaries over a single document to QA / summarization over multiple documents.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentStream,
+    ToolCall,
+    ToolCallResult,
+)
+
+handler = top_agent.run(
+    "What can you build with LlamaIndex?",
+)
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalling tool {ev.tool_name} with args {ev.tool_kwargs}\n Got response: {str(ev.tool_output)[:200]}"
+        )
+    elif isinstance(ev, ToolCall):
+        print(f"\nTool call: {ev.tool_name} with args {ev.tool_kwargs}")
+    # Print the stream of the agent
+    # elif isinstance(ev, AgentStream):
+    #     print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Tool call: tool_SimpleIndexDemoLlama2_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Tool call: tool_apps_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Tool call: tool_putting_it_all_together_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Tool call: tool_llamacloud_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Calling tool tool_SimpleIndexDemoLlama2_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a VectorStoreIndex. This involves setting up the necessary environment, loading documents into the index, and then querying the index for information. You need to instal
+    
+    Tool call: tool_using_llms_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Calling tool tool_llamacloud_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a system that connects to your data stores, automatically indexes them, and then queries the data. This is done by integrating LlamaCloud into your project. The system a
+    
+    Calling tool tool_apps_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a full-stack web application. You can integrate it into a backend server like Flask, package it into a Docker container, or use it directly in a framework such as Stream
+    
+    Calling tool tool_putting_it_all_together_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a variety of applications and tools. This includes:
+    
+    1. Chatbots: You can use LlamaIndex to create interactive chatbots.
+    2. Agents: LlamaIndex can be used to build intel
+    
+    Calling tool tool_using_llms_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a variety of applications by leveraging the various Language Model (LLM) integrations it supports. These include OpenAI, Anthropic, Mistral, DeepSeek, Hugging Face, and 
+
+
+
+```python
+# print the final response string
+print(str(response))
+```
+
+    With LlamaIndex, you can build various applications and tools, including:
+    
+    1. **VectorStoreIndex**: Set up and query a VectorStoreIndex by loading documents and configuring the environment as per the documentation.
+       
+    2. **Full-Stack Web Applications**: Integrate LlamaIndex into backend servers like Flask, Docker containers, or frameworks like Streamlit. Resources include guides for TypeScript+React, Delphic starter template, and Flask, Streamlit, and Docker integration examples.
+    
+    3. **Chatbots, Agents, and Unified Query Framework**: Create interactive chatbots, intelligent agents, and a unified query framework for handling different query types. LlamaIndex also supports property graphs and full-stack web applications.
+    
+    4. **Data Management with LlamaCloud**: Build systems that connect to data stores, automatically index data, and efficiently query it by integrating LlamaCloud into your project.
+    
+    5. **LLM Integrations**: Utilize various Language Model (LLM) integrations such as OpenAI, Anthropic, Mistral, DeepSeek, and Hugging Face. LlamaIndex provides a unified interface to access different LLMs, enabling you to select models based on their strengths and price points. You can use multi-modal LLMs for chat messages with text, images, and audio inputs, and even call tools and functions directly through API calls.
+    
+    These capabilities make LlamaIndex a versatile tool for building a wide range of applications and systems.
+
+
+
+```python
+# access the tool calls
+# print(response.tool_calls)
+```
+
+
+```python
+# baseline
+response = base_query_engine.query(
+    "What can you build with LlamaIndex?",
+)
+print(str(response))
+```
+
+    With LlamaIndex, you can build a variety of applications and systems, including a full-stack web application, a chatbot, and a unified query framework over multiple indexes. You can also perform semantic searches, summarization queries, and queries over structured data like SQL or Pandas DataFrames. Additionally, LlamaIndex supports routing over heterogeneous data sources and compare/contrast queries. It provides tools and templates to help you integrate these capabilities into production-ready applications.
+
+
+
+```python
+response = await top_agent.run("Compare workflows to query engines")
+print(str(response))
+```
+
+    Workflows and query engines serve different purposes in an application context:
+    
+    1. Workflows:
+       - Workflows are designed to manage the execution flow of an application by dividing it into sections triggered by events.
+       - They are event-driven and step-based, allowing for the management of application complexity by breaking it into smaller, more manageable pieces.
+       - Workflows focus on controlling the flow of application execution through steps and events.
+    
+    2. Query Engines:
+       - Query engines are tools used to process queries against a database or data source to retrieve specific information.
+       - They are primarily used for querying and retrieving data from databases.
+       - Query engines are focused on the retrieval, postprocessing, and response synthesis stages of querying.
+    
+    In summary, workflows are more about controlling the flow of application execution, while query engines are specifically designed for querying and retrieving data from databases.
+
+
+
+```python
+response = await top_agent.run(
+    "Can you compare the compact and tree_summarize response synthesizer response modes at a very high-level?"
+)
+print(str(response))
+```
+
+    The compact response synthesizer mode aims to produce concise and condensed responses, focusing on delivering the most relevant information in a brief format. On the other hand, the tree_summarize response synthesizer mode is designed to create structured and summarized responses, organizing information in a comprehensive manner. 
+    
+    In summary, the compact mode provides brief and straightforward responses, while the tree_summarize mode offers more detailed and organized output for a comprehensive summary.
+
@@ -0,0 +1,224 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/nvidia_agent.ipynb
+toc: True
+title: "Function Calling NVIDIA Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our NVIDIA agent, powered by function calling capabilities.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the NVIDIA NIM Endpoint (using our own `llama_index` LLM class)
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+
+```python
+%pip install --upgrade --quiet llama-index-llms-nvidia
+```
+
+
+```python
+import getpass
+import os
+
+# del os.environ['NVIDIA_API_KEY']  ## delete key and reset
+if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
+    print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
+else:
+    nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
+    assert nvapi_key.startswith(
+        "nvapi-"
+    ), f"{nvapi_key[:5]}... is not a valid key"
+    os.environ["NVIDIA_API_KEY"] = nvapi_key
+```
+
+    Valid NVIDIA_API_KEY already in environment. Delete to reset
+
+
+
+```python
+from llama_index.llms.nvidia import NVIDIA
+from llama_index.core.tools import FunctionTool
+from llama_index.embeddings.nvidia import NVIDIAEmbedding
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Here we initialize a simple NVIDIA agent with calculator functions.
+
+
+```python
+llm = NVIDIA("meta/llama-3.1-70b-instruct")
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+### Chat
+
+
+```python
+response = await agent.run("What is (121 * 3) + 42?")
+print(str(response))
+```
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+### Managing Context/Memory
+
+By default, `.run()` is stateless. If you want to maintain state, you can pass in a `context` object.
+
+
+```python
+from llama_index.core.agent.workflow import Context
+
+ctx = Context(agent)
+
+response = await agent.run("Hello, my name is John Doe.", ctx=ctx)
+print(str(response))
+
+response = await agent.run("What is my name?", ctx=ctx)
+print(str(response))
+```
+
+### Agent with Personality
+
+You can specify a system prompt to give the agent additional instruction or personality.
+
+
+```python
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+    system_prompt="Talk like a pirate in every response.",
+)
+```
+
+
+```python
+response = await agent.run("Hi")
+print(response)
+```
+
+
+```python
+response = await agent.run("Tell me a story")
+print(response)
+```
+
+# NVIDIA Agent with RAG/Query Engine Tools
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+
+embed_model = NVIDIAEmbedding(model="NV-Embed-QA", truncate="END")
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+agent = FunctionAgent(tools=[query_engine_tool], llm=llm)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls."
+)
+print(str(response))
+```
+
+# ReAct Agent 
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent
+```
+
+
+```python
+agent = ReActAgent([multiply_tool, add_tool], llm=llm, verbose=True)
+```
+
+Using the `stream_events()` method, we can stream the response as it is generated to see the agent's thought process.
+
+The final response will have only the final answer.
+
+
+```python
+from llama_index.core.agent.workflow import AgentStream
+
+handler = agent.run("What is 20+(2*4)? Calculate step by step ")
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+
+```python
+print(str(response))
+```
+
+
+```python
+print(response.tool_calls)
+```
@@ -0,0 +1,295 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_context_retrieval.ipynb
+toc: True
+title: "Context-Augmented Function Calling Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+In this tutorial, we show you how to to make your agent context-aware.
+
+Our indexing/retrieval modules help to remove the complexity of having too many functions to fit in the prompt.
+
+## Initial Setup 
+
+Here we setup a normal FunctionAgent, and then augment it with context. This agent will perform retrieval first before calling any tools. This can help ground the agent's tool picking and answering capabilities in context.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core.settings import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+import json
+from typing import Sequence
+
+from llama_index.core import (
+    SimpleDirectoryReader,
+    VectorStoreIndex,
+    StorageContext,
+    load_index_from_storage,
+)
+from llama_index.core.tools import QueryEngineTool
+```
+
+
+```python
+try:
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/march"
+    )
+    march_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/june"
+    )
+    june_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/sept"
+    )
+    sept_index = load_index_from_storage(storage_context)
+
+    index_loaded = True
+except:
+    index_loaded = False
+```
+
+Download Data
+
+
+```python
+!mkdir -p 'data/10q/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_march_2022.pdf' -O 'data/10q/uber_10q_march_2022.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_june_2022.pdf' -O 'data/10q/uber_10q_june_2022.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_sept_2022.pdf' -O 'data/10q/uber_10q_sept_2022.pdf'
+```
+
+
+```python
+# build indexes across the three data sources
+if not index_loaded:
+    # load data
+    march_docs = SimpleDirectoryReader(
+        input_files=["./data/10q/uber_10q_march_2022.pdf"]
+    ).load_data()
+    june_docs = SimpleDirectoryReader(
+        input_files=["./data/10q/uber_10q_june_2022.pdf"]
+    ).load_data()
+    sept_docs = SimpleDirectoryReader(
+        input_files=["./data/10q/uber_10q_sept_2022.pdf"]
+    ).load_data()
+
+    # build index
+    march_index = VectorStoreIndex.from_documents(march_docs)
+    june_index = VectorStoreIndex.from_documents(june_docs)
+    sept_index = VectorStoreIndex.from_documents(sept_docs)
+
+    # persist index
+    march_index.storage_context.persist(persist_dir="./storage/march")
+    june_index.storage_context.persist(persist_dir="./storage/june")
+    sept_index.storage_context.persist(persist_dir="./storage/sept")
+```
+
+
+```python
+march_engine = march_index.as_query_engine(similarity_top_k=3)
+june_engine = june_index.as_query_engine(similarity_top_k=3)
+sept_engine = sept_index.as_query_engine(similarity_top_k=3)
+```
+
+
+```python
+query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=march_engine,
+        name="uber_march_10q",
+        description=(
+            "Provides information about Uber 10Q filings for March 2022. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=june_engine,
+        name="uber_june_10q",
+        description=(
+            "Provides information about Uber financials for June 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=sept_engine,
+        name="uber_sept_10q",
+        description=(
+            "Provides information about Uber financials for Sept 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+]
+```
+
+### Try Context-Augmented Agent
+
+Here we augment our agent with context in different settings:
+- toy context: we define some abbreviations that map to financial terms (e.g. R=Revenue). We supply this as context to the agent
+
+
+```python
+from llama_index.core import Document
+from llama_index.core.agent.workflow import FunctionAgent
+```
+
+
+```python
+# toy index - stores a list of abbreviations
+texts = [
+    "Abbreviation: 'Y' = Revenue",
+    "Abbreviation: 'X' = Risk Factors",
+    "Abbreviation: 'Z' = Costs",
+]
+docs = [Document(text=t) for t in texts]
+context_index = VectorStoreIndex.from_documents(docs)
+context_retriever = context_index.as_retriever(similarity_top_k=2)
+```
+
+
+```python
+from llama_index.core.tools import BaseTool
+
+system_prompt_template = """You are a helpful assistant. 
+Here is some context that you can use to answer the user's question and for help with picking the right tool:
+
+{context}
+"""
+
+
+async def get_agent_with_context_awareness(
+    query: str, context_retriever, tools: list[BaseTool]
+) -> FunctionAgent:
+    context_nodes = await context_retriever.aretrieve(query)
+    context_text = "\n----\n".join([n.node.text for n in context_nodes])
+
+    return FunctionAgent(
+        tools=tools,
+        llm=OpenAI(model="gpt-4o"),
+        system_prompt=system_prompt_template.format(context=context_text),
+    )
+```
+
+
+```python
+query = "What is the 'X' of March 2022?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+
+response = await agent.run(query)
+```
+
+
+```python
+print(str(response))
+```
+
+    The risk factors mentioned in Uber's 10-Q filing for March 2022 include uncertainties related to the COVID-19 pandemic, such as the severity and duration of the outbreak, potential future waves or variants of the virus, the administration and efficacy of vaccines, and the impact of governmental actions. There are also concerns regarding the effects on drivers, merchants, consumers, and business partners, as well as other factors that may affect the company's business, results of operations, financial position, and cash flows.
+
+
+
+```python
+query = "What is the 'Y' and 'Z' in September 2022?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+
+response = await agent.run(query)
+```
+
+
+```python
+print(str(response))
+```
+
+    In September 2022, Uber's revenue (Y) was $8,343 million, and the total costs (Z) were $8,839 million.
+
+
+### Managing Context/Memory
+
+By default, each `.run()` call is stateless. We can manage context by using a serializable `Context` object.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+
+query = "What is the 'Y' and 'Z' in September 2022?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+response = await agent.run(query, ctx=ctx)
+
+query = "What did I just ask?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+response = await agent.run(query, ctx=ctx)
+print(str(response))
+```
+
+    You asked for the revenue ('Y') and costs ('Z') for Uber in September 2022.
+
+
+### Use Uber 10-Q as context, use Calculator as Tool
+
+
+```python
+from llama_index.core.tools import FunctionTool
+
+
+def magic_formula(revenue: int, cost: int) -> int:
+    """Runs MAGIC_FORMULA on revenue and cost."""
+    return revenue - cost
+
+
+magic_tool = FunctionTool.from_defaults(magic_formula)
+```
+
+
+```python
+context_retriever = sept_index.as_retriever(similarity_top_k=3)
+```
+
+
+```python
+query = "Can you run MAGIC_FORMULA on Uber's revenue and cost?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, [magic_tool]
+)
+response = await agent.run(query)
+print(str(response))
+```
+
+    The result of running MAGIC_FORMULA on Uber's revenue of $8,343 million and cost of $5,173 million is 3,170.
+
@@ -0,0 +1,783 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_query_cookbook.ipynb
+toc: True
+title: "OpenAI Agent + Query Engine Experimental Cookbook"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+In this notebook, we try out the OpenAIAgent across a variety of query engine tools and datasets. We explore how OpenAIAgent can compare/replace existing workflows solved by our retrievers/query engines.
+
+- Auto retrieval 
+- Joint SQL and vector search
+
+**NOTE:** Any Text-to-SQL application should be aware that executing 
+arbitrary SQL queries can be a security risk. It is recommended to
+take precautions as needed, such as using restricted roles, read-only
+databases, sandboxing, etc.
+
+## AutoRetrieval from a Vector Database
+
+Our existing "auto-retrieval" capabilities (in `VectorIndexAutoRetriever`) allow an LLM to infer the right query parameters for a vector database - including both the query string and metadata filter.
+
+Since the OpenAI Function API can infer function parameters, we explore its capabilities in performing auto-retrieval here.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-openai
+%pip install llama-index-readers-wikipedia
+%pip install llama-index-vector-stores-pinecone
+```
+
+
+```python
+import os
+
+os.environ["PINECONE_API_KEY"] = "..."
+os.environ["OPENAI_API_KEY"] = "..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+from pinecone import Pinecone, ServerlessSpec
+
+pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
+```
+
+
+```python
+# dimensions are for text-embedding-3-small
+pc.create_index(
+    name="quickstart-index",
+    dimension=1536,
+    metric="euclidean",
+    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
+)
+
+# may need to wait for index to be created
+import time
+
+time.sleep(10)
+```
+
+
+
+
+    {
+        "name": "quickstart-index",
+        "metric": "euclidean",
+        "host": "quickstart-index-c2e1535.svc.aped-4627-b74a.pinecone.io",
+        "spec": {
+            "serverless": {
+                "cloud": "aws",
+                "region": "us-east-1"
+            }
+        },
+        "status": {
+            "ready": true,
+            "state": "Ready"
+        },
+        "vector_type": "dense",
+        "dimension": 1536,
+        "deletion_protection": "disabled",
+        "tags": null
+    }
+
+
+
+
+```python
+index = pc.Index("quickstart-index")
+```
+
+
+```python
+# Optional: delete data in your pinecone index
+# index.delete(deleteAll=True, namespace="test")
+```
+
+
+```python
+from llama_index.core import VectorStoreIndex, StorageContext
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+```
+
+
+```python
+from llama_index.core.schema import TextNode
+
+nodes = [
+    TextNode(
+        text=(
+            "Michael Jordan is a retired professional basketball player,"
+            " widely regarded as one of the greatest basketball players of all"
+            " time."
+        ),
+        metadata={
+            "category": "Sports",
+            "country": "United States",
+            "gender": "male",
+            "born": 1963,
+        },
+    ),
+    TextNode(
+        text=(
+            "Angelina Jolie is an American actress, filmmaker, and"
+            " humanitarian. She has received numerous awards for her acting"
+            " and is known for her philanthropic work."
+        ),
+        metadata={
+            "category": "Entertainment",
+            "country": "United States",
+            "gender": "female",
+            "born": 1975,
+        },
+    ),
+    TextNode(
+        text=(
+            "Elon Musk is a business magnate, industrial designer, and"
+            " engineer. He is the founder, CEO, and lead designer of SpaceX,"
+            " Tesla, Inc., Neuralink, and The Boring Company."
+        ),
+        metadata={
+            "category": "Business",
+            "country": "United States",
+            "gender": "male",
+            "born": 1971,
+        },
+    ),
+    TextNode(
+        text=(
+            "Rihanna is a Barbadian singer, actress, and businesswoman. She"
+            " has achieved significant success in the music industry and is"
+            " known for her versatile musical style."
+        ),
+        metadata={
+            "category": "Music",
+            "country": "Barbados",
+            "gender": "female",
+            "born": 1988,
+        },
+    ),
+    TextNode(
+        text=(
+            "Cristiano Ronaldo is a Portuguese professional footballer who is"
+            " considered one of the greatest football players of all time. He"
+            " has won numerous awards and set multiple records during his"
+            " career."
+        ),
+        metadata={
+            "category": "Sports",
+            "country": "Portugal",
+            "gender": "male",
+            "born": 1985,
+        },
+    ),
+]
+```
+
+
+```python
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+from llama_index.core import StorageContext
+
+vector_store = PineconeVectorStore(pinecone_index=index, namespace="test")
+storage_context = StorageContext.from_defaults(vector_store=vector_store)
+```
+
+
+```python
+from llama_index.core import VectorStoreIndex
+
+index = VectorStoreIndex(nodes, storage_context=storage_context)
+```
+
+
+    Upserted vectors:   0%|          | 0/5 [00:00<?, ?it/s]
+
+
+#### Define Function Tool
+
+Here we define the function interface, which is passed to OpenAI to perform auto-retrieval.
+
+We were not able to get OpenAI to work with nested pydantic objects or tuples as arguments,
+so we converted the metadata filter keys and values into lists for the function API to work with.
+
+
+```python
+# define function tool
+from llama_index.core.tools import FunctionTool
+from llama_index.core.vector_stores import (
+    VectorStoreInfo,
+    MetadataInfo,
+    MetadataFilter,
+    MetadataFilters,
+    FilterCondition,
+    FilterOperator,
+)
+from llama_index.core.retrievers import VectorIndexRetriever
+from llama_index.core.query_engine import RetrieverQueryEngine
+
+from typing import List, Tuple, Any
+from pydantic import BaseModel, Field
+
+
+# define vector store info describing schema of vector store
+vector_store_info = VectorStoreInfo(
+    content_info="brief biography of celebrities",
+    metadata_info=[
+        MetadataInfo(
+            name="category",
+            type="str",
+            description=(
+                "Category of the celebrity, one of [Sports, Entertainment,"
+                " Business, Music]"
+            ),
+        ),
+        MetadataInfo(
+            name="country",
+            type="str",
+            description=(
+                "Country of the celebrity, one of [United States, Barbados,"
+                " Portugal]"
+            ),
+        ),
+        MetadataInfo(
+            name="gender",
+            type="str",
+            description=("Gender of the celebrity, one of [male, female]"),
+        ),
+        MetadataInfo(
+            name="born",
+            type="int",
+            description=("Born year of the celebrity, could be any integer"),
+        ),
+    ],
+)
+```
+
+Define AutoRetrieve Functions
+
+
+```python
+from typing import Any, Annotated
+
+
+async def auto_retrieve_fn(
+    query: Annotated[str, "The natural language query/question to answer."],
+    filter_key_list: Annotated[
+        List[str], "List of metadata filter field names"
+    ],
+    filter_value_list: Annotated[
+        List[Any],
+        "List of metadata filter field values (corresponding to names in filter_key_list)",
+    ],
+    filter_operator_list: Annotated[
+        List[str],
+        "Metadata filters conditions (could be one of <, <=, >, >=, ==, !=)",
+    ],
+    filter_condition: Annotated[
+        str, "Metadata filters condition values (could be AND or OR)"
+    ],
+    top_k: Annotated[
+        int, "The number of results to return from the vector database."
+    ],
+):
+    """Auto retrieval function.
+
+    Performs auto-retrieval from a vector database, and then applies a set of filters.
+
+    """
+    query = query or "Query"
+
+    metadata_filters = [
+        MetadataFilter(key=k, value=v, operator=op)
+        for k, v, op in zip(
+            filter_key_list, filter_value_list, filter_operator_list
+        )
+    ]
+    retriever = VectorIndexRetriever(
+        index,
+        filters=MetadataFilters(
+            filters=metadata_filters, condition=filter_condition.lower()
+        ),
+        top_k=top_k,
+    )
+    query_engine = RetrieverQueryEngine.from_args(retriever)
+
+    response = await query_engine.aquery(query)
+    return str(response)
+
+
+description = f"""\
+Use this tool to look up biographical information about celebrities.
+The vector database schema is given below:
+
+<schema>
+{vector_store_info.model_dump_json()}
+</schema>
+"""
+
+auto_retrieve_tool = FunctionTool.from_defaults(
+    auto_retrieve_fn,
+    name="celebrity_bios",
+    description=description,
+)
+```
+
+#### Initialize Agent
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+agent = FunctionAgent(
+    tools=[auto_retrieve_tool],
+    llm=OpenAI(model="gpt-4o"),
+    system_prompt=(
+        "You are a helpful assistant that can answer questions about celebrities by writing a filtered query to a vector database. "
+        "Unless the user is asking to compare things, you generally only need to make one call to the retriever."
+    ),
+)
+
+# hold the context/session state for the agent
+ctx = Context(agent)
+```
+
+
+```python
+from llama_index.core.agent.workflow import (
+    ToolCallResult,
+    ToolCall,
+    AgentStream,
+    AgentInput,
+    AgentOutput,
+)
+
+handler = agent.run(
+    "Tell me about two celebrities from the United States. ", ctx=ctx
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool celebrity_bios with args {'query': 'celebrities from the United States', 'filter_key_list': ['country'], 'filter_value_list': ['United States'], 'filter_operator_list': ['=='], 'filter_condition': 'AND', 'top_k': 2}, got response: Angelina Jolie and Elon Musk are notable celebrities from the United States.
+    Here are two celebrities from the United States:
+    
+    1. **Angelina Jolie**: She is a renowned actress, filmmaker, and humanitarian. Jolie has received numerous accolades, including an Academy Award and three Golden Globe Awards. She is also known for her humanitarian efforts, particularly her work with refugees as a Special Envoy for the United Nations High Commissioner for Refugees (UNHCR).
+    
+    2. **Elon Musk**: He is a prominent entrepreneur and business magnate. Musk is the CEO and lead designer of SpaceX, CEO and product architect of Tesla, Inc., and has been involved in numerous other ventures, including Neuralink and The Boring Company. He is known for his ambitious vision of the future, including space exploration and sustainable energy.
+
+
+```python
+handler = agent.run("Tell me about two celebrities born after 1980. ", ctx=ctx)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool celebrity_bios with args {'query': 'celebrities born after 1980', 'filter_key_list': ['born'], 'filter_value_list': [1980], 'filter_operator_list': ['>'], 'filter_condition': 'AND', 'top_k': 2}, got response: Rihanna, born in 1988, is a celebrity who fits the criteria of being born after 1980.
+    Here is a celebrity born after 1980:
+    
+    - **Rihanna**: Born in 1988, Rihanna is a Barbadian singer, actress, and businesswoman. She gained worldwide fame with her music career, producing hits like "Umbrella," "Diamonds," and "Work." Beyond music, Rihanna has made a significant impact in the fashion and beauty industries with her Fenty brand, known for its inclusivity and innovation.
+
+
+```python
+response = await agent.run(
+    "Tell me about few celebrities under category business and born after 1950. ",
+    ctx=ctx,
+)
+print(str(response))
+```
+
+    Here is a celebrity in the business category who was born after 1950:
+    
+    - **Elon Musk**: He is a prominent entrepreneur and business magnate, born in 1971. Musk is the CEO and lead designer of SpaceX, CEO and product architect of Tesla, Inc., and has been involved in numerous other ventures, including Neuralink and The Boring Company. He is known for his ambitious vision of the future, including space exploration and sustainable energy.
+
+
+## Joint Text-to-SQL and Semantic Search
+
+This is currently handled by our `SQLAutoVectorQueryEngine`.
+
+Let's try implementing this by giving our `OpenAIAgent` access to two query tools: SQL and Vector 
+
+**NOTE:** Any Text-to-SQL application should be aware that executing 
+arbitrary SQL queries can be a security risk. It is recommended to
+take precautions as needed, such as using restricted roles, read-only
+databases, sandboxing, etc.
+
+#### Load and Index Structured Data
+
+We load sample structured datapoints into a SQL db and index it.
+
+
+```python
+from sqlalchemy import (
+    create_engine,
+    MetaData,
+    Table,
+    Column,
+    String,
+    Integer,
+    select,
+    column,
+)
+from llama_index.core import SQLDatabase
+from llama_index.core.indices import SQLStructStoreIndex
+
+engine = create_engine("sqlite:///:memory:", future=True)
+metadata_obj = MetaData()
+```
+
+
+```python
+# create city SQL table
+table_name = "city_stats"
+city_stats_table = Table(
+    table_name,
+    metadata_obj,
+    Column("city_name", String(16), primary_key=True),
+    Column("population", Integer),
+    Column("country", String(16), nullable=False),
+)
+
+metadata_obj.create_all(engine)
+```
+
+
+```python
+# print tables
+metadata_obj.tables.keys()
+```
+
+
+
+
+    dict_keys(['city_stats'])
+
+
+
+
+```python
+from sqlalchemy import insert
+
+rows = [
+    {"city_name": "Toronto", "population": 2930000, "country": "Canada"},
+    {"city_name": "Tokyo", "population": 13960000, "country": "Japan"},
+    {"city_name": "Berlin", "population": 3645000, "country": "Germany"},
+]
+for row in rows:
+    stmt = insert(city_stats_table).values(**row)
+    with engine.begin() as connection:
+        cursor = connection.execute(stmt)
+```
+
+
+```python
+with engine.connect() as connection:
+    cursor = connection.exec_driver_sql("SELECT * FROM city_stats")
+    print(cursor.fetchall())
+```
+
+    [('Toronto', 2930000, 'Canada'), ('Tokyo', 13960000, 'Japan'), ('Berlin', 3645000, 'Germany')]
+
+
+
+```python
+sql_database = SQLDatabase(engine, include_tables=["city_stats"])
+```
+
+
+```python
+from llama_index.core.query_engine import NLSQLTableQueryEngine
+
+query_engine = NLSQLTableQueryEngine(
+    sql_database=sql_database,
+    tables=["city_stats"],
+)
+```
+
+#### Load and Index Unstructured Data
+
+We load unstructured data into a vector index backed by Pinecone
+
+
+```python
+# install wikipedia python package
+%pip install wikipedia llama-index-readers-wikipedia
+```
+
+
+```python
+from llama_index.readers.wikipedia import WikipediaReader
+
+cities = ["Toronto", "Berlin", "Tokyo"]
+wiki_docs = WikipediaReader().load_data(pages=cities)
+```
+
+
+```python
+from pinecone import Pinecone, ServerlessSpec
+
+pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
+```
+
+
+```python
+# dimensions are for text-embedding-3-small
+pc.create_index(
+    name="quickstart-sql",
+    dimension=1536,
+    metric="euclidean",
+    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
+)
+
+# may need to wait for index to be created
+import time
+
+time.sleep(10)
+```
+
+
+```python
+# define pinecone index
+index = pc.Index("quickstart-sql")
+```
+
+
+```python
+# OPTIONAL: delete all
+index.delete(deleteAll=True)
+```
+
+
+```python
+from llama_index.core import VectorStoreIndex, StorageContext
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+from llama_index.core.node_parser import TokenTextSplitter
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+# define node parser and LLM
+Settings.llm = OpenAI(temperature=0, model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+Settings.node_parser = TokenTextSplitter(chunk_size=1024)
+
+# define pinecone vector index
+vector_store = PineconeVectorStore(
+    pinecone_index=index, namespace="wiki_cities"
+)
+storage_context = StorageContext.from_defaults(vector_store=vector_store)
+vector_index = VectorStoreIndex([], storage_context=storage_context)
+```
+
+
+```python
+# Insert documents into vector index
+# Each document has metadata of the city attached
+for city, wiki_doc in zip(cities, wiki_docs):
+    nodes = Settings.node_parser.get_nodes_from_documents([wiki_doc])
+    # add metadata to each node
+    for node in nodes:
+        node.metadata = {"title": city}
+    vector_index.insert_nodes(nodes)
+```
+
+#### Define Query Engines / Tools
+
+
+```python
+from llama_index.core.retrievers import VectorIndexAutoRetriever
+from llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo
+from llama_index.core.query_engine import RetrieverQueryEngine
+from llama_index.core.tools import QueryEngineTool
+
+
+vector_store_info = VectorStoreInfo(
+    content_info="articles about different cities",
+    metadata_info=[
+        MetadataInfo(
+            name="title", type="str", description="The name of the city"
+        ),
+    ],
+)
+
+# pre-built auto-retriever, this works similarly to our custom auto-retriever above
+vector_auto_retriever = VectorIndexAutoRetriever(
+    vector_index, vector_store_info=vector_store_info
+)
+
+retriever_query_engine = RetrieverQueryEngine.from_args(
+    vector_auto_retriever,
+)
+```
+
+
+```python
+sql_tool = QueryEngineTool.from_defaults(
+    query_engine=query_engine,
+    name="sql_tool",
+    description=(
+        "Useful for translating a natural language query into a SQL query over"
+        " a table containing: city_stats, containing the population/country of"
+        " each city"
+    ),
+)
+vector_tool = QueryEngineTool.from_defaults(
+    query_engine=retriever_query_engine,
+    name="vector_tool",
+    description=(
+        "Useful for answering semantic questions about different cities"
+    ),
+)
+```
+
+#### Initialize Agent
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.llms.openai import OpenAI
+from llama_index.core.workflow import Context
+
+agent = FunctionAgent(
+    tools=[sql_tool, vector_tool],
+    llm=OpenAI(model="gpt-4o"),
+)
+
+# hold the context/session state for the agent
+ctx = Context(agent)
+```
+
+
+```python
+from llama_index.core.agent.workflow import (
+    ToolCallResult,
+    ToolCall,
+    AgentStream,
+    AgentInput,
+    AgentOutput,
+)
+
+handler = agent.run(
+    "Tell me about the arts and culture of the city with the highest population. ",
+    ctx=ctx,
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool sql_tool with args {'input': 'SELECT city FROM city_stats ORDER BY population DESC LIMIT 1;'}, got response: The city with the highest population is Tokyo.
+    
+    Called tool vector_tool with args {'input': 'Tell me about the arts and culture of Tokyo.'}, got response: Tokyo boasts a vibrant arts and culture scene, characterized by a diverse range of museums, galleries, and performance venues. Ueno Park is a cultural hub, housing the Tokyo National Museum, which specializes in traditional Japanese art, alongside the National Museum of Western Art, a UNESCO World Heritage site, and the National Museum of Nature and Science. The park also features Ueno Zoo, known for its giant pandas.
+    
+    The city is home to numerous notable museums, including the Artizon Museum, the National Museum of Emerging Science and Innovation, and the Edo-Tokyo Museum, which explores the city's history. Contemporary art is showcased at the Mori Art Museum and the Sumida Hokusai Museum, while the Sompo Museum of Art is recognized for its collection, including Van Gogh's "Sunflowers."
+    
+    The performing arts thrive in Tokyo, with venues like the National Noh Theatre and Kabuki-za dedicated to traditional Japanese theatre. The New National Theatre Tokyo hosts a variety of performances, including opera and ballet. Major concert venues such as the Nippon Budokan and Tokyo Dome frequently feature popular music acts.
+    
+    Tokyo's nightlife is vibrant, particularly in districts like Shibuya and Roppongi, which are filled with bars, clubs, and live music venues. The city is also known for its festivals, such as the Sannō Matsuri and the Sanja Festival, which celebrate traditional culture.
+    
+    Shopping districts like Ginza and Nihombashi offer a blend of high-end retail and cultural experiences, while areas like Jinbōchō are famous for their literary connections, featuring bookstores and cafes linked to renowned authors. Overall, Tokyo's arts and culture reflect a rich tapestry of traditional and contemporary influences, making it a dynamic city for cultural exploration.
+    Tokyo, the city with the highest population, boasts a vibrant arts and culture scene. It features a diverse range of museums, galleries, and performance venues. Ueno Park serves as a cultural hub, housing the Tokyo National Museum, the National Museum of Western Art, and the National Museum of Nature and Science. The park also includes Ueno Zoo, known for its giant pandas.
+    
+    Notable museums in Tokyo include the Artizon Museum, the National Museum of Emerging Science and Innovation, and the Edo-Tokyo Museum, which explores the city's history. Contemporary art is showcased at the Mori Art Museum and the Sumida Hokusai Museum, while the Sompo Museum of Art is recognized for its collection, including Van Gogh's "Sunflowers."
+    
+    The performing arts thrive with venues like the National Noh Theatre and Kabuki-za dedicated to traditional Japanese theatre. The New National Theatre Tokyo hosts a variety of performances, including opera and ballet. Major concert venues such as the Nippon Budokan and Tokyo Dome frequently feature popular music acts.
+    
+    Tokyo's nightlife is vibrant, especially in districts like Shibuya and Roppongi, filled with bars, clubs, and live music venues. The city is also known for its festivals, such as the Sannō Matsuri and the Sanja Festival, celebrating traditional culture.
+    
+    Shopping districts like Ginza and Nihombashi offer a blend of high-end retail and cultural experiences, while areas like Jinbōchō are famous for their literary connections, featuring bookstores and cafes linked to renowned authors. Overall, Tokyo's arts and culture reflect a rich tapestry of traditional and contemporary influences, making it a dynamic city for cultural exploration.
+
+
+```python
+handler = agent.run("Tell me about the history of Berlin", ctx=ctx)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool vector_tool with args {'input': 'Tell me about the history of Berlin.'}, got response: Berlin's history dates back to prehistoric times, with evidence of human settlements as early as 60,000 BC. The area saw the emergence of various cultures, including the Maglemosian culture around 9,000 BC and the Lusatian culture around 2,000 BC, as dense human settlements developed along the Spree and Havel rivers. By 500 BC, Germanic tribes began to settle in the region, followed by Slavic tribes in the 7th century.
+    
+    In the 12th century, the region came under German rule with the establishment of the Margraviate of Brandenburg. The first written records of towns in the area appear in the late 12th century, with Berlin's founding date considered to be 1237. The towns of Berlin and Cölln formed close economic ties and eventually merged, with the Hohenzollern family ruling the area from the 14th century until 1918.
+    
+    The Thirty Years' War in the 17th century devastated Berlin, leading to significant population loss. However, under Frederick William, known as the "Great Elector," the city experienced a revival through policies promoting immigration and religious tolerance. The establishment of the Kingdom of Prussia in 1701 marked a significant turning point, with Berlin becoming its capital.
+    
+    The 19th century brought the Industrial Revolution, transforming Berlin into a major economic center and leading to rapid population growth. By the late 19th century, Berlin was the capital of the newly founded German Empire. The city continued to grow and evolve through the 20th century, experiencing significant events such as World War II, the division into East and West Berlin during the Cold War, and reunification in 1990, when it once again became the capital of a unified Germany.
+    
+    Today, Berlin is recognized as a global city of culture, politics, media, and science, with a diverse economy and rich historical heritage.
+    Berlin's history is rich and varied, dating back to prehistoric times with evidence of human settlements as early as 60,000 BC. The area saw the emergence of various cultures, including the Maglemosian culture around 9,000 BC and the Lusatian culture around 2,000 BC, with dense settlements along the Spree and Havel rivers. By 500 BC, Germanic tribes settled in the region, followed by Slavic tribes in the 7th century.
+    
+    In the 12th century, the region came under German rule with the establishment of the Margraviate of Brandenburg. Berlin's founding date is considered to be 1237, with the towns of Berlin and Cölln forming close economic ties and eventually merging. The Hohenzollern family ruled the area from the 14th century until 1918.
+    
+    The Thirty Years' War in the 17th century devastated Berlin, but it experienced a revival under Frederick William, the "Great Elector," through policies promoting immigration and religious tolerance. The establishment of the Kingdom of Prussia in 1701 marked a significant turning point, with Berlin becoming its capital.
+    
+    The 19th century brought the Industrial Revolution, transforming Berlin into a major economic center and leading to rapid population growth. By the late 19th century, Berlin was the capital of the newly founded German Empire. The city continued to evolve through the 20th century, experiencing significant events such as World War II, the division into East and West Berlin during the Cold War, and reunification in 1990, when it once again became the capital of a unified Germany.
+    
+    Today, Berlin is recognized as a global city of culture, politics, media, and science, with a diverse economy and rich historical heritage.
+
+
+```python
+response = await agent.run(
+    "Can you give me the country corresponding to each city?", ctx=ctx
+)
+
+print(str(response))
+```
+
+    Here are the cities along with their corresponding countries:
+    
+    - Toronto is in Canada.
+    - Tokyo is in Japan.
+    - Berlin is in Germany.
+
@@ -0,0 +1,170 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_retrieval.ipynb
+toc: True
+title: "Retrieval-Augmented Agents"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+In this tutorial, we show you how to use our `FunctionAgent` or `ReActAgent` implementation with a tool retriever, 
+to augment any existing agent and store/index an arbitrary number of tools. 
+
+Our indexing/retrieval modules help to remove the complexity of having too many functions to fit in the prompt.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the OpenAI API
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+from llama_index.core.tools import FunctionTool
+
+
+def multiply(a: int, b: int) -> int:
+    """Multiply two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+
+
+def useless(a: int, b: int) -> int:
+    """Toy useless function."""
+    pass
+
+
+multiply_tool = FunctionTool.from_defaults(multiply, name="multiply")
+add_tool = FunctionTool.from_defaults(add, name="add")
+
+# toy-example of many tools
+useless_tools = [
+    FunctionTool.from_defaults(useless, name=f"useless_{str(idx)}")
+    for idx in range(28)
+]
+
+all_tools = [multiply_tool] + [add_tool] + useless_tools
+
+all_tools_map = {t.metadata.name: t for t in all_tools}
+```
+
+## Building an Object Index
+
+We have an `ObjectIndex` construct in LlamaIndex that allows the user to use our index data structures over arbitrary objects.
+The ObjectIndex will handle serialiation to/from the object, and use an underying index (e.g. VectorStoreIndex, SummaryIndex, KeywordTableIndex) as the storage mechanism. 
+
+In this case, we have a large collection of Tool objects, and we'd want to define an ObjectIndex over these Tools.
+
+The index comes bundled with a retrieval mechanism, an `ObjectRetriever`. 
+
+This can be passed in to our agent so that it can 
+perform Tool retrieval during query-time.
+
+
+```python
+# define an "object" index over these tools
+from llama_index.core import VectorStoreIndex
+from llama_index.core.objects import ObjectIndex
+
+obj_index = ObjectIndex.from_objects(
+    all_tools,
+    index_cls=VectorStoreIndex,
+    # if we were using an external vector store, we could pass the stroage context and any other kwargs
+    # storage_context=storage_context,
+    # embed_model=embed_model,
+    # ...
+)
+```
+
+To reload the index later, we can use the `from_objects_and_index` method.
+
+
+```python
+# from llama_index.core import StorageContext, load_index_from_storage
+
+# saving and loading from disk
+# obj_index.index.storage_context.persist(persist_dir="obj_index_storage")
+
+# reloading from disk
+# vector_index = load_index_from_storage(StorageContext.from_defaults(persist_dir="obj_index_storage"))
+
+# or if using an external vector store, no need to persist, just reload the index
+# vector_index = VectorStoreIndex.from_vector_store(vector_store=vector_store, ...)
+
+# Then, we can reload the ObjectIndex
+# obj_index = ObjectIndex.from_objects_and_index(
+#     all_tools,
+#     index=vector_index,
+# )
+```
+
+## Agent w/ Tool Retrieval 
+
+Agents in LlamaIndex can be used with a `ToolRetriever` to retrieve tools during query-time.
+
+During query-time, we would first use the `ObjectRetriever` to retrieve a set of relevant Tools. These tools would then be passed into the agent; more specifically, their function signatures would be passed into the OpenAI Function calling API. 
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+agent = FunctionAgent(
+    tool_retriever=obj_index.as_retriever(similarity_top_k=2),
+    llm=OpenAI(model="gpt-4o"),
+)
+
+# context to hold the session/state
+ctx = Context(agent)
+```
+
+
+```python
+resp = await agent.run(
+    "What's 212 multiplied by 122? Make sure to use Tools", ctx=ctx
+)
+print(str(resp))
+print(resp.tool_calls)
+```
+
+    The result of multiplying 212 by 122 is 25,864.
+    [ToolCallResult(tool_name='multiply', tool_kwargs={'a': 212, 'b': 122}, tool_id='call_4Ygos3MpRH7Gj3R79HISRGyH', tool_output=ToolOutput(content='25864', tool_name='multiply', raw_input={'args': (), 'kwargs': {'a': 212, 'b': 122}}, raw_output=25864, is_error=False), return_direct=False)]
+
+
+
+```python
+resp = await agent.run(
+    "What's 212 added to 122 ? Make sure to use Tools", ctx=ctx
+)
+print(str(resp))
+print(resp.tool_calls)
+```
+
+    The result of adding 212 to 122 is 334.
+    [ToolCallResult(tool_name='add', tool_kwargs={'a': 212, 'b': 122}, tool_id='call_rXUfwQ477bcd6bxafQHgETaa', tool_output=ToolOutput(content='334', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 212, 'b': 122}}, raw_output=334, is_error=False), return_direct=False)]
+
@@ -0,0 +1,158 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_with_query_engine.ipynb
+toc: True
+title: "Agent with Query Engine Tools"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+## Build Query Engine Tools
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+from llama_index.core import StorageContext, load_index_from_storage
+
+try:
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/lyft"
+    )
+    lyft_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/uber"
+    )
+    uber_index = load_index_from_storage(storage_context)
+
+    index_loaded = True
+except:
+    index_loaded = False
+```
+
+Download Data
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
+```
+
+
+```python
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+
+if not index_loaded:
+    # load data
+    lyft_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/lyft_2021.pdf"]
+    ).load_data()
+    uber_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/uber_2021.pdf"]
+    ).load_data()
+
+    # build index
+    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
+    uber_index = VectorStoreIndex.from_documents(uber_docs)
+
+    # persist index
+    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
+    uber_index.storage_context.persist(persist_dir="./storage/uber")
+```
+
+
+```python
+lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3)
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+
+query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=lyft_engine,
+        name="lyft_10k",
+        description=(
+            "Provides information about Lyft financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=uber_engine,
+        name="uber_10k",
+        description=(
+            "Provides information about Uber financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+]
+```
+
+## Setup Agent
+
+For LLMs like OpenAI that have a function calling API, we should use the `FunctionAgent`.
+
+For other LLMs, we can use the `ReActAgent`.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+from llama_index.core.workflow import Context
+
+agent = FunctionAgent(tools=query_engine_tools, llm=OpenAI(model="gpt-4o"))
+
+# context to hold the session/state
+ctx = Context(agent)
+```
+
+## Let's Try It Out!
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult, AgentStream
+
+handler = agent.run("What's the revenue for Lyft in 2021 vs Uber?", ctx=ctx)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"Call {ev.tool_name} with args {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    Call lyft_10k with args {'input': "What was Lyft's revenue for the year 2021?"}
+    Returned: Lyft's revenue for the year 2021 was $3,208,323,000.
+    Call uber_10k with args {'input': "What was Uber's revenue for the year 2021?"}
+    Returned: Uber's revenue for the year 2021 was $17.455 billion.
+    In 2021, Lyft's revenue was approximately $3.21 billion, while Uber's revenue was significantly higher at $17.455 billion.
@@ -0,0 +1,282 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/react_agent.ipynb
+toc: True
+title: "ReActAgent - A Simple Intro with Calculator Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+This is a notebook that showcases the ReAct agent over very simple calculator tools (no fancy RAG pipelines or API calls).
+
+We show how it can reason step-by-step over different tools to achieve the end goal.
+
+The main advantage of the ReAct agent over a Function Calling agent is that it can work with any LLM regardless of whether it supports function calling.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+## Define Function Tools
+
+We setup some trivial `multiply` and `add` tools. Note that you can define arbitrary functions and pass it to the `FunctionTool` (which will process the docstring and parameter signature).
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiply two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+## Run Some Queries
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.core.agent.workflow import ReActAgent
+from llama_index.core.workflow import Context
+
+llm = OpenAI(model="gpt-4o-mini")
+agent = ReActAgent(tools=[multiply, add], llm=llm)
+
+# Create a context to store the conversation history/session state
+ctx = Context(agent)
+```
+
+## Run Some Example Queries
+
+By streaming the result, we can see the full response, including the thought process and tool calls.
+
+If we wanted to stream only the result, we can buffer the stream and start streaming once `Answer:` is in the response.
+
+
+
+```python
+from llama_index.core.agent.workflow import AgentStream, ToolCallResult
+
+handler = agent.run("What is 20+(2*4)?", ctx=ctx)
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to help me answer the question.
+    Action: multiply
+    Action Input: {"a": 2, "b": 4}Thought: Now I have the result of the multiplication, which is 8. I will add this to 20 to complete the calculation.
+    Action: add
+    Action Input: {'a': 20, 'b': 8}Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: The result of 20 + (2 * 4) is 28.
+
+
+```python
+print(str(response))
+```
+
+    The result of 20 + (2 * 4) is 28.
+
+
+
+```python
+print(response.tool_calls)
+```
+
+    [ToolCallResult(tool_name='multiply', tool_kwargs={'a': 2, 'b': 4}, tool_id='a394d807-a9b7-42e0-8bff-f47a432d1530', tool_output=ToolOutput(content='8', tool_name='multiply', raw_input={'args': (), 'kwargs': {'a': 2, 'b': 4}}, raw_output=8, is_error=False), return_direct=False), ToolCallResult(tool_name='add', tool_kwargs={'a': 20, 'b': 8}, tool_id='784ccd85-ae9a-4184-9613-3696742064c7', tool_output=ToolOutput(content='28', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 20, 'b': 8}}, raw_output=28, is_error=False), return_direct=False)]
+
+
+## View Prompts
+
+Let's take a look at the core system prompt powering the ReAct agent! 
+
+Within the agent, the current conversation history is dumped below this line.
+
+
+```python
+prompt_dict = agent.get_prompts()
+for k, v in prompt_dict.items():
+    print(f"Prompt: {k}\n\nValue: {v.template}")
+```
+
+    Prompt: react_header
+    
+    Value: You are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.
+    
+    ## Tools
+    
+    You have access to a wide variety of tools. You are responsible for using the tools in any sequence you deem appropriate to complete the task at hand.
+    This may require breaking the task into subtasks and using different tools to complete each subtask.
+    
+    You have access to the following tools:
+    {tool_desc}
+    
+    
+    ## Output Format
+    
+    Please answer in the same language as the question and use the following format:
+    
+    ```
+    Thought: The current language of the user is: (user's language). I need to use a tool to help me answer the question.
+    Action: tool name (one of {tool_names}) if using a tool.
+    Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
+    ```
+    
+    Please ALWAYS start with a Thought.
+    
+    NEVER surround your response with markdown code markers. You may use code markers within your response if you need to.
+    
+    Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.
+    
+    If this format is used, the tool will respond in the following format:
+    
+    ```
+    Observation: tool response
+    ```
+    
+    You should keep repeating the above format till you have enough information to answer the question without using any more tools. At that point, you MUST respond in one of the following two formats:
+    
+    ```
+    Thought: I can answer without using any more tools. I'll use the user's language to answer
+    Answer: [your answer here (In the same language as the user's question)]
+    ```
+    
+    ```
+    Thought: I cannot answer the question with the provided tools.
+    Answer: [your answer here (In the same language as the user's question)]
+    ```
+    
+    ## Current Conversation
+    
+    Below is the current conversation consisting of interleaving human and assistant messages.
+    
+
+
+### Customizing the Prompt
+
+For fun, let's try instructing the agent to output the answer along with reasoning in bullet points. See "## Additional Rules" section.
+
+
+```python
+from llama_index.core import PromptTemplate
+
+react_system_header_str = """\
+
+You are designed to help with a variety of tasks, from answering questions \
+    to providing summaries to other types of analyses.
+
+## Tools
+You have access to a wide variety of tools. You are responsible for using
+the tools in any sequence you deem appropriate to complete the task at hand.
+This may require breaking the task into subtasks and using different tools
+to complete each subtask.
+
+You have access to the following tools:
+{tool_desc}
+
+## Output Format
+To answer the question, please use the following format.
+
+```
+Thought: I need to use a tool to help me answer the question.
+Action: tool name (one of {tool_names}) if using a tool.
+Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
+```
+
+Please ALWAYS start with a Thought.
+
+Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.
+
+If this format is used, the user will respond in the following format:
+
+```
+Observation: tool response
+```
+
+You should keep repeating the above format until you have enough information
+to answer the question without using any more tools. At that point, you MUST respond
+in the one of the following two formats:
+
+```
+Thought: I can answer without using any more tools.
+Answer: [your answer here]
+```
+
+```
+Thought: I cannot answer the question with the provided tools.
+Answer: Sorry, I cannot answer your query.
+```
+
+## Additional Rules
+- The answer MUST contain a sequence of bullet points that explain how you arrived at the answer. This can include aspects of the previous conversation history.
+- You MUST obey the function signature of each tool. Do NOT pass in no arguments if the function expects arguments.
+
+## Current Conversation
+Below is the current conversation consisting of interleaving human and assistant messages.
+
+"""
+react_system_prompt = PromptTemplate(react_system_header_str)
+```
+
+
+```python
+agent.get_prompts()
+```
+
+
+
+
+    {'react_header': PromptTemplate(metadata={'prompt_type': <PromptType.CUSTOM: 'custom'>}, template_vars=['tool_desc', 'tool_names'], kwargs={}, output_parser=None, template_var_mappings=None, function_mappings=None, template='You are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.\n\n## Tools\n\nYou have access to a wide variety of tools. You are responsible for using the tools in any sequence you deem appropriate to complete the task at hand.\nThis may require breaking the task into subtasks and using different tools to complete each subtask.\n\nYou have access to the following tools:\n{tool_desc}\n\n\n## Output Format\n\nPlease answer in the same language as the question and use the following format:\n\n```\nThought: The current language of the user is: (user\'s language). I need to use a tool to help me answer the question.\nAction: tool name (one of {tool_names}) if using a tool.\nAction Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})\n```\n\nPlease ALWAYS start with a Thought.\n\nNEVER surround your response with markdown code markers. You may use code markers within your response if you need to.\n\nPlease use a valid JSON format for the Action Input. Do NOT do this {{\'input\': \'hello world\', \'num_beams\': 5}}.\n\nIf this format is used, the tool will respond in the following format:\n\n```\nObservation: tool response\n```\n\nYou should keep repeating the above format till you have enough information to answer the question without using any more tools. At that point, you MUST respond in one of the following two formats:\n\n```\nThought: I can answer without using any more tools. I\'ll use the user\'s language to answer\nAnswer: [your answer here (In the same language as the user\'s question)]\n```\n\n```\nThought: I cannot answer the question with the provided tools.\nAnswer: [your answer here (In the same language as the user\'s question)]\n```\n\n## Current Conversation\n\nBelow is the current conversation consisting of interleaving human and assistant messages.\n')}
+
+
+
+
+```python
+agent.update_prompts({"react_header": react_system_prompt})
+```
+
+
+```python
+handler = agent.run("What is 5+3+2")
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to help me answer the question.
+    Action: add
+    Action Input: {"a": 5, "b": 3}Thought: I need to add the result (8) to the remaining number (2).
+    Action: add
+    Action Input: {'a': 8, 'b': 2}Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: The result of 5 + 3 + 2 is 10.
+
+
+```python
+print(response)
+```
+
+    The result of 5 + 3 + 2 is 10.
+
@@ -0,0 +1,215 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/react_agent_with_query_engine.ipynb
+toc: True
+title: "ReAct Agent with Query Engine (RAG) Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+In this section, we show how to setup an agent powered by the ReAct loop for financial analysis.
+
+The agent has access to two "tools": one to query the 2021 Lyft 10-K and the other to query the 2021 Uber 10-K.
+
+Note that you can plug in any LLM to use as a ReAct agent.
+
+## Build Query Engine Tools
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+from llama_index.core import StorageContext, load_index_from_storage
+
+try:
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/lyft"
+    )
+    lyft_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/uber"
+    )
+    uber_index = load_index_from_storage(storage_context)
+
+    index_loaded = True
+except:
+    index_loaded = False
+```
+
+Download Data
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
+```
+
+
+```python
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+
+if not index_loaded:
+    # load data
+    lyft_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/lyft_2021.pdf"]
+    ).load_data()
+    uber_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/uber_2021.pdf"]
+    ).load_data()
+
+    # build index
+    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
+    uber_index = VectorStoreIndex.from_documents(uber_docs)
+
+    # persist index
+    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
+    uber_index.storage_context.persist(persist_dir="./storage/uber")
+```
+
+
+```python
+lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3)
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+
+query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=lyft_engine,
+        name="lyft_10k",
+        description=(
+            "Provides information about Lyft financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=uber_engine,
+        name="uber_10k",
+        description=(
+            "Provides information about Uber financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+]
+```
+
+## Setup ReAct Agent
+
+Here we setup our ReAct agent with the tools we created above.
+
+You can **optionally** specify a system prompt which will be added to the core ReAct system prompt.
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent
+from llama_index.core.workflow import Context
+
+agent = ReActAgent(
+    tools=query_engine_tools,
+    llm=OpenAI(model="gpt-4o-mini"),
+    # system_prompt="..."
+)
+
+# context to hold this session/state
+
+ctx = Context(agent)
+```
+
+## Run Some Example Queries
+
+By streaming the result, we can see the full response, including the thought process and tool calls.
+
+If we wanted to stream only the result, we can buffer the stream and start streaming once `Answer:` is in the response.
+
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult, AgentStream
+
+handler = agent.run("What was Lyft's revenue growth in 2021?", ctx=ctx)
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to help me answer the question.
+    Action: lyft_10k
+    Action Input: {"input": "What was Lyft's revenue growth in 2021?"}Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: Lyft's revenue growth in 2021 was 36% compared to the prior year.
+
+
+```python
+print(str(response))
+```
+
+    Lyft's revenue growth in 2021 was 36% compared to the prior year.
+
+
+
+```python
+handler = agent.run(
+    "Compare and contrast the revenue growth of Uber and Lyft in 2021, then give an analysis",
+    ctx=ctx,
+)
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to gather information about Uber's revenue growth in 2021 to compare it with Lyft's.
+    Action: uber_10k
+    Action Input: {'input': "What was Uber's revenue growth in 2021?"}Thought: I now have the revenue growth information for both Uber and Lyft in 2021. Lyft's revenue growth was 36%, while Uber's was 57%. I will now provide a comparison and analysis.
+    Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: In 2021, Uber experienced a revenue growth of 57%, increasing from $11.139 billion in 2020 to $17.455 billion. In contrast, Lyft's revenue growth was 36%. 
+    
+    When comparing the two, Uber outperformed Lyft in terms of revenue growth, indicating a stronger recovery or expansion in its business operations during that year. This could be attributed to Uber's diversified services, including food delivery through Uber Eats, which may have contributed significantly to its revenue. Lyft, primarily focused on ride-sharing, may have faced more challenges in scaling its growth compared to Uber.
+    
+    Overall, while both companies showed positive growth, Uber's higher percentage suggests it was able to capitalize on market opportunities more effectively than Lyft in 2021.
+
+
+```python
+print(str(response))
+```
+
+    In 2021, Uber experienced a revenue growth of 57%, increasing from $11.139 billion in 2020 to $17.455 billion. In contrast, Lyft's revenue growth was 36%. 
+    
+    When comparing the two, Uber outperformed Lyft in terms of revenue growth, indicating a stronger recovery or expansion in its business operations during that year. This could be attributed to Uber's diversified services, including food delivery through Uber Eats, which may have contributed significantly to its revenue. Lyft, primarily focused on ride-sharing, may have faced more challenges in scaling its growth compared to Uber.
+    
+    Overall, while both companies showed positive growth, Uber's higher percentage suggests it was able to capitalize on market opportunities more effectively than Lyft in 2021.
+
@@ -0,0 +1,250 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/return_direct_agent.ipynb
+toc: True
+title: "Controlling Agent Reasoning Loop with Return Direct Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+All tools have an option for `return_direct` -- if this is set to `True`, and the associated tool is called (without any other tools being called), the agent reasoning loop is ended and the tool output is returned directly.
+
+This can be useful for speeding up response times when you know the tool output is good enough, to avoid the agent re-writing the response, and for ending the reasoning loop.
+
+This notebook walks through a notebook where an agent needs to gather information from a user in order to make a restaurant booking.
+
+
+```python
+%pip install llama-index-core llama-index-llms-anthropic
+```
+
+
+```python
+import os
+
+os.environ["ANTHROPIC_API_KEY"] = "sk-..."
+```
+
+## Tools setup
+
+
+```python
+from typing import Optional
+
+from llama_index.core.tools import FunctionTool
+from pydantic import BaseModel
+
+# we will store booking under random IDs
+bookings = {}
+
+
+# we will represent and track the state of a booking as a Pydantic model
+class Booking(BaseModel):
+    name: Optional[str] = None
+    email: Optional[str] = None
+    phone: Optional[str] = None
+    date: Optional[str] = None
+    time: Optional[str] = None
+
+
+def get_booking_state(user_id: str) -> str:
+    """Get the current state of a booking for a given booking ID."""
+    try:
+        return str(bookings[user_id].dict())
+    except:
+        return f"Booking ID {user_id} not found"
+
+
+def update_booking(user_id: str, property: str, value: str) -> str:
+    """Update a property of a booking for a given booking ID. Only enter details that are explicitly provided."""
+    booking = bookings[user_id]
+    setattr(booking, property, value)
+    return f"Booking ID {user_id} updated with {property} = {value}"
+
+
+def create_booking(user_id: str) -> str:
+    """Create a new booking and return the booking ID."""
+    bookings[user_id] = Booking()
+    return "Booking created, but not yet confirmed. Please provide your name, email, phone, date, and time."
+
+
+def confirm_booking(user_id: str) -> str:
+    """Confirm a booking for a given booking ID."""
+    booking = bookings[user_id]
+
+    if booking.name is None:
+        raise ValueError("Please provide your name.")
+
+    if booking.email is None:
+        raise ValueError("Please provide your email.")
+
+    if booking.phone is None:
+        raise ValueError("Please provide your phone number.")
+
+    if booking.date is None:
+        raise ValueError("Please provide the date of your booking.")
+
+    if booking.time is None:
+        raise ValueError("Please provide the time of your booking.")
+
+    return f"Booking ID {user_id} confirmed!"
+
+
+# create tools for each function
+get_booking_state_tool = FunctionTool.from_defaults(fn=get_booking_state)
+update_booking_tool = FunctionTool.from_defaults(fn=update_booking)
+create_booking_tool = FunctionTool.from_defaults(
+    fn=create_booking, return_direct=True
+)
+confirm_booking_tool = FunctionTool.from_defaults(
+    fn=confirm_booking, return_direct=True
+)
+```
+
+## A user has walked in! Let's help them make a booking
+
+
+```python
+from llama_index.llms.anthropic import Anthropic
+from llama_index.core.llms import ChatMessage
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.core.workflow import Context
+
+llm = Anthropic(model="claude-3-sonnet-20240229", temperature=0.1)
+
+user = "user123"
+system_prompt = f"""You are now connected to the booking system and helping {user} with making a booking.
+Only enter details that the user has explicitly provided.
+Do not make up any details.
+"""
+
+agent = FunctionAgent(
+    tools=[
+        get_booking_state_tool,
+        update_booking_tool,
+        create_booking_tool,
+        confirm_booking_tool,
+    ],
+    llm=llm,
+    system_prompt=system_prompt,
+)
+
+# create a context for the agent to hold the state/history of a session
+ctx = Context(agent)
+```
+
+
+```python
+from llama_index.core.agent.workflow import AgentStream, ToolCallResult
+
+handler = agent.run(
+    "Hello! I would like to make a booking, around 5pm?", ctx=ctx
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+    elif isinstance(ev, ToolCallResult):
+        print(
+            f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+
+response = await handler
+```
+
+    Okay, let's create a new booking for you.{"user_id": "user123"}
+    Call create_booking with {'user_id': 'user123'}
+    Returned: Booking created, but not yet confirmed. Please provide your name, email, phone, date, and time.
+
+
+
+```python
+print(str(response))
+```
+
+    Booking created, but not yet confirmed. Please provide your name, email, phone, date, and time.
+
+
+Perfect, we can see the function output was retruned directly, with no modification or final LLM call!
+
+
+```python
+handler = agent.run(
+    "Sure! My name is Logan, and my email is test@gmail.com?", ctx=ctx
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+    elif isinstance(ev, ToolCallResult):
+        print(
+            f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+
+response = await handler
+```
+
+    Got it, thanks for providing your name and email. I've updated the booking with that information.{"user_id": "user123", "property": "name", "value": "Logan"}{"user_id": "user123", "property": "email", "value": "test@gmail.com"}
+    Call update_booking with {'user_id': 'user123', 'property': 'name', 'value': 'Logan'}
+    Returned: Booking ID user123 updated with name = Logan
+    
+    Call update_booking with {'user_id': 'user123', 'property': 'email', 'value': 'test@gmail.com'}
+    Returned: Booking ID user123 updated with email = test@gmail.com
+    Please also provide your phone number, preferred date, and time for the booking.
+
+
+```python
+print(str(response))
+```
+
+    Please also provide your phone number, preferred date, and time for the booking.
+
+
+
+```python
+handler = agent.run(
+    "Right! My phone number is 1234567890, the date of the booking is April 5, at 5pm.",
+    ctx=ctx,
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+    elif isinstance(ev, ToolCallResult):
+        print(
+            f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+
+response = await handler
+```
+
+    Great, thank you for providing the additional details. I've updated the booking with your phone number, date, and time.{"user_id": "user123", "property": "phone", "value": "1234567890"}{"user_id": "user123", "property": "date", "value": "2023-04-05"}{"user_id": "user123", "property": "time", "value": "17:00"}
+    Call update_booking with {'user_id': 'user123', 'property': 'phone', 'value': '1234567890'}
+    Returned: Booking ID user123 updated with phone = 1234567890
+    
+    Call update_booking with {'user_id': 'user123', 'property': 'date', 'value': '2023-04-05'}
+    Returned: Booking ID user123 updated with date = 2023-04-05
+    
+    Call update_booking with {'user_id': 'user123', 'property': 'time', 'value': '17:00'}
+    Returned: Booking ID user123 updated with time = 17:00
+    Looks like I have all the necessary details. Let me confirm this booking for you.{"user_id": "user123"}
+    Call confirm_booking with {'user_id': 'user123'}
+    Returned: Booking ID user123 confirmed!
+
+
+
+```python
+print(str(response))
+```
+
+    Booking ID user123 confirmed!
+
+
+
+```python
+print(bookings["user123"])
+```
+
+    name='Logan' email='test@gmail.com' phone='1234567890' date='2023-04-05' time='17:00'
+
@@ -2,14 +2,53 @@

 ### The Idea for Pyhton Examples

-1. You add a notebook wherever you want in the `/notebooks` directory
+
+1. You add a notebook wherever you want in the `/notebooks` directory OR, it already exists in another `source`
 2. You want it to go onto the website? You add an entry to "index.toml" for your example. Here's the info we need:
    - A description
    - The location of your example
    - Some tags 
    - If you want, you can mark it 'experimental'
    - If _we_ want, we can make it 'featured'
+    - Mark the `language` as `language = "py"`
 3. A github action on this repo or on the `developers` repo runs the `scripts/notebooks_to_markdown.py` script. Which converts your notebook to markdown with some frontmatter which includes:
    - all of the info above
    - an auto generate 'open in Colab' url so you don't have to worry about adding it yourself
-4. (This is my idea but we can change it) In Astro, we use this frontmatter to generate a tiled 'Cookbook' page with tags and filters. You can easilyl navigate different topics and all LlamaCloud/LITS/LI examples. When you click-> takes you to usual Examples page.
+4. (This is my idea but we can change it) In Astro, we use this frontmatter to generate a tiled 'Cookbook' page with tags and filters. You can easilyl navigate different topics and all LlamaCloud/LITS/LI examples. When you click-> takes you to usual Examples page.
+
+> Extra: This POC has most examples as a local `notebook` and one example where the recipe comes from another `source`. For those that come from a local `notebook`, we also generate a frontmatter element called `colab` which auto-generates the 'open in colab' url. This can be used for an 'open in colab' button 🚀
+
+### (Optional) Example Generated frontmatter for Astro to generate individual recipe pages:
+This is useful if you want to use the generated frontmatter which can be used to add elements to individual recipes like tags, an 'open in colab' button, etc.
+
+```
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_builder.ipynb
+toc: True
+title: "GPT Builder Demo"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+```
+### Example build run for the astro website:
+
+```bash
+pip install -r requirements.txt
+python scripts/notebooks_to_markdown.py
+```
+
+This adds all of the _local_ notebooks as markdown, into the `makrdowns` directory, along with their frontmatter.
+
+
+### POC Landing Page Generation for Astro
+
+> Disclaimer: This index.toml to landing page generation code was created with Claude :) 
+
+```bash
+python scripts/cookbook_page_generator.py index.toml -o my_cookbooks.html
+```
+This outputs a POC HTML page that you can have a look at. It's just to demonstrate how the index.toml can be used. 
+Note that it has both local notebooks + external sources
@@ -1,7 +1,7 @@
 [config]
 layout = "recipe"
 toc = true
-colab = "https://colab.research.google.com/github/TuanaCelik/cookbooks-demo/blob/main/"
+colab = "https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/"

 [[recipe]]
 title = "Agent Workflow + Research Assistant using AgentQL"
@@ -9,10 +9,210 @@ notebook = "notebooks/agent/agent_workflow_research_assistant.ipynb"
 tags = ["Agent", "Websearch", "Integrations"]
 description = "Build a research assistant using AgentWorlflow and websearch tools"
 experimental = true
+featured = true
+language = "py"

 [[recipe]]
 title = "Custom Planning Multi-Agent System"
 notebook = "notebooks/agent/custom_multi_agent.ipynb"
 tags = ["Agent"]
 description = "Build a research agent that writes and refines reports with a multi-agent structure."
-featured = true
+featured = true
+language = "py"
+
+[[recipe]]
+title = "Parsing Documents with Instructions"
+source = "https://github.com/run-llama/llama_cloud_services/blob/main/examples/parse/parsing_instructions/parsing_instructions.ipynb"
+tags = ["LlamaParse"]
+description = "Parse documents with additional instructions"
+featured = true
+language = "py"
+
+[[recipe]]
+title = "Agent Builder"
+notebook = "notebooks/agent/agent_builder.ipynb"
+tags = ["Agent"]
+description = "Create and configure agents with custom tools and capabilities"
+language = "py"
+
+[[recipe]]
+title = "Basic Agent Workflow"
+notebook = "notebooks/agent/agent_workflow_basic.ipynb"
+tags = ["Agent"]
+description = "Get started with basic agent workflows and tool usage"
+language = "py"
+
+[[recipe]]
+title = "Multi-Agent Workflow"
+notebook = "notebooks/agent/agent_workflow_multi.ipynb"
+tags = ["Agent"]
+description = "Build complex workflows with multiple collaborating agents"
+language = "py"
+
+[[recipe]]
+title = "Anthropic Claude Agent"
+notebook = "notebooks/agent/anthropic_agent.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Use Anthropic's Claude model as an agent with tools"
+language = "py"
+
+[[recipe]]
+title = "AWS Bedrock Converse Agent"
+notebook = "notebooks/agent/bedrock_converse_agent.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Integrate AWS Bedrock Converse with agent workflows"
+language = "py"
+
+[[recipe]]
+title = "Code Act Agent"
+notebook = "notebooks/agent/code_act_agent.ipynb"
+tags = ["Agent"]
+description = "Build agents that can write and execute code"
+language = "py"
+
+[[recipe]]
+title = "From Scratch Code Act Agent"
+notebook = "notebooks/agent/from_scratch_code_act_agent.ipynb"
+tags = ["Agent"]
+description = "Create a code-writing agent from the ground up"
+language = "py"
+
+[[recipe]]
+title = "Mistral Agent"
+notebook = "notebooks/agent/mistral_agent.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Use Mistral AI models as agents with tool integration"
+language = "py"
+
+[[recipe]]
+title = "NVIDIA Agent"
+notebook = "notebooks/agent/nvidia_agent.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Integrate NVIDIA AI models with agent workflows"
+language = "py"
+
+[[recipe]]
+title = "NVIDIA Document Research Assistant"
+notebook = "notebooks/agent/nvidia_document_research_assistant_for_blog_creation.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Create a document research assistant for blog content generation"
+language = "py"
+
+[[recipe]]
+title = "NVIDIA Sub-Question Query Engine"
+notebook = "notebooks/agent/nvidia_sub_question_query_engine.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Build a query engine that breaks complex questions into sub-questions"
+language = "py"
+
+[[recipe]]
+title = "OpenAI Agent with Context Retrieval"
+notebook = "notebooks/agent/openai_agent_context_retrieval.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Use OpenAI agents with advanced context retrieval capabilities"
+language = "py"
+
+[[recipe]]
+title = "OpenAI Agent with Lengthy Tools"
+notebook = "notebooks/agent/openai_agent_lengthy_tools.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Handle complex tools and long-running operations with OpenAI agents"
+language = "py"
+
+[[recipe]]
+title = "OpenAI Agent Query Cookbook"
+notebook = "notebooks/agent/openai_agent_query_cookbook.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Comprehensive guide to OpenAI agent query patterns and best practices"
+language = "py"
+
+[[recipe]]
+title = "OpenAI Agent Retrieval"
+notebook = "notebooks/agent/openai_agent_retrieval.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Implement retrieval-augmented generation with OpenAI agents"
+language = "py"
+
+[[recipe]]
+title = "OpenAI Agent with Query Engine"
+notebook = "notebooks/agent/openai_agent_with_query_engine.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Combine OpenAI agents with query engines for enhanced information retrieval"
+language = "py"
+
+[[recipe]]
+title = "ReAct Agent"
+notebook = "notebooks/agent/react_agent.ipynb"
+tags = ["Agent"]
+description = "Implement reasoning and acting agents with step-by-step problem solving"
+language = "py"
+
+[[recipe]]
+title = "ReAct Agent with Query Engine"
+notebook = "notebooks/agent/react_agent_with_query_engine.ipynb"
+tags = ["Agent"]
+description = "Combine ReAct agents with query engines for structured reasoning"
+language = "py"
+
+[[recipe]]
+title = "Return Direct Agent"
+notebook = "notebooks/agent/return_direct_agent.ipynb"
+tags = ["Agent"]
+description = "Build agents that return direct responses without intermediate steps"
+language = "py"
+
+[[recipe]]
+title = "Agents as Tools"
+notebook = "notebooks/agent/agents_as_tools.ipynb"
+tags = ["Agent"]
+description = "Use agents as tools within other agent workflows"
+language = "py"
+
+[[recipe]]
+title = "Multi-Agent Workflow with Weaviate"
+notebook = "notebooks/agent/multi_agent_workflow_with_weaviate_queryagent.ipynb"
+tags = ["Agent", "Integrations"]
+description = "Build multi-agent systems with Weaviate vector database integration"
+language = "py"
+
+[[recipe]]
+title = "Multi-Document Agents"
+notebook = "notebooks/agent/multi_document_agents-v1.ipynb"
+tags = ["Agent"]
+description = "Create agents that can process and reason across multiple documents"
+language = "py"
+
+[[recipe]]
+title = "SEC Chatbot"
+notebook = "notebooks/agent/Chatbot_SEC.ipynb"
+tags = ["Agent"]
+description = "Build a specialized chatbot for SEC document analysis and queries"
+language = "py"
+
+[[recipe]]
+title = "Chat Memory Buffer"
+notebook = "notebooks/agent/memory/chat_memory_buffer.ipynb"
+tags = ["Agent", "Memory"]
+description = "Implement conversation memory using buffer storage"
+language = "py"
+
+[[recipe]]
+title = "Composable Memory"
+notebook = "notebooks/agent/memory/composable_memory.ipynb"
+tags = ["Agent", "Memory"]
+description = "Build flexible memory systems that can be composed and combined"
+language = "py"
+
+[[recipe]]
+title = "Summary Memory Buffer"
+notebook = "notebooks/agent/memory/summary_memory_buffer.ipynb"
+tags = ["Agent", "Memory"]
+description = "Use summarization techniques for efficient conversation memory"
+language = "py"
+
+[[recipe]]
+title = "Vector Memory"
+notebook = "notebooks/agent/memory/vector_memory.ipynb"
+tags = ["Agent", "Memory"]
+description = "Implement semantic memory using vector embeddings"
+language = "py"
@@ -0,0 +1,333 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/Chatbot_SEC.ipynb
+toc: True
+title: "How to Build a Chatbot"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+LlamaIndex serves as a bridge between your data and Language Learning Models (LLMs), providing a toolkit that enables you to establish a query interface around your data for a variety of tasks, such as question-answering and summarization.
+
+In this tutorial, we'll walk you through building a context-augmented chatbot using a [Data Agent](https://gpt-index.readthedocs.io/en/stable/core_modules/agent_modules/agents/root.html). This agent, powered by LLMs, is capable of intelligently executing tasks over your data. The end result is a chatbot agent equipped with a robust set of data interface tools provided by LlamaIndex to answer queries about your data.
+
+**Note**: This tutorial builds upon initial work on creating a query interface over SEC 10-K filings - [check it out here](https://medium.com/@jerryjliu98/how-unstructured-and-llamaindex-can-help-bring-the-power-of-llms-to-your-own-data-3657d063e30d).
+
+### Context
+
+In this guide, we’ll build a "10-K Chatbot" that uses raw UBER 10-K HTML filings from Dropbox. Users can interact with the chatbot to ask questions related to the 10-K filings.
+
+### Preparation
+
+
+```python
+%pip install llama-index-readers-file
+%pip install llama-index-embeddings-openai
+%pip install llama-index-agent-openai
+%pip install llama-index-llms-openai
+%pip install llama-index-question-gen-openai
+%pip install unstructured
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.core import Settings
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+# global defaults
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model_name="text-embedding-3-large")
+Settings.chunk_size = 512
+Settings.chunk_overlap = 64
+```
+
+### Ingest Data
+
+Let's first download the raw 10-k files, from 2019-2022.
+
+
+```python
+# NOTE: the code examples assume you're operating within a Jupyter notebook.
+# download files
+!mkdir data
+!wget "https://www.dropbox.com/s/948jr9cfs7fgj99/UBER.zip?dl=1" -O data/UBER.zip
+!unzip data/UBER.zip -d data
+```
+
+To parse the HTML files into formatted text, we use the [Unstructured](https://github.com/Unstructured-IO/unstructured) library. Thanks to [LlamaHub](https://llamahub.ai/), we can directly integrate with Unstructured, allowing conversion of any text into a Document format that LlamaIndex can ingest.
+
+First we install the necessary packages:
+
+Then we can use the `UnstructuredReader` to parse the HTML files into a list of `Document` objects.
+
+
+```python
+from llama_index.readers.file import UnstructuredReader
+from pathlib import Path
+
+years = [2022, 2021, 2020, 2019]
+```
+
+
+```python
+loader = UnstructuredReader()
+doc_set = {}
+all_docs = []
+for year in years:
+    year_docs = loader.load_data(
+        file=Path(f"./data/UBER/UBER_{year}.html"), split_documents=False
+    )
+    # insert year metadata into each year
+    for d in year_docs:
+        d.metadata = {"year": year}
+    doc_set[year] = year_docs
+    all_docs.extend(year_docs)
+```
+
+### Setting up Vector Indices for each year
+
+We first setup a vector index for each year. Each vector index allows us
+to ask questions about the 10-K filing of a given year.
+
+We build each index and save it to disk.
+
+
+```python
+# initialize simple vector indices
+# NOTE: don't run this cell if the indices are already loaded!
+from llama_index.core import VectorStoreIndex, StorageContext
+
+
+index_set = {}
+for year in years:
+    storage_context = StorageContext.from_defaults()
+    cur_index = VectorStoreIndex.from_documents(
+        doc_set[year],
+        storage_context=storage_context,
+    )
+    index_set[year] = cur_index
+    storage_context.persist(persist_dir=f"./storage/{year}")
+```
+
+To load an index from disk, do the following
+
+
+```python
+# Load indices from disk
+from llama_index.core import StorageContext, load_index_from_storage
+
+index_set = {}
+for year in years:
+    storage_context = StorageContext.from_defaults(
+        persist_dir=f"./storage/{year}"
+    )
+    cur_index = load_index_from_storage(
+        storage_context,
+    )
+    index_set[year] = cur_index
+```
+
+### Setting up a Sub Question Query Engine to Synthesize Answers Across 10-K Filings
+
+Since we have access to documents of 4 years, we may not only want to ask questions regarding the 10-K document of a given year, but ask questions that require analysis over all 10-K filings.
+
+To address this, we can use a [Sub Question Query Engine](https://gpt-index.readthedocs.io/en/stable/examples/query_engine/sub_question_query_engine.html). It decomposes a query into subqueries, each answered by an individual vector index, and synthesizes the results to answer the overall query.
+
+LlamaIndex provides some wrappers around indices (and query engines) so that they can be used by query engines and agents. First we define a `QueryEngineTool` for each vector index.
+Each tool has a name and a description; these are what the LLM agent sees to decide which tool to choose.
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+
+individual_query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=index_set[year].as_query_engine(),
+        name=f"vector_index_{year}",
+        description=(
+            "useful for when you want to answer queries about the"
+            f" {year} SEC 10-K for Uber"
+        ),
+    )
+    for year in years
+]
+```
+
+Now we can create the Sub Question Query Engine, which will allow us to synthesize answers across the 10-K filings. We pass in the `individual_query_engine_tools` we defined above.
+
+
+```python
+from llama_index.core.query_engine import SubQuestionQueryEngine
+
+query_engine = SubQuestionQueryEngine.from_defaults(
+    query_engine_tools=individual_query_engine_tools,
+)
+```
+
+### Setting up the Chatbot Agent
+
+We use a LlamaIndex Data Agent to setup the outer chatbot agent, which has access to a set of Tools. Specifically, we will use an OpenAIAgent, that takes advantage of OpenAI API function calling. We want to use the separate Tools we defined previously for each index (corresponding to a given year), as well as a tool for the sub question query engine we defined above.
+
+First we define a `QueryEngineTool` for the sub question query engine:
+
+
+```python
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=query_engine,
+    name="sub_question_query_engine",
+    description=(
+        "useful for when you want to answer queries that require analyzing"
+        " multiple SEC 10-K documents for Uber"
+    ),
+)
+```
+
+Then, we combine the Tools we defined above into a single list of tools for the agent:
+
+
+```python
+tools = individual_query_engine_tools + [query_engine_tool]
+```
+
+Finally, we call `FunctionAgent` to create the agent, passing in the list of tools we defined above.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.llms.openai import OpenAI
+
+agent = FunctionAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
+```
+
+### Testing the Agent
+
+We can now test the agent with various queries.
+
+If we test it with a simple "hello" query, the agent does not use any Tools.
+
+
+```python
+from llama_index.core.workflow import Context
+
+# Setup the context for this specific interaction
+ctx = Context(agent)
+
+response = await agent.run("hi, i am bob", ctx=ctx)
+print(str(response))
+```
+
+    Hello Bob! How can I assist you today?
+
+
+If we test it with a query regarding the 10-k of a given year, the agent will use
+the relevant vector index Tool.
+
+
+```python
+response = await agent.run(
+    "What were some of the biggest risk factors in 2020 for Uber?", ctx=ctx
+)
+print(str(response))
+```
+
+    In 2020, some of the biggest risk factors for Uber included:
+    
+    1. **Legal and Regulatory Risks**: Extensive government regulation and oversight could adversely impact operations and future prospects.
+    2. **Data Privacy and Security Risks**: Risks related to data collection, use, and processing could lead to investigations, litigation, and negative publicity.
+    3. **Economic Impact of COVID-19**: The pandemic adversely affected business operations, demand for services, and financial condition due to governmental restrictions and changes in consumer behavior.
+    4. **Market Volatility**: Volatility in the market price of common stock could affect investors' ability to resell shares at favorable prices.
+    5. **Safety Incidents**: Criminal or dangerous activities on the platform could harm the ability to attract and retain drivers and consumers.
+    6. **Investment Risks**: Substantial investments in new technologies and offerings carry inherent risks, with no guarantee of realizing expected benefits.
+    7. **Dependence on Metropolitan Areas**: A significant portion of gross bookings comes from large metropolitan areas, which may be negatively impacted by various external factors.
+    8. **Talent Retention**: Attracting and retaining high-quality personnel is crucial, and issues with attrition or succession planning could adversely affect the business.
+    9. **Cybersecurity Threats**: Cyberattacks and data breaches could harm reputation and operational results.
+    10. **Capital Requirements**: The need for additional capital to support growth may not be met on reasonable terms, impacting business expansion.
+    11. **Acquisition Challenges**: Difficulty in identifying and integrating suitable businesses could harm operating results and future prospects.
+    12. **Operational Limitations**: Potential restrictions in certain jurisdictions may require modifications to the business model, affecting service delivery.
+
+
+Finally, if we test it with a query to compare/contrast risk factors across years, the agent will use the Sub Question Query Engine Tool.
+
+
+```python
+cross_query_str = (
+    "Compare/contrast the risk factors described in the Uber 10-K across"
+    " years. Give answer in bullet points."
+)
+
+response = await agent.run(cross_query_str, ctx=ctx)
+print(str(response))
+```
+
+    Here's a comparison of the risk factors for Uber across the years 2020, 2021, and 2022:
+    
+    - **COVID-19 Impact**:
+      - **2020**: The pandemic significantly affected business operations, demand, and financial condition.
+      - **2021**: Continued impact of the pandemic was a concern, affecting various parts of the business.
+      - **2022**: The pandemic's impact was less emphasized, with more focus on operational and competitive risks.
+    
+    - **Driver Classification**:
+      - **2020**: Not specifically highlighted.
+      - **2021**: Potential reclassification of Drivers as employees could alter the business model.
+      - **2022**: Continued risk of reclassification impacting operational costs.
+    
+    - **Competition**:
+      - **2020**: Not specifically highlighted.
+      - **2021**: Intense competition with low barriers to entry and well-capitalized competitors.
+      - **2022**: Competitive landscape challenges due to established alternatives and low barriers to entry.
+    
+    - **Financial Concerns**:
+      - **2020**: Market volatility and capital requirements were major concerns.
+      - **2021**: Historical losses and increased operating expenses raised profitability concerns.
+      - **2022**: Significant losses and rising expenses continued to raise profitability concerns.
+    
+    - **User and Personnel Retention**:
+      - **2020**: Talent retention was crucial, with risks from attrition.
+      - **2021**: Attracting and retaining a critical mass of users and personnel was essential.
+      - **2022**: Continued emphasis on retaining Drivers, consumers, and high-quality personnel.
+    
+    - **Brand and Reputation**:
+      - **2020**: Safety incidents and cybersecurity threats could harm reputation.
+      - **2021**: Maintaining and enhancing brand reputation was critical, with past negative publicity being a concern.
+      - **2022**: Brand and reputation were under scrutiny, with negative media coverage potentially harming prospects.
+    
+    - **Operational Challenges**:
+      - **2020**: Operational limitations and acquisition challenges were highlighted.
+      - **2021**: Challenges in managing growth and optimizing organizational structure.
+      - **2022**: Historical workplace culture and the need for organizational optimization were critical.
+    
+    - **Safety and Liability**:
+      - **2020**: Safety incidents and liability claims were significant risks.
+      - **2021**: Safety incidents and liability claims, especially with vulnerable road users, were concerns.
+      - **2022**: Safety incidents and public reporting could impact reputation and financial results.
+    
+    Overall, while some risk factors remained consistent across the years, such as competition, financial concerns, and safety, the emphasis shifted slightly with the evolving business environment and external factors like the pandemic.
+
+
+### Setting up the Chatbot Loop
+
+Now that we have the chatbot setup, it only takes a few more steps to setup a basic interactive loop to chat with our SEC-augmented chatbot!
+
+
+```python
+agent = FunctionAgent(tools=tools, llm=OpenAI(model="gpt-4o"))
+ctx = Context(agent)
+
+while True:
+    text_input = input("User: ")
+    if text_input == "exit":
+        break
+    response = await agent.run(text_input, ctx=ctx)
+    print(f"Agent: {response}")
+
+# User: What were some of the legal proceedings against Uber in 2022?
+```
@@ -0,0 +1,99 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/chat_memory_buffer.ipynb
+toc: True
+title: "Chat Memory Buffer"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+The `ChatMemoryBuffer` is a memory buffer that simply stores the last X messages that fit into a token limit.
+
+%pip install llama-index-core
+
+## Setup
+
+
+```python
+from llama_index.core.memory import ChatMemoryBuffer
+
+memory = ChatMemoryBuffer.from_defaults(token_limit=40000)
+```
+
+## Using Standalone
+
+
+```python
+from llama_index.core.llms import ChatMessage
+
+chat_history = [
+    ChatMessage(role="user", content="Hello, how are you?"),
+    ChatMessage(role="assistant", content="I'm doing well, thank you!"),
+]
+
+# put a list of messages
+memory.put_messages(chat_history)
+
+# put one message at a time
+# memory.put_message(chat_history[0])
+```
+
+
+```python
+# Get the last X messages that fit into a token limit
+history = memory.get()
+```
+
+
+```python
+# Get all messages
+all_history = memory.get_all()
+```
+
+
+```python
+# clear the memory
+memory.reset()
+```
+
+## Using with Agents
+
+You can set the memory in any agent in the `.run()` method.
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-proj-..."
+```
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+
+memory = ChatMemoryBuffer.from_defaults(token_limit=40000)
+
+agent = FunctionAgent(tools=[], llm=OpenAI(model="gpt-4o-mini"))
+
+# context to hold the chat history/state
+ctx = Context(agent)
+```
+
+
+```python
+resp = await agent.run("Hello, how are you?", ctx=ctx, memory=memory)
+```
+
+
+```python
+print(memory.get_all())
+```
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Hello, how are you?')]), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text="Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?")])]
+
@@ -0,0 +1,512 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/composable_memory.ipynb
+toc: True
+title: "Simple Composable Memory"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+In this notebook, we demonstrate how to inject multiple memory sources into an agent. Specifically, we use the `SimpleComposableMemory` which is comprised of a `primary_memory` as well as potentially several secondary memory sources (stored in `secondary_memory_sources`). The main difference is that `primary_memory` will be used as the main chat buffer for the agent, where as any retrieved messages from `secondary_memory_sources` will be injected to the system prompt message only.
+
+Multiple memory sources may be of use for example in situations where you have a longer-term memory such as `VectorMemory` that you want to use in addition to the default `ChatMemoryBuffer`. What you'll see in this notebook is that with a `SimpleComposableMemory` you'll be able to effectively "load" the desired messages from long-term memory into the main memory (i.e. the `ChatMemoryBuffer`).
+
+## How `SimpleComposableMemory` Works?
+
+We begin with the basic usage of the `SimpleComposableMemory`. Here we construct a `VectorMemory` as well as a default `ChatMemoryBuffer`. The `VectorMemory` will be our secondary memory source, whereas the `ChatMemoryBuffer` will be the main or primary one. To instantiate a `SimpleComposableMemory` object, we need to supply a `primary_memory` and (optionally) a list of `secondary_memory_sources`.
+
+![SimpleComposableMemoryIllustration](https://d3ddy8balm3goa.cloudfront.net/llamaindex/simple-composable-memory.excalidraw.svg)
+
+
+```python
+from llama_index.core.memory import (
+    VectorMemory,
+    SimpleComposableMemory,
+    ChatMemoryBuffer,
+)
+from llama_index.core.llms import ChatMessage
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+vector_memory = VectorMemory.from_defaults(
+    vector_store=None,  # leave as None to use default in-memory vector store
+    embed_model=OpenAIEmbedding(),
+    retriever_kwargs={"similarity_top_k": 1},
+)
+
+# let's set some initial messages in our secondary vector memory
+msgs = [
+    ChatMessage.from_str("You are a SOMEWHAT helpful assistant.", "system"),
+    ChatMessage.from_str("Bob likes burgers.", "user"),
+    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
+    ChatMessage.from_str("Alice likes apples.", "user"),
+]
+vector_memory.set(msgs)
+
+chat_memory_buffer = ChatMemoryBuffer.from_defaults()
+
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=chat_memory_buffer,
+    secondary_memory_sources=[vector_memory],
+)
+```
+
+
+```python
+composable_memory.primary_memory
+```
+
+
+
+
+    ChatMemoryBuffer(chat_store=SimpleChatStore(store={}), chat_store_key='chat_history', token_limit=3000, tokenizer_fn=functools.partial(<bound method Encoding.encode of <Encoding 'cl100k_base'>>, allowed_special='all'))
+
+
+
+
+```python
+composable_memory.secondary_memory_sources
+```
+
+
+
+
+    [VectorMemory(vector_index=<llama_index.core.indices.vector_store.base.VectorStoreIndex object at 0x137b912a0>, retriever_kwargs={'similarity_top_k': 1}, batch_by_user_message=True, cur_batch_textnode=TextNode(id_='288b0ef3-570e-4698-a1ae-b3531df66361', embedding=None, metadata={'sub_dicts': [{'role': <MessageRole.USER: 'user'>, 'content': 'Alice likes apples.', 'additional_kwargs': {}}]}, excluded_embed_metadata_keys=['sub_dicts'], excluded_llm_metadata_keys=['sub_dicts'], relationships={}, text='Alice likes apples.', start_char_idx=None, end_char_idx=None, text_template='{metadata_str}\n\n{content}', metadata_template='{key}: {value}', metadata_seperator='\n'))]
+
+
+
+### `put()` messages into memory
+
+Since `SimpleComposableMemory` is itself a subclass of `BaseMemory`, we add messages to it in the same way as we do for other memory modules. Note that for `SimpleComposableMemory`, invoking `.put()` effectively calls `.put()` on all memory sources. In other words, the message gets added to `primary` and `secondary` sources.
+
+
+```python
+msgs = [
+    ChatMessage.from_str("You are a REALLY helpful assistant.", "system"),
+    ChatMessage.from_str("Jerry likes juice.", "user"),
+]
+```
+
+
+```python
+# load into all memory sources modules"
+for m in msgs:
+    composable_memory.put(m)
+```
+
+### `get()` messages from memory
+
+When `.get()` is invoked, we similarly execute all of the `.get()` methods of `primary` memory as well as all of the `secondary` sources. This leaves us with sequence of lists of messages that we have to must "compose" into a sensible single set of messages (to pass downstream to our agents). Special care must be applied here in general to ensure that the final sequence of messages are both sensible and conform to the chat APIs of the LLM provider.
+
+For `SimpleComposableMemory`, we **inject the messages from the `secondary` sources in the system message of the `primary` memory**. The rest of the message history of the `primary` source is left intact, and this composition is what is ultimately returned.
+
+
+```python
+msgs = composable_memory.get("What does Bob like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Bob likes burgers.\n\tASSISTANT: Indeed, Bob likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+
+```python
+# see the memory injected into the system message of the primary memory
+print(msgs[0])
+```
+
+    system: You are a REALLY helpful assistant.
+    
+    Below are a set of relevant dialogues retrieved from potentially several memory sources:
+    
+    =====Relevant messages from memory source 1=====
+    
+    	USER: Bob likes burgers.
+    	ASSISTANT: Indeed, Bob likes apples.
+    
+    =====End of relevant messages from memory source 1======
+    
+    This is the end of the retrieved message dialogues.
+
+
+### Successive calls to `get()`
+
+Successive calls of `get()` will simply replace the loaded `secondary` memory messages in the system prompt.
+
+
+```python
+msgs = composable_memory.get("What does Alice like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: Alice likes apples.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+
+```python
+# see the memory injected into the system message of the primary memory
+print(msgs[0])
+```
+
+    system: You are a REALLY helpful assistant.
+    
+    Below are a set of relevant dialogues retrieved from potentially several memory sources:
+    
+    =====Relevant messages from memory source 1=====
+    
+    	USER: Alice likes apples.
+    
+    =====End of relevant messages from memory source 1======
+    
+    This is the end of the retrieved message dialogues.
+
+
+### What if `get()` retrieves `secondary` messages that already exist in `primary` memory?
+
+In the event that messages retrieved from `secondary` memory already exist in `primary` memory, then these rather redundant secondary messages will not get added to the system message. In the below example, the message "Jerry likes juice." was `put` into all memory sources, so the system message is not altered.
+
+
+```python
+msgs = composable_memory.get("What does Jerry like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a REALLY helpful assistant.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+### How to `reset` memory 
+
+Similar to the other methods `put()` and `get()`, calling `reset()` will execute `reset()` on both the `primary` and `secondary` memory sources. If you want to reset only the `primary` then you should call the `reset()` method only from it.
+
+#### `reset()` only primary memory
+
+
+```python
+composable_memory.primary_memory.reset()
+```
+
+
+```python
+composable_memory.primary_memory.get()
+```
+
+
+
+
+    []
+
+
+
+
+```python
+composable_memory.secondary_memory_sources[0].get("What does Alice like?")
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content='Alice likes apples.', additional_kwargs={})]
+
+
+
+#### `reset()` all memory sources
+
+
+```python
+composable_memory.reset()
+```
+
+
+```python
+composable_memory.primary_memory.get()
+```
+
+
+
+
+    []
+
+
+
+
+```python
+composable_memory.secondary_memory_sources[0].get("What does Alice like?")
+```
+
+
+
+
+    []
+
+
+
+## Use `SimpleComposableMemory` With An Agent
+
+Here we will use a `SimpleComposableMemory` with an agent and demonstrate how a secondary, long-term memory source can be used to use messages from on agent conversation as part of another conversation with another agent session.
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.core.tools import FunctionTool
+from llama_index.core.agent import FunctionCallingAgent
+
+import nest_asyncio
+
+nest_asyncio.apply()
+```
+
+### Define our memory modules
+
+
+```python
+vector_memory = VectorMemory.from_defaults(
+    vector_store=None,  # leave as None to use default in-memory vector store
+    embed_model=OpenAIEmbedding(),
+    retriever_kwargs={"similarity_top_k": 2},
+)
+
+chat_memory_buffer = ChatMemoryBuffer.from_defaults()
+
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=chat_memory_buffer,
+    secondary_memory_sources=[vector_memory],
+)
+```
+
+### Define our Agent
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiply two integers and returns the result integer"""
+    return a * b
+
+
+def mystery(a: int, b: int) -> int:
+    """Mystery function on two numbers"""
+    return a**2 - b**2
+
+
+multiply_tool = FunctionTool.from_defaults(fn=multiply)
+mystery_tool = FunctionTool.from_defaults(fn=mystery)
+```
+
+
+```python
+llm = OpenAI(model="gpt-3.5-turbo-0613")
+agent = FunctionCallingAgent.from_tools(
+    [multiply_tool, mystery_tool],
+    llm=llm,
+    memory=composable_memory,
+    verbose=True,
+)
+```
+
+### Execute some function calls
+
+When `.chat()` is invoked, the messages are put into the composable memory, which we understand from the previous section implies that all the messages are put in both `primary` and `secondary` sources.
+
+
+```python
+response = agent.chat("What is the mystery function on 5 and 6?")
+```
+
+    Added user message to memory: What is the mystery function on 5 and 6?
+    === Calling Function ===
+    Calling function: mystery with args: {"a": 5, "b": 6}
+    === Function Output ===
+    -11
+    === LLM Response ===
+    The mystery function on 5 and 6 returns -11.
+
+
+
+```python
+response = agent.chat("What happens if you multiply 2 and 3?")
+```
+
+    Added user message to memory: What happens if you multiply 2 and 3?
+    === Calling Function ===
+    Calling function: multiply with args: {"a": 2, "b": 3}
+    === Function Output ===
+    6
+    === LLM Response ===
+    If you multiply 2 and 3, the result is 6.
+
+
+### New Agent Sessions
+
+Now that we've added the messages to our `vector_memory`, we can see the effect of having this memory be used with a new agent session versus when it is used. Specifically, we ask the new agents to "recall" the outputs of the function calls, rather than re-computing.
+
+#### An Agent without our past memory
+
+
+```python
+llm = OpenAI(model="gpt-3.5-turbo-0613")
+agent_without_memory = FunctionCallingAgent.from_tools(
+    [multiply_tool, mystery_tool], llm=llm, verbose=True
+)
+```
+
+
+```python
+response = agent_without_memory.chat(
+    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+)
+```
+
+    Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
+    === LLM Response ===
+    I'm sorry, but I don't have access to the previous output of the mystery function on 5 and 6.
+
+
+#### An Agent with our past memory
+
+We see that the agent without access to the our past memory cannot complete the task. With this next agent we will indeed pass in our previous long-term memory (i.e., `vector_memory`). Note that we even use a fresh `ChatMemoryBuffer` which means there is no `chat_history` with this agent. Nonetheless, it will be able to retrieve from our long-term memory to get the past dialogue it needs.
+
+
+```python
+llm = OpenAI(model="gpt-3.5-turbo-0613")
+
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=ChatMemoryBuffer.from_defaults(),
+    secondary_memory_sources=[
+        vector_memory.copy(
+            deep=True
+        )  # using a copy here for illustration purposes
+        # later will use original vector_memory again
+    ],
+)
+
+agent_with_memory = FunctionCallingAgent.from_tools(
+    [multiply_tool, mystery_tool],
+    llm=llm,
+    memory=composable_memory,
+    verbose=True,
+)
+```
+
+
+```python
+agent_with_memory.chat_history  # an empty chat history
+```
+
+
+
+
+    []
+
+
+
+
+```python
+response = agent_with_memory.chat(
+    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+)
+```
+
+    Added user message to memory: What was the output of the mystery function on 5 and 6 again? Don't recompute.
+    === LLM Response ===
+    The output of the mystery function on 5 and 6 is -11.
+
+
+
+```python
+response = agent_with_memory.chat(
+    "What was the output of the multiply function on 2 and 3 again? Don't recompute."
+)
+```
+
+    Added user message to memory: What was the output of the multiply function on 2 and 3 again? Don't recompute.
+    === LLM Response ===
+    The output of the multiply function on 2 and 3 is 6.
+
+
+
+```python
+agent_with_memory.chat_history
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the mystery function on 5 and 6 again? Don't recompute.", additional_kwargs={}),
+     ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the mystery function on 5 and 6 is -11.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.USER: 'user'>, content="What was the output of the multiply function on 2 and 3 again? Don't recompute.", additional_kwargs={}),
+     ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='The output of the multiply function on 2 and 3 is 6.', additional_kwargs={})]
+
+
+
+### What happens under the hood with `.chat(user_input)`
+
+Under the hood, `.chat(user_input)` call effectively will call the memory's `.get()` method with `user_input` as the argument. As we learned in the previous section, this will ultimately return a composition of the `primary` and all of the `secondary` memory sources. These composed messages are what is being passed to the LLM's chat API as the chat history.
+
+
+```python
+composable_memory = SimpleComposableMemory.from_defaults(
+    primary_memory=ChatMemoryBuffer.from_defaults(),
+    secondary_memory_sources=[
+        vector_memory.copy(
+            deep=True
+        )  # copy for illustrative purposes to explain what
+        # happened under the hood from previous subsection
+    ],
+)
+agent_with_memory = agent_worker.as_agent(memory=composable_memory)
+```
+
+
+```python
+agent_with_memory.memory.get(
+    "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+)
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.SYSTEM: 'system'>, content='You are a helpful assistant.\n\nBelow are a set of relevant dialogues retrieved from potentially several memory sources:\n\n=====Relevant messages from memory source 1=====\n\n\tUSER: What is the mystery function on 5 and 6?\n\tASSISTANT: None\n\tTOOL: -11\n\tASSISTANT: The mystery function on 5 and 6 returns -11.\n\n=====End of relevant messages from memory source 1======\n\nThis is the end of the retrieved message dialogues.', additional_kwargs={})]
+
+
+
+
+```python
+print(
+    agent_with_memory.memory.get(
+        "What was the output of the mystery function on 5 and 6 again? Don't recompute."
+    )[0]
+)
+```
+
+    system: You are a helpful assistant.
+    
+    Below are a set of relevant dialogues retrieved from potentially several memory sources:
+    
+    =====Relevant messages from memory source 1=====
+    
+    	USER: What is the mystery function on 5 and 6?
+    	ASSISTANT: None
+    	TOOL: -11
+    	ASSISTANT: The mystery function on 5 and 6 returns -11.
+    
+    =====End of relevant messages from memory source 1======
+    
+    This is the end of the retrieved message dialogues.
+
@@ -0,0 +1,110 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/summary_memory_buffer.ipynb
+toc: True
+title: "Chat Summary Memory Buffer"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+The `ChatSummaryMemoryBuffer` is a memory buffer that stores the last X messages that fit into a token limit. It also summarizes the chat history into a single message.
+
+
+
+```python
+%pip install llama-index-core
+```
+
+## Setup
+
+
+```python
+from llama_index.core.memory import ChatSummaryMemoryBuffer
+
+memory = ChatSummaryMemoryBuffer.from_defaults(
+    token_limit=40000,
+    # optional set the summary prompt, here's the default:
+    # summarize_prompt=(
+    #     "The following is a conversation between the user and assistant. "
+    #     "Write a concise summary about the contents of this conversation."
+    # )
+)
+```
+
+## Using Standalone
+
+
+```python
+from llama_index.core.llms import ChatMessage
+
+chat_history = [
+    ChatMessage(role="user", content="Hello, how are you?"),
+    ChatMessage(role="assistant", content="I'm doing well, thank you!"),
+]
+
+# put a list of messages
+memory.put_messages(chat_history)
+
+# put one message at a time
+# memory.put_message(chat_history[0])
+```
+
+
+```python
+# Get the last X messages that fit into a token limit
+history = memory.get()
+```
+
+
+```python
+# Get all messages
+all_history = memory.get_all()
+```
+
+
+```python
+# clear the memory
+memory.reset()
+```
+
+## Using with Agents
+
+You can set the memory in any agent in the `.run()` method.
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-proj-..."
+```
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+
+memory = ChatMemoryBuffer.from_defaults(token_limit=40000)
+
+agent = FunctionAgent(tools=[], llm=OpenAI(model="gpt-4o-mini"))
+
+# context to hold the chat history/state
+ctx = Context(agent)
+```
+
+
+```python
+resp = await agent.run("Hello, how are you?", ctx=ctx, memory=memory)
+```
+
+
+```python
+print(memory.get_all())
+```
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text='Hello, how are you?')]), ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, additional_kwargs={}, blocks=[TextBlock(block_type='text', text="Hello! I'm just a program, so I don't have feelings, but I'm here and ready to help you. How can I assist you today?")])]
+
@@ -0,0 +1,100 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/memory/vector_memory.ipynb
+toc: True
+title: "Vector Memory"
+featured: False
+experimental: False
+tags: ['Agent', 'Memory']
+language: py
+---
+**NOTE:** This example of memory is deprecated in favor of the newer and more flexible `Memory` class. See the [latest docs](https://docs.llamaindex.ai/en/stable/module_guides/deploying/agents/memory/).
+
+The vector memory module uses vector search (backed by a vector db) to retrieve relevant conversation items given a user input.
+
+This notebook shows you how to use the `VectorMemory` class. We show you how to use its individual functions. A typical usecase for vector memory is as a long-term memory storage of chat messages. You can
+
+![VectorMemoryIllustration](https://d3ddy8balm3goa.cloudfront.net/llamaindex/vector-memory.excalidraw.svg)
+
+### Initialize and Experiment with Memory Module
+
+Here we initialize a raw memory module and demonstrate its functions - to put and retrieve from ChatMessage objects.
+
+- Note that `retriever_kwargs` is the same args you'd specify on the `VectorIndexRetriever` or from `index.as_retriever(..)`.
+
+
+```python
+from llama_index.core.memory import VectorMemory
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+
+vector_memory = VectorMemory.from_defaults(
+    vector_store=None,  # leave as None to use default in-memory vector store
+    embed_model=OpenAIEmbedding(),
+    retriever_kwargs={"similarity_top_k": 1},
+)
+```
+
+
+```python
+from llama_index.core.llms import ChatMessage
+
+msgs = [
+    ChatMessage.from_str("Jerry likes juice.", "user"),
+    ChatMessage.from_str("Bob likes burgers.", "user"),
+    ChatMessage.from_str("Alice likes apples.", "user"),
+]
+```
+
+
+```python
+# load into memory
+for m in msgs:
+    vector_memory.put(m)
+```
+
+
+```python
+# retrieve from memory
+msgs = vector_memory.get("What does Jerry like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content='Jerry likes juice.', additional_kwargs={})]
+
+
+
+
+```python
+vector_memory.reset()
+```
+
+Now let's try resetting and trying again. This time, we'll add an assistant message. Note that user/assistant messages are bundled by default.
+
+
+```python
+msgs = [
+    ChatMessage.from_str("Jerry likes burgers.", "user"),
+    ChatMessage.from_str("Bob likes apples.", "user"),
+    ChatMessage.from_str("Indeed, Bob likes apples.", "assistant"),
+    ChatMessage.from_str("Alice likes juice.", "user"),
+]
+vector_memory.set(msgs)
+```
+
+
+```python
+msgs = vector_memory.get("What does Bob like?")
+msgs
+```
+
+
+
+
+    [ChatMessage(role=<MessageRole.USER: 'user'>, content='Bob likes apples.', additional_kwargs={}),
+     ChatMessage(role=<MessageRole.ASSISTANT: 'assistant'>, content='Indeed, Bob likes apples.', additional_kwargs={})]
+
+
@@ -0,0 +1,271 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_builder.ipynb
+toc: True
+title: "GPT Builder Demo"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agent_builder.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+Inspired by GPTs interface, presented at OpenAI Dev Day 2023. Construct an agent with natural language.
+
+Here you can build your own agent...with another agent!
+
+
+```python
+%pip install llama-index-embeddings-openai
+%pip install llama-index-llms-openai
+%pip install llama-index-readers-file
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.llms.openai import OpenAI
+from llama_index.core import Settings
+
+llm = OpenAI(model="gpt-4o")
+Settings.llm = llm
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+## Define Candidate Tools
+
+We also define a tool retriever to retrieve candidate tools.
+
+In this setting we define tools as different Wikipedia pages.
+
+
+```python
+from llama_index.core import SimpleDirectoryReader
+```
+
+
+```python
+wiki_titles = ["Toronto", "Seattle", "Chicago", "Boston", "Houston"]
+```
+
+
+```python
+from pathlib import Path
+
+import requests
+
+for title in wiki_titles:
+    response = requests.get(
+        "https://en.wikipedia.org/w/api.php",
+        params={
+            "action": "query",
+            "format": "json",
+            "titles": title,
+            "prop": "extracts",
+            # 'exintro': True,
+            "explaintext": True,
+        },
+    ).json()
+    page = next(iter(response["query"]["pages"].values()))
+    wiki_text = page["extract"]
+
+    data_path = Path("data")
+    if not data_path.exists():
+        Path.mkdir(data_path)
+
+    with open(data_path / f"{title}.txt", "w") as fp:
+        fp.write(wiki_text)
+```
+
+
+```python
+# Load all wiki documents
+city_docs = {}
+for wiki_title in wiki_titles:
+    city_docs[wiki_title] = SimpleDirectoryReader(
+        input_files=[f"data/{wiki_title}.txt"]
+    ).load_data()
+```
+
+### Build Query Tool for Each Document
+
+
+```python
+from llama_index.core import VectorStoreIndex
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import VectorStoreIndex
+
+# Build tool dictionary
+tool_dict = {}
+
+for wiki_title in wiki_titles:
+    # build vector index
+    vector_index = VectorStoreIndex.from_documents(
+        city_docs[wiki_title],
+    )
+    # define query engines
+    vector_query_engine = vector_index.as_query_engine(llm=llm)
+
+    # define tools
+    vector_tool = QueryEngineTool.from_defaults(
+        query_engine=vector_query_engine,
+        name=wiki_title,
+        description=("Useful for questions related to" f" {wiki_title}"),
+    )
+    tool_dict[wiki_title] = vector_tool
+```
+
+### Define Tool Retriever
+
+
+```python
+# define an "object" index and retriever over these tools
+from llama_index.core import VectorStoreIndex
+from llama_index.core.objects import ObjectIndex
+
+tool_index = ObjectIndex.from_objects(
+    list(tool_dict.values()),
+    index_cls=VectorStoreIndex,
+)
+tool_retriever = tool_index.as_retriever(similarity_top_k=1)
+```
+
+### Load Data
+
+Here we load wikipedia pages from different cities.
+
+## Define Meta-Tools for GPT Builder
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.core.llms import ChatMessage
+from llama_index.core import ChatPromptTemplate
+from typing import List
+
+GEN_SYS_PROMPT_STR = """\
+Task information is given below. 
+
+Given the task, please generate a system prompt for an OpenAI-powered bot to solve this task: 
+{task} \
+"""
+
+gen_sys_prompt_messages = [
+    ChatMessage(
+        role="system",
+        content="You are helping to build a system prompt for another bot.",
+    ),
+    ChatMessage(role="user", content=GEN_SYS_PROMPT_STR),
+]
+
+GEN_SYS_PROMPT_TMPL = ChatPromptTemplate(gen_sys_prompt_messages)
+
+
+agent_cache = {}
+
+
+async def create_system_prompt(task: str):
+    """Create system prompt for another agent given an input task."""
+    llm = OpenAI(llm="gpt-4")
+    fmt_messages = GEN_SYS_PROMPT_TMPL.format_messages(task=task)
+    response = await llm.achat(fmt_messages)
+    return response.message.content
+
+
+async def get_tools(task: str):
+    """Get the set of relevant tools to use given an input task."""
+    subset_tools = await tool_retriever.aretrieve(task)
+    return [t.metadata.name for t in subset_tools]
+
+
+def create_agent(system_prompt: str, tool_names: List[str]):
+    """Create an agent given a system prompt and an input set of tools."""
+    llm = OpenAI(model="gpt-4o")
+    try:
+        # get the list of tools
+        input_tools = [tool_dict[tn] for tn in tool_names]
+
+        agent = FunctionAgent(
+            tools=input_tools, llm=llm, system_prompt=system_prompt
+        )
+        agent_cache["agent"] = agent
+        return_msg = "Agent created successfully."
+    except Exception as e:
+        return_msg = f"An error occurred when building an agent. Here is the error: {repr(e)}"
+    return return_msg
+```
+
+
+```python
+from llama_index.core.tools import FunctionTool
+
+system_prompt_tool = FunctionTool.from_defaults(fn=create_system_prompt)
+get_tools_tool = FunctionTool.from_defaults(fn=get_tools)
+create_agent_tool = FunctionTool.from_defaults(fn=create_agent)
+```
+
+
+```python
+GPT_BUILDER_SYS_STR = """\
+You are helping to construct an agent given a user-specified task. You should generally use the tools in this order to build the agent.
+
+1) Create system prompt tool: to create the system prompt for the agent.
+2) Get tools tool: to fetch the candidate set of tools to use.
+3) Create agent tool: to create the final agent.
+"""
+
+prefix_msgs = [ChatMessage(role="system", content=GPT_BUILDER_SYS_STR)]
+
+
+builder_agent = FunctionAgent(
+    tools=[system_prompt_tool, get_tools_tool, create_agent_tool],
+    prefix_messages=prefix_msgs,
+    llm=OpenAI(model="gpt-4o"),
+    verbose=True,
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult
+
+handler = builder_agent.run("Build an agent that can tell me about Toronto.")
+async for event in handler.stream_events():
+    if isinstance(event, ToolCallResult):
+        print(
+            f"Called tool {event.tool_name} with input {event.tool_kwargs}\nGot output: {event.tool_output}"
+        )
+
+result = await handler
+print(f"Result: {result}")
+```
+
+    Called tool create_system_prompt with input {'task': 'Tell me about Toronto'}
+    Got output: "Generate a brief summary about Toronto, including its history, culture, landmarks, and notable features."
+    Called tool get_tools with input {'task': 'Tell me about Toronto'}
+    Got output: ['Toronto']
+    Called tool create_agent with input {'system_prompt': 'Generate a brief summary about Toronto, including its history, culture, landmarks, and notable features.', 'tool_names': ['Toronto']}
+    Got output: Agent created successfully.
+    Result: I have created an agent that can provide information about Toronto, including its history, culture, landmarks, and notable features. You can now ask the agent any questions you have about Toronto!
+
+
+
+```python
+city_agent = agent_cache["agent"]
+```
+
+
+```python
+response = await city_agent.run("Tell me about the parks in Toronto")
+print(str(response))
+```
+
+    Toronto is home to a diverse array of parks and public spaces, offering both urban and natural environments. Key downtown parks include Allan Gardens, Christie Pits, and Trinity Bellwoods Park. For waterfront views, Tommy Thompson Park and the Toronto Islands are popular destinations. In the city's outer areas, large parks like High Park, Humber Bay Park, and Morningside Park provide expansive green spaces. Additionally, parts of Rouge National Urban Park, the largest urban park in North America, are located within Toronto. The city also features notable squares such as Nathan Phillips Square, Yonge–Dundas Square, and Harbourfront Square. Approximately 12.5% of Toronto's land is dedicated to parkland, offering facilities for various activities, including winter sports like ice skating and skiing.
+
@@ -0,0 +1,351 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_basic.ipynb
+toc: True
+title: "FunctionAgent / AgentWorkflow Basic Introduction"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+The `AgentWorkflow` is an orchestrator for running a system of one or more agents. In this example, we'll create a simple workflow with a single `FunctionAgent`, and use that to cover the basic functionality.
+
+
+```python
+%pip install llama-index
+```
+
+## Setup
+
+In this example, we will use `OpenAI` as our LLM. For all LLMs, check out the [examples documentation](https://docs.llamaindex.ai/en/stable/examples/llm/openai/) or [LlamaHub](https://llamahub.ai/?tab=llms) for a list of all supported LLMs and how to install/use them.
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+llm = OpenAI(model="gpt-4o-mini", api_key="sk-...")
+```
+
+To make our agent more useful, we can give it tools/actions to use. In this case, we'll use Tavily to implement a tool that can search the web for information. You can get a free API key from [Tavily](https://tavily.com/).
+
+
+```python
+%pip install tavily-python
+```
+
+When creating a tool, its very important to:
+- give the tool a proper name and docstring/description. The LLM uses this to understand what the tool does.
+- annotate the types. This helps the LLM understand the expected input and output types.
+- use async when possible, since this will make the workflow more efficient.
+
+
+```python
+from tavily import AsyncTavilyClient
+
+
+async def search_web(query: str) -> str:
+    """Useful for using the web to answer questions."""
+    client = AsyncTavilyClient(api_key="tvly-...")
+    return str(await client.search(query))
+```
+
+With the tool and and LLM defined, we can create an `AgentWorkflow` that uses the tool.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[search_web],
+    llm=llm,
+    system_prompt="You are a helpful assistant that can search the web for information.",
+)
+```
+
+## Running the Agent
+
+Now that our agent is created, we can run it!
+
+
+```python
+response = await agent.run(user_msg="What is the weather in San Francisco?")
+print(str(response))
+```
+
+    The current weather in San Francisco is as follows:
+    
+    - **Temperature**: 16.1°C (61°F)
+    - **Condition**: Partly cloudy
+    - **Wind**: 13.6 mph (22.0 kph) from the west
+    - **Humidity**: 64%
+    - **Visibility**: 16 km (9 miles)
+    - **Pressure**: 1017 mb (30.04 in)
+    
+    For more details, you can check the full report [here](https://www.weatherapi.com/).
+
+
+The above is the equivalent of the following of using `AgentWorkflow` with a single `FunctionAgent`:
+
+
+```python
+from llama_index.core.agent.workflow import AgentWorkflow
+
+workflow = AgentWorkflow(agents=[agent])
+
+response = await workflow.run(user_msg="What is the weather in San Francisco?")
+```
+
+If you were creating a workflow with multiple agents, you can pass in a list of agents to the `AgentWorkflow` constructor. Learn more in our [multi-agent workflow example](https://docs.llamaindex.ai/en/stable/understanding/agent/multi_agent/).
+
+## Maintaining State
+
+By default, the `FunctionAgent` will maintain stateless between runs. This means that the agent will not have any memory of previous runs.
+
+To maintain state, we need to keep track of the previous state. Since the `FunctionAgent` is running in a  `Workflow`, the state is stored in the `Context`. This can be passed between runs to maintain state and history.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+```
+
+
+```python
+response = await agent.run(
+    user_msg="My name is Logan, nice to meet you!", ctx=ctx
+)
+print(str(response))
+```
+
+    Nice to meet you, Logan! How can I assist you today?
+
+
+
+```python
+response = await agent.run(user_msg="What is my name?", ctx=ctx)
+print(str(response))
+```
+
+    Your name is Logan.
+
+
+The context is serializable, so it can be saved to a database, file, etc. and loaded back in later. 
+
+The `JsonSerializer` is a simple serializer that uses `json.dumps` and `json.loads` to serialize and deserialize the context.
+
+The `JsonPickleSerializer` is a serializer that uses `pickle` to serialize and deserialize the context. If you have objects in your context that are not serializable, you can use this serializer.
+
+
+```python
+from llama_index.core.workflow import JsonPickleSerializer, JsonSerializer
+
+ctx_dict = ctx.to_dict(serializer=JsonSerializer())
+
+restored_ctx = Context.from_dict(agent, ctx_dict, serializer=JsonSerializer())
+```
+
+
+```python
+response = await agent.run(
+    user_msg="Do you still remember my name?", ctx=restored_ctx
+)
+print(str(response))
+```
+
+    Yes, I remember your name is Logan.
+
+
+## Streaming
+
+The `AgentWorkflow`/`FunctionAgent` also supports streaming. Since the `AgentWorkflow` is a `Workflow`, it can be streamed like any other `Workflow`. This works by using the handler that is returned from the workflow. There are a few key events that are streamed, feel free to explore below.
+
+If you only want to stream the LLM output, you can use the `AgentStream` events.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentInput,
+    AgentOutput,
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+
+handler = agent.run(user_msg="What is the weather in Saskatoon?")
+
+async for event in handler.stream_events():
+    if isinstance(event, AgentStream):
+        print(event.delta, end="", flush=True)
+        # print(event.response)  # the current full response
+        # print(event.raw)  # the raw llm api response
+        # print(event.current_agent_name)  # the current agent name
+    # elif isinstance(event, AgentInput):
+    #    print(event.input)  # the current input messages
+    #    print(event.current_agent_name)  # the current agent name
+    # elif isinstance(event, AgentOutput):
+    #    print(event.response)  # the current full response
+    #    print(event.tool_calls)  # the selected tool calls, if any
+    #    print(event.raw)  # the raw llm api response
+    # elif isinstance(event, ToolCallResult):
+    #    print(event.tool_name)  # the tool name
+    #    print(event.tool_kwargs)  # the tool kwargs
+    #    print(event.tool_output)  # the tool output
+    # elif isinstance(event, ToolCall):
+    #     print(event.tool_name)  # the tool name
+    #     print(event.tool_kwargs)  # the tool kwargs
+```
+
+    The current weather in Saskatoon is as follows:
+    
+    - **Temperature**: 22.2°C (72°F)
+    - **Condition**: Overcast
+    - **Humidity**: 25%
+    - **Wind Speed**: 6.0 mph (9.7 kph) from the northwest
+    - **Visibility**: 4.8 km
+    - **Pressure**: 1018 mb
+    
+    For more details, you can check the full report [here](https://www.weatherapi.com/).
+
+## Tools and State
+
+Tools can also be defined that have access to the workflow context. This means you can set and retrieve variables from the context and use them in the tool or between tools.
+
+**Note:** The `Context` parameter should be the first parameter of the tool.
+
+
+```python
+from llama_index.core.workflow import Context
+
+
+async def set_name(ctx: Context, name: str) -> str:
+    state = await ctx.store.get("state")
+    state["name"] = name
+    await ctx.store.set("state", state)
+    return f"Name set to {name}"
+
+
+agent = FunctionAgent(
+    tools=[set_name],
+    llm=llm,
+    system_prompt="You are a helpful assistant that can set a name.",
+    initial_state={"name": "unset"},
+)
+
+ctx = Context(agent)
+
+response = await agent.run(user_msg="My name is Logan", ctx=ctx)
+print(str(response))
+
+state = await ctx.store.get("state")
+print(state["name"])
+```
+
+    Your name has been set to Logan.
+    Logan
+
+
+## Human in the Loop
+
+Tools can also be defined that involve a human in the loop. This is useful for tasks that require human input, such as confirming a tool call or providing feedback.
+
+Using workflow events, we can emit events that require a response from the user. Here, we use the built-in `InputRequiredEvent` and `HumanResponseEvent` to handle the human in the loop, but you can also define your own events.
+
+`wait_for_event` will emit the `waiter_event` and wait until it sees the `HumanResponseEvent` with the specified `requirements`. The `waiter_id` is used to ensure that we only send one `waiter_event` for each `waiter_id`.
+
+
+```python
+from llama_index.core.workflow import (
+    Context,
+    InputRequiredEvent,
+    HumanResponseEvent,
+)
+
+
+async def dangerous_task(ctx: Context) -> str:
+    """A dangerous task that requires human confirmation."""
+
+    question = "Are you sure you want to proceed?"
+    response = await ctx.wait_for_event(
+        HumanResponseEvent,
+        waiter_id=question,
+        waiter_event=InputRequiredEvent(
+            prefix=question,
+            user_name="Logan",
+        ),
+        requirements={"user_name": "Logan"},
+    )
+    if response.response == "yes":
+        return "Dangerous task completed successfully."
+    else:
+        return "Dangerous task aborted."
+
+
+agent = FunctionAgent(
+    tools=[dangerous_task],
+    llm=llm,
+    system_prompt="You are a helpful assistant that can perform dangerous tasks.",
+)
+```
+
+
+```python
+handler = agent.run(user_msg="I want to proceed with the dangerous task.")
+
+async for event in handler.stream_events():
+    if isinstance(event, InputRequiredEvent):
+        response = input(event.prefix).strip().lower()
+        handler.ctx.send_event(
+            HumanResponseEvent(
+                response=response,
+                user_name=event.user_name,
+            )
+        )
+
+response = await handler
+print(str(response))
+```
+
+    The dangerous task has been completed successfully. If you need anything else, feel free to ask!
+
+
+In production scenarios, you might handle human-in-the-loop over a websocket or multiple API requests.
+
+As mentioned before, the `Context` object is serializable, and this means we can also save the workflow mid-run and restore it later. 
+
+**NOTE:** Any functions/steps that were in-progress will start from the beginning when the workflow is restored.
+
+
+```python
+from llama_index.core.workflow import JsonSerializer
+
+handler = agent.run(user_msg="I want to proceed with the dangerous task.")
+
+input_ev = None
+async for event in handler.stream_events():
+    if isinstance(event, InputRequiredEvent):
+        input_ev = event
+        break
+
+# save the context somewhere for later
+ctx_dict = handler.ctx.to_dict(serializer=JsonSerializer())
+
+# get the response from the user
+response_str = input(input_ev.prefix).strip().lower()
+
+# restore the workflow
+restored_ctx = Context.from_dict(agent, ctx_dict, serializer=JsonSerializer())
+
+handler = agent.run(ctx=restored_ctx)
+handler.ctx.send_event(
+    HumanResponseEvent(
+        response=response_str,
+        user_name=input_ev.user_name,
+    )
+)
+response = await handler
+print(str(response))
+```
+
+    The dangerous task has been completed successfully. If you need anything else, feel free to ask!
+
@@ -0,0 +1,300 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_multi.ipynb
+toc: True
+title: "Multi-Agent Report Generation with AgentWorkflow"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agent_workflow_multi.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+In this notebook, we will explore how to use the `AgentWorkflow` class to create multi-agent systems. Specifically, we will create a system that can generate a report on a given topic.
+
+This notebook will assume that you have already either read the [basic agent workflow notebook](https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_basic) or the [agent workflow documentation](https://docs.llamaindex.ai/en/stable/understanding/agent/).
+
+## Setup
+
+In this example, we will use `OpenAI` as our LLM. For all LLMs, check out the [examples documentation](https://docs.llamaindex.ai/en/stable/examples/llm/openai/) or [LlamaHub](https://llamahub.ai/?tab=llms) for a list of all supported LLMs and how to install/use them.
+
+If we wanted, each agent could have a different LLM, but for this example, we will use the same LLM for all agents.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+llm = OpenAI(model="gpt-4o", api_key="sk-...")
+```
+
+## System Design
+
+Our system will have three agents:
+
+1. A `ResearchAgent` that will search the web for information on the given topic.
+2. A `WriteAgent` that will write the report using the information found by the `ResearchAgent`.
+3. A `ReviewAgent` that will review the report and provide feedback.
+
+We will use the `AgentWorkflow` class to create a multi-agent system that will execute these agents in order.
+
+While there are many ways to implement this system, in this case, we will use a few tools to help with the research and writing processes.
+
+1. A `web_search` tool to search the web for information on the given topic.
+2. A `record_notes` tool to record notes on the given topic.
+3. A `write_report` tool to write the report using the information found by the `ResearchAgent`.
+4. A `review_report` tool to review the report and provide feedback.
+
+Utilizing the `Context` class, we can pass state between agents, and each agent will have access to the current state of the system.
+
+
+
+```python
+%pip install tavily-python
+```
+
+
+```python
+from tavily import AsyncTavilyClient
+from llama_index.core.workflow import Context
+
+
+async def search_web(query: str) -> str:
+    """Useful for using the web to answer questions."""
+    client = AsyncTavilyClient(api_key="tvly-...")
+    return str(await client.search(query))
+
+
+async def record_notes(ctx: Context, notes: str, notes_title: str) -> str:
+    """Useful for recording notes on a given topic. Your input should be notes with a title to save the notes under."""
+    current_state = await ctx.store.get("state")
+    if "research_notes" not in current_state:
+        current_state["research_notes"] = {}
+    current_state["research_notes"][notes_title] = notes
+    await ctx.store.set("state", current_state)
+    return "Notes recorded."
+
+
+async def write_report(ctx: Context, report_content: str) -> str:
+    """Useful for writing a report on a given topic. Your input should be a markdown formatted report."""
+    current_state = await ctx.store.get("state")
+    current_state["report_content"] = report_content
+    await ctx.store.set("state", current_state)
+    return "Report written."
+
+
+async def review_report(ctx: Context, review: str) -> str:
+    """Useful for reviewing a report and providing feedback. Your input should be a review of the report."""
+    current_state = await ctx.store.get("state")
+    current_state["review"] = review
+    await ctx.store.set("state", current_state)
+    return "Report reviewed."
+```
+
+With our tools defined, we can now create our agents.
+
+If the LLM you are using supports tool calling, you can use the `FunctionAgent` class. Otherwise, you can use the `ReActAgent` class.
+
+Here, the name and description of each agent is used so that the system knows what each agent is responsible for and when to hand off control to the next agent.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+
+research_agent = FunctionAgent(
+    name="ResearchAgent",
+    description="Useful for searching the web for information on a given topic and recording notes on the topic.",
+    system_prompt=(
+        "You are the ResearchAgent that can search the web for information on a given topic and record notes on the topic. "
+        "Once notes are recorded and you are satisfied, you should hand off control to the WriteAgent to write a report on the topic. "
+        "You should have at least some notes on a topic before handing off control to the WriteAgent."
+    ),
+    llm=llm,
+    tools=[search_web, record_notes],
+    can_handoff_to=["WriteAgent"],
+)
+
+write_agent = FunctionAgent(
+    name="WriteAgent",
+    description="Useful for writing a report on a given topic.",
+    system_prompt=(
+        "You are the WriteAgent that can write a report on a given topic. "
+        "Your report should be in a markdown format. The content should be grounded in the research notes. "
+        "Once the report is written, you should get feedback at least once from the ReviewAgent."
+    ),
+    llm=llm,
+    tools=[write_report],
+    can_handoff_to=["ReviewAgent", "ResearchAgent"],
+)
+
+review_agent = FunctionAgent(
+    name="ReviewAgent",
+    description="Useful for reviewing a report and providing feedback.",
+    system_prompt=(
+        "You are the ReviewAgent that can review the write report and provide feedback. "
+        "Your review should either approve the current report or request changes for the WriteAgent to implement. "
+        "If you have feedback that requires changes, you should hand off control to the WriteAgent to implement the changes after submitting the review."
+    ),
+    llm=llm,
+    tools=[review_report],
+    can_handoff_to=["WriteAgent"],
+)
+```
+
+## Running the Workflow
+
+With our agents defined, we can create our `AgentWorkflow` and run it.
+
+
+```python
+from llama_index.core.agent.workflow import AgentWorkflow
+
+agent_workflow = AgentWorkflow(
+    agents=[research_agent, write_agent, review_agent],
+    root_agent=research_agent.name,
+    initial_state={
+        "research_notes": {},
+        "report_content": "Not written yet.",
+        "review": "Review required.",
+    },
+)
+```
+
+As the workflow is running, we will stream the events to get an idea of what is happening under the hood.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentInput,
+    AgentOutput,
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+
+handler = agent_workflow.run(
+    user_msg=(
+        "Write me a report on the history of the internet. "
+        "Briefly describe the history of the internet, including the development of the internet, the development of the web, "
+        "and the development of the internet in the 21st century."
+    )
+)
+
+current_agent = None
+current_tool_calls = ""
+async for event in handler.stream_events():
+    if (
+        hasattr(event, "current_agent_name")
+        and event.current_agent_name != current_agent
+    ):
+        current_agent = event.current_agent_name
+        print(f"\n{'='*50}")
+        print(f"🤖 Agent: {current_agent}")
+        print(f"{'='*50}\n")
+
+    # if isinstance(event, AgentStream):
+    #     if event.delta:
+    #         print(event.delta, end="", flush=True)
+    # elif isinstance(event, AgentInput):
+    #     print("📥 Input:", event.input)
+    elif isinstance(event, AgentOutput):
+        if event.response.content:
+            print("📤 Output:", event.response.content)
+        if event.tool_calls:
+            print(
+                "🛠️  Planning to use tools:",
+                [call.tool_name for call in event.tool_calls],
+            )
+    elif isinstance(event, ToolCallResult):
+        print(f"🔧 Tool Result ({event.tool_name}):")
+        print(f"  Arguments: {event.tool_kwargs}")
+        print(f"  Output: {event.tool_output}")
+    elif isinstance(event, ToolCall):
+        print(f"🔨 Calling Tool: {event.tool_name}")
+        print(f"  With arguments: {event.tool_kwargs}")
+```
+
+    
+    ==================================================
+    🤖 Agent: ResearchAgent
+    ==================================================
+    
+    🛠️  Planning to use tools: ['search_web']
+    🔨 Calling Tool: search_web
+      With arguments: {'query': 'history of the internet'}
+    🔧 Tool Result (search_web):
+      Arguments: {'query': 'history of the internet'}
+      Output: {'query': 'history of the internet', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'title': 'Internet history timeline: ARPANET to the World Wide Web', 'url': 'https://www.livescience.com/20727-internet-history.html', 'content': 'Internet history timeline: ARPANET to the World Wide Web\nThe internet history timeline shows how today\'s vast network evolved from the initial concept\nIn internet history, credit for the initial concept that developed into the World Wide Web is typically given to Leonard Kleinrock. "\nAccording to the journal Management and Business Review (MBR), Kleinrock, along with other innovators such as J.C.R. Licklider, the first director of the Information Processing Technology Office (IPTO), provided the backbone for the ubiquitous stream of emails, media, Facebook postings and tweets that are now shared online every day.\n The precursor to the internet was jumpstarted in the early days of the history of computers , in 1969 with the U.S. Defense Department\'s Advanced Research Projects Agency Network (ARPANET), according to the journal American Scientist. The successful push to stop the bill, involving technology companies such as Google and nonprofit organizations including Wikipedia and the Electronic Frontier Foundation, is considered a victory for sites such as YouTube that depend on user-generated content, as well as "fair use" on the internet.\n Vinton Cerf and Bob Kahn (the duo said by many to be the Fathers of the Internet) publish "A Protocol for Packet Network Interconnection," which details the design of TCP.\n1976:', 'score': 0.81097376, 'raw_content': None}, {'title': 'A Brief History of the Internet - University System of Georgia', 'url': 'https://usg.edu/galileo/skills/unit07/internet07_02.phtml', 'content': 'The Internet started in the 1960s as a way for government researchers to share information. This eventually led to the formation of the ARPANET (Advanced Research Projects Agency Network), the network that ultimately evolved into what we now know as the Internet. In response to this, other networks were created to provide information sharing. ARPANET and the Defense Data Network officially changed to the TCP/IP standard on January 1, 1983, hence the birth of the Internet. (Business computers like the UNIVAC processed data more slowly than the IAS-type machines, but were designed for fast input and output.) The first few sales were to government agencies, the A.C. Nielsen Company, and the Prudential Insurance Company.', 'score': 0.8091708, 'raw_content': None}, {'title': 'Timeline - History of the Internet', 'url': 'https://historyoftheinternet.net/timeline/', 'content': "Learn how the internet evolved from SAGE and IBM's internal networks to ARPANET and the World Wide Web. Explore the commercial and government paths that led to the current internet format and protocols.", 'score': 0.7171114, 'raw_content': None}, {'title': 'Learn About Internet History | History of the Internet', 'url': 'https://internethistory.org/', 'content': 'Learn about the origins, evolution and impact of the internet through stories, materials and videos. Explore the first internet message, optical amplifier, wavelength division multiplexing and more.', 'score': 0.7040996, 'raw_content': None}, {'title': 'Brief History of the Internet', 'url': 'https://www.internetsociety.org/resources/doc/2017/brief-history-internet/', 'content': "Learn how the Internet evolved from the initial internetting concepts to a global network of networks that transformed the computer and communications world. Explore the key milestones, challenges, and opportunities of the Internet's development and future.", 'score': 0.6944897, 'raw_content': None}], 'response_time': 1.65}
+    🛠️  Planning to use tools: ['record_notes']
+    🔨 Calling Tool: record_notes
+      With arguments: {'notes': "The internet's history began in the 1960s as a project for government researchers to share information, leading to the creation of ARPANET (Advanced Research Projects Agency Network). ARPANET was the first network to implement the TCP/IP protocol suite, which became the foundation for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the birth of the internet.\n\nThe World Wide Web was developed later, in 1989, by Tim Berners-Lee, a British scientist at CERN. The web was initially conceived as a way to facilitate information sharing among scientists and institutes around the world. Berners-Lee developed the first web browser and web server, and introduced the concept of hyperlinks, which allowed users to navigate between different documents on the web.\n\nIn the 21st century, the internet has evolved into a global network that connects billions of devices and users. It has transformed communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact.", 'notes_title': 'History of the Internet'}
+    🔧 Tool Result (record_notes):
+      Arguments: {'notes': "The internet's history began in the 1960s as a project for government researchers to share information, leading to the creation of ARPANET (Advanced Research Projects Agency Network). ARPANET was the first network to implement the TCP/IP protocol suite, which became the foundation for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the birth of the internet.\n\nThe World Wide Web was developed later, in 1989, by Tim Berners-Lee, a British scientist at CERN. The web was initially conceived as a way to facilitate information sharing among scientists and institutes around the world. Berners-Lee developed the first web browser and web server, and introduced the concept of hyperlinks, which allowed users to navigate between different documents on the web.\n\nIn the 21st century, the internet has evolved into a global network that connects billions of devices and users. It has transformed communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact.", 'notes_title': 'History of the Internet'}
+      Output: Notes recorded.
+    🛠️  Planning to use tools: ['handoff']
+    🔨 Calling Tool: handoff
+      With arguments: {'to_agent': 'WriteAgent', 'reason': 'I have gathered and recorded notes on the history of the internet, including its development, the creation of the web, and its evolution in the 21st century. The WriteAgent can now use these notes to write a comprehensive report.'}
+    🔧 Tool Result (handoff):
+      Arguments: {'to_agent': 'WriteAgent', 'reason': 'I have gathered and recorded notes on the history of the internet, including its development, the creation of the web, and its evolution in the 21st century. The WriteAgent can now use these notes to write a comprehensive report.'}
+      Output: Handed off to WriteAgent because: I have gathered and recorded notes on the history of the internet, including its development, the creation of the web, and its evolution in the 21st century. The WriteAgent can now use these notes to write a comprehensive report.
+    
+    ==================================================
+    🤖 Agent: WriteAgent
+    ==================================================
+    
+    🛠️  Planning to use tools: ['write_report']
+    🔨 Calling Tool: write_report
+      With arguments: {'report_content': "# History of the Internet\n\nThe history of the internet is a fascinating journey from a government research project to a global network that has transformed the way we communicate, work, and live. This report provides a brief overview of the key milestones in the development of the internet, the World Wide Web, and the internet's evolution in the 21st century.\n\n## Development of the Internet\n\nThe origins of the internet date back to the 1960s, when it was conceived as a means for government researchers to share information. This led to the creation of ARPANET (Advanced Research Projects Agency Network), which was the first network to implement the TCP/IP protocol suite. TCP/IP became the foundational technology for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the official birth of the internet.\n\n## Development of the World Wide Web\n\nThe World Wide Web was developed in 1989 by Tim Berners-Lee, a British scientist working at CERN. The web was initially designed to facilitate information sharing among scientists and research institutes globally. Berners-Lee created the first web browser and web server, introducing the concept of hyperlinks, which allowed users to navigate between different documents on the web seamlessly.\n\n## The Internet in the 21st Century\n\nIn the 21st century, the internet has evolved into a vast global network connecting billions of devices and users. It has revolutionized communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact, making it an integral part of daily life for people around the world.\n\nThe internet continues to evolve, with ongoing advancements in technology and infrastructure shaping its future. As it grows, the internet remains a powerful tool for innovation, connectivity, and information sharing, influencing nearly every aspect of modern society."}
+    🔧 Tool Result (write_report):
+      Arguments: {'report_content': "# History of the Internet\n\nThe history of the internet is a fascinating journey from a government research project to a global network that has transformed the way we communicate, work, and live. This report provides a brief overview of the key milestones in the development of the internet, the World Wide Web, and the internet's evolution in the 21st century.\n\n## Development of the Internet\n\nThe origins of the internet date back to the 1960s, when it was conceived as a means for government researchers to share information. This led to the creation of ARPANET (Advanced Research Projects Agency Network), which was the first network to implement the TCP/IP protocol suite. TCP/IP became the foundational technology for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the official birth of the internet.\n\n## Development of the World Wide Web\n\nThe World Wide Web was developed in 1989 by Tim Berners-Lee, a British scientist working at CERN. The web was initially designed to facilitate information sharing among scientists and research institutes globally. Berners-Lee created the first web browser and web server, introducing the concept of hyperlinks, which allowed users to navigate between different documents on the web seamlessly.\n\n## The Internet in the 21st Century\n\nIn the 21st century, the internet has evolved into a vast global network connecting billions of devices and users. It has revolutionized communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact, making it an integral part of daily life for people around the world.\n\nThe internet continues to evolve, with ongoing advancements in technology and infrastructure shaping its future. As it grows, the internet remains a powerful tool for innovation, connectivity, and information sharing, influencing nearly every aspect of modern society."}
+      Output: Report written.
+    🛠️  Planning to use tools: ['handoff']
+    🔨 Calling Tool: handoff
+      With arguments: {'to_agent': 'ReviewAgent', 'reason': 'The report on the history of the internet has been written and needs to be reviewed for accuracy and completeness.'}
+    🔧 Tool Result (handoff):
+      Arguments: {'to_agent': 'ReviewAgent', 'reason': 'The report on the history of the internet has been written and needs to be reviewed for accuracy and completeness.'}
+      Output: Handed off to ReviewAgent because: The report on the history of the internet has been written and needs to be reviewed for accuracy and completeness.
+    
+    ==================================================
+    🤖 Agent: ReviewAgent
+    ==================================================
+    
+    🛠️  Planning to use tools: ['review_report']
+    🔨 Calling Tool: review_report
+      With arguments: {'review': "The report on the history of the internet provides a concise and informative overview of the key developments in the internet's history. It effectively covers the origins of the internet with ARPANET, the creation of the World Wide Web by Tim Berners-Lee, and the evolution of the internet in the 21st century. The report is well-structured, with clear sections that make it easy to follow.\n\nThe content is accurate and aligns with the historical timeline of the internet's development. It highlights significant milestones such as the adoption of TCP/IP and the introduction of hyperlinks, which are crucial to understanding the internet's growth.\n\nOverall, the report meets the requirements and provides a comprehensive summary of the internet's history. It is approved for final submission."}
+    🔧 Tool Result (review_report):
+      Arguments: {'review': "The report on the history of the internet provides a concise and informative overview of the key developments in the internet's history. It effectively covers the origins of the internet with ARPANET, the creation of the World Wide Web by Tim Berners-Lee, and the evolution of the internet in the 21st century. The report is well-structured, with clear sections that make it easy to follow.\n\nThe content is accurate and aligns with the historical timeline of the internet's development. It highlights significant milestones such as the adoption of TCP/IP and the introduction of hyperlinks, which are crucial to understanding the internet's growth.\n\nOverall, the report meets the requirements and provides a comprehensive summary of the internet's history. It is approved for final submission."}
+      Output: Report reviewed.
+    📤 Output: The report on the history of the internet has been reviewed and approved. It provides a comprehensive and accurate overview of the internet's development, the creation of the World Wide Web, and its evolution in the 21st century. The report is well-structured and meets the requirements for final submission.
+
+
+Now, we can retrieve the final report in the system for ourselves.
+
+
+```python
+state = await handler.ctx.store.get("state")
+print(state["report_content"])
+```
+
+    # History of the Internet
+    
+    The history of the internet is a fascinating journey from a government research project to a global network that has transformed the way we communicate, work, and live. This report provides a brief overview of the key milestones in the development of the internet, the World Wide Web, and the internet's evolution in the 21st century.
+    
+    ## Development of the Internet
+    
+    The origins of the internet date back to the 1960s, when it was conceived as a means for government researchers to share information. This led to the creation of ARPANET (Advanced Research Projects Agency Network), which was the first network to implement the TCP/IP protocol suite. TCP/IP became the foundational technology for the modern internet. On January 1, 1983, ARPANET and the Defense Data Network officially adopted TCP/IP, marking the official birth of the internet.
+    
+    ## Development of the World Wide Web
+    
+    The World Wide Web was developed in 1989 by Tim Berners-Lee, a British scientist working at CERN. The web was initially designed to facilitate information sharing among scientists and research institutes globally. Berners-Lee created the first web browser and web server, introducing the concept of hyperlinks, which allowed users to navigate between different documents on the web seamlessly.
+    
+    ## The Internet in the 21st Century
+    
+    In the 21st century, the internet has evolved into a vast global network connecting billions of devices and users. It has revolutionized communication, commerce, entertainment, and education, among other fields. The development of broadband technology, mobile internet, and social media platforms has further accelerated the internet's growth and impact, making it an integral part of daily life for people around the world.
+    
+    The internet continues to evolve, with ongoing advancements in technology and infrastructure shaping its future. As it grows, the internet remains a powerful tool for innovation, connectivity, and information sharing, influencing nearly every aspect of modern society.
+
@@ -1,11 +1,12 @@
 ---
 layout: recipe
-colab: https://colab.research.google.com/github/TuanaCelik/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_research_assistant.ipynb
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agent_workflow_research_assistant.ipynb
 toc: True
 title: "Agent Workflow + Research Assistant using AgentQL"
-featured: False
+featured: True
 experimental: True
 tags: ['Agent', 'Websearch', 'Integrations']
+language: py
 ---
 <a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agent_workflow_research_assistant.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

@@ -0,0 +1,467 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/agents_as_tools.ipynb
+toc: True
+title: "Multi-Agent Report Generation using Agents as Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/agents_as_tools.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+In this notebook, we will explore how to create a multi-agent system that uses a top-level agent to orchestrate a group of agents as tools. Specifically, we will create a system that can research, write, and review a report on a given topic.
+
+This notebook will assume that you have already either read the [basic agent workflow notebook](https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_basic) or the [general agent documentation](https://docs.llamaindex.ai/en/stable/understanding/agent/).
+
+## Setup
+
+In this example, we will use `OpenAI` as our LLM. For all LLMs, check out the [examples documentation](https://docs.llamaindex.ai/en/stable/examples/llm/openai/) or [LlamaHub](https://llamahub.ai/?tab=llms) for a list of all supported LLMs and how to install/use them.
+
+If we wanted, each agent could have a different LLM, but for this example, we will use the same LLM for all agents.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+sub_agent_llm = OpenAI(model="gpt-4.1-mini", api_key="sk-...")
+orchestrator_llm = OpenAI(model="o3-mini", api_key="sk-...")
+```
+
+## System Design
+
+Our system will have three agents:
+
+1. A `ResearchAgent` that will search the web for information on the given topic.
+2. A `WriteAgent` that will write the report using the information found by the `ResearchAgent`.
+3. A `ReviewAgent` that will review the report and provide feedback.
+
+We will then use a top-level agent to orchestrate the other agents to write our report.
+
+While there are many ways to implement this system, in this case, we will use a single `web_search` tool to search the web for information on the given topic.
+
+
+
+```python
+%pip install tavily-python
+```
+
+
+```python
+from tavily import AsyncTavilyClient
+
+
+async def search_web(query: str) -> str:
+    """Useful for using the web to answer questions."""
+    client = AsyncTavilyClient(api_key="tvly-...")
+    return str(await client.search(query))
+```
+
+With our tool defined, we can now create our sub-agents.
+
+If the LLM you are using supports tool calling, you can use the `FunctionAgent` class. Otherwise, you can use the `ReActAgent` class.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+
+research_agent = FunctionAgent(
+    system_prompt=(
+        "You are the ResearchAgent that can search the web for information on a given topic and record notes on the topic. "
+        "You should output notes on the topic in a structured format."
+    ),
+    llm=sub_agent_llm,
+    tools=[search_web],
+)
+
+write_agent = FunctionAgent(
+    system_prompt=(
+        "You are the WriteAgent that can write a report on a given topic. "
+        "Your report should be in a markdown format. The content should be grounded in the research notes. "
+        "Return your markdown report surrounded by <report>...</report> tags."
+    ),
+    llm=sub_agent_llm,
+    tools=[],
+)
+
+review_agent = FunctionAgent(
+    system_prompt=(
+        "You are the ReviewAgent that can review the write report and provide feedback. "
+        "Your review should either approve the current report or request changes to be implemented."
+    ),
+    llm=sub_agent_llm,
+    tools=[],
+)
+```
+
+With our sub-agents defined, we can then convert them into tools that can be used by the top-level agent.
+
+
+```python
+import re
+from llama_index.core.workflow import Context
+
+
+async def call_research_agent(ctx: Context, prompt: str) -> str:
+    """Useful for recording research notes based on a specific prompt."""
+    result = await research_agent.run(
+        user_msg=f"Write some notes about the following: {prompt}"
+    )
+
+    state = await ctx.store.get("state")
+    state["research_notes"].append(str(result))
+    await ctx.store.set("state", state)
+
+    return str(result)
+
+
+async def call_write_agent(ctx: Context) -> str:
+    """Useful for writing a report based on the research notes or revising the report based on feedback."""
+    state = await ctx.store.get("state")
+    notes = state.get("research_notes", None)
+    if not notes:
+        return "No research notes to write from."
+
+    user_msg = f"Write a markdown report from the following notes. Be sure to output the report in the following format: <report>...</report>:\n\n"
+
+    # Add the feedback to the user message if it exists
+    feedback = state.get("review", None)
+    if feedback:
+        user_msg += f"<feedback>{feedback}</feedback>\n\n"
+
+    # Add the research notes to the user message
+    notes = "\n\n".join(notes)
+    user_msg += f"<research_notes>{notes}</research_notes>\n\n"
+
+    # Run the write agent
+    result = await write_agent.run(user_msg=user_msg)
+    report = re.search(r"<report>(.*)</report>", str(result), re.DOTALL).group(
+        1
+    )
+    state["report_content"] = str(report)
+    await ctx.store.set("state", state)
+
+    return str(report)
+
+
+async def call_review_agent(ctx: Context) -> str:
+    """Useful for reviewing the report and providing feedback."""
+    state = await ctx.store.get("state")
+    report = state.get("report_content", None)
+    if not report:
+        return "No report content to review."
+
+    result = await review_agent.run(
+        user_msg=f"Review the following report: {report}"
+    )
+    state["review"] = result
+    await ctx.store.set("state", state)
+
+    return result
+```
+
+## Creating the Top-Level Orchestrator Agent
+
+With our sub-agents defined as tools, we can now create our top-level orchestrator agent.
+
+
+```python
+orchestrator = FunctionAgent(
+    system_prompt=(
+        "You are an expert in the field of report writing. "
+        "You are given a user request and a list of tools that can help with the request. "
+        "You are to orchestrate the tools to research, write, and review a report on the given topic. "
+        "Once the review is positive, you should notify the user that the report is ready to be accessed."
+    ),
+    llm=orchestrator_llm,
+    tools=[
+        call_research_agent,
+        call_write_agent,
+        call_review_agent,
+    ],
+    initial_state={
+        "research_notes": [],
+        "report_content": None,
+        "review": None,
+    },
+)
+```
+
+## Running the Agent
+
+Let's run our agents! We can iterate over events as the workflow runs.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentInput,
+    AgentOutput,
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+from llama_index.core.workflow import Context
+
+# Create a context for the orchestrator to hold history/state
+ctx = Context(orchestrator)
+
+
+async def run_orchestrator(ctx: Context, user_msg: str):
+    handler = orchestrator.run(
+        user_msg=user_msg,
+        ctx=ctx,
+    )
+
+    async for event in handler.stream_events():
+        if isinstance(event, AgentStream):
+            if event.delta:
+                print(event.delta, end="", flush=True)
+        # elif isinstance(event, AgentInput):
+        #     print("📥 Input:", event.input)
+        elif isinstance(event, AgentOutput):
+            # Skip printing the output since we are streaming above
+            # if event.response.content:
+            #     print("📤 Output:", event.response.content)
+            if event.tool_calls:
+                print(
+                    "🛠️  Planning to use tools:",
+                    [call.tool_name for call in event.tool_calls],
+                )
+        elif isinstance(event, ToolCallResult):
+            print(f"🔧 Tool Result ({event.tool_name}):")
+            print(f"  Arguments: {event.tool_kwargs}")
+            print(f"  Output: {event.tool_output}")
+        elif isinstance(event, ToolCall):
+            print(f"🔨 Calling Tool: {event.tool_name}")
+            print(f"  With arguments: {event.tool_kwargs}")
+```
+
+
+```python
+await run_orchestrator(
+    ctx=ctx,
+    user_msg=(
+        "Write me a report on the history of the internet. "
+        "Briefly describe the history of the internet, including the development of the internet, the development of the web, "
+        "and the development of the internet in the 21st century."
+    ),
+)
+```
+
+    🛠️  Planning to use tools: ['call_research_agent']
+    🔨 Calling Tool: call_research_agent
+      With arguments: {'prompt': 'Write a detailed research note on the history of the internet, covering the development of the internet, the development of the web, and the development of the internet in the 21st century.'}
+    🔧 Tool Result (call_research_agent):
+      Arguments: {'prompt': 'Write a detailed research note on the history of the internet, covering the development of the internet, the development of the web, and the development of the internet in the 21st century.'}
+      Output: Research Notes on the History of the Internet
+    
+    1. Development of the Internet:
+    - The internet's origins trace back to the late 1960s with the U.S. Defense Department's Advanced Research Projects Agency Network (ARPANET), designed as a military defense system during the Cold War.
+    - ARPANET was the first network to implement the protocol suite TCP/IP, which became the technical foundation of the modern Internet.
+    - The Network Working Group evolved into the Internet Working Group to coordinate the growing research community.
+    - In the 1970s, commercial packet networks emerged, primarily to provide remote computer access.
+    - The National Science Foundation (NSF) expanded access to the scientific and academic community and helped make TCP/IP the standard for federally supported research networks.
+    - The internet grew through interconnected commercial backbones linked by network access points (NAPs).
+    
+    2. Development of the World Wide Web:
+    - Invented by Tim Berners-Lee in 1989 while working at CERN, the World Wide Web introduced a "web" of linked information accessible to anyone on the Internet.
+    - By December 1990, Berners-Lee developed the essential tools: HTTP (HyperText Transfer Protocol), HTML (HyperText Markup Language), the first web browser/editor, the first web server, and the first website.
+    - The Web allowed easy access to existing information and linked resources, initially serving CERN scientists.
+    - In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to create open standards for the Web.
+    - The Web evolved from Web 1.0 (basic, static pages) to Web 2.0 (interactive, user-generated content) starting around 2003, and further towards Web 3.0 (semantic web, intelligent data) from 2014 onwards.
+    
+    3. Development of the Internet in the 21st Century:
+    - The 21st century saw transformative developments such as broadband, fiber-optic technology, and mobile internet.
+    - The rise of smartphones revolutionized mobile browsing and internet access.
+    - Cloud computing emerged, allowing data storage and processing on remote servers, changing how businesses and individuals manage information.
+    - The Internet of Things (IoT) connected everyday devices to the internet, expanding the internet's reach into daily life.
+    - Social media platforms became dominant, reshaping communication and information sharing.
+    - The internet's infrastructure and services have continuously evolved to support increasing data demands and new technologies.
+    
+    These notes summarize the key milestones and technological advancements that shaped the internet from its inception to its current state in the 21st century.
+    🛠️  Planning to use tools: ['call_write_agent']
+    🔨 Calling Tool: call_write_agent
+      With arguments: {}
+    🔧 Tool Result (call_write_agent):
+      Arguments: {}
+      Output: 
+    # History of the Internet
+    
+    ## 1. Development of the Internet
+    
+    The origins of the internet date back to the late 1960s with the creation of the Advanced Research Projects Agency Network (ARPANET) by the U.S. Defense Department. Initially designed as a military defense system during the Cold War, ARPANET was the first network to implement the TCP/IP protocol suite, which later became the technical foundation of the modern Internet.
+    
+    The Network Working Group, which coordinated early research efforts, evolved into the Internet Working Group as the research community expanded. During the 1970s, commercial packet networks began to emerge, primarily to provide remote computer access.
+    
+    The National Science Foundation (NSF) played a crucial role by expanding internet access to the scientific and academic communities and promoting TCP/IP as the standard for federally supported research networks. The internet grew further through interconnected commercial backbones linked by network access points (NAPs), facilitating broader connectivity.
+    
+    ## 2. Development of the World Wide Web
+    
+    The World Wide Web was invented in 1989 by Tim Berners-Lee while working at CERN. It introduced a "web" of linked information accessible to anyone on the Internet. By December 1990, Berners-Lee had developed the essential tools that formed the Web's foundation: HTTP (HyperText Transfer Protocol), HTML (HyperText Markup Language), the first web browser/editor, the first web server, and the first website.
+    
+    Initially serving CERN scientists, the Web allowed easy access to existing information and linked resources. In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to create open standards for the Web, ensuring its continued growth and interoperability.
+    
+    The Web evolved through several stages:
+    - **Web 1.0:** Basic, static pages.
+    - **Web 2.0:** Starting around 2003, characterized by interactive, user-generated content.
+    - **Web 3.0:** From 2014 onwards, focusing on the semantic web and intelligent data.
+    
+    ## 3. Development of the Internet in the 21st Century
+    
+    The 21st century brought transformative advancements to the internet, including broadband and fiber-optic technologies that significantly increased data transmission speeds. The rise of smartphones revolutionized mobile browsing and internet access, making the internet ubiquitous.
+    
+    Cloud computing emerged as a major innovation, enabling data storage and processing on remote servers, which transformed how businesses and individuals manage information. The Internet of Things (IoT) connected everyday devices to the internet, expanding its reach into daily life.
+    
+    Social media platforms became dominant forces, reshaping communication and information sharing globally. Throughout these developments, the internet's infrastructure and services have continuously evolved to support increasing data demands and new technologies.
+    
+    ---
+    
+    This report summarizes the key milestones and technological advancements that have shaped the internet from its inception in the late 1960s to its current state in the 21st century.
+    
+    🛠️  Planning to use tools: ['call_review_agent']
+    🔨 Calling Tool: call_review_agent
+      With arguments: {}
+    🔧 Tool Result (call_review_agent):
+      Arguments: {}
+      Output: The report titled "History of the Internet" is well-structured, clear, and provides a concise overview of the major developments in the evolution of the internet. It effectively covers the origins, the invention and growth of the World Wide Web, and significant 21st-century advancements.
+    
+    Strengths:
+    - The chronological organization helps readers follow the progression of internet technology.
+    - Key figures and organizations (e.g., ARPANET, Tim Berners-Lee, NSF, W3C) are appropriately highlighted.
+    - The explanation of Web 1.0, 2.0, and 3.0 stages adds valuable context.
+    - The inclusion of recent technologies such as cloud computing, IoT, and social media reflects current trends.
+    
+    Suggestions for improvement:
+    1. **Add citations or references:** The report would benefit from citing sources or references to support the historical facts and technological descriptions.
+    2. **Clarify technical terms:** While the report is generally accessible, briefly defining terms like TCP/IP, NAPs, and semantic web could help readers unfamiliar with networking jargon.
+    3. **Expand on social impact:** Consider including a brief discussion on how the internet has impacted society, economy, and culture to provide a more holistic view.
+    4. **Minor formatting:** The section numbering is inconsistent (e.g., "1.", "2.", "3." but no numbering for the introduction or conclusion). Adding a brief introduction and conclusion section with numbering or consistent formatting would improve flow.
+    
+    Overall, the report is informative and well-written. With the suggested enhancements, it would be even more comprehensive and reader-friendly.
+    
+    Recommendation: **Approve with minor revisions** to incorporate citations, clarify terms, and consider adding social impact context.
+    🛠️  Planning to use tools: ['call_write_agent']
+    🔨 Calling Tool: call_write_agent
+      With arguments: {}
+    🔧 Tool Result (call_write_agent):
+      Arguments: {}
+      Output: 
+    # History of the Internet
+    
+    ## 1. Introduction
+    
+    The internet is a transformative technology that has reshaped communication, information sharing, and society at large. This report provides a concise overview of the major developments in the evolution of the internet, from its origins in the late 1960s to the advanced technologies and societal impacts of the 21st century.
+    
+    ## 2. Development of the Internet
+    
+    The origins of the internet date back to the late 1960s with the creation of the Advanced Research Projects Agency Network (ARPANET) by the U.S. Department of Defense. ARPANET was initially designed as a military defense communication system during the Cold War. It was the first network to implement the Transmission Control Protocol/Internet Protocol (TCP/IP), a suite of communication protocols that became the technical foundation of the modern internet. TCP/IP enables different networks to interconnect and communicate seamlessly.
+    
+    During the 1970s, commercial packet-switched networks emerged, primarily to provide remote computer access. The National Science Foundation (NSF) played a crucial role in expanding internet access to the scientific and academic communities and helped establish TCP/IP as the standard protocol for federally supported research networks. The internet's growth was further supported by interconnected commercial backbones linked through Network Access Points (NAPs), which facilitated data exchange between different service providers.
+    
+    ## 3. Development of the World Wide Web
+    
+    In 1989, Tim Berners-Lee, working at CERN, invented the World Wide Web (WWW), which introduced a system of linked information accessible to anyone connected to the internet. By December 1990, Berners-Lee had developed the essential components of the Web: HyperText Transfer Protocol (HTTP), HyperText Markup Language (HTML), the first web browser/editor, the first web server, and the first website. These innovations allowed users to easily access and navigate information through hyperlinks.
+    
+    Initially serving CERN scientists, the Web rapidly expanded to the public. In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to develop open standards ensuring the Web's interoperability and growth.
+    
+    The Web has evolved through several stages:
+    
+    - **Web 1.0**: Characterized by static, read-only web pages.
+    - **Web 2.0**: Beginning around 2003, marked by interactive, user-generated content and social media platforms.
+    - **Web 3.0**: Emerging from 2014 onwards, focusing on the semantic web and intelligent data processing to create more personalized and meaningful online experiences.
+    
+    ## 4. Development of the Internet in the 21st Century
+    
+    The 21st century has witnessed transformative advancements in internet technology and infrastructure. Broadband and fiber-optic technologies have significantly increased data transmission speeds. The proliferation of smartphones revolutionized mobile internet access, enabling users to connect anytime and anywhere.
+    
+    Cloud computing emerged as a paradigm shift, allowing data storage and processing on remote servers rather than local devices. This innovation has changed how businesses and individuals manage information and applications.
+    
+    The Internet of Things (IoT) has expanded the internet's reach by connecting everyday devices—such as home appliances, vehicles, and wearable technology—to the network, enabling new functionalities and data-driven services.
+    
+    Social media platforms have become dominant forces in communication and information sharing, reshaping social interactions, marketing, and news dissemination.
+    
+    The internet's infrastructure and services continue to evolve to meet increasing data demands and support emerging technologies.
+    
+    ## 5. Social Impact of the Internet
+    
+    Beyond technological advancements, the internet has profoundly impacted society, the economy, and culture. It has democratized access to information, facilitated global communication, and enabled new forms of social interaction. Economically, it has created new industries, transformed traditional business models, and fostered innovation. Culturally, the internet has influenced media consumption, education, and the way communities form and interact.
+    
+    However, these changes also bring challenges such as privacy concerns, digital divides, misinformation, and cybersecurity threats, which require ongoing attention and management.
+    
+    ## 6. Conclusion
+    
+    The history of the internet is marked by continuous innovation and expansion, from its military origins to a global network integral to modern life. Key figures like Tim Berners-Lee and organizations such as ARPANET, NSF, and W3C have played pivotal roles in its development. Understanding the technical foundations, evolutionary stages of the Web, and recent technological trends provides valuable context for appreciating the internet's role today. Incorporating social impact considerations offers a more holistic view of this transformative technology.
+    
+    ---
+    
+    *Note: This report would benefit from citations to authoritative sources for historical facts and technical explanations to enhance credibility and provide readers with avenues for further research.*
+    
+    
+    The revised report on the history of the internet is now complete and ready for your review. Would you like to access the final report?
+
+With our report written and revised/reviewed, we can inspect the final report in the state.
+
+
+```python
+state = await ctx.store.get("state")
+print(state["report_content"])
+```
+
+    
+    # History of the Internet
+    
+    ## 1. Introduction
+    
+    The internet is a transformative technology that has reshaped communication, information sharing, and society at large. This report provides a concise overview of the major developments in the evolution of the internet, from its origins in the late 1960s to the advanced technologies and societal impacts of the 21st century.
+    
+    ## 2. Development of the Internet
+    
+    The origins of the internet date back to the late 1960s with the creation of the Advanced Research Projects Agency Network (ARPANET) by the U.S. Department of Defense. ARPANET was initially designed as a military defense communication system during the Cold War. It was the first network to implement the Transmission Control Protocol/Internet Protocol (TCP/IP), a suite of communication protocols that became the technical foundation of the modern internet. TCP/IP enables different networks to interconnect and communicate seamlessly.
+    
+    During the 1970s, commercial packet-switched networks emerged, primarily to provide remote computer access. The National Science Foundation (NSF) played a crucial role in expanding internet access to the scientific and academic communities and helped establish TCP/IP as the standard protocol for federally supported research networks. The internet's growth was further supported by interconnected commercial backbones linked through Network Access Points (NAPs), which facilitated data exchange between different service providers.
+    
+    ## 3. Development of the World Wide Web
+    
+    In 1989, Tim Berners-Lee, working at CERN, invented the World Wide Web (WWW), which introduced a system of linked information accessible to anyone connected to the internet. By December 1990, Berners-Lee had developed the essential components of the Web: HyperText Transfer Protocol (HTTP), HyperText Markup Language (HTML), the first web browser/editor, the first web server, and the first website. These innovations allowed users to easily access and navigate information through hyperlinks.
+    
+    Initially serving CERN scientists, the Web rapidly expanded to the public. In 1994, Berners-Lee founded the World Wide Web Consortium (W3C) at MIT to develop open standards ensuring the Web's interoperability and growth.
+    
+    The Web has evolved through several stages:
+    
+    - **Web 1.0**: Characterized by static, read-only web pages.
+    - **Web 2.0**: Beginning around 2003, marked by interactive, user-generated content and social media platforms.
+    - **Web 3.0**: Emerging from 2014 onwards, focusing on the semantic web and intelligent data processing to create more personalized and meaningful online experiences.
+    
+    ## 4. Development of the Internet in the 21st Century
+    
+    The 21st century has witnessed transformative advancements in internet technology and infrastructure. Broadband and fiber-optic technologies have significantly increased data transmission speeds. The proliferation of smartphones revolutionized mobile internet access, enabling users to connect anytime and anywhere.
+    
+    Cloud computing emerged as a paradigm shift, allowing data storage and processing on remote servers rather than local devices. This innovation has changed how businesses and individuals manage information and applications.
+    
+    The Internet of Things (IoT) has expanded the internet's reach by connecting everyday devices—such as home appliances, vehicles, and wearable technology—to the network, enabling new functionalities and data-driven services.
+    
+    Social media platforms have become dominant forces in communication and information sharing, reshaping social interactions, marketing, and news dissemination.
+    
+    The internet's infrastructure and services continue to evolve to meet increasing data demands and support emerging technologies.
+    
+    ## 5. Social Impact of the Internet
+    
+    Beyond technological advancements, the internet has profoundly impacted society, the economy, and culture. It has democratized access to information, facilitated global communication, and enabled new forms of social interaction. Economically, it has created new industries, transformed traditional business models, and fostered innovation. Culturally, the internet has influenced media consumption, education, and the way communities form and interact.
+    
+    However, these changes also bring challenges such as privacy concerns, digital divides, misinformation, and cybersecurity threats, which require ongoing attention and management.
+    
+    ## 6. Conclusion
+    
+    The history of the internet is marked by continuous innovation and expansion, from its military origins to a global network integral to modern life. Key figures like Tim Berners-Lee and organizations such as ARPANET, NSF, and W3C have played pivotal roles in its development. Understanding the technical foundations, evolutionary stages of the Web, and recent technological trends provides valuable context for appreciating the internet's role today. Incorporating social impact considerations offers a more holistic view of this transformative technology.
+    
+    ---
+    
+    *Note: This report would benefit from citations to authoritative sources for historical facts and technical explanations to enhance credibility and provide readers with avenues for further research.*
+    
+    
+
@@ -0,0 +1,213 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/anthropic_agent.ipynb
+toc: True
+title: "Function Calling Anthropic Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our Anthropic agent, powered by function calling capabilities.
+
+**NOTE:** Only claude-3* models support function calling using Anthropic's API.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the Anthropic API (using our own `llama_index` LLM class)
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-anthropic
+%pip install llama-index-embeddings-openai
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Make sure your ANTHROPIC_API_KEY is set. Otherwise explicitly specify the `api_key` parameter.
+
+
+```python
+from llama_index.llms.anthropic import Anthropic
+
+llm = Anthropic(model="claude-3-opus-20240229", api_key="sk-...")
+```
+
+## Initialize Anthropic Agent
+
+Here we initialize a simple Mistral agent with calculator functions.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult
+
+
+async def run_agent_verbose(query: str):
+    handler = agent.run(query)
+    async for event in handler.stream_events():
+        if isinstance(event, ToolCallResult):
+            print(
+                f"Called tool {event.tool_name} with args {event.tool_kwargs}\nGot result: {event.tool_output}"
+            )
+
+    return await handler
+```
+
+### Chat
+
+
+```python
+response = await run_agent_verbose("What is (121 + 2) * 5?")
+print(str(response))
+```
+
+    Called tool add with args {'a': 121, 'b': 2}
+    Got result: 123
+    Called tool multiply with args {'a': 123, 'b': 5}
+    Got result: 615
+    Therefore, (121 + 2) * 5 = 615
+
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+    [ToolCallResult(tool_name='add', tool_kwargs={'a': 121, 'b': 2}, tool_id='toolu_01MH6ME7ppxGPSJcCMEUAN5Q', tool_output=ToolOutput(content='123', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 121, 'b': 2}}, raw_output=123, is_error=False), return_direct=False), ToolCallResult(tool_name='multiply', tool_kwargs={'a': 123, 'b': 5}, tool_id='toolu_01JE5TVERND5YC97E68gYoPw', tool_output=ToolOutput(content='615', tool_name='multiply', raw_input={'args': (), 'kwargs': {'a': 123, 'b': 5}}, raw_output=615, is_error=False), return_direct=False)]
+
+
+### Managing Context/Memory
+
+By default, `.run()` is stateless. If you want to maintain state, you can pass in a `context` object.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+
+response = await agent.run("My name is John Doe", ctx=ctx)
+response = await agent.run("What is my name?", ctx=ctx)
+
+print(str(response))
+```
+
+## Anthropic Agent over RAG Pipeline
+
+Build a Anthropic agent over a simple 10K document. We use OpenAI embeddings and claude-3-haiku-20240307 to construct the RAG pipeline, and pass it to the Anthropic Opus agent as a tool.
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+```
+
+    --2025-03-24 12:52:55--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf
+    Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
+    Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
+    HTTP request sent, awaiting response... 200 OK
+    Length: 1880483 (1.8M) [application/octet-stream]
+    Saving to: ‘data/10k/uber_2021.pdf’
+    
+    data/10k/uber_2021. 100%[===================>]   1.79M  8.98MB/s    in 0.2s    
+    
+    2025-03-24 12:52:56 (8.98 MB/s) - ‘data/10k/uber_2021.pdf’ saved [1880483/1880483]
+    
+
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.llms.anthropic import Anthropic
+
+embed_model = OpenAIEmbedding(
+    model_name="text-embedding-3-large", api_key="sk-proj-..."
+)
+query_llm = Anthropic(model="claude-3-haiku-20240307", api_key="sk-...")
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=query_llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(tools=[query_engine_tool], llm=llm, verbose=True)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber?"
+)
+print(str(response))
+```
+
+    In summary, based on Uber's 2021 10-K filing, some of the company's key risk factors included:
+    
+    - Significant expected increases in operating expenses 
+    - Challenges attracting and retaining drivers, consumers, merchants, shippers, and carriers
+    - Risks to Uber's brand and reputation
+    - Challenges from Uber's historical workplace culture 
+    - Difficulties optimizing organizational structure and managing growth
+    - Risks related to criminal activity by platform users
+    - Risks from new offerings and technologies like autonomous vehicles
+    - Data security and privacy risks
+    - Climate change exposure 
+    - Reliance on third-party platforms
+    - Regulatory and legal risks
+    - Intellectual property risks
+    
+    In terms of growth opportunities and tailwinds, Uber's strategy in 2021 focused on restructuring by divesting certain markets and business lines, and instead partnering with and taking minority ownership positions in local ridesharing and delivery companies in those markets. This suggests Uber saw opportunities to still participate in the growth of those markets through its investments, rather than operating independently.
+
@@ -0,0 +1,157 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/bedrock_converse_agent.ipynb
+toc: True
+title: "Function Calling AWS Bedrock Converse Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our AWS Bedrock Converse agent, powered by function calling capabilities.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. AWS credentials with access to Bedrock and the Claude Haiku LLM
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-bedrock-converse
+%pip install llama-index-embeddings-huggingface
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Make sure to set your AWS credentials, either the `profile_name` or the keys below.
+
+
+```python
+from llama_index.llms.bedrock_converse import BedrockConverse
+
+llm = BedrockConverse(
+    model="anthropic.claude-3-haiku-20240307-v1:0",
+    # NOTE replace with your own AWS credentials
+    aws_access_key_id="AWS Access Key ID to use",
+    aws_secret_access_key="AWS Secret Access Key to use",
+    aws_session_token="AWS Session Token to use",
+    region_name="AWS Region to use, eg. us-east-1",
+)
+```
+
+## Initialize AWS Bedrock Converse Agent
+
+Here we initialize a simple AWS Bedrock Converse agent with calculator functions.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+### Chat
+
+
+```python
+response = await agent.run("What is (121 + 2) * 5?")
+print(str(response))
+```
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+## AWS Bedrock Converse Agent over RAG Pipeline
+
+Build an AWS Bedrock Converse agent over a simple 10K document. We use both HuggingFace embeddings and `BAAI/bge-small-en-v1.5` to construct the RAG pipeline, and pass it to the AWS Bedrock Converse agent as a tool.
+
+
+```python
+!mkdir -p 'data/10k/'
+!curl -o 'data/10k/uber_2021.pdf' 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf'
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+from llama_index.embeddings.huggingface import HuggingFaceEmbedding
+from llama_index.llms.bedrock_converse import BedrockConverse
+
+embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")
+query_llm = BedrockConverse(
+    model="anthropic.claude-3-haiku-20240307-v1:0",
+    # NOTE replace with your own AWS credentials
+    aws_access_key_id="AWS Access Key ID to use",
+    aws_secret_access_key="AWS Secret Access Key to use",
+    aws_session_token="AWS Session Token to use",
+    region_name="AWS Region to use, eg. us-east-1",
+)
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=query_llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[query_engine_tool],
+    llm=llm,
+)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls."
+)
+```
+
+
+```python
+print(str(response))
+```
@@ -0,0 +1,369 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/code_act_agent.ipynb
+toc: True
+title: "Prebuilt CodeAct Agent w/ LlamaIndex"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+LlamaIndex offers a prebuilt CodeAct Agent that can be used to write and execute code, inspired by the original [CodeAct paper](https://arxiv.org/abs/2402.01030).
+
+With this agent, you provide an agent with a set of functions, and the agent will write code that uses those functions to help complete the task you give it.
+
+Some advantages of using the CodeAct Agent:
+
+- No need to exhaustively list out all the possible functions that the agent might need
+- The agent can develop complex workflows around your existing functions
+- Can integrate directly with existing API's
+
+Let's walk through a simple example of how to use the CodeAct Agent.
+
+**NOTE:** This example includes code that will execute arbitrary code. This is dangerous, and proper sandboxing should be used in production environments.
+
+## Setup
+
+First, let's configure the LLM we want to use, and provide some functions that we can use in our code.
+
+
+```python
+%pip install -U llama-index-core llama-index-llms-ollama
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+
+# Configure the LLM
+llm = OpenAI(model="gpt-4o-mini", api_key="sk-...")
+
+
+# Define a few helper functions
+def add(a: int, b: int) -> int:
+    """Add two numbers together"""
+    return a + b
+
+
+def subtract(a: int, b: int) -> int:
+    """Subtract two numbers"""
+    return a - b
+
+
+def multiply(a: int, b: int) -> int:
+    """Multiply two numbers"""
+    return a * b
+
+
+def divide(a: int, b: int) -> float:
+    """Divide two numbers"""
+    return a / b
+```
+
+## Create a Code Executor 
+
+The `CodeActAgent` will require a specific `code_execute_fn` to execute the code generated by the agent.
+
+Below, we define a simple `code_execute_fn` that will execute the code in-process and maintain execution state.
+
+**NOTE:** In a production environment, you should use a more robust method of executing code. This is just for demonstration purposes, and executing code in-process is dangerous. Consider using docker or external services to execute code.
+
+With this executor, we can pass in a dictionary of local and global variables to use in the execution context.
+
+- `locals`: Local variables to use in the execution context, this includes our functions that we want the LLM to code around
+- `globals`: Global variables to use in the execution context, this includes the builtins and other imported modules we want to use in the execution context
+
+
+```python
+from typing import Any, Dict, Tuple
+import io
+import contextlib
+import ast
+import traceback
+
+
+class SimpleCodeExecutor:
+    """
+    A simple code executor that runs Python code with state persistence.
+
+    This executor maintains a global and local state between executions,
+    allowing for variables to persist across multiple code runs.
+
+    NOTE: not safe for production use! Use with caution.
+    """
+
+    def __init__(self, locals: Dict[str, Any], globals: Dict[str, Any]):
+        """
+        Initialize the code executor.
+
+        Args:
+            locals: Local variables to use in the execution context
+            globals: Global variables to use in the execution context
+        """
+        # State that persists between executions
+        self.globals = globals
+        self.locals = locals
+
+    def execute(self, code: str) -> Tuple[bool, str, Any]:
+        """
+        Execute Python code and capture output and return values.
+
+        Args:
+            code: Python code to execute
+
+        Returns:
+            Dict with keys `success`, `output`, and `return_value`
+        """
+        # Capture stdout and stderr
+        stdout = io.StringIO()
+        stderr = io.StringIO()
+
+        output = ""
+        return_value = None
+        try:
+            # Execute with captured output
+            with contextlib.redirect_stdout(
+                stdout
+            ), contextlib.redirect_stderr(stderr):
+                # Try to detect if there's a return value (last expression)
+                try:
+                    tree = ast.parse(code)
+                    last_node = tree.body[-1] if tree.body else None
+
+                    # If the last statement is an expression, capture its value
+                    if isinstance(last_node, ast.Expr):
+                        # Split code to add a return value assignment
+                        last_line = code.rstrip().split("\n")[-1]
+                        exec_code = (
+                            code[: -len(last_line)]
+                            + "\n__result__ = "
+                            + last_line
+                        )
+
+                        # Execute modified code
+                        exec(exec_code, self.globals, self.locals)
+                        return_value = self.locals.get("__result__")
+                    else:
+                        # Normal execution
+                        exec(code, self.globals, self.locals)
+                except:
+                    # If parsing fails, just execute the code as is
+                    exec(code, self.globals, self.locals)
+
+            # Get output
+            output = stdout.getvalue()
+            if stderr.getvalue():
+                output += "\n" + stderr.getvalue()
+
+        except Exception as e:
+            # Capture exception information
+            output = f"Error: {type(e).__name__}: {str(e)}\n"
+            output += traceback.format_exc()
+
+        if return_value is not None:
+            output += "\n\n" + str(return_value)
+
+        return output
+```
+
+
+```python
+code_executor = SimpleCodeExecutor(
+    # give access to our functions defined above
+    locals={
+        "add": add,
+        "subtract": subtract,
+        "multiply": multiply,
+        "divide": divide,
+    },
+    globals={
+        # give access to all builtins
+        "__builtins__": __builtins__,
+        # give access to numpy
+        "np": __import__("numpy"),
+    },
+)
+```
+
+## Setup the CodeAct Agent
+
+Now that we have our code executor, we can setup the CodeAct Agent.
+
+
+
+```python
+from llama_index.core.agent.workflow import CodeActAgent
+from llama_index.core.workflow import Context
+
+agent = CodeActAgent(
+    code_execute_fn=code_executor.execute,
+    llm=llm,
+    tools=[add, subtract, multiply, divide],
+)
+
+# context to hold the agent's session/state/chat history
+ctx = Context(agent)
+```
+
+## Use the Agent
+
+Now that we have our agent, we can use it to complete tasks! Since we gave it some math functions, we will prompt it for tasks that require calculations.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    ToolCall,
+    ToolCallResult,
+    AgentStream,
+)
+
+
+async def run_agent_verbose(agent, ctx, query):
+    handler = agent.run(query, ctx=ctx)
+    print(f"User:  {query}")
+    async for event in handler.stream_events():
+        if isinstance(event, ToolCallResult):
+            print(
+                f"\n-----------\nCode execution result:\n{event.tool_output}"
+            )
+        elif isinstance(event, ToolCall):
+            print(f"\n-----------\nParsed code:\n{event.tool_kwargs['code']}")
+        elif isinstance(event, AgentStream):
+            print(f"{event.delta}", end="", flush=True)
+
+    return await handler
+```
+
+Here, the agent uses some built-in functions to calculate the sum of all numbers from 1 to 10.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of all numbers from 1 to 10"
+)
+```
+
+    User:  Calculate the sum of all numbers from 1 to 10
+    The sum of all numbers from 1 to 10 can be calculated using the formula for the sum of an arithmetic series. However, I will compute it directly for you.
+    
+    <execute>
+    # Calculate the sum of numbers from 1 to 10
+    total_sum = sum(range(1, 11))
+    print(total_sum)
+    </execute>
+    -----------
+    Parsed code:
+    # Calculate the sum of numbers from 1 to 10
+    total_sum = sum(range(1, 11))
+    print(total_sum)
+    
+    -----------
+    Code execution result:
+    55
+    
+    The sum of all numbers from 1 to 10 is 55.
+
+Next, we get the agent to use the tools that we passed in.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Add 5 and 3, then multiply the result by 2"
+)
+```
+
+    User:  Add 5 and 3, then multiply the result by 2
+    I will perform the addition of 5 and 3, and then multiply the result by 2.
+    
+    <execute>
+    # Perform the calculation
+    addition_result = add(5, 3)
+    final_result = multiply(addition_result, 2)
+    print(final_result)
+    </execute>
+    -----------
+    Parsed code:
+    # Perform the calculation
+    addition_result = add(5, 3)
+    final_result = multiply(addition_result, 2)
+    print(final_result)
+    
+    -----------
+    Code execution result:
+    16
+    
+    The result of adding 5 and 3, then multiplying by 2, is 16.
+
+We can even get the agent to define new functions for us!
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 10 fibonacci numbers"
+)
+```
+
+    User:  Calculate the sum of the first 10 fibonacci numbers
+    I will calculate the sum of the first 10 Fibonacci numbers.
+    
+    <execute>
+    def fibonacci(n):
+        fib_sequence = [0, 1]
+        for i in range(2, n):
+            fib_sequence.append(fib_sequence[-1] + fib_sequence[-2])
+        return fib_sequence
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    first_10_fib = fibonacci(10)
+    fibonacci_sum = sum(first_10_fib)
+    print(fibonacci_sum)
+    </execute>
+    -----------
+    Parsed code:
+    def fibonacci(n):
+        fib_sequence = [0, 1]
+        for i in range(2, n):
+            fib_sequence.append(fib_sequence[-1] + fib_sequence[-2])
+        return fib_sequence
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    first_10_fib = fibonacci(10)
+    fibonacci_sum = sum(first_10_fib)
+    print(fibonacci_sum)
+    
+    -----------
+    Code execution result:
+    88
+    
+    The sum of the first 10 Fibonacci numbers is 88.
+
+And then reuse those new functions in a new task!
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 20 fibonacci numbers"
+)
+```
+
+    User:  Calculate the sum of the first 20 fibonacci numbers
+    I will calculate the sum of the first 20 Fibonacci numbers.
+    
+    <execute>
+    # Calculate the sum of the first 20 Fibonacci numbers
+    first_20_fib = fibonacci(20)
+    fibonacci_sum_20 = sum(first_20_fib)
+    print(fibonacci_sum_20)
+    </execute>
+    -----------
+    Parsed code:
+    # Calculate the sum of the first 20 Fibonacci numbers
+    first_20_fib = fibonacci(20)
+    fibonacci_sum_20 = sum(first_20_fib)
+    print(fibonacci_sum_20)
+    
+    -----------
+    Code execution result:
+    10945
+    
+    The sum of the first 20 Fibonacci numbers is 10,945.
@@ -1,11 +1,12 @@
 ---
 layout: recipe
-colab: https://colab.research.google.com/github/TuanaCelik/cookbooks-demo/blob/main/notebooks/agent/custom_multi_agent.ipynb
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/custom_multi_agent.ipynb
 toc: True
 title: "Custom Planning Multi-Agent System"
 featured: True
 experimental: False
 tags: ['Agent']
+language: py
 ---
 <a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/custom_multi_agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

@@ -0,0 +1,505 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/from_scratch_code_act_agent.ipynb
+toc: True
+title: "Creating a CodeAct Agent From Scratch"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+While LlamaIndex provides a pre-built [CodeActAgent](https://docs.llamaindex.ai/en/stable/examples/agent/code_act_agent/), we can also create our own from scratch.
+
+This way, we can fully understand and customize the agent's behaviour beyond what is provided by the pre-built agent.
+
+In this notebook, we will
+1. Create a workflow for generating and parsing code
+2. Implement basic code execution
+3. Add memory and state to the agent
+
+## Setting up Functions for our Agent
+
+If we want our agent to execute our code, we need to deine the code for it to execute!
+
+For now, let's use a few basic math functions.
+
+
+```python
+# Define a few helper functions
+def add(a: int, b: int) -> int:
+    """Add two numbers together"""
+    return a + b
+
+
+def subtract(a: int, b: int) -> int:
+    """Subtract two numbers"""
+    return a - b
+
+
+def multiply(a: int, b: int) -> int:
+    """Multiply two numbers"""
+    return a * b
+
+
+def divide(a: int, b: int) -> float:
+    """Divide two numbers"""
+    return a / b
+```
+
+## Creating a Code Executor 
+
+In order to execute code, we need to create a code executor.
+
+Here, we will use a simple in-process code executor that maintains it's own state.
+
+**NOTE:** This is a simple example, and does not include proper sandboxing. In a production environment, you should use tools like docker or proper code sandboxing environments.
+
+
+```python
+from typing import Any, Dict, Tuple
+import io
+import contextlib
+import ast
+import traceback
+
+
+class SimpleCodeExecutor:
+    """
+    A simple code executor that runs Python code with state persistence.
+
+    This executor maintains a global and local state between executions,
+    allowing for variables to persist across multiple code runs.
+
+    NOTE: not safe for production use! Use with caution.
+    """
+
+    def __init__(self, locals: Dict[str, Any], globals: Dict[str, Any]):
+        """
+        Initialize the code executor.
+
+        Args:
+            locals: Local variables to use in the execution context
+            globals: Global variables to use in the execution context
+        """
+        # State that persists between executions
+        self.globals = globals
+        self.locals = locals
+
+    def execute(self, code: str) -> Tuple[bool, str, Any]:
+        """
+        Execute Python code and capture output and return values.
+
+        Args:
+            code: Python code to execute
+
+        Returns:
+            Dict with keys `success`, `output`, and `return_value`
+        """
+        # Capture stdout and stderr
+        stdout = io.StringIO()
+        stderr = io.StringIO()
+
+        output = ""
+        return_value = None
+        try:
+            # Execute with captured output
+            with contextlib.redirect_stdout(
+                stdout
+            ), contextlib.redirect_stderr(stderr):
+                # Try to detect if there's a return value (last expression)
+                try:
+                    tree = ast.parse(code)
+                    last_node = tree.body[-1] if tree.body else None
+
+                    # If the last statement is an expression, capture its value
+                    if isinstance(last_node, ast.Expr):
+                        # Split code to add a return value assignment
+                        last_line = code.rstrip().split("\n")[-1]
+                        exec_code = (
+                            code[: -len(last_line)]
+                            + "\n__result__ = "
+                            + last_line
+                        )
+
+                        # Execute modified code
+                        exec(exec_code, self.globals, self.locals)
+                        return_value = self.locals.get("__result__")
+                    else:
+                        # Normal execution
+                        exec(code, self.globals, self.locals)
+                except:
+                    # If parsing fails, just execute the code as is
+                    exec(code, self.globals, self.locals)
+
+            # Get output
+            output = stdout.getvalue()
+            if stderr.getvalue():
+                output += "\n" + stderr.getvalue()
+
+        except Exception as e:
+            # Capture exception information
+            output = f"Error: {type(e).__name__}: {str(e)}\n"
+            output += traceback.format_exc()
+
+        if return_value is not None:
+            output += "\n\n" + str(return_value)
+
+        return output
+```
+
+
+```python
+code_executor = SimpleCodeExecutor(
+    # give access to our functions defined above
+    locals={
+        "add": add,
+        "subtract": subtract,
+        "multiply": multiply,
+        "divide": divide,
+    },
+    globals={
+        # give access to all builtins
+        "__builtins__": __builtins__,
+        # give access to numpy
+        "np": __import__("numpy"),
+    },
+)
+```
+
+## Defining the CodeAct Agent
+
+Now, we can using LlamaIndex Workflows to define the workflow for our agent.
+
+The basic flow is:
+- take in our prompt + chat history
+- parse out the code to execute (if any)
+- execute the code
+- provide the output of the code execution back to the agent
+- repeat until the agent is satisfied with the answer
+
+First, we can create the events in the workflow.
+
+
+```python
+from llama_index.core.llms import ChatMessage
+from llama_index.core.workflow import Event
+
+
+class InputEvent(Event):
+    input: list[ChatMessage]
+
+
+class StreamEvent(Event):
+    delta: str
+
+
+class CodeExecutionEvent(Event):
+    code: str
+```
+
+Next, we can define the workflow that orchestrates using these events.
+
+
+```python
+import inspect
+import re
+from typing import Any, Callable, List
+
+from llama_index.core.llms import ChatMessage, LLM
+from llama_index.core.memory import ChatMemoryBuffer
+from llama_index.core.tools.types import BaseTool
+from llama_index.core.workflow import (
+    Context,
+    Workflow,
+    StartEvent,
+    StopEvent,
+    step,
+)
+from llama_index.llms.openai import OpenAI
+
+
+CODEACT_SYSTEM_PROMPT = """
+You are a helpful assistant that can execute code.
+
+Given the chat history, you can write code within <execute>...</execute> tags to help the user with their question.
+
+In your code, you can reference any previously used variables or functions.
+
+The user has also provided you with some predefined functions:
+{fn_str}
+
+To execute code, write the code between <execute>...</execute> tags.
+"""
+
+
+class CodeActAgent(Workflow):
+    def __init__(
+        self,
+        fns: List[Callable],
+        code_execute_fn: Callable,
+        llm: LLM | None = None,
+        **workflow_kwargs: Any,
+    ) -> None:
+        super().__init__(**workflow_kwargs)
+        self.fns = fns or []
+        self.code_execute_fn = code_execute_fn
+        self.llm = llm or OpenAI(model="gpt-4o-mini")
+
+        # parse the functions into truncated function strings
+        self.fn_str = "\n\n".join(
+            f'def {fn.__name__}{str(inspect.signature(fn))}:\n    """ {fn.__doc__} """\n    ...'
+            for fn in self.fns
+        )
+        self.system_message = ChatMessage(
+            role="system",
+            content=CODEACT_SYSTEM_PROMPT.format(fn_str=self.fn_str),
+        )
+
+    def _parse_code(self, response: str) -> str | None:
+        # find the code between <execute>...</execute> tags
+        matches = re.findall(r"<execute>(.*?)</execute>", response, re.DOTALL)
+        if matches:
+            return "\n\n".join(matches)
+
+        return None
+
+    @step
+    async def prepare_chat_history(
+        self, ctx: Context, ev: StartEvent
+    ) -> InputEvent:
+        # check if memory is setup
+        memory = await ctx.store.get("memory", default=None)
+        if not memory:
+            memory = ChatMemoryBuffer.from_defaults(llm=self.llm)
+
+        # get user input
+        user_input = ev.get("user_input")
+        if user_input is None:
+            raise ValueError("user_input kwarg is required")
+        user_msg = ChatMessage(role="user", content=user_input)
+        memory.put(user_msg)
+
+        # get chat history
+        chat_history = memory.get()
+
+        # update context
+        await ctx.store.set("memory", memory)
+
+        # add the system message to the chat history and return
+        return InputEvent(input=[self.system_message, *chat_history])
+
+    @step
+    async def handle_llm_input(
+        self, ctx: Context, ev: InputEvent
+    ) -> CodeExecutionEvent | StopEvent:
+        chat_history = ev.input
+
+        # stream the response
+        response_stream = await self.llm.astream_chat(chat_history)
+        async for response in response_stream:
+            ctx.write_event_to_stream(StreamEvent(delta=response.delta or ""))
+
+        # save the final response, which should have all content
+        memory = await ctx.store.get("memory")
+        memory.put(response.message)
+        await ctx.store.set("memory", memory)
+
+        # get the code to execute
+        code = self._parse_code(response.message.content)
+
+        if not code:
+            return StopEvent(result=response)
+        else:
+            return CodeExecutionEvent(code=code)
+
+    @step
+    async def handle_code_execution(
+        self, ctx: Context, ev: CodeExecutionEvent
+    ) -> InputEvent:
+        # execute the code
+        ctx.write_event_to_stream(ev)
+        output = self.code_execute_fn(ev.code)
+
+        # update the memory
+        memory = await ctx.store.get("memory")
+        memory.put(ChatMessage(role="assistant", content=output))
+        await ctx.store.set("memory", memory)
+
+        # get the latest chat history and loop back to the start
+        chat_history = memory.get()
+        return InputEvent(input=[self.system_message, *chat_history])
+```
+
+## Testing the CodeAct Agent
+
+Now, we can test out the CodeAct Agent!
+
+We'll create a simple agent and slowly build up the complexity with requests.
+
+
+```python
+from llama_index.core.workflow import Context
+
+agent = CodeActAgent(
+    fns=[add, subtract, multiply, divide],
+    code_execute_fn=code_executor.execute,
+    llm=OpenAI(model="gpt-4o-mini", api_key="sk-..."),
+)
+
+# context to hold the agent's state / memory
+ctx = Context(agent)
+```
+
+
+```python
+async def run_agent_verbose(agent: CodeActAgent, ctx: Context, query: str):
+    handler = agent.run(user_input=query, ctx=ctx)
+    print(f"User:  {query}")
+    async for event in handler.stream_events():
+        if isinstance(event, StreamEvent):
+            print(f"{event.delta}", end="", flush=True)
+        elif isinstance(event, CodeExecutionEvent):
+            print(f"\n-----------\nParsed code:\n{event.code}\n")
+
+    return await handler
+```
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of all numbers from 1 to 10"
+)
+```
+
+    User:  Calculate the sum of all numbers from 1 to 10
+    To calculate the sum of all numbers from 1 to 10, we can use the `add` function in a loop. Here's how we can do it:
+    
+    <execute>
+    total_sum = 0
+    for number in range(1, 11):
+        total_sum = add(total_sum, number)
+    total_sum
+    </execute>
+    -----------
+    Parsed code:
+    
+    total_sum = 0
+    for number in range(1, 11):
+        total_sum = add(total_sum, number)
+    total_sum
+    
+    
+    The sum of all numbers from 1 to 10 is 55.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Add 5 and 3, then multiply the result by 2"
+)
+```
+
+    User:  Add 5 and 3, then multiply the result by 2
+    To perform the calculation, we will first add 5 and 3 using the `add` function, and then multiply the result by 2 using the `multiply` function. Here's how we can do it:
+    
+    <execute>
+    result_addition = add(5, 3)
+    final_result = multiply(result_addition, 2)
+    final_result
+    </execute>
+    -----------
+    Parsed code:
+    
+    result_addition = add(5, 3)
+    final_result = multiply(result_addition, 2)
+    final_result
+    
+    
+    The final result of adding 5 and 3, then multiplying by 2, is 16.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 10 fibonacci numbers0"
+)
+```
+
+    User:  Calculate the sum of the first 10 fibonacci numbers0
+    To calculate the sum of the first 10 Fibonacci numbers, we first need to generate the Fibonacci sequence up to the 10th number and then sum those numbers. The Fibonacci sequence starts with 0 and 1, and each subsequent number is the sum of the two preceding ones.
+    
+    Here's how we can do it:
+    
+    <execute>
+    def fibonacci(n: int) -> int:
+        """ Return the nth Fibonacci number """
+        if n == 0:
+            return 0
+        elif n == 1:
+            return 1
+        else:
+            a, b = 0, 1
+            for _ in range(2, n + 1):
+                a, b = b, a + b
+            return b
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    fibonacci_sum = 0
+    for i in range(10):
+        fibonacci_sum = add(fibonacci_sum, fibonacci(i))
+    
+    fibonacci_sum
+    </execute>
+    -----------
+    Parsed code:
+    
+    def fibonacci(n: int) -> int:
+        """ Return the nth Fibonacci number """
+        if n == 0:
+            return 0
+        elif n == 1:
+            return 1
+        else:
+            a, b = 0, 1
+            for _ in range(2, n + 1):
+                a, b = b, a + b
+            return b
+    
+    # Calculate the sum of the first 10 Fibonacci numbers
+    fibonacci_sum = 0
+    for i in range(10):
+        fibonacci_sum = add(fibonacci_sum, fibonacci(i))
+    
+    fibonacci_sum
+    
+    
+    The sum of the first 10 Fibonacci numbers is 55.
+
+
+```python
+response = await run_agent_verbose(
+    agent, ctx, "Calculate the sum of the first 20 fibonacci numbers"
+)
+```
+
+    User:  Calculate the sum of the first 20 fibonacci numbers
+    To calculate the sum of the first 20 Fibonacci numbers, we can use the same approach as before, but this time we will iterate up to 20. Here's how we can do it:
+    
+    <execute>
+    # Calculate the sum of the first 20 Fibonacci numbers
+    fibonacci_sum_20 = 0
+    for i in range(20):
+        fibonacci_sum_20 = add(fibonacci_sum_20, fibonacci(i))
+    
+    fibonacci_sum_20
+    </execute>
+    -----------
+    Parsed code:
+    
+    # Calculate the sum of the first 20 Fibonacci numbers
+    fibonacci_sum_20 = 0
+    for i in range(20):
+        fibonacci_sum_20 = add(fibonacci_sum_20, fibonacci(i))
+    
+    fibonacci_sum_20
+    
+    
+    The sum of the first 20 Fibonacci numbers is 6765.
@@ -0,0 +1,186 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/mistral_agent.ipynb
+toc: True
+title: "Function Calling Mistral Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our Mistral agent, powered by function calling capabilities.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the OpenAI API (using our own `llama_index` LLM class)
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-mistralai
+%pip install llama-index-embeddings-mistralai
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Make sure your MISTRAL_API_KEY is set. Otherwise explicitly specify the `api_key` parameter.
+
+
+```python
+from llama_index.llms.mistralai import MistralAI
+
+llm = MistralAI(model="mistral-large-latest", api_key="...")
+```
+
+## Initialize Mistral Agent
+
+Here we initialize a simple Mistral agent with calculator functions.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+### Chat
+
+
+```python
+response = await agent.run("What is (121 + 2) * 5?")
+print(str(response))
+```
+
+    Added user message to memory: What is (121 + 2) * 5?
+    === Calling Function ===
+    Calling function: add with args: {"a": 121, "b": 2}
+    === Calling Function ===
+    Calling function: multiply with args: {"a": 123, "b": 5}
+    assistant: The result of (121 + 2) * 5 is 615.
+
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+### Managing Context/Memory
+
+By default, `.run()` is stateless. If you want to maintain state, you can pass in a `context` object.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+
+response = await agent.run("My name is John Doe", ctx=ctx)
+response = await agent.run("What is my name?", ctx=ctx)
+
+print(str(response))
+```
+
+## Mistral Agent over RAG Pipeline
+
+Build a Mistral agent over a simple 10K document. We use both Mistral embeddings and mistral-medium to construct the RAG pipeline, and pass it to the Mistral agent as a tool.
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+from llama_index.embeddings.mistralai import MistralAIEmbedding
+from llama_index.llms.mistralai import MistralAI
+
+embed_model = MistralAIEmbedding(api_key="...")
+query_llm = MistralAI(model="mistral-medium", api_key="...")
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=query_llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(tools=[query_engine_tool], llm=llm)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls."
+)
+print(str(response))
+```
+
+    Added user message to memory: Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls.
+    === Calling Function ===
+    Calling function: uber_10k with args: {"input": "What are the risk factors for Uber in 2021?"}
+    === Calling Function ===
+    Calling function: uber_10k with args: {"input": "What are the tailwinds for Uber in 2021?"}
+    assistant: Based on the information provided, here are the risk factors for Uber in 2021:
+    
+    1. Failure to offer or develop autonomous vehicle technologies, which could result in inferior performance or safety concerns compared to competitors.
+    2. Dependence on high-quality personnel and the potential impact of attrition or unsuccessful succession planning on the business.
+    3. Security or data privacy breaches, unauthorized access, or destruction of proprietary, employee, or user data.
+    4. Cyberattacks, such as malware, ransomware, viruses, spamming, and phishing attacks, which could harm the company's reputation and operations.
+    5. Climate change risks, including physical and transitional risks, that may adversely impact the business if not managed effectively.
+    6. Reliance on third parties to maintain open marketplaces for distributing products and providing software, which could negatively affect the business if interfered with.
+    7. The need for additional capital to support business growth, which may not be available on reasonable terms or at all.
+    8. Difficulties in identifying, acquiring, and integrating suitable businesses, which could harm operating results and prospects.
+    9. Legal and regulatory risks, including extensive government regulation and oversight related to payment and financial services.
+    10. Intellectual property risks, such as the inability to protect intellectual property or claims of misappropriation by third parties.
+    11. Volatility in the market price of common stock, which could result in steep declines and loss of investment for shareholders.
+    12. Economic risks related to the COVID-19 pandemic, which has adversely impacted and could continue to adversely impact the business, financial condition, and results of operations.
+    13. The potential reclassification of Drivers as employees, workers, or quasi-employees, which could result in material costs associated with defending, settling, or resolving lawsuits and demands for arbitration.
+    
+    On the other hand, here are some tailwinds for Uber in 2021:
+    
+    1. Launch of Uber One, a single cross-platform membership program in the United States, which offers discounts, special pricing, priority service, and exclusive perks across rides, delivery, and grocery offerings.
+    2. Introduction of a "Super App" view on iOS
+
@@ -0,0 +1,706 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/multi_agent_workflow_with_weaviate_queryagent.ipynb
+toc: True
+title: "Multi-Agent Workflow with Weaviate QueryAgent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/agent/multi_agent_workflow_with_weaviate_queryagent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
+
+In this example, we will be building a LlamaIndex Agent Workflow that ends up being a multi-agent system that aims to be a Docs Assistant capable of:
+- Writing new content to a "LlamaIndexDocs" collection in Weaviate
+- Writing new content to a "WeaviateDocs" collection in Weaviate
+- Using the Weaviate [`QueryAgent`](https://weaviate.io/developers/agents/query) to answer questions based on the contents of these collections.
+
+The `QueryAgent` is a full agent prodcut by Weaviate, that is capable of doing regular search, as well as aggregations over the collections you give it access to. Our 'orchestrator' agent will decide when to invoke the Weaviate QueryAgent, leaving the job of creating Weaviate specific search queries to it.
+
+**Things you will need:**
+
+- An OpenAI API key (or switch to another provider and adjust the code below)
+- A Weaviate sandbox (this is free)
+- Your Weaviate sandbox URL and API key
+
+![Workflow Overview](../_static/agents/workflow-weaviate-multiagent.png)
+
+## Install & Import Dependencies
+
+
+```python
+!pip install llama-index-core llama-index-utils-workflow weaviate-client[agents] llama-index-llms-openai llama-index-readers-web
+```
+
+
+```python
+from llama_index.core.workflow import (
+    StartEvent,
+    StopEvent,
+    Workflow,
+    step,
+    Event,
+    Context,
+)
+from llama_index.utils.workflow import draw_all_possible_flows
+from llama_index.readers.web import SimpleWebPageReader
+from llama_index.core.llms import ChatMessage
+from llama_index.core.tools import FunctionTool
+from llama_index.llms.openai import OpenAI
+from llama_index.core.agent.workflow import FunctionAgent
+
+from enum import Enum
+from pydantic import BaseModel, Field
+from llama_index.llms.openai import OpenAI
+from typing import List, Union
+import json
+
+import weaviate
+from weaviate.auth import Auth
+from weaviate.agents.query import QueryAgent
+from weaviate.classes.config import Configure, Property, DataType
+
+import os
+from getpass import getpass
+```
+
+## Set up Weaviate
+
+To use the Weaviate Query Agent, first, create a [Weaviate Cloud](https://weaviate.io/deployment/serverless) account👇
+1. [Create Serverless Weaviate Cloud account](https://weaviate.io/deployment/serverless) and set up a free [Sandbox](https://weaviate.io/developers/wcs/manage-clusters/create#sandbox-clusters)
+2. Go to 'Embedding' and enable it, by default, this will make it so that we use `Snowflake/snowflake-arctic-embed-l-v2.0` as the embedding model
+3. Take note of the `WEAVIATE_URL` and `WEAVIATE_API_KEY` to connect to your cluster below
+
+> Info: We recommend using [Weaviate Embeddings](https://weaviate.io/developers/weaviate/model-providers/weaviate) so you do not have to provide any extra keys for external embedding providers.
+
+
+```python
+if "WEAVIATE_API_KEY" not in os.environ:
+    os.environ["WEAVIATE_API_KEY"] = getpass("Add Weaviate API Key")
+if "WEAVIATE_URL" not in os.environ:
+    os.environ["WEAVIATE_URL"] = getpass("Add Weaviate URL")
+```
+
+
+```python
+client = weaviate.connect_to_weaviate_cloud(
+    cluster_url=os.environ.get("WEAVIATE_URL"),
+    auth_credentials=Auth.api_key(os.environ.get("WEAVIATE_API_KEY")),
+)
+```
+
+### Create WeaviateDocs and LlamaIndexDocs Collections
+
+The helper function below will create a "WeaviateDocs" and "LlamaIndexDocs" collection in Weaviate (if they don't exist already). It will also set up a `QueryAgent` that has access to both of these collections.
+
+The Weaviate [`QueryAgent`](https://weaviate.io/blog/query-agent) is designed to be able to query Weviate Collections for both regular search and aggregations, and also handles the burden of creating the Weaviate specific queries internally.
+
+The Agent will use the collection descriptions, as well as the property descriptions while formilating the queries.
+
+
+```python
+def fresh_setup_weaviate(client):
+    if client.collections.exists("WeaviateDocs"):
+        client.collections.delete("WeaviateDocs")
+    client.collections.create(
+        "WeaviateDocs",
+        description="A dataset with the contents of Weaviate technical Docs and website",
+        vectorizer_config=Configure.Vectorizer.text2vec_weaviate(),
+        properties=[
+            Property(
+                name="url",
+                data_type=DataType.TEXT,
+                description="the source URL of the webpage",
+            ),
+            Property(
+                name="text",
+                data_type=DataType.TEXT,
+                description="the content of the webpage",
+            ),
+        ],
+    )
+
+    if client.collections.exists("LlamaIndexDocs"):
+        client.collections.delete("LlamaIndexDocs")
+    client.collections.create(
+        "LlamaIndexDocs",
+        description="A dataset with the contents of LlamaIndex technical Docs and website",
+        vectorizer_config=Configure.Vectorizer.text2vec_weaviate(),
+        properties=[
+            Property(
+                name="url",
+                data_type=DataType.TEXT,
+                description="the source URL of the webpage",
+            ),
+            Property(
+                name="text",
+                data_type=DataType.TEXT,
+                description="the content of the webpage",
+            ),
+        ],
+    )
+
+    agent = QueryAgent(
+        client=client, collections=["LlamaIndexDocs", "WeaviateDocs"]
+    )
+    return agent
+```
+
+### Write Contents of Webpage to the Collections
+
+The helper function below uses the `SimpleWebPageReader` to write the contents of a webpage to the relevant Weaviate collection
+
+
+```python
+def write_webpages_to_weaviate(client, urls: list[str], collection_name: str):
+    documents = SimpleWebPageReader(html_to_text=True).load_data(urls)
+    collection = client.collections.get(collection_name)
+    with collection.batch.dynamic() as batch:
+        for doc in documents:
+            batch.add_object(properties={"url": doc.id_, "text": doc.text})
+```
+
+## Create a Function Calling Agent
+
+Now that we have the relevant functions to write to a collection and also the `QueryAgent` at hand, we can start by using the `FunctionAgent`, which is a simple tool calling agent.
+
+
+```python
+if "OPENAI_API_KEY" not in os.environ:
+    os.environ["OPENAI_API_KEY"] = getpass("openai-key")
+```
+
+
+```python
+weaviate_agent = fresh_setup_weaviate(client)
+```
+
+
+```python
+llm = OpenAI(model="gpt-4o-mini")
+
+
+def write_to_weaviate_collection(urls=list[str]):
+    """Useful for writing new content to the WeaviateDocs collection"""
+    write_webpages_to_weaviate(client, urls, "WeaviateDocs")
+
+
+def write_to_li_collection(urls=list[str]):
+    """Useful for writing new content to the LlamaIndexDocs collection"""
+    write_webpages_to_weaviate(client, urls, "LlamaIndexDocs")
+
+
+def query_agent(query: str) -> str:
+    """Useful for asking questions about Weaviate and LlamaIndex"""
+    response = weaviate_agent.run(query)
+    return response.final_answer
+
+
+agent = FunctionAgent(
+    tools=[write_to_weaviate_collection, write_to_li_collection, query_agent],
+    llm=llm,
+    system_prompt="""You are a helpful assistant that can write the
+      contents of urls to WeaviateDocs and LlamaIndexDocs collections,
+      as well as forwarding questions to a QueryAgent""",
+)
+```
+
+
+```python
+response = await agent.run(
+    user_msg="Can you save https://docs.llamaindex.ai/en/stable/examples/agent/agent_workflow_basic/"
+)
+print(str(response))
+```
+
+
+```python
+response = await agent.run(
+    user_msg="""What are llama index workflows? And can you save
+    these to weaviate docs: https://weaviate.io/blog/what-are-agentic-workflows
+    and https://weaviate.io/blog/ai-agents"""
+)
+print(str(response))
+```
+
+    Llama Index workflows refer to orchestrations involving one or more AI agents within the LlamaIndex framework. These workflows manage complex tasks dynamically by leveraging components such as large language models (LLMs), tools, and memory states. Key features of Llama Index workflows include:
+    
+    - Support for single or multiple agents managed within an AgentWorkflow orchestrator.
+    - Ability to maintain state across runs via serializable context objects.
+    - Integration of external tools with type annotations, including asynchronous functions.
+    - Streaming of intermediate outputs and event-based interactions.
+    - Human-in-the-loop capabilities to confirm or guide agent actions during workflow execution.
+    
+    These workflows enable agents to execute sequences of operations, call external tools asynchronously, maintain conversation or task states, stream partial results, and incorporate human inputs when necessary. They embody dynamic, agent-driven sequences of task decomposition, tool use, and reflection, allowing AI systems to plan, act, and improve iteratively toward specific goals.
+    
+    I have also saved the contents from the provided URLs to the WeaviateDocs collection.
+
+
+
+```python
+response = await agent.run(
+    user_msg="How many docs do I have in the weaviate and llamaindex collections in total?"
+)
+print(str(response))
+```
+
+    You have a total of 2 documents in the WeaviateDocs collection and 1 document in the LlamaIndexDocs collection. In total, that makes 3 documents across both collections.
+
+
+
+```python
+weaviate_agent = fresh_setup_weaviate(client)
+```
+
+## Create a Workflow with Branches
+
+### Simple Example: Create Events
+
+A LlamaIndex Workflow has 2 fundamentals:
+- An Event
+- A Step
+
+An step may return an event, and an event may trigger a step!
+
+For our use-case, we can imagine thet there are 4 events:
+
+
+```python
+class EvaluateQuery(Event):
+    query: str
+
+
+class WriteLlamaIndexDocsEvent(Event):
+    urls: list[str]
+
+
+class WriteWeaviateDocsEvent(Event):
+    urls: list[str]
+
+
+class QueryAgentEvent(Event):
+    query: str
+```
+
+### Simple Example: A Branching Workflow (that does nothing yet)
+
+
+```python
+class DocsAssistantWorkflow(Workflow):
+    @step
+    async def start(self, ctx: Context, ev: StartEvent) -> EvaluateQuery:
+        return EvaluateQuery(query=ev.query)
+
+    @step
+    async def evaluate_query(
+        self, ctx: Context, ev: EvaluateQuery
+    ) -> QueryAgentEvent | WriteLlamaIndexDocsEvent | WriteWeaviateDocsEvent | StopEvent:
+        if ev.query == "llama":
+            return WriteLlamaIndexDocsEvent(urls=[ev.query])
+        if ev.query == "weaviate":
+            return WriteWeaviateDocsEvent(urls=[ev.query])
+        if ev.query == "question":
+            return QueryAgentEvent(query=ev.query)
+        return StopEvent()
+
+    @step
+    async def write_li_docs(
+        self, ctx: Context, ev: WriteLlamaIndexDocsEvent
+    ) -> StopEvent:
+        print(f"Got a request to write something to LlamaIndexDocs")
+        return StopEvent()
+
+    @step
+    async def write_weaviate_docs(
+        self, ctx: Context, ev: WriteWeaviateDocsEvent
+    ) -> StopEvent:
+        print(f"Got a request to write something to WeaviateDocs")
+        return StopEvent()
+
+    @step
+    async def query_agent(
+        self, ctx: Context, ev: QueryAgentEvent
+    ) -> StopEvent:
+        print(f"Got a request to forward a query to the QueryAgent")
+        return StopEvent()
+```
+
+
+```python
+workflow_that_does_nothing = DocsAssistantWorkflow()
+
+# draw_all_possible_flows(workflow_that_does_nothing)
+```
+
+
+```python
+print(
+    await workflow_that_does_nothing.run(start_event=StartEvent(query="llama"))
+)
+```
+
+    Got a request to write something to LlamaIndexDocs
+    None
+
+
+### Classify the Query with Structured Outputs
+
+
+```python
+class SaveToLlamaIndexDocs(BaseModel):
+    """The URLs to parse and save into a llama-index specific docs collection."""
+
+    llama_index_urls: List[str] = Field(default_factory=list)
+
+
+class SaveToWeaviateDocs(BaseModel):
+    """The URLs to parse and save into a weaviate specific docs collection."""
+
+    weaviate_urls: List[str] = Field(default_factory=list)
+
+
+class Ask(BaseModel):
+    """The natural language questions that can be asked to a Q&A agent."""
+
+    queries: List[str] = Field(default_factory=list)
+
+
+class Actions(BaseModel):
+    """Actions to take based on the latest user message."""
+
+    actions: List[
+        Union[SaveToLlamaIndexDocs, SaveToWeaviateDocs, Ask]
+    ] = Field(default_factory=list)
+```
+
+#### Create a Workflow
+
+Let's create a workflow that, still, does nothing, but the incoming user query will be converted to our structure. Based on the contents of that structure, the workflow will decide which step to run.
+
+Notice how whichever step runs first, will return a `StopEvent`... This is good, but maybe we can improve that later!
+
+
+```python
+from llama_index.llms.openai import OpenAIResponses
+
+
+class DocsAssistantWorkflow(Workflow):
+    def __init__(self, *args, **kwargs):
+        self.llm = OpenAIResponses(model="gpt-4.1-mini")
+        self.system_prompt = """You are a docs assistant. You evaluate incoming queries and break them down to subqueries when needed.
+                          You decide on the next best course of action. Overall, here are the options:
+                          - You can write the contents of a URL to llamaindex docs (if it's a llamaindex url)
+                          - You can write the contents of a URL to weaviate docs (if it's a weaviate url)
+                          - You can answer a question about llamaindex and weaviate using the QueryAgent"""
+        super().__init__(*args, **kwargs)
+
+    @step
+    async def start(self, ev: StartEvent) -> EvaluateQuery:
+        return EvaluateQuery(query=ev.query)
+
+    @step
+    async def evaluate_query(
+        self, ev: EvaluateQuery
+    ) -> QueryAgentEvent | WriteLlamaIndexDocsEvent | WriteWeaviateDocsEvent:
+        sllm = self.llm.as_structured_llm(Actions)
+        response = await sllm.achat(
+            [
+                ChatMessage(role="system", content=self.system_prompt),
+                ChatMessage(role="user", content=ev.query),
+            ]
+        )
+        actions = response.raw.actions
+        print(actions)
+        for action in actions:
+            if isinstance(action, SaveToLlamaIndexDocs):
+                return WriteLlamaIndexDocsEvent(urls=action.llama_index_urls)
+            elif isinstance(action, SaveToWeaviateDocs):
+                return WriteWeaviateDocsEvent(urls=action.weaviate_urls)
+            elif isinstance(action, Ask):
+                for query in action.queries:
+                    return QueryAgentEvent(query=query)
+
+    @step
+    async def write_li_docs(self, ev: WriteLlamaIndexDocsEvent) -> StopEvent:
+        print(f"Writing {ev.urls} to LlamaIndex Docs")
+        return StopEvent()
+
+    @step
+    async def write_weaviate_docs(
+        self, ev: WriteWeaviateDocsEvent
+    ) -> StopEvent:
+        print(f"Writing {ev.urls} to Weaviate Docs")
+        return StopEvent()
+
+    @step
+    async def query_agent(self, ev: QueryAgentEvent) -> StopEvent:
+        print(f"Sending `'{ev.query}`' to agent")
+        return StopEvent()
+
+
+everything_docs_agent_beta = DocsAssistantWorkflow()
+```
+
+
+```python
+async def run_docs_agent_beta(query: str):
+    print(
+        await everything_docs_agent_beta.run(
+            start_event=StartEvent(query=query)
+        )
+    )
+```
+
+
+```python
+await run_docs_agent_beta(
+    """Can you save https://www.llamaindex.ai/blog/get-citations-and-reasoning-for-extracted-data-in-llamaextract
+    and https://www.llamaindex.ai/blog/llamaparse-update-may-2025-new-models-skew-detection-and-more??"""
+)
+```
+
+    [SaveToLlamaIndexDocs(llama_index_urls=['https://www.llamaindex.ai/blog/get-citations-and-reasoning-for-extracted-data-in-llamaextract', 'https://www.llamaindex.ai/blog/llamaparse-update-may-2025-new-models-skew-detection-and-more'])]
+    Writing ['https://www.llamaindex.ai/blog/get-citations-and-reasoning-for-extracted-data-in-llamaextract', 'https://www.llamaindex.ai/blog/llamaparse-update-may-2025-new-models-skew-detection-and-more'] to LlamaIndex Docs
+    None
+
+
+
+```python
+await run_docs_agent_beta(
+    "How many documents do we have in the LlamaIndexDocs collection now?"
+)
+```
+
+    [Ask(queries=['How many documents are in the LlamaIndexDocs collection?'])]
+    Sending `'How many documents are in the LlamaIndexDocs collection?`' to agent
+    None
+
+
+
+```python
+await run_docs_agent_beta("What are LlamaIndex workflows?")
+```
+
+    [Ask(queries=['What are LlamaIndex workflows?'])]
+    Sending `'What are LlamaIndex workflows?`' to agent
+    None
+
+
+
+```python
+await run_docs_agent_beta(
+    "Can you save https://weaviate.io/blog/graph-rag and https://weaviate.io/blog/genai-apps-with-weaviate-and-databricks??"
+)
+```
+
+    [SaveToWeaviateDocs(weaviate_urls=['https://weaviate.io/blog/graph-rag', 'https://weaviate.io/blog/genai-apps-with-weaviate-and-databricks'])]
+    Writing ['https://weaviate.io/blog/graph-rag', 'https://weaviate.io/blog/genai-apps-with-weaviate-and-databricks'] to Weaviate Docs
+    None
+
+
+## Run Multiple Branches & Put it all togehter
+
+In these cases, it makes sense to run multiple branches. So, a single step can trigger multiple events at once! We can `send_event` via the context 👇
+
+
+```python
+class ActionCompleted(Event):
+    result: str
+
+
+class DocsAssistantWorkflow(Workflow):
+    def __init__(self, *args, **kwargs):
+        self.llm = OpenAIResponses(model="gpt-4.1-mini")
+        self.system_prompt = """You are a docs assistant. You evaluate incoming queries and break them down to subqueries when needed.
+                      You decide on the next best course of action. Overall, here are the options:
+                      - You can write the contents of a URL to llamaindex docs (if it's a llamaindex url)
+                      - You can write the contents of a URL to weaviate docs (if it's a weaviate url)
+                      - You can answer a question about llamaindex and weaviate using the QueryAgent"""
+        super().__init__(*args, **kwargs)
+
+    @step
+    async def start(self, ctx: Context, ev: StartEvent) -> EvaluateQuery:
+        return EvaluateQuery(query=ev.query)
+
+    @step
+    async def evaluate_query(
+        self, ctx: Context, ev: EvaluateQuery
+    ) -> QueryAgentEvent | WriteLlamaIndexDocsEvent | WriteWeaviateDocsEvent | None:
+        await ctx.store.set("results", [])
+        sllm = self.llm.as_structured_llm(Actions)
+        response = await sllm.achat(
+            [
+                ChatMessage(role="system", content=self.system_prompt),
+                ChatMessage(role="user", content=ev.query),
+            ]
+        )
+        actions = response.raw.actions
+        await ctx.store.set("num_events", len(actions))
+        await ctx.store.set("results", [])
+        print(actions)
+        for action in actions:
+            if isinstance(action, SaveToLlamaIndexDocs):
+                ctx.send_event(
+                    WriteLlamaIndexDocsEvent(urls=action.llama_index_urls)
+                )
+            elif isinstance(action, SaveToWeaviateDocs):
+                ctx.send_event(
+                    WriteWeaviateDocsEvent(urls=action.weaviate_urls)
+                )
+            elif isinstance(action, Ask):
+                for query in action.queries:
+                    ctx.send_event(QueryAgentEvent(query=query))
+
+    @step
+    async def write_li_docs(
+        self, ctx: Context, ev: WriteLlamaIndexDocsEvent
+    ) -> ActionCompleted:
+        print(f"Writing {ev.urls} to LlamaIndex Docs")
+        write_webpages_to_weaviate(
+            client, urls=ev.urls, collection_name="LlamaIndexDocs"
+        )
+        results = await ctx.store.get("results")
+        results.append(f"Wrote {ev.urls} it LlamaIndex Docs")
+        return ActionCompleted(result=f"Writing {ev.urls} to LlamaIndex Docs")
+
+    @step
+    async def write_weaviate_docs(
+        self, ctx: Context, ev: WriteWeaviateDocsEvent
+    ) -> ActionCompleted:
+        print(f"Writing {ev.urls} to Weaviate Docs")
+        write_webpages_to_weaviate(
+            client, urls=ev.urls, collection_name="WeaviateDocs"
+        )
+        results = await ctx.store.get("results")
+        results.append(f"Wrote {ev.urls} it Weavite Docs")
+        return ActionCompleted(result=f"Writing {ev.urls} to Weaviate Docs")
+
+    @step
+    async def query_agent(
+        self, ctx: Context, ev: QueryAgentEvent
+    ) -> ActionCompleted:
+        print(f"Sending {ev.query} to agent")
+        response = weaviate_agent.run(ev.query)
+        results = await ctx.store.get("results")
+        results.append(f"QueryAgent responded with:\n {response.final_answer}")
+        return ActionCompleted(result=f"Sending `'{ev.query}`' to agent")
+
+    @step
+    async def collect(
+        self, ctx: Context, ev: ActionCompleted
+    ) -> StopEvent | None:
+        num_events = await ctx.store.get("num_events")
+        evs = ctx.collect_events(ev, [ActionCompleted] * num_events)
+        if evs is None:
+            return None
+        return StopEvent(result=[ev.result for ev in evs])
+
+
+everything_docs_agent = DocsAssistantWorkflow(timeout=None)
+```
+
+
+```python
+async def run_docs_agent(query: str):
+    handler = everything_docs_agent.run(start_event=StartEvent(query=query))
+    result = await handler
+    for response in await handler.ctx.get("results"):
+        print(response)
+```
+
+
+```python
+await run_docs_agent(
+    "Can you save https://docs.llamaindex.ai/en/stable/understanding/workflows/ and https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/"
+)
+```
+
+    [SaveToLlamaIndexDocs(llama_index_urls=['https://docs.llamaindex.ai/en/stable/understanding/workflows/']), SaveToLlamaIndexDocs(llama_index_urls=['https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/'])]
+    Writing ['https://docs.llamaindex.ai/en/stable/understanding/workflows/'] to LlamaIndex Docs
+    Writing ['https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/'] to LlamaIndex Docs
+    Wrote ['https://docs.llamaindex.ai/en/stable/understanding/workflows/'] it LlamaIndex Docs
+    Wrote ['https://docs.llamaindex.ai/en/stable/understanding/workflows/branches_and_loops/'] it LlamaIndex Docs
+
+
+
+```python
+await run_docs_agent(
+    "How many documents do we have in the LlamaIndexDocs collection now?"
+)
+```
+
+    [Ask(queries=['How many documents are in the LlamaIndexDocs collection?'])]
+    Sending How many documents are in the LlamaIndexDocs collection? to agent
+    QueryAgent responded with:
+     The LlamaIndexDocs collection contains 2 documents, specifically related to workflows and branches and loops within the documentation.
+
+
+
+```python
+await run_docs_agent(
+    "What are LlamaIndex workflows? And can you save https://weaviate.io/blog/graph-rag"
+)
+```
+
+    [Ask(queries=['What are LlamaIndex workflows?'])]
+    Sending What are LlamaIndex workflows? to agent
+    QueryAgent responded with:
+     LlamaIndex workflows are an event-driven, step-based framework designed to control and manage the execution flow of complex applications, particularly those involving generative AI. They break an application into discrete Steps, each triggered by Events and capable of emitting further Events, allowing for complex logic involving loops, branches, and parallel execution.
+    
+    In a LlamaIndex workflow, steps perform functions ranging from simple tasks to complex agents, with inputs and outputs communicated via Events. This event-driven model facilitates maintainability and clarity, overcoming limitations of previous approaches like directed acyclic graphs (DAGs) which struggled with complex flows involving loops and branching.
+    
+    Key features include:
+    - **Loops:** Steps can return events that loop back to previous steps to enable iterative processes.
+    - **Branches:** Workflows can branch into different paths based on conditions, allowing for multiple distinct sequences of steps.
+    - **Parallelism:** Multiple branches or steps can run concurrently and synchronize their results.
+    - **State Maintenance:** Workflows support maintaining state and context throughout execution.
+    - **Observability and Debugging:** Supported by various components and callbacks for monitoring.
+    
+    An example workflow might involve judging whether a query is of sufficient quality, looping to improve it if not, then concurrently executing different retrieval-augmented generation (RAG) strategies, and finally judging their responses to produce a single output.
+    
+    Workflows are especially useful as applications grow in complexity, enabling developers to organize and control intricate AI logic more naturally and efficiently than traditional graph-based methods. For simpler pipelines, LlamaIndex suggests using workflows optionally, but for advanced agentic applications, workflows provide a flexible and powerful control abstraction.
+
+
+
+```python
+await run_docs_agent("How do I use loops in llamaindex workflows?")
+```
+
+    [Ask(queries=['How to use loops in llamaindex workflows'])]
+    Sending How to use loops in llamaindex workflows to agent
+    QueryAgent responded with:
+     In LlamaIndex workflows, loops are implemented using an event-driven approach where you define custom event types and steps that emit events to control the workflow's execution flow. To create a loop, you define a custom event (e.g., `LoopEvent`) and a workflow step that can return either the event continuing the loop or another event to proceed. For example, a workflow step might randomly decide to either loop back (emit `LoopEvent` again) or continue to a next step emitting a different event.
+    
+    This allows creating flexible looping behaviors where any step can loop back to any other step by returning the corresponding event instances. The approach leverages Python's async functions decorated with `@step`, which process events and return the next event(s), enabling both loops and conditional branching in workflows.
+    
+    Thus, loops in LlamaIndex workflows are event-based, using custom event types and the return of events from steps to signal iterations until a condition is met.
+    
+    Example:
+    
+    ```python
+    from llamaindex.workflow import Workflow, Event, StartEvent, StopEvent, step
+    import random
+    
+    class LoopEvent(Event):
+        loop_output: str
+    
+    class FirstEvent(Event):
+        first_output: str
+    
+    class MyWorkflow(Workflow):
+        @step
+        async def step_one(self, ev: StartEvent | LoopEvent) -> FirstEvent | LoopEvent:
+            if random.randint(0, 1) == 0:
+                print("Bad thing happened")
+                return LoopEvent(loop_output="Back to step one.")
+            else:
+                print("Good thing happened")
+                return FirstEvent(first_output="First step complete.")
+    
+        # ... other steps ...
+    
+    # Running this workflow will cause step_one to loop randomly until it proceeds.
+    ```
+    
+    You can combine loops with branching and parallel execution in workflows to build complex control flows. For detailed guidance and examples, consult the LlamaIndex documentation under "Branches and Loops" and the "Workflows" guides.
+
@@ -0,0 +1,591 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/multi_document_agents-v1.ipynb
+toc: True
+title: "Multi-Document Agents (V1)"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+In this guide, you learn towards setting up a multi-document agent over the LlamaIndex documentation.
+
+This is an extension of V0 multi-document agents with the additional features:
+- Reranking during document (tool) retrieval
+- Query planning tool that the agent can use to plan 
+
+
+We do this with the following architecture:
+
+- setup a "document agent" over each Document: each doc agent can do QA/summarization within its doc
+- setup a top-level agent over this set of document agents. Do tool retrieval and then do CoT over the set of tools to answer a question.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index-core
+%pip install llama-index-agent-openai
+%pip install llama-index-readers-file
+%pip install llama-index-postprocessor-cohere-rerank
+%pip install llama-index-llms-openai
+%pip install llama-index-embeddings-openai
+%pip install unstructured[html]
+```
+
+
+```python
+%load_ext autoreload
+%autoreload 2
+```
+
+## Setup and Download Data
+
+In this section, we'll load in the LlamaIndex documentation.
+
+**NOTE:** This command will take a while to run, it will download the entire LlamaIndex documentation. In my testing, this took about 15 minutes.
+
+
+```python
+domain = "docs.llamaindex.ai"
+docs_url = "https://docs.llamaindex.ai/en/latest/"
+!wget -e robots=off --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains {domain} --no-parent {docs_url}
+```
+
+
+```python
+from llama_index.readers.file import UnstructuredReader
+
+reader = UnstructuredReader()
+```
+
+
+```python
+from pathlib import Path
+
+all_files_gen = Path("./docs.llamaindex.ai/").rglob("*")
+all_files = [f.resolve() for f in all_files_gen]
+```
+
+
+```python
+all_html_files = [f for f in all_files if f.suffix.lower() == ".html"]
+```
+
+
+```python
+len(all_html_files)
+```
+
+
+
+
+    1656
+
+
+
+
+```python
+useful_files = [
+    x
+    for x in all_html_files
+    if "understanding" in str(x).split(".")[-2]
+    or "examples" in str(x).split(".")[-2]
+]
+print(len(useful_files))
+```
+
+    680
+
+
+
+```python
+from llama_index.core import Document
+
+# TODO: set to higher value if you want more docs to be indexed
+doc_limit = 100
+
+docs = []
+for idx, f in enumerate(useful_files):
+    if idx > doc_limit:
+        break
+    print(f"Idx {idx}/{len(useful_files)}")
+    loaded_docs = reader.load_data(file=f, split_documents=True)
+
+    loaded_doc = Document(
+        text="\n\n".join([d.get_content() for d in loaded_docs]),
+        metadata={"path": str(f)},
+    )
+    print(loaded_doc.metadata["path"])
+    docs.append(loaded_doc)
+```
+
+
+```python
+print(len(docs))
+```
+
+    101
+
+
+Define Global LLM + Embeddings
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+llm = OpenAI(model="gpt-4o")
+Settings.llm = llm
+Settings.embed_model = OpenAIEmbedding(
+    model="text-embedding-3-small", embed_batch_size=256
+)
+```
+
+## Building Multi-Document Agents
+
+In this section we show you how to construct the multi-document agent. We first build a document agent for each document, and then define the top-level parent agent with an object index.
+
+### Build Document Agent for each Document
+
+In this section we define "document agents" for each document.
+
+We define both a vector index (for semantic search) and summary index (for summarization) for each document. The two query engines are then converted into tools that are passed to an OpenAI function calling agent.
+
+This document agent can dynamically choose to perform semantic search or summarization within a given document.
+
+We create a separate document agent for each city.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+from llama_index.core import (
+    load_index_from_storage,
+    StorageContext,
+    VectorStoreIndex,
+)
+from llama_index.core import SummaryIndex
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core.node_parser import SentenceSplitter
+import os
+from tqdm.notebook import tqdm
+import pickle
+
+
+async def build_agent_per_doc(nodes, file_base):
+    vi_out_path = f"./data/llamaindex_docs/{file_base}"
+    summary_out_path = f"./data/llamaindex_docs/{file_base}_summary.pkl"
+    if not os.path.exists(vi_out_path):
+        Path("./data/llamaindex_docs/").mkdir(parents=True, exist_ok=True)
+        # build vector index
+        vector_index = VectorStoreIndex(nodes)
+        vector_index.storage_context.persist(persist_dir=vi_out_path)
+    else:
+        vector_index = load_index_from_storage(
+            StorageContext.from_defaults(persist_dir=vi_out_path),
+        )
+
+    # build summary index
+    summary_index = SummaryIndex(nodes)
+
+    # define query engines
+    vector_query_engine = vector_index.as_query_engine(llm=llm)
+    summary_query_engine = summary_index.as_query_engine(
+        response_mode="tree_summarize", llm=llm
+    )
+
+    # extract a summary
+    if not os.path.exists(summary_out_path):
+        Path(summary_out_path).parent.mkdir(parents=True, exist_ok=True)
+        summary = str(
+            await summary_query_engine.aquery(
+                "Extract a concise 1-2 line summary of this document"
+            )
+        )
+        pickle.dump(summary, open(summary_out_path, "wb"))
+    else:
+        summary = pickle.load(open(summary_out_path, "rb"))
+
+    # define tools
+    query_engine_tools = [
+        QueryEngineTool.from_defaults(
+            query_engine=vector_query_engine,
+            name=f"vector_tool_{file_base}",
+            description=f"Useful for questions related to specific facts",
+        ),
+        QueryEngineTool.from_defaults(
+            query_engine=summary_query_engine,
+            name=f"summary_tool_{file_base}",
+            description=f"Useful for summarization questions",
+        ),
+    ]
+
+    # build agent
+    function_llm = OpenAI(model="gpt-4")
+    agent = FunctionAgent(
+        tools=query_engine_tools,
+        llm=function_llm,
+        system_prompt=f"""\
+You are a specialized agent designed to answer queries about the `{file_base}.html` part of the LlamaIndex docs.
+You must ALWAYS use at least one of the tools provided when answering a question; do NOT rely on prior knowledge.\
+""",
+    )
+
+    return agent, summary
+
+
+async def build_agents(docs):
+    node_parser = SentenceSplitter()
+
+    # Build agents dictionary
+    agents_dict = {}
+    extra_info_dict = {}
+
+    # # this is for the baseline
+    # all_nodes = []
+
+    for idx, doc in enumerate(tqdm(docs)):
+        nodes = node_parser.get_nodes_from_documents([doc])
+        # all_nodes.extend(nodes)
+
+        # ID will be base + parent
+        file_path = Path(doc.metadata["path"])
+        file_base = str(file_path.parent.stem) + "_" + str(file_path.stem)
+        agent, summary = await build_agent_per_doc(nodes, file_base)
+
+        agents_dict[file_base] = agent
+        extra_info_dict[file_base] = {"summary": summary, "nodes": nodes}
+
+    return agents_dict, extra_info_dict
+```
+
+
+```python
+agents_dict, extra_info_dict = await build_agents(docs)
+```
+
+### Build Retriever-Enabled OpenAI Agent
+
+We build a top-level agent that can orchestrate across the different document agents to answer any user query.
+
+This agent will use a tool retriever to retrieve the most relevant tools for the query.
+
+**Improvements from V0**: We make the following improvements compared to the "base" version in V0.
+
+- Adding in reranking: we use Cohere reranker to better filter the candidate set of documents.
+- Adding in a query planning tool: we add an explicit query planning tool that's dynamically created based on the set of retrieved tools.
+
+
+
+```python
+from typing import Callable
+from llama_index.core.tools import FunctionTool
+
+
+def get_agent_tool_callable(agent: FunctionAgent) -> Callable:
+    async def query_agent(query: str) -> str:
+        response = await agent.run(query)
+        return str(response)
+
+    return query_agent
+
+
+# define tool for each document agent
+all_tools = []
+for file_base, agent in agents_dict.items():
+    summary = extra_info_dict[file_base]["summary"]
+    async_fn = get_agent_tool_callable(agent)
+    doc_tool = FunctionTool.from_defaults(
+        async_fn,
+        name=f"tool_{file_base}",
+        description=summary,
+    )
+    all_tools.append(doc_tool)
+```
+
+
+```python
+print(all_tools[0].metadata)
+```
+
+    ToolMetadata(description='The document provides a series of tutorials on building agentic LLM applications using LlamaIndex, covering key steps such as building RAG pipelines, agents, and workflows, along with techniques for data ingestion, indexing, querying, and application evaluation.', name='tool_understanding_index', fn_schema=<class 'llama_index.core.tools.utils.tool_understanding_index'>, return_direct=False)
+
+
+
+```python
+# define an "object" index and retriever over these tools
+from llama_index.core import VectorStoreIndex
+from llama_index.core.objects import (
+    ObjectIndex,
+    ObjectRetriever,
+)
+from llama_index.postprocessor.cohere_rerank import CohereRerank
+from llama_index.core.query_engine import SubQuestionQueryEngine
+from llama_index.core.schema import QueryBundle
+from llama_index.llms.openai import OpenAI
+
+
+llm = OpenAI(model_name="gpt-4o")
+
+obj_index = ObjectIndex.from_objects(
+    all_tools,
+    index_cls=VectorStoreIndex,
+)
+vector_node_retriever = obj_index.as_node_retriever(
+    similarity_top_k=10,
+)
+
+
+# define a custom object retriever that adds in a query planning tool
+class CustomObjectRetriever(ObjectRetriever):
+    def __init__(
+        self,
+        retriever,
+        object_node_mapping,
+        node_postprocessors=None,
+        llm=None,
+    ):
+        self._retriever = retriever
+        self._object_node_mapping = object_node_mapping
+        self._llm = llm or OpenAI("gpt-4o")
+        self._node_postprocessors = node_postprocessors or []
+
+    def retrieve(self, query_bundle):
+        if isinstance(query_bundle, str):
+            query_bundle = QueryBundle(query_str=query_bundle)
+
+        nodes = self._retriever.retrieve(query_bundle)
+        for processor in self._node_postprocessors:
+            nodes = processor.postprocess_nodes(
+                nodes, query_bundle=query_bundle
+            )
+        tools = [self._object_node_mapping.from_node(n.node) for n in nodes]
+
+        sub_agent = FunctionAgent(
+            name="compare_tool",
+            description=f"""\
+Useful for any queries that involve comparing multiple documents. ALWAYS use this tool for comparison queries - make sure to call this \
+tool with the original query. Do NOT use the other tools for any queries involving multiple documents.
+""",
+            tools=tools,
+            llm=self._llm,
+            system_prompt="""You are an expert at comparing documents. Given a query, use the tools provided to compare the documents and return a summary of the results.""",
+        )
+
+        async def query_sub_agent(query: str) -> str:
+            response = await sub_agent.run(query)
+            return str(response)
+
+        sub_question_tool = FunctionTool.from_defaults(
+            query_sub_agent,
+            name=sub_agent.name,
+            description=sub_agent.description,
+        )
+        return tools + [sub_question_tool]
+```
+
+
+```python
+# wrap it with ObjectRetriever to return objects
+custom_obj_retriever = CustomObjectRetriever(
+    vector_node_retriever,
+    obj_index.object_node_mapping,
+    node_postprocessors=[CohereRerank(top_n=5, model="rerank-v3.5")],
+    llm=llm,
+)
+```
+
+
+```python
+tmps = custom_obj_retriever.retrieve("hello")
+
+# should be 5 + 1 -- 5 from reranker, 1 from subquestion
+print(len(tmps))
+```
+
+    6
+
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent, FunctionAgent
+
+top_agent = FunctionAgent(
+    tool_retriever=custom_obj_retriever,
+    system_prompt=""" \
+You are an agent designed to answer queries about the documentation.
+Please always use the tools provided to answer a question. Do not rely on prior knowledge.\
+
+""",
+    llm=llm,
+)
+
+# top_agent = ReActAgent(
+#     tool_retriever=custom_obj_retriever,
+#     system_prompt=""" \
+# You are an agent designed to answer queries about the documentation.
+# Please always use the tools provided to answer a question. Do not rely on prior knowledge.\
+
+# """,
+#     llm=llm,
+# )
+```
+
+### Define Baseline Vector Store Index
+
+As a point of comparison, we define a "naive" RAG pipeline which dumps all docs into a single vector index collection.
+
+We set the top_k = 4
+
+
+```python
+all_nodes = [
+    n for extra_info in extra_info_dict.values() for n in extra_info["nodes"]
+]
+```
+
+
+```python
+base_index = VectorStoreIndex(all_nodes)
+base_query_engine = base_index.as_query_engine(similarity_top_k=4)
+```
+
+## Running Example Queries
+
+Let's run some example queries, ranging from QA / summaries over a single document to QA / summarization over multiple documents.
+
+
+```python
+from llama_index.core.agent.workflow import (
+    AgentStream,
+    ToolCall,
+    ToolCallResult,
+)
+
+handler = top_agent.run(
+    "What can you build with LlamaIndex?",
+)
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalling tool {ev.tool_name} with args {ev.tool_kwargs}\n Got response: {str(ev.tool_output)[:200]}"
+        )
+    elif isinstance(ev, ToolCall):
+        print(f"\nTool call: {ev.tool_name} with args {ev.tool_kwargs}")
+    # Print the stream of the agent
+    # elif isinstance(ev, AgentStream):
+    #     print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Tool call: tool_SimpleIndexDemoLlama2_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Tool call: tool_apps_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Tool call: tool_putting_it_all_together_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Tool call: tool_llamacloud_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Calling tool tool_SimpleIndexDemoLlama2_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a VectorStoreIndex. This involves setting up the necessary environment, loading documents into the index, and then querying the index for information. You need to instal
+    
+    Tool call: tool_using_llms_index with args {'query': 'What can you build with LlamaIndex?'}
+    
+    Calling tool tool_llamacloud_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a system that connects to your data stores, automatically indexes them, and then queries the data. This is done by integrating LlamaCloud into your project. The system a
+    
+    Calling tool tool_apps_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a full-stack web application. You can integrate it into a backend server like Flask, package it into a Docker container, or use it directly in a framework such as Stream
+    
+    Calling tool tool_putting_it_all_together_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a variety of applications and tools. This includes:
+    
+    1. Chatbots: You can use LlamaIndex to create interactive chatbots.
+    2. Agents: LlamaIndex can be used to build intel
+    
+    Calling tool tool_using_llms_index with args {'query': 'What can you build with LlamaIndex?'}
+     Got response: With LlamaIndex, you can build a variety of applications by leveraging the various Language Model (LLM) integrations it supports. These include OpenAI, Anthropic, Mistral, DeepSeek, Hugging Face, and 
+
+
+
+```python
+# print the final response string
+print(str(response))
+```
+
+    With LlamaIndex, you can build various applications and tools, including:
+    
+    1. **VectorStoreIndex**: Set up and query a VectorStoreIndex by loading documents and configuring the environment as per the documentation.
+       
+    2. **Full-Stack Web Applications**: Integrate LlamaIndex into backend servers like Flask, Docker containers, or frameworks like Streamlit. Resources include guides for TypeScript+React, Delphic starter template, and Flask, Streamlit, and Docker integration examples.
+    
+    3. **Chatbots, Agents, and Unified Query Framework**: Create interactive chatbots, intelligent agents, and a unified query framework for handling different query types. LlamaIndex also supports property graphs and full-stack web applications.
+    
+    4. **Data Management with LlamaCloud**: Build systems that connect to data stores, automatically index data, and efficiently query it by integrating LlamaCloud into your project.
+    
+    5. **LLM Integrations**: Utilize various Language Model (LLM) integrations such as OpenAI, Anthropic, Mistral, DeepSeek, and Hugging Face. LlamaIndex provides a unified interface to access different LLMs, enabling you to select models based on their strengths and price points. You can use multi-modal LLMs for chat messages with text, images, and audio inputs, and even call tools and functions directly through API calls.
+    
+    These capabilities make LlamaIndex a versatile tool for building a wide range of applications and systems.
+
+
+
+```python
+# access the tool calls
+# print(response.tool_calls)
+```
+
+
+```python
+# baseline
+response = base_query_engine.query(
+    "What can you build with LlamaIndex?",
+)
+print(str(response))
+```
+
+    With LlamaIndex, you can build a variety of applications and systems, including a full-stack web application, a chatbot, and a unified query framework over multiple indexes. You can also perform semantic searches, summarization queries, and queries over structured data like SQL or Pandas DataFrames. Additionally, LlamaIndex supports routing over heterogeneous data sources and compare/contrast queries. It provides tools and templates to help you integrate these capabilities into production-ready applications.
+
+
+
+```python
+response = await top_agent.run("Compare workflows to query engines")
+print(str(response))
+```
+
+    Workflows and query engines serve different purposes in an application context:
+    
+    1. Workflows:
+       - Workflows are designed to manage the execution flow of an application by dividing it into sections triggered by events.
+       - They are event-driven and step-based, allowing for the management of application complexity by breaking it into smaller, more manageable pieces.
+       - Workflows focus on controlling the flow of application execution through steps and events.
+    
+    2. Query Engines:
+       - Query engines are tools used to process queries against a database or data source to retrieve specific information.
+       - They are primarily used for querying and retrieving data from databases.
+       - Query engines are focused on the retrieval, postprocessing, and response synthesis stages of querying.
+    
+    In summary, workflows are more about controlling the flow of application execution, while query engines are specifically designed for querying and retrieving data from databases.
+
+
+
+```python
+response = await top_agent.run(
+    "Can you compare the compact and tree_summarize response synthesizer response modes at a very high-level?"
+)
+print(str(response))
+```
+
+    The compact response synthesizer mode aims to produce concise and condensed responses, focusing on delivering the most relevant information in a brief format. On the other hand, the tree_summarize response synthesizer mode is designed to create structured and summarized responses, organizing information in a comprehensive manner. 
+    
+    In summary, the compact mode provides brief and straightforward responses, while the tree_summarize mode offers more detailed and organized output for a comprehensive summary.
+
@@ -0,0 +1,224 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/nvidia_agent.ipynb
+toc: True
+title: "Function Calling NVIDIA Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+This notebook shows you how to use our NVIDIA agent, powered by function calling capabilities.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the NVIDIA NIM Endpoint (using our own `llama_index` LLM class)
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+
+```python
+%pip install --upgrade --quiet llama-index-llms-nvidia
+```
+
+
+```python
+import getpass
+import os
+
+# del os.environ['NVIDIA_API_KEY']  ## delete key and reset
+if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
+    print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
+else:
+    nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
+    assert nvapi_key.startswith(
+        "nvapi-"
+    ), f"{nvapi_key[:5]}... is not a valid key"
+    os.environ["NVIDIA_API_KEY"] = nvapi_key
+```
+
+    Valid NVIDIA_API_KEY already in environment. Delete to reset
+
+
+
+```python
+from llama_index.llms.nvidia import NVIDIA
+from llama_index.core.tools import FunctionTool
+from llama_index.embeddings.nvidia import NVIDIAEmbedding
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiple two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+Here we initialize a simple NVIDIA agent with calculator functions.
+
+
+```python
+llm = NVIDIA("meta/llama-3.1-70b-instruct")
+```
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+)
+```
+
+### Chat
+
+
+```python
+response = await agent.run("What is (121 * 3) + 42?")
+print(str(response))
+```
+
+
+```python
+# inspect sources
+print(response.tool_calls)
+```
+
+### Managing Context/Memory
+
+By default, `.run()` is stateless. If you want to maintain state, you can pass in a `context` object.
+
+
+```python
+from llama_index.core.agent.workflow import Context
+
+ctx = Context(agent)
+
+response = await agent.run("Hello, my name is John Doe.", ctx=ctx)
+print(str(response))
+
+response = await agent.run("What is my name?", ctx=ctx)
+print(str(response))
+```
+
+### Agent with Personality
+
+You can specify a system prompt to give the agent additional instruction or personality.
+
+
+```python
+agent = FunctionAgent(
+    tools=[multiply, add],
+    llm=llm,
+    system_prompt="Talk like a pirate in every response.",
+)
+```
+
+
+```python
+response = await agent.run("Hi")
+print(response)
+```
+
+
+```python
+response = await agent.run("Tell me a story")
+print(response)
+```
+
+# NVIDIA Agent with RAG/Query Engine Tools
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+
+embed_model = NVIDIAEmbedding(model="NV-Embed-QA", truncate="END")
+
+# load data
+uber_docs = SimpleDirectoryReader(
+    input_files=["./data/10k/uber_2021.pdf"]
+).load_data()
+
+# build index
+uber_index = VectorStoreIndex.from_documents(
+    uber_docs, embed_model=embed_model
+)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3, llm=llm)
+query_engine_tool = QueryEngineTool.from_defaults(
+    query_engine=uber_engine,
+    name="uber_10k",
+    description=(
+        "Provides information about Uber financials for year 2021. "
+        "Use a detailed plain text question as input to the tool."
+    ),
+)
+```
+
+
+```python
+agent = FunctionAgent(tools=[query_engine_tool], llm=llm)
+```
+
+
+```python
+response = await agent.run(
+    "Tell me both the risk factors and tailwinds for Uber? Do two parallel tool calls."
+)
+print(str(response))
+```
+
+# ReAct Agent 
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent
+```
+
+
+```python
+agent = ReActAgent([multiply_tool, add_tool], llm=llm, verbose=True)
+```
+
+Using the `stream_events()` method, we can stream the response as it is generated to see the agent's thought process.
+
+The final response will have only the final answer.
+
+
+```python
+from llama_index.core.agent.workflow import AgentStream
+
+handler = agent.run("What is 20+(2*4)? Calculate step by step ")
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+
+```python
+print(str(response))
+```
+
+
+```python
+print(response.tool_calls)
+```
@@ -0,0 +1,295 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_context_retrieval.ipynb
+toc: True
+title: "Context-Augmented Function Calling Agent"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+In this tutorial, we show you how to to make your agent context-aware.
+
+Our indexing/retrieval modules help to remove the complexity of having too many functions to fit in the prompt.
+
+## Initial Setup 
+
+Here we setup a normal FunctionAgent, and then augment it with context. This agent will perform retrieval first before calling any tools. This can help ground the agent's tool picking and answering capabilities in context.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core.settings import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+import json
+from typing import Sequence
+
+from llama_index.core import (
+    SimpleDirectoryReader,
+    VectorStoreIndex,
+    StorageContext,
+    load_index_from_storage,
+)
+from llama_index.core.tools import QueryEngineTool
+```
+
+
+```python
+try:
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/march"
+    )
+    march_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/june"
+    )
+    june_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/sept"
+    )
+    sept_index = load_index_from_storage(storage_context)
+
+    index_loaded = True
+except:
+    index_loaded = False
+```
+
+Download Data
+
+
+```python
+!mkdir -p 'data/10q/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_march_2022.pdf' -O 'data/10q/uber_10q_march_2022.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_june_2022.pdf' -O 'data/10q/uber_10q_june_2022.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10q/uber_10q_sept_2022.pdf' -O 'data/10q/uber_10q_sept_2022.pdf'
+```
+
+
+```python
+# build indexes across the three data sources
+if not index_loaded:
+    # load data
+    march_docs = SimpleDirectoryReader(
+        input_files=["./data/10q/uber_10q_march_2022.pdf"]
+    ).load_data()
+    june_docs = SimpleDirectoryReader(
+        input_files=["./data/10q/uber_10q_june_2022.pdf"]
+    ).load_data()
+    sept_docs = SimpleDirectoryReader(
+        input_files=["./data/10q/uber_10q_sept_2022.pdf"]
+    ).load_data()
+
+    # build index
+    march_index = VectorStoreIndex.from_documents(march_docs)
+    june_index = VectorStoreIndex.from_documents(june_docs)
+    sept_index = VectorStoreIndex.from_documents(sept_docs)
+
+    # persist index
+    march_index.storage_context.persist(persist_dir="./storage/march")
+    june_index.storage_context.persist(persist_dir="./storage/june")
+    sept_index.storage_context.persist(persist_dir="./storage/sept")
+```
+
+
+```python
+march_engine = march_index.as_query_engine(similarity_top_k=3)
+june_engine = june_index.as_query_engine(similarity_top_k=3)
+sept_engine = sept_index.as_query_engine(similarity_top_k=3)
+```
+
+
+```python
+query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=march_engine,
+        name="uber_march_10q",
+        description=(
+            "Provides information about Uber 10Q filings for March 2022. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=june_engine,
+        name="uber_june_10q",
+        description=(
+            "Provides information about Uber financials for June 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=sept_engine,
+        name="uber_sept_10q",
+        description=(
+            "Provides information about Uber financials for Sept 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+]
+```
+
+### Try Context-Augmented Agent
+
+Here we augment our agent with context in different settings:
+- toy context: we define some abbreviations that map to financial terms (e.g. R=Revenue). We supply this as context to the agent
+
+
+```python
+from llama_index.core import Document
+from llama_index.core.agent.workflow import FunctionAgent
+```
+
+
+```python
+# toy index - stores a list of abbreviations
+texts = [
+    "Abbreviation: 'Y' = Revenue",
+    "Abbreviation: 'X' = Risk Factors",
+    "Abbreviation: 'Z' = Costs",
+]
+docs = [Document(text=t) for t in texts]
+context_index = VectorStoreIndex.from_documents(docs)
+context_retriever = context_index.as_retriever(similarity_top_k=2)
+```
+
+
+```python
+from llama_index.core.tools import BaseTool
+
+system_prompt_template = """You are a helpful assistant. 
+Here is some context that you can use to answer the user's question and for help with picking the right tool:
+
+{context}
+"""
+
+
+async def get_agent_with_context_awareness(
+    query: str, context_retriever, tools: list[BaseTool]
+) -> FunctionAgent:
+    context_nodes = await context_retriever.aretrieve(query)
+    context_text = "\n----\n".join([n.node.text for n in context_nodes])
+
+    return FunctionAgent(
+        tools=tools,
+        llm=OpenAI(model="gpt-4o"),
+        system_prompt=system_prompt_template.format(context=context_text),
+    )
+```
+
+
+```python
+query = "What is the 'X' of March 2022?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+
+response = await agent.run(query)
+```
+
+
+```python
+print(str(response))
+```
+
+    The risk factors mentioned in Uber's 10-Q filing for March 2022 include uncertainties related to the COVID-19 pandemic, such as the severity and duration of the outbreak, potential future waves or variants of the virus, the administration and efficacy of vaccines, and the impact of governmental actions. There are also concerns regarding the effects on drivers, merchants, consumers, and business partners, as well as other factors that may affect the company's business, results of operations, financial position, and cash flows.
+
+
+
+```python
+query = "What is the 'Y' and 'Z' in September 2022?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+
+response = await agent.run(query)
+```
+
+
+```python
+print(str(response))
+```
+
+    In September 2022, Uber's revenue (Y) was $8,343 million, and the total costs (Z) were $8,839 million.
+
+
+### Managing Context/Memory
+
+By default, each `.run()` call is stateless. We can manage context by using a serializable `Context` object.
+
+
+```python
+from llama_index.core.workflow import Context
+
+ctx = Context(agent)
+
+query = "What is the 'Y' and 'Z' in September 2022?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+response = await agent.run(query, ctx=ctx)
+
+query = "What did I just ask?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, query_engine_tools
+)
+response = await agent.run(query, ctx=ctx)
+print(str(response))
+```
+
+    You asked for the revenue ('Y') and costs ('Z') for Uber in September 2022.
+
+
+### Use Uber 10-Q as context, use Calculator as Tool
+
+
+```python
+from llama_index.core.tools import FunctionTool
+
+
+def magic_formula(revenue: int, cost: int) -> int:
+    """Runs MAGIC_FORMULA on revenue and cost."""
+    return revenue - cost
+
+
+magic_tool = FunctionTool.from_defaults(magic_formula)
+```
+
+
+```python
+context_retriever = sept_index.as_retriever(similarity_top_k=3)
+```
+
+
+```python
+query = "Can you run MAGIC_FORMULA on Uber's revenue and cost?"
+agent = await get_agent_with_context_awareness(
+    query, context_retriever, [magic_tool]
+)
+response = await agent.run(query)
+print(str(response))
+```
+
+    The result of running MAGIC_FORMULA on Uber's revenue of $8,343 million and cost of $5,173 million is 3,170.
+
@@ -0,0 +1,783 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_query_cookbook.ipynb
+toc: True
+title: "OpenAI Agent + Query Engine Experimental Cookbook"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+In this notebook, we try out the OpenAIAgent across a variety of query engine tools and datasets. We explore how OpenAIAgent can compare/replace existing workflows solved by our retrievers/query engines.
+
+- Auto retrieval 
+- Joint SQL and vector search
+
+**NOTE:** Any Text-to-SQL application should be aware that executing 
+arbitrary SQL queries can be a security risk. It is recommended to
+take precautions as needed, such as using restricted roles, read-only
+databases, sandboxing, etc.
+
+## AutoRetrieval from a Vector Database
+
+Our existing "auto-retrieval" capabilities (in `VectorIndexAutoRetriever`) allow an LLM to infer the right query parameters for a vector database - including both the query string and metadata filter.
+
+Since the OpenAI Function API can infer function parameters, we explore its capabilities in performing auto-retrieval here.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+%pip install llama-index-llms-openai
+%pip install llama-index-readers-wikipedia
+%pip install llama-index-vector-stores-pinecone
+```
+
+
+```python
+import os
+
+os.environ["PINECONE_API_KEY"] = "..."
+os.environ["OPENAI_API_KEY"] = "..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+from pinecone import Pinecone, ServerlessSpec
+
+pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
+```
+
+
+```python
+# dimensions are for text-embedding-3-small
+pc.create_index(
+    name="quickstart-index",
+    dimension=1536,
+    metric="euclidean",
+    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
+)
+
+# may need to wait for index to be created
+import time
+
+time.sleep(10)
+```
+
+
+
+
+    {
+        "name": "quickstart-index",
+        "metric": "euclidean",
+        "host": "quickstart-index-c2e1535.svc.aped-4627-b74a.pinecone.io",
+        "spec": {
+            "serverless": {
+                "cloud": "aws",
+                "region": "us-east-1"
+            }
+        },
+        "status": {
+            "ready": true,
+            "state": "Ready"
+        },
+        "vector_type": "dense",
+        "dimension": 1536,
+        "deletion_protection": "disabled",
+        "tags": null
+    }
+
+
+
+
+```python
+index = pc.Index("quickstart-index")
+```
+
+
+```python
+# Optional: delete data in your pinecone index
+# index.delete(deleteAll=True, namespace="test")
+```
+
+
+```python
+from llama_index.core import VectorStoreIndex, StorageContext
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+```
+
+
+```python
+from llama_index.core.schema import TextNode
+
+nodes = [
+    TextNode(
+        text=(
+            "Michael Jordan is a retired professional basketball player,"
+            " widely regarded as one of the greatest basketball players of all"
+            " time."
+        ),
+        metadata={
+            "category": "Sports",
+            "country": "United States",
+            "gender": "male",
+            "born": 1963,
+        },
+    ),
+    TextNode(
+        text=(
+            "Angelina Jolie is an American actress, filmmaker, and"
+            " humanitarian. She has received numerous awards for her acting"
+            " and is known for her philanthropic work."
+        ),
+        metadata={
+            "category": "Entertainment",
+            "country": "United States",
+            "gender": "female",
+            "born": 1975,
+        },
+    ),
+    TextNode(
+        text=(
+            "Elon Musk is a business magnate, industrial designer, and"
+            " engineer. He is the founder, CEO, and lead designer of SpaceX,"
+            " Tesla, Inc., Neuralink, and The Boring Company."
+        ),
+        metadata={
+            "category": "Business",
+            "country": "United States",
+            "gender": "male",
+            "born": 1971,
+        },
+    ),
+    TextNode(
+        text=(
+            "Rihanna is a Barbadian singer, actress, and businesswoman. She"
+            " has achieved significant success in the music industry and is"
+            " known for her versatile musical style."
+        ),
+        metadata={
+            "category": "Music",
+            "country": "Barbados",
+            "gender": "female",
+            "born": 1988,
+        },
+    ),
+    TextNode(
+        text=(
+            "Cristiano Ronaldo is a Portuguese professional footballer who is"
+            " considered one of the greatest football players of all time. He"
+            " has won numerous awards and set multiple records during his"
+            " career."
+        ),
+        metadata={
+            "category": "Sports",
+            "country": "Portugal",
+            "gender": "male",
+            "born": 1985,
+        },
+    ),
+]
+```
+
+
+```python
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+from llama_index.core import StorageContext
+
+vector_store = PineconeVectorStore(pinecone_index=index, namespace="test")
+storage_context = StorageContext.from_defaults(vector_store=vector_store)
+```
+
+
+```python
+from llama_index.core import VectorStoreIndex
+
+index = VectorStoreIndex(nodes, storage_context=storage_context)
+```
+
+
+    Upserted vectors:   0%|          | 0/5 [00:00<?, ?it/s]
+
+
+#### Define Function Tool
+
+Here we define the function interface, which is passed to OpenAI to perform auto-retrieval.
+
+We were not able to get OpenAI to work with nested pydantic objects or tuples as arguments,
+so we converted the metadata filter keys and values into lists for the function API to work with.
+
+
+```python
+# define function tool
+from llama_index.core.tools import FunctionTool
+from llama_index.core.vector_stores import (
+    VectorStoreInfo,
+    MetadataInfo,
+    MetadataFilter,
+    MetadataFilters,
+    FilterCondition,
+    FilterOperator,
+)
+from llama_index.core.retrievers import VectorIndexRetriever
+from llama_index.core.query_engine import RetrieverQueryEngine
+
+from typing import List, Tuple, Any
+from pydantic import BaseModel, Field
+
+
+# define vector store info describing schema of vector store
+vector_store_info = VectorStoreInfo(
+    content_info="brief biography of celebrities",
+    metadata_info=[
+        MetadataInfo(
+            name="category",
+            type="str",
+            description=(
+                "Category of the celebrity, one of [Sports, Entertainment,"
+                " Business, Music]"
+            ),
+        ),
+        MetadataInfo(
+            name="country",
+            type="str",
+            description=(
+                "Country of the celebrity, one of [United States, Barbados,"
+                " Portugal]"
+            ),
+        ),
+        MetadataInfo(
+            name="gender",
+            type="str",
+            description=("Gender of the celebrity, one of [male, female]"),
+        ),
+        MetadataInfo(
+            name="born",
+            type="int",
+            description=("Born year of the celebrity, could be any integer"),
+        ),
+    ],
+)
+```
+
+Define AutoRetrieve Functions
+
+
+```python
+from typing import Any, Annotated
+
+
+async def auto_retrieve_fn(
+    query: Annotated[str, "The natural language query/question to answer."],
+    filter_key_list: Annotated[
+        List[str], "List of metadata filter field names"
+    ],
+    filter_value_list: Annotated[
+        List[Any],
+        "List of metadata filter field values (corresponding to names in filter_key_list)",
+    ],
+    filter_operator_list: Annotated[
+        List[str],
+        "Metadata filters conditions (could be one of <, <=, >, >=, ==, !=)",
+    ],
+    filter_condition: Annotated[
+        str, "Metadata filters condition values (could be AND or OR)"
+    ],
+    top_k: Annotated[
+        int, "The number of results to return from the vector database."
+    ],
+):
+    """Auto retrieval function.
+
+    Performs auto-retrieval from a vector database, and then applies a set of filters.
+
+    """
+    query = query or "Query"
+
+    metadata_filters = [
+        MetadataFilter(key=k, value=v, operator=op)
+        for k, v, op in zip(
+            filter_key_list, filter_value_list, filter_operator_list
+        )
+    ]
+    retriever = VectorIndexRetriever(
+        index,
+        filters=MetadataFilters(
+            filters=metadata_filters, condition=filter_condition.lower()
+        ),
+        top_k=top_k,
+    )
+    query_engine = RetrieverQueryEngine.from_args(retriever)
+
+    response = await query_engine.aquery(query)
+    return str(response)
+
+
+description = f"""\
+Use this tool to look up biographical information about celebrities.
+The vector database schema is given below:
+
+<schema>
+{vector_store_info.model_dump_json()}
+</schema>
+"""
+
+auto_retrieve_tool = FunctionTool.from_defaults(
+    auto_retrieve_fn,
+    name="celebrity_bios",
+    description=description,
+)
+```
+
+#### Initialize Agent
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+agent = FunctionAgent(
+    tools=[auto_retrieve_tool],
+    llm=OpenAI(model="gpt-4o"),
+    system_prompt=(
+        "You are a helpful assistant that can answer questions about celebrities by writing a filtered query to a vector database. "
+        "Unless the user is asking to compare things, you generally only need to make one call to the retriever."
+    ),
+)
+
+# hold the context/session state for the agent
+ctx = Context(agent)
+```
+
+
+```python
+from llama_index.core.agent.workflow import (
+    ToolCallResult,
+    ToolCall,
+    AgentStream,
+    AgentInput,
+    AgentOutput,
+)
+
+handler = agent.run(
+    "Tell me about two celebrities from the United States. ", ctx=ctx
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool celebrity_bios with args {'query': 'celebrities from the United States', 'filter_key_list': ['country'], 'filter_value_list': ['United States'], 'filter_operator_list': ['=='], 'filter_condition': 'AND', 'top_k': 2}, got response: Angelina Jolie and Elon Musk are notable celebrities from the United States.
+    Here are two celebrities from the United States:
+    
+    1. **Angelina Jolie**: She is a renowned actress, filmmaker, and humanitarian. Jolie has received numerous accolades, including an Academy Award and three Golden Globe Awards. She is also known for her humanitarian efforts, particularly her work with refugees as a Special Envoy for the United Nations High Commissioner for Refugees (UNHCR).
+    
+    2. **Elon Musk**: He is a prominent entrepreneur and business magnate. Musk is the CEO and lead designer of SpaceX, CEO and product architect of Tesla, Inc., and has been involved in numerous other ventures, including Neuralink and The Boring Company. He is known for his ambitious vision of the future, including space exploration and sustainable energy.
+
+
+```python
+handler = agent.run("Tell me about two celebrities born after 1980. ", ctx=ctx)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool celebrity_bios with args {'query': 'celebrities born after 1980', 'filter_key_list': ['born'], 'filter_value_list': [1980], 'filter_operator_list': ['>'], 'filter_condition': 'AND', 'top_k': 2}, got response: Rihanna, born in 1988, is a celebrity who fits the criteria of being born after 1980.
+    Here is a celebrity born after 1980:
+    
+    - **Rihanna**: Born in 1988, Rihanna is a Barbadian singer, actress, and businesswoman. She gained worldwide fame with her music career, producing hits like "Umbrella," "Diamonds," and "Work." Beyond music, Rihanna has made a significant impact in the fashion and beauty industries with her Fenty brand, known for its inclusivity and innovation.
+
+
+```python
+response = await agent.run(
+    "Tell me about few celebrities under category business and born after 1950. ",
+    ctx=ctx,
+)
+print(str(response))
+```
+
+    Here is a celebrity in the business category who was born after 1950:
+    
+    - **Elon Musk**: He is a prominent entrepreneur and business magnate, born in 1971. Musk is the CEO and lead designer of SpaceX, CEO and product architect of Tesla, Inc., and has been involved in numerous other ventures, including Neuralink and The Boring Company. He is known for his ambitious vision of the future, including space exploration and sustainable energy.
+
+
+## Joint Text-to-SQL and Semantic Search
+
+This is currently handled by our `SQLAutoVectorQueryEngine`.
+
+Let's try implementing this by giving our `OpenAIAgent` access to two query tools: SQL and Vector 
+
+**NOTE:** Any Text-to-SQL application should be aware that executing 
+arbitrary SQL queries can be a security risk. It is recommended to
+take precautions as needed, such as using restricted roles, read-only
+databases, sandboxing, etc.
+
+#### Load and Index Structured Data
+
+We load sample structured datapoints into a SQL db and index it.
+
+
+```python
+from sqlalchemy import (
+    create_engine,
+    MetaData,
+    Table,
+    Column,
+    String,
+    Integer,
+    select,
+    column,
+)
+from llama_index.core import SQLDatabase
+from llama_index.core.indices import SQLStructStoreIndex
+
+engine = create_engine("sqlite:///:memory:", future=True)
+metadata_obj = MetaData()
+```
+
+
+```python
+# create city SQL table
+table_name = "city_stats"
+city_stats_table = Table(
+    table_name,
+    metadata_obj,
+    Column("city_name", String(16), primary_key=True),
+    Column("population", Integer),
+    Column("country", String(16), nullable=False),
+)
+
+metadata_obj.create_all(engine)
+```
+
+
+```python
+# print tables
+metadata_obj.tables.keys()
+```
+
+
+
+
+    dict_keys(['city_stats'])
+
+
+
+
+```python
+from sqlalchemy import insert
+
+rows = [
+    {"city_name": "Toronto", "population": 2930000, "country": "Canada"},
+    {"city_name": "Tokyo", "population": 13960000, "country": "Japan"},
+    {"city_name": "Berlin", "population": 3645000, "country": "Germany"},
+]
+for row in rows:
+    stmt = insert(city_stats_table).values(**row)
+    with engine.begin() as connection:
+        cursor = connection.execute(stmt)
+```
+
+
+```python
+with engine.connect() as connection:
+    cursor = connection.exec_driver_sql("SELECT * FROM city_stats")
+    print(cursor.fetchall())
+```
+
+    [('Toronto', 2930000, 'Canada'), ('Tokyo', 13960000, 'Japan'), ('Berlin', 3645000, 'Germany')]
+
+
+
+```python
+sql_database = SQLDatabase(engine, include_tables=["city_stats"])
+```
+
+
+```python
+from llama_index.core.query_engine import NLSQLTableQueryEngine
+
+query_engine = NLSQLTableQueryEngine(
+    sql_database=sql_database,
+    tables=["city_stats"],
+)
+```
+
+#### Load and Index Unstructured Data
+
+We load unstructured data into a vector index backed by Pinecone
+
+
+```python
+# install wikipedia python package
+%pip install wikipedia llama-index-readers-wikipedia
+```
+
+
+```python
+from llama_index.readers.wikipedia import WikipediaReader
+
+cities = ["Toronto", "Berlin", "Tokyo"]
+wiki_docs = WikipediaReader().load_data(pages=cities)
+```
+
+
+```python
+from pinecone import Pinecone, ServerlessSpec
+
+pc = Pinecone(api_key=os.environ["PINECONE_API_KEY"])
+```
+
+
+```python
+# dimensions are for text-embedding-3-small
+pc.create_index(
+    name="quickstart-sql",
+    dimension=1536,
+    metric="euclidean",
+    spec=ServerlessSpec(cloud="aws", region="us-east-1"),
+)
+
+# may need to wait for index to be created
+import time
+
+time.sleep(10)
+```
+
+
+```python
+# define pinecone index
+index = pc.Index("quickstart-sql")
+```
+
+
+```python
+# OPTIONAL: delete all
+index.delete(deleteAll=True)
+```
+
+
+```python
+from llama_index.core import VectorStoreIndex, StorageContext
+from llama_index.vector_stores.pinecone import PineconeVectorStore
+from llama_index.core.node_parser import TokenTextSplitter
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+
+# define node parser and LLM
+Settings.llm = OpenAI(temperature=0, model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+Settings.node_parser = TokenTextSplitter(chunk_size=1024)
+
+# define pinecone vector index
+vector_store = PineconeVectorStore(
+    pinecone_index=index, namespace="wiki_cities"
+)
+storage_context = StorageContext.from_defaults(vector_store=vector_store)
+vector_index = VectorStoreIndex([], storage_context=storage_context)
+```
+
+
+```python
+# Insert documents into vector index
+# Each document has metadata of the city attached
+for city, wiki_doc in zip(cities, wiki_docs):
+    nodes = Settings.node_parser.get_nodes_from_documents([wiki_doc])
+    # add metadata to each node
+    for node in nodes:
+        node.metadata = {"title": city}
+    vector_index.insert_nodes(nodes)
+```
+
+#### Define Query Engines / Tools
+
+
+```python
+from llama_index.core.retrievers import VectorIndexAutoRetriever
+from llama_index.core.vector_stores import MetadataInfo, VectorStoreInfo
+from llama_index.core.query_engine import RetrieverQueryEngine
+from llama_index.core.tools import QueryEngineTool
+
+
+vector_store_info = VectorStoreInfo(
+    content_info="articles about different cities",
+    metadata_info=[
+        MetadataInfo(
+            name="title", type="str", description="The name of the city"
+        ),
+    ],
+)
+
+# pre-built auto-retriever, this works similarly to our custom auto-retriever above
+vector_auto_retriever = VectorIndexAutoRetriever(
+    vector_index, vector_store_info=vector_store_info
+)
+
+retriever_query_engine = RetrieverQueryEngine.from_args(
+    vector_auto_retriever,
+)
+```
+
+
+```python
+sql_tool = QueryEngineTool.from_defaults(
+    query_engine=query_engine,
+    name="sql_tool",
+    description=(
+        "Useful for translating a natural language query into a SQL query over"
+        " a table containing: city_stats, containing the population/country of"
+        " each city"
+    ),
+)
+vector_tool = QueryEngineTool.from_defaults(
+    query_engine=retriever_query_engine,
+    name="vector_tool",
+    description=(
+        "Useful for answering semantic questions about different cities"
+    ),
+)
+```
+
+#### Initialize Agent
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.llms.openai import OpenAI
+from llama_index.core.workflow import Context
+
+agent = FunctionAgent(
+    tools=[sql_tool, vector_tool],
+    llm=OpenAI(model="gpt-4o"),
+)
+
+# hold the context/session state for the agent
+ctx = Context(agent)
+```
+
+
+```python
+from llama_index.core.agent.workflow import (
+    ToolCallResult,
+    ToolCall,
+    AgentStream,
+    AgentInput,
+    AgentOutput,
+)
+
+handler = agent.run(
+    "Tell me about the arts and culture of the city with the highest population. ",
+    ctx=ctx,
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool sql_tool with args {'input': 'SELECT city FROM city_stats ORDER BY population DESC LIMIT 1;'}, got response: The city with the highest population is Tokyo.
+    
+    Called tool vector_tool with args {'input': 'Tell me about the arts and culture of Tokyo.'}, got response: Tokyo boasts a vibrant arts and culture scene, characterized by a diverse range of museums, galleries, and performance venues. Ueno Park is a cultural hub, housing the Tokyo National Museum, which specializes in traditional Japanese art, alongside the National Museum of Western Art, a UNESCO World Heritage site, and the National Museum of Nature and Science. The park also features Ueno Zoo, known for its giant pandas.
+    
+    The city is home to numerous notable museums, including the Artizon Museum, the National Museum of Emerging Science and Innovation, and the Edo-Tokyo Museum, which explores the city's history. Contemporary art is showcased at the Mori Art Museum and the Sumida Hokusai Museum, while the Sompo Museum of Art is recognized for its collection, including Van Gogh's "Sunflowers."
+    
+    The performing arts thrive in Tokyo, with venues like the National Noh Theatre and Kabuki-za dedicated to traditional Japanese theatre. The New National Theatre Tokyo hosts a variety of performances, including opera and ballet. Major concert venues such as the Nippon Budokan and Tokyo Dome frequently feature popular music acts.
+    
+    Tokyo's nightlife is vibrant, particularly in districts like Shibuya and Roppongi, which are filled with bars, clubs, and live music venues. The city is also known for its festivals, such as the Sannō Matsuri and the Sanja Festival, which celebrate traditional culture.
+    
+    Shopping districts like Ginza and Nihombashi offer a blend of high-end retail and cultural experiences, while areas like Jinbōchō are famous for their literary connections, featuring bookstores and cafes linked to renowned authors. Overall, Tokyo's arts and culture reflect a rich tapestry of traditional and contemporary influences, making it a dynamic city for cultural exploration.
+    Tokyo, the city with the highest population, boasts a vibrant arts and culture scene. It features a diverse range of museums, galleries, and performance venues. Ueno Park serves as a cultural hub, housing the Tokyo National Museum, the National Museum of Western Art, and the National Museum of Nature and Science. The park also includes Ueno Zoo, known for its giant pandas.
+    
+    Notable museums in Tokyo include the Artizon Museum, the National Museum of Emerging Science and Innovation, and the Edo-Tokyo Museum, which explores the city's history. Contemporary art is showcased at the Mori Art Museum and the Sumida Hokusai Museum, while the Sompo Museum of Art is recognized for its collection, including Van Gogh's "Sunflowers."
+    
+    The performing arts thrive with venues like the National Noh Theatre and Kabuki-za dedicated to traditional Japanese theatre. The New National Theatre Tokyo hosts a variety of performances, including opera and ballet. Major concert venues such as the Nippon Budokan and Tokyo Dome frequently feature popular music acts.
+    
+    Tokyo's nightlife is vibrant, especially in districts like Shibuya and Roppongi, filled with bars, clubs, and live music venues. The city is also known for its festivals, such as the Sannō Matsuri and the Sanja Festival, celebrating traditional culture.
+    
+    Shopping districts like Ginza and Nihombashi offer a blend of high-end retail and cultural experiences, while areas like Jinbōchō are famous for their literary connections, featuring bookstores and cafes linked to renowned authors. Overall, Tokyo's arts and culture reflect a rich tapestry of traditional and contemporary influences, making it a dynamic city for cultural exploration.
+
+
+```python
+handler = agent.run("Tell me about the history of Berlin", ctx=ctx)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"\nCalled tool {ev.tool_name} with args {ev.tool_kwargs}, got response: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    
+    Called tool vector_tool with args {'input': 'Tell me about the history of Berlin.'}, got response: Berlin's history dates back to prehistoric times, with evidence of human settlements as early as 60,000 BC. The area saw the emergence of various cultures, including the Maglemosian culture around 9,000 BC and the Lusatian culture around 2,000 BC, as dense human settlements developed along the Spree and Havel rivers. By 500 BC, Germanic tribes began to settle in the region, followed by Slavic tribes in the 7th century.
+    
+    In the 12th century, the region came under German rule with the establishment of the Margraviate of Brandenburg. The first written records of towns in the area appear in the late 12th century, with Berlin's founding date considered to be 1237. The towns of Berlin and Cölln formed close economic ties and eventually merged, with the Hohenzollern family ruling the area from the 14th century until 1918.
+    
+    The Thirty Years' War in the 17th century devastated Berlin, leading to significant population loss. However, under Frederick William, known as the "Great Elector," the city experienced a revival through policies promoting immigration and religious tolerance. The establishment of the Kingdom of Prussia in 1701 marked a significant turning point, with Berlin becoming its capital.
+    
+    The 19th century brought the Industrial Revolution, transforming Berlin into a major economic center and leading to rapid population growth. By the late 19th century, Berlin was the capital of the newly founded German Empire. The city continued to grow and evolve through the 20th century, experiencing significant events such as World War II, the division into East and West Berlin during the Cold War, and reunification in 1990, when it once again became the capital of a unified Germany.
+    
+    Today, Berlin is recognized as a global city of culture, politics, media, and science, with a diverse economy and rich historical heritage.
+    Berlin's history is rich and varied, dating back to prehistoric times with evidence of human settlements as early as 60,000 BC. The area saw the emergence of various cultures, including the Maglemosian culture around 9,000 BC and the Lusatian culture around 2,000 BC, with dense settlements along the Spree and Havel rivers. By 500 BC, Germanic tribes settled in the region, followed by Slavic tribes in the 7th century.
+    
+    In the 12th century, the region came under German rule with the establishment of the Margraviate of Brandenburg. Berlin's founding date is considered to be 1237, with the towns of Berlin and Cölln forming close economic ties and eventually merging. The Hohenzollern family ruled the area from the 14th century until 1918.
+    
+    The Thirty Years' War in the 17th century devastated Berlin, but it experienced a revival under Frederick William, the "Great Elector," through policies promoting immigration and religious tolerance. The establishment of the Kingdom of Prussia in 1701 marked a significant turning point, with Berlin becoming its capital.
+    
+    The 19th century brought the Industrial Revolution, transforming Berlin into a major economic center and leading to rapid population growth. By the late 19th century, Berlin was the capital of the newly founded German Empire. The city continued to evolve through the 20th century, experiencing significant events such as World War II, the division into East and West Berlin during the Cold War, and reunification in 1990, when it once again became the capital of a unified Germany.
+    
+    Today, Berlin is recognized as a global city of culture, politics, media, and science, with a diverse economy and rich historical heritage.
+
+
+```python
+response = await agent.run(
+    "Can you give me the country corresponding to each city?", ctx=ctx
+)
+
+print(str(response))
+```
+
+    Here are the cities along with their corresponding countries:
+    
+    - Toronto is in Canada.
+    - Tokyo is in Japan.
+    - Berlin is in Germany.
+
@@ -0,0 +1,170 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_retrieval.ipynb
+toc: True
+title: "Retrieval-Augmented Agents"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+In this tutorial, we show you how to use our `FunctionAgent` or `ReActAgent` implementation with a tool retriever, 
+to augment any existing agent and store/index an arbitrary number of tools. 
+
+Our indexing/retrieval modules help to remove the complexity of having too many functions to fit in the prompt.
+
+## Initial Setup 
+
+Let's start by importing some simple building blocks.  
+
+The main thing we need is:
+1. the OpenAI API
+2. a place to keep conversation history 
+3. a definition for tools that our agent can use.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+Let's define some very simple calculator tools for our agent.
+
+
+```python
+from llama_index.core.tools import FunctionTool
+
+
+def multiply(a: int, b: int) -> int:
+    """Multiply two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+
+
+def useless(a: int, b: int) -> int:
+    """Toy useless function."""
+    pass
+
+
+multiply_tool = FunctionTool.from_defaults(multiply, name="multiply")
+add_tool = FunctionTool.from_defaults(add, name="add")
+
+# toy-example of many tools
+useless_tools = [
+    FunctionTool.from_defaults(useless, name=f"useless_{str(idx)}")
+    for idx in range(28)
+]
+
+all_tools = [multiply_tool] + [add_tool] + useless_tools
+
+all_tools_map = {t.metadata.name: t for t in all_tools}
+```
+
+## Building an Object Index
+
+We have an `ObjectIndex` construct in LlamaIndex that allows the user to use our index data structures over arbitrary objects.
+The ObjectIndex will handle serialiation to/from the object, and use an underying index (e.g. VectorStoreIndex, SummaryIndex, KeywordTableIndex) as the storage mechanism. 
+
+In this case, we have a large collection of Tool objects, and we'd want to define an ObjectIndex over these Tools.
+
+The index comes bundled with a retrieval mechanism, an `ObjectRetriever`. 
+
+This can be passed in to our agent so that it can 
+perform Tool retrieval during query-time.
+
+
+```python
+# define an "object" index over these tools
+from llama_index.core import VectorStoreIndex
+from llama_index.core.objects import ObjectIndex
+
+obj_index = ObjectIndex.from_objects(
+    all_tools,
+    index_cls=VectorStoreIndex,
+    # if we were using an external vector store, we could pass the stroage context and any other kwargs
+    # storage_context=storage_context,
+    # embed_model=embed_model,
+    # ...
+)
+```
+
+To reload the index later, we can use the `from_objects_and_index` method.
+
+
+```python
+# from llama_index.core import StorageContext, load_index_from_storage
+
+# saving and loading from disk
+# obj_index.index.storage_context.persist(persist_dir="obj_index_storage")
+
+# reloading from disk
+# vector_index = load_index_from_storage(StorageContext.from_defaults(persist_dir="obj_index_storage"))
+
+# or if using an external vector store, no need to persist, just reload the index
+# vector_index = VectorStoreIndex.from_vector_store(vector_store=vector_store, ...)
+
+# Then, we can reload the ObjectIndex
+# obj_index = ObjectIndex.from_objects_and_index(
+#     all_tools,
+#     index=vector_index,
+# )
+```
+
+## Agent w/ Tool Retrieval 
+
+Agents in LlamaIndex can be used with a `ToolRetriever` to retrieve tools during query-time.
+
+During query-time, we would first use the `ObjectRetriever` to retrieve a set of relevant Tools. These tools would then be passed into the agent; more specifically, their function signatures would be passed into the OpenAI Function calling API. 
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+from llama_index.core.workflow import Context
+from llama_index.llms.openai import OpenAI
+
+agent = FunctionAgent(
+    tool_retriever=obj_index.as_retriever(similarity_top_k=2),
+    llm=OpenAI(model="gpt-4o"),
+)
+
+# context to hold the session/state
+ctx = Context(agent)
+```
+
+
+```python
+resp = await agent.run(
+    "What's 212 multiplied by 122? Make sure to use Tools", ctx=ctx
+)
+print(str(resp))
+print(resp.tool_calls)
+```
+
+    The result of multiplying 212 by 122 is 25,864.
+    [ToolCallResult(tool_name='multiply', tool_kwargs={'a': 212, 'b': 122}, tool_id='call_4Ygos3MpRH7Gj3R79HISRGyH', tool_output=ToolOutput(content='25864', tool_name='multiply', raw_input={'args': (), 'kwargs': {'a': 212, 'b': 122}}, raw_output=25864, is_error=False), return_direct=False)]
+
+
+
+```python
+resp = await agent.run(
+    "What's 212 added to 122 ? Make sure to use Tools", ctx=ctx
+)
+print(str(resp))
+print(resp.tool_calls)
+```
+
+    The result of adding 212 to 122 is 334.
+    [ToolCallResult(tool_name='add', tool_kwargs={'a': 212, 'b': 122}, tool_id='call_rXUfwQ477bcd6bxafQHgETaa', tool_output=ToolOutput(content='334', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 212, 'b': 122}}, raw_output=334, is_error=False), return_direct=False)]
+
@@ -0,0 +1,158 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/openai_agent_with_query_engine.ipynb
+toc: True
+title: "Agent with Query Engine Tools"
+featured: False
+experimental: False
+tags: ['Agent', 'Integrations']
+language: py
+---
+## Build Query Engine Tools
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+from llama_index.core import StorageContext, load_index_from_storage
+
+try:
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/lyft"
+    )
+    lyft_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/uber"
+    )
+    uber_index = load_index_from_storage(storage_context)
+
+    index_loaded = True
+except:
+    index_loaded = False
+```
+
+Download Data
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
+```
+
+
+```python
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+
+if not index_loaded:
+    # load data
+    lyft_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/lyft_2021.pdf"]
+    ).load_data()
+    uber_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/uber_2021.pdf"]
+    ).load_data()
+
+    # build index
+    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
+    uber_index = VectorStoreIndex.from_documents(uber_docs)
+
+    # persist index
+    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
+    uber_index.storage_context.persist(persist_dir="./storage/uber")
+```
+
+
+```python
+lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3)
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+
+query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=lyft_engine,
+        name="lyft_10k",
+        description=(
+            "Provides information about Lyft financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=uber_engine,
+        name="uber_10k",
+        description=(
+            "Provides information about Uber financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+]
+```
+
+## Setup Agent
+
+For LLMs like OpenAI that have a function calling API, we should use the `FunctionAgent`.
+
+For other LLMs, we can use the `ReActAgent`.
+
+
+```python
+from llama_index.core.agent.workflow import FunctionAgent, ReActAgent
+from llama_index.core.workflow import Context
+
+agent = FunctionAgent(tools=query_engine_tools, llm=OpenAI(model="gpt-4o"))
+
+# context to hold the session/state
+ctx = Context(agent)
+```
+
+## Let's Try It Out!
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult, AgentStream
+
+handler = agent.run("What's the revenue for Lyft in 2021 vs Uber?", ctx=ctx)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, ToolCallResult):
+        print(
+            f"Call {ev.tool_name} with args {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+    elif isinstance(ev, AgentStream):
+        print(ev.delta, end="", flush=True)
+
+response = await handler
+```
+
+    Call lyft_10k with args {'input': "What was Lyft's revenue for the year 2021?"}
+    Returned: Lyft's revenue for the year 2021 was $3,208,323,000.
+    Call uber_10k with args {'input': "What was Uber's revenue for the year 2021?"}
+    Returned: Uber's revenue for the year 2021 was $17.455 billion.
+    In 2021, Lyft's revenue was approximately $3.21 billion, while Uber's revenue was significantly higher at $17.455 billion.
@@ -0,0 +1,282 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/react_agent.ipynb
+toc: True
+title: "ReActAgent - A Simple Intro with Calculator Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+This is a notebook that showcases the ReAct agent over very simple calculator tools (no fancy RAG pipelines or API calls).
+
+We show how it can reason step-by-step over different tools to achieve the end goal.
+
+The main advantage of the ReAct agent over a Function Calling agent is that it can work with any LLM regardless of whether it supports function calling.
+
+If you're opening this Notebook on colab, you will probably need to install LlamaIndex 🦙.
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+## Define Function Tools
+
+We setup some trivial `multiply` and `add` tools. Note that you can define arbitrary functions and pass it to the `FunctionTool` (which will process the docstring and parameter signature).
+
+
+```python
+def multiply(a: int, b: int) -> int:
+    """Multiply two integers and returns the result integer"""
+    return a * b
+
+
+def add(a: int, b: int) -> int:
+    """Add two integers and returns the result integer"""
+    return a + b
+```
+
+## Run Some Queries
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.core.agent.workflow import ReActAgent
+from llama_index.core.workflow import Context
+
+llm = OpenAI(model="gpt-4o-mini")
+agent = ReActAgent(tools=[multiply, add], llm=llm)
+
+# Create a context to store the conversation history/session state
+ctx = Context(agent)
+```
+
+## Run Some Example Queries
+
+By streaming the result, we can see the full response, including the thought process and tool calls.
+
+If we wanted to stream only the result, we can buffer the stream and start streaming once `Answer:` is in the response.
+
+
+
+```python
+from llama_index.core.agent.workflow import AgentStream, ToolCallResult
+
+handler = agent.run("What is 20+(2*4)?", ctx=ctx)
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to help me answer the question.
+    Action: multiply
+    Action Input: {"a": 2, "b": 4}Thought: Now I have the result of the multiplication, which is 8. I will add this to 20 to complete the calculation.
+    Action: add
+    Action Input: {'a': 20, 'b': 8}Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: The result of 20 + (2 * 4) is 28.
+
+
+```python
+print(str(response))
+```
+
+    The result of 20 + (2 * 4) is 28.
+
+
+
+```python
+print(response.tool_calls)
+```
+
+    [ToolCallResult(tool_name='multiply', tool_kwargs={'a': 2, 'b': 4}, tool_id='a394d807-a9b7-42e0-8bff-f47a432d1530', tool_output=ToolOutput(content='8', tool_name='multiply', raw_input={'args': (), 'kwargs': {'a': 2, 'b': 4}}, raw_output=8, is_error=False), return_direct=False), ToolCallResult(tool_name='add', tool_kwargs={'a': 20, 'b': 8}, tool_id='784ccd85-ae9a-4184-9613-3696742064c7', tool_output=ToolOutput(content='28', tool_name='add', raw_input={'args': (), 'kwargs': {'a': 20, 'b': 8}}, raw_output=28, is_error=False), return_direct=False)]
+
+
+## View Prompts
+
+Let's take a look at the core system prompt powering the ReAct agent! 
+
+Within the agent, the current conversation history is dumped below this line.
+
+
+```python
+prompt_dict = agent.get_prompts()
+for k, v in prompt_dict.items():
+    print(f"Prompt: {k}\n\nValue: {v.template}")
+```
+
+    Prompt: react_header
+    
+    Value: You are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.
+    
+    ## Tools
+    
+    You have access to a wide variety of tools. You are responsible for using the tools in any sequence you deem appropriate to complete the task at hand.
+    This may require breaking the task into subtasks and using different tools to complete each subtask.
+    
+    You have access to the following tools:
+    {tool_desc}
+    
+    
+    ## Output Format
+    
+    Please answer in the same language as the question and use the following format:
+    
+    ```
+    Thought: The current language of the user is: (user's language). I need to use a tool to help me answer the question.
+    Action: tool name (one of {tool_names}) if using a tool.
+    Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
+    ```
+    
+    Please ALWAYS start with a Thought.
+    
+    NEVER surround your response with markdown code markers. You may use code markers within your response if you need to.
+    
+    Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.
+    
+    If this format is used, the tool will respond in the following format:
+    
+    ```
+    Observation: tool response
+    ```
+    
+    You should keep repeating the above format till you have enough information to answer the question without using any more tools. At that point, you MUST respond in one of the following two formats:
+    
+    ```
+    Thought: I can answer without using any more tools. I'll use the user's language to answer
+    Answer: [your answer here (In the same language as the user's question)]
+    ```
+    
+    ```
+    Thought: I cannot answer the question with the provided tools.
+    Answer: [your answer here (In the same language as the user's question)]
+    ```
+    
+    ## Current Conversation
+    
+    Below is the current conversation consisting of interleaving human and assistant messages.
+    
+
+
+### Customizing the Prompt
+
+For fun, let's try instructing the agent to output the answer along with reasoning in bullet points. See "## Additional Rules" section.
+
+
+```python
+from llama_index.core import PromptTemplate
+
+react_system_header_str = """\
+
+You are designed to help with a variety of tasks, from answering questions \
+    to providing summaries to other types of analyses.
+
+## Tools
+You have access to a wide variety of tools. You are responsible for using
+the tools in any sequence you deem appropriate to complete the task at hand.
+This may require breaking the task into subtasks and using different tools
+to complete each subtask.
+
+You have access to the following tools:
+{tool_desc}
+
+## Output Format
+To answer the question, please use the following format.
+
+```
+Thought: I need to use a tool to help me answer the question.
+Action: tool name (one of {tool_names}) if using a tool.
+Action Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})
+```
+
+Please ALWAYS start with a Thought.
+
+Please use a valid JSON format for the Action Input. Do NOT do this {{'input': 'hello world', 'num_beams': 5}}.
+
+If this format is used, the user will respond in the following format:
+
+```
+Observation: tool response
+```
+
+You should keep repeating the above format until you have enough information
+to answer the question without using any more tools. At that point, you MUST respond
+in the one of the following two formats:
+
+```
+Thought: I can answer without using any more tools.
+Answer: [your answer here]
+```
+
+```
+Thought: I cannot answer the question with the provided tools.
+Answer: Sorry, I cannot answer your query.
+```
+
+## Additional Rules
+- The answer MUST contain a sequence of bullet points that explain how you arrived at the answer. This can include aspects of the previous conversation history.
+- You MUST obey the function signature of each tool. Do NOT pass in no arguments if the function expects arguments.
+
+## Current Conversation
+Below is the current conversation consisting of interleaving human and assistant messages.
+
+"""
+react_system_prompt = PromptTemplate(react_system_header_str)
+```
+
+
+```python
+agent.get_prompts()
+```
+
+
+
+
+    {'react_header': PromptTemplate(metadata={'prompt_type': <PromptType.CUSTOM: 'custom'>}, template_vars=['tool_desc', 'tool_names'], kwargs={}, output_parser=None, template_var_mappings=None, function_mappings=None, template='You are designed to help with a variety of tasks, from answering questions to providing summaries to other types of analyses.\n\n## Tools\n\nYou have access to a wide variety of tools. You are responsible for using the tools in any sequence you deem appropriate to complete the task at hand.\nThis may require breaking the task into subtasks and using different tools to complete each subtask.\n\nYou have access to the following tools:\n{tool_desc}\n\n\n## Output Format\n\nPlease answer in the same language as the question and use the following format:\n\n```\nThought: The current language of the user is: (user\'s language). I need to use a tool to help me answer the question.\nAction: tool name (one of {tool_names}) if using a tool.\nAction Input: the input to the tool, in a JSON format representing the kwargs (e.g. {{"input": "hello world", "num_beams": 5}})\n```\n\nPlease ALWAYS start with a Thought.\n\nNEVER surround your response with markdown code markers. You may use code markers within your response if you need to.\n\nPlease use a valid JSON format for the Action Input. Do NOT do this {{\'input\': \'hello world\', \'num_beams\': 5}}.\n\nIf this format is used, the tool will respond in the following format:\n\n```\nObservation: tool response\n```\n\nYou should keep repeating the above format till you have enough information to answer the question without using any more tools. At that point, you MUST respond in one of the following two formats:\n\n```\nThought: I can answer without using any more tools. I\'ll use the user\'s language to answer\nAnswer: [your answer here (In the same language as the user\'s question)]\n```\n\n```\nThought: I cannot answer the question with the provided tools.\nAnswer: [your answer here (In the same language as the user\'s question)]\n```\n\n## Current Conversation\n\nBelow is the current conversation consisting of interleaving human and assistant messages.\n')}
+
+
+
+
+```python
+agent.update_prompts({"react_header": react_system_prompt})
+```
+
+
+```python
+handler = agent.run("What is 5+3+2")
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to help me answer the question.
+    Action: add
+    Action Input: {"a": 5, "b": 3}Thought: I need to add the result (8) to the remaining number (2).
+    Action: add
+    Action Input: {'a': 8, 'b': 2}Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: The result of 5 + 3 + 2 is 10.
+
+
+```python
+print(response)
+```
+
+    The result of 5 + 3 + 2 is 10.
+
@@ -0,0 +1,215 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/react_agent_with_query_engine.ipynb
+toc: True
+title: "ReAct Agent with Query Engine (RAG) Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+In this section, we show how to setup an agent powered by the ReAct loop for financial analysis.
+
+The agent has access to two "tools": one to query the 2021 Lyft 10-K and the other to query the 2021 Uber 10-K.
+
+Note that you can plug in any LLM to use as a ReAct agent.
+
+## Build Query Engine Tools
+
+
+```python
+%pip install llama-index
+```
+
+
+```python
+import os
+
+os.environ["OPENAI_API_KEY"] = "sk-..."
+```
+
+
+```python
+from llama_index.llms.openai import OpenAI
+from llama_index.embeddings.openai import OpenAIEmbedding
+from llama_index.core import Settings
+
+Settings.llm = OpenAI(model="gpt-4o-mini")
+Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
+```
+
+
+```python
+from llama_index.core import StorageContext, load_index_from_storage
+
+try:
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/lyft"
+    )
+    lyft_index = load_index_from_storage(storage_context)
+
+    storage_context = StorageContext.from_defaults(
+        persist_dir="./storage/uber"
+    )
+    uber_index = load_index_from_storage(storage_context)
+
+    index_loaded = True
+except:
+    index_loaded = False
+```
+
+Download Data
+
+
+```python
+!mkdir -p 'data/10k/'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
+!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'
+```
+
+
+```python
+from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
+
+if not index_loaded:
+    # load data
+    lyft_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/lyft_2021.pdf"]
+    ).load_data()
+    uber_docs = SimpleDirectoryReader(
+        input_files=["./data/10k/uber_2021.pdf"]
+    ).load_data()
+
+    # build index
+    lyft_index = VectorStoreIndex.from_documents(lyft_docs)
+    uber_index = VectorStoreIndex.from_documents(uber_docs)
+
+    # persist index
+    lyft_index.storage_context.persist(persist_dir="./storage/lyft")
+    uber_index.storage_context.persist(persist_dir="./storage/uber")
+```
+
+
+```python
+lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)
+uber_engine = uber_index.as_query_engine(similarity_top_k=3)
+```
+
+
+```python
+from llama_index.core.tools import QueryEngineTool
+
+query_engine_tools = [
+    QueryEngineTool.from_defaults(
+        query_engine=lyft_engine,
+        name="lyft_10k",
+        description=(
+            "Provides information about Lyft financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+    QueryEngineTool.from_defaults(
+        query_engine=uber_engine,
+        name="uber_10k",
+        description=(
+            "Provides information about Uber financials for year 2021. "
+            "Use a detailed plain text question as input to the tool."
+        ),
+    ),
+]
+```
+
+## Setup ReAct Agent
+
+Here we setup our ReAct agent with the tools we created above.
+
+You can **optionally** specify a system prompt which will be added to the core ReAct system prompt.
+
+
+```python
+from llama_index.core.agent.workflow import ReActAgent
+from llama_index.core.workflow import Context
+
+agent = ReActAgent(
+    tools=query_engine_tools,
+    llm=OpenAI(model="gpt-4o-mini"),
+    # system_prompt="..."
+)
+
+# context to hold this session/state
+
+ctx = Context(agent)
+```
+
+## Run Some Example Queries
+
+By streaming the result, we can see the full response, including the thought process and tool calls.
+
+If we wanted to stream only the result, we can buffer the stream and start streaming once `Answer:` is in the response.
+
+
+
+```python
+from llama_index.core.agent.workflow import ToolCallResult, AgentStream
+
+handler = agent.run("What was Lyft's revenue growth in 2021?", ctx=ctx)
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to help me answer the question.
+    Action: lyft_10k
+    Action Input: {"input": "What was Lyft's revenue growth in 2021?"}Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: Lyft's revenue growth in 2021 was 36% compared to the prior year.
+
+
+```python
+print(str(response))
+```
+
+    Lyft's revenue growth in 2021 was 36% compared to the prior year.
+
+
+
+```python
+handler = agent.run(
+    "Compare and contrast the revenue growth of Uber and Lyft in 2021, then give an analysis",
+    ctx=ctx,
+)
+
+async for ev in handler.stream_events():
+    # if isinstance(ev, ToolCallResult):
+    #     print(f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}")
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+
+response = await handler
+```
+
+    Thought: The current language of the user is: English. I need to use a tool to gather information about Uber's revenue growth in 2021 to compare it with Lyft's.
+    Action: uber_10k
+    Action Input: {'input': "What was Uber's revenue growth in 2021?"}Thought: I now have the revenue growth information for both Uber and Lyft in 2021. Lyft's revenue growth was 36%, while Uber's was 57%. I will now provide a comparison and analysis.
+    Thought: I can answer without using any more tools. I'll use the user's language to answer.
+    Answer: In 2021, Uber experienced a revenue growth of 57%, increasing from $11.139 billion in 2020 to $17.455 billion. In contrast, Lyft's revenue growth was 36%. 
+    
+    When comparing the two, Uber outperformed Lyft in terms of revenue growth, indicating a stronger recovery or expansion in its business operations during that year. This could be attributed to Uber's diversified services, including food delivery through Uber Eats, which may have contributed significantly to its revenue. Lyft, primarily focused on ride-sharing, may have faced more challenges in scaling its growth compared to Uber.
+    
+    Overall, while both companies showed positive growth, Uber's higher percentage suggests it was able to capitalize on market opportunities more effectively than Lyft in 2021.
+
+
+```python
+print(str(response))
+```
+
+    In 2021, Uber experienced a revenue growth of 57%, increasing from $11.139 billion in 2020 to $17.455 billion. In contrast, Lyft's revenue growth was 36%. 
+    
+    When comparing the two, Uber outperformed Lyft in terms of revenue growth, indicating a stronger recovery or expansion in its business operations during that year. This could be attributed to Uber's diversified services, including food delivery through Uber Eats, which may have contributed significantly to its revenue. Lyft, primarily focused on ride-sharing, may have faced more challenges in scaling its growth compared to Uber.
+    
+    Overall, while both companies showed positive growth, Uber's higher percentage suggests it was able to capitalize on market opportunities more effectively than Lyft in 2021.
+
@@ -0,0 +1,250 @@
+---
+layout: recipe
+colab: https://colab.research.google.com/github/run-llama/cookbooks-demo/blob/main/notebooks/agent/return_direct_agent.ipynb
+toc: True
+title: "Controlling Agent Reasoning Loop with Return Direct Tools"
+featured: False
+experimental: False
+tags: ['Agent']
+language: py
+---
+All tools have an option for `return_direct` -- if this is set to `True`, and the associated tool is called (without any other tools being called), the agent reasoning loop is ended and the tool output is returned directly.
+
+This can be useful for speeding up response times when you know the tool output is good enough, to avoid the agent re-writing the response, and for ending the reasoning loop.
+
+This notebook walks through a notebook where an agent needs to gather information from a user in order to make a restaurant booking.
+
+
+```python
+%pip install llama-index-core llama-index-llms-anthropic
+```
+
+
+```python
+import os
+
+os.environ["ANTHROPIC_API_KEY"] = "sk-..."
+```
+
+## Tools setup
+
+
+```python
+from typing import Optional
+
+from llama_index.core.tools import FunctionTool
+from pydantic import BaseModel
+
+# we will store booking under random IDs
+bookings = {}
+
+
+# we will represent and track the state of a booking as a Pydantic model
+class Booking(BaseModel):
+    name: Optional[str] = None
+    email: Optional[str] = None
+    phone: Optional[str] = None
+    date: Optional[str] = None
+    time: Optional[str] = None
+
+
+def get_booking_state(user_id: str) -> str:
+    """Get the current state of a booking for a given booking ID."""
+    try:
+        return str(bookings[user_id].dict())
+    except:
+        return f"Booking ID {user_id} not found"
+
+
+def update_booking(user_id: str, property: str, value: str) -> str:
+    """Update a property of a booking for a given booking ID. Only enter details that are explicitly provided."""
+    booking = bookings[user_id]
+    setattr(booking, property, value)
+    return f"Booking ID {user_id} updated with {property} = {value}"
+
+
+def create_booking(user_id: str) -> str:
+    """Create a new booking and return the booking ID."""
+    bookings[user_id] = Booking()
+    return "Booking created, but not yet confirmed. Please provide your name, email, phone, date, and time."
+
+
+def confirm_booking(user_id: str) -> str:
+    """Confirm a booking for a given booking ID."""
+    booking = bookings[user_id]
+
+    if booking.name is None:
+        raise ValueError("Please provide your name.")
+
+    if booking.email is None:
+        raise ValueError("Please provide your email.")
+
+    if booking.phone is None:
+        raise ValueError("Please provide your phone number.")
+
+    if booking.date is None:
+        raise ValueError("Please provide the date of your booking.")
+
+    if booking.time is None:
+        raise ValueError("Please provide the time of your booking.")
+
+    return f"Booking ID {user_id} confirmed!"
+
+
+# create tools for each function
+get_booking_state_tool = FunctionTool.from_defaults(fn=get_booking_state)
+update_booking_tool = FunctionTool.from_defaults(fn=update_booking)
+create_booking_tool = FunctionTool.from_defaults(
+    fn=create_booking, return_direct=True
+)
+confirm_booking_tool = FunctionTool.from_defaults(
+    fn=confirm_booking, return_direct=True
+)
+```
+
+## A user has walked in! Let's help them make a booking
+
+
+```python
+from llama_index.llms.anthropic import Anthropic
+from llama_index.core.llms import ChatMessage
+from llama_index.core.agent.workflow import FunctionAgent
+from llama_index.core.workflow import Context
+
+llm = Anthropic(model="claude-3-sonnet-20240229", temperature=0.1)
+
+user = "user123"
+system_prompt = f"""You are now connected to the booking system and helping {user} with making a booking.
+Only enter details that the user has explicitly provided.
+Do not make up any details.
+"""
+
+agent = FunctionAgent(
+    tools=[
+        get_booking_state_tool,
+        update_booking_tool,
+        create_booking_tool,
+        confirm_booking_tool,
+    ],
+    llm=llm,
+    system_prompt=system_prompt,
+)
+
+# create a context for the agent to hold the state/history of a session
+ctx = Context(agent)
+```
+
+
+```python
+from llama_index.core.agent.workflow import AgentStream, ToolCallResult
+
+handler = agent.run(
+    "Hello! I would like to make a booking, around 5pm?", ctx=ctx
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+    elif isinstance(ev, ToolCallResult):
+        print(
+            f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+
+response = await handler
+```
+
+    Okay, let's create a new booking for you.{"user_id": "user123"}
+    Call create_booking with {'user_id': 'user123'}
+    Returned: Booking created, but not yet confirmed. Please provide your name, email, phone, date, and time.
+
+
+
+```python
+print(str(response))
+```
+
+    Booking created, but not yet confirmed. Please provide your name, email, phone, date, and time.
+
+
+Perfect, we can see the function output was retruned directly, with no modification or final LLM call!
+
+
+```python
+handler = agent.run(
+    "Sure! My name is Logan, and my email is test@gmail.com?", ctx=ctx
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+    elif isinstance(ev, ToolCallResult):
+        print(
+            f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+
+response = await handler
+```
+
+    Got it, thanks for providing your name and email. I've updated the booking with that information.{"user_id": "user123", "property": "name", "value": "Logan"}{"user_id": "user123", "property": "email", "value": "test@gmail.com"}
+    Call update_booking with {'user_id': 'user123', 'property': 'name', 'value': 'Logan'}
+    Returned: Booking ID user123 updated with name = Logan
+    
+    Call update_booking with {'user_id': 'user123', 'property': 'email', 'value': 'test@gmail.com'}
+    Returned: Booking ID user123 updated with email = test@gmail.com
+    Please also provide your phone number, preferred date, and time for the booking.
+
+
+```python
+print(str(response))
+```
+
+    Please also provide your phone number, preferred date, and time for the booking.
+
+
+
+```python
+handler = agent.run(
+    "Right! My phone number is 1234567890, the date of the booking is April 5, at 5pm.",
+    ctx=ctx,
+)
+
+async for ev in handler.stream_events():
+    if isinstance(ev, AgentStream):
+        print(f"{ev.delta}", end="", flush=True)
+    elif isinstance(ev, ToolCallResult):
+        print(
+            f"\nCall {ev.tool_name} with {ev.tool_kwargs}\nReturned: {ev.tool_output}"
+        )
+
+response = await handler
+```
+
+    Great, thank you for providing the additional details. I've updated the booking with your phone number, date, and time.{"user_id": "user123", "property": "phone", "value": "1234567890"}{"user_id": "user123", "property": "date", "value": "2023-04-05"}{"user_id": "user123", "property": "time", "value": "17:00"}
+    Call update_booking with {'user_id': 'user123', 'property': 'phone', 'value': '1234567890'}
+    Returned: Booking ID user123 updated with phone = 1234567890
+    
+    Call update_booking with {'user_id': 'user123', 'property': 'date', 'value': '2023-04-05'}
+    Returned: Booking ID user123 updated with date = 2023-04-05
+    
+    Call update_booking with {'user_id': 'user123', 'property': 'time', 'value': '17:00'}
+    Returned: Booking ID user123 updated with time = 17:00
+    Looks like I have all the necessary details. Let me confirm this booking for you.{"user_id": "user123"}
+    Call confirm_booking with {'user_id': 'user123'}
+    Returned: Booking ID user123 confirmed!
+
+
+
+```python
+print(str(response))
+```
+
+    Booking ID user123 confirmed!
+
+
+
+```python
+print(bookings["user123"])
+```
+
+    name='Logan' email='test@gmail.com' phone='1234567890' date='2023-04-05' time='17:00'
+
@@ -0,0 +1,644 @@
+#!/usr/bin/env python3
+"""
+Cookbook Documentation Generator
+
+Reads index.toml and generates an HTML documentation page with filtering and search.
+"""
+
+import tomllib
+from pathlib import Path
+from typing import List, Dict, Set
+import json
+from urllib.parse import urljoin
+
+class CookbookGenerator:
+    def __init__(self, toml_path: str, output_path: str = "cookbooks.html"):
+        self.toml_path = toml_path
+        self.output_path = output_path
+        self.data = None
+        
+    def load_toml(self):
+        """Load and parse the index.toml file"""
+        with open(self.toml_path, 'rb') as f:
+            self.data = tomllib.load(f)
+    
+    def get_all_tags(self) -> Set[str]:
+        """Extract all unique tags from recipes"""
+        tags = set()
+        for recipe in self.data.get('recipe', []):
+            tags.update(recipe.get('tags', []))
+        return sorted(tags)
+    
+    def get_all_languages(self) -> Set[str]:
+        """Extract all unique languages from recipes"""
+        languages = set()
+        for recipe in self.data.get('recipe', []):
+            lang = recipe.get('language', '')
+            if lang:
+                languages.add(lang)
+        return sorted(languages)
+    
+    def generate_notebook_url(self, recipe: Dict) -> str:
+        """Generate the notebook URL based on config and recipe data"""
+        config = self.data.get('config', {})
+        colab_base = config.get('colab', '')
+        
+        if 'notebook' in recipe:
+            return urljoin(colab_base, recipe['notebook'])
+        elif 'source' in recipe:
+            return recipe['source']
+        return "#"
+    
+    def generate_html(self) -> str:
+        """Generate the complete HTML page"""
+        if not self.data:
+            raise ValueError("No data loaded. Call load_toml() first.")
+        
+        recipes = self.data.get('recipe', [])
+        all_tags = self.get_all_tags()
+        all_languages = self.get_all_languages()
+        
+        # Convert recipes to JSON for JavaScript
+        recipes_json = json.dumps([
+            {
+                'title': recipe.get('title', ''),
+                'description': recipe.get('description', ''),
+                'tags': recipe.get('tags', []),
+                'language': recipe.get('language', ''),
+                'url': self.generate_notebook_url(recipe),
+                'featured': recipe.get('featured', False),
+                'experimental': recipe.get('experimental', False)
+            }
+            for recipe in recipes
+        ])
+        
+        html_template = f"""<!DOCTYPE html>
+<html lang="en">
+<head>
+    <meta charset="UTF-8">
+    <meta name="viewport" content="width=device-width, initial-scale=1.0">
+    <title>Cookbooks Documentation</title>
+    <style>
+        * {{
+            margin: 0;
+            padding: 0;
+            box-sizing: border-box;
+        }}
+        
+        body {{
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
+            background-color: #f8f9fa;
+            color: #333;
+            line-height: 1.6;
+        }}
+        
+        .container {{
+            max-width: 1200px;
+            margin: 0 auto;
+            padding: 20px;
+        }}
+        
+        .header {{
+            background: white;
+            padding: 40px 20px;
+            margin-bottom: 30px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+            text-align: center;
+        }}
+        
+        .page-title {{
+            font-size: 2.5rem;
+            font-weight: 700;
+            margin-bottom: 10px;
+            color: #1a1a1a;
+        }}
+        
+        .page-subtitle {{
+            font-size: 1.1rem;
+            color: #666;
+            max-width: 600px;
+            margin: 0 auto;
+        }}
+        
+        .controls-section {{
+            background: white;
+            padding: 30px;
+            margin-bottom: 20px;
+            border-radius: 8px;
+            box-shadow: 0 2px 4px rgba(0,0,0,0.1);
+        }}
+        
+        .search-filters-row {{
+            display: flex;
+            gap: 20px;
+            align-items: flex-end;
+            flex-wrap: wrap;
+            margin-bottom: 20px;
+        }}
+        
+        .search-container {{
+            flex: 1;
+            min-width: 300px;
+        }}
+        
+        .filter-group {{
+            min-width: 150px;
+        }}
+        
+        .label {{
+            display: block;
+            font-weight: 600;
+            margin-bottom: 8px;
+            font-size: 0.9rem;
+            color: #374151;
+        }}
+        
+        .search-input, .filter-select {{
+            width: 100%;
+            padding: 12px;
+            border: 2px solid #e5e7eb;
+            border-radius: 6px;
+            font-size: 16px;
+            transition: border-color 0.2s;
+        }}
+        
+        .search-input:focus, .filter-select:focus {{
+            outline: none;
+            border-color: #3b82f6;
+            box-shadow: 0 0 0 3px rgba(59, 130, 246, 0.1);
+        }}
+        
+        .active-filters {{
+            display: flex;
+            gap: 10px;
+            flex-wrap: wrap;
+            align-items: center;
+        }}
+        
+        .filter-tag {{
+            background: #3b82f6;
+            color: white;
+            padding: 6px 12px;
+            border-radius: 20px;
+            font-size: 0.8rem;
+            display: flex;
+            align-items: center;
+            gap: 6px;
+        }}
+        
+        .filter-tag .remove {{
+            cursor: pointer;
+            font-weight: bold;
+            padding: 2px;
+        }}
+        
+        .clear-all {{
+            background: #ef4444;
+            color: white;
+            border: none;
+            padding: 6px 12px;
+            border-radius: 4px;
+            cursor: pointer;
+            font-size: 0.8rem;
+        }}
+        
+        .results-info {{
+            background: #f0f9ff;
+            border: 1px solid #0ea5e9;
+            border-left: 4px solid #0ea5e9;
+            padding: 12px 20px;
+            margin-bottom: 20px;
+            border-radius: 4px;
+            font-size: 0.9rem;
+            color: #0c4a6e;
+        }}
+        
+        .cookbooks-grid {{
+            display: grid;
+            grid-template-columns: repeat(auto-fill, minmax(380px, 1fr));
+            gap: 24px;
+        }}
+        
+        .cookbook-card {{
+            background: white;
+            border-radius: 8px;
+            padding: 24px;
+            box-shadow: 0 2px 8px rgba(0,0,0,0.1);
+            transition: all 0.2s;
+            position: relative;
+            border: 2px solid transparent;
+        }}
+        
+        .cookbook-card:hover {{
+            transform: translateY(-2px);
+            box-shadow: 0 4px 16px rgba(0,0,0,0.15);
+            border-color: #3b82f6;
+        }}
+        
+        .cookbook-card.featured {{
+            border-color: #f59e0b;
+            background: linear-gradient(135deg, #fefbf3 0%, #ffffff 100%);
+        }}
+        
+        .cookbook-card.experimental {{
+            border-color: #8b5cf6;
+            background: linear-gradient(135deg, #faf5ff 0%, #ffffff 100%);
+        }}
+        
+        .featured-badge, .experimental-badge {{
+            position: absolute;
+            top: -8px;
+            right: 16px;
+            padding: 4px 12px;
+            border-radius: 12px;
+            font-size: 0.7rem;
+            font-weight: 600;
+            text-transform: uppercase;
+        }}
+        
+        .featured-badge {{
+            background: #f59e0b;
+            color: white;
+        }}
+        
+        .experimental-badge {{
+            background: #8b5cf6;
+            color: white;
+        }}
+        
+        .cookbook-title {{
+            font-size: 1.3rem;
+            font-weight: 600;
+            margin-bottom: 12px;
+            color: #1a1a1a;
+        }}
+        
+        .cookbook-description {{
+            color: #6b7280;
+            margin-bottom: 16px;
+            line-height: 1.5;
+        }}
+        
+        .cookbook-tags {{
+            display: flex;
+            gap: 6px;
+            flex-wrap: wrap;
+            margin-bottom: 16px;
+        }}
+        
+        .tag {{
+            background: #f3f4f6;
+            color: #374151;
+            padding: 4px 10px;
+            border-radius: 16px;
+            font-size: 0.8rem;
+            border: 1px solid #e5e7eb;
+        }}
+        
+        .cookbook-footer {{
+            display: flex;
+            justify-content: space-between;
+            align-items: center;
+        }}
+        
+        .cookbook-link {{
+            color: #3b82f6;
+            text-decoration: none;
+            font-weight: 500;
+            font-size: 0.9rem;
+        }}
+        
+        .cookbook-link:hover {{
+            text-decoration: underline;
+        }}
+        
+        .language-badge {{
+            background: #1f2937;
+            color: white;
+            padding: 2px 8px;
+            border-radius: 4px;
+            font-size: 0.7rem;
+            text-transform: uppercase;
+        }}
+        
+        .empty-state {{
+            text-align: center;
+            padding: 60px 20px;
+            color: #6b7280;
+        }}
+        
+        .empty-state-icon {{
+            font-size: 3rem;
+            margin-bottom: 16px;
+        }}
+        
+        @media (max-width: 768px) {{
+            .search-filters-row {{
+                flex-direction: column;
+                align-items: stretch;
+            }}
+            
+            .cookbooks-grid {{
+                grid-template-columns: 1fr;
+            }}
+            
+            .cookbook-card {{
+                padding: 20px;
+            }}
+            
+            .page-title {{
+                font-size: 2rem;
+            }}
+        }}
+    </style>
+</head>
+<body>
+    <div class="container">
+        <!-- Header -->
+        <div class="header">
+            <h1 class="page-title">Cookbooks Documentation</h1>
+            <p class="page-subtitle">Explore our collection of practical guides and tutorials for building AI applications</p>
+        </div>
+
+        <!-- Search and Filter Controls -->
+        <div class="controls-section">
+            <div class="search-filters-row">
+                <div class="search-container">
+                    <label class="label" for="search">Search Cookbooks</label>
+                    <input type="text" id="search" class="search-input" placeholder="Search by title, description, or tags...">
+                </div>
+                
+                <div class="filter-group">
+                    <label class="label" for="tag-filter">Filter by Tag</label>
+                    <select id="tag-filter" class="filter-select">
+                        <option value="">All Tags</option>
+                        {''.join(f'<option value="{tag}">{tag}</option>' for tag in all_tags)}
+                    </select>
+                </div>
+                
+                <div class="filter-group">
+                    <label class="label" for="language-filter">Language</label>
+                    <select id="language-filter" class="filter-select">
+                        <option value="">All Languages</option>
+                        {''.join(f'<option value="{lang}">{lang.upper()}</option>' for lang in all_languages)}
+                    </select>
+                </div>
+                
+                <div class="filter-group">
+                    <label class="label" for="sort-select">Sort By</label>
+                    <select id="sort-select" class="filter-select">
+                        <option value="title">Title A-Z</option>
+                        <option value="featured">Featured First</option>
+                        <option value="tags">By Tag</option>
+                    </select>
+                </div>
+            </div>
+            
+            <div class="active-filters" id="active-filters">
+                <!-- Active filters will be populated by JavaScript -->
+            </div>
+        </div>
+
+        <!-- Results Info -->
+        <div class="results-info" id="results-info">
+            Loading cookbooks...
+        </div>
+
+        <!-- Cookbooks Grid -->
+        <div class="cookbooks-grid" id="cookbooks-grid">
+            <!-- Cookbook cards will be populated by JavaScript -->
+        </div>
+        
+        <!-- Empty State -->
+        <div class="empty-state" id="empty-state" style="display: none;">
+            <div class="empty-state-icon">📚</div>
+            <h3>No cookbooks found</h3>
+            <p>Try adjusting your search or filter criteria</p>
+        </div>
+    </div>
+
+    <script>
+        // Cookbook data from TOML
+        const cookbooks = {recipes_json};
+        let filteredCookbooks = [...cookbooks];
+        
+        // DOM elements
+        const searchInput = document.getElementById('search');
+        const tagFilter = document.getElementById('tag-filter');
+        const languageFilter = document.getElementById('language-filter');
+        const sortSelect = document.getElementById('sort-select');
+        const resultsInfo = document.getElementById('results-info');
+        const cookbooksGrid = document.getElementById('cookbooks-grid');
+        const emptyState = document.getElementById('empty-state');
+        const activeFiltersContainer = document.getElementById('active-filters');
+        
+        function createCookbookCard(cookbook) {{
+            const badgeHtml = cookbook.featured 
+                ? '<div class="featured-badge">Featured</div>' 
+                : cookbook.experimental 
+                    ? '<div class="experimental-badge">Experimental</div>'
+                    : '';
+            
+            const cardClass = cookbook.featured 
+                ? 'cookbook-card featured' 
+                : cookbook.experimental 
+                    ? 'cookbook-card experimental'
+                    : 'cookbook-card';
+            
+            const tagsHtml = cookbook.tags.map(tag => 
+                `<span class="tag">${{tag}}</span>`
+            ).join('');
+            
+            const languageBadge = cookbook.language 
+                ? `<span class="language-badge">${{cookbook.language}}</span>`
+                : '';
+            
+            return `
+                <div class="${{cardClass}}">
+                    ${{badgeHtml}}
+                    <div class="cookbook-title">${{cookbook.title}}</div>
+                    <div class="cookbook-description">${{cookbook.description}}</div>
+                    <div class="cookbook-tags">${{tagsHtml}}</div>
+                    <div class="cookbook-footer">
+                        <a href="${{cookbook.url}}" class="cookbook-link" target="_blank">
+                            Open Cookbook →
+                        </a>
+                        ${{languageBadge}}
+                    </div>
+                </div>
+            `;
+        }}
+        
+        function updateActiveFilters() {{
+            const activeFilters = [];
+            
+            if (tagFilter.value) {{
+                activeFilters.push({{
+                    type: 'tag',
+                    value: tagFilter.value,
+                    label: `Tag: ${{tagFilter.value}}`
+                }});
+            }}
+            
+            if (languageFilter.value) {{
+                activeFilters.push({{
+                    type: 'language',
+                    value: languageFilter.value,
+                    label: `Language: ${{languageFilter.value.toUpperCase()}}`
+                }});
+            }}
+            
+            if (searchInput.value.trim()) {{
+                activeFilters.push({{
+                    type: 'search',
+                    value: searchInput.value,
+                    label: `Search: "${{searchInput.value}}"`
+                }});
+            }}
+            
+            const filtersHtml = activeFilters.map(filter => `
+                <div class="filter-tag">
+                    ${{filter.label}}
+                    <span class="remove" onclick="removeFilter('${{filter.type}}', '${{filter.value}}')">×</span>
+                </div>
+            `).join('');
+            
+            if (activeFilters.length > 0) {{
+                activeFiltersContainer.innerHTML = filtersHtml + 
+                    '<button class="clear-all" onclick="clearAllFilters()">Clear All</button>';
+            }} else {{
+                activeFiltersContainer.innerHTML = '';
+            }}
+        }}
+        
+        function removeFilter(type, value) {{
+            if (type === 'tag') {{
+                tagFilter.value = '';
+            }} else if (type === 'language') {{
+                languageFilter.value = '';
+            }} else if (type === 'search') {{
+                searchInput.value = '';
+            }}
+            filterAndRender();
+        }}
+        
+        function clearAllFilters() {{
+            searchInput.value = '';
+            tagFilter.value = '';
+            languageFilter.value = '';
+            sortSelect.value = 'title';
+            filterAndRender();
+        }}
+        
+        function filterAndRender() {{
+            const searchTerm = searchInput.value.toLowerCase().trim();
+            const selectedTag = tagFilter.value;
+            const selectedLanguage = languageFilter.value;
+            const sortBy = sortSelect.value;
+            
+            // Filter cookbooks
+            filteredCookbooks = cookbooks.filter(cookbook => {{
+                // Search filter
+                const matchesSearch = !searchTerm || 
+                    cookbook.title.toLowerCase().includes(searchTerm) ||
+                    cookbook.description.toLowerCase().includes(searchTerm) ||
+                    cookbook.tags.some(tag => tag.toLowerCase().includes(searchTerm));
+                
+                // Tag filter
+                const matchesTag = !selectedTag || cookbook.tags.includes(selectedTag);
+                
+                // Language filter
+                const matchesLanguage = !selectedLanguage || cookbook.language === selectedLanguage;
+                
+                return matchesSearch && matchesTag && matchesLanguage;
+            }});
+            
+            // Sort cookbooks
+            filteredCookbooks.sort((a, b) => {{
+                if (sortBy === 'featured') {{
+                    if (a.featured && !b.featured) return -1;
+                    if (!a.featured && b.featured) return 1;
+                    if (a.experimental && !b.experimental) return -1;
+                    if (!a.experimental && b.experimental) return 1;
+                }}
+                if (sortBy === 'tags') {{
+                    return a.tags[0]?.localeCompare(b.tags[0] || '') || 0;
+                }}
+                return a.title.localeCompare(b.title);
+            }});
+            
+            // Update results info
+            resultsInfo.textContent = `Showing ${{filteredCookbooks.length}} of ${{cookbooks.length}} cookbooks`;
+            
+            // Update active filters display
+            updateActiveFilters();
+            
+            // Render results
+            if (filteredCookbooks.length === 0) {{
+                cookbooksGrid.style.display = 'none';
+                emptyState.style.display = 'block';
+            }} else {{
+                cookbooksGrid.style.display = 'grid';
+                emptyState.style.display = 'none';
+                cookbooksGrid.innerHTML = filteredCookbooks.map(createCookbookCard).join('');
+            }}
+        }}
+        
+        // Event listeners
+        searchInput.addEventListener('input', filterAndRender);
+        tagFilter.addEventListener('change', filterAndRender);
+        languageFilter.addEventListener('change', filterAndRender);
+        sortSelect.addEventListener('change', filterAndRender);
+        
+        // Initialize page
+        filterAndRender();
+        
+        // Add some interactivity hints
+        console.log('Cookbook Documentation Page Loaded');
+        console.log(`Total cookbooks: ${{cookbooks.length}}`);
+        console.log(`Available tags: {', '.join(all_tags)}`);
+    </script>
+</body>
+</html>"""
+        
+        return html_template
+    
+    def generate_file(self):
+        """Generate the HTML file"""
+        self.load_toml()
+        html_content = self.generate_html()
+        
+        with open(self.output_path, 'w', encoding='utf-8') as f:
+            f.write(html_content)
+        
+        print(f"Generated cookbook documentation: {self.output_path}")
+        print(f"Total recipes: {len(self.data.get('recipe', []))}")
+        print(f"Unique tags: {', '.join(self.get_all_tags())}")
+
+def main():
+    """Main function to run the generator"""
+    import argparse
+    
+    parser = argparse.ArgumentParser(description='Generate cookbook documentation from index.toml')
+    parser.add_argument('toml_file', help='Path to index.toml file')
+    parser.add_argument('-o', '--output', default='cookbooks.html', 
+                       help='Output HTML file (default: cookbooks.html)')
+    
+    args = parser.parse_args()
+    
+    # Check if input file exists
+    if not Path(args.toml_file).exists():
+        print(f"Error: File {args.toml_file} not found")
+        return 1
+    
+    try:
+        generator = CookbookGenerator(args.toml_file, args.output)
+        generator.generate_file()
+        return 0
+    except Exception as e:
+        print(f"Error generating documentation: {e}")
+        return 1
+
+if __name__ == "__main__":
+    exit(main())
@@ -102,6 +102,7 @@ title: "{title}"
 featured: {notebook_info.get("featured", False)}
 experimental: {notebook_info.get("experimental", False)}
 tags: {notebook_info.get("tags", [])}
+language: {notebook_info.get("language", None)}
 ---
 """
    # Remove the original header and add frontmatter
@@ -264,20 +265,23 @@ def convert_notebooks(index_data):
    colab_base_url = index_data["config"]["colab"]
    notebooks = []
    for i, recipe_data in enumerate(recipes):
-        notebook_info = {
-            "file": Path(recipe_data["notebook"]),
-            "title": recipe_data["title"],
-            "colab": f"{colab_base_url.rstrip('/')}/{recipe_data["notebook"]}",
-            "featured": recipe_data.get("featured", False),
-            "experimental": recipe_data.get("experimental", False),
-            "tags": recipe_data.get("tags", []),
-            "relative_repo_path": recipe_data["notebook"],  # Pass relative path for image fixing
-        }
-        notebook_path = Path(recipe_data["notebook"])
-        notebooks.append({
-            "notebook_path": notebook_path,
-            "notebook_info": notebook_info
-        })
+        if recipe_data.get("notebook") is not None:
+            notebook_info = {
+                "file": Path(recipe_data["notebook"]),
+                "title": recipe_data["title"],
+                "colab": f"{colab_base_url.rstrip('/')}/{recipe_data["notebook"]}",
+                "featured": recipe_data.get("featured", False),
+                "experimental": recipe_data.get("experimental", False),
+                "tags": recipe_data.get("tags", []),
+                "relative_repo_path": recipe_data["notebook"],  # Pass relative path for image fixing
+            }
+            if recipe_data.get("language"):
+                notebook_info["language"] = recipe_data["language"]
+            notebook_path = Path(recipe_data["notebook"])
+            notebooks.append({
+                "notebook_path": notebook_path,
+                "notebook_info": notebook_info
+            })
    notebook_paths = [notebook['notebook_path'] for notebook in notebooks ]
    total_notebooks = len(notebook_paths)