Release 0.9.16 (#1811 )

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
feat: add MCP tools integration and example usage (#1819 )
2026-07-01 22:14:03 -04:00 · 2025-04-04 11:12:27 +02:00 · 2025-04-04 11:03:10 +02:00 · 2025-04-03 14:23:31 +02:00 · 2025-04-03 13:57:39 +07:00 · 2025-04-02 21:24:57 +07:00
404 changed files with 17958 additions and 2297 deletions
@@ -14,13 +14,14 @@ There are some important folders in the repository:
    all JS runtime environments.
  - `env`: The environment package of LlamaIndex.TS, which contains the environment-specific classes and interfaces. It
    includes compatibility layers for Node.js, Deno, Vercel Edge Runtime, Cloudflare Workers...
+  - `providers/*`: The providers package of LlamaIndex.TS, which contains the providers for LLM and other services.
 - `apps/*`: The applications based on LlamaIndex.TS.
  - `next`: Our documentation website based on Next.js.
 - `examples`: The code examples of LlamaIndex.TS using Node.js.

 ## Getting Started

-Make sure you have Node.js LIS (Long-term Support) installed. You can check your Node.js version by running:
+Make sure you have Node.js LTS (Long-term Support) installed. You can check your Node.js version by running:

 ```shell
 node -v
@@ -30,7 +31,7 @@ node -v
 ### Use pnpm

 ```shell
-corepack enable
+npm install -g pnpm
 ```

 ### Install dependencies
@@ -41,33 +42,65 @@ pnpm install

 ### Build the packages

-You'll need Turbo to build the packages. If you don't have it, you can run it with `pnpx`.
-
 To build all packages, run:

 ```shell
-# Build all packages
-pnpx turbo build --filter "./packages/*"
-
-# Or if you have turbo installed, you can run:
-turbo build --filter "./packages/*"
+pnpm build
 ```

+### Run tests
+
+#### Unit tests
+
+After build, to run all unit tests, call:
+
+```shell
+pnpm test
+```
+
+Unit tests are located in the `tests` folder of each package. They are using their own package (e.g. `@llamaindex/core-tests` for `@llamaindex/core`). The tests are importing the package under test and the test package is not published.
+
+#### E2E tests
+
+To run all E2E tests, call:
+
+```shell
+pnpm e2e
+```
+
+All E2E tests are in the `e2e` folder.
+
 ### Docs

 See the [docs](./apps/next/README.md) for more information.

-## Changeset
+## Adding a new package
+
+Please follow these steps to add a new package:
+
+1. Only add new packages to the `packages/providers` folder.
+2. Use the `package.json` and `tsconfig.json` of an existing packages as template.
+3. Reference your new package in the root `tsconfig.json` file
+4. Add your package to the `examples/package.json` file if you add a new example.
+
+## Before sending a PR
+
+Before sending a PR, make sure of the following:
+
+1. Tests are all running and you added meaningful tests for your change.
+2. If you have a new feature, document it in the `apps/next` docs folder.
+3. If you have a new feature, add a new example in the `examples` folder.
+4. You have a descriptive changeset for each PR:
+
+### Changesets

 We use [changesets](https://github.com/changesets/changesets) for managing versions and changelogs. To create a new
 changeset, run in the root folder:

-```
+```shell
 pnpm changeset
 ```

-Please send a descriptive changeset for each PR.
-
 ## Publishing (maintainers only)

 The [Release Github Action](.github/workflows/release.yml) is automatically generating and updating a
@@ -1,5 +1,95 @@
 # @llamaindex/doc

+## 0.2.5
+
+### Patch Changes
+
+- 4999df1: bump nextjs
+- Updated dependencies [f5e4d09]
+  - llamaindex@0.9.16
+
+## 0.2.4
+
+### Patch Changes
+
+- 9c63f3f: Add support for openai responses api
+- Updated dependencies [9c63f3f]
+- Updated dependencies [c515a32]
+  - @llamaindex/openai@0.3.0
+  - @llamaindex/core@0.6.2
+  - @llamaindex/workflow@1.0.2
+  - llamaindex@0.9.15
+  - @llamaindex/cloud@4.0.2
+  - @llamaindex/node-parser@2.0.2
+  - @llamaindex/readers@3.0.2
+
+## 0.2.3
+
+### Patch Changes
+
+- 648cfb5: Add support for supabase vector store
+  Added doc for the supbase vector store
+- Updated dependencies [1b6f368]
+- Updated dependencies [eaf326e]
+- Updated dependencies [9d951b2]
+  - @llamaindex/core@0.6.1
+  - llamaindex@0.9.14
+  - @llamaindex/cloud@4.0.1
+  - @llamaindex/node-parser@2.0.1
+  - @llamaindex/openai@0.2.1
+  - @llamaindex/readers@3.0.1
+  - @llamaindex/workflow@1.0.1
+
+## 0.2.2
+
+### Patch Changes
+
+- e98033e: docs: correct the number of indexes
+
+## 0.2.1
+
+### Patch Changes
+
+- Updated dependencies [75d6e29]
+  - llamaindex@0.9.13
+
+## 0.2.0
+
+### Minor Changes
+
+- f1db9b3: Adding an options parameter to vercel tool to tailor responses
+
+### Patch Changes
+
+- 21bebfc: Expose more content to fix the issue with unavailable documentation links, and adjust the documentation based on the latest code.
+- 2b39cef: Added documentation for structured output in openai and ollama
+- Updated dependencies [21bebfc]
+- Updated dependencies [93bc0ff]
+- Updated dependencies [91a18e7]
+- Updated dependencies [bf56fc0]
+- Updated dependencies [f8a86e4]
+- Updated dependencies [5189b44]
+- Updated dependencies [58a9446]
+  - @llamaindex/readers@3.0.0
+  - @llamaindex/core@0.6.0
+  - @llamaindex/openai@0.2.0
+  - @llamaindex/cloud@4.0.0
+  - @llamaindex/workflow@1.0.0
+  - llamaindex@0.9.12
+  - @llamaindex/node-parser@2.0.0
+
+## 0.1.11
+
+### Patch Changes
+
+- a8c0637: feat: simplify to provide base URL to OpenAI
+- a654f58: Added docs for using perplexity
+- 98eebf7: Add RequestOptions parameter passing to support Gemini proxy calls.
+  Add a usage example for the RequestOptions parameter.
+- Updated dependencies [a8c0637]
+  - @llamaindex/openai@0.1.61
+  - llamaindex@0.9.11
+
 ## 0.1.10

 ### Patch Changes
@@ -6,8 +6,7 @@ This is a Next.js application generated with
 Run development server:

 ```bash
-turbo run dev
-# turbo will build all required packages before running the dev server
+pnpm run dev
 ```

 ## Learn More
@@ -4,6 +4,8 @@ const withMDX = createMDX();

 /** @type {import('next').NextConfig} */
 const config = {
+  // default timeout for static generation is 60s, but we need to increase it to 10 minutes due to the large number of document pages
+  staticPageGenerationTimeout: 600,
  reactStrictMode: true,
  eslint: {
    ignoreDuringBuilds: true,
@@ -1,6 +1,6 @@
 {
  "name": "@llamaindex/doc",
-  "version": "0.1.10",
+  "version": "0.2.5",
  "private": true,
  "scripts": {
    "postinstall": "fumadocs-mdx",
@@ -8,8 +8,9 @@
    "build": "next build",
    "dev": "next dev",
    "start": "next start",
-    "postbuild": "tsx scripts/post-build.mts",
-    "build:docs": "cross-env NODE_OPTIONS=\"--max-old-space-size=8192\" typedoc && tsx scripts/generate-docs.mts"
+    "postbuild": "tsx scripts/post-build.mts && tsx scripts/validate-links.mts",
+    "build:docs": "cross-env NODE_OPTIONS=\"--max-old-space-size=8192\" typedoc && tsx scripts/generate-docs.mts",
+    "validate-links": "tsx scripts/validate-links.mts"
  },
  "dependencies": {
    "@icons-pack/react-simple-icons": "^10.1.0",
@@ -45,7 +46,7 @@
    "hast-util-to-jsx-runtime": "^2.3.2",
    "llamaindex": "workspace:*",
    "lucide-react": "^0.460.0",
-    "next": "^15.2.1",
+    "next": "^15.2.4",
    "next-themes": "^0.4.3",
    "react": "^19.0.0",
    "react-dom": "^19.0.0",
@@ -0,0 +1,249 @@
+import glob from "fast-glob";
+import fs from "fs";
+import matter from "gray-matter";
+import path from "path";
+
+const CONTENT_DIR = path.join(process.cwd(), "src/content/docs");
+const BUILD_DIR = path.join(process.cwd(), ".next");
+
+// Regular expression to find internal links
+// This captures Markdown links [text](/docs/path) and href attributes href="/docs/path"
+const INTERNAL_LINK_REGEX = /(?:(?:\]\(|\bhref=["'])\/docs\/([^")]+))/g;
+
+// Regular expression to find relative links
+// This captures relative links like [text](./path) or ![alt](../images/image.png)
+const RELATIVE_LINK_REGEX = /(?:\]\()(?:\s*)(?:\.\.?)\//g;
+
+interface LinkValidationResult {
+  file: string;
+  invalidLinks: Array<{ link: string; line: number }>;
+}
+
+interface RelativeLinkResult {
+  file: string;
+  relativeLinks: Array<{ line: number; lineContent: string }>;
+}
+
+/**
+ * Get all valid documentation routes from the content directory
+ */
+async function getValidRoutes(): Promise<Set<string>> {
+  const mdxFiles = await glob("**/*.mdx", { cwd: CONTENT_DIR });
+
+  const routes = new Set<string>();
+
+  // Add each MDX file as a valid route
+  for (const file of mdxFiles) {
+    // Remove .mdx extension and normalize to route format
+    let route = file.replace(/\.mdx$/, "");
+
+    // Handle index files
+    if (route.endsWith("/index")) {
+      route = route.replace(/\/index$/, "");
+    } else if (route === "index") {
+      route = "";
+    }
+
+    routes.add(route);
+  }
+
+  return routes;
+}
+
+/**
+ * Extract internal links from a MDX file
+ */
+function extractLinksFromFile(
+  filePath: string,
+): Array<{ link: string; line: number }> {
+  const content = fs.readFileSync(filePath, "utf-8");
+  const { content: mdxContent } = matter(content);
+
+  const lines = mdxContent.split("\n");
+  const links: Array<{ link: string; line: number }> = [];
+
+  lines.forEach((line, lineNumber) => {
+    let match;
+    while ((match = INTERNAL_LINK_REGEX.exec(line)) !== null) {
+      if (match[1]) {
+        links.push({
+          link: match[1],
+          line: lineNumber + 1, // 1-based line numbers
+        });
+      }
+    }
+  });
+
+  return links;
+}
+
+/**
+ * Check if a link is an image link
+ */
+function isImageLink(link: string): boolean {
+  // Check for image extensions
+  const imageExtensions = [".png", ".jpg", ".jpeg", ".gif", ".svg", ".webp"];
+  const hasImageExtension = imageExtensions.some((ext) =>
+    link.toLowerCase().endsWith(ext),
+  );
+
+  // Check for markdown image syntax: ![alt](./path)
+  const isMarkdownImage = link.trim().startsWith("!");
+
+  return hasImageExtension || isMarkdownImage;
+}
+
+/**
+ * Extract relative links from a MDX file
+ */
+function findRelativeLinksInFile(
+  filePath: string,
+): Array<{ line: number; lineContent: string }> {
+  const content = fs.readFileSync(filePath, "utf-8");
+  const { content: mdxContent } = matter(content);
+
+  const lines = mdxContent.split("\n");
+  const relativeLinks: Array<{ line: number; lineContent: string }> = [];
+
+  lines.forEach((line, lineNumber) => {
+    // Check for relative links
+    if (RELATIVE_LINK_REGEX.test(line)) {
+      // Reset the regex lastIndex to start from the beginning of the line
+      RELATIVE_LINK_REGEX.lastIndex = 0;
+
+      // Skip image links
+      if (!isImageLink(line)) {
+        relativeLinks.push({
+          line: lineNumber + 1, // 1-based line numbers
+          lineContent: line.trim(),
+        });
+      }
+    }
+  });
+
+  return relativeLinks;
+}
+
+/**
+ * Validate internal links in all MDX files
+ */
+/**
+ * Find relative links in all MDX files
+ */
+async function findRelativeLinks(): Promise<RelativeLinkResult[]> {
+  const mdxFiles = await glob("**/*.mdx", { cwd: CONTENT_DIR });
+  const results: RelativeLinkResult[] = [];
+
+  for (const file of mdxFiles) {
+    const filePath = path.join(CONTENT_DIR, file);
+    const relativeLinks = findRelativeLinksInFile(filePath);
+
+    if (relativeLinks.length > 0) {
+      results.push({
+        file,
+        relativeLinks,
+      });
+    }
+  }
+
+  return results;
+}
+
+async function validateLinks(): Promise<LinkValidationResult[]> {
+  const mdxFiles = await glob("**/*.mdx", { cwd: CONTENT_DIR });
+  const validRoutes = await getValidRoutes();
+
+  const results: LinkValidationResult[] = [];
+
+  for (const file of mdxFiles) {
+    const filePath = path.join(CONTENT_DIR, file);
+    const links = extractLinksFromFile(filePath);
+
+    const invalidLinks = links.filter(({ link }) => {
+      // Check if the link exists in valid routes
+      // First normalize the link (remove any query string or hash)
+      const baseLink = link.split("?")[0].split("#")[0];
+      // Remove the trailing slash if present.
+      // This works with links like "api/interfaces/MetadataFilter#operator" and "api/interfaces/MetadataFilter/#operator".
+      const normalizedLink = baseLink.endsWith("/")
+        ? baseLink.slice(0, -1)
+        : baseLink;
+
+      // Remove llamaindex/ prefix if it exists as it's the root of the docs
+      let routePath = normalizedLink;
+      if (routePath.startsWith("llamaindex/")) {
+        routePath = routePath.substring("llamaindex/".length);
+      }
+
+      return !validRoutes.has(normalizedLink) && !validRoutes.has(routePath);
+    });
+
+    if (invalidLinks.length > 0) {
+      results.push({
+        file,
+        invalidLinks,
+      });
+    }
+  }
+
+  return results;
+}
+
+/**
+ * Main function to validate links and report errors
+ */
+async function main() {
+  console.log("🔍 Validating links in documentation...");
+
+  try {
+    // Check for invalid internal links
+    const validationResults: LinkValidationResult[] = await validateLinks();
+    // Check for relative links
+    const relativeLinksResults = await findRelativeLinks();
+
+    let hasErrors = false;
+
+    // Report invalid internal links
+    if (validationResults.length > 0) {
+      console.error("❌ Found invalid internal links:");
+      hasErrors = true;
+
+      for (const result of validationResults) {
+        console.error(`\nFile: ${result.file}`);
+
+        for (const { link, line } of result.invalidLinks) {
+          console.error(`  - Line ${line}: /docs/${link}`);
+        }
+      }
+    }
+
+    // Report relative links
+    if (relativeLinksResults.length > 0) {
+      console.error("\n❌ Found relative links (use absolute paths instead):");
+      hasErrors = true;
+
+      for (const result of relativeLinksResults) {
+        console.error(`\nFile: ${result.file}`);
+
+        for (const { line, lineContent } of result.relativeLinks) {
+          console.error(`  - Line ${line}: ${lineContent}`);
+        }
+      }
+    }
+
+    if (hasErrors) {
+      // Exit with error code to fail the build
+      process.exit(1);
+    } else {
+      console.log("✅ All links are valid!");
+    }
+  } catch (error) {
+    console.error("Error validating links:", error);
+    process.exit(1);
+  }
+}
+
+main().catch((error) => {
+  console.error("Unhandled error:", error);
+  process.exit(1);
+});
@@ -60,7 +60,7 @@ export default function HomePage() {
          icon={Footprints}
          subheading="Progressive"
          heading="From the simplest to the most complex"
-          description="LlamaIndex.TS is designed to be simple to get started, but powerful enough to build complex, agentic AI applications."
+          description="LlamaIndex.TS is designed to be simple to get started, but powerful enough to build complex, agentic AI applications using multi-agents."
        >
          <Suspense
            fallback={
@@ -76,44 +76,48 @@ export default function HomePage() {
          >
            <MagicMove
              code={[
-                `import { OpenAI } from "@llamaindex/openai";
+                `import { openai } from "@llamaindex/openai";

-const llm = new OpenAI();
+const llm = openai();
 const response = await llm.complete({ prompt: "How are you?" });`,
-                `import { OpenAI } from "@llamaindex/openai";
+                `import { openai } from "@llamaindex/openai";

-const llm = new OpenAI();
+const llm = openai();
 const response = await llm.chat({
  messages: [{ content: "Tell me a joke.", role: "user" }],
 });`,
-                `import { ChatMemoryBuffer } from "llamaindex";
-import { OpenAI } from "@llamaindex/openai";
+                `import { agent } from "llamaindex";
+import { openai } from "@llamaindex/openai";

-const llm = new OpenAI({ model: 'gpt4o-turbo' });
-const buffer = new ChatMemoryBuffer({
-  tokenLimit: 128_000,
-})
-buffer.put({ content: "Tell me a joke.", role: "user" })
-const response = await llm.chat({
-  messages: buffer.getMessages(),
-  stream: true
-});`,
-                `import { ChatMemoryBuffer } from "llamaindex";
-import { OpenAIAgent } from "@llamaindex/openai";
-
-const agent = new OpenAIAgent({
-  llm,
-  tools: [...myTools]
+const analyseAgent = agent({
+  llm: openai({ model: "gpt-4o" }),
+  tools: [analyseTools],
  systemPrompt,
 });
-const buffer = new ChatMemoryBuffer({
-  tokenLimit: 128_000,
-})
-buffer.put({ content: "Analysis the data based on the given data.", role: "user" })
-buffer.put({ content: \`\${data}\`, role: "user" })
-const response = await agent.chat({
-  message: buffer.getMessages(),
-});`,
+const response = await analyseAgent.run(\`Analyse the given data:
+\${data}\`);`,
+                `import { agent, multiAgent } from "llamaindex";
+import { openai } from "@llamaindex/openai";
+
+const analyseAgent = agent({
+  name: "AnalyseAgent",
+  llm: openai({ model: "gpt-4o" }),
+  tools: [analyseTools],
+});
+const reporterAgent = agent({
+  name: "ReporterAgent",
+  llm: openai({ model: "gpt-4o" }),
+  tools: [reporterTools],
+  canHandoffTo: [analyseAgent],
+});
+
+const agents = multiAgent({
+  agents: [analyseAgent, reporterAgent],
+  rootAgent: reporterAgent,
+});
+
+const response = await agents.run(\`Analyse the given data:
+\${data}\`);`,
              ]}
            />
          </Suspense>
@@ -125,20 +129,20 @@ const response = await agent.chat({
          description="Truly powerful retrieval-augmented generation applications use agentic techniques, and LlamaIndex.TS makes it easy to build them."
        >
          <CodeBlock
-            code={`import { agent } from "llamaindex";
-import { OpenAI } from "@llamaindex/openai";
+            code={`import { agent, SimpleDirectoryReader, VectorStoreIndex } from "llamaindex";
+import { openai } from "@llamaindex/openai";

-// using a previously created LlamaIndex index to query information from
-const queryTool = index.queryTool();
+// load documents from current directoy into an index
+const reader = new SimpleDirectoryReader();
+const documents = await reader.loadData(currentDir);
+const index = await VectorStoreIndex.fromDocuments(documents);

-const agent = agent({
-  llm: new OpenAI({
-    model: "gpt-4o",
-  }),
-  tools: [queryTool],
+const myAgent = agent({
+  llm: openai({ model: "gpt-4o" }),
+  tools: [index.queryTool()],
 });

-await agent.run('...');`}
+await myAgent.run('...');`}
            lang="ts"
          />
        </Feature>
@@ -11,8 +11,6 @@ import {
 } from "fumadocs-ui/page";
 import { notFound } from "next/navigation";

-const { AutoTypeTable } = createTypeTable();
-
 export const revalidate = false;

 export default async function Page(props: {
@@ -22,6 +20,7 @@ export default async function Page(props: {
  const page = source.getPage(params.slug);
  if (!page) notFound();

+  const { AutoTypeTable } = createTypeTable();
  const MDX = page.data.body;

  return (
@@ -1,12 +0,0 @@
---
-title: Agents
---
-
-A built-in agent that can take decisions and reasoning based on the tools provided to it.
-
-## OpenAI Agent
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/agent/openai";
-
-<DynamicCodeBlock lang="ts" code={CodeSource} />
@@ -1,28 +0,0 @@
---
-title: Gemini Agent
---
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSourceGemini from "!raw-loader!../../../../../../../examples/gemini/agent.ts";
-
-## Installation
-
-import { Tab, Tabs } from "fumadocs-ui/components/tabs";
-
-<Tabs groupId="install" items={["npm", "yarn", "pnpm"]} persist>
-	```shell tab="npm"
-	npm install llamaindex @llamaindex/google
-	```
-
-	```shell tab="yarn"
-	yarn add llamaindex @llamaindex/google
-	```
-
-	```shell tab="pnpm"
-	pnpm add llamaindex @llamaindex/google
-	```
-</Tabs>
-
-## Source 
-
-<DynamicCodeBlock lang="ts" code={CodeSourceGemini} />
@@ -1,10 +0,0 @@
---
-title: Chat Engine
---
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/chatEngine";
-
-Chat Engine is a class that allows you to create a chatbot from a retriever. It is a wrapper around a retriever that allows you to chat with it in a conversational manner.
-
-<DynamicCodeBlock lang="ts" code={CodeSource} />
@@ -1,59 +0,0 @@
---
-title: Context-Aware Agent
---
-
-The Context-Aware Agent enhances the capabilities of standard LLM agents by incorporating relevant context from a retriever for each query. This allows the agent to provide more informed and specific responses based on the available information.
-
-## Usage
-
-Here's a simple example of how to use the Context-Aware Agent:
-
-```typescript
-import {
-  Document,
-  VectorStoreIndex,
-} from "llamaindex";
-import { OpenAI, OpenAIContextAwareAgent } from "@llamaindex/openai";
-
-async function createContextAwareAgent() {
-  // Create and index some documents
-  const documents = [
-    new Document({
-      text: "LlamaIndex is a data framework for LLM applications.",
-      id_: "doc1",
-    }),
-    new Document({
-      text: "The Eiffel Tower is located in Paris, France.",
-      id_: "doc2",
-    }),
-  ];
-
-  const index = await VectorStoreIndex.fromDocuments(documents);
-  const retriever = index.asRetriever({ similarityTopK: 1 });
-
-  // Create the Context-Aware Agent
-  const agent = new OpenAIContextAwareAgent({
-    llm: new OpenAI({ model: "gpt-3.5-turbo" }),
-    contextRetriever: retriever,
-  });
-
-  // Use the agent to answer queries
-  const response = await agent.chat({
-    message: "What is LlamaIndex used for?",
-  });
-
-  console.log("Agent Response:", response.response);
-}
-
-createContextAwareAgent().catch(console.error);
-```
-
-In this example, the Context-Aware Agent uses the retriever to fetch relevant context for each query, allowing it to provide more accurate and informed responses based on the indexed documents.
-
-## Key Components
-
- `contextRetriever`: A retriever (e.g., from a VectorStoreIndex) that fetches relevant documents or passages for each query.
-
-## Available Context-Aware Agents
-
- `OpenAIContextAwareAgent`: A context-aware agent using OpenAI's models.
@@ -1,15 +0,0 @@
-{
-  "title": "Examples",
-  "pages": [
-    "more_examples",
-    "chat_engine",
-    "vector_index",
-    "summary_index",
-    "save_load_index",
-    "context_aware_agent",
-    "agent",
-    "agent_gemini",
-    "local_llm",
-    "other_llms"
-  ]
-}
@@ -1,66 +0,0 @@
---
-title: Using other LLM APIs
---
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/mistral";
-
-By default LlamaIndex.TS uses OpenAI's LLMs and embedding models, but we support [lots of other LLMs](../modules/llms) including models from Mistral (Mistral, Mixtral), Anthropic (Claude) and Google (Gemini).
-
-If you don't want to use an API at all you can [run a local model](../../examples/local_llm).
-
-This example runs you through the process of setting up a Mistral model:
-
-
-## Installation
-
-import { Tab, Tabs } from "fumadocs-ui/components/tabs";
-
-<Tabs groupId="install" items={["npm", "yarn", "pnpm"]} persist>
-	```shell tab="npm"
-	npm install llamaindex @llamaindex/mistral
-	```
-
-	```shell tab="yarn"
-	yarn add llamaindex @llamaindex/mistral
-	```
-
-	```shell tab="pnpm"
-	pnpm add llamaindex @llamaindex/mistral
-	```
-</Tabs>
-
-## Using another LLM
-
-You can specify what LLM LlamaIndex.TS will use on the `Settings` object, like this:
-
-```typescript
-import { MistralAI } from "@llamaindex/mistral";
-import { Settings } from "llamaindex";
-
-Settings.llm = new MistralAI({
-  model: "mistral-tiny",
-  apiKey: "<YOUR_API_KEY>",
-});
-```
-
-You can see examples of other APIs we support by checking out "Available LLMs" in the sidebar of our [LLMs section](../modules/llms).
-
-## Using another embedding model
-
-A frequent gotcha when trying to use a different API as your LLM is that LlamaIndex will also by default index and embed your data using OpenAI's embeddings. To completely switch away from OpenAI you will need to set your embedding model as well, for example:
-
-```typescript
-import { MistralAIEmbedding } from "@llamaindex/mistral";
-import { Settings } from "llamaindex";
-
-Settings.embedModel = new MistralAIEmbedding();
-```
-
-We support [many different embeddings](../modules/embeddings).
-
-## Full example
-
-This example uses Mistral's `mistral-tiny` model as the LLM and Mistral for embeddings as well.
-
-<DynamicCodeBlock lang="ts" code={CodeSource} />
@@ -1,8 +0,0 @@
---
-title: Save/Load an Index
---
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/storageContext";
-
-<DynamicCodeBlock lang="ts" code={CodeSource} />
@@ -1,8 +0,0 @@
---
-title: Summary Index
---
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/summaryIndex";
-
-<DynamicCodeBlock lang="ts" code={CodeSource} />
@@ -1,8 +0,0 @@
---
-title: Vector Index
---
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/vectorIndex";
-
-<DynamicCodeBlock lang="ts" code={CodeSource} />
@@ -1,11 +1,7 @@
 ---
-title: Chatbot tutorial
+title: Create-Llama
 ---

-Once you've mastered basic [retrieval-augment generation](retrieval_augmented_generation) you may want to create an interface to chat with your data. You can do this step-by-step, but we recommend getting started quickly using `create-llama`.
-
-## Using create-llama
-
 `create-llama` is a powerful but easy to use command-line tool that generates a working, full-stack web application that allows you to chat with your data. You can learn more about it on [the `create-llama` README page](https://www.npmjs.com/package/create-llama).

 Run it once and it will ask you a series of questions about the kind of application you want to generate. Then you can customize your application to suit your use-case. To get started, run:
@@ -1,10 +1,10 @@
 ---
-title: See all examples
+title: Code examples
 ---

 Our GitHub repository has a wealth of examples to explore and try out. You can check out our [examples folder](https://github.com/run-llama/LlamaIndexTS/tree/main/examples) to see them all at once, or browse the pages in this section for some selected highlights.

-## Check out all examples
+## Use examples locally

 It may be useful to check out all the examples at once so you can try them out locally. To do this into a folder called `my-new-project`, run these commands:

@@ -19,3 +19,14 @@ Then you can run any example in the folder with `tsx`, e.g.:
 ```bash npm2yarn
 npx tsx ./vectorIndex.ts
 ```
+
+## Try examples online
+
+You can also try the examples online using StackBlitz:
+
+<iframe
+  className="w-full h-[440px]"
+  aria-label="LlamaIndex.TS Examples"
+  aria-description="This is a list of examples for LlamaIndex.TS."
+  src="https://stackblitz.com/github/run-llama/LlamaIndexTS/tree/main/examples?file=README.md"
+/>
@@ -14,7 +14,7 @@ Before you start, make sure you have try LlamaIndex.TS in Node.js to make sure y

 <Card
  title="Getting Started with LlamaIndex.TS in Node.js"
-  href="/docs/llamaindex/getting_started/setup/node"
+  href="/docs/llamaindex/getting_started/frameworks/node"
 />

 Also, you need have the basic understanding of <a href='https://developers.cloudflare.com/workers/'><SiCloudflareworkers className="inline mr-2" color="#F38020" />Cloudflare Worker</a>.
@@ -1,5 +1,5 @@
 ---
-title: Choose Framework
+title: Frameworks
 description: We support multiple JS runtime and frameworks, bundlers.
 ---
 import {
@@ -15,28 +15,28 @@ import {
 		<>
 			<SiNodedotjs className="inline" color="#5FA04E" /> Node.js
 		</>
-	} href="/docs/llamaindex/getting_started/setup/node" />
+	} href="/docs/llamaindex/getting_started/frameworks/node" />
 	<Card title={
 		<>
 			<SiTypescript className="inline" color="#3178C6" /> TypeScript
 		</>
-	} href="/docs/llamaindex/getting_started/setup/typescript" />
+	} href="/docs/llamaindex/getting_started/frameworks/typescript" />
 	<Card title={
 		<>
 			<SiVite className='inline' color='#646CFF' /> Vite
 		</>
-	} href="/docs/llamaindex/getting_started/setup/vite" />
+	} href="/docs/llamaindex/getting_started/frameworks/vite" />
 	<Card
 		title={
 			<>
 				<SiNextdotjs className='inline' /> Next.js (React Server Component)
 			</>
 		}
-		href="/docs/llamaindex/getting_started/setup/next"
+		href="/docs/llamaindex/getting_started/frameworks/next"
 	/>
 	<Card title={
 		<>
 			<SiCloudflareworkers className='inline' color='#F38020' /> Cloudflare Workers
 		</>
-	} href="/docs/llamaindex/getting_started/setup/cloudflare" />
+	} href="/docs/llamaindex/getting_started/frameworks/cloudflare" />
 </Cards>
@@ -0,0 +1,6 @@
+{
+  "title": "Framework",
+  "description": "The setup guide",
+  "defaultOpen": true,
+  "pages": ["node", "typescript", "next", "vite", "cloudflare"]
+}
@@ -7,7 +7,7 @@ Before you start, make sure you have try LlamaIndex.TS in Node.js to make sure y

 <Card
  title="Getting Started with LlamaIndex.TS in Node.js"
-  href="/docs/llamaindex/getting_started/setup/node"
+  href="/docs/llamaindex/getting_started/frameworks/node"
 />

 ## Differences between Node.js and Next.js
@@ -35,7 +35,7 @@ If you see any dependency issues, you are welcome to open an issue on the GitHub

 ## Edge Runtime

-[Vercel Edge Runtime](https://edge-runtime.vercel.app/) is a subset of Node.js APIs. Similar to [Cloudflare Workers](./cloudflare#difference-between-nodejs-and-cloudflare-worker),
+[Vercel Edge Runtime](https://edge-runtime.vercel.app/) is a subset of Node.js APIs. Similar to [Cloudflare Workers](/docs/llamaindex/getting_started/frameworks/cloudflare#difference-between-nodejs-and-cloudflare-worker),
 it is a serverless platform that runs your code on the edge.

 Not all features of Node.js are supported in Vercel Edge Runtime, so does LlamaIndex.TS, we are working on more compatibility with all JavaScript runtimes.
@@ -42,11 +42,11 @@ By the default, we are using `js-tiktoken` for tokenization. You can install `gp
 	```
 </Tabs>

-> Note: This only works for Node.js
+**Note**: This only works for Node.js

 ## TypeScript support

 <Card
 	title="Getting Started with LlamaIndex.TS in TypeScript"
-	href="/docs/llamaindex/getting_started/setup/typescript"
+	href="/docs/llamaindex/getting_started/frameworks/typescript"
 />
@@ -106,21 +106,38 @@ Some modules uses `Web Stream` API like `ReadableStream` and `WritableStream`, y
 }
 ```

-```ts twoslash
-import { OpenAIAgent } from '@llamaindex/openai'
+```typescript
+import { agent, tool } from 'llamaindex'
+import { openai } from "@llamaindex/openai";

-const agent = new OpenAIAgent({
-  tools: []
-})
+Settings.llm = openai({
+  model: "gpt-4o-mini",
+});

-const response = await agent.chat({
-  message: 'Hello, how are you?',
-  stream: true
-})
-for await (const _ of response) {
-                      //^?
-  // ...
+const addTool = tool({
+  name: "add", 
+  description: "Adds two numbers",
+  parameters: z.object({x: z.number(), y: z.number()}),
+  execute: ({ x, y }) => x + y,
+});
+
+const myAgent = agent({
+  tools: [addTool],
+});
+
+// Chat with the agent
+const context = myAgent.run("Hello, how are you?");
+
+for await (const event of context) {
+  if (event instanceof AgentStream) {
+    for (const chunk of event.data.delta) {
+      process.stdout.write(chunk); // stream response
+    }
+  } else {
+    console.log(event); // other events
+  }
 }
+
 ```

 ## Run TypeScript Script in Node.js
@@ -7,7 +7,7 @@ Before you start, make sure you have try LlamaIndex.TS in Node.js to make sure y

 <Card
  title="Getting Started with LlamaIndex.TS in Node.js"
-  href="/docs/llamaindex/getting_started/setup/node"
+  href="/docs/llamaindex/getting_started/frameworks/node"
 />

 Also, make sure you have a basic understanding of [Vite](https://vitejs.dev/).
@@ -37,20 +37,20 @@ In most cases, you'll also need an LLM package to use LlamaIndex. For example, t
 	```
 </Tabs>

-Go to [Using other LLM APIs](/docs/llamaindex/examples/other_llms) to find out how to use other LLMs.
+Go to [LLM APIs](/docs/llamaindex/modules/llms) to find out how to use other LLMs.


 ## What's next?

 <Cards>
 	<Card
-		title="I want to try LlamaIndex.TS"
-		description="Learn how to use LlamaIndex.TS with different JS runtime and frameworks."
-		href="/docs/llamaindex/getting_started/setup"
+		title="Learn LlamaIndex.TS"
+		description="Learn how to use LlamaIndex.TS by starting with one of our tutorials."
+		href="/docs/llamaindex/tutorials/rag"
 	/>
 	<Card
 		title="Show me code examples"
 		description="Explore code examples using LlamaIndex.TS."
-		href="https://stackblitz.com/github/run-llama/LlamaIndexTS/tree/main/examples?file=README.md"
+		href="/docs/llamaindex/getting_started/examples"
 	/>
 </Cards>
@@ -1,4 +1,4 @@
 {
  "title": "Getting Started",
-  "pages": ["index", "setup", "starter_tutorial", "environments", "concepts"]
+  "pages": ["index", "create_llama", "examples", "frameworks"]
 }
@@ -1,6 +0,0 @@
-{
-  "title": "Setup",
-  "description": "The setup guide",
-  "defaultOpen": true,
-  "pages": ["index", "next", "node", "typescript", "vite", "cloudflare"]
-}
@@ -1,9 +0,0 @@
-{
-  "title": "Starter Tutorials",
-  "pages": [
-    "retrieval_augmented_generation",
-    "chatbot",
-    "structured_data_extraction",
-    "agent"
-  ]
-}
@@ -1,92 +0,0 @@
---
-title: Using a local model via Ollama
---
-
-If you're happy using OpenAI, you can skip this section, but many people are interested in using models they run themselves. The easiest way to do this is via the great work of our friends at [Ollama](https://ollama.com/), who provide a simple to use client that will download, install and run a [growing range of models](https://ollama.com/library) for you.
-
-### Install Ollama
-
-They provide a one-click installer for Mac, Linux and Windows on their [home page](https://ollama.com/).
-
-### Pick and run a model
-
-Since we're going to be doing agentic work, we'll need a very capable model, but the largest models are hard to run on a laptop. We think `mixtral 8x7b` is a good balance between power and resources, but `llama3` is another great option. You can run it simply by running
-
-```bash
-ollama run mixtral:8x7b
-```
-
-The first time you run it will also automatically download and install the model for you.
-
-### Switch the LLM in your code
-
-There are two changes you need to make to the code we already wrote in `1_agent` to get Mixtral 8x7b to work. First, you need to switch to that model. Replace the call to `Settings.llm` with this:
-
-```javascript
-Settings.llm = new Ollama({
-  model: "mixtral:8x7b",
-});
-```
-
-### Swap to a ReActAgent
-
-In our original code we used a specific OpenAIAgent, so we'll need to switch to a more generic agent pattern, the ReAct pattern. This is simple: change the `const agent` line in your code to read
-
-```javascript
-const agent = new ReActAgent({ tools });
-```
-
-(You will also need to bring in `Ollama` and `ReActAgent` in your imports)
-
-### Run your totally local agent
-
-Because your embeddings were already local, your agent can now run entirely locally without making any API calls.
-
-```bash
-node agent.mjs
-```
-
-Note that your model will probably run a lot slower than OpenAI, so be prepared to wait a while!
-
-**_Output_**
-
-```javascript
-{
-  response: {
-    message: {
-      role: 'assistant',
-      content: ' Thought: I need to use a tool to add the numbers 101 and 303.\n' +
-        'Action: sumNumbers\n' +
-        'Action Input: {"a": 101, "b": 303}\n' +
-        '\n' +
-        'Observation: 404\n' +
-        '\n' +
-        'Thought: I can answer without using any more tools.\n' +
-        'Answer: The sum of 101 and 303 is 404.'
-    },
-    raw: {
-      model: 'mixtral:8x7b',
-      created_at: '2024-05-09T00:24:30.339473Z',
-      message: [Object],
-      done: true,
-      total_duration: 64678371209,
-      load_duration: 57394551334,
-      prompt_eval_count: 475,
-      prompt_eval_duration: 4163981000,
-      eval_count: 94,
-      eval_duration: 3116692000
-    }
-  },
-  sources: [Getter]
-}
-```
-
-Tada! You can see all of this in the folder `1a_mixtral`.
-
-### Extending to other examples
-
-You can use a ReActAgent instead of an OpenAIAgent in any of the further examples below, but keep in mind that GPT-4 is a lot more capable than Mixtral 8x7b, so you may see more errors or failures in reasoning if you are using an entirely local setup.
-
-### Next steps
-
-Now you've got a local agent, you can [add Retrieval-Augmented Generation to your agent](4_agentic_rag).
@@ -1,16 +0,0 @@
---
-title: Cost Analysis
---
-
-This page shows how to track LLM cost using APIs.
-
-## Callback Manager
-
-The callback manager is a class that manages the callback functions.
-
-You can register `llm-start`, `llm-end`, and `llm-stream` callbacks to the callback manager for tracking the cost.
-
-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/recipes/cost-analysis";
-
-<DynamicCodeBlock lang="ts" code={CodeSource} />
@@ -1,5 +0,0 @@
-{
-  "title": "Guide",
-  "description": "See our guide",
-  "pages": ["loading", "workflow", "chat", "agents", "cost-analysis"]
-}
@@ -16,9 +16,13 @@ The TypeScript implementation is designed for JavaScript server side application

 LlamaIndex.TS provides tools for beginners, advanced users, and everyone in between.

+Try it out with a starter example using StackBlitz:
+
 <iframe
  className="w-full h-[440px]"
  aria-label="LlamaIndex.TS Starter"
  aria-description="This is a starter example for LlamaIndex.TS, it shows the basic usage of the library."
  src="https://stackblitz.com/github/run-llama/LlamaIndexTS/tree/main/examples?embed=1&file=starter.ts"
 />
+
+You'll need an OpenAI API key to run this example. You can retrieve it from [OpenAI](https://platform.openai.com/api-keys).
@@ -84,6 +84,7 @@ const queryTool = llamaindex({
  model: openai("gpt-4"),
  index,
  description: "Search through the documents",
+  options: { fields: ["sourceNodes", "messages"]}
 });

 // Use the tool with Vercel's AI SDK
@@ -4,13 +4,11 @@
  "root": true,
  "pages": [
    "---Guide---",
-    "what-is-llamaindex",
    "index",
    "getting_started",
-    "migration",
-    "guide",
-    "examples",
+    "tutorials",
    "modules",
-    "integration"
+    "integration",
+    "migration"
  ]
 }
@@ -75,7 +75,7 @@ Now:
 import { SimpleDirectoryReader } from "@llamaindex/readers/directory";
 ```

-For more details about available data loaders and their usage, check the [Loading Data](/docs/llamaindex/guide/loading).
+For more details about available data loaders and their usage, check the [Loading Data](/docs/llamaindex/modules/loading).

 ### 4. Prefer using `llamaindex` instead of `@llamaindex/core`

@@ -2,6 +2,8 @@
 title: Agents
 ---

+**Note**: Agents are deprecated, use [Agent Workflows](/docs/llamaindex/modules/agent_workflow) instead.
+
 An “agent” is an automated reasoning and decision engine. It takes in a user input/query and can make internal decisions for executing that query in order to return the correct result. The key agent components can include, but are not limited to:

 - Breaking down a complex question into smaller ones
@@ -19,11 +21,6 @@ LlamaIndex.TS comes with a few built-in agents, but you can also create your own
 - ReACT Agent
 - Meta3.1 504B via Bedrock (in `@llamaIndex/community`)

-## Examples
-
- [OpenAI Agent](/docs/llamaindex/examples/agent)
- [Gemini Agent](/docs/llamaindex/examples/agent_gemini)
-
 ## Api References

 - [OpenAIAgent](/docs/api/classes/OpenAIAgent)
@@ -1,5 +1,5 @@
 {
  "title": "Migration",
  "description": "Migration between different versions",
-  "pages": ["0.8-to-0.9"]
+  "pages": ["0.8-to-0.9", "deprecated"]
 }
@@ -1,12 +1,9 @@
 ---
-title: Agent Workflow
+title: Agent Workflows
 ---

-import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../examples/agentworkflow/blog-writer.ts";
-import { Tab, Tabs } from "fumadocs-ui/components/tabs";

-Agent Workflows are a powerful system that enables you to create and orchestrate one or multiple agents with tools to perform specific tasks. It's built on top of the base `Workflow` system and provides a streamlined interface for agent interactions.
+Agent Workflows are a powerful system that enables you to create and orchestrate one or multiple agents with tools to perform specific tasks. It's built on top of the base [`Workflow`](/docs/llamaindex/modules/workflows) system and provides a streamlined interface for agent interactions.

 ## Usage

@@ -15,11 +12,11 @@ Agent Workflows are a powerful system that enables you to create and orchestrate
 The simplest use case is creating a single agent with specific tools. Here's an example of creating an assistant that tells jokes:

 ```typescript
-import { agent, FunctionTool } from "llamaindex";
-import { OpenAI } from "@llamaindex/openai";
+import { agent, tool } from "llamaindex";
+import { openai } from "@llamaindex/openai";

 // Define a joke-telling tool
-const jokeTool = FunctionTool.from(
+const jokeTool = tool(
  () => "Baby Llama is called cria",
  {
    name: "joke",
@@ -28,15 +25,13 @@ const jokeTool = FunctionTool.from(
 );

 // Create an single agent workflow with the tool
-const workflow = agent({
+const jokeAgent = agent({
  tools: [jokeTool],
-  llm: new OpenAI({
-    model: "gpt-4o-mini",
-  }),
+  llm: openai({ model: "gpt-4o-mini" }),
 });

 // Run the workflow
-const result = await workflow.run("Tell me something funny");
+const result = await jokeAgent.run("Tell me something funny");
 console.log(result); // Baby Llama is called cria
 ```

@@ -73,8 +68,8 @@ An Agent Workflow can orchestrate multiple agents, enabling complex interactions
 Here's an example of a multi-agent system that combines joke-telling and weather information:

 ```typescript
-import { multiAgent, agent, FunctionTool } from "llamaindex";
-import { OpenAI } from "@llamaindex/openai";
+import { multiAgent, agent, tool } from "llamaindex";
+import { openai } from "@llamaindex/openai";
 import { z } from "zod";

 // Create a weather agent
@@ -82,18 +77,18 @@ const weatherAgent = agent({
  name: "WeatherAgent",
  description: "Provides weather information for any city",
  tools: [
-    FunctionTool.from(
-      ({ city }: { city: string }) => `The weather in ${city} is sunny`,
+    tool(
      {
        name: "fetchWeather",
        description: "Get weather information for a city",
        parameters: z.object({
          city: z.string(),
        }),
+        execute: ({ city }) => `The weather in ${city} is sunny`,
      }
    ),
  ],
-  llm: new OpenAI({ model: "gpt-4o-mini" }),
+  llm: openai({ model: "gpt-4o-mini" }),
 });

 // Create a joke-telling agent
@@ -101,18 +96,18 @@ const jokeAgent = agent({
  name: "JokeAgent",
  description: "Tells jokes and funny stories",
  tools: [jokeTool], // Using the joke tool defined earlier
-  llm: new OpenAI({ model: "gpt-4o-mini" }),
+  llm: openai({ model: "gpt-4o-mini" }),
  canHandoffTo: [weatherAgent], // Can hand off to the weather agent
 });

 // Create the multi-agent workflow
-const workflow = multiAgent({
+const agents = multiAgent({
  agents: [jokeAgent, weatherAgent],
  rootAgent: jokeAgent, // Start with the joke agent
 });

 // Run the workflow
-const result = await workflow.run(
+const result = await agents.run(
  "Give me a morning greeting with a joke and the weather in San Francisco"
 );
 ```
@@ -6,7 +6,7 @@ import { ChatDemoRSC } from '../../../../../components/demo/chat/rsc/demo';

 Using [chat-ui](https://github.com/run-llama/chat-ui), it's easy to add a chat interface to your LlamaIndexTS application using [Next.js RSC](https://nextjs.org/docs/app/building-your-application/rendering/server-components) and [Vercel AI RSC](https://sdk.vercel.ai/docs/ai-sdk-rsc/overview).

-With RSC, the chat messages are not returned as JSON from the server (like when using an [API route](./chat)), instead the chat message components are rendered on the server side.
+With RSC, the chat messages are not returned as JSON from the server (like when using an [API route](/docs/llamaindex/modules/chat/chat)), instead the chat message components are rendered on the server side.
 This is for example useful for rendering a whole chat history on the server before sending it to the client. [Check here](https://sdk.vercel.ai/docs/getting-started/navigating-the-library#when-to-use-ai-sdk-rsc), for a discussion of when to use use RSC.

 For implementing a chat interface with RSC, you need to create an AI action and then connect the chat interface to use it.
@@ -2,7 +2,7 @@
 title: Index
 ---

-An index is the basic container and organization for your data. LlamaIndex.TS supports two indexes:
+An index is the basic container and organization for your data. LlamaIndex.TS supports three indexes:

 - `VectorStoreIndex` - will send the top-k `Node`s to the LLM when generating a response. The default top-k is 2.
 - `SummaryIndex` - will send every `Node` in the index to the LLM in order to generate a response
@@ -35,7 +35,7 @@ Currently, the following readers are mapped to specific file types:

 - [TextFileReader](/docs/api/classes/TextFileReader): `.txt`
 - [PDFReader](/docs/api/classes/PDFReader): `.pdf`
- [PapaCSVReader](/docs/api/classes/PapaCSVReader): `.csv`
+- [CSVReader](/docs/api/classes/CSVReader): `.csv`
 - [MarkdownReader](/docs/api/classes/MarkdownReader): `.md`
 - [DocxReader](/docs/api/classes/DocxReader): `.docx`
 - [HTMLReader](/docs/api/classes/HTMLReader): `.htm`, `.html`
@@ -6,11 +6,11 @@ Chat stores manage chat history by storing sequences of messages in a structured

 ## Available Chat Stores

- [SimpleChatStore](/docs/api/classes/SimpleChatStore): A simple in-memory chat store with support for [persisting](/docs/llamaindex/modules/data_stores/#local-storage) data to disk.
+- [SimpleChatStore](/docs/api/classes/SimpleChatStore): A simple in-memory chat store with support for [persisting](/docs/llamaindex/modules/data_stores#local-storage) data to disk.

 Check the [LlamaIndexTS Github](https://github.com/run-llama/LlamaIndexTS) for the most up to date overview of integrations.

 ## API Reference

- [BaseChatStore](/docs/api/interfaces/BaseChatStore)
+- [BaseChatStore](/docs/api/classes/BaseChatStore)

@@ -2,12 +2,12 @@
 title: Document Stores
 ---

-Document stores contain ingested document chunks, i.e. [Node](/docs/llamaindex/modules/documents_and_nodes/index)s.
+Document stores contain ingested document chunks, i.e. [Node](/docs/llamaindex/modules/documents_and_nodes)s.

 ## Available Document Stores

- [SimpleDocumentStore](/docs/api/classes/SimpleDocumentStore): A simple in-memory document store with support for [persisting](/docs/llamaindex/modules/data_stores/#local-storage) data to disk.
- [PostgresDocumentStore](/docs/api/classes/PostgresDocumentStore): A PostgreSQL document store, see [PostgreSQL Storage](/docs/llamaindex/modules/data_stores/#postgresql-storage).
+- [SimpleDocumentStore](/docs/api/classes/SimpleDocumentStore): A simple in-memory document store with support for [persisting](/docs/llamaindex/modules/data_stores#local-storage) data to disk.
+- [PostgresDocumentStore](/docs/api/classes/PostgresDocumentStore): A PostgreSQL document store, see [PostgreSQL Storage](/docs/llamaindex/modules/data_stores#postgresql-storage).

 Check the [LlamaIndexTS Github](https://github.com/run-llama/LlamaIndexTS) for the most up to date overview of integrations.

@@ -6,8 +6,8 @@ Index stores are underlying storage components that contain metadata(i.e. inform

 ## Available Index Stores

- [SimpleIndexStore](/docs/api/classes/SimpleIndexStore): A simple in-memory index store with support for [persisting](/docs/llamaindex/modules/data_stores/#local-storage) data to disk.
- [PostgresIndexStore](/docs/api/classes/PostgresIndexStore): A PostgreSQL index store, , see [PostgreSQL Storage](/docs/llamaindex/modules/data_stores/#postgresql-storage).
+- [SimpleIndexStore](/docs/api/classes/SimpleIndexStore): A simple in-memory index store with support for [persisting](/docs/llamaindex/modules/data_stores#local-storage) data to disk.
+- [PostgresIndexStore](/docs/api/classes/PostgresIndexStore): A PostgreSQL index store, , see [PostgreSQL Storage](/docs/llamaindex/modules/data_stores#postgresql-storage).

 Check the [LlamaIndexTS Github](https://github.com/run-llama/LlamaIndexTS) for the most up to date overview of integrations.

@@ -2,12 +2,12 @@
 title: Key-Value Stores
 ---

-Key-Value Stores represent underlying storage components used in [Document Stores](/docs/llamaindex/modules/data_stores/doc_stores/index) and [Index Stores](/docs/llamaindex/modules/data_stores/index_stores/index)
+Key-Value Stores represent underlying storage components used in [Document Stores](/docs/llamaindex/modules/data_stores/doc_stores) and [Index Stores](/docs/llamaindex/modules/data_stores/index_stores)

 ## Available Key-Value Stores

- [SimpleKVStore](/docs/api/classes/SimpleKVStore): A simple Key-Value store with support of [persisting](/docs/llamaindex/modules/data_stores/#local-storage) data to disk.
- [PostgresKVStore](/docs/api/classes/PostgresKVStore): A PostgreSQL Key-Value store, see [PostgreSQL Storage](/docs/llamaindex/modules/data_stores/#postgresql-storage).
+- [SimpleKVStore](/docs/api/classes/SimpleKVStore): A simple Key-Value store with support of [persisting](/docs/llamaindex/modules/data_stores#local-storage) data to disk.
+- [PostgresKVStore](/docs/api/classes/PostgresKVStore): A PostgreSQL Key-Value store, see [PostgreSQL Storage](/docs/llamaindex/modules/data_stores#postgresql-storage).

 Check the [LlamaIndexTS Github](https://github.com/run-llama/LlamaIndexTS) for the most up to date overview of integrations.

@@ -8,7 +8,7 @@ Vector stores save embedding vectors of your ingested document chunks.

 Available Vector Stores are shown on the sidebar to the left. Additionally the following integrations exist without separate documentation:

- [SimpleVectorStore](/docs/api/classes/SimpleVectorStore): A simple in-memory vector store with optional [persistance](/docs/llamaindex/modules/data_stores/#local-storage) to disk.
+- [SimpleVectorStore](/docs/api/classes/SimpleVectorStore): A simple in-memory vector store with optional [persistance](/docs/llamaindex/modules/data_stores#local-storage) to disk.
 - [AstraDBVectorStore](/docs/api/classes/AstraDBVectorStore): A cloud-native, scalable Database-as-a-Service built on Apache Cassandra, see [datastax.com](https://www.datastax.com/products/datastax-astra)
 - [ChromaVectorStore](/docs/api/classes/ChromaVectorStore): An open-source vector database, focused on ease of use and performance, see [trychroma.com](https://www.trychroma.com/)
 - [MilvusVectorStore](/docs/api/classes/MilvusVectorStore): An open-source, high-performance, highly scalable vector database, see [milvus.io](https://milvus.io/)
@@ -19,6 +19,3 @@ Available Vector Stores are shown on the sidebar to the left. Additionally the f

 Check the [LlamaIndexTS Github](https://github.com/run-llama/LlamaIndexTS) for the most up to date overview of integrations.

-## API Reference
-
- [BaseVectorStore](/docs/api/classes/BaseVectorStore)
@@ -56,10 +56,10 @@ const vectorStore = new QdrantVectorStore({

 ```ts
 const document = new Document({ text: essay, id_: path });
-
-const index = await VectorStoreIndex.fromDocuments([document], {
-  vectorStore,
-});
+const storageContext = await storageContextFromDefaults({ vectorStore });
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    storageContext,
+  });
 ```

 ## Query the index
@@ -91,11 +91,11 @@ async function main() {
  });

  const document = new Document({ text: essay, id_: path });
-
+  const storageContext = await storageContextFromDefaults({ vectorStore });
  const index = await VectorStoreIndex.fromDocuments([document], {
-    vectorStore,
+    storageContext,
  });
-
+  
  const queryEngine = index.asQueryEngine();

  const response = await queryEngine.query({
@@ -0,0 +1,166 @@
+---
+title: Supabase Vector Store
+---
+
+[supabase.com](https://supabase.com/)
+
+To use this vector store, you need a Supabase project. You can create one at [supabase.com](https://supabase.com/).
+
+## Installation
+
+import { Tab, Tabs } from "fumadocs-ui/components/tabs";
+
+<Tabs groupId="install" items={["npm", "yarn", "pnpm"]} persist>
+  ```shell tab="npm"
+  npm install llamaindex @llamaindex/supabase
+  ```
+
+  ```shell tab="yarn"
+  yarn add llamaindex @llamaindex/supabase
+  ```
+
+  ```shell tab="pnpm"
+  pnpm add llamaindex @llamaindex/supabase
+  ```
+</Tabs>
+
+## Database Setup
+
+Before using the vector store, you need to:
+1. Enable the `pgvector` extension
+2. Create a table for storing vectors
+3. Create a vector similarity search function
+
+```sql
+create table documents (
+id uuid primary key,
+content text,
+metadata jsonb,
+embedding vector(1536)
+);
+```
+
+-- Create a function for similarity search
+```sql
+create function match_documents (
+query_embedding vector(1536),
+match_count int
+) returns table (
+id uuid,
+content text,
+metadata jsonb,
+embedding vector(1536),
+similarity float
+)
+language plpgsql
+as $$
+begin
+return query
+select
+id,
+content,
+metadata,
+embedding,
+1 - (embedding <=> query_embedding) as similarity
+from documents
+order by embedding <=> query_embedding
+limit match_count;
+end;
+$$;
+```
+
+## Importing the modules
+
+```ts
+import { Document, VectorStoreIndex } from "llamaindex";
+import { SupabaseVectorStore } from "@llamaindex/supabase";
+```
+
+## Setup Supabase
+
+```ts
+const vectorStore = new SupabaseVectorStore({
+  supabaseUrl: process.env.SUPABASE_URL,
+  supabaseKey: process.env.SUPABASE_KEY,
+  table: "documents",
+});
+```
+
+## Setup the index
+
+```ts
+const documents = [
+  new Document({ 
+    text: "Sample document text",
+    metadata: { source: "example" }
+  })
+];
+
+const storageContext = await storageContextFromDefaults({ vectorStore });
+const index = await VectorStoreIndex.fromDocuments(documents, {
+  storageContext,
+});
+```
+
+## Query the index
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const response = await queryEngine.query({
+  query: "What is in the document?",
+});
+
+// Output response
+console.log(response.toString());
+```
+
+## Full code
+
+```ts
+import { Document, VectorStoreIndex, storageContextFromDefaults } from "llamaindex";
+import { SupabaseVectorStore } from "@llamaindex/supabase";
+
+async function main() {
+  // Initialize the vector store
+  const vectorStore = new SupabaseVectorStore({
+    supabaseUrl: process.env.SUPABASE_URL,
+    supabaseKey: process.env.SUPABASE_KEY,
+    table: "documents",
+  });
+
+  // Create sample documents
+  const documents = [
+    new Document({
+      text: "Vector search enables semantic similarity search",
+      metadata: {
+        source: "research_paper",
+        author: "Jane Smith",
+      },
+    }),
+  ];
+
+  // Create storage context
+  const storageContext = await storageContextFromDefaults({ vectorStore });
+
+  // Create and store embeddings
+  const index = await VectorStoreIndex.fromDocuments(documents, {
+    storageContext,
+  });
+
+  // Query the index
+  const queryEngine = index.asQueryEngine();
+  const response = await queryEngine.query({
+    query: "What is vector search?",
+  });
+
+  // Output response
+  console.log(response.toString());
+}
+
+main().catch(console.error);
+```
+
+## API Reference
+
+- [SupabaseVectorStore](/docs/api/classes/SupabaseVectorStore)
@@ -4,7 +4,7 @@ title: Embedding

 The embedding model in LlamaIndex is responsible for creating numerical representations of text. By default, LlamaIndex will use the `text-embedding-ada-002` model from OpenAI.

-This can be explicitly updated through `Settings`
+This can be explicitly updated through `Settings.embedModel`.

 ## Installation

@@ -35,7 +35,7 @@ Settings.embedModel = new OpenAIEmbedding({

 ## Local Embedding

-For local embeddings, you can use the [HuggingFace](/docs/llamaindex/modules/embeddings/available_embeddings/huggingface) embedding model.
+For local embeddings, you can use the [HuggingFace](/docs/llamaindex/modules/embeddings/huggingface) embedding model.

 ## Local Ollama Embeddings With Remote Host

@@ -74,4 +74,4 @@ the response is not correct with a score of 2.5

 ## API Reference

- [CorrectnessEvaluator](/docs/api/classes/CorrectnessEvaluator)
+- [CorrectnessEvaluator](/docs/api/classes/CorrectnessEvaluator)
@@ -8,7 +8,7 @@ Currently, the following components are Transformation objects:

 - [SentenceSplitter](/docs/api/classes/SentenceSplitter)
 - [MetadataExtractor](/docs/llamaindex/modules/documents_and_nodes/metadata_extraction)
- [Embeddings](/docs/llamaindex/modules/embeddings/index)
+- [Embeddings](/docs/llamaindex/modules/embeddings)

 ## Usage Pattern

@@ -1,98 +0,0 @@
---
-title: OpenAI
---
-
-## Installation
-
-import { Tab, Tabs } from "fumadocs-ui/components/tabs";
-
-<Tabs groupId="install" items={["npm", "yarn", "pnpm"]} persist>
-	```shell tab="npm"
-	npm install llamaindex @llamaindex/openai
-	```
-
-	```shell tab="yarn"
-	yarn add llamaindex @llamaindex/openai
-	```
-
-	```shell tab="pnpm"
-	pnpm add llamaindex @llamaindex/openai
-	```
-</Tabs>
-
-
-```ts
-import { OpenAI } from "@llamaindex/openai";
-import { Settings } from "llamaindex";
-
-Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0, apiKey: <YOUR_API_KEY> });
-```
-
-You can setup the apiKey on the environment variables, like:
-
-```bash
-export OPENAI_API_KEY="<YOUR_API_KEY>"
-```
-
-## Load and index documents
-
-For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
-
-```ts
-import { Document, VectorStoreIndex } from "llamaindex";
-
-const document = new Document({ text: essay, id_: "essay" });
-
-const index = await VectorStoreIndex.fromDocuments([document]);
-```
-
-## Query
-
-```ts
-const queryEngine = index.asQueryEngine();
-
-const query = "What is the meaning of life?";
-
-const results = await queryEngine.query({
-  query,
-});
-```
-
-## Full Example
-
-```ts
-import { OpenAI } from "@llamaindex/openai";
-import { Document, Settings, VectorStoreIndex } from "llamaindex";
-
-// Use the OpenAI LLM
-Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0 });
-
-async function main() {
-  const document = new Document({ text: essay, id_: "essay" });
-
-  // Load and index documents
-  const index = await VectorStoreIndex.fromDocuments([document]);
-
-  // get retriever
-  const retriever = index.asRetriever();
-
-  // Create a query engine
-  const queryEngine = index.asQueryEngine({
-    retriever,
-  });
-
-  const query = "What is the meaning of life?";
-
-  // Query
-  const response = await queryEngine.query({
-    query,
-  });
-
-  // Log the response
-  console.log(response.response);
-}
-```
-
-## API Reference
-
- [OpenAI](/docs/api/classes/OpenAI)
@@ -31,6 +31,20 @@ Settings.llm = new Gemini({
 });
 ```

+## Usage with Proxy
+
+```ts
+import { Gemini, GEMINI_MODEL } from "@llamaindex/google";
+import { Settings } from "llamaindex";
+
+Settings.llm = new Gemini({
+  model: GEMINI_MODEL.GEMINI_PRO,
+  requestOptions: {
+    baseUrl: <YOUR_PROXY_URL>   // optional, but useful for custom endpoints
+  }
+});
+```
+
 ### Usage with Vertex AI

 To use Gemini via Vertex AI you can use `GeminiVertexSession`.
@@ -3,7 +3,7 @@ title: Groq
 ---

 import { DynamicCodeBlock } from 'fumadocs-ui/components/dynamic-codeblock';
-import CodeSource from "!raw-loader!../../../../../../../../../examples/groq.ts";
+import CodeSource from "!raw-loader!../../../../../../../../examples/groq.ts";

 ## Installation

@@ -45,7 +45,7 @@ export AZURE_OPENAI_DEPLOYMENT="gpt-4" # or some other deployment name

 ## Local LLM

-For local LLMs, currently we recommend the use of [Ollama](/docs/llamaindex/modules/llms/available_llms/ollama) LLM.
+For local LLMs, currently we recommend the use of [Ollama](/docs/llamaindex/modules/llms/ollama) LLM.

 ## Available LLMs

@@ -55,6 +55,35 @@ const results = await queryEngine.query({
 });
 ```

+## Using JSON Response Format
+
+You can configure Ollama to return responses in JSON format:
+
+```ts
+import { Ollama } from "@llamaindex/llms/ollama";
+import { z } from "zod";
+
+// Simple JSON format
+const llm = new Ollama({ 
+  model: "llama2", 
+  temperature: 0,
+  responseFormat: { type: "json_object" }
+});
+
+// Using Zod schema for validation
+const responseSchema = z.object({
+  summary: z.string(),
+  topics: z.array(z.string()),
+  sentiment: z.enum(["positive", "negative", "neutral"])
+});
+
+const llm = new Ollama({ 
+  model: "llama2", 
+  temperature: 0,
+  responseFormat: responseSchema  
+});
+```
+
 ## Full Example

 ```ts
@@ -0,0 +1,393 @@
+---
+title: OpenAI
+---
+
+## Installation
+
+import { Tab, Tabs } from "fumadocs-ui/components/tabs";
+
+<Tabs groupId="install" items={["npm", "yarn", "pnpm"]} persist>
+	```shell tab="npm"
+	npm install llamaindex @llamaindex/openai
+	```
+
+	```shell tab="yarn"
+	yarn add llamaindex @llamaindex/openai
+	```
+
+	```shell tab="pnpm"
+	pnpm add llamaindex @llamaindex/openai
+	```
+</Tabs>
+
+
+```ts
+import { OpenAI } from "@llamaindex/openai";
+import { Settings } from "llamaindex";
+
+Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0, apiKey: <YOUR_API_KEY> });
+```
+
+You can setup the apiKey on the environment variables, like:
+
+```bash
+export OPENAI_API_KEY="<YOUR_API_KEY>"
+```
+
+You can optionally set a custom base URL, like:
+
+```bash
+export OPENAI_BASE_URL="https://api.scaleway.ai/v1"
+```
+
+or
+
+```ts
+Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0, apiKey: <YOUR_API_KEY>, baseURL: "https://api.scaleway.ai/v1" });
+```
+
+## Using OpenAI Responses API
+
+The OpenAI Responses API provides enhanced functionality for handling complex interactions, including built-in tools, annotations, and streaming responses. Here's how to use it:
+
+### Basic Setup
+
+```ts
+import { openaiResponses } from "@llamaindex/openai";
+
+const llm = openaiResponses({
+  model: "gpt-4o",
+  temperature: 0.1,
+  maxOutputTokens: 1000
+});
+```
+
+### Message Content Types
+
+The API supports different types of message content, including text and images:
+
+```ts
+const response = await llm.chat({
+  messages: [
+    {
+      role: "user",
+      content: [
+        {
+          type: "input_text",
+          text: "What's in this image?"
+        },
+        {
+          type: "input_image",
+          image_url: "https://example.com/image.jpg",
+          detail: "auto" // Optional: can be "auto", "low", or "high"
+        }
+      ]
+    }
+  ]
+});
+```
+
+### Advanced Features
+
+#### Built-in Tools
+
+```ts
+const llm = openaiResponses({
+  model: "gpt-4o",
+  builtInTools: [
+    {
+      type: "function",
+      name: "search_files",
+      description: "Search through available files"
+    }
+  ],
+  strict: true // Enable strict mode for tool calls
+});
+```
+
+#### Response Tracking and Storage
+
+```ts
+const llm = openaiResponses({
+  trackPreviousResponses: true, // Enable response tracking
+  store: true, // Store responses for future reference
+  user: "user-123", // Associate responses with a user
+  callMetadata: { // Add custom metadata
+    sessionId: "session-123",
+    context: "customer-support"
+  }
+});
+```
+
+#### Streaming Responses
+
+```ts
+const response = await llm.chat({
+  messages: [
+    {
+      role: "user",
+      content: "Generate a long response"
+    }
+  ],
+  stream: true // Enable streaming
+});
+
+for await (const chunk of response) {
+  console.log(chunk.delta); // Process each chunk of the response
+}
+```
+
+### Configuration Options
+
+The OpenAI Responses API supports various configuration options:
+
+```ts
+const llm = openaiResponses({
+  // Model and basic settings
+  model: "gpt-4o",
+  temperature: 0.1,
+  topP: 1,
+  maxOutputTokens: 1000,
+  
+  // API configuration
+  apiKey: "your-api-key",
+  baseURL: "custom-endpoint",
+  maxRetries: 10,
+  timeout: 60000,
+  
+  // Response handling
+  trackPreviousResponses: false,
+  store: false,
+  strict: false,
+  
+  // Additional options
+  instructions: "Custom instructions for the model",
+  truncation: "auto", // Can be "auto", "disabled", or null
+  include: ["citations", "reasoning"] // Specify what to include in responses
+});
+```
+
+### Response Structure
+
+The API returns responses with rich metadata and optional annotations:
+
+```ts
+interface ResponseStructure {
+  message: {
+    content: string;
+    role: "assistant";
+    options: {
+      built_in_tool_calls: Array<ToolCall>;
+      annotations?: Array<Citation | URLCitation | FilePath>;
+      refusal?: string;
+      reasoning?: ReasoningItem;
+      usage?: ResponseUsage;
+      toolCall?: Array<PartialToolCall>;
+    }
+  }
+}
+```
+
+### Best Practices
+
+1. Use `trackPreviousResponses` when you need conversation continuity
+2. Enable `strict` mode when using tools to ensure accurate function calls
+3. Set appropriate `maxOutputTokens` to control response length
+4. Use `annotations` to track citations and references in responses
+5. Implement error handling for potential API failures and retries
+
+## Using JSON Response Format
+
+You can configure OpenAI to return responses in JSON format:
+
+```ts
+Settings.llm = new OpenAI({ 
+  model: "gpt-4o", 
+  temperature: 0,
+  responseFormat: { type: "json_object" }  
+});
+
+// You can also use a Zod schema to validate the response structure
+import { z } from "zod";
+
+const responseSchema = z.object({
+  summary: z.string(),  
+  topics: z.array(z.string()),
+  sentiment: z.enum(["positive", "negative", "neutral"])
+});
+
+Settings.llm = new OpenAI({ 
+  model: "gpt-4o", 
+  temperature: 0,
+  responseFormat: responseSchema  
+});
+```
+
+## Response Formats
+
+The OpenAI LLM supports different response formats to structure the output in specific ways. There are two main approaches to formatting responses:
+
+### 1. JSON Object Format
+
+The simplest way to get structured JSON responses is using the `json_object` response format:
+
+```ts
+Settings.llm = new OpenAI({ 
+  model: "gpt-4o", 
+  temperature: 0,
+  responseFormat: { type: "json_object" }  
+});
+
+const response = await llm.chat({
+  messages: [
+    {
+      role: "system",
+      content: "You are a helpful assistant that outputs JSON."
+    },
+    {
+      role: "user", 
+      content: "Summarize this meeting transcript"
+    }
+  ]
+});
+
+// Response will be valid JSON
+console.log(response.message.content);
+```
+
+### 2. Schema Validation with Zod
+
+For more robust type safety and validation, you can use Zod schemas to define the expected response structure:
+
+```ts
+import { z } from "zod";
+
+// Define the response schema
+const meetingSchema = z.object({
+  summary: z.string(),
+  participants: z.array(z.string()),
+  actionItems: z.array(z.string()),
+  nextSteps: z.string()
+});
+
+// Configure the LLM with the schema
+Settings.llm = new OpenAI({ 
+  model: "gpt-4o", 
+  temperature: 0,
+  responseFormat: meetingSchema
+});
+
+const response = await llm.chat({
+  messages: [
+    {
+      role: "user",
+      content: "Summarize this meeting transcript" 
+    }
+  ]
+});
+
+// Response will be typed and validated according to the schema
+const result = response.message.content;
+console.log(result.summary);
+console.log(result.actionItems);
+```
+
+### Response Format Options
+
+The response format can be configured in two ways:
+
+1. At LLM initialization:
+```ts
+const llm = new OpenAI({
+  model: "gpt-4o",
+  responseFormat: { type: "json_object" } // or a Zod schema
+});
+```
+
+2. Per request:
+```ts
+const response = await llm.chat({
+  messages: [...],
+  responseFormat: { type: "json_object" } // or a Zod schema
+});
+```
+
+The response format options are:
+
+- `{ type: "json_object" }` - Returns responses as JSON objects
+- `zodSchema` - A Zod schema that defines and validates the response structure
+
+### Best Practices
+
+1. Use JSON object format for simple structured responses
+2. Use Zod schemas when you need:
+   - Type safety
+   - Response validation
+   - Complex nested structures
+   - Specific field constraints
+3. Set a low temperature (e.g. 0) when using structured outputs for more reliable formatting
+4. Include clear instructions in system or user messages about the expected response format
+5. Handle potential parsing errors when working with JSON responses
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+import { Document, VectorStoreIndex } from "llamaindex";
+
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document]);
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import { OpenAI } from "@llamaindex/openai";
+import { Document, Settings, VectorStoreIndex } from "llamaindex";
+
+// Use the OpenAI LLM
+Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0 });
+
+async function main() {
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document]);
+
+  // get retriever
+  const retriever = index.asRetriever();
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
+
+## API Reference
+
+- [OpenAI](/docs/api/classes/OpenAI)
@@ -0,0 +1,134 @@
+---
+title: Perplexity LLM
+---
+
+## Installation
+
+import { Tab, Tabs } from "fumadocs-ui/components/tabs";
+
+<Tabs groupId="install" items={["npm", "yarn", "pnpm"]} persist>
+	```shell tab="npm"
+	npm install @llamaindex/perplexity
+	```
+
+	```shell tab="yarn"
+	yarn add @llamaindex/perplexity
+	```
+
+	```shell tab="pnpm"
+	pnpm add @llamaindex/perplexity
+	```
+</Tabs>
+
+## Usage
+
+```ts
+import { Settings } from "llamaindex";
+import { perplexity } from "@llamaindex/perplexity";
+Settings.llm = perplexity({
+apiKey: "<YOUR_API_KEY>",
+model: "sonar", // or available models 
+});
+```
+
+## Example
+
+```ts
+import { perplexity } from "@llamaindex/perplexity";
+
+const perplexityLlm = perplexity({
+  apiKey: "<YOUR_API_KEY>",
+  model: "sonar", // or avaiable models
+});
+
+async function main() {
+  const response = await perplexityLlm.chat({
+    messages: [
+      {
+        role: "system",
+        content: "You are an AI assistant",
+      },
+      {
+        role: "user",
+        content: "Tell me about San Francisco",
+      },
+    ],
+    stream: false,
+  });
+  console.log(response);
+
+  const stream = await perplexityLlm.chat({
+    messages: [
+      {
+        role: "system",
+        content: "You are a creative AI assistant that tells engaging stories",
+      },
+      {
+        role: "user",
+        content: "Tell me a short story",
+      },
+    ],
+    stream: true,
+  });
+
+  console.log("\nStreaming response:");
+  for await (const chunk of stream) {
+    process.stdout.write(chunk.delta);
+  }
+}
+```
+
+## Full Example
+
+```ts
+import { perplexity } from "@llamaindex/perplexity";
+import { Document, Settings, VectorStoreIndex } from "llamaindex";
+
+// Use the perplexity LLM
+Settings.llm = perplexity({ model: "sonar", apiKey: "<YOUR_API_KEY>" });
+
+async function main() {
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document]);
+
+  // get retriever
+  const retriever = index.asRetriever();
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
+
+## Available Models
+
+The following models are available:
+
+- `sonar`: 128k context window
+- `sonar-pro`: 200k context window
+- `sonar-deep-research`: 128k context window
+- `sonar-reasoning`: 128k context window
+- `sonar-reasoning-pro`: 128k context window
+- `r1-1776`: 128k context window
+
+
+# Limitations
+
+Currently does not support function calling.
+
+## API Reference
+
+- [Perplexity](/docs/api/classes/Perplexity)
@@ -8,15 +8,15 @@ import { Tab, Tabs } from "fumadocs-ui/components/tabs";

 <Tabs groupId="install" items={["npm", "yarn", "pnpm"]} persist>
 	```shell tab="npm"
-	npm install llamaindex
+	npm install @llamaindex/together
 	```

 	```shell tab="yarn"
-	yarn add llamaindex
+	yarn add @llamaindex/together
 	```

 	```shell tab="pnpm"
-	pnpm add llamaindex
+	pnpm add @llamaindex/together
 	```
 </Tabs>

@@ -5,7 +5,7 @@ description: Learn how to use Node Parsers and Text Splitters to extract data fr
 import { CodeNodeParserDemo } from '../../../../../components/demo/code-node-parser.tsx';
 import { Tab, Tabs } from "fumadocs-ui/components/tabs";

-Node parsers are a simple abstraction that take a list of documents, and chunk them into `Node` objects, such that each node is a specific chunk of the parent document. When a document is broken into nodes, all of it's attributes are inherited to the children nodes (i.e. `metadata`, text and metadata templates, etc.). You can read more about `Node` and `Document` properties [here](./).
+Node parsers are a simple abstraction that take a list of documents, and chunk them into `Node` objects, such that each node is a specific chunk of the parent document. When a document is broken into nodes, all of it's attributes are inherited to the children nodes (i.e. `metadata`, text and metadata templates, etc.). You can read more about `Node` and `Document` properties [here](/docs/llamaindex/modules/loading).

 ## NodeParser

@@ -28,14 +28,21 @@ Answer:`;

 ### 1. Customizing the default prompt on initialization

-The first method is to create a new instance of `ResponseSynthesizer` (or the module you would like to update the prompt) and pass the custom prompt to the `responseBuilder` parameter. Then, pass the instance to the `asQueryEngine` method of the index.
+The first method is to create a new instance of a Response Synthesizer (or the module you would like to update the prompt) by using the getResponseSynthesizer function. Instead of passing the custom prompt to the deprecated responseBuilder parameter, call getResponseSynthesizer with the mode as the first argument and supply the new prompt via the options parameter.

 ```ts
-// Create an instance of response synthesizer
+// Create an instance of Response Synthesizer
+
+// Deprecated usage:
 const responseSynthesizer = new ResponseSynthesizer({
  responseBuilder: new CompactAndRefine(undefined, newTextQaPrompt),
 });

+// Current usage:
+const responseSynthesizer = getResponseSynthesizer('compact', {
+  textQATemplate: newTextQaPrompt
+})
+
 // Create index
 const index = await VectorStoreIndex.fromDocuments([document]);

@@ -75,5 +82,5 @@ const response = await queryEngine.query({

 ## API Reference

- [ResponseSynthesizer](/docs/api/classes/ResponseSynthesizer)
+- [Response Synthesizer](/docs/llamaindex/modules/response_synthesizer)
 - [CompactAndRefine](/docs/api/classes/CompactAndRefine)
@@ -1,5 +1,5 @@
 ---
-title: ResponseSynthesizer
+title: Response Synthesizer
 ---

 The ResponseSynthesizer is responsible for sending the query, nodes, and prompt templates to the LLM to generate a response. There are a few key modes for generating a response:
@@ -12,15 +12,17 @@ The ResponseSynthesizer is responsible for sending the query, nodes, and prompt
  multiple compact prompts. The same as `refine`, but should result in less LLM calls.
 - `TreeSummarize`: Given a set of text chunks and the query, recursively construct a tree
  and return the root node as the response. Good for summarization purposes.
- `SimpleResponseBuilder`: Given a set of text chunks and the query, apply the query to each text
-  chunk while accumulating the responses into an array. Returns a concatenated string of all
-  responses. Good for when you need to run the same query separately against each text
-  chunk.
+- `MultiModal`: Combines textual inputs with additional modality-specific metadata to generate an integrated response. 
+  It leverages a text QA template to build a prompt that incorporates various input types and produces either streaming or complete responses.
+  This approach is ideal for use cases where enriching the answer with multi-modal context (such as images, audio, or other data) 
+  can enhance the output quality.

 ```typescript
-import { NodeWithScore, TextNode, ResponseSynthesizer } from "llamaindex";
+import { NodeWithScore, TextNode, getResponseSynthesizer, responseModeSchema } from "llamaindex";

-const responseSynthesizer = new ResponseSynthesizer();
+// you can also use responseModeSchema.Enum.refine, responseModeSchema.Enum.tree_summarize, responseModeSchema.Enum.multi_modal
+// or you can use the CompactAndRefine, Refine, TreeSummarize, or MultiModal classes directly
+const responseSynthesizer = getResponseSynthesizer(responseModeSchema.Enum.compact);

 const nodesWithScore: NodeWithScore[] = [
  {
@@ -55,8 +57,9 @@ for await (const chunk of stream) {

 ## API Reference

- [ResponseSynthesizer](/docs/api/classes/ResponseSynthesizer)
+- [getResponseSynthesizer](/docs/api/functions/getResponseSynthesizer)
+- [responseModeSchema](/docs/api/variables/responseModeSchema)
 - [Refine](/docs/api/classes/Refine)
 - [CompactAndRefine](/docs/api/classes/CompactAndRefine)
 - [TreeSummarize](/docs/api/classes/TreeSummarize)
- [SimpleResponseBuilder](/docs/api/classes/SimpleResponseBuilder)
+- [MultiModal](/docs/api/classes/MultiModal)
@@ -7,9 +7,59 @@ A tool can be called to perform custom actions, or retrieve extra information ba
 A result from a tool call can be used by subsequent steps in a workflow, or to compute a final answer.
 For example, a "weather tool" could fetch some live weather information from a geographical location.

+## Tool Function
+
+The `tool` function is a utility provided to define a tool that can be used by an agent. It takes a function and a configuration object as arguments. The configuration object includes the tool's name, description, and parameters.
+
+### Parameters with Zod
+
+The `parameters` field in the tool configuration is defined using `zod`, a TypeScript-first schema declaration and validation library. `zod` allows you to specify the expected structure and types of the input parameters, ensuring that the data passed to the tool is valid.
+
+Example:
+```ts
+import { agent, tool } from "llamaindex";
+import { z } from "zod";
+
+// first arg is LLM input, second is bound arg
+const queryKnowledgeBase = async ({ question }, { userToken }) => {
+  const response = await fetch(`https://knowledge-base.com?token=${userToken}&query=${question}`);
+  // ...
+};
+
+// define tool with zod validation
+const kbTool = tool(queryKnowledgeBase, {
+  name: 'queryKnowledgeBase',
+  description: 'Query knowledge base',
+  parameters: z.object({
+    question: z.string({
+      description: 'The user question',
+    }),
+  }),
+});
+
+```
+In this example, `z.object` is used to define a schema for the `parameters` where `question` is expected to be a string. This ensures that any input to the tool adheres to the specified structure, providing a layer of type safety and validation.
+
+
+## Built-in tools
+
+You can import built-in tools from the `@llamaindex/tools` package.
+
+```ts
+import { agent } from "llamaindex";
+import { wiki } from "@llamaindex/tools";
+
+const researchAgent = agent({
+  name: "WikiAgent",
+  description: "Gathering information from the internet",
+  systemPrompt: `You are a research agent. Your role is to gather information from the internet using the provided tools.`,
+  tools: [wiki()],
+});
+```
+
 ## Function tool

-Function tools are implemented with the `FunctionTool` class.
+You can still use the `FunctionTool` class to define a tool.
 A `FunctionTool` is constructed from a function with signature
 ```ts
 (input: T, additionalArg?: AdditionalToolArgument) => R
@@ -29,6 +79,8 @@ Note: calling the `bind` method will return a new `FunctionTool` instance, witho

 Example to pass a `userToken` as additional argument:
 ```ts
+import { agent, tool } from "llamaindex";
+
 // first arg is LLM input, second is bound arg
 const queryKnowledgeBase = async ({ question }, { userToken }) => {
  const response = await fetch(`https://knowledge-base.com?token=${userToken}&query=${question}`);
@@ -36,7 +88,7 @@ const queryKnowledgeBase = async ({ question }, { userToken }) => {
 };

 // define tool as usual
-const kbTool = FunctionTool.from(queryKnowledgeBase, {
+const kbTool = tool(queryKnowledgeBase, {
  name: 'queryKnowledgeBase',
  description: 'Query knowledge base',
  parameters: z.object({
@@ -48,7 +100,7 @@ const kbTool = FunctionTool.from(queryKnowledgeBase, {

 // create an agent
 const additionalArg = { userToken: 'abcd1234' };
-const kbAgent = new LLMAgent({
+const workflow = agent({
  tools: [kbTool.bind(additionalArg)],
  // llm, systemPrompt etc
 })
@@ -1,5 +1,5 @@
 ---
-title: Agent tutorial
+title: 1. Setup
 ---

 In this guide we'll walk you through the process of building an Agent in JavaScript using the LlamaIndex.TS library, starting from nothing and adding complexity in stages.
@@ -1,5 +1,5 @@
 ---
-title: Create a basic agent
+title: 2. Create a basic agent
 ---

 We want to use `await` so we're going to wrap all of our code in a `main` function, like this:
@@ -25,15 +25,21 @@ npx tsx example.ts
 First we'll need to pull in our dependencies. These are:

 - The OpenAI class to use the OpenAI LLM
- FunctionTool to provide tools to our agent
- OpenAIAgent to create the agent itself
+- tool to provide tools to our agent
+- agent to create the single agent
 - Settings to define some global settings for the library
 - Dotenv to load our API key from the .env file
+- Zod to define the schema for our tool

 ```javascript
-import { FunctionTool, Settings } from "llamaindex";
-import { OpenAI, OpenAIAgent } from "@llamaindex/openai";
 import "dotenv/config";
+import {
+  agent,
+  AgentStream,
+  tool,
+  openai,
+  Settings,
+} from "llamaindex";
 import { z } from "zod";
 ```

@@ -42,25 +48,12 @@ import { z } from "zod";
 We need to tell our OpenAI class where its API key is, and which of OpenAI's models to use. We'll be using `gpt-4o`, which is capable while still being pretty cheap. This is a global setting, so anywhere an LLM is needed will use the same model.

 ```javascript
-Settings.llm = new OpenAI({
+Settings.llm = openai({
  apiKey: process.env.OPENAI_API_KEY,
  model: "gpt-4o",
 });
 ```

-### Turn on logging
-
-We want to see what our agent is up to, so we're going to hook into some events that the library generates and print them out. There are several events possible, but we'll specifically tune in to `llm-tool-call` (when a tool is called) and `llm-tool-result` (when it responds).
-
-```javascript
-Settings.callbackManager.on("llm-tool-call", (event) => {
-  console.log(event.detail);
-});
-Settings.callbackManager.on("llm-tool-result", (event) => {
-  console.log(event.detail);
-});
-```
-
 ### Create a function

 We're going to create a very simple function that adds two numbers together. This will be the tool we ask our agent to use.
@@ -75,7 +68,7 @@ Note that we're passing in an object with two named parameters, `a` and `b`. Thi

 ### Turn the function into a tool for the agent

-This is the most complicated part of creating an agent. We need to define a `FunctionTool`. We have to pass in:
+This is the most complicated part of creating an agent. We need to define a `tool`. We have to pass in:

 - The function itself (`sumNumbers`)
 - A name for the function, which the LLM will use to call it
@@ -84,7 +77,7 @@ This is the most complicated part of creating an agent. We need to define a `Fun
 - You can see [more examples of function schemas](https://cookbook.openai.com/examples/how_to_call_functions_with_chat_models).

 ```javascript
-const tool = FunctionTool.from(sumNumbers, {
+const addTool = tool({
  name: "sumNumbers",
  description: "Use this function to sum two numbers",
  parameters: z.object({
@@ -95,13 +88,14 @@ const tool = FunctionTool.from(sumNumbers, {
      description: "Second number to sum",
    }),
  }),
+  execute: sumNumbers,
 });
 ```

 We then wrap up the tools into an array. We could provide lots of tools this way, but for this example we're just using the one.

 ```javascript
-const tools = [tool];
+const tools = [addTool];
 ```

 ### Create the agent
@@ -109,7 +103,7 @@ const tools = [tool];
 With your LLM already set up and your tools defined, creating an agent is simple:

 ```javascript
-const agent = new OpenAIAgent({ tools });
+const myAgent = agent({ tools });
 ```

 ### Ask the agent a question
@@ -117,61 +111,109 @@ const agent = new OpenAIAgent({ tools });
 We can use the `chat` interface to ask our agent a question, and it will use the tools we've defined to find an answer.

 ```javascript
-let response = await agent.chat({
-  message: "Add 101 and 303",
-});
+const context = myAgent.run("Sum 101 and 303");
+const result = await context;
+console.log(result.data);
+```
+You will see the following output:

-console.log(response);
+**_Output_**
+
+```
+{ result: 'The sum of 101 and 303 is 404.' }
+```
+
+To stream the response, you can use the `AgentStream` event which provides chunks of the response as they become available. This allows you to display the response incrementally rather than waiting for the full response:
+
+```javascript
+const context = myAgent.run("Add 101 and 303");
+for await (const event of context) {
+  if (event instanceof AgentStream) {
+    process.stdout.write(event.data.delta);
+  }
+}
+```
+
+**_Streaming Output_**
+
+```
+The sum of 101 and 303 is 404.
+```
+
+### Logging workflow events
+
+To log the workflow events, you can check the event type and log the event data.
+
+```javascript
+const context = myAgent.run("Sum 202 and 404");
+for await (const event of context) {
+  if (event instanceof AgentStream) {
+    // Stream the response
+    for (const chunk of event.data.delta) {
+      process.stdout.write(chunk);
+    }
+  } else {
+    // Log other events
+    console.log("\nWorkflow event:", JSON.stringify(event, null, 2));
+  }
+}
 ```

 Let's see what running this looks like using `npx tsx agent.ts`

 **_Output_**

-```javascript
-{
-  toolCall: {
-    id: 'call_ze6A8C3mOUBG4zmXO8Z4CPB5',
-    name: 'sumNumbers',
-    input: { a: 101, b: 303 }
+```
+Workflow event: {
+  "data": {
+    "userInput": "Sum 202 and 404"
  },
-  toolResult: {
-    tool: FunctionTool { _fn: [Function: sumNumbers], _metadata: [Object] },
-    input: { a: 101, b: 303 },
-    output: '404',
-    isError: false
-  }
+  "displayName": "StartEvent"
 }
+
+Workflow event: {
+  "data": {
+    "input": [
+      {
+        "role": "user",
+        "content": "Sum 202 and 404"
+      }
+    ],
+    "currentAgentName": "Agent"
+  },
+  "displayName": "AgentInput"
+}
+
+Workflow event: {
+  "data": {
+    "input": [
+      {
+        "role": "system",
+        "content": "You are a helpful assistant. Use the provided tools to answer questions."
+      },
+      {
+        "role": "user",
+        "content": "Sum 202 and 404"
+      }
+    ],
+    "currentAgentName": "Agent"
+  },
+  "displayName": "AgentSetup"
+}
+
+....
+
 ```

-```javascript
-{
-  response: {
-    raw: {
-      id: 'chatcmpl-9KwauZku3QOvH78MNvxJs81mDvQYK',
-      object: 'chat.completion',
-      created: 1714778824,
-      model: 'gpt-4-turbo-2024-04-09',
-      choices: [Array],
-      usage: [Object],
-      system_fingerprint: 'fp_ea6eb70039'
-    },
-    message: {
-      content: 'The sum of 101 and 303 is 404.',
-      role: 'assistant',
-      options: {}
-    }
-  },
-  sources: [Getter]
-}
-```
+We're seeing several workflow events being logged:

-We're seeing two pieces of output here. The first is our callback firing when the tool is called. You can see in `toolResult` that the LLM has correctly passed `101` and `303` to our `sumNumbers` function, which adds them up and returns `404`.
+1. `AgentToolCall` - Shows the agent preparing to call our tool with the numbers 202 and 404
+2. `AgentToolCallResult` - Shows the result of calling the tool, which returned "606"
+3. `AgentInput` - Shows the original user input
+4. `AgentOutput` - Shows the agent's response

-The second piece of output is the response from the LLM itself, where the `message.content` key is giving us the answer.
+Great! We've built an agent that can understand requests and use tools to fulfill them. Next you can:

-Great! We've built an agent with tool use! Next you can:
-
- [See the full code](https://github.com/run-llama/ts-agents/blob/main/1_agent/agent.ts)
+- [See the full code](https://github.com/run-llama/LlamaIndexTS/blob/main/examples/agentworkflow/blog-writer.ts)
 - [Switch to a local LLM](3_local_model)
 - Move on to [add Retrieval-Augmented Generation to your agent](4_agentic_rag)
@@ -0,0 +1,49 @@
+---
+title: 3. Using a local model via Ollama
+---
+
+If you're happy using OpenAI, you can skip this section, but many people are interested in using models they run themselves. The easiest way to do this is via the great work of our friends at [Ollama](https://ollama.com/), who provide a simple to use client that will download, install and run a [growing range of models](https://ollama.com/library) for you.
+
+### Install Ollama
+
+They provide a one-click installer for Mac, Linux and Windows on their [home page](https://ollama.com/).
+
+### Pick and run a model
+
+Since we're going to be doing agentic work, we'll need a very capable model, but the largest models are hard to run on a laptop. We think `mixtral 8x7b` is a good balance between power and resources, but `llama3` is another great option. You can run it simply by running
+
+```bash
+ollama run mixtral:8x7b
+```
+
+The first time you run it will also automatically download and install the model for you.
+
+### Switch the LLM in your code
+
+There are two changes you need to make to the code we already wrote in `1_agent` to get Mixtral 8x7b to work. First, you need to switch to that model. Replace the call to `Settings.llm` with this:
+
+```javascript
+Settings.llm = ollama({
+  model: "mixtral:8x7b",
+});
+```
+
+### Run local agent
+
+You can also create local agent by importing `agent` from `llamaindex`.
+
+```javascript
+import { agent } from "llamaindex";
+
+const workflow = agent({
+  tools: [getWeatherTool],
+});
+
+const workflowContext = workflow.run(
+  "What's the weather like in San Francisco?",
+);
+```
+
+### Next steps
+
+Now you've got a local agent, you can [add Retrieval-Augmented Generation to your agent](4_agentic_rag).
@@ -1,5 +1,5 @@
 ---
-title: Adding Retrieval-Augmented Generation (RAG)
+title: 4. Adding Retrieval-Augmented Generation (RAG)
 ---

 While an agent that can perform math is nifty (LLMs are usually not very good at math), LLM-based applications are always more interesting when they work with large amounts of data. In this case, we're going to use a 200-page PDF of the proposed budget of the city of San Francisco for fiscal years 2024-2024 and 2024-2025. It's a great example because it's extremely wordy and full of tables of figures, which present a challenge for humans and LLMs alike.
@@ -37,7 +37,7 @@ import { Tab, Tabs } from "fumadocs-ui/components/tabs";
 We'll be bringing in `SimpleDirectoryReader`, `HuggingFaceEmbedding`, `VectorStoreIndex`, and `QueryEngineTool`, `OpenAIContextAwareAgent` from LlamaIndex.TS, as well as the dependencies we previously used.

 ```javascript
-import { FunctionTool, QueryEngineTool, Settings, VectorStoreIndex } from "llamaindex";
+import { QueryEngineTool, Settings, VectorStoreIndex } from "llamaindex";
 import { OpenAI, OpenAIAgent } from "@llamaindex/openai";
 import { HuggingFaceEmbedding } from "@llamaindex/huggingface";
 import { SimpleDirectoryReader } from "@llamaindex/readers/directory";
@@ -87,38 +87,13 @@ By default LlamaIndex will retrieve just the 2 most relevant chunks of text. Thi
 retriever.similarityTopK = 10;
 ```

-### Approach 1: Create a Context-Aware Agent
+### Use index.queryTool

-With the retriever ready, you can create a **context-aware agent**.
+`index.queryTool` creates a `QueryEngineTool` that can be used be an agent to query data from the index. 

 ```javascript
-const agent = new OpenAIContextAwareAgent({
-  contextRetriever: retriever,
-});
-
-// Example query to the context-aware agent
-let response = await agent.chat({
-  message: `What's the budget of San Francisco in 2023-2024?`,
-});
-
-console.log(response);
-```
-
-**Expected Output:**
-
-```md
-The total budget for the City and County of San Francisco for the fiscal year 2023-2024 is $14.6 billion. This represents a $611.8 million, or 4.4 percent, increase over the previous fiscal year's budget. The budget covers various expenditures across different departments and services, including significant allocations to public works, transportation, commerce, public protection, and health services.
-```
-
-### Approach 2: Using QueryEngineTool (Alternative Approach)
-
-If you prefer more flexibility and don't mind additional complexity, you can create a `QueryEngineTool`. This approach allows you to define the query logic, providing a more tailored way to interact with the data, but note that it introduces a delay due to the extra tool call.
-
-```javascript
-const queryEngine = await index.asQueryEngine({ retriever });
 const tools = [
-  new QueryEngineTool({
-    queryEngine: queryEngine,
+  index.queryTool({
    metadata: {
      name: "san_francisco_budget_tool",
      description: `This tool can answer detailed questions about the individual components of the budget of San Francisco in 2023-2024.`,
@@ -127,11 +102,9 @@ const tools = [
 ];

 // Create an agent using the tools array
-const agent = new OpenAIAgent({ tools });
+const ragAgent = agent({ tools });

-let toolResponse = await agent.chat({
-  message: "What's the budget of San Francisco in 2023-2024?",
-});
+let toolResponse = await ragAgent.run("What's the budget of San Francisco in 2023-2024?");

 console.log(toolResponse);
 ```
@@ -159,10 +132,4 @@ console.log(toolResponse);

 Once again we see a `toolResult`. You can see the query the LLM decided to send to the query engine ("total budget"), and the output the engine returned. In `response.message` you see that the LLM has returned the output from the tool almost verbatim, although it trimmed out the bit about 2024-2025 since we didn't ask about that year.

-### Comparison of Approaches
-
-The `OpenAIContextAwareAgent` approach simplifies the setup by allowing you to directly link the retriever to the agent, making it straightforward to access relevant context for your queries. This is ideal for situations where you want easy integration with existing data sources, like a context chat engine.
-
-On the other hand, using the `QueryEngineTool` offers more flexibility and power. This method allows for customization in how queries are constructed and executed, enabling you to query data from various storages and process them in different ways. However, this added flexibility comes with increased complexity and response time due to the separate tool call and queryEngine generating tool output by LLM that is then passed to the agent.
-
 So now we have an agent that can index complicated documents and answer questions about them. Let's [combine our math agent and our RAG agent](5_rag_and_tools)!
@@ -1,5 +1,5 @@
 ---
-title: A RAG agent that does math
+title: 5. A RAG agent that does math
 ---

 In [our third iteration of the agent](https://github.com/run-llama/ts-agents/blob/main/3_rag_and_tools/agent.ts) we've combined the two previous agents, so we've defined both `sumNumbers` and a `QueryEngineTool` and created an array of two tools. The tools support both Zod and JSON Schema for parameter definition:
@@ -7,14 +7,13 @@ In [our third iteration of the agent](https://github.com/run-llama/ts-agents/blo
 ```javascript
 // define the query engine as a tool
 const tools = [
-  new QueryEngineTool({
-    queryEngine: queryEngine,
+  index.queryTool({
    metadata: {
      name: "san_francisco_budget_tool",
      description: `This tool can answer detailed questions about the individual components of the budget of San Francisco in 2023-2024.`,
    },
  }),
-  FunctionTool.from(sumNumbers, {
+  tool({
    name: "sumNumbers",
    description: "Use this function to sum two numbers",
    parameters: z.object({
@@ -25,14 +24,15 @@ const tools = [
        description: "Second number to sum",
      }),
    }),
+    execute: ({ a, b }) => `${a + b}`,
  }),
 ];
 ```

-You can also use JSON Schema to define the tool parameters as an alternative to Zod.
+You can also use JSON Schema to define the tool parameters as an alternative to Zod. 

 ```javascript
-FunctionTool.from(sumNumbers, {
+tool(sumNumbers, {
  name: "sumNumbers",
  description: "Use this function to sum two numbers",
  parameters: {
@@ -56,22 +56,13 @@ FunctionTool.from(sumNumbers, {
 These tool descriptions are identical to the ones we previously defined. Now let's ask it 3 questions in a row:

 ```javascript
-let response = await agent.chat({
-  message:
-    "What's the budget of San Francisco for community health in 2023-24?",
-});
+let response = await agent.run("What's the budget of San Francisco for community health in 2023-24?");
 console.log(response);

-let response2 = await agent.chat({
-  message:
-    "What's the budget of San Francisco for public protection in 2023-24?",
-});
+let response2 = await agent.run("What's the budget of San Francisco for public protection in 2023-24?");
 console.log(response2);

-let response3 = await agent.chat({
-  message:
-    "What's the combined budget of San Francisco for community health and public protection in 2023-24?",
-});
+let response3 = await agent.run("What's the combined budget of San Francisco for community health and public protection in 2023-24?");
 console.log(response3);
 ```

@@ -1,5 +1,5 @@
 ---
-title: Adding LlamaParse
+title: 6. Adding LlamaParse
 ---

 Complicated PDFs can be very tricky for LLMs to understand. To help with this, LlamaIndex provides LlamaParse, a hosted service that parses complex documents including PDFs. To use it, get a `LLAMA_CLOUD_API_KEY` by [signing up for LlamaCloud](https://cloud.llamaindex.ai/) (it's free for up to 1000 pages/day) and adding it to your `.env` file just as you did for your OpenAI key:
@@ -1,5 +1,5 @@
 ---
-title: Adding persistent vector storage
+title: 7. Adding persistent vector storage
 ---

 In the previous examples, we've been loading our data into memory each time we run the agent. This is fine for small datasets, but for larger datasets you'll want to store your embeddings in a database. LlamaIndex.TS provides a `VectorStore` class that can store your embeddings in a variety of databases. We're going to use [Qdrant](https://qdrant.tech/), a popular vector store, for this example.
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
github-actions[bot]	2cbdf71669	Release 0.9.16 (#1811 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>	2025-04-04 11:12:27 +02:00
Huu Le	ead657aedd	feat: add MCP tools integration and example usage (#1819 )	2025-04-04 11:03:10 +02:00
Marcus Schiesser	f5e4d098b0	chore: remove gpt-tokenizer (#1815 )	2025-04-03 14:23:31 +02:00
dependabot[bot]	4d97226e50	chore(deps): bump next from 15.2.3 to 15.2.4 (#1812 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-04-03 13:57:39 +07:00
Marcus Schiesser	4999df18cc	chore: bump nextjs to "^15.2.3" (#1810 )	2025-04-02 21:24:57 +07:00
github-actions[bot]	9a27b6d94a	Release 0.9.15 (#1807 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>	2025-04-02 17:08:47 +07:00
Thuc Pham	8c02684f0f	fix: handle error when streaming workflow (#1808 )	2025-04-02 16:26:01 +07:00
ANKIT VARSHNEY	9c63f3f94e	feat: openai responses api (#1801 )	2025-04-02 16:21:43 +07:00
Thuc Pham	c515a324f6	feat: return raw output for agent toolcall result (#1806 )	2025-04-01 22:20:06 +07:00
github-actions[bot]	c70d7b9930	Release 0.9.14 (#1799 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>	2025-04-01 12:59:10 +02:00
Marcus Schiesser	1b6f368a3f	feat: Support loading from URLs for all readers extending FileReader (#1805 )	2025-04-01 17:39:59 +07:00
Thuc Pham	9d951b288f	feat: support llamacloud in @llamaindex/server (#1796 )	2025-04-01 17:39:39 +07:00
dependabot[bot]	5fe16697a2	chore(deps-dev): bump vite from 5.4.15 to 5.4.16 (#1804 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-04-01 16:49:36 +07:00
Marcus Schiesser	189d8a83ac	chore: use node 20 for examples (#1803 )	2025-03-31 17:21:19 +07:00
ANKIT VARSHNEY	648cfb5cb5	feat: supbase vector store (#1790 )	2025-03-29 15:14:28 +07:00
Marcus Schiesser	eaf326ee90	fix: passing right llm setting from SimpleChatEngine to ChatMemoryBuffer (#1798 )	2025-03-28 18:20:52 +07:00
github-actions[bot]	fc1bedf438	Release (#1794 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-03-28 15:22:44 +07:00
Thuc Pham	164cf7a6df	fix: custom next server start fail (#1795 )	2025-03-28 15:09:57 +07:00
Zhanghao	e98033e2cc	docs: correct the number of indexes (#1793 )	2025-03-27 16:33:52 +02:00
dependabot[bot]	c0ffc7b434	chore(deps-dev): bump vite from 5.4.14 to 5.4.15 (#1787 ) Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2025-03-26 20:43:49 +07:00
github-actions[bot]	9cf88e9f3f	Release 0.9.13 (#1783 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>	2025-03-26 10:31:29 +02:00
Thuc Pham	75d6e29187	feat: response source nodes in query tool output (#1784 )	2025-03-26 15:24:53 +07:00
Parham Saidi	132517877e	fix: stringify all tool results for anthropic on bedrock (#1786 )	2025-03-25 21:16:44 +07:00
Thuc Pham	299008b34f	feat: copy create-llama to @llamaindex/servers (#1780 )	2025-03-25 11:55:44 +02:00
Thuc Pham	482ed67690	fix: document deployment fail due to static generation timed out (#1779 )	2025-03-24 11:12:09 +02:00
github-actions[bot]	9aeec9089b	Release 0.9.12 (#1744 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>	2025-03-24 14:34:45 +07:00
Daniel Bank	f1db9b3d48	feat: vercel tool response fields options (#1765 )	2025-03-24 14:25:26 +07:00
ANKIT VARSHNEY	25093531cf	feat: elastic search vector store (#1777 )	2025-03-24 14:18:42 +07:00
Thuc Pham	f8a86e4eff	feat: @llamaindex/server (#1759 )	2025-03-21 19:14:19 +07:00
ANKIT VARSHNEY	04f8c96caa	feat: add support for mongodb document store (#1771 )	2025-03-21 16:16:49 +07:00
Jorge Luis Middleton	43053f9e16	Update qdrant.mdx documentation (#1770 )	2025-03-21 10:06:25 +02:00
Parham Saidi	93bc0ffd21	fix: context engine additional options not being passed (#1772 )	2025-03-21 01:49:43 +07:00
Huu Le	58a9446220	Fix wrong multi-agent setup (#1767 )	2025-03-20 14:55:25 +03:00
Parham Saidi	da06e4550b	fix: include inline data check for GoogleStudio (#1769 )	2025-03-20 14:50:47 +03:00
Parham Saidi	3fd4cc383e	feat: google multimodal output using their new gen ai library (#1762 )	2025-03-19 21:43:39 +02:00
ANKIT VARSHNEY	2b39ceffa6	docs: doc for structured output (#1761 )	2025-03-19 14:15:11 +07:00
Thuc Pham	77e24cec65	fix: crypto is not defined when running on node18 (#1763 )	2025-03-19 08:38:16 +02:00
ANKIT VARSHNEY	2a0a899d66	chore: added saftey setting as parameter for gemini (#1760 )	2025-03-18 23:05:47 +07:00
ANKIT VARSHNEY	050cd53450	fix: delete by id in pinecone vector store (#1758 )	2025-03-18 23:03:42 +07:00
George He	5189b446f4	fix: add retry handling logic to parser reader and fix lint issues (#1757 ) Co-authored-by: Alex Yang <himself65@outlook.com>	2025-03-17 14:50:38 -07:00
Thuc Pham	c7ff3233fe	feat: @llamaindex/tools (#1755 )	2025-03-17 21:20:24 +07:00
Jack Qian	21bebfcaa6	docs: add missing links (#1754 )	2025-03-16 13:32:19 +07:00
ANKIT VARSHNEY	91a18e7057	feat: add support for structured output with zod schema. (#1749 )	2025-03-16 11:58:28 +08:00
ANKIT VARSHNEY	d1c1f99e06	feat: function calling support in mistral provider (#1756 )	2025-03-15 12:18:36 +07:00
Marcus Schiesser	8be84aeb5e	chore: add gemini streaming example (#782 )	2025-03-14 09:25:09 +07:00
Thuc Pham	5b7b314b25	fix: unhandled server error when accessing not found pages (#1753 )	2025-03-13 18:31:55 +07:00
Marcus Schiesser	4a51c9b48e	docs: add link checker and fix links (#1750 )	2025-03-13 15:46:05 +07:00
Alex Yang	bf56fc08ad	chore(cloud): update openapi.json (#1746 )	2025-03-12 15:19:40 +00:00
Marcus Schiesser	fa40b36516	docs: cleanup (#1745 )	2025-03-12 17:54:20 +07:00
yangqiao	da8068e9e0	fix: Add fromConnectionString method for Azure vCore storages (#1743 ) Co-authored-by: yangqiao <yangqiao@microsoft.com>	2025-03-12 15:58:30 +07:00
github-actions[bot]	37dcf37625	Release 0.9.11 (#1734 ) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>	2025-03-12 12:44:55 +07:00
Stefan Edberg	a8c0637d11	feat: make it possible to provide base URL to OpenAI (#1740 ) Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>	2025-03-12 12:17:38 +07:00
Marcus Schiesser	387a19284d	fix: Update mistral package for mistral API 1.5.1 (#1741 )	2025-03-12 11:23:42 +07:00
ANKIT VARSHNEY	a654f580cf	docs: add doc for perplexity (#1738 )	2025-03-12 11:17:23 +07:00
Thuc Pham	68ea7ec6a5	chore: use agent workflow for examples (#1726 )	2025-03-11 19:05:00 +07:00
Marcus Schiesser	2d11ffbaea	docs: update contrib (#1736 )	2025-03-11 18:06:06 +07:00
ANKIT VARSHNEY	1587e48a14	Feat/perplexity (#1719 )	2025-03-11 17:21:57 +07:00
Marcus Schiesser	bd239aaf2d	docs: update main agent docs (#1735 )	2025-03-11 17:18:38 +07:00
Jack Qian	98eebf7277	feat: add request options for gemini (#1733 )	2025-03-11 15:56:10 +07:00