changeset

remove unused turbo gen for ip sec vuln
fix fastapi security vuln
2026-07-04 03:40:26 -04:00 · 2024-02-11 04:16:02 +08:00 · 2024-02-11 04:12:58 +08:00 · 2024-02-11 03:59:59 +08:00 · 2024-02-10 13:54:13 -03:00 · 2024-02-10 12:07:14 -03:00
178 changed files with 5196 additions and 1243 deletions
@@ -0,0 +1,5 @@
+---
+"llamaindex": patch
+---
+
+feat: add filtering of metadata to PGVectorStore
@@ -1,5 +0,0 @@
---
-"llamaindex": patch
---
-
-easier prompt customization for SimpleResponseBuilder
@@ -0,0 +1,5 @@
+---
+"llamaindex": patch
+---
+
+feat(reranker): cohere reranker
@@ -0,0 +1,5 @@
+---
+"llamaindex": patch
+---
+
+feat: use batching in vector store index
@@ -0,0 +1,5 @@
+---
+"create-llama": patch
+---
+
+update fastapi for CVE-2024-24762
@@ -1,5 +0,0 @@
---
-"llamaindex": patch
---
-
-fix(cyclic): remove cyclic structures from transform hash
@@ -1,5 +0,0 @@
---
-"llamaindex": patch
---
-
-chore: improve extractors prompt
@@ -0,0 +1,5 @@
+---
+"llamaindex": patch
+---
+
+Add reader for LlamaParse
@@ -4,6 +4,6 @@
    "ghcr.io/devcontainers/features/node:1": {},
    "ghcr.io/devcontainers-contrib/features/turborepo-npm:1": {},
    "ghcr.io/devcontainers-contrib/features/typescript:2": {},
-    "ghcr.io/devcontainers-contrib/features/pnpm:2": {},
-  },
+    "ghcr.io/devcontainers-contrib/features/pnpm:2": {}
+  }
 }
@@ -0,0 +1,2 @@
+examples/readers/data/** binary
+examples/data/** binary
@@ -38,6 +38,12 @@ jobs:
      - name: Run Circular Dependency Check
        run: pnpm run circular-check
        working-directory: ./packages/core
+      - uses: actions/upload-artifact@v3
+        if: failure()
+        with:
+          name: typecheck-build-dist
+          path: ./packages/core/dist
+          if-no-files-found: error
  typecheck-examples:
    runs-on: ubuntu-latest

@@ -78,3 +78,15 @@ pnpm start
 That should start a webserver which will serve the docs on https://localhost:3000

 Any changes you make should be reflected in the browser. If you need to regenerate the API docs and find that your TSDoc isn't getting the updates, feel free to remove apps/docs/api. It will automatically regenerate itself when you run pnpm start again.
+
+## Publishing
+
+To publish a new version of the library, run
+
+```shell
+pnpm new-llamaindex
+pnpm new-create-llama
+pnpm release
+git push # push to the main branch
+git push --tags
+```
@@ -70,7 +70,7 @@ main();
 Then you can run it using

 ```bash
-pnpx ts-node example.ts
+pnpm dlx ts-node example.ts
 ```

 ## Playground
@@ -0,0 +1,85 @@
+# Agents
+
+A built-in agent that can take decisions and reasoning based on the tools provided to it.
+
+## OpenAI Agent
+
+```ts
+import { FunctionTool, OpenAIAgent } from "llamaindex";
+
+// Define a function to sum two numbers
+function sumNumbers({ a, b }: { a: number; b: number }): number {
+  return a + b;
+}
+
+// Define a function to divide two numbers
+function divideNumbers({ a, b }: { a: number; b: number }): number {
+  return a / b;
+}
+
+// Define the parameters of the sum function as a JSON schema
+const sumJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The first number",
+    },
+    b: {
+      type: "number",
+      description: "The second number",
+    },
+  },
+  required: ["a", "b"],
+};
+
+// Define the parameters of the divide function as a JSON schema
+const divideJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The dividend to divide",
+    },
+    b: {
+      type: "number",
+      description: "The divisor to divide by",
+    },
+  },
+  required: ["a", "b"],
+};
+
+async function main() {
+  // Create a function tool from the sum function
+  const sumFunctionTool = new FunctionTool(sumNumbers, {
+    name: "sumNumbers",
+    description: "Use this function to sum two numbers",
+    parameters: sumJSON,
+  });
+
+  // Create a function tool from the divide function
+  const divideFunctionTool = new FunctionTool(divideNumbers, {
+    name: "divideNumbers",
+    description: "Use this function to divide two numbers"
+    parameters: divideJSON,
+  });
+
+  // Create an OpenAIAgent with the function tools
+  const agent = new OpenAIAgent({
+    tools: [sumFunctionTool, divideFunctionTool],
+    verbose: true,
+  });
+
+  // Chat with the agent
+  const response = await agent.chat({
+    message: "How much is 5 + 5? then divide by 2",
+  });
+
+  // Print the response
+  console.log(String(response));
+}
+
+main().then(() => {
+  console.log("Done");
+});
+```
@@ -35,7 +35,7 @@ LlamaIndex.TS help you prepare the knowledge base with a suite of data connector
 [**Data Loaders**](../modules/data_loader.md):
 A data connector (i.e. `Reader`) ingest data from different data sources and data formats into a simple `Document` representation (text and simple metadata).

-[**Documents / Nodes**](../modules/documents_and_nodes.md): A `Document` is a generic container around any data source - for instance, a PDF, an API output, or retrieved data from a database. A `Node` is the atomic unit of data in LlamaIndex and represents a "chunk" of a source `Document`. It's a rich representation that includes metadata and relationships (to other nodes) to enable accurate and expressive retrieval operations.
+[**Documents / Nodes**](../modules/documents_and_nodes/index.md): A `Document` is a generic container around any data source - for instance, a PDF, an API output, or retrieved data from a database. A `Node` is the atomic unit of data in LlamaIndex and represents a "chunk" of a source `Document`. It's a rich representation that includes metadata and relationships (to other nodes) to enable accurate and expressive retrieval operations.

 [**Data Indexes**](../modules/data_index.md):
 Once you've ingested your data, LlamaIndex helps you index data into a format that's easy to retrieve.
@@ -69,7 +69,7 @@ A response synthesizer generates a response from an LLM, using a user query and

 #### Pipelines

-[**Query Engines**](../modules/query_engine.md):
+[**Query Engines**](../modules/query_engines):
 A query engine is an end-to-end pipeline that allow you to ask question over your data.
 It takes in a natural language query, and returns a response, along with reference context retrieved and passed to the LLM.

@@ -58,6 +58,6 @@ Our examples use OpenAI by default. You'll need to set up your Open AI key like
 export OPENAI_API_KEY="sk-......" # Replace with your key from https://platform.openai.com/account/api-keys
 ```

-If you want to have it automatically loaded every time, add it to your .zshrc/.bashrc.
+If you want to have it automatically loaded every time, add it to your `.zshrc/.bashrc`.

 WARNING: do not check in your OpenAI key into version control.
@@ -36,9 +36,9 @@ async function main() {

  // Query the index
  const queryEngine = index.asQueryEngine();
-  const response = await queryEngine.query(
-    "What did the author do in college?",
-  );
+  const response = await queryEngine.query({
+    query: "What did the author do in college?",
+  });

  // Output response
  console.log(response.toString());
@@ -37,9 +37,9 @@ For more complex applications, our lower-level APIs allow advanced users to cust

 `npm install llamaindex`

-Our documentation includes [Installation Instructions](./installation.mdx) and a [Starter Tutorial](./starter.md) to build your first application.
+Our documentation includes [Installation Instructions](./getting_started/installation.mdx) and a [Starter Tutorial](./getting_started/starter.md) to build your first application.

-Once you're up and running, [High-Level Concepts](./getting_started/concepts.md) has an overview of LlamaIndex's modular architecture. For more hands-on practical examples, look through our [End-to-End Tutorials](./end_to_end.md).
+Once you're up and running, [High-Level Concepts](./getting_started/concepts.md) has an overview of LlamaIndex's modular architecture. For more hands-on practical examples, look through our Examples section on the sidebar.

 ## 🗺️ Ecosystem

@@ -0,0 +1 @@
+label: "Agents"
@@ -0,0 +1,14 @@
+# Agents
+
+An “agent” is an automated reasoning and decision engine. It takes in a user input/query and can make internal decisions for executing that query in order to return the correct result. The key agent components can include, but are not limited to:
+
+- Breaking down a complex question into smaller ones
+- Choosing an external Tool to use + coming up with parameters for calling the Tool
+- Planning out a set of tasks
+- Storing previously completed tasks in a memory module
+
+## Getting Started
+
+LlamaIndex.TS comes with a few built-in agents, but you can also create your own. The built-in agents include:
+
+- [OpenAI Agent](./openai.mdx)
@@ -0,0 +1,183 @@
+# OpenAI Agent
+
+OpenAI API that supports function calling, it’s never been easier to build your own agent!
+
+In this notebook tutorial, we showcase how to write your own OpenAI agent
+
+## Setup
+
+First, you need to install the `llamaindex` package. You can do this by running the following command in your terminal:
+
+```bash
+pnpm i llamaindex
+```
+
+Then we can define a function to sum two numbers and another function to divide two numbers.
+
+```ts
+function sumNumbers({ a, b }: { a: number; b: number }): number {
+  return a + b;
+}
+
+// Define a function to divide two numbers
+function divideNumbers({ a, b }: { a: number; b: number }): number {
+  return a / b;
+}
+```
+
+## Create a function tool
+
+Now we can create a function tool from the sum function and another function tool from the divide function.
+
+For the parameters of the sum function, we can define a JSON schema.
+
+### JSON Schema
+
+```ts
+const sumJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The first number",
+    },
+    b: {
+      type: "number",
+      description: "The second number",
+    },
+  },
+  required: ["a", "b"],
+};
+
+const divideJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The dividend a to divide",
+    },
+    b: {
+      type: "number",
+      description: "The divisor b to divide by",
+    },
+  },
+  required: ["a", "b"],
+};
+
+const sumFunctionTool = new FunctionTool(sumNumbers, {
+  name: "sumNumbers",
+  description: "Use this function to sum two numbers",
+  parameters: sumJSON,
+});
+
+const divideFunctionTool = new FunctionTool(divideNumbers, {
+  name: "divideNumbers",
+  description: "Use this function to divide two numbers",
+  parameters: divideJSON,
+});
+```
+
+## Create an OpenAIAgent
+
+Now we can create an OpenAIAgent with the function tools.
+
+```ts
+const worker = new OpenAIAgent({
+  tools: [sumFunctionTool, divideFunctionTool],
+  verbose: true,
+});
+```
+
+## Chat with the agent
+
+Now we can chat with the agent.
+
+```ts
+const response = await worker.chat({
+  message: "How much is 5 + 5? then divide by 2",
+});
+
+console.log(String(response));
+```
+
+## Full code
+
+```ts
+import { FunctionTool, OpenAIAgent } from "llamaindex";
+
+// Define a function to sum two numbers
+function sumNumbers({ a, b }: { a: number; b: number }): number {
+  return a + b;
+}
+
+// Define a function to divide two numbers
+function divideNumbers({ a, b }: { a: number; b: number }): number {
+  return a / b;
+}
+
+// Define the parameters of the sum function as a JSON schema
+const sumJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The first number",
+    },
+    b: {
+      type: "number",
+      description: "The second number",
+    },
+  },
+  required: ["a", "b"],
+};
+
+// Define the parameters of the divide function as a JSON schema
+const divideJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The argument a to divide",
+    },
+    b: {
+      type: "number",
+      description: "The argument b to divide",
+    },
+  },
+  required: ["a", "b"],
+};
+
+async function main() {
+  // Create a function tool from the sum function
+  const sumFunctionTool = new FunctionTool(sumNumbers, {
+    name: "sumNumbers",
+    description: "Use this function to sum two numbers",
+    parameters: sumJSON,
+  });
+
+  // Create a function tool from the divide function
+  const divideFunctionTool = new FunctionTool(divideNumbers, {
+    name: "divideNumbers",
+    description: "Use this function to divide two numbers",
+    parameters: divideJSON,
+  });
+
+  // Create an OpenAIAgent with the function tools
+  const agent = new OpenAIAgent({
+    tools: [sumFunctionTool, divideFunctionTool],
+    verbose: true,
+  });
+
+  // Chat with the agent
+  const response = await agent.chat({
+    message: "How much is 5 + 5? then divide by 2",
+  });
+
+  // Print the response
+  console.log(String(response));
+}
+
+main().then(() => {
+  console.log("Done");
+});
+```
@@ -0,0 +1,128 @@
+# OpenAI Agent + QueryEngineTool
+
+QueryEngineTool is a tool that allows you to query a vector index. In this example, we will create a vector index from a set of documents and then create a QueryEngineTool from the vector index. We will then create an OpenAIAgent with the QueryEngineTool and chat with the agent.
+
+## Setup
+
+First, you need to install the `llamaindex` package. You can do this by running the following command in your terminal:
+
+```bash
+pnpm i llamaindex
+```
+
+Then you can import the necessary classes and functions.
+
+```ts
+import {
+  OpenAIAgent,
+  SimpleDirectoryReader,
+  VectorStoreIndex,
+  QueryEngineTool,
+} from "llamaindex";
+```
+
+## Create a vector index
+
+Now we can create a vector index from a set of documents.
+
+```ts
+// Load the documents
+const documents = await new SimpleDirectoryReader().loadData({
+  directoryPath: "node_modules/llamaindex/examples/",
+});
+
+// Create a vector index from the documents
+const vectorIndex = await VectorStoreIndex.fromDocuments(documents);
+```
+
+## Create a QueryEngineTool
+
+Now we can create a QueryEngineTool from the vector index.
+
+```ts
+// Create a query engine from the vector index
+const abramovQueryEngine = vectorIndex.asQueryEngine();
+
+// Create a QueryEngineTool with the query engine
+const queryEngineTool = new QueryEngineTool({
+  queryEngine: abramovQueryEngine,
+  metadata: {
+    name: "abramov_query_engine",
+    description: "A query engine for the Abramov documents",
+  },
+});
+```
+
+## Create an OpenAIAgent
+
+```ts
+// Create an OpenAIAgent with the query engine tool tools
+
+const agent = new OpenAIAgent({
+  tools: [queryEngineTool],
+  verbose: true,
+});
+```
+
+## Chat with the agent
+
+Now we can chat with the agent.
+
+```ts
+const response = await agent.chat({
+  message: "What was his salary?",
+});
+
+console.log(String(response));
+```
+
+## Full code
+
+```ts
+import {
+  OpenAIAgent,
+  SimpleDirectoryReader,
+  VectorStoreIndex,
+  QueryEngineTool,
+} from "llamaindex";
+
+async function main() {
+  // Load the documents
+  const documents = await new SimpleDirectoryReader().loadData({
+    directoryPath: "node_modules/llamaindex/examples/",
+  });
+
+  // Create a vector index from the documents
+  const vectorIndex = await VectorStoreIndex.fromDocuments(documents);
+
+  // Create a query engine from the vector index
+  const abramovQueryEngine = vectorIndex.asQueryEngine();
+
+  // Create a QueryEngineTool with the query engine
+  const queryEngineTool = new QueryEngineTool({
+    queryEngine: abramovQueryEngine,
+    metadata: {
+      name: "abramov_query_engine",
+      description: "A query engine for the Abramov documents",
+    },
+  });
+
+  // Create an OpenAIAgent with the function tools
+  const agent = new OpenAIAgent({
+    tools: [queryEngineTool],
+    verbose: true,
+  });
+
+  // Chat with the agent
+  const response = await agent.chat({
+    message: "What was his salary?",
+  });
+
+  // Print the response
+  console.log(String(response));
+}
+
+main().then(() => {
+  console.log("Done");
+});
+```
@@ -1,17 +0,0 @@
---
-sidebar_position: 3
---
-
-# Reader / Loader
-
-LlamaIndex.TS supports easy loading of files from folders using the `SimpleDirectoryReader` class. Currently, `.txt`, `.pdf`, `.csv`, `.md` and `.docx` files are supported, with more planned in the future!
-
-```typescript
-import { SimpleDirectoryReader } from "llamaindex";
-
-documents = new SimpleDirectoryReader().loadData("./data");
-```
-
-## API Reference
-
- [SimpleDirectoryReader](../api/classes/SimpleDirectoryReader.md)
@@ -0,0 +1,35 @@
+---
+sidebar_position: 4
+---
+
+import CodeBlock from "@theme/CodeBlock";
+import CodeSource from "!raw-loader!../../../../examples/readers/src/simple-directory-reader";
+import CodeSource2 from "!raw-loader!../../../../examples/readers/src/custom-simple-directory-reader";
+
+# Loader
+
+Before you can start indexing your documents, you need to load them into memory.
+
+### SimpleDirectoryReader
+
+[![Open in StackBlitz](https://developer.stackblitz.com/img/open_in_stackblitz.svg)](https://stackblitz.com/github/run-llama/LlamaIndexTS/tree/main/examples/readers?file=src/simple-directory-reader.ts&title=Simple%20Directory%20Reader)
+
+LlamaIndex.TS supports easy loading of files from folders using the `SimpleDirectoryReader` class.
+
+It is a simple reader that reads all files from a directory and its subdirectories.
+
+<CodeBlock language="ts">{CodeSource}</CodeBlock>
+
+Currently, it supports reading `.csv`, `.docx`, `.html`, `.md` and `.pdf` files,
+but support for other file types is planned.
+
+Also, you can provide a `defaultReader` as a fallback for files with unsupported extensions.
+Or pass new readers for `fileExtToReader` to support more file types.
+
+<CodeBlock language="ts" showLineNumbers metastring="{8-12,17-21}">
+  {CodeSource2}
+</CodeBlock>
+
+## API Reference
+
+- [SimpleDirectoryReader](../api/classes/SimpleDirectoryReader.md)
@@ -1,5 +1,5 @@
 ---
-sidebar_position: 3
+sidebar_position: 4
 ---

 # Embedding
@@ -0,0 +1,2 @@
+label: "LLMs"
+position: 3
@@ -0,0 +1 @@
+label: "Available LLMs"
@@ -0,0 +1,80 @@
+# Anthropic
+
+## Usage
+
+```ts
+import { Anthropic, serviceContextFromDefaults } from "llamaindex";
+
+const anthropicLLM = new Anthropic({
+  apiKey: "<YOUR_API_KEY>",
+});
+
+const serviceContext = serviceContextFromDefaults({ llm: anthropicLLM });
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the Anthropic LLM
+  const anthropicLLM = new Anthropic({
+    apiKey: "<YOUR_API_KEY>",
+  });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: anthropicLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -0,0 +1,88 @@
+# Azure OpenAI
+
+To use Azure OpenAI, you only need to set a few environment variables together with the `OpenAI` class.
+
+For example:
+
+## Environment Variables
+
+```
+export AZURE_OPENAI_KEY="<YOUR KEY HERE>"
+export AZURE_OPENAI_ENDPOINT="<YOUR ENDPOINT, see https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api>"
+export AZURE_OPENAI_DEPLOYMENT="gpt-4" # or some other deployment name
+```
+
+## Usage
+
+```ts
+import { OpenAI, serviceContextFromDefaults } from "llamaindex";
+
+const azureOpenaiLLM = new OpenAI({ model: "gpt-4", temperature: 0 });
+
+const serviceContext = serviceContextFromDefaults({ llm: azureOpenaiLLM });
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the LLM
+  const azureOpenaiLLM = new OpenAI({ model: "gpt-4", temperature: 0 });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: azureOpenaiLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -0,0 +1,97 @@
+# LLama2
+
+## Usage
+
+```ts
+import { Ollama, serviceContextFromDefaults } from "llamaindex";
+
+const llama2LLM = new LlamaDeuce({ chatStrategy: DeuceChatStrategy.META });
+
+const serviceContext = serviceContextFromDefaults({ llm: llama2LLM });
+```
+
+## Usage with Replication
+
+```ts
+import {
+  Ollama,
+  ReplicateSession,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+const replicateSession = new ReplicateSession({
+  replicateKey,
+});
+
+const llama2LLM = new LlamaDeuce({
+  chatStrategy: DeuceChatStrategy.META,
+  replicateSession,
+});
+
+const serviceContext = serviceContextFromDefaults({ llm: llama2LLM });
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the LLM
+  const llama2LLM = new LlamaDeuce({ chatStrategy: DeuceChatStrategy.META });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: mistralLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -0,0 +1,79 @@
+# Mistral
+
+## Usage
+
+```ts
+import { Ollama, serviceContextFromDefaults } from "llamaindex";
+
+const mistralLLM = new MistralAI({
+  model: "mistral-tiny",
+  apiKey: "<YOUR_API_KEY>",
+});
+
+const serviceContext = serviceContextFromDefaults({ llm: mistralLLM });
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the LLM
+  const mistralLLM = new MistralAI({ model: "mistral-tiny" });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: mistralLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -0,0 +1,76 @@
+# Ollama
+
+## Usage
+
+```ts
+import { Ollama, serviceContextFromDefaults } from "llamaindex";
+
+const ollamaLLM = new Ollama({ model: "llama2", temperature: 0.75 });
+
+const serviceContext = serviceContextFromDefaults({ llm: ollamaLLM });
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the LLM
+  const ollamaLLM = new Ollama({ model: "llama2", temperature: 0.75 });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: ollamaLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -0,0 +1,80 @@
+# OpenAI
+
+```ts
+import { OpenAI, serviceContextFromDefaults } from "llamaindex";
+
+const openaiLLM = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0, apiKey: <YOUR_API_KEY> });
+
+const serviceContext = serviceContextFromDefaults({ llm: openaiLLM });
+```
+
+You can setup the apiKey on the environment variables, like:
+
+```bash
+export OPENAI_API_KEY="<YOUR_API_KEY>"
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the LLM
+  const openaiLLM = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0 });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: openaiLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -0,0 +1,80 @@
+# Portkey LLM
+
+## Usage
+
+```ts
+import { Portkey, serviceContextFromDefaults } from "llamaindex";
+
+const portkeyLLM = new Portkey({
+  apiKey: "<YOUR_API_KEY>",
+});
+
+const serviceContext = serviceContextFromDefaults({ llm: portkeyLLM });
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the LLM
+  const portkeyLLM = new Portkey({
+    apiKey: "<YOUR_API_KEY>",
+  });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: portkeyLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -0,0 +1,80 @@
+# Together LLM
+
+## Usage
+
+```ts
+import { TogetherLLM, serviceContextFromDefaults } from "llamaindex";
+
+const togetherLLM = new TogetherLLM({
+  apiKey: "<YOUR_API_KEY>",
+});
+
+const serviceContext = serviceContextFromDefaults({ llm: togetherLLM });
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Query
+
+```ts
+const queryEngine = index.asQueryEngine();
+
+const query = "What is the meaning of life?";
+
+const results = await queryEngine.query({
+  query,
+});
+```
+
+## Full Example
+
+```ts
+import {
+  Anthropic,
+  Document,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+async function main() {
+  // Create an instance of the LLM
+  const togetherLLM = new TogetherLLM({
+    apiKey: "<YOUR_API_KEY>",
+  });
+
+  // Create a service context
+  const serviceContext = serviceContextFromDefaults({ llm: togetherLLM });
+
+  const document = new Document({ text: essay, id_: "essay" });
+
+  // Load and index documents
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  // Create a query engine
+  const queryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const query = "What is the meaning of life?";
+
+  // Query
+  const response = await queryEngine.query({
+    query,
+  });
+
+  // Log the response
+  console.log(response.response);
+}
+```
@@ -2,7 +2,7 @@
 sidebar_position: 3
 ---

-# LLM
+# Large Language Models (LLMs)

 The LLM is responsible for reading text and generating natural language responses to queries. By default, LlamaIndex.TS uses `gpt-3.5-turbo`.

@@ -1,5 +1,5 @@
 ---
-sidebar_position: 3
+sidebar_position: 4
 ---

 # NodeParser
@@ -0,0 +1,2 @@
+label: "Node Postprocessors"
+position: 3
@@ -0,0 +1,71 @@
+# Cohere Reranker
+
+The Cohere Reranker is a postprocessor that uses the Cohere API to rerank the results of a search query.
+
+## Setup
+
+Firstly, you will need to install the `llamaindex` package.
+
+```bash
+pnpm install llamaindex
+```
+
+Now, you will need to sign up for an API key at [Cohere](https://cohere.ai/). Once you have your API key you can import the necessary modules and create a new instance of the `CohereRerank` class.
+
+```ts
+import {
+  CohereRerank,
+  Document,
+  OpenAI,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+```
+
+## Load and index documents
+
+For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
+
+```ts
+const document = new Document({ text: essay, id_: "essay" });
+
+const serviceContext = serviceContextFromDefaults({
+  llm: new OpenAI({ model: "gpt-3.5-turbo", temperature: 0.1 }),
+});
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+```
+
+## Increase similarity topK to retrieve more results
+
+The default value for `similarityTopK` is 2. This means that only the most similar document will be returned. To retrieve more results, you can increase the value of `similarityTopK`.
+
+```ts
+const retriever = index.asRetriever();
+retriever.similarityTopK = 5;
+```
+
+## Create a new instance of the CohereRerank class
+
+Then you can create a new instance of the `CohereRerank` class and pass in your API key and the number of results you want to return.
+
+```ts
+const nodePostprocessor = new CohereRerank({
+  apiKey: "<COHERE_API_KEY>",
+  topN: 4,
+});
+```
+
+## Create a query engine with the retriever and node postprocessor
+
+```ts
+const queryEngine = index.asQueryEngine({
+  retriever,
+  nodePostprocessors: [nodePostprocessor],
+});
+
+// log the response
+const response = await queryEngine.query("Where did the author grown up?");
+```
@@ -0,0 +1,110 @@
+# Node Postprocessors
+
+## Concept
+
+Node postprocessors are a set of modules that take a set of nodes, and apply some kind of transformation or filtering before returning them.
+
+In LlamaIndex, node postprocessors are most commonly applied within a query engine, after the node retrieval step and before the response synthesis step.
+
+LlamaIndex offers several node postprocessors for immediate use, while also providing a simple API for adding your own custom postprocessors.
+
+## Usage Pattern
+
+An example of using a node postprocessors is below:
+
+```ts
+import {
+  Node,
+  NodeWithScore,
+  SimilarityPostprocessor,
+  CohereRerank,
+} from "llamaindex";
+
+const nodes: NodeWithScore[] = [
+  {
+    node: new TextNode({ text: "hello world" }),
+    score: 0.8,
+  },
+  {
+    node: new TextNode({ text: "LlamaIndex is the best" }),
+    score: 0.6,
+  },
+];
+
+// similarity postprocessor: filter nodes below 0.75 similarity score
+const processor = new SimilarityPostprocessor({
+  similarityCutoff: 0.7,
+});
+
+const filteredNodes = processor.postprocessNodes(nodes);
+
+// cohere rerank: rerank nodes given query using trained model
+const reranker = new CohereRerank({
+  apiKey: "<COHERE_API_KEY>",
+  topN: 2,
+});
+
+const rerankedNodes = await reranker.postprocessNodes(nodes, "<user_query>");
+
+console.log(filteredNodes, rerankedNodes);
+```
+
+Now you can use the `filteredNodes` and `rerankedNodes` in your application.
+
+## Using Node Postprocessors in LlamaIndex
+
+Most commonly, node-postprocessors will be used in a query engine, where they are applied to the nodes returned from a retriever, and before the response synthesis step.
+
+### Using Node Postprocessors in a Query Engine
+
+```ts
+import { Node, NodeWithScore, SimilarityPostprocessor, CohereRerank } from "llamaindex";
+
+const nodes: NodeWithScore[] = [
+  {
+    node: new TextNode({ text: "hello world" }),
+    score: 0.8,
+  },
+  {
+    node: new TextNode({ text: "LlamaIndex is the best" }),
+    score: 0.6,
+  }
+];
+
+// cohere rerank: rerank nodes given query using trained model
+const reranker = new CohereRerank({
+  apiKey: "<COHERE_API_KEY>,
+  topN: 2,
+})
+
+const document = new Document({ text: "essay", id_: "essay" });
+
+const serviceContext = serviceContextFromDefaults({
+  llm: new OpenAI({ model: "gpt-3.5-turbo", temperature: 0.1 }),
+});
+
+const index = await VectorStoreIndex.fromDocuments([document], {
+  serviceContext,
+});
+
+const queryEngine = index.asQueryEngine({
+  nodePostprocessors: [processor, reranker],
+});
+
+// all node post-processors will be applied during each query
+const response = await queryEngine.query("<user_query>");
+```
+
+### Using with retrieved nodes
+
+```ts
+import { SimilarityPostprocessor } from "llamaindex";
+
+nodes = await index.asRetriever().retrieve("test query str");
+
+const processor = new SimilarityPostprocessor({
+  similarityCutoff: 0.7,
+});
+
+const filteredNodes = processor.postprocessNodes(nodes);
+```
@@ -36,6 +36,6 @@ You can learn more about Tools by taking a look at the LlamaIndex Python documen

 ## API Reference

- [RetrieverQueryEngine](../api/classes/RetrieverQueryEngine.md)
- [SubQuestionQueryEngine](../api/classes/SubQuestionQueryEngine.md)
- [QueryEngineTool](../api/interfaces//QueryEngineTool.md)
+- [RetrieverQueryEngine](../../api/classes/RetrieverQueryEngine.md)
+- [SubQuestionQueryEngine](../../api/classes/SubQuestionQueryEngine.md)
+- [QueryEngineTool](../../api/interfaces/QueryEngineTool.md)
@@ -0,0 +1,152 @@
+# Metadata Filtering
+
+Metadata filtering is a way to filter the documents that are returned by a query based on the metadata associated with the documents. This is useful when you want to filter the documents based on some metadata that is not part of the document text.
+
+You can also check our multi-tenancy blog post to see how metadata filtering can be used in a multi-tenant environment. [https://blog.llamaindex.ai/building-multi-tenancy-rag-system-with-llamaindex-0d6ab4e0c44b] (the article uses the Python version of LlamaIndex, but the concepts are the same).
+
+## Setup
+
+Firstly if you haven't already, you need to install the `llamaindex` package:
+
+```bash
+pnpm i llamaindex
+```
+
+Then you can import the necessary modules from `llamaindex`:
+
+```ts
+import {
+  ChromaVectorStore,
+  Document,
+  VectorStoreIndex,
+  storageContextFromDefaults,
+} from "llamaindex";
+
+const collectionName = "dog_colors";
+```
+
+## Creating documents with metadata
+
+You can create documents with metadata using the `Document` class:
+
+```ts
+const docs = [
+  new Document({
+    text: "The dog is brown",
+    metadata: {
+      color: "brown",
+      dogId: "1",
+    },
+  }),
+  new Document({
+    text: "The dog is red",
+    metadata: {
+      color: "red",
+      dogId: "2",
+    },
+  }),
+];
+```
+
+## Creating a ChromaDB vector store
+
+You can create a `ChromaVectorStore` to store the documents:
+
+```ts
+const chromaVS = new ChromaVectorStore({ collectionName });
+const serviceContext = await storageContextFromDefaults({
+  vectorStore: chromaVS,
+});
+
+const index = await VectorStoreIndex.fromDocuments(docs, {
+  storageContext: serviceContext,
+});
+```
+
+## Querying the index with metadata filtering
+
+Now you can query the index with metadata filtering using the `preFilters` option:
+
+```ts
+const queryEngine = index.asQueryEngine({
+  preFilters: {
+    filters: [
+      {
+        key: "dogId",
+        value: "2",
+        filterType: "ExactMatch",
+      },
+    ],
+  },
+});
+
+const response = await queryEngine.query({
+  query: "What is the color of the dog?",
+});
+
+console.log(response.toString());
+```
+
+## Full Code
+
+```ts
+import {
+  ChromaVectorStore,
+  Document,
+  VectorStoreIndex,
+  storageContextFromDefaults,
+} from "llamaindex";
+
+const collectionName = "dog_colors";
+
+async function main() {
+  try {
+    const docs = [
+      new Document({
+        text: "The dog is brown",
+        metadata: {
+          color: "brown",
+          dogId: "1",
+        },
+      }),
+      new Document({
+        text: "The dog is red",
+        metadata: {
+          color: "red",
+          dogId: "2",
+        },
+      }),
+    ];
+
+    console.log("Creating ChromaDB vector store");
+    const chromaVS = new ChromaVectorStore({ collectionName });
+    const ctx = await storageContextFromDefaults({ vectorStore: chromaVS });
+
+    console.log("Embedding documents and adding to index");
+    const index = await VectorStoreIndex.fromDocuments(docs, {
+      storageContext: ctx,
+    });
+
+    console.log("Querying index");
+    const queryEngine = index.asQueryEngine({
+      preFilters: {
+        filters: [
+          {
+            key: "dogId",
+            value: "2",
+            filterType: "ExactMatch",
+          },
+        ],
+      },
+    });
+    const response = await queryEngine.query({
+      query: "What is the color of the dog?",
+    });
+    console.log(response.toString());
+  } catch (e) {
+    console.error(e);
+  }
+}
+
+main();
+```
@@ -6,6 +6,6 @@
    "composite": true,
    "incremental": true,
    "outDir": "./lib",
-    "tsBuildInfoFile": "./lib/.tsbuildinfo",
-  },
+    "tsBuildInfoFile": "./lib/.tsbuildinfo"
+  }
 }
@@ -0,0 +1,76 @@
+import { FunctionTool, OpenAIAgent } from "llamaindex";
+
+// Define a function to sum two numbers
+function sumNumbers({ a, b }: { a: number; b: number }): number {
+  return a + b;
+}
+
+// Define a function to divide two numbers
+function divideNumbers({ a, b }: { a: number; b: number }): number {
+  return a / b;
+}
+
+// Define the parameters of the sum function as a JSON schema
+const sumJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The first number",
+    },
+    b: {
+      type: "number",
+      description: "The second number",
+    },
+  },
+  required: ["a", "b"],
+};
+
+const divideJSON = {
+  type: "object",
+  properties: {
+    a: {
+      type: "number",
+      description: "The dividend a to divide",
+    },
+    b: {
+      type: "number",
+      description: "The divisor b to divide by",
+    },
+  },
+  required: ["a", "b"],
+};
+
+async function main() {
+  // Create a function tool from the sum function
+  const functionTool = new FunctionTool(sumNumbers, {
+    name: "sumNumbers",
+    description: "Use this function to sum two numbers",
+    parameters: sumJSON,
+  });
+
+  // Create a function tool from the divide function
+  const functionTool2 = new FunctionTool(divideNumbers, {
+    name: "divideNumbers",
+    description: "Use this function to divide two numbers",
+    parameters: divideJSON,
+  });
+
+  // Create an OpenAIAgent with the function tools
+  const agent = new OpenAIAgent({
+    tools: [functionTool, functionTool2],
+    verbose: true,
+  });
+
+  // Chat with the agent
+  const response = await agent.chat({
+    message: "How much is 5 + 5? then divide by 2",
+  });
+
+  // Print the response
+  console.log(String(response));
+}
+
+main().then(() => {
+  console.log("Done");
+});
@@ -0,0 +1,46 @@
+import {
+  OpenAIAgent,
+  QueryEngineTool,
+  SimpleDirectoryReader,
+  VectorStoreIndex,
+} from "llamaindex";
+
+async function main() {
+  // Load the documents
+  const documents = await new SimpleDirectoryReader().loadData({
+    directoryPath: "node_modules/llamaindex/examples/",
+  });
+
+  // Create a vector index from the documents
+  const vectorIndex = await VectorStoreIndex.fromDocuments(documents);
+
+  // Create a query engine from the vector index
+  const abramovQueryEngine = vectorIndex.asQueryEngine();
+
+  // Create a QueryEngineTool with the query engine
+  const queryEngineTool = new QueryEngineTool({
+    queryEngine: abramovQueryEngine,
+    metadata: {
+      name: "abramov_query_engine",
+      description: "A query engine for the Abramov documents",
+    },
+  });
+
+  // Create an OpenAIAgent with the function tools
+  const agent = new OpenAIAgent({
+    tools: [queryEngineTool],
+    verbose: true,
+  });
+
+  // Chat with the agent
+  const response = await agent.chat({
+    message: "What was his salary?",
+  });
+
+  // Print the response
+  console.log(String(response));
+}
+
+main().then(() => {
+  console.log("Done");
+});
@@ -1,7 +1,9 @@
 import { Anthropic } from "llamaindex";

 (async () => {
-  const anthropic = new Anthropic();
+  const anthropic = new Anthropic({
+    apiKey: process.env.ANTHROPIC_API_KEY,
+  });
  const result = await anthropic.chat({
    messages: [
      { content: "You want to talk in rhymes.", role: "system" },
@@ -14,18 +14,27 @@ Here are two sample scripts which work well with the sample data in the Astra Po

 - `ASTRA_DB_APPLICATION_TOKEN`: The generated app token for your Astra database
 - `ASTRA_DB_ENDPOINT`: The API endpoint for your Astra database
+- `ASTRA_DB_NAMESPACE`: (Optional) The namespace where your collection is stored defaults to `default_keyspace`
 - `OPENAI_API_KEY`: Your OpenAI key

 2. `cd` Into the `examples` directory
 3. run `npm i`

-## Load the data
+## Example load and query
+
+Loads and queries a simple vectorstore with some documents about Astra DB
+
+run `ts-node astradb/example`
+
+## Movie Reviews Example
+
+### Load the data

 This sample loads the same dataset of movie reviews as the Astra Portal sample dataset. (Feel free to load the data in your the Astra Data Explorer to compare)

 run `ts-node astradb/load`

-## Use RAG to Query the data
+### Use RAG to Query the data

 Check out your data in the Astra Data Explorer and change the sample query as you see fit.

@@ -0,0 +1,58 @@
+import {
+  AstraDBVectorStore,
+  Document,
+  storageContextFromDefaults,
+  VectorStoreIndex,
+} from "llamaindex";
+
+const collectionName = "test_collection";
+
+async function main() {
+  try {
+    const docs = [
+      new Document({
+        text: "AstraDB is built on Apache Cassandra",
+        metadata: {
+          id: 123,
+          foo: "bar",
+        },
+      }),
+      new Document({
+        text: "AstraDB is a NoSQL DB",
+        metadata: {
+          id: 456,
+          foo: "baz",
+        },
+      }),
+      new Document({
+        text: "AstraDB supports vector search",
+        metadata: {
+          id: 789,
+          foo: "qux",
+        },
+      }),
+    ];
+
+    const astraVS = new AstraDBVectorStore();
+    await astraVS.create(collectionName, {
+      vector: { dimension: 1536, metric: "cosine" },
+    });
+    await astraVS.connect(collectionName);
+
+    const ctx = await storageContextFromDefaults({ vectorStore: astraVS });
+    const index = await VectorStoreIndex.fromDocuments(docs, {
+      storageContext: ctx,
+    });
+
+    const queryEngine = index.asQueryEngine();
+    const response = await queryEngine.query({
+      query: "Describe AstraDB.",
+    });
+
+    console.log(response.toString());
+  } catch (e) {
+    console.error(e);
+  }
+}
+
+main();
@@ -10,9 +10,9 @@ const collectionName = "movie_reviews";
 async function main() {
  try {
    const reader = new PapaCSVReader(false);
-    const docs = await reader.loadData("../data/movie_reviews.csv");
+    const docs = await reader.loadData("./data/movie_reviews.csv");

-    const astraVS = new AstraDBVectorStore();
+    const astraVS = new AstraDBVectorStore({ contentKey: "reviewtext" });
    await astraVS.create(collectionName, {
      vector: { dimension: 1536, metric: "cosine" },
    });
@@ -8,7 +8,7 @@ const collectionName = "movie_reviews";

 async function main() {
  try {
-    const astraVS = new AstraDBVectorStore();
+    const astraVS = new AstraDBVectorStore({ contentKey: "reviewtext" });
    await astraVS.connect(collectionName);

    const ctx = serviceContextFromDefaults();
@@ -19,7 +19,7 @@ async function main() {
    const queryEngine = await index.asQueryEngine({ retriever });

    const results = await queryEngine.query({
-      query: "What is the best reviewed movie?",
+      query: 'How was "La Sapienza" reviewed?',
    });

    console.log(results.response);
@@ -6,7 +6,7 @@ Export your OpenAI API Key using `export OPEN_API_KEY=insert your api key here`

 If you haven't installed chromadb, run `pip install chromadb`. Start the server using `chroma run`.

-Now, open a new terminal window and inside `examples`, run `pnpx ts-node chromadb/test.ts`.
+Now, open a new terminal window and inside `examples`, run `pnpm dlx ts-node chromadb/test.ts`.

 Here's the output for the input query `Tell me about Godfrey Cheshire's rating of La Sapienza.`:

@@ -1,61 +1,5 @@
-## Reader Examples
+## LlamaIndex Reader Examples

-These examples show how to use a specific reader class by loading a document and running a test query.
-
-1. Make sure you are in `examples` directory
-
-```bash
-cd ./examples
-```
-
-2. Prepare `OPENAI_API_KEY` environment variable:
-
-```bash
-export OPENAI_API_KEY=your_openai_api_key
-```
-
-3. Run the following command to load documents and test query:
-
- MarkdownReader Example
-
-```bash
-npx ts-node readers/load-md.ts
-```
-
- DocxReader Example
-
-```bash
-npx ts-node readers/load-docx.ts
-```
-
- PdfReader Example
-
-```bash
-npx ts-node readers/load-pdf.ts
-```
-
- HtmlReader Example
-
-```bash
-npx ts-node readers/load-html.ts
-```
-
- CsvReader Example
-
-```bash
-npx ts-node readers/load-csv.ts
-```
-
- NotionReader Example
-
-```bash
-export NOTION_TOKEN=your_notion_token
-npx ts-node readers/load-notion.ts
-```
-
- AssemblyAI Example
-
-```bash
-export ASSEMBLYAI_API_KEY=your_assemblyai_api_key
-npx ts-node readers/load-assemblyai.ts
+```shell
+npm run start
 ```
@@ -0,0 +1,22 @@
+{
+  "name": "llamaindex-loader-example",
+  "private": true,
+  "type": "module",
+  "scripts": {
+    "start": "node --loader ts-node/esm ./src/simple-directory-reader.ts",
+    "start:csv": "node --loader ts-node/esm ./src/csv.ts",
+    "start:docx": "node --loader ts-node/esm ./src/docx.ts",
+    "start:html": "node --loader ts-node/esm ./src/html.ts",
+    "start:markdown": "node --loader ts-node/esm ./src/markdown.ts",
+    "start:pdf": "node --loader ts-node/esm ./src/pdf.ts",
+    "start:llamaparse": "node --loader ts-node/esm ./src/llamaparse.ts"
+  },
+  "dependencies": {
+    "llamaindex": "latest"
+  },
+  "devDependencies": {
+    "@types/node": "^20.11.14",
+    "ts-node": "^10.9.2",
+    "typescript": "^5.3.3"
+  }
+}
@@ -2,7 +2,7 @@ import { program } from "commander";
 import { TranscribeParams, VectorStoreIndex } from "llamaindex";
 import { AudioTranscriptReader } from "llamaindex/readers/AssemblyAIReader";
 import { stdin as input, stdout as output } from "node:process";
-import readline from "node:readline/promises";
+import { createInterface } from "node:readline/promises";

 program
  .option("-a, --audio [string]", "URL or path of the audio file to transcribe")
@@ -35,7 +35,7 @@ program
    // Create query engine
    const queryEngine = index.asQueryEngine();

-    const rl = readline.createInterface({ input, output });
+    const rl = createInterface({ input, output });
    while (true) {
      const query = await rl.question("Ask a question: ");

@@ -10,7 +10,7 @@ import { PapaCSVReader } from "llamaindex/readers/CSVReader";
 async function main() {
  // Load CSV
  const reader = new PapaCSVReader();
-  const path = "data/titanic_train.csv";
+  const path = "../data/titanic_train.csv";
  const documents = await reader.loadData(path);

  const serviceContext = serviceContextFromDefaults({
@@ -0,0 +1,26 @@
+import type { BaseReader, Document, Metadata } from "llamaindex";
+import {
+  FILE_EXT_TO_READER,
+  SimpleDirectoryReader,
+  TextFileReader,
+} from "llamaindex/readers/SimpleDirectoryReader";
+
+class ZipReader implements BaseReader {
+  loadData(...args: any[]): Promise<Document<Metadata>[]> {
+    throw new Error("Implement me");
+  }
+}
+
+const reader = new SimpleDirectoryReader();
+const documents = await reader.loadData({
+  directoryPath: "../data",
+  defaultReader: new TextFileReader(),
+  fileExtToReader: {
+    ...FILE_EXT_TO_READER,
+    zip: new ZipReader(),
+  },
+});
+
+documents.forEach((doc) => {
+  console.log(`document (${doc.id_}):`, doc.getText());
+});
@@ -1,7 +1,7 @@
 import { VectorStoreIndex } from "llamaindex";
 import { DocxReader } from "llamaindex/readers/DocxReader";

-const FILE_PATH = "./data/stars.docx";
+const FILE_PATH = "../data/stars.docx";
 const SAMPLE_QUERY = "Information about Zodiac";

 async function main() {
@@ -4,7 +4,7 @@ import { HTMLReader } from "llamaindex/readers/HTMLReader";
 async function main() {
  // Load page
  const reader = new HTMLReader();
-  const documents = await reader.loadData("data/18-1_Changelog.html");
+  const documents = await reader.loadData("../data/llamaindex.html");

  // Split text and create embeddings. Store them in a VectorStoreIndex
  const index = await VectorStoreIndex.fromDocuments(documents);
@@ -12,7 +12,7 @@ async function main() {
  // Query the index
  const queryEngine = index.asQueryEngine();
  const response = await queryEngine.query({
-    query: "What were the notable changes in 18.1?",
+    query: "What can I do with LlamaIndex?",
  });

  // Output response
@@ -0,0 +1,21 @@
+import { LlamaParseReader, VectorStoreIndex } from "llamaindex";
+
+async function main() {
+  // Load PDF using LlamaParse
+  const reader = new LlamaParseReader({ resultType: "markdown" });
+  const documents = await reader.loadData("../data/TOS.pdf");
+
+  // Split text and create embeddings. Store them in a VectorStoreIndex
+  const index = await VectorStoreIndex.fromDocuments(documents);
+
+  // Query the index
+  const queryEngine = index.asQueryEngine();
+  const response = await queryEngine.query({
+    query: "What is the license grant in the TOS?",
+  });
+
+  // Output response
+  console.log(response.toString());
+}
+
+main().catch(console.error);
@@ -1,7 +1,7 @@
 import { VectorStoreIndex } from "llamaindex";
 import { MarkdownReader } from "llamaindex/readers/MarkdownReader";

-const FILE_PATH = "./data/planets.md";
+const FILE_PATH = "../data/planets.md";
 const SAMPLE_QUERY = "List all planets";

 async function main() {
@@ -3,7 +3,7 @@ import { program } from "commander";
 import { VectorStoreIndex } from "llamaindex";
 import { NotionReader } from "llamaindex/readers/NotionReader";
 import { stdin as input, stdout as output } from "node:process";
-import readline from "node:readline/promises";
+import { createInterface } from "node:readline/promises";

 program
  .argument("[page]", "Notion page id (must be provided)")
@@ -70,7 +70,7 @@ program
    // Create query engine
    const queryEngine = index.asQueryEngine();

-    const rl = readline.createInterface({ input, output });
+    const rl = createInterface({ input, output });
    while (true) {
      const query = await rl.question("Query: ");

@@ -1,13 +1,10 @@
 import { VectorStoreIndex } from "llamaindex";
 import { PDFReader } from "llamaindex/readers/PDFReader";
-import { resolve } from "node:path";

 async function main() {
  // Load PDF
  const reader = new PDFReader();
-  const documents = await reader.loadData(
-    resolve(__dirname, "../data/brk-2022.pdf"),
-  );
+  const documents = await reader.loadData("../data/brk-2022.pdf");

  // Split text and create embeddings. Store them in a VectorStoreIndex
  const index = await VectorStoreIndex.fromDocuments(documents);
@@ -0,0 +1,10 @@
+import { SimpleDirectoryReader } from "llamaindex/readers/SimpleDirectoryReader";
+// or
+// import { SimpleDirectoryReader } from 'llamaindex'
+
+const reader = new SimpleDirectoryReader();
+const documents = await reader.loadData("../data");
+
+documents.forEach((doc) => {
+  console.log(`document (${doc.id_}):`, doc.getText());
+});
@@ -0,0 +1,11 @@
+{
+  "compilerOptions": {
+    "target": "es2017",
+    "module": "node16",
+    "moduleResolution": "node16",
+    "outDir": "./dist",
+    "types": ["node"],
+    "skipLibCheck": true
+  },
+  "include": ["./src/**/*.ts"]
+}
@@ -0,0 +1,55 @@
+import {
+  CohereRerank,
+  Document,
+  OpenAI,
+  VectorStoreIndex,
+  serviceContextFromDefaults,
+} from "llamaindex";
+
+import essay from "../essay";
+
+async function main() {
+  const document = new Document({ text: essay, id_: "essay" });
+
+  const serviceContext = serviceContextFromDefaults({
+    llm: new OpenAI({ model: "gpt-3.5-turbo", temperature: 0.1 }),
+  });
+
+  const index = await VectorStoreIndex.fromDocuments([document], {
+    serviceContext,
+  });
+
+  const retriever = index.asRetriever();
+
+  retriever.similarityTopK = 5;
+
+  const nodePostprocessor = new CohereRerank({
+    apiKey: "<COHERE_API_KEY>",
+    topN: 5,
+  });
+
+  const queryEngine = index.asQueryEngine({
+    retriever,
+    nodePostprocessors: [nodePostprocessor],
+  });
+
+  const baseQueryEngine = index.asQueryEngine({
+    retriever,
+  });
+
+  const response = await queryEngine.query({
+    query: "What did the author do growing up?",
+  });
+
+  // cohere response
+  console.log(response.response);
+
+  const baseResponse = await baseQueryEngine.query({
+    query: "What did the author do growing up?",
+  });
+
+  // response without cohere
+  console.log(baseResponse.response);
+}
+
+main().catch(console.error);
@@ -1,4 +1,9 @@
-import { Document, SubQuestionQueryEngine, VectorStoreIndex } from "llamaindex";
+import {
+  Document,
+  QueryEngineTool,
+  SubQuestionQueryEngine,
+  VectorStoreIndex,
+} from "llamaindex";

 import essay from "./essay";

@@ -6,16 +11,18 @@ import essay from "./essay";
  const document = new Document({ text: essay, id_: essay });
  const index = await VectorStoreIndex.fromDocuments([document]);

-  const queryEngine = SubQuestionQueryEngine.fromDefaults({
-    queryEngineTools: [
-      {
-        queryEngine: index.asQueryEngine(),
-        metadata: {
-          name: "pg_essay",
-          description: "Paul Graham essay on What I Worked On",
-        },
+  const queryEngineTools = [
+    new QueryEngineTool({
+      queryEngine: index.asQueryEngine(),
+      metadata: {
+        name: "pg_essay",
+        description: "Paul Graham essay on What I Worked On",
      },
-    ],
+    }),
+  ];
+
+  const queryEngine = SubQuestionQueryEngine.fromDefaults({
+    queryEngineTools,
  });

  const response = await queryEngine.query({
@@ -1,6 +1,6 @@
 {
  "compilerOptions": {
-    "target": "es2016",
+    "target": "es2017",
    "module": "esnext",
    "moduleResolution": "bundler",
    "esModuleInterop": true,
@@ -10,13 +10,13 @@
    "outDir": "./lib",
    "tsBuildInfoFile": "./lib/.tsbuildinfo",
    "incremental": true,
-    "composite": true,
+    "composite": true
  },
  "ts-node": {
    "files": true,
    "compilerOptions": {
-      "module": "commonjs",
-    },
+      "module": "commonjs"
+    }
  },
-  "include": ["./**/*.ts"],
+  "include": ["./**/*.ts"]
 }
@@ -18,20 +18,19 @@
  },
  "devDependencies": {
    "@changesets/cli": "^2.27.1",
-    "@turbo/gen": "^1.11.3",
-    "@types/jest": "^29.5.11",
+    "@types/jest": "^29.5.12",
    "eslint": "^8.56.0",
    "eslint-config-custom": "workspace:*",
-    "husky": "^9.0.6",
+    "husky": "^9.0.10",
    "jest": "^29.7.0",
-    "lint-staged": "^15.2.0",
-    "prettier": "^3.2.4",
+    "lint-staged": "^15.2.2",
+    "prettier": "^3.2.5",
    "prettier-plugin-organize-imports": "^3.2.4",
    "ts-jest": "^29.1.2",
-    "turbo": "^1.11.3",
+    "turbo": "^1.12.3",
    "typescript": "^5.3.3"
  },
-  "packageManager": "pnpm@8.14.3+sha256.2d0363bb6c314daa67087ef07743eea1ba2e2d360c835e8fec6b5575e4ed9484",
+  "packageManager": "pnpm@8.15.1",
  "pnpm": {
    "overrides": {
      "trim": "1.0.1",
@@ -1,5 +1,27 @@
 # llamaindex

+## 0.1.10
+
+### Patch Changes
+
+- b6c1500: feat(embedBatchSize): add batching for embeddings
+- 6cc3a36: fix: update `VectorIndexRetriever` constructor parameters' type.
+- cd82947: feat(queryEngineTool): add query engine tool to agents
+
+## 0.1.9
+
+### Patch Changes
+
+- 09464e6: add OpenAIAgent (thanks @EmanuelCampos)
+
+## 0.1.8
+
+### Patch Changes
+
+- d903da6: easier prompt customization for SimpleResponseBuilder
+- ab9d941: fix(cyclic): remove cyclic structures from transform hash
+- 177b446: chore: improve extractors prompt
+
 ## 0.1.7

 ### Patch Changes
@@ -1,26 +1,27 @@
 {
  "name": "llamaindex",
  "private": true,
-  "version": "0.1.7",
+  "version": "0.1.10",
  "license": "MIT",
  "dependencies": {
-    "@anthropic-ai/sdk": "^0.12.4",
+    "@anthropic-ai/sdk": "^0.13.0",
    "@datastax/astra-db-ts": "^0.1.4",
    "@mistralai/mistralai": "^0.0.10",
    "@notionhq/client": "^2.2.14",
    "@pinecone-database/pinecone": "^1.1.3",
    "@qdrant/js-client-rest": "^1.7.0",
-    "@xenova/transformers": "^2.14.1",
-    "assemblyai": "^4.2.1",
+    "@xenova/transformers": "^2.15.0",
+    "assemblyai": "^4.2.2",
    "chromadb": "~1.7.3",
+    "cohere-ai": "^7.7.5",
    "file-type": "^18.7.0",
-    "js-tiktoken": "^1.0.8",
+    "js-tiktoken": "^1.0.10",
    "lodash": "^4.17.21",
    "mammoth": "^1.6.0",
    "md-utils-ts": "^2.0.0",
    "mongodb": "^6.3.0",
    "notion-md-crawler": "^0.0.2",
-    "openai": "^4.26.0",
+    "openai": "^4.26.1",
    "papaparse": "^5.4.1",
    "pathe": "^1.1.2",
    "pdf2json": "^3.0.5",
@@ -29,18 +30,18 @@
    "portkey-ai": "^0.1.16",
    "rake-modified": "^1.0.8",
    "replicate": "^0.25.2",
-    "string-strip-html": "^13.4.5",
+    "string-strip-html": "^13.4.6",
    "wink-nlp": "^1.14.3"
  },
  "devDependencies": {
    "@aws-crypto/sha256-js": "^5.2.0",
    "@types/edit-json-file": "^1.7.3",
-    "@types/jest": "^29.5.11",
+    "@types/jest": "^29.5.12",
    "@types/lodash": "^4.14.202",
-    "@types/node": "^18.19.10",
+    "@types/node": "^18.19.14",
    "@types/papaparse": "^5.3.14",
    "@types/pg": "^8.11.0",
-    "bunchee": "^4.4.3",
+    "bunchee": "^4.4.6",
    "edit-json-file": "^1.8.0",
    "madge": "^6.1.0",
    "typescript": "^5.3.3"
@@ -118,11 +119,6 @@
      "import": "./dist/Response.mjs",
      "require": "./dist/Response.js"
    },
-    "./Retriever": {
-      "types": "./dist/Retriever.d.mts",
-      "import": "./dist/Retriever.mjs",
-      "require": "./dist/Retriever.js"
-    },
    "./ServiceContext": {
      "types": "./dist/ServiceContext.d.mts",
      "import": "./dist/ServiceContext.mjs",
@@ -133,10 +129,15 @@
      "import": "./dist/TextSplitter.mjs",
      "require": "./dist/TextSplitter.js"
    },
-    "./Tool": {
-      "types": "./dist/Tool.d.mts",
-      "import": "./dist/Tool.mjs",
-      "require": "./dist/Tool.js"
+    "./tools": {
+      "types": "./dist/tools.d.mts",
+      "import": "./dist/tools.mjs",
+      "require": "./dist/tools.js"
+    },
+    "./readers": {
+      "types": "./dist/readers.d.mts",
+      "import": "./dist/readers.mjs",
+      "require": "./dist/readers.js"
    },
    "./readers/AssemblyAIReader": {
      "types": "./dist/readers/AssemblyAIReader.d.mts",
@@ -200,7 +201,7 @@
  "scripts": {
    "lint": "eslint .",
    "test": "jest",
-    "build": "NODE_OPTIONS=\"--max-old-space-size=8192\" bunchee",
+    "build": "rm -rf ./dist && NODE_OPTIONS=\"--max-old-space-size=8192\" bunchee",
    "postbuild": "pnpm run copy && pnpm run modify-package-json",
    "copy": "cp -r package.json CHANGELOG.md ../../README.md ../../LICENSE examples src dist/",
    "modify-package-json": "node ./scripts/modify-package-json.mjs",
@@ -74,9 +74,6 @@ export class SubQuestionOutputParser
 {
  parse(output: string): StructuredOutput<SubQuestion[]> {
    const parsed = parseJsonMarkdown(output);
-
-    // TODO add zod validation
-
    return { rawOutput: output, parsedOutput: parsed };
  }

@@ -13,7 +13,7 @@ export class Response {
    this.sourceNodes = sourceNodes || [];
  }

-  getFormattedSources() {
+  protected _getFormattedSources() {
    throw new Error("Not implemented yet");
  }

@@ -0,0 +1,2 @@
+export * from "./openai/base";
+export * from "./openai/worker";
@@ -0,0 +1,55 @@
+import { CallbackManager } from "../../callbacks/CallbackManager";
+import { ChatMessage, OpenAI } from "../../llm";
+import { ObjectRetriever } from "../../objects/base";
+import { BaseTool } from "../../types";
+import { AgentRunner } from "../runner/base";
+import { OpenAIAgentWorker } from "./worker";
+
+type OpenAIAgentParams = {
+  tools: BaseTool[];
+  llm?: OpenAI;
+  memory?: any;
+  prefixMessages?: ChatMessage[];
+  verbose?: boolean;
+  maxFunctionCalls?: number;
+  defaultToolChoice?: string;
+  callbackManager?: CallbackManager;
+  toolRetriever?: ObjectRetriever<BaseTool>;
+};
+
+/**
+ * An agent that uses OpenAI's API to generate text.
+ *
+ * @category OpenAI
+ */
+export class OpenAIAgent extends AgentRunner {
+  constructor({
+    tools,
+    llm,
+    memory,
+    prefixMessages,
+    verbose,
+    maxFunctionCalls = 5,
+    defaultToolChoice = "auto",
+    callbackManager,
+    toolRetriever,
+  }: OpenAIAgentParams) {
+    const stepEngine = new OpenAIAgentWorker({
+      tools,
+      callbackManager,
+      llm,
+      prefixMessages,
+      maxFunctionCalls,
+      toolRetriever,
+      verbose,
+    });
+
+    super({
+      agentWorker: stepEngine,
+      memory,
+      callbackManager,
+      defaultToolChoice,
+      chatHistory: prefixMessages,
+    });
+  }
+}
@@ -0,0 +1,13 @@
+export type OpenAIToolCall = ChatCompletionMessageToolCall;
+
+export interface Function {
+  arguments: string;
+  name: string;
+  type: "function";
+}
+
+export interface ChatCompletionMessageToolCall {
+  id: string;
+  function: Function;
+  type: "function";
+}
@@ -0,0 +1,27 @@
+import { ToolMetadata } from "../../types";
+
+export type OpenAIFunction = {
+  type: "function";
+  function: ToolMetadata;
+};
+
+type OpenAiTool = {
+  name: string;
+  description: string;
+  parameters: ToolMetadata["parameters"];
+};
+
+export const toOpenAiTool = ({
+  name,
+  description,
+  parameters,
+}: OpenAiTool): OpenAIFunction => {
+  return {
+    type: "function",
+    function: {
+      name: name,
+      description: description,
+      parameters,
+    },
+  };
+};
@@ -0,0 +1,405 @@
+// Assuming that the necessary interfaces and classes (like BaseTool, OpenAI, ChatMessage, CallbackManager, etc.) are defined elsewhere
+
+import { CallbackManager } from "../../callbacks/CallbackManager";
+import { AgentChatResponse, ChatResponseMode } from "../../engines/chat";
+import { randomUUID } from "../../env";
+import {
+  ChatMessage,
+  ChatResponse,
+  ChatResponseChunk,
+  OpenAI,
+} from "../../llm";
+import { ChatMemoryBuffer } from "../../memory/ChatMemoryBuffer";
+import { ObjectRetriever } from "../../objects/base";
+import { ToolOutput } from "../../tools/types";
+import { callToolWithErrorHandling } from "../../tools/utils";
+import { BaseTool } from "../../types";
+import { AgentWorker, Task, TaskStep, TaskStepOutput } from "../types";
+import { addUserStepToMemory, getFunctionByName } from "../utils";
+import { OpenAIToolCall } from "./types/chat";
+import { toOpenAiTool } from "./utils";
+
+const DEFAULT_MAX_FUNCTION_CALLS = 5;
+
+/**
+ * Call function.
+ * @param tools: tools
+ * @param toolCall: tool call
+ * @param verbose: verbose
+ * @returns: void
+ */
+async function callFunction(
+  tools: BaseTool[],
+  toolCall: OpenAIToolCall,
+  verbose: boolean = false,
+): Promise<[ChatMessage, ToolOutput]> {
+  const id_ = toolCall.id;
+  const functionCall = toolCall.function;
+  const name = toolCall.function.name;
+  const argumentsStr = toolCall.function.arguments;
+
+  if (verbose) {
+    console.log("=== Calling Function ===");
+    console.log(`Calling function: ${name} with args: ${argumentsStr}`);
+  }
+
+  const tool = getFunctionByName(tools, name);
+  const argumentDict = JSON.parse(argumentsStr);
+
+  // Call tool
+  // Use default error message
+  const output = await callToolWithErrorHandling(tool, argumentDict, null);
+
+  if (verbose) {
+    console.log(`Got output ${output}`);
+    console.log("==========================");
+  }
+
+  return [
+    {
+      content: String(output),
+      role: "tool",
+      additionalKwargs: {
+        name,
+        tool_call_id: id_,
+      },
+    },
+    output,
+  ];
+}
+
+type OpenAIAgentWorkerParams = {
+  tools: BaseTool[];
+  llm?: OpenAI;
+  prefixMessages?: ChatMessage[];
+  verbose?: boolean;
+  maxFunctionCalls?: number;
+  callbackManager?: CallbackManager | undefined;
+  toolRetriever?: ObjectRetriever<BaseTool>;
+};
+
+type CallFunctionOutput = {
+  message: ChatMessage;
+  toolOutput: ToolOutput;
+};
+
+/**
+ * OpenAI agent worker.
+ * This class is responsible for running the agent.
+ */
+export class OpenAIAgentWorker implements AgentWorker {
+  private _llm: OpenAI;
+  private _verbose: boolean;
+  private _maxFunctionCalls: number;
+
+  public prefixMessages: ChatMessage[];
+  public callbackManager: CallbackManager | undefined;
+
+  private _getTools: (input: string) => BaseTool[];
+
+  /**
+   * Initialize.
+   */
+  constructor({
+    tools,
+    llm,
+    prefixMessages,
+    verbose,
+    maxFunctionCalls = DEFAULT_MAX_FUNCTION_CALLS,
+    callbackManager,
+    toolRetriever,
+  }: OpenAIAgentWorkerParams) {
+    this._llm = llm ?? new OpenAI({ model: "gpt-3.5-turbo-0613" });
+    this._verbose = verbose || false;
+    this._maxFunctionCalls = maxFunctionCalls;
+    this.prefixMessages = prefixMessages || [];
+    this.callbackManager = callbackManager || this._llm.callbackManager;
+
+    if (tools.length > 0 && toolRetriever) {
+      throw new Error("Cannot specify both tools and tool_retriever");
+    } else if (tools.length > 0) {
+      this._getTools = () => tools;
+    } else if (toolRetriever) {
+      // @ts-ignore
+      this._getTools = (message: string) => toolRetriever.retrieve(message);
+    } else {
+      this._getTools = () => [];
+    }
+  }
+
+  /**
+   * Get all messages.
+   * @param task: task
+   * @returns: messages
+   */
+  public getAllMessages(task: Task): ChatMessage[] {
+    return [
+      ...this.prefixMessages,
+      ...task.memory.get(),
+      ...task.extraState.newMemory.get(),
+    ];
+  }
+
+  /**
+   * Get latest tool calls.
+   * @param task: task
+   * @returns: tool calls
+   */
+  public getLatestToolCalls(task: Task): OpenAIToolCall[] | null {
+    const chatHistory: ChatMessage[] = task.extraState.newMemory.getAll();
+
+    if (chatHistory.length === 0) {
+      return null;
+    }
+
+    return chatHistory[chatHistory.length - 1].additionalKwargs?.toolCalls;
+  }
+
+  /**
+   *
+   * @param task
+   * @param openaiTools
+   * @param toolChoice
+   * @returns
+   */
+  private _getLlmChatKwargs(
+    task: Task,
+    openaiTools: { [key: string]: any }[],
+    toolChoice: string | { [key: string]: any } = "auto",
+  ): { [key: string]: any } {
+    const llmChatKwargs: { [key: string]: any } = {
+      messages: this.getAllMessages(task),
+    };
+
+    if (openaiTools.length > 0) {
+      llmChatKwargs.tools = openaiTools;
+      llmChatKwargs.toolChoice = toolChoice;
+    }
+
+    return llmChatKwargs;
+  }
+
+  /**
+   * Process message.
+   * @param task: task
+   * @param chatResponse: chat response
+   * @returns: agent chat response
+   */
+  private _processMessage(
+    task: Task,
+    chatResponse: ChatResponse,
+  ): AgentChatResponse | AsyncIterable<ChatResponseChunk> {
+    const aiMessage = chatResponse.message;
+    task.extraState.newMemory.put(aiMessage);
+    return new AgentChatResponse(aiMessage.content, task.extraState.sources);
+  }
+
+  /**
+   * Get agent response.
+   * @param task: task
+   * @param mode: mode
+   * @param llmChatKwargs: llm chat kwargs
+   * @returns: agent chat response
+   */
+  private async _getAgentResponse(
+    task: Task,
+    mode: ChatResponseMode,
+    llmChatKwargs: any,
+  ): Promise<AgentChatResponse> {
+    if (mode === ChatResponseMode.WAIT) {
+      const chatResponse = (await this._llm.chat({
+        stream: false,
+        ...llmChatKwargs,
+      })) as unknown as ChatResponse;
+
+      return this._processMessage(task, chatResponse) as AgentChatResponse;
+    } else {
+      throw new Error("Not implemented");
+    }
+  }
+
+  /**
+   * Call function.
+   * @param tools: tools
+   * @param toolCall: tool call
+   * @param memory: memory
+   * @param sources: sources
+   * @returns: void
+   */
+  async callFunction(
+    tools: BaseTool[],
+    toolCall: OpenAIToolCall,
+  ): Promise<CallFunctionOutput> {
+    const functionCall = toolCall.function;
+
+    if (!functionCall) {
+      throw new Error("Invalid tool_call object");
+    }
+
+    const functionMessage = await callFunction(tools, toolCall, this._verbose);
+
+    const message = functionMessage[0];
+    const toolOutput = functionMessage[1];
+
+    return {
+      message,
+      toolOutput,
+    };
+  }
+
+  /**
+   * Initialize step.
+   * @param task: task
+   * @param kwargs: kwargs
+   * @returns: task step
+   */
+  initializeStep(task: Task, kwargs?: any): TaskStep {
+    const sources: ToolOutput[] = [];
+
+    const newMemory = new ChatMemoryBuffer();
+
+    const taskState = {
+      sources,
+      nFunctionCalls: 0,
+      newMemory,
+    };
+
+    task.extraState = {
+      ...task.extraState,
+      ...taskState,
+    };
+
+    return new TaskStep(task.taskId, randomUUID(), task.input);
+  }
+
+  /**
+   * Should continue.
+   * @param toolCalls: tool calls
+   * @param nFunctionCalls: number of function calls
+   * @returns: boolean
+   */
+  private _shouldContinue(
+    toolCalls: OpenAIToolCall[] | null,
+    nFunctionCalls: number,
+  ): boolean {
+    if (nFunctionCalls > this._maxFunctionCalls) {
+      return false;
+    }
+
+    if (toolCalls?.length === 0) {
+      return false;
+    }
+
+    return true;
+  }
+
+  /**
+   * Get tools.
+   * @param input: input
+   * @returns: tools
+   */
+  getTools(input: string): BaseTool[] {
+    return this._getTools(input);
+  }
+
+  private async _runStep(
+    step: TaskStep,
+    task: Task,
+    mode: ChatResponseMode = ChatResponseMode.WAIT,
+    toolChoice: string | { [key: string]: any } = "auto",
+  ): Promise<TaskStepOutput> {
+    const tools = this.getTools(task.input);
+
+    if (step.input) {
+      addUserStepToMemory(step, task.extraState.newMemory, this._verbose);
+    }
+
+    const openaiTools = tools.map((tool) =>
+      toOpenAiTool({
+        name: tool.metadata.name,
+        description: tool.metadata.description,
+        parameters: tool.metadata.parameters,
+      }),
+    );
+
+    const llmChatKwargs = this._getLlmChatKwargs(task, openaiTools, toolChoice);
+
+    const agentChatResponse = await this._getAgentResponse(
+      task,
+      mode,
+      llmChatKwargs,
+    );
+
+    const latestToolCalls = this.getLatestToolCalls(task) || [];
+
+    let isDone: boolean;
+    let newSteps: TaskStep[] = [];
+
+    if (
+      !this._shouldContinue(latestToolCalls, task.extraState.nFunctionCalls)
+    ) {
+      isDone = true;
+      newSteps = [];
+    } else {
+      isDone = false;
+      for (const toolCall of latestToolCalls) {
+        const { message, toolOutput } = await this.callFunction(
+          tools,
+          toolCall,
+        );
+
+        task.extraState.sources.push(toolOutput);
+        task.extraState.newMemory.put(message);
+
+        task.extraState.nFunctionCalls += 1;
+      }
+
+      newSteps = [step.getNextStep(randomUUID(), undefined)];
+    }
+
+    return new TaskStepOutput(agentChatResponse, step, newSteps, isDone);
+  }
+
+  /**
+   * Run step.
+   * @param step: step
+   * @param task: task
+   * @param kwargs: kwargs
+   * @returns: task step output
+   */
+  async runStep(
+    step: TaskStep,
+    task: Task,
+    kwargs?: any,
+  ): Promise<TaskStepOutput> {
+    const toolChoice = kwargs?.toolChoice || "auto";
+    return this._runStep(step, task, ChatResponseMode.WAIT, toolChoice);
+  }
+
+  /**
+   * Stream step.
+   * @param step: step
+   * @param task: task
+   * @param kwargs: kwargs
+   * @returns: task step output
+   */
+  async streamStep(
+    step: TaskStep,
+    task: Task,
+    kwargs?: any,
+  ): Promise<TaskStepOutput> {
+    const toolChoice = kwargs?.toolChoice || "auto";
+    return this._runStep(step, task, ChatResponseMode.STREAM, toolChoice);
+  }
+
+  /**
+   * Finalize task.
+   * @param task: task
+   * @param kwargs: kwargs
+   * @returns: void
+   */
+  finalizeTask(task: Task, kwargs?: any): void {
+    task.memory.set(task.memory.get().concat(task.extraState.newMemory.get()));
+    task.extraState.newMemory.reset();
+  }
+}
@@ -0,0 +1,350 @@
+import { randomUUID } from "crypto";
+import { CallbackManager } from "../../callbacks/CallbackManager";
+import {
+  AgentChatResponse,
+  ChatEngineAgentParams,
+  ChatResponseMode,
+} from "../../engines/chat";
+import { ChatMessage, LLM } from "../../llm";
+import { ChatMemoryBuffer } from "../../memory/ChatMemoryBuffer";
+import { BaseMemory } from "../../memory/types";
+import { AgentWorker, Task, TaskStep, TaskStepOutput } from "../types";
+import { AgentState, BaseAgentRunner, TaskState } from "./types";
+
+const validateStepFromArgs = (
+  taskId: string,
+  input: string,
+  step?: any,
+  kwargs?: any,
+): TaskStep | undefined => {
+  if (step) {
+    if (input) {
+      throw new Error("Cannot specify both `step` and `input`");
+    }
+    return step;
+  } else {
+    return new TaskStep(taskId, step, input, kwargs);
+  }
+};
+
+type AgentRunnerParams = {
+  agentWorker: AgentWorker;
+  chatHistory?: ChatMessage[];
+  state?: AgentState;
+  memory?: BaseMemory;
+  llm?: LLM;
+  callbackManager?: CallbackManager;
+  initTaskStateKwargs?: Record<string, any>;
+  deleteTaskOnFinish?: boolean;
+  defaultToolChoice?: string;
+};
+
+export class AgentRunner extends BaseAgentRunner {
+  agentWorker: AgentWorker;
+  state: AgentState;
+  memory: BaseMemory;
+  callbackManager: CallbackManager;
+  initTaskStateKwargs: Record<string, any>;
+  deleteTaskOnFinish: boolean;
+  defaultToolChoice: string;
+
+  /**
+   * Creates an AgentRunner.
+   */
+  constructor(params: AgentRunnerParams) {
+    super();
+
+    this.agentWorker = params.agentWorker;
+    this.state = params.state ?? new AgentState();
+    this.memory =
+      params.memory ??
+      new ChatMemoryBuffer({
+        chatHistory: params.chatHistory,
+      });
+    this.callbackManager = params.callbackManager ?? new CallbackManager();
+    this.initTaskStateKwargs = params.initTaskStateKwargs ?? {};
+    this.deleteTaskOnFinish = params.deleteTaskOnFinish ?? false;
+    this.defaultToolChoice = params.defaultToolChoice ?? "auto";
+  }
+
+  /**
+   * Creates a task.
+   * @param input
+   * @param kwargs
+   */
+  createTask(input: string, kwargs?: any): Task {
+    let extraState;
+
+    if (!this.initTaskStateKwargs) {
+      if (kwargs && "extraState" in kwargs) {
+        if (extraState) {
+          delete extraState["extraState"];
+        }
+      }
+    } else {
+      if (kwargs && "extraState" in kwargs) {
+        throw new Error(
+          "Cannot specify both `extraState` and `initTaskStateKwargs`",
+        );
+      } else {
+        extraState = this.initTaskStateKwargs;
+      }
+    }
+
+    const task = new Task({
+      taskId: randomUUID(),
+      input,
+      memory: this.memory,
+      extraState,
+      ...kwargs,
+    });
+
+    const initialStep = this.agentWorker.initializeStep(task);
+
+    const taskState = new TaskState({
+      task,
+      stepQueue: [initialStep],
+    });
+
+    this.state.taskDict[task.taskId] = taskState;
+
+    return task;
+  }
+
+  /**
+   * Deletes the task.
+   * @param taskId
+   */
+  deleteTask(taskId: string): void {
+    delete this.state.taskDict[taskId];
+  }
+
+  /**
+   * Returns the list of tasks.
+   */
+  listTasks(): Task[] {
+    return Object.values(this.state.taskDict).map(
+      (taskState) => taskState.task,
+    );
+  }
+
+  /**
+   * Returns the task.
+   */
+  getTask(taskId: string): Task {
+    return this.state.taskDict[taskId].task;
+  }
+
+  /**
+   * Returns the completed steps in the task.
+   * @param taskId
+   * @param kwargs
+   */
+  getCompletedSteps(taskId: string): TaskStepOutput[] {
+    return this.state.taskDict[taskId].completedSteps;
+  }
+
+  /**
+   * Returns the next steps in the task.
+   * @param taskId
+   * @param kwargs
+   */
+  getUpcomingSteps(taskId: string, kwargs: any): TaskStep[] {
+    return this.state.taskDict[taskId].stepQueue;
+  }
+
+  private async _runStep(
+    taskId: string,
+    step?: TaskStep,
+    mode: ChatResponseMode = ChatResponseMode.WAIT,
+    kwargs?: any,
+  ): Promise<TaskStepOutput> {
+    const task = this.state.getTask(taskId);
+    const curStep = step || this.state.getStepQueue(taskId).shift();
+
+    let curStepOutput;
+
+    if (!curStep) {
+      throw new Error(`No step found for task ${taskId}`);
+    }
+
+    if (mode === ChatResponseMode.WAIT) {
+      curStepOutput = await this.agentWorker.runStep(curStep, task, kwargs);
+    } else if (mode === ChatResponseMode.STREAM) {
+      curStepOutput = await this.agentWorker.streamStep(curStep, task, kwargs);
+    } else {
+      throw new Error(`Invalid mode: ${mode}`);
+    }
+
+    const nextSteps = curStepOutput.nextSteps;
+
+    this.state.addSteps(taskId, nextSteps);
+    this.state.addCompletedStep(taskId, [curStepOutput]);
+
+    return curStepOutput;
+  }
+
+  /**
+   * Runs the next step in the task.
+   * @param taskId
+   * @param kwargs
+   * @param step
+   * @returns
+   */
+  async runStep(
+    taskId: string,
+    input: string,
+    step?: TaskStep,
+    kwargs: any = {},
+  ): Promise<TaskStepOutput> {
+    const curStep = validateStepFromArgs(taskId, input, step, kwargs);
+    return this._runStep(taskId, curStep, ChatResponseMode.WAIT, kwargs);
+  }
+
+  /**
+   * Runs the step and returns the response.
+   * @param taskId
+   * @param input
+   * @param step
+   * @param kwargs
+   */
+  async streamStep(
+    taskId: string,
+    input: string,
+    step?: TaskStep,
+    kwargs?: any,
+  ): Promise<TaskStepOutput> {
+    const curStep = validateStepFromArgs(taskId, input, step, kwargs);
+    return this._runStep(taskId, curStep, ChatResponseMode.STREAM, kwargs);
+  }
+
+  /**
+   * Finalizes the response and returns it.
+   * @param taskId
+   * @param kwargs
+   * @param stepOutput
+   * @returns
+   */
+  async finalizeResponse(
+    taskId: string,
+    stepOutput: TaskStepOutput,
+    kwargs?: any,
+  ): Promise<AgentChatResponse> {
+    if (!stepOutput) {
+      stepOutput =
+        this.getCompletedSteps(taskId)[
+          this.getCompletedSteps(taskId).length - 1
+        ];
+    }
+    if (!stepOutput.isLast) {
+      throw new Error(
+        "finalizeResponse can only be called on the last step output",
+      );
+    }
+
+    if (!(stepOutput.output instanceof AgentChatResponse)) {
+      throw new Error(
+        `When \`isLast\` is True, cur_step_output.output must be AGENT_CHAT_RESPONSE_TYPE: ${stepOutput.output}`,
+      );
+    }
+
+    this.agentWorker.finalizeTask(this.getTask(taskId), kwargs);
+
+    if (this.deleteTaskOnFinish) {
+      this.deleteTask(taskId);
+    }
+
+    return stepOutput.output;
+  }
+
+  protected async _chat({
+    message,
+    toolChoice,
+  }: ChatEngineAgentParams & { mode: ChatResponseMode }) {
+    const task = this.createTask(message as string);
+
+    let resultOutput;
+
+    while (true) {
+      const curStepOutput = await this._runStep(
+        task.taskId,
+        undefined,
+        ChatResponseMode.WAIT,
+        {
+          toolChoice,
+        },
+      );
+
+      if (curStepOutput.isLast) {
+        resultOutput = curStepOutput;
+        break;
+      }
+
+      toolChoice = "auto";
+    }
+
+    return this.finalizeResponse(task.taskId, resultOutput);
+  }
+
+  /**
+   * Sends a message to the LLM and returns the response.
+   * @param message
+   * @param chatHistory
+   * @param toolChoice
+   * @returns
+   */
+  public async chat({
+    message,
+    chatHistory,
+    toolChoice,
+  }: ChatEngineAgentParams): Promise<AgentChatResponse> {
+    if (!toolChoice) {
+      toolChoice = this.defaultToolChoice;
+    }
+
+    const chatResponse = await this._chat({
+      message,
+      chatHistory,
+      toolChoice,
+      mode: ChatResponseMode.WAIT,
+    });
+
+    return chatResponse;
+  }
+
+  protected _getPromptModules(): string[] {
+    return [];
+  }
+
+  protected _getPrompts(): string[] {
+    return [];
+  }
+
+  /**
+   * Resets the agent.
+   */
+  reset(): void {
+    this.state = new AgentState();
+  }
+
+  getCompletedStep(
+    taskId: string,
+    stepId: string,
+    kwargs: any,
+  ): TaskStepOutput {
+    const completedSteps = this.getCompletedSteps(taskId);
+    for (const stepOutput of completedSteps) {
+      if (stepOutput.taskStep.stepId === stepId) {
+        return stepOutput;
+      }
+    }
+
+    throw new Error(`Step ${stepId} not found in task ${taskId}`);
+  }
+
+  /**
+   * Undoes the step.
+   * @param taskId
+   */
+  undoStep(taskId: string): void {}
+}
@@ -0,0 +1,102 @@
+import { AgentChatResponse } from "../../engines/chat";
+import { BaseAgent, Task, TaskStep, TaskStepOutput } from "../types";
+
+export class TaskState {
+  task!: Task;
+  stepQueue!: TaskStep[];
+  completedSteps!: TaskStepOutput[];
+
+  constructor(init?: Partial<TaskState>) {
+    Object.assign(this, init);
+  }
+}
+
+export abstract class BaseAgentRunner extends BaseAgent {
+  constructor(init?: Partial<BaseAgentRunner>) {
+    super();
+  }
+
+  abstract createTask(input: string, kwargs: any): Task;
+  abstract deleteTask(taskId: string): void;
+  abstract getTask(taskId: string, kwargs: any): Task;
+  abstract listTasks(kwargs: any): Task[];
+  abstract getUpcomingSteps(taskId: string, kwargs: any): TaskStep[];
+  abstract getCompletedSteps(taskId: string, kwargs: any): TaskStepOutput[];
+
+  getCompletedStep(
+    taskId: string,
+    stepId: string,
+    kwargs: any,
+  ): TaskStepOutput {
+    const completedSteps = this.getCompletedSteps(taskId, kwargs);
+    for (const stepOutput of completedSteps) {
+      if (stepOutput.taskStep.stepId === stepId) {
+        return stepOutput;
+      }
+    }
+
+    throw new Error(`Step ${stepId} not found in task ${taskId}`);
+  }
+
+  abstract runStep(
+    taskId: string,
+    input: string,
+    step: TaskStep,
+    kwargs: any,
+  ): Promise<TaskStepOutput>;
+
+  abstract streamStep(
+    taskId: string,
+    input: string,
+    step: TaskStep,
+    kwargs?: any,
+  ): Promise<TaskStepOutput>;
+
+  abstract finalizeResponse(
+    taskId: string,
+    stepOutput: TaskStepOutput,
+    kwargs?: any,
+  ): Promise<AgentChatResponse>;
+
+  abstract undoStep(taskId: string): void;
+}
+
+export class AgentState {
+  taskDict!: Record<string, TaskState>;
+
+  constructor(init?: Partial<AgentState>) {
+    Object.assign(this, init);
+
+    if (!this.taskDict) {
+      this.taskDict = {};
+    }
+  }
+
+  getTask(taskId: string): Task {
+    return this.taskDict[taskId].task;
+  }
+
+  getCompletedSteps(taskId: string): TaskStepOutput[] {
+    return this.taskDict[taskId].completedSteps || [];
+  }
+
+  getStepQueue(taskId: string): TaskStep[] {
+    return this.taskDict[taskId].stepQueue || [];
+  }
+
+  addSteps(taskId: string, steps: TaskStep[]): void {
+    if (!this.taskDict[taskId].stepQueue) {
+      this.taskDict[taskId].stepQueue = [];
+    }
+
+    this.taskDict[taskId].stepQueue.push(...steps);
+  }
+
+  addCompletedStep(taskId: string, stepOutputs: TaskStepOutput[]): void {
+    if (!this.taskDict[taskId].completedSteps) {
+      this.taskDict[taskId].completedSteps = [];
+    }
+
+    this.taskDict[taskId].completedSteps.push(...stepOutputs);
+  }
+}
@@ -0,0 +1,181 @@
+import { AgentChatResponse, ChatEngineAgentParams } from "../engines/chat";
+import { QueryEngineParamsNonStreaming } from "../types";
+
+export interface AgentWorker {
+  initializeStep(task: Task, kwargs?: any): TaskStep;
+  runStep(step: TaskStep, task: Task, kwargs?: any): Promise<TaskStepOutput>;
+  streamStep(step: TaskStep, task: Task, kwargs?: any): Promise<TaskStepOutput>;
+  finalizeTask(task: Task, kwargs?: any): void;
+}
+
+interface BaseChatEngine {
+  chat(params: ChatEngineAgentParams): Promise<AgentChatResponse>;
+}
+
+interface BaseQueryEngine {
+  query(params: QueryEngineParamsNonStreaming): Promise<AgentChatResponse>;
+}
+
+/**
+ * BaseAgent is the base class for all agents.
+ */
+export abstract class BaseAgent implements BaseChatEngine, BaseQueryEngine {
+  protected _getPrompts(): string[] {
+    return [];
+  }
+
+  protected _getPromptModules(): string[] {
+    return [];
+  }
+
+  abstract chat(params: ChatEngineAgentParams): Promise<AgentChatResponse>;
+  abstract reset(): void;
+
+  /**
+   * query is the main entrypoint for the agent. It takes a query and returns a response.
+   * @param params
+   * @returns
+   */
+  async query(
+    params: QueryEngineParamsNonStreaming,
+  ): Promise<AgentChatResponse> {
+    // Handle non-streaming query
+    const agentResponse = await this.chat({
+      message: params.query,
+      chatHistory: [],
+    });
+
+    return agentResponse;
+  }
+}
+
+type TaskParams = {
+  taskId: string;
+  input: string;
+  memory: any;
+  extraState: Record<string, any>;
+};
+
+/**
+ * Task is a unit of work for the agent.
+ * @param taskId: taskId
+ */
+export class Task {
+  taskId: string;
+  input: string;
+
+  memory: any;
+  extraState: Record<string, any>;
+
+  constructor({ taskId, input, memory, extraState }: TaskParams) {
+    this.taskId = taskId;
+    this.input = input;
+    this.memory = memory;
+    this.extraState = extraState ?? {};
+  }
+}
+
+interface ITaskStep {
+  taskId: string;
+  stepId: string;
+  input?: string | null;
+  stepState: Record<string, any>;
+  nextSteps: Record<string, TaskStep>;
+  prevSteps: Record<string, TaskStep>;
+  isReady: boolean;
+  getNextStep(
+    stepId: string,
+    input?: string,
+    stepState?: Record<string, any>,
+  ): TaskStep;
+  linkStep(nextStep: TaskStep): void;
+}
+
+/**
+ * TaskStep is a unit of work for the agent.
+ * @param taskId: taskId
+ * @param stepId: stepId
+ * @param input: input
+ * @param stepState: stepState
+ */
+export class TaskStep implements ITaskStep {
+  taskId: string;
+  stepId: string;
+  input?: string | null;
+  stepState: Record<string, any> = {};
+  nextSteps: Record<string, TaskStep> = {};
+  prevSteps: Record<string, TaskStep> = {};
+  isReady: boolean = true;
+
+  constructor(
+    taskId: string,
+    stepId: string,
+    input?: string | null,
+    stepState?: Record<string, any> | null,
+  ) {
+    this.taskId = taskId;
+    this.stepId = stepId;
+    this.input = input;
+    this.stepState = stepState ?? this.stepState;
+  }
+
+  /*
+   * getNextStep is a function that returns the next step.
+   * @param stepId: stepId
+   * @param input: input
+   * @param stepState: stepState
+   * @returns: TaskStep
+   */
+  getNextStep(
+    stepId: string,
+    input?: string,
+    stepState?: Record<string, unknown>,
+  ): TaskStep {
+    return new TaskStep(
+      this.taskId,
+      stepId,
+      input,
+      stepState ?? this.stepState,
+    );
+  }
+
+  /*
+   * linkStep is a function that links the next step.
+   * @param nextStep: nextStep
+   * @returns: void
+   */
+  linkStep(nextStep: TaskStep): void {
+    this.nextSteps[nextStep.stepId] = nextStep;
+    nextStep.prevSteps[this.stepId] = this;
+  }
+}
+
+/**
+ * TaskStepOutput is a unit of work for the agent.
+ * @param output: output
+ * @param taskStep: taskStep
+ * @param nextSteps: nextSteps
+ * @param isLast: isLast
+ */
+export class TaskStepOutput {
+  output: unknown;
+  taskStep: TaskStep;
+  nextSteps: TaskStep[];
+  isLast: boolean;
+
+  constructor(
+    output: unknown,
+    taskStep: TaskStep,
+    nextSteps: TaskStep[],
+    isLast: boolean = false,
+  ) {
+    this.output = output;
+    this.taskStep = taskStep;
+    this.nextSteps = nextSteps;
+    this.isLast = isLast;
+  }
+
+  toString(): string {
+    return String(this.output);
+  }
+}
@@ -0,0 +1,51 @@
+import { ChatMessage } from "../llm";
+import { ChatMemoryBuffer } from "../memory/ChatMemoryBuffer";
+import { BaseTool } from "../types";
+import { TaskStep } from "./types";
+
+/**
+ * Adds the user's input to the memory.
+ *
+ * @param step - The step to add to the memory.
+ * @param memory - The memory to add the step to.
+ * @param verbose - Whether to print debug messages.
+ */
+export function addUserStepToMemory(
+  step: TaskStep,
+  memory: ChatMemoryBuffer,
+  verbose: boolean = false,
+): void {
+  if (!step.input) {
+    return;
+  }
+
+  const userMessage: ChatMessage = {
+    content: step.input,
+    role: "user",
+  };
+
+  memory.put(userMessage);
+
+  if (verbose) {
+    console.log(`Added user message to memory!: ${userMessage.content}`);
+  }
+}
+
+/**
+ * Get function by name.
+ * @param tools: tools
+ * @param name: name
+ * @returns: tool
+ */
+export function getFunctionByName(tools: BaseTool[], name: string): BaseTool {
+  const nameToTool: { [key: string]: BaseTool } = {};
+  tools.forEach((tool) => {
+    nameToTool[tool.metadata.name] = tool;
+  });
+
+  if (!(name in nameToTool)) {
+    throw new Error(`Tool with name ${name} not found`);
+  }
+
+  return nameToTool[name];
+}
@@ -1,3 +1,4 @@
+import type { Anthropic } from "@anthropic-ai/sdk";
 import { NodeWithScore } from "../Node";

 /*
@@ -39,14 +40,7 @@ export interface DefaultStreamToken {
 //OpenAI stream token schema is the default.
 //Note: Anthropic and Replicate also use similar token schemas.
 export type OpenAIStreamToken = DefaultStreamToken;
-export type AnthropicStreamToken = {
-  completion: string;
-  model: string;
-  stop_reason: string | undefined;
-  stop?: boolean | undefined;
-  log_id?: string;
-};
-
+export type AnthropicStreamToken = Anthropic.Completion;
 //
 //Callback Responses
 //
@@ -36,7 +36,7 @@ export class HuggingFaceEmbedding extends BaseEmbedding {
    return this.extractor;
  }

-  async getTextEmbedding(text: string): Promise<number[]> {
+  override async getTextEmbedding(text: string): Promise<number[]> {
    const extractor = await this.getExtractor();
    const output = await extractor(text, { pooling: "mean", normalize: true });
    return Array.from(output.data);
@@ -0,0 +1,7 @@
+import { Ollama } from "../llm/ollama";
+import { BaseEmbedding } from "./types";
+
+/**
+ * OllamaEmbedding is an alias for Ollama that implements the BaseEmbedding interface.
+ */
+export class OllamaEmbedding extends Ollama implements BaseEmbedding {}
@@ -59,7 +59,9 @@ export class OpenAIEmbedding extends BaseEmbedding {
    this.model = init?.model ?? "text-embedding-ada-002";
    this.dimensions = init?.dimensions; // if no dimensions provided, will be undefined/not sent to OpenAI

+    this.embedBatchSize = init?.embedBatchSize ?? 10;
    this.maxRetries = init?.maxRetries ?? 10;
+
    this.timeout = init?.timeout ?? 60 * 1000; // Default is 60 seconds
    this.additionalSessionOptions = init?.additionalSessionOptions;

@@ -100,21 +102,43 @@ export class OpenAIEmbedding extends BaseEmbedding {
    }
  }

-  private async getOpenAIEmbedding(input: string) {
+  /**
+   * Get embeddings for a batch of texts
+   * @param texts
+   * @param options
+   */
+  private async getOpenAIEmbedding(input: string[]): Promise<number[][]> {
    const { data } = await this.session.openai.embeddings.create({
      model: this.model,
      dimensions: this.dimensions, // only sent to OpenAI if set by user
      input,
    });

-    return data[0].embedding;
+    return data.map((d) => d.embedding);
  }

+  /**
+   * Get embeddings for a batch of texts
+   * @param texts
+   */
+  async getTextEmbeddings(texts: string[]): Promise<number[][]> {
+    return await this.getOpenAIEmbedding(texts);
+  }
+
+  /**
+   * Get embeddings for a single text
+   * @param texts
+   */
  async getTextEmbedding(text: string): Promise<number[]> {
-    return this.getOpenAIEmbedding(text);
+    return (await this.getOpenAIEmbedding([text]))[0];
  }

+  /**
+   * Get embeddings for a query
+   * @param texts
+   * @param options
+   */
  async getQueryEmbedding(query: string): Promise<number[]> {
-    return this.getOpenAIEmbedding(query);
+    return (await this.getOpenAIEmbedding([query]))[0];
  }
 }
@@ -2,6 +2,7 @@ export * from "./ClipEmbedding";
 export * from "./HuggingFaceEmbedding";
 export * from "./MistralAIEmbedding";
 export * from "./MultiModalEmbedding";
+export { OllamaEmbedding } from "./OllamaEmbedding";
 export * from "./OpenAIEmbedding";
 export { TogetherEmbedding } from "./together";
 export * from "./types";
@@ -2,7 +2,11 @@ import { BaseNode, MetadataMode } from "../Node";
 import { TransformComponent } from "../ingestion";
 import { SimilarityType, similarity } from "./utils";

+const DEFAULT_EMBED_BATCH_SIZE = 10;
+
 export abstract class BaseEmbedding implements TransformComponent {
+  embedBatchSize = DEFAULT_EMBED_BATCH_SIZE;
+
  similarity(
    embedding1: number[],
    embedding2: number[],
@@ -14,12 +18,66 @@ export abstract class BaseEmbedding implements TransformComponent {
  abstract getTextEmbedding(text: string): Promise<number[]>;
  abstract getQueryEmbedding(query: string): Promise<number[]>;

-  async transform(nodes: BaseNode[], _options?: any): Promise<BaseNode[]> {
-    for (const node of nodes) {
-      node.embedding = await this.getTextEmbedding(
-        node.getContent(MetadataMode.EMBED),
-      );
+  /**
+   * Optionally override this method to retrieve multiple embeddings in a single request
+   * @param texts
+   */
+  async getTextEmbeddings(texts: string[]): Promise<Array<number[]>> {
+    const embeddings: number[][] = [];
+
+    for (const text of texts) {
+      const embedding = await this.getTextEmbedding(text);
+      embeddings.push(embedding);
    }
+
+    return embeddings;
+  }
+
+  /**
+   * Get embeddings for a batch of texts
+   * @param texts
+   * @param options
+   */
+  async getTextEmbeddingsBatch(
+    texts: string[],
+    options?: {
+      logProgress?: boolean;
+    },
+  ): Promise<Array<number[]>> {
+    const resultEmbeddings: Array<number[]> = [];
+    const chunkSize = this.embedBatchSize;
+
+    const queue: string[] = texts;
+
+    const curBatch: string[] = [];
+
+    for (let i = 0; i < queue.length; i++) {
+      curBatch.push(queue[i]);
+      if (i == queue.length - 1 || curBatch.length == chunkSize) {
+        const embeddings = await this.getTextEmbeddings(curBatch);
+
+        resultEmbeddings.push(...embeddings);
+
+        if (options?.logProgress) {
+          console.log(`getting embedding progress: ${i} / ${queue.length}`);
+        }
+
+        curBatch.length = 0;
+      }
+    }
+
+    return resultEmbeddings;
+  }
+
+  async transform(nodes: BaseNode[], _options?: any): Promise<BaseNode[]> {
+    const texts = nodes.map((node) => node.getContent(MetadataMode.EMBED));
+
+    const embeddings = await this.getTextEmbeddingsBatch(texts);
+
+    for (let i = 0; i < nodes.length; i++) {
+      nodes[i].embedding = embeddings[i];
+    }
+
    return nodes;
  }
 }
@@ -22,11 +22,17 @@ export class DefaultContextGenerator implements ContextGenerator {
    this.nodePostprocessors = init.nodePostprocessors || [];
  }

-  private applyNodePostprocessors(nodes: NodeWithScore[]) {
-    return this.nodePostprocessors.reduce(
-      (nodes, nodePostprocessor) => nodePostprocessor.postprocessNodes(nodes),
-      nodes,
-    );
+  private async applyNodePostprocessors(nodes: NodeWithScore[], query: string) {
+    let nodesWithScore = nodes;
+
+    for (const postprocessor of this.nodePostprocessors) {
+      nodesWithScore = await postprocessor.postprocessNodes(
+        nodesWithScore,
+        query,
+      );
+    }
+
+    return nodesWithScore;
  }

  async generate(message: string, parentEvent?: Event): Promise<Context> {
@@ -42,7 +48,10 @@ export class DefaultContextGenerator implements ContextGenerator {
      parentEvent,
    );

-    const nodes = this.applyNodePostprocessors(sourceNodesWithScore);
+    const nodes = await this.applyNodePostprocessors(
+      sourceNodesWithScore,
+      message,
+    );

    return {
      message: {
@@ -1,9 +1,10 @@
 import { ChatHistory } from "../../ChatHistory";
-import { NodeWithScore } from "../../Node";
+import { BaseNode, NodeWithScore } from "../../Node";
 import { Response } from "../../Response";
 import { Event } from "../../callbacks/CallbackManager";
 import { ChatMessage } from "../../llm";
 import { MessageContent } from "../../llm/types";
+import { ToolOutput } from "../../tools/types";

 /**
 * Represents the base parameters for ChatEngine.
@@ -24,6 +25,10 @@ export interface ChatEngineParamsNonStreaming extends ChatEngineParamsBase {
  stream?: false | null;
 }

+export interface ChatEngineAgentParams extends ChatEngineParamsBase {
+  toolChoice?: string | Record<string, any>;
+}
+
 /**
 * A ChatEngine is used to handle back and forth chats between the application and the LLM.
 */
@@ -52,3 +57,32 @@ export interface Context {
 export interface ContextGenerator {
  generate(message: string, parentEvent?: Event): Promise<Context>;
 }
+
+export enum ChatResponseMode {
+  WAIT = "wait",
+  STREAM = "stream",
+}
+
+export class AgentChatResponse {
+  response: string;
+  sources: ToolOutput[];
+  sourceNodes?: BaseNode[];
+
+  constructor(
+    response: string,
+    sources?: ToolOutput[],
+    sourceNodes?: BaseNode[],
+  ) {
+    this.response = response;
+    this.sources = sources || [];
+    this.sourceNodes = sourceNodes || [];
+  }
+
+  protected _getFormattedSources() {
+    throw new Error("Not implemented yet");
+  }
+
+  toString() {
+    return this.response ?? "";
+  }
+}
@@ -36,11 +36,17 @@ export class RetrieverQueryEngine implements BaseQueryEngine {
    this.nodePostprocessors = nodePostprocessors || [];
  }

-  private applyNodePostprocessors(nodes: NodeWithScore[]) {
-    return this.nodePostprocessors.reduce(
-      (nodes, nodePostprocessor) => nodePostprocessor.postprocessNodes(nodes),
-      nodes,
-    );
+  private async applyNodePostprocessors(nodes: NodeWithScore[], query: string) {
+    let nodesWithScore = nodes;
+
+    for (const postprocessor of this.nodePostprocessors) {
+      nodesWithScore = await postprocessor.postprocessNodes(
+        nodesWithScore,
+        query,
+      );
+    }
+
+    return nodesWithScore;
  }

  private async retrieve(query: string, parentEvent: Event) {
@@ -50,7 +56,7 @@ export class RetrieverQueryEngine implements BaseQueryEngine {
      this.preFilters,
    );

-    return this.applyNodePostprocessors(nodes);
+    return await this.applyNodePostprocessors(nodes, query);
  }

  query(params: QueryEngineParamsStreaming): Promise<AsyncIterable<Response>>;
@@ -14,9 +14,9 @@ import {
 } from "../../synthesizers";
 import {
  BaseQueryEngine,
+  BaseTool,
  QueryEngineParamsNonStreaming,
  QueryEngineParamsStreaming,
-  QueryEngineTool,
  ToolMetadata,
 } from "../../types";
 import { BaseQuestionGenerator, SubQuestion } from "./types";
@@ -27,28 +27,23 @@ import { BaseQuestionGenerator, SubQuestion } from "./types";
 export class SubQuestionQueryEngine implements BaseQueryEngine {
  responseSynthesizer: BaseSynthesizer;
  questionGen: BaseQuestionGenerator;
-  queryEngines: Record<string, BaseQueryEngine>;
+  queryEngines: BaseTool[];
  metadatas: ToolMetadata[];

  constructor(init: {
    questionGen: BaseQuestionGenerator;
    responseSynthesizer: BaseSynthesizer;
-    queryEngineTools: QueryEngineTool[];
+    queryEngineTools: BaseTool[];
  }) {
    this.questionGen = init.questionGen;
    this.responseSynthesizer =
      init.responseSynthesizer ?? new ResponseSynthesizer();
-    this.queryEngines = init.queryEngineTools.reduce<
-      Record<string, BaseQueryEngine>
-    >((acc, tool) => {
-      acc[tool.metadata.name] = tool.queryEngine;
-      return acc;
-    }, {});
+    this.queryEngines = init.queryEngineTools;
    this.metadatas = init.queryEngineTools.map((tool) => tool.metadata);
  }

  static fromDefaults(init: {
-    queryEngineTools: QueryEngineTool[];
+    queryEngineTools: BaseTool[];
    questionGen?: BaseQuestionGenerator;
    responseSynthesizer?: BaseSynthesizer;
    serviceContext?: ServiceContext;
@@ -122,13 +117,24 @@ export class SubQuestionQueryEngine implements BaseQueryEngine {
  ): Promise<NodeWithScore | null> {
    try {
      const question = subQ.subQuestion;
-      const queryEngine = this.queryEngines[subQ.toolName];

-      const response = await queryEngine.query({
+      const queryEngine = this.queryEngines.find(
+        (tool) => tool.metadata.name === subQ.toolName,
+      );
+
+      if (!queryEngine) {
+        return null;
+      }
+
+      const responseText = await queryEngine?.call?.({
        query: question,
        parentEvent,
      });
-      const responseText = response.response;
+
+      if (!responseText) {
+        return null;
+      }
+
      const nodeText = `Sub question: ${question}\nResponse: ${responseText}`;
      const node = new TextNode({ text: nodeText });
      return { node, score: 0 };
@@ -9,6 +9,7 @@ export * from "./Response";
 export * from "./Retriever";
 export * from "./ServiceContext";
 export * from "./TextSplitter";
+export * from "./agent";
 export * from "./callbacks/CallbackManager";
 export * from "./constants";
 export * from "./embeddings";
@@ -20,17 +21,8 @@ export * from "./ingestion";
 export * from "./llm";
 export * from "./nodeParsers";
 export * from "./postprocessors";
-export * from "./readers/AssemblyAIReader";
-export * from "./readers/CSVReader";
-export * from "./readers/DocxReader";
-export * from "./readers/HTMLReader";
-export * from "./readers/MarkdownReader";
-export * from "./readers/NotionReader";
-export * from "./readers/PDFReader";
-export * from "./readers/SimpleDirectoryReader";
-export * from "./readers/SimpleMongoReader";
-export * from "./readers/base";
+export * from "./readers";
 export * from "./selectors";
 export * from "./storage";
 export * from "./synthesizers";
-export type * from "./types";
+export * from "./tools";
@@ -17,6 +17,12 @@ import { VectorStoreIndex } from "./VectorStoreIndex";
 * VectorIndexRetriever retrieves nodes from a VectorIndex.
 */

+export type VectorIndexRetrieverOptions = {
+  index: VectorStoreIndex;
+  similarityTopK?: number;
+  imageSimilarityTopK?: number;
+};
+
 export class VectorIndexRetriever implements BaseRetriever {
  index: VectorStoreIndex;
  similarityTopK: number;
@@ -27,11 +33,7 @@ export class VectorIndexRetriever implements BaseRetriever {
    index,
    similarityTopK,
    imageSimilarityTopK,
-  }: {
-    index: VectorStoreIndex;
-    similarityTopK?: number;
-    imageSimilarityTopK?: number;
-  }) {
+  }: VectorIndexRetrieverOptions) {
    this.index = index;
    this.serviceContext = this.index.serviceContext;
    this.similarityTopK = similarityTopK ?? DEFAULT_SIMILARITY_TOP_K;
@@ -34,7 +34,10 @@ import {
  IndexDict,
  IndexStructType,
 } from "../BaseIndex";
-import { VectorIndexRetriever } from "./VectorIndexRetriever";
+import {
+  VectorIndexRetriever,
+  VectorIndexRetrieverOptions,
+} from "./VectorIndexRetriever";

 interface IndexStructOptions {
  indexStruct?: IndexDict;
@@ -163,20 +166,14 @@ export class VectorStoreIndex extends BaseIndex<IndexDict> {
    nodes: BaseNode[],
    options?: { logProgress?: boolean },
  ): Promise<BaseNode[]> {
-    const nodesWithEmbeddings: BaseNode[] = [];
-
-    for (let i = 0; i < nodes.length; ++i) {
-      const node = nodes[i];
-      if (options?.logProgress) {
-        console.log(`Getting embedding for node ${i + 1}/${nodes.length}`);
-      }
-      node.embedding = await this.embedModel.getTextEmbedding(
-        node.getContent(MetadataMode.EMBED),
-      );
-      nodesWithEmbeddings.push(node);
-    }
-
-    return nodesWithEmbeddings;
+    const texts = nodes.map((node) => node.getContent(MetadataMode.EMBED));
+    const embeddings = await this.embedModel.getTextEmbeddingsBatch(texts, {
+      logProgress: options?.logProgress,
+    });
+    return nodes.map((node, i) => {
+      node.embedding = embeddings[i];
+      return node;
+    });
  }

  /**
@@ -260,7 +257,9 @@ export class VectorStoreIndex extends BaseIndex<IndexDict> {
    return index;
  }

-  asRetriever(options?: any): VectorIndexRetriever {
+  asRetriever(
+    options?: Omit<VectorIndexRetrieverOptions, "index">,
+  ): VectorIndexRetriever {
    return new VectorIndexRetriever({ index: this, ...options });
  }

@@ -1,5 +1,5 @@
 import { BaseNode, Document } from "../Node";
-import { BaseReader } from "../readers/base";
+import { BaseReader } from "../readers/type";
 import { BaseDocumentStore, VectorStore } from "../storage";
 import { IngestionCache, getTransformationHash } from "./IngestionCache";
 import { DocStoreStrategy, createDocStoreStrategy } from "./strategies";
@@ -77,7 +77,14 @@ export class OpenAI extends BaseLLM {
  maxTokens?: number;
  additionalChatOptions?: Omit<
    Partial<OpenAILLM.Chat.ChatCompletionCreateParams>,
-    "max_tokens" | "messages" | "model" | "temperature" | "top_p" | "stream"
+    | "max_tokens"
+    | "messages"
+    | "model"
+    | "temperature"
+    | "top_p"
+    | "stream"
+    | "tools"
+    | "toolChoice"
  >;

  // OpenAI session params
@@ -179,7 +186,7 @@ export class OpenAI extends BaseLLM {

  mapMessageType(
    messageType: MessageType,
-  ): "user" | "assistant" | "system" | "function" {
+  ): "user" | "assistant" | "system" | "function" | "tool" {
    switch (messageType) {
      case "user":
        return "user";
@@ -189,11 +196,30 @@ export class OpenAI extends BaseLLM {
        return "system";
      case "function":
        return "function";
+      case "tool":
+        return "tool";
      default:
        return "user";
    }
  }

+  toOpenAIMessage(messages: ChatMessage[]) {
+    return messages.map((message) => {
+      const additionalKwargs = message.additionalKwargs ?? {};
+
+      if (message.additionalKwargs?.toolCalls) {
+        additionalKwargs.tool_calls = message.additionalKwargs.toolCalls;
+        delete additionalKwargs.toolCalls;
+      }
+
+      return {
+        role: this.mapMessageType(message.role),
+        content: message.content,
+        ...additionalKwargs,
+      };
+    });
+  }
+
  chat(
    params: LLMChatParamsStreaming,
  ): Promise<AsyncIterable<ChatResponseChunk>>;
@@ -201,18 +227,15 @@ export class OpenAI extends BaseLLM {
  async chat(
    params: LLMChatParamsNonStreaming | LLMChatParamsStreaming,
  ): Promise<ChatResponse | AsyncIterable<ChatResponseChunk>> {
-    const { messages, parentEvent, stream } = params;
-    const baseRequestParams: OpenAILLM.Chat.ChatCompletionCreateParams = {
+    const { messages, parentEvent, stream, tools, toolChoice } = params;
+
+    let baseRequestParams: OpenAILLM.Chat.ChatCompletionCreateParams = {
      model: this.model,
      temperature: this.temperature,
      max_tokens: this.maxTokens,
-      messages: messages.map(
-        (message) =>
-          ({
-            role: this.mapMessageType(message.role),
-            content: message.content,
-          }) as ChatCompletionMessageParam,
-      ),
+      tools: tools,
+      tool_choice: toolChoice,
+      messages: this.toOpenAIMessage(messages) as ChatCompletionMessageParam[],
      top_p: this.topP,
      ...this.additionalChatOptions,
    };
@@ -221,6 +244,7 @@ export class OpenAI extends BaseLLM {
    if (stream) {
      return this.streamChat(params);
    }
+
    // Non-streaming
    const response = await this.session.openai.chat.completions.create({
      ...baseRequestParams,
@@ -228,8 +252,19 @@ export class OpenAI extends BaseLLM {
    });

    const content = response.choices[0].message?.content ?? "";
+
+    const kwargsOutput: Record<string, any> = {};
+
+    if (response.choices[0].message?.tool_calls) {
+      kwargsOutput.toolCalls = response.choices[0].message.tool_calls;
+    }
+
    return {
-      message: { content, role: response.choices[0].message.role },
+      message: {
+        content,
+        role: response.choices[0].message.role,
+        additionalKwargs: kwargsOutput,
+      },
    };
  }

@@ -1,5 +1,5 @@
 import { CallbackManager, Event } from "../callbacks/CallbackManager";
-import { BaseEmbedding } from "../embeddings";
+import { BaseEmbedding } from "../embeddings/types";
 import { ok } from "../env";
 import {
  ChatMessage,
@@ -39,17 +39,20 @@ export type MessageType =
  | "system"
  | "generic"
  | "function"
-  | "memory";
+  | "memory"
+  | "tool";

 export interface ChatMessage {
  // TODO: use MessageContent
  content: any;
  role: MessageType;
+  additionalKwargs?: Record<string, any>;
 }

 export interface ChatResponse {
  message: ChatMessage;
  raw?: Record<string, any>;
+  additionalKwargs?: Record<string, any>;
 }

 export interface ChatResponseChunk {
@@ -74,6 +77,9 @@ export interface LLMChatParamsBase {
  messages: ChatMessage[];
  parentEvent?: Event;
  extraParams?: Record<string, any>;
+  tools?: any;
+  toolChoice?: any;
+  additionalKwargs?: Record<string, any>;
 }

 export interface LLMChatParamsStreaming extends LLMChatParamsBase {
@@ -0,0 +1,119 @@
+import { ChatMessage } from "../llm";
+import { SimpleChatStore } from "../storage/chatStore/SimpleChatStore";
+import { BaseChatStore } from "../storage/chatStore/types";
+import { BaseMemory } from "./types";
+
+type ChatMemoryBufferParams = {
+  tokenLimit?: number;
+  chatStore?: BaseChatStore;
+  chatStoreKey?: string;
+  chatHistory?: ChatMessage[];
+};
+
+/**
+ * Chat memory buffer.
+ */
+export class ChatMemoryBuffer implements BaseMemory {
+  tokenLimit: number;
+
+  chatStore: BaseChatStore;
+  chatStoreKey: string;
+
+  /**
+   * Initialize.
+   */
+  constructor(init?: Partial<ChatMemoryBufferParams>) {
+    this.tokenLimit = init?.tokenLimit ?? 3000;
+    this.chatStore = init?.chatStore ?? new SimpleChatStore();
+    this.chatStoreKey = init?.chatStoreKey ?? "chat_history";
+
+    if (init?.chatHistory) {
+      this.chatStore.setMessages(this.chatStoreKey, init.chatHistory);
+    }
+  }
+
+  /**
+    Get chat history.
+    @param initialTokenCount: number of tokens to start with
+  */
+  get(initialTokenCount: number = 0): ChatMessage[] {
+    const chatHistory = this.getAll();
+
+    if (initialTokenCount > this.tokenLimit) {
+      throw new Error("Initial token count exceeds token limit");
+    }
+
+    let messageCount = chatHistory.length;
+    let tokenCount =
+      this._tokenCountForMessageCount(messageCount) + initialTokenCount;
+
+    while (tokenCount > this.tokenLimit && messageCount > 1) {
+      messageCount -= 1;
+      if (chatHistory[-messageCount].role === "assistant") {
+        // we cannot have an assistant message at the start of the chat history
+        // if after removal of the first, we have an assistant message,
+        // we need to remove the assistant message too
+        messageCount -= 1;
+      }
+
+      tokenCount =
+        this._tokenCountForMessageCount(messageCount) + initialTokenCount;
+    }
+
+    // catch one message longer than token limit
+    if (tokenCount > this.tokenLimit || messageCount <= 0) {
+      return [];
+    }
+
+    return chatHistory.slice(-messageCount);
+  }
+
+  /**
+   * Get all chat history.
+   * @returns {ChatMessage[]} chat history
+   */
+  getAll(): ChatMessage[] {
+    return this.chatStore.getMessages(this.chatStoreKey);
+  }
+
+  /**
+   * Put chat history.
+   * @param message
+   */
+  put(message: ChatMessage): void {
+    this.chatStore.addMessage(this.chatStoreKey, message);
+  }
+
+  /**
+   * Set chat history.
+   * @param messages
+   */
+  set(messages: ChatMessage[]): void {
+    this.chatStore.setMessages(this.chatStoreKey, messages);
+  }
+
+  /**
+   * Reset chat history.
+   */
+  reset(): void {
+    this.chatStore.deleteMessages(this.chatStoreKey);
+  }
+
+  /**
+   * Get token count for message count.
+   * @param messageCount
+   * @returns {number} token count
+   */
+  private _tokenCountForMessageCount(messageCount: number): number {
+    if (messageCount <= 0) {
+      return 0;
+    }
+
+    const chatHistory = this.getAll();
+    const msgStr = chatHistory
+      .slice(-messageCount)
+      .map((m) => m.content)
+      .join(" ");
+    return msgStr.split(" ").length;
+  }
+}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
yisding	34ff2a9a0e	changeset	2024-02-11 04:16:02 +08:00
yisding	cc2c5f3c2e	remove unused turbo gen for ip sec vuln	2024-02-11 04:12:58 +08:00
yisding	269f4f6703	fix fastapi security vuln	2024-02-11 03:59:59 +08:00
Emanuel Ferreira	0b57187909	docs: add available LLMs (#536 )	2024-02-10 13:54:13 -03:00
Emanuel Ferreira	e78e9f4832	feat(reranker): cohere reranker (#535 )	2024-02-10 12:07:14 -03:00
Marcus Schiesser	383933adb5	feat: Add reader for LlamaParse (#530 )	2024-02-09 11:27:50 +07:00
Marcus Schiesser	dd054137bf	feat: use batching in vector store index (#524 ) Co-authored-by: Alex Yang <himself65@outlook.com> Co-authored-by: Emanuel Ferreira <contatoferreirads@gmail.com>	2024-02-08 08:59:56 -03:00
byteninja	cf3b7571eb	feat: add filtering of metadata to PGVectorStore (#525 )	2024-02-08 10:54:52 +07:00
Alex Yang	ae7a2c202a	fix: add alias class `OllamaEmbedding` (#527 )	2024-02-07 14:26:39 -06:00
Alex Yang	9b00d578bc	feat: improve reader interfaces (#498 )	2024-02-07 11:44:01 -06:00
Marcus Schiesser	b8173e4c4e	RELEASING: Releasing 1 package(s) Releases: create-llama@0.0.25 [skip ci]	2024-02-07 16:53:46 +07:00
Marcus Schiesser	67b5445fb9	fix(cl): improved error messages for python installation	2024-02-07 16:16:06 +07:00
Marcus Schiesser	87419ef5d1	Revert "fix: add handle error from template installation (#522 )" This reverts commit `ad218160d8`.	2024-02-07 16:01:08 +07:00
Huu Le (Lee)	ad218160d8	fix: add handle error from template installation (#522 ) Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>	2024-02-07 15:30:26 +07:00
Marcus Schiesser	eeb90d7991	fix(cl): add link to configure search tool	2024-02-07 14:07:57 +07:00
Marcus Schiesser	7b7329bd18	feat(cl): Added latest turbo models for GPT-3.5 and GPT 4	2024-02-07 12:46:19 +07:00
Alex Yang	b3acbb06f4	docs: update CONTRIBUTING.md (#516 )	2024-02-07 12:05:29 +07:00
Marcus Schiesser	7db7562841	fix(cl): just retrieve top-k 3 for context to prevent token exceed	2024-02-07 10:59:31 +07:00
yisding	0e75b124c3	minor update	2024-02-06 12:24:06 -08:00
yisding	d79a0b76f3	update packages	2024-02-06 11:55:38 -08:00
yisding	c3eb4933fb	RELEASING: Releasing 1 package(s) Releases: llamaindex@0.1.10 [skip ci]	2024-02-06 11:50:39 -08:00
yisding	e3a956aedd	pnpm install	2024-02-06 11:48:12 -08:00
yisding	e562e479dc	Merge branch 'main' of github.com:run-llama/LlamaIndexTS	2024-02-06 11:39:07 -08:00
Alex Yang	1900e019e3	build: fix build errors (#521 )	2024-02-06 12:54:08 -06:00
Emanuel Ferreira	317f140822	fix: revert embed batch temporarily (#520 )	2024-02-06 12:01:48 -03:00
Emanuel Ferreira	cd829474d6	feat(queryEngineTool): add query engine tool to agents (#509 )	2024-02-06 11:11:26 -03:00
Emanuel Ferreira	b6c1500570	feat(embedding): add batch embed size (#407 )	2024-02-06 10:19:14 -03:00
Huu Le (Lee)	d06a85bd34	feat: Add support for llamahub tools (#517 ) Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>	2024-02-06 17:34:03 +07:00
Ian Sinnott	6b9a2feac5	Consistent Document IDs in NotionReader.ts (#519 )	2024-02-06 15:52:29 +07:00
Mike Fortman	bd08004afe	Update Astra DB Vectorstore to support namespaces (#485 ) Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>	2024-02-06 11:31:08 +07:00
Emanuel Ferreira	0ecc4b2051	docs: minor fixes (#514 )	2024-02-05 14:12:49 +07:00
metonym	f9f351229a	Fix typo in starter example (#512 )	2024-02-05 14:11:26 +07:00
Mario Martinez	72659a237b	Convert keys from snakecase to camelcase (#510 )	2024-02-05 14:10:17 +07:00
Gavin Morgan	6cc3a36d44	fix: update `VectorIndexRetriever` constructor parameters' type. (#515 ) Co-authored-by: Alex Yang <himself65@outlook.com>	2024-02-04 18:22:04 -06:00
TechPandaPro	6fe55d6e88	docs: fix broken relative links in docs (#513 )	2024-02-04 06:20:17 -03:00
yisding	36f2903eb3	RELEASING: Releasing 1 package(s) Releases: llamaindex@0.1.9 [skip ci]	2024-02-02 11:26:00 -08:00
yisding	09464e6da7	docs(changeset): add OpenAIAgent (thanks @EmanuelCampos)	2024-02-02 11:23:03 -08:00
Emanuel Ferreira	955e084cf3	feat: OpenAI agent (#416 ) Co-authored-by: yisding <yi.s.ding@gmail.com>	2024-02-02 11:22:14 -08:00
yisding	46ee0c8765	Merge branch 'main' of github.com:run-llama/LlamaIndexTS	2024-02-02 08:31:02 -08:00
Emanuel Ferreira	da5391c018	docs(filtering): add metadata filtering (#508 )	2024-02-02 11:41:44 -03:00
Mario Martinez	ce732beece	Fix typo in PineconeVectorStore.ts (#507 )	2024-02-02 17:13:41 +07:00
TechPandaPro	889b70093c	fix: update deprecated pnpx to pnpm dlx (#501 )	2024-02-02 17:02:42 +07:00
Marcus Schiesser	7211a27f01	RELEASING: Releasing 1 package(s) Releases: create-llama@0.0.24 [skip ci]	2024-02-02 16:09:25 +07:00
Marcus Schiesser	ba95ca3fb6	feat(cl): Use condense plus context chat engine for FastAPI as default	2024-02-02 15:46:11 +07:00
Marcus Schiesser	ffdc507625	fix: upgrade ncc to fix template lookup	2024-02-02 15:34:46 +07:00
yisding	4016c55604	RELEASING: Releasing 2 package(s) Releases: llamaindex@0.1.8 docs@0.0.2 [skip ci]	2024-02-01 16:13:37 -08:00