Compare commits

...

30 Commits

Author SHA1 Message Date
github-actions[bot] f3bc2b61e7 Release (#2164) 2025-08-07 15:18:42 -06:00
Logan 4c703767b7 Adding GPT-5 support (#2163) 2025-08-07 13:39:47 -06:00
github-actions[bot] a27648200d Release (#2161) 2025-08-07 13:39:20 -06:00
abdeliibrahim c93bb02002 #2159 Remove unneeded console logs from gemini stream (#2160)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-08-07 11:38:35 +08:00
github-actions[bot] e9ded4e65f Release (#2154)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-08-06 12:18:06 +08:00
Marcus Schiesser 47a6f5fe5a chore: bump ollama (#2156) 2025-08-06 12:11:17 +08:00
Marcus Schiesser b80f33e264 chore: add opus 4.1 and fix prompt caching (#2155) 2025-08-06 11:54:27 +08:00
Alex Yang b6409b6823 chore: bump openai (#2152)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-08-06 10:58:45 +08:00
github-actions[bot] db3f556cb4 Release 0.11.26 (#2149)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-08-05 12:00:17 +08:00
Marcus Schiesser 4b5179169b chore: add deprecation to readme (#2150) 2025-08-05 11:53:35 +08:00
abdeliibrahim 971d37ceba fix(deepseek): add 'as const' assertion to DEEPSEEK_MODELS for correct TypeScript inference (#2148)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-08-05 10:30:13 +08:00
github-actions[bot] 3e0ffdc688 Release 0.11.25 (#2144)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-31 12:18:18 +08:00
Marcus Schiesser 049471bade chore: deprecate cloud packages (#2143) 2025-07-31 12:12:56 +08:00
github-actions[bot] 1e296ebe72 Release 0.11.24 (#2141)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-30 12:56:45 -04:00
Marcus Schiesser f9f1de9516 chore: use Logger for core (#2139) 2025-07-30 11:43:45 +08:00
Twisha Bansal f576812e7a docs: Using MCP Toolbox for Databases with LlamaIndex (#2138) 2025-07-30 11:19:34 +08:00
Adrian Lyjak c3bf3c7178 Adding support for page citations, and refactor the confidence into the field metadata (#2140) 2025-07-30 10:25:19 +08:00
github-actions[bot] 38487da65d Release 0.11.23 (#2136)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-28 14:07:23 +08:00
Marcus Schiesser f29799e385 feat: Add toolcall callbacks to agent workflows (#2137) 2025-07-24 15:37:14 +08:00
Marcus Schiesser 9bca30620b fix: docs build 2025-07-23 12:55:35 +08:00
Marcus Schiesser 7224c06409 feat: Add logger and callbacks to llm.exec (#2135) 2025-07-23 12:37:02 +08:00
github-actions[bot] 29c7cf0989 Release 0.11.22 (#2131)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-23 11:30:04 +08:00
Marcus Schiesser c65a2dc4a7 chore: Deprecate community package and link to AWS package (#2134) 2025-07-23 11:05:50 +08:00
Terence Sim f1c5079290 docs: updated bedrock import and supported models (#2129)
Co-authored-by: Terence Sim <40583743+InTheAxis@users.noreply.github.com>
2025-07-23 10:40:49 +08:00
Terence Sim 9ed31958a7 chore: add logger as param to AgentWorkflow constructor (#2130)
Co-authored-by: Terence Sim <40583743+InTheAxis@users.noreply.github.com>
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-07-22 16:35:28 +08:00
github-actions[bot] e4c7113614 Release 0.11.21 (#2128)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-22 12:23:58 +08:00
Thuc Pham 38da40bc98 feat: VectoryMemoryBlock (#2110)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-22 12:18:09 +08:00
Marcus Schiesser 4d50ca4d84 chore: add streamchat test (#2122) 2025-07-22 11:30:01 +08:00
github-actions[bot] 8b5253a297 Release (#2127) 2025-07-21 15:40:31 -06:00
Logan ea15e75c89 deployment docs nits (#2126) 2025-07-21 15:30:37 -06:00
202 changed files with 6498 additions and 356 deletions
+87
View File
@@ -1,5 +1,92 @@
# @llamaindex/doc
## 0.2.51
### Patch Changes
- Updated dependencies [4c70376]
- @llamaindex/openai@0.4.16
## 0.2.50
### Patch Changes
- Updated dependencies [b6409b6]
- @llamaindex/openai@0.4.15
## 0.2.49
### Patch Changes
- Updated dependencies [4b51791]
- @llamaindex/cloud@4.1.1
- llamaindex@0.11.26
## 0.2.48
### Patch Changes
- Updated dependencies [049471b]
- Updated dependencies [049471b]
- @llamaindex/cloud@4.1.0
- llamaindex@0.11.25
## 0.2.47
### Patch Changes
- Updated dependencies [c3bf3c7]
- Updated dependencies [f9f1de9]
- @llamaindex/cloud@4.0.28
- @llamaindex/core@0.6.19
- llamaindex@0.11.24
- @llamaindex/node-parser@2.0.19
- @llamaindex/openai@0.4.14
- @llamaindex/readers@3.1.18
- @llamaindex/workflow@1.1.20
## 0.2.46
### Patch Changes
- Updated dependencies [f29799e]
- Updated dependencies [7224c06]
- @llamaindex/workflow@1.1.19
- @llamaindex/core@0.6.18
- llamaindex@0.11.23
- @llamaindex/cloud@4.0.27
- @llamaindex/node-parser@2.0.18
- @llamaindex/openai@0.4.13
- @llamaindex/readers@3.1.17
## 0.2.45
### Patch Changes
- Updated dependencies [9ed3195]
- @llamaindex/workflow@1.1.18
- llamaindex@0.11.22
## 0.2.44
### Patch Changes
- 38da40b: feat: VectoryMemoryBlock
- Updated dependencies [38da40b]
- @llamaindex/core@0.6.17
- @llamaindex/cloud@4.0.26
- llamaindex@0.11.21
- @llamaindex/node-parser@2.0.17
- @llamaindex/openai@0.4.12
- @llamaindex/readers@3.1.16
- @llamaindex/workflow@1.1.17
## 0.2.43
### Patch Changes
- ea15e75: Minor updates in deployment docs
## 0.2.42
### Patch Changes
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/doc",
"version": "0.2.42",
"version": "0.2.51",
"private": true,
"scripts": {
"postinstall": "fumadocs-mdx",
Binary file not shown.

After

Width:  |  Height:  |  Size: 154 KiB

@@ -77,7 +77,7 @@ export async function POST(request: NextRequest) {
const agent = await initializeAgent();
const result = await agent.run(message);
return NextResponse.json({ response: result.result });
return NextResponse.json({ response: result.data });
} catch (error) {
console.error("Chat error:", error);
return NextResponse.json(
@@ -132,7 +132,7 @@ export default async function handler(
const agent = await initializeAgent();
const result = await agent.run(message);
res.json({ response: result.result });
res.json({ response: result.data });
} catch (error) {
console.error("Chat error:", error);
res.status(500).json({ error: "Internal server error" });
@@ -220,7 +220,7 @@ export async function POST(request: NextRequest) {
});
const result = await myAgent.run(message);
return NextResponse.json({ response: result.result });
return NextResponse.json({ response: result.data });
} catch (error) {
return NextResponse.json({ error: error.message }, { status: 500 });
}
@@ -233,11 +233,40 @@ Implement streaming for better user experience:
```typescript
// app/api/chat-stream/route.ts
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { agentStreamEvent } from "@llamaindex/workflow";
import { NextRequest } from "next/server";
import { z } from "zod";
// Assume myAgent is initialized elsewhere
declare const myAgent: any;
// Initialize agent once (consider using a singleton pattern)
let myAgent: any = null;
async function initializeAgent() {
if (myAgent) return myAgent;
try {
const greetTool = tool({
name: "greet",
description: "Greets a user with their name",
parameters: z.object({
name: z.string(),
}),
execute: ({ name }) => `Hello, ${name}! How can I help you today?`,
});
myAgent = agent({
tools: [greetTool],
llm: openai({ model: "gpt-4o-mini" }),
});
return myAgent;
} catch (error) {
console.error("Failed to initialize agent:", error);
throw error;
}
}
export async function POST(request: NextRequest) {
const { message } = await request.json();
@@ -245,9 +274,10 @@ export async function POST(request: NextRequest) {
const stream = new ReadableStream({
async start(controller) {
try {
const context = myAgent.runStream(message);
const agent = await initializeAgent();
const events = agent.runStream(message);
for await (const event of context) {
for await (const event of events) {
if (agentStreamEvent.include(event)) {
controller.enqueue(new TextEncoder().encode(event.data.delta));
}
@@ -63,7 +63,7 @@ app.post('/api/chat', async (req, res) => {
try {
const { message } = req.body;
const result = await myAgent.run(message);
res.json({ response: result.result });
res.json({ response: result.data });
} catch (error) {
res.status(500).json({ error: 'Chat failed' });
}
@@ -110,7 +110,7 @@ fastify.post('/api/chat', async (request, reply) => {
try {
const { message } = request.body as { message: string };
const result = await myAgent.run(message);
return { response: result.result };
return { response: result.data };
} catch (error) {
reply.status(500).send({ error: 'Chat failed' });
}
@@ -162,7 +162,7 @@ app.post("/api/chat", async (c) => {
try {
const result = await myAgent.run(message);
return c.json({ response: result.result });
return c.json({ response: result.data });
} catch (error) {
return c.json({ error: error.message }, 500);
}
@@ -187,9 +187,9 @@ app.post('/api/chat-stream', async (req, res) => {
});
try {
const context = myAgent.runStream(message);
const events = myAgent.runStream(message);
for await (const event of context) {
for await (const event of events) {
if (agentStreamEvent.include(event)) {
res.write(event.data.delta);
}
@@ -34,7 +34,7 @@ export default {
const { message } = await request.json();
const result = await myAgent.run(message);
return new Response(JSON.stringify({ response: result.result }), {
return new Response(JSON.stringify({ response: result.data }), {
headers: { "Content-Type": "application/json" },
});
} catch (error) {
@@ -83,7 +83,7 @@ export default async function handler(req, res) {
try {
const result = await myAgent.run(message);
res.json({ response: result.result });
res.json({ response: result.data });
} catch (error) {
res.status(500).json({ error: error.message });
}
@@ -124,7 +124,7 @@ export async function POST(request: NextRequest) {
});
const result = await myAgent.run(message);
return NextResponse.json({ response: result.result });
return NextResponse.json({ response: result.data });
} catch (error) {
return NextResponse.json({ error: error.message }, { status: 500 });
}
@@ -173,7 +173,7 @@ export const handler: APIGatewayProxyHandler = async (event, context) => {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": "*",
},
body: JSON.stringify({ response: result.result }),
body: JSON.stringify({ response: result.data }),
};
} catch (error) {
return {
@@ -222,7 +222,7 @@ export const handler: Handler = async (event, context) => {
return {
statusCode: 200,
body: JSON.stringify({ response: result.result }),
body: JSON.stringify({ response: result.data }),
};
} catch (error) {
return {
@@ -0,0 +1,85 @@
---
title: MCP Toolbox For Databases
description: MCP Toolbox for Databases is an open source MCP server for databases.
---
# MCP Toolbox for Databases
[MCP Toolbox for Databases](https://github.com/googleapis/genai-toolbox) is an open source MCP server for databases. It was designed with enterprise-grade and production-quality in mind. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more.
Toolbox Tools can be seemlessly integrated with LlamaIndex applications. For more
information on [getting
started](https://googleapis.github.io/genai-toolbox/getting-started/local_quickstart_js/) or
[configuring](https://googleapis.github.io/genai-toolbox/getting-started/configure/)
Toolbox, see the
[documentation](https://googleapis.github.io/genai-toolbox/getting-started/introduction/).
![architecture](/images/mcp_db_toolbox.png)
### Configure and deploy
Toolbox is an open source server that you deploy and manage yourself. For more
instructions on deploying and configuring, see the official Toolbox
documentation:
* [Installing the Server](https://googleapis.github.io/genai-toolbox/getting-started/introduction/#installing-the-server)
* [Configuring Toolbox](https://googleapis.github.io/genai-toolbox/getting-started/configure/)
### Install client SDK
LlamaIndex relies on the `@toolbox-sdk/core` node package to use Toolbox. Install the
package before getting started:
```shell
npm install @toolbox-sdk/core
```
### Loading Toolbox Tools
Once your Toolbox server is configured and up and running, you can load tools
from your server using the SDK:
```javascript
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { ToolboxClient } from "@toolbox-sdk/core";
// Initialize LLM
const llm = gemini({
model: GEMINI_MODEL.GEMINI_2_0_FLASH,
apiKey: process.env.GOOGLE_API_KEY,
});
// Replace with your Toolbox Server URL
const URL = 'https://127.0.0.1:5000';
const client = new ToolboxClient("http://127.0.0.1:5000");
const toolboxTools = await client.loadToolset("my-toolset");
const getTool = (toolboxTool) => tool({
name: toolboxTool.getName(),
description: toolboxTool.getDescription(),
parameters: toolboxTool.getParamSchema(),
execute: toolboxTool
});
const tools = toolboxTools.map(getTool);
const myAgent = agent({
tools: tools,
llm,
memory,
systemPrompt: prompt,
});
const result = await myAgent.run(query);
console.log(result);
```
### Advanced Toolbox Features
Toolbox has a variety of features to make developing Gen AI tools for databases seamless.
For more information, read more about the following:
- [Authenticated Parameters](https://googleapis.github.io/genai-toolbox/resources/tools/#authenticated-parameters): bind tool inputs to values from OIDC tokens automatically, making it easy to run sensitive queries without potentially leaking data
- [Authorized Invocations](https://googleapis.github.io/genai-toolbox/resources/tools/#authorized-invocations): restrict access to use a tool based on the users Auth token
- [OpenTelemetry](https://googleapis.github.io/genai-toolbox/how-to/export_telemetry/): get metrics and tracing from Toolbox with [OpenTelemetry](https://opentelemetry.io/docs/)
@@ -1,5 +1,5 @@
{
"title": "Integration",
"description": "See our integrations",
"pages": ["open-llm-metry", "lang-trace", "vercel"]
"pages": ["open-llm-metry", "lang-trace", "mcp-toolbox", "vercel"]
}
@@ -0,0 +1,164 @@
---
title: Low-Level LLM Execution
---
Sometimes your need more control over LLM interactions than what high-level agents provide. The `llm.exec` method makes it simple for you to make a single LLM call with tools but hides the complexity of executing the tools and generating the tool messages.
## When to Use `llm.exec`
Use `llm.exec` when you need to:
- Build custom agent logic in [workflow](/docs/llamaindex/modules/agents/workflows) steps
- Have precise control over message handling and tool execution
## Basic Usage
The `llm.exec` method takes messages and tools as parameter and executes one LLM call.
The LLM might either request to call one or more of the tools or generate an assistant message as result.
For each tool call that is requested, `llm.exec` executes it and generates the two tool call messages (call and result). If no tool call is requested, just the assistant message is returned.
```ts
import { openai } from "@llamaindex/openai";
import { ChatMessage, tool } from "llamaindex";
import z from "zod";
const llm = openai({ model: "gpt-4.1-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
const { newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
});
// Add the new messages (including tool calls and responses) to your conversation
messages.push(...newMessages);
```
> `newMessages` is an array as each tool call generates two messages: a tool call message and the tool call result message.
## Agent Loop Pattern
A common pattern is to use `llm.exec` in a loop until the LLM stops making tool calls:
```ts
import { openai } from "@llamaindex/openai";
import { ChatMessage, tool } from "llamaindex";
import z from "zod";
async function runAgentLoop() {
const llm = openai({ model: "gpt-4.1-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
let exit = false;
do {
const { newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
});
console.log(newMessages);
messages.push(...newMessages);
// Exit when no more tool calls are made
exit = toolCalls.length === 0;
} while (!exit);
}
```
## Streaming Support
For real-time responses, use the `stream` option to get the assistant's response as streamed tokens:
```ts
import { openai } from "@llamaindex/openai";
import { tool } from "llamaindex";
import z from "zod";
async function streamingAgentLoop() {
const llm = openai({ model: "gpt-4o-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
let exit = false;
do {
const { stream, newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
stream: true,
});
// Stream the response token by token
for await (const chunk of stream) {
process.stdout.write(chunk.delta);
}
messages.push(...newMessages());
exit = toolCalls.length === 0;
} while (!exit);
}
```
> `newMessages` is a function when streaming. The reason is that the result only is available after streaming. Calling it before, will throw an error.
## Return Values
`llm.exec` returns an object with:
- **`newMessages`**: Array of new chat messages including the LLM response and any tool call messages (call or result). This is a function return the array when streaming.
- **`toolCalls`**: Array of tool calls made by the LLM
- **`stream`**: Async iterable for streaming responses (only when `stream: true`)
## Best Practices
For using `llm.exec` in an agent loop, take care to:
1. **Maintain message history**: Always add `newMessages` to your conversation history
2. **Set exit conditions**: Implement proper logic to avoid infinite loops
@@ -1,4 +1,10 @@
{
"title": "Agents",
"pages": ["tool", "agent_workflow", "workflows", "natural_language_workflow"]
"pages": [
"tool",
"agent_workflow",
"workflows",
"low-level",
"natural_language_workflow"
]
}
@@ -101,6 +101,9 @@ const agent = agent({
});
```
You can also use [MCP Toolbox for
Databases](/docs/llamaindex/integration/mcp-toolbox) to interact with MCP tools.
## Function tool
@@ -106,34 +106,40 @@ const memory = createMemory({
Long-term memory is represented as `Memory Block` objects. These objects contain information that are from previous user sessions or from the beginning of the current conversation. When memory is retrieved (by calling `getLLM`), the short-term and long-term memories are merged together within the given `tokenLimit`.
Currently, there are two predefined memory blocks:
Currently, there are three predefined memory blocks:
- `staticBlock`: A memory block that stores a static piece of information.
- `factExtractionBlock`: A memory block that extracts facts from the chat history.
- `vectorBlock`: A memory block that stores and retrieves chat messages from a vector database using semantic similarity search. Messages are stored individually and retrieved based on their relevance to recent conversation context. Here we've passed in the `vectorStore` to use to store and retrieve the chat messages.
This sounds a bit complicated, but it's actually quite simple. Let's look at an example:
```ts
import { createMemory, factExtractionBlock, staticBlock } from "llamaindex";
import { createMemory, factExtractionBlock, staticBlock, vectorBlock } from "llamaindex";
import { QdrantVectorStore } from "@llamaindex/qdrant";
import { OpenAIEmbedding } from "@llamaindex/openai";
const memoryBlocks= [
staticBlock({
id: "core_info",
content: "My name is Logan, and I live in Saskatoon. I work at LlamaIndex.",
}),
factExtractionBlock({
id: "user-extracted_info",
priority: 1,
llm: llm,
maxFacts: 50,
}),
vectorBlock({
vectorStore: new QdrantVectorStore({ url: "http://localhost:6333" }),
priority: 2,
}),
];
```
Here, we've setup two memory blocks:
Here, we've setup three memory blocks:
- `core_info`: A static memory block that stores some core information about the user. This information will always be inserted into the memory. The type used is `MessageContent` to support multi-modal content.
- `extracted_info`: An extracted memory block that will extract information from the chat history. Here we've passed in the `llm` to use to extract facts from the chat history, and set the `maxFacts` to 50. If the number of extracted facts exceeds this limit, the `maxFacts` will be automatically summarized and reduced to leave room for new information.
- `staticBlock`: A static memory block that stores some core information about the user. This information will always be inserted into the memory. The type used is `MessageContent` to support multi-modal content.
- `factExtractionBlock`: An extracted memory block that will extract information from the chat history. Here we've passed in the `llm` to use to extract facts from the chat history, and set the `maxFacts` to 50. If the number of extracted facts exceeds this limit, the `maxFacts` will be automatically summarized and reduced to leave room for new information.
- `vectorBlock`: A vector memory block that will store in a vector database and retrieve them from there. Messages are stored individually and retrieved based on their relevance to recent conversation context. Here we've passed in the `vectorStore` to use to store and retrieve the chat messages.
You'll also notice that we've set the `priority` for the `factExtractionBlock` block. This is used to determine the handling when the memory blocks content (i.e. long-term memory) + short-term memory exceeds the token limit on the `Memory` object.
@@ -158,6 +164,46 @@ When memory is retrieved (using `getLLM`), the short-term and long-term memories
The amount of short-term memory included is specified by the `shortTermTokenLimitRatio`. If it's set to `0.7`, 70% of the `tokenLimit` is used for short-term memory (not including the static memory block).
#### VectorBlock Configuration Options
The `vectorBlock` offers several configuration options to customize its behavior:
```ts
vectorBlock({
vectorStore: new QdrantVectorStore({ url: "http://localhost:6333" }),
priority: 2,
retrievalContextWindow: 5, // Number of recent messages to use for context when retrieving
formatTemplate: new PromptTemplate({ template: "Context: {{ context }}" }), // Custom formatting template
nodePostprocessors: [/* custom postprocessors */], // Apply processing to retrieved nodes
queryOptions: {
similarityTopK: 3, // Number of top similar results to return (default: 2)
mode: VectorStoreQueryMode.DEFAULT, // Query mode for the vector store
sessionFilterKey: "session_id", // Metadata key for session filtering (default: "session_id")
// Custom filters can be added here - session filter is automatically included
filters: {
filters: [
{ key: "custom_field", value: "custom_value", operator: "==" }
],
condition: "and"
}
}
})
```
**Key Configuration Options:**
- **`retrievalContextWindow`**: Number of recent messages to consider when creating the retrieval query (default: 5). A larger window provides more context but may be less precise.
- **`formatTemplate`**: Template for formatting retrieved information before adding to memory. Defaults to a simple context template.
- **`nodePostprocessors`**: Array of postprocessors to apply to retrieved nodes, useful for filtering or transforming results.
- **`queryOptions.similarityTopK`**: Number of most similar messages to retrieve from the vector store (default: 2).
- **`queryOptions.sessionFilterKey`**: Metadata key used to isolate memory between different sessions (default: "session_id").
- **`queryOptions.filters`**: Additional metadata filters for retrieval. The session filter is automatically added to ensure memory isolation.
**Session Isolation:**
The vectorBlock automatically adds a session filter using the block's ID to ensure that memories from different sessions don't interfere with each other. This filter uses the `sessionFilterKey` (default: "session_id") and can be customized if needed.
## Persistence with Snapshots
Save and restore memory state:
@@ -5,13 +5,13 @@ title: Bedrock
## Installation
```package-install
npm i llamaindex @llamaindex/community
npm i llamaindex @llamaindex/aws
```
## Usage
```ts
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/community";
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/aws";
Settings.llm = new Bedrock({
model: BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
@@ -23,9 +23,19 @@ Settings.llm = new Bedrock({
});
```
Currently only supports Anthropic and Meta models:
Supported models are listed below (accessible by BEDROCK_MODELS).
```ts
AMAZON_TITAN_TG1_LARGE = "amazon.titan-tg1-large";
AMAZON_TITAN_TEXT_EXPRESS_V1 = "amazon.titan-text-express-v1";
AI21_J2_GRANDE_INSTRUCT = "ai21.j2-grande-instruct";
AI21_J2_JUMBO_INSTRUCT = "ai21.j2-jumbo-instruct";
AI21_J2_MID = "ai21.j2-mid";
AI21_J2_MID_V1 = "ai21.j2-mid-v1";
AI21_J2_ULTRA = "ai21.j2-ultra";
AI21_J2_ULTRA_V1 = "ai21.j2-ultra-v1";
COHERE_COMMAND_TEXT_V14 = "cohere.command-text-v14";
ANTHROPIC_CLAUDE_INSTANT_1 = "anthropic.claude-instant-v1";
ANTHROPIC_CLAUDE_2 = "anthropic.claude-v2";
ANTHROPIC_CLAUDE_2_1 = "anthropic.claude-v2:1";
@@ -33,7 +43,12 @@ ANTHROPIC_CLAUDE_3_SONNET = "anthropic.claude-3-sonnet-20240229-v1:0";
ANTHROPIC_CLAUDE_3_HAIKU = "anthropic.claude-3-haiku-20240307-v1:0";
ANTHROPIC_CLAUDE_3_OPUS = "anthropic.claude-3-opus-20240229-v1:0"; // available on us-west-2
ANTHROPIC_CLAUDE_3_5_SONNET = "anthropic.claude-3-5-sonnet-20240620-v1:0";
ANTHROPIC_CLAUDE_3_5_SONNET_V2 = "anthropic.claude-3-5-sonnet-20241022-v2:0";
ANTHROPIC_CLAUDE_3_5_HAIKU = "anthropic.claude-3-5-haiku-20241022-v1:0";
ANTHROPIC_CLAUDE_3_7_SONNET = "anthropic.claude-3-7-sonnet-20250219-v1:0";
ANTHROPIC_CLAUDE_4_SONNET = "anthropic.claude-sonnet-4-20250514-v1:0";
ANTHROPIC_CLAUDE_4_OPUS = "anthropic.claude-opus-4-20250514-v1:0";
META_LLAMA2_13B_CHAT = "meta.llama2-13b-chat-v1";
META_LLAMA2_70B_CHAT = "meta.llama2-70b-chat-v1";
META_LLAMA3_8B_INSTRUCT = "meta.llama3-8b-instruct-v1:0";
@@ -45,41 +60,66 @@ META_LLAMA3_2_1B_INSTRUCT = "meta.llama3-2-1b-instruct-v1:0"; // only available
META_LLAMA3_2_3B_INSTRUCT = "meta.llama3-2-3b-instruct-v1:0"; // only available via inference endpoints (see below)
META_LLAMA3_2_11B_INSTRUCT = "meta.llama3-2-11b-instruct-v1:0"; // only available via inference endpoints (see below), multimodal and function call supported
META_LLAMA3_2_90B_INSTRUCT = "meta.llama3-2-90b-instruct-v1:0"; // only available via inference endpoints (see below), multimodal and function call supported
META_LLAMA3_3_70B_INSTRUCT = "meta.llama3-3-70b-instruct-v1:0";
MISTRAL_7B_INSTRUCT = "mistral.mistral-7b-instruct-v0:2";
MISTRAL_MIXTRAL_7B_INSTRUCT = "mistral.mixtral-8x7b-instruct-v0:1";
MISTRAL_MIXTRAL_LARGE_2402 = "mistral.mistral-large-2402-v1:0";
AMAZON_NOVA_PREMIER_1 = "amazon.nova-premier-v1:0";
AMAZON_NOVA_PRO_1 = "amazon.nova-pro-v1:0";
AMAZON_NOVA_LITE_1 = "amazon.nova-lite-v1:0";
AMAZON_NOVA_MICRO_1 = "amazon.nova-micro-v1:0";
```
You can also use Bedrock's Inference endpoints by using the model names:
You can also use Bedrock's Inference endpoints by using the model names (accessible by INFERENCE_BEDROCK_MODELS).
Note that the region must be set correctly.
```ts
// US
//US
US_ANTHROPIC_CLAUDE_3_HAIKU = "us.anthropic.claude-3-haiku-20240307-v1:0";
US_ANTHROPIC_CLAUDE_3_5_HAIKU = "us.anthropic.claude-3-5-haiku-20241022-v1:0";
US_ANTHROPIC_CLAUDE_3_OPUS = "us.anthropic.claude-3-opus-20240229-v1:0";
US_ANTHROPIC_CLAUDE_3_SONNET = "us.anthropic.claude-3-sonnet-20240229-v1:0";
US_ANTHROPIC_CLAUDE_3_5_SONNET = "us.anthropic.claude-3-5-sonnet-20240620-v1:0";
US_ANTHROPIC_CLAUDE_3_5_SONNET_V2 =
"us.anthropic.claude-3-5-sonnet-20241022-v2:0";
US_ANTHROPIC_CLAUDE_3_5_SONNET_V2 = "us.anthropic.claude-3-5-sonnet-20241022-v2:0";
US_ANTHROPIC_CLAUDE_3_7_SONNET = "us.anthropic.claude-3-7-sonnet-20250219-v1:0";
US_ANTHROPIC_CLAUDE_4_SONNET = "us.anthropic.claude-sonnet-4-20250514-v1:0";
US_ANTHROPIC_CLAUDE_4_OPUS = "us.anthropic.claude-opus-4-20250514-v1:0";
US_META_LLAMA_3_2_1B_INSTRUCT = "us.meta.llama3-2-1b-instruct-v1:0";
US_META_LLAMA_3_2_3B_INSTRUCT = "us.meta.llama3-2-3b-instruct-v1:0";
US_META_LLAMA_3_2_11B_INSTRUCT = "us.meta.llama3-2-11b-instruct-v1:0";
US_META_LLAMA_3_2_90B_INSTRUCT = "us.meta.llama3-2-90b-instruct-v1:0";
US_AMAZON_NOVA_PRO_1 = "us.amazon.nova-premier-v1:0";
US_META_LLAMA_3_3_70B_INSTRUCT = "us.meta.llama3-3-70b-instruct-v1:0";
US_AMAZON_NOVA_PREMIER_1 = "us.amazon.nova-premier-v1:0";
US_AMAZON_NOVA_PRO_1 = "us.amazon.nova-pro-v1:0";
US_AMAZON_NOVA_LITE_1 = "us.amazon.nova-lite-v1:0";
US_AMAZON_NOVA_MICRO_1 = "us.amazon.nova-micro-v1:0";
// EU
//EU
EU_ANTHROPIC_CLAUDE_3_HAIKU = "eu.anthropic.claude-3-haiku-20240307-v1:0";
EU_ANTHROPIC_CLAUDE_3_5_HAIKU = "eu.anthropic.claude-3-5-haiku-20240307-v1:0";
EU_ANTHROPIC_CLAUDE_3_SONNET = "eu.anthropic.claude-3-sonnet-20240229-v1:0";
EU_ANTHROPIC_CLAUDE_3_5_SONNET = "eu.anthropic.claude-3-5-sonnet-20240620-v1:0";
EU_ANTHROPIC_CLAUDE_3_7_SONNET = "eu.anthropic.claude-3-7-sonnet-20250219-v1:0";
EU_ANTHROPIC_CLAUDE_4_SONNET = "eu.anthropic.claude-sonnet-4-20250514-v1:0";
EU_ANTHROPIC_CLAUDE_4_OPUS = "eu.anthropic.claude-opus-4-20250514-v1:0";
EU_META_LLAMA_3_2_1B_INSTRUCT = "eu.meta.llama3-2-1b-instruct-v1:0";
EU_META_LLAMA_3_2_3B_INSTRUCT = "eu.meta.llama3-2-3b-instruct-v1:0";
EU_AMAZON_NOVA_PRO_1 = "eu.amazon.nova-premier-v1:0";
EU_AMAZON_NOVA_PREMIER_1 = "eu.amazon.nova-premier-v1:0";
EU_AMAZON_NOVA_PRO_1 = "eu.amazon.nova-pro-v1:0";
EU_AMAZON_NOVA_LITE_1 = "eu.amazon.nova-lite-v1:0";
EU_AMAZON_NOVA_MICRO_1 = "eu.amazon.nova-micro-v1:0";
//APAC
APAC_ANTHROPIC_CLAUDE_3_5_SONNET = "apac.anthropic.claude-3-5-sonnet-20240620-v1:0";
APAC_ANTHROPIC_CLAUDE_3_5_SONNET_V2 = "apac.anthropic.claude-3-5-sonnet-20241022-v2:0";
APAC_ANTHROPIC_CLAUDE_3_7_SONNET = "apac.anthropic.claude-3-7-sonnet-20250219-v1:0";
APAC_ANTHROPIC_CLAUDE_3_HAIKU = "apac.anthropic.claude-3-haiku-20240307-v1:0";
APAC_ANTHROPIC_CLAUDE_3_SONNET = "apac.anthropic.claude-3-sonnet-20240229-v1:0";
APAC_AMAZON_NOVA_PRO_1 = "apac.amazon.nova-pro-v1:0";
APAC_AMAZON_NOVA_LITE_1 = "apac.amazon.nova-lite-v1:0";
APAC_AMAZON_NOVA_MICRO_1 = "apac.amazon.nova-micro-v1:0";
```
Sonnet, Haiku and Opus are multimodal, image_url only supports base64 data url format, e.g. `data:image/jpeg;base64,SGVsbG8sIFdvcmxkIQ==`
@@ -87,10 +127,11 @@ Sonnet, Haiku and Opus are multimodal, image_url only supports base64 data url f
## Full Example
```ts
import { BEDROCK_MODELS, Bedrock } from "llamaindex";
import { INFERENCE_BEDROCK_MODELS, Bedrock } from "@llamaindex/aws";
Settings.llm = new Bedrock({
model: BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
model: INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_SONNET,
region: "us-east-1",
});
async function main() {
@@ -119,7 +160,7 @@ async function main() {
## Agent Example
```ts
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/community";
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/aws";
import { tool } from "llamaindex";
import { agent } from "@llamaindex/workflow";
import { z } from "zod";
@@ -1,5 +1,42 @@
# @llamaindex/cloudflare-worker-agent-test
## 0.0.187
### Patch Changes
- llamaindex@0.11.26
## 0.0.186
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 0.0.185
### Patch Changes
- llamaindex@0.11.24
## 0.0.184
### Patch Changes
- llamaindex@0.11.23
## 0.0.183
### Patch Changes
- llamaindex@0.11.22
## 0.0.182
### Patch Changes
- llamaindex@0.11.21
## 0.0.181
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/cloudflare-worker-agent-test",
"version": "0.0.181",
"version": "0.0.187",
"type": "module",
"private": true,
"scripts": {
@@ -1,5 +1,38 @@
# @llamaindex/llama-parse-browser-test
## 0.0.85
### Patch Changes
- Updated dependencies [4b51791]
- @llamaindex/cloud@4.1.1
## 0.0.84
### Patch Changes
- Updated dependencies [049471b]
- @llamaindex/cloud@4.1.0
## 0.0.83
### Patch Changes
- Updated dependencies [c3bf3c7]
- @llamaindex/cloud@4.0.28
## 0.0.82
### Patch Changes
- @llamaindex/cloud@4.0.27
## 0.0.81
### Patch Changes
- @llamaindex/cloud@4.0.26
## 0.0.80
### Patch Changes
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/llama-parse-browser-test",
"private": true,
"version": "0.0.80",
"version": "0.0.85",
"type": "module",
"scripts": {
"dev": "vite",
+37
View File
@@ -1,5 +1,42 @@
# @llamaindex/next-agent-test
## 0.1.187
### Patch Changes
- llamaindex@0.11.26
## 0.1.186
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 0.1.185
### Patch Changes
- llamaindex@0.11.24
## 0.1.184
### Patch Changes
- llamaindex@0.11.23
## 0.1.183
### Patch Changes
- llamaindex@0.11.22
## 0.1.182
### Patch Changes
- llamaindex@0.11.21
## 0.1.181
### Patch Changes
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/next-agent-test",
"version": "0.1.181",
"version": "0.1.187",
"private": true,
"scripts": {
"dev": "next dev",
@@ -1,5 +1,42 @@
# test-edge-runtime
## 0.1.186
### Patch Changes
- llamaindex@0.11.26
## 0.1.185
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 0.1.184
### Patch Changes
- llamaindex@0.11.24
## 0.1.183
### Patch Changes
- llamaindex@0.11.23
## 0.1.182
### Patch Changes
- llamaindex@0.11.22
## 0.1.181
### Patch Changes
- llamaindex@0.11.21
## 0.1.180
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/nextjs-edge-runtime-test",
"version": "0.1.180",
"version": "0.1.186",
"private": true,
"scripts": {
"dev": "next dev",
@@ -1,5 +1,60 @@
# @llamaindex/next-node-runtime
## 0.1.58
### Patch Changes
- @llamaindex/huggingface@0.1.26
## 0.1.57
### Patch Changes
- @llamaindex/huggingface@0.1.25
## 0.1.56
### Patch Changes
- llamaindex@0.11.26
## 0.1.55
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 0.1.54
### Patch Changes
- llamaindex@0.11.24
- @llamaindex/huggingface@0.1.24
- @llamaindex/readers@3.1.18
## 0.1.53
### Patch Changes
- llamaindex@0.11.23
- @llamaindex/huggingface@0.1.23
- @llamaindex/readers@3.1.17
## 0.1.52
### Patch Changes
- llamaindex@0.11.22
## 0.1.51
### Patch Changes
- llamaindex@0.11.21
- @llamaindex/huggingface@0.1.22
- @llamaindex/readers@3.1.16
## 0.1.50
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/next-node-runtime-test",
"version": "0.1.50",
"version": "0.1.58",
"private": true,
"scripts": {
"dev": "next dev",
@@ -1,5 +1,42 @@
# vite-import-llamaindex
## 0.0.53
### Patch Changes
- llamaindex@0.11.26
## 0.0.52
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 0.0.51
### Patch Changes
- llamaindex@0.11.24
## 0.0.50
### Patch Changes
- llamaindex@0.11.23
## 0.0.49
### Patch Changes
- llamaindex@0.11.22
## 0.0.48
### Patch Changes
- llamaindex@0.11.21
## 0.0.47
### Patch Changes
@@ -1,7 +1,7 @@
{
"name": "vite-import-llamaindex",
"private": true,
"version": "0.0.47",
"version": "0.0.53",
"type": "module",
"scripts": {
"build": "vite build",
@@ -1,5 +1,42 @@
# @llamaindex/waku-query-engine-test
## 0.0.187
### Patch Changes
- llamaindex@0.11.26
## 0.0.186
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 0.0.185
### Patch Changes
- llamaindex@0.11.24
## 0.0.184
### Patch Changes
- llamaindex@0.11.23
## 0.0.183
### Patch Changes
- llamaindex@0.11.22
## 0.0.182
### Patch Changes
- llamaindex@0.11.21
## 0.0.181
### Patch Changes
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/waku-query-engine-test",
"version": "0.0.181",
"version": "0.0.187",
"type": "module",
"private": true,
"scripts": {
+1 -1
View File
@@ -23,7 +23,7 @@ await test("pinecone", async (t) => {
});
const vectorStore = new PineconeVectorStore({
embeddingModel: openaiEmbedding,
embedModel: openaiEmbedding,
});
t.after(async () => {
+211
View File
@@ -1,5 +1,216 @@
# examples
## 0.3.38
### Patch Changes
- Updated dependencies [4c70376]
- @llamaindex/openai@0.4.16
- @llamaindex/clip@0.0.72
- @llamaindex/deepinfra@0.0.72
- @llamaindex/deepseek@0.0.34
- @llamaindex/fireworks@0.0.32
- @llamaindex/groq@0.0.88
- @llamaindex/huggingface@0.1.26
- @llamaindex/jinaai@0.0.32
- @llamaindex/perplexity@0.0.29
- @llamaindex/azure@0.1.33
- @llamaindex/together@0.0.32
- @llamaindex/vllm@0.0.58
- @llamaindex/xai@0.0.19
## 0.3.37
### Patch Changes
- Updated dependencies [47a6f5f]
- Updated dependencies [b80f33e]
- Updated dependencies [b6409b6]
- Updated dependencies [b80f33e]
- @llamaindex/ollama@0.1.20
- @llamaindex/anthropic@0.3.22
- @llamaindex/openai@0.4.15
- @llamaindex/clip@0.0.71
- @llamaindex/deepinfra@0.0.71
- @llamaindex/deepseek@0.0.33
- @llamaindex/fireworks@0.0.31
- @llamaindex/groq@0.0.87
- @llamaindex/huggingface@0.1.25
- @llamaindex/jinaai@0.0.31
- @llamaindex/perplexity@0.0.28
- @llamaindex/azure@0.1.32
- @llamaindex/together@0.0.31
- @llamaindex/vllm@0.0.57
- @llamaindex/xai@0.0.18
## 0.3.36
### Patch Changes
- Updated dependencies [4b51791]
- Updated dependencies [971d37c]
- @llamaindex/cloud@4.1.1
- @llamaindex/deepseek@0.0.32
- llamaindex@0.11.26
## 0.3.35
### Patch Changes
- Updated dependencies [c3bf3c7]
- Updated dependencies [f9f1de9]
- @llamaindex/cloud@4.0.28
- @llamaindex/core@0.6.19
- llamaindex@0.11.24
- @llamaindex/node-parser@2.0.19
- @llamaindex/anthropic@0.3.21
- @llamaindex/assemblyai@0.1.18
- @llamaindex/clip@0.0.70
- @llamaindex/cohere@0.0.33
- @llamaindex/deepinfra@0.0.70
- @llamaindex/discord@0.1.18
- @llamaindex/google@0.3.18
- @llamaindex/huggingface@0.1.24
- @llamaindex/jinaai@0.0.30
- @llamaindex/mistral@0.1.19
- @llamaindex/mixedbread@0.0.33
- @llamaindex/notion@0.1.18
- @llamaindex/ollama@0.1.19
- @llamaindex/openai@0.4.14
- @llamaindex/perplexity@0.0.27
- @llamaindex/portkey-ai@0.0.61
- @llamaindex/replicate@0.0.61
- @llamaindex/bm25-retriever@0.0.8
- @llamaindex/astra@0.0.33
- @llamaindex/azure@0.1.31
- @llamaindex/chroma@0.0.33
- @llamaindex/elastic-search@0.1.19
- @llamaindex/firestore@1.0.26
- @llamaindex/milvus@0.1.28
- @llamaindex/mongodb@0.0.34
- @llamaindex/pinecone@0.1.19
- @llamaindex/postgres@0.0.62
- @llamaindex/qdrant@0.1.29
- @llamaindex/supabase@0.1.20
- @llamaindex/upstash@0.0.33
- @llamaindex/weaviate@0.0.34
- @llamaindex/vercel@0.1.19
- @llamaindex/voyage-ai@1.0.25
- @llamaindex/readers@3.1.18
- @llamaindex/tools@0.1.9
- @llamaindex/workflow@1.1.20
- @llamaindex/deepseek@0.0.31
- @llamaindex/fireworks@0.0.30
- @llamaindex/groq@0.0.86
- @llamaindex/together@0.0.30
- @llamaindex/vllm@0.0.56
- @llamaindex/xai@0.0.17
## 0.3.34
### Patch Changes
- Updated dependencies [f29799e]
- Updated dependencies [7224c06]
- @llamaindex/workflow@1.1.19
- @llamaindex/core@0.6.18
- llamaindex@0.11.23
- @llamaindex/cloud@4.0.27
- @llamaindex/node-parser@2.0.18
- @llamaindex/anthropic@0.3.20
- @llamaindex/assemblyai@0.1.17
- @llamaindex/clip@0.0.69
- @llamaindex/cohere@0.0.32
- @llamaindex/deepinfra@0.0.69
- @llamaindex/discord@0.1.17
- @llamaindex/google@0.3.17
- @llamaindex/huggingface@0.1.23
- @llamaindex/jinaai@0.0.29
- @llamaindex/mistral@0.1.18
- @llamaindex/mixedbread@0.0.32
- @llamaindex/notion@0.1.17
- @llamaindex/ollama@0.1.18
- @llamaindex/openai@0.4.13
- @llamaindex/perplexity@0.0.26
- @llamaindex/portkey-ai@0.0.60
- @llamaindex/replicate@0.0.60
- @llamaindex/bm25-retriever@0.0.7
- @llamaindex/astra@0.0.32
- @llamaindex/azure@0.1.30
- @llamaindex/chroma@0.0.32
- @llamaindex/elastic-search@0.1.18
- @llamaindex/firestore@1.0.25
- @llamaindex/milvus@0.1.27
- @llamaindex/mongodb@0.0.33
- @llamaindex/pinecone@0.1.18
- @llamaindex/postgres@0.0.61
- @llamaindex/qdrant@0.1.28
- @llamaindex/supabase@0.1.19
- @llamaindex/upstash@0.0.32
- @llamaindex/weaviate@0.0.33
- @llamaindex/vercel@0.1.18
- @llamaindex/voyage-ai@1.0.24
- @llamaindex/readers@3.1.17
- @llamaindex/tools@0.1.8
- @llamaindex/deepseek@0.0.30
- @llamaindex/fireworks@0.0.29
- @llamaindex/groq@0.0.85
- @llamaindex/together@0.0.29
- @llamaindex/vllm@0.0.55
- @llamaindex/xai@0.0.16
## 0.3.33
### Patch Changes
- Updated dependencies [38da40b]
- @llamaindex/core@0.6.17
- @llamaindex/cloud@4.0.26
- llamaindex@0.11.21
- @llamaindex/node-parser@2.0.17
- @llamaindex/anthropic@0.3.19
- @llamaindex/assemblyai@0.1.16
- @llamaindex/clip@0.0.68
- @llamaindex/cohere@0.0.31
- @llamaindex/deepinfra@0.0.68
- @llamaindex/discord@0.1.16
- @llamaindex/google@0.3.16
- @llamaindex/huggingface@0.1.22
- @llamaindex/jinaai@0.0.28
- @llamaindex/mistral@0.1.17
- @llamaindex/mixedbread@0.0.31
- @llamaindex/notion@0.1.16
- @llamaindex/ollama@0.1.17
- @llamaindex/openai@0.4.12
- @llamaindex/perplexity@0.0.25
- @llamaindex/portkey-ai@0.0.59
- @llamaindex/replicate@0.0.59
- @llamaindex/bm25-retriever@0.0.6
- @llamaindex/astra@0.0.31
- @llamaindex/azure@0.1.29
- @llamaindex/chroma@0.0.31
- @llamaindex/elastic-search@0.1.17
- @llamaindex/firestore@1.0.24
- @llamaindex/milvus@0.1.26
- @llamaindex/mongodb@0.0.32
- @llamaindex/pinecone@0.1.17
- @llamaindex/postgres@0.0.60
- @llamaindex/qdrant@0.1.27
- @llamaindex/supabase@0.1.18
- @llamaindex/upstash@0.0.31
- @llamaindex/weaviate@0.0.32
- @llamaindex/vercel@0.1.17
- @llamaindex/voyage-ai@1.0.23
- @llamaindex/readers@3.1.16
- @llamaindex/tools@0.1.7
- @llamaindex/workflow@1.1.17
- @llamaindex/deepseek@0.0.29
- @llamaindex/fireworks@0.0.28
- @llamaindex/groq@0.0.84
- @llamaindex/together@0.0.28
- @llamaindex/vllm@0.0.54
- @llamaindex/xai@0.0.15
## 0.3.32
### Patch Changes
+150
View File
@@ -0,0 +1,150 @@
/**
* Example: Vector Memory Block
*
* This example demonstrates how to use the VectorMemoryBlock to store and retrieve
* conversation history using vector similarity search. The vector memory block
* stores messages in a vector store and can retrieve relevant context based on
* semantic similarity to recent messages.
*/
import { OpenAI, OpenAIEmbedding } from "@llamaindex/openai";
import { QdrantVectorStore } from "@llamaindex/qdrant";
import { createMemory, vectorBlock } from "llamaindex";
// Set up the LLM and embedding model
const llm = new OpenAI({ model: "gpt-4.1-mini" });
const embedModel = new OpenAIEmbedding({ model: "text-embedding-3-small" });
// Simulate a conversation with some context
// This conversation has 8 messages, which is more than the token limit of 100 tokens (set below)
// The last 4 messages are kept in to short term memory block (as their tokens are in the limit)
// Whereas the first 5 messages are added to long term memory block (in here we will use the vector memory block with Qdrant)
const CONVERSATION_TURNS = [
//// This is the first 5 messages that are added to long term memory block (vector memory block)
{
role: "user",
content: "Hi, I'm Sarah and I work as a data scientist at Google.",
},
{
role: "assistant",
content:
"Hello Sarah! It's great to meet you. Data science at Google must be exciting!",
},
{
role: "user",
content:
"Yes, I specialize in machine learning and natural language processing.",
},
{
role: "assistant",
content: "That's impressive! ML and NLP are fascinating fields.",
},
{
role: "user",
content:
"I have a PhD in Computer Science from Stanford, and I love hiking on weekends.",
},
//// This is the last 4 messages that are added to short term memory block
{
role: "assistant",
content:
"Wow, Stanford PhD! And hiking is a great way to unwind from tech work.",
},
{
role: "user",
content: "I also have two cats named Whiskers and Mittens.",
},
{
role: "assistant",
content:
"Cats make wonderful companions! Whiskers and Mittens are cute names.",
},
{
role: "user",
content: "Summary information about Sarah and her cats",
},
];
async function main() {
console.log("=== Vector Memory Block Example ===\n");
/**
* Create a vector store. You can quickly get a local instance of Qdrant running with Docker:
* ```bash
* docker pull qdrant/qdrant
* docker run -p 6333:6333 qdrant/qdrant
* ```
*
* Go to http://localhost:6333/dashboard#/collections to see your data
*/
const vectorStore = new QdrantVectorStore({
url: "http://localhost:6333",
embedModel,
});
// Create a vector memory block using the factory function
const vectorMemoryBlock = vectorBlock({
vectorStore,
priority: 5,
});
// Create a memory store with the vector memory block
const memory = createMemory([], {
llm,
memoryBlocks: [vectorMemoryBlock],
tokenLimit: 100,
shortTermTokenLimitRatio: 0.7,
});
// Store the conversation history in the vector memory
console.log(`Adding ${CONVERSATION_TURNS.length} messages to the memory...`);
for (const message of CONVERSATION_TURNS) {
await memory.add(message);
}
// Retrieve relevant context for the current user request
console.log("Retrieving relevant context...");
const chatHistory = await memory.getLLM();
// You will see there's 1 generated context message from vector memory block, and 4 messages from short term memory block
console.log("Chat memory:", chatHistory);
// Now simulate the assistant responding with context
console.log("\nAssistant response with context:");
const response = await llm.chat({
messages: chatHistory,
});
console.log(response.message.content);
// Try adding more messages to the memory
const newMessages = [
{
role: "user",
content: "Write a long paragraph about weather in Tokyo",
},
{
role: "assistant",
content:
"The weather in Tokyo is sunny and warm. The temperature is around 20 degrees Celsius. The weather is very nice and the people are friendly.",
},
{
role: "user",
content: "What is the weather in Tokyo?",
},
];
// Add the new messages to the memory
for (const message of newMessages) {
await memory.add(message);
}
// Try retrieving the new messages
const newChatHistory = await memory.getLLM();
// You can see now that new chat history will contain the nodes (separated by `\n`) in the
// context message that is generated by the vector memory block
// The number of retrieved nodes is set by `similarityTopK` in `queryOptions` of `vectorBlock`
// (default `similarityTopK` is 2)
console.log("New chat history:", newChatHistory);
}
main().catch(console.error);
+14
View File
@@ -0,0 +1,14 @@
import { anthropic } from "@llamaindex/anthropic";
import { agent } from "@llamaindex/workflow";
(async function () {
const workflow = agent({
llm: anthropic({
model: "claude-4-1-opus",
}),
});
const result = await workflow.run(
"What are three compounds we should consider investigating to advance research into new antibiotics? Why should we consider them?",
);
console.log(result.data.result);
})();
+9
View File
@@ -0,0 +1,9 @@
import { ollama } from "@llamaindex/ollama";
(async () => {
const llm = ollama({
model: "gpt-oss:20b",
});
const response = await llm.complete({ prompt: "How are you?" });
console.log("Response:", response.text);
})();
+47 -47
View File
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/examples",
"version": "0.3.32",
"version": "0.3.38",
"private": true,
"scripts": {
"lint": "eslint .",
@@ -11,52 +11,52 @@
"@azure/cosmos": "^4.1.1",
"@azure/identity": "^4.4.1",
"@azure/search-documents": "^12.1.0",
"@llamaindex/anthropic": "^0.3.18",
"@llamaindex/assemblyai": "^0.1.15",
"@llamaindex/astra": "^0.0.30",
"@llamaindex/azure": "^0.1.28",
"@llamaindex/bm25-retriever": "^0.0.5",
"@llamaindex/chroma": "^0.0.30",
"@llamaindex/clip": "^0.0.67",
"@llamaindex/cloud": "^4.0.25",
"@llamaindex/cohere": "^0.0.30",
"@llamaindex/core": "^0.6.16",
"@llamaindex/deepinfra": "^0.0.67",
"@llamaindex/deepseek": "^0.0.28",
"@llamaindex/discord": "^0.1.15",
"@llamaindex/elastic-search": "^0.1.16",
"@llamaindex/anthropic": "^0.3.22",
"@llamaindex/assemblyai": "^0.1.18",
"@llamaindex/astra": "^0.0.33",
"@llamaindex/azure": "^0.1.33",
"@llamaindex/bm25-retriever": "^0.0.8",
"@llamaindex/chroma": "^0.0.33",
"@llamaindex/clip": "^0.0.72",
"@llamaindex/cloud": "^4.1.1",
"@llamaindex/cohere": "^0.0.33",
"@llamaindex/core": "^0.6.19",
"@llamaindex/deepinfra": "^0.0.72",
"@llamaindex/deepseek": "^0.0.34",
"@llamaindex/discord": "^0.1.18",
"@llamaindex/elastic-search": "^0.1.19",
"@llamaindex/env": "^0.1.30",
"@llamaindex/firestore": "^1.0.23",
"@llamaindex/fireworks": "^0.0.27",
"@llamaindex/google": "^0.3.15",
"@llamaindex/groq": "^0.0.83",
"@llamaindex/huggingface": "^0.1.21",
"@llamaindex/jinaai": "^0.0.27",
"@llamaindex/milvus": "^0.1.25",
"@llamaindex/mistral": "^0.1.16",
"@llamaindex/mixedbread": "^0.0.30",
"@llamaindex/mongodb": "^0.0.31",
"@llamaindex/node-parser": "^2.0.16",
"@llamaindex/notion": "^0.1.15",
"@llamaindex/ollama": "^0.1.16",
"@llamaindex/openai": "^0.4.11",
"@llamaindex/perplexity": "^0.0.24",
"@llamaindex/pinecone": "^0.1.16",
"@llamaindex/portkey-ai": "^0.0.58",
"@llamaindex/postgres": "^0.0.59",
"@llamaindex/qdrant": "^0.1.26",
"@llamaindex/readers": "^3.1.15",
"@llamaindex/replicate": "^0.0.58",
"@llamaindex/supabase": "^0.1.17",
"@llamaindex/together": "^0.0.27",
"@llamaindex/tools": "^0.1.6",
"@llamaindex/upstash": "^0.0.30",
"@llamaindex/vercel": "^0.1.16",
"@llamaindex/vllm": "^0.0.53",
"@llamaindex/voyage-ai": "^1.0.22",
"@llamaindex/weaviate": "^0.0.31",
"@llamaindex/workflow": "^1.1.16",
"@llamaindex/xai": "^0.0.14",
"@llamaindex/firestore": "^1.0.26",
"@llamaindex/fireworks": "^0.0.32",
"@llamaindex/google": "^0.3.18",
"@llamaindex/groq": "^0.0.88",
"@llamaindex/huggingface": "^0.1.26",
"@llamaindex/jinaai": "^0.0.32",
"@llamaindex/milvus": "^0.1.28",
"@llamaindex/mistral": "^0.1.19",
"@llamaindex/mixedbread": "^0.0.33",
"@llamaindex/mongodb": "^0.0.34",
"@llamaindex/node-parser": "^2.0.19",
"@llamaindex/notion": "^0.1.18",
"@llamaindex/ollama": "^0.1.20",
"@llamaindex/openai": "^0.4.16",
"@llamaindex/perplexity": "^0.0.29",
"@llamaindex/pinecone": "^0.1.19",
"@llamaindex/portkey-ai": "^0.0.61",
"@llamaindex/postgres": "^0.0.62",
"@llamaindex/qdrant": "^0.1.29",
"@llamaindex/readers": "^3.1.18",
"@llamaindex/replicate": "^0.0.61",
"@llamaindex/supabase": "^0.1.20",
"@llamaindex/together": "^0.0.32",
"@llamaindex/tools": "^0.1.9",
"@llamaindex/upstash": "^0.0.33",
"@llamaindex/vercel": "^0.1.19",
"@llamaindex/vllm": "^0.0.58",
"@llamaindex/voyage-ai": "^1.0.25",
"@llamaindex/weaviate": "^0.0.34",
"@llamaindex/workflow": "^1.1.20",
"@llamaindex/xai": "^0.0.19",
"@notionhq/client": "^4.0.0",
"@pinecone-database/pinecone": "^4.0.0",
"@vercel/postgres": "^0.10.0",
@@ -65,7 +65,7 @@
"commander": "^12.1.0",
"dotenv": "^17.2.0",
"js-tiktoken": "^1.0.14",
"llamaindex": "^0.11.20",
"llamaindex": "^0.11.26",
"mongodb": "6.7.0",
"postgres": "^3.4.4",
"wikipedia": "^2.1.2",
+1 -1
View File
@@ -15,7 +15,7 @@ async function main() {
const vectorStore = new QdrantVectorStore({
url: process.env.QDRANT_URL,
apiKey: process.env.QDRANT_API_KEY,
embeddingModel: embedding,
embedModel: embedding,
collectionName: "gemini_test",
});
const storageContext = await storageContextFromDefaults({ vectorStore });
+1 -1
View File
@@ -16,7 +16,7 @@ async function main() {
const vectorStore = new QdrantVectorStore({
url: process.env.QDRANT_URL,
apiKey: process.env.QDRANT_API_KEY,
embeddingModel: embedding,
embedModel: embedding,
collectionName: "jina_test",
});
const storageContext = await storageContextFromDefaults({ vectorStore });
+37
View File
@@ -1,5 +1,42 @@
# @llamaindex/autotool
## 8.0.26
### Patch Changes
- llamaindex@0.11.26
## 8.0.25
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 8.0.24
### Patch Changes
- llamaindex@0.11.24
## 8.0.23
### Patch Changes
- llamaindex@0.11.23
## 8.0.22
### Patch Changes
- llamaindex@0.11.22
## 8.0.21
### Patch Changes
- llamaindex@0.11.21
## 8.0.20
### Patch Changes
@@ -1,5 +1,48 @@
# @llamaindex/autotool-01-node-example
## 0.0.134
### Patch Changes
- llamaindex@0.11.26
- @llamaindex/autotool@8.0.26
## 0.0.133
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
- @llamaindex/autotool@8.0.25
## 0.0.132
### Patch Changes
- llamaindex@0.11.24
- @llamaindex/autotool@8.0.24
## 0.0.131
### Patch Changes
- llamaindex@0.11.23
- @llamaindex/autotool@8.0.23
## 0.0.130
### Patch Changes
- llamaindex@0.11.22
- @llamaindex/autotool@8.0.22
## 0.0.129
### Patch Changes
- llamaindex@0.11.21
- @llamaindex/autotool@8.0.21
## 0.0.128
### Patch Changes
@@ -13,5 +13,5 @@
"scripts": {
"start": "node --import tsx --import @llamaindex/autotool/node ./src/index.ts"
},
"version": "0.0.128"
"version": "0.0.134"
}
+1 -1
View File
@@ -6,7 +6,7 @@
"url": "git+https://github.com/run-llama/LlamaIndexTS.git",
"directory": "packages/autotool"
},
"version": "8.0.20",
"version": "8.0.26",
"description": "auto transpile your JS function to LLM Agent compatible",
"files": [
"dist",
+35
View File
@@ -1,5 +1,40 @@
# @llamaindex/cloud
## 4.1.1
### Patch Changes
- 4b51791: Add deprecation to README
## 4.1.0
### Minor Changes
- 049471b: Add deprecation warning
## 4.0.28
### Patch Changes
- c3bf3c7: Adding support for citations to beta agent data schema
- Updated dependencies [f9f1de9]
- @llamaindex/core@0.6.19
## 4.0.27
### Patch Changes
- Updated dependencies [f29799e]
- Updated dependencies [7224c06]
- @llamaindex/core@0.6.18
## 4.0.26
### Patch Changes
- Updated dependencies [38da40b]
- @llamaindex/core@0.6.17
## 4.0.25
### Patch Changes
+4 -3
View File
@@ -1,8 +1,9 @@
# @llamaindex/cloud
> LlamaCloud is a new generation of managed parsing, ingestion, and retrieval services, designed to bring production-grade context-augmentation to your LLM and RAG applications.
For more information, see the [API documentation](https://docs.cloud.llamaindex.ai/).
> [!WARNING]
> This package has been deprecated since version 4.1.0.
> Please migrate to [llama-cloud-services](https://www.npmjs.com/package/llama-cloud-services).
> See the documentation: https://docs.cloud.llamaindex.ai
## License
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/cloud",
"version": "4.0.25",
"version": "4.1.1",
"type": "module",
"license": "MIT",
"scripts": {
+7
View File
@@ -1,3 +1,10 @@
// Deprecation warning
console.warn(`
The package @llamaindex/cloud has been deprecated since version 4.1.0
* Please migrate to llama-cloud-services.
* See the documentation: https://docs.cloud.llamaindex.ai
`);
import { client } from "./client/client.gen";
client.setConfig({
+7
View File
@@ -1,3 +1,10 @@
// Deprecation warning
console.warn(`
The package @llamaindex/cloud has been deprecated since version 4.1.0
* Please migrate to llama-cloud-services.
* See the documentation: https://docs.cloud.llamaindex.ai
`);
export { AgentClient, createAgentDataClient } from "./client";
export type {
+28 -3
View File
@@ -28,6 +28,29 @@ export type ComparisonOperator =
*/
export type FilterOperation = RawFilterOperation;
/**
* Metadata for an extracted field, including confidence and citation information
*/
export interface ExtractedFieldMetadata {
/** The confidence score for the field, combined with parsing confidence if applicable */
confidence?: number;
/** The confidence score for the field based on the extracted text only */
extracted_confidence?: number;
/** The page number that the field occurred on */
page_number?: number;
/** The original text this field's value was derived from */
matching_text?: string;
}
/**
* Dictionary mapping field names to their metadata
* Values can be ExtractedFieldMetadata objects, nested dictionaries, or arrays
*/
export type ExtractedFieldMetadataDict = Record<
string,
ExtractedFieldMetadata | Record<string, unknown> | unknown[]
>;
/**
* Base extracted data interface
*/
@@ -35,11 +58,13 @@ export interface ExtractedData<T = unknown> {
/** The original data that was extracted from the document. For tracking changes. Should not be updated. */
original_data: T;
/** The latest state of the data. Will differ if data has been updated. */
data?: T;
data: T;
/** The status of the extracted data. Prefer to use the StatusType values, but any string is allowed. */
status: StatusType | string;
/** Confidence scores, if any, for each primitive field in the original_data data. */
confidence?: Record<string, unknown>;
/** The overall confidence score for the extracted data. */
overall_confidence?: number;
/** Page links, and perhaps eventually bounding boxes, for individual fields in the extracted data. */
field_metadata?: ExtractedFieldMetadataDict;
/** The ID of the file that was used to extract the data. */
file_id?: string;
/** The name of the file that was used to extract the data. */
+7
View File
@@ -1,3 +1,10 @@
// Deprecation warning
console.warn(`
The package @llamaindex/cloud has been deprecated since version 4.1.0
* Please migrate to llama-cloud-services.
* See the documentation: https://docs.cloud.llamaindex.ai
`);
import { createClient, createConfig } from "@hey-api/client-fetch";
import { createWorkflow, type InferWorkflowEventData } from "@llama-flow/core";
import { createStatefulMiddleware } from "@llama-flow/core/middleware/state";
+777
View File
@@ -0,0 +1,777 @@
# @llamaindex/community
## 0.0.101
### Patch Changes
- Updated dependencies [f9f1de9]
- @llamaindex/core@0.6.19
## 0.0.100
### Patch Changes
- Updated dependencies [f29799e]
- Updated dependencies [7224c06]
- @llamaindex/core@0.6.18
## 0.0.99
### Patch Changes
- c65a2dc: Deprecate community package and link to AWS package
## 0.0.98
### Patch Changes
- Updated dependencies [9b2e25a]
- @llamaindex/core@0.6.4
- @llamaindex/env@0.1.30
## 0.0.97
### Patch Changes
- Updated dependencies [3ee8c83]
- @llamaindex/core@0.6.3
## 0.0.96
### Patch Changes
- e9bf442: fix: update the tool call schema for nova
## 0.0.95
### Patch Changes
- 411dcea: Add Nova Premier to AWS Nova models. Add EU endpoints
## 0.0.94
### Patch Changes
- Updated dependencies [9c63f3f]
- @llamaindex/core@0.6.2
## 0.0.93
### Patch Changes
- Updated dependencies [1b6f368]
- Updated dependencies [eaf326e]
- @llamaindex/core@0.6.1
## 0.0.92
### Patch Changes
- 1325178: fix: stringify all tool results for anthropic on bedrock
## 0.0.91
### Patch Changes
- 5189b44: fix: add retry handling logic to parser reader and fix lint issues
- 3fd4cc3: feat: use google's new gen ai library to support multimodal output
- Updated dependencies [21bebfc]
- Updated dependencies [93bc0ff]
- Updated dependencies [91a18e7]
- Updated dependencies [5189b44]
- @llamaindex/core@0.6.0
## 0.0.90
### Patch Changes
- Updated dependencies [40ee761]
- @llamaindex/core@0.5.8
## 0.0.89
### Patch Changes
- Updated dependencies [4bac71d]
- @llamaindex/core@0.5.7
## 0.0.88
### Patch Changes
- e28c29d: Added Llama 3.3 70B Instruct support
- Updated dependencies [beb922b]
- @llamaindex/env@0.1.29
- @llamaindex/core@0.5.6
## 0.0.87
### Patch Changes
- Updated dependencies [5668970]
- @llamaindex/core@0.5.5
## 0.0.86
### Patch Changes
- Updated dependencies [ad3c7f1]
- @llamaindex/core@0.5.4
## 0.0.85
### Patch Changes
- 1914b52: Added Claude 3.7 Sonnet support
- Updated dependencies [cb021e7]
- @llamaindex/core@0.5.3
## 0.0.84
### Patch Changes
- Updated dependencies [d952e68]
- @llamaindex/core@0.5.2
## 0.0.83
### Patch Changes
- Updated dependencies [cc50c9c]
- @llamaindex/env@0.1.28
- @llamaindex/core@0.5.1
## 0.0.82
### Patch Changes
- Updated dependencies [6a4a737]
- Updated dependencies [d924c63]
- @llamaindex/core@0.5.0
## 0.0.81
### Patch Changes
- 1c908fd: Revert previous release (not working with CJS)
- Updated dependencies [1c908fd]
- @llamaindex/core@0.4.23
- @llamaindex/env@0.1.27
## 0.0.80
### Patch Changes
- cb608b5: fix: bundle output incorrect
- Updated dependencies [cb608b5]
- @llamaindex/core@0.4.22
- @llamaindex/env@0.1.26
## 0.0.79
### Patch Changes
- Updated dependencies [9456616]
- Updated dependencies [1931bbc]
- @llamaindex/core@0.4.21
## 0.0.78
### Patch Changes
- Updated dependencies [d211b7a]
- @llamaindex/core@0.4.20
## 0.0.77
### Patch Changes
- 24caf93: fix: added inference profile mapping for nova models"
- Updated dependencies [a9b5b99]
- @llamaindex/core@0.4.19
## 0.0.76
### Patch Changes
- c1850ee: feat: Amazon Nova support via Bedrock
- Updated dependencies [b504303]
- Updated dependencies [e0f6cc3]
- @llamaindex/env@0.1.25
- @llamaindex/core@0.4.18
## 0.0.75
### Patch Changes
- Updated dependencies [3d1808b]
- @llamaindex/core@0.4.17
## 0.0.74
### Patch Changes
- 8be4589: chore: bump version
- Updated dependencies [8be4589]
- @llamaindex/core@0.4.16
- @llamaindex/env@0.1.24
## 0.0.73
### Patch Changes
- Updated dependencies [d2b2722]
- @llamaindex/env@0.1.23
- @llamaindex/core@0.4.15
## 0.0.72
### Patch Changes
- Updated dependencies [969365c]
- @llamaindex/env@0.1.22
- @llamaindex/core@0.4.14
## 0.0.71
### Patch Changes
- 90d265c: chore: bump version
- Updated dependencies [90d265c]
- @llamaindex/core@0.4.13
- @llamaindex/env@0.1.21
## 0.0.70
### Patch Changes
- Updated dependencies [ef4f63d]
- @llamaindex/core@0.4.12
## 0.0.69
### Patch Changes
- Updated dependencies [6d22fa2]
- @llamaindex/core@0.4.11
## 0.0.68
### Patch Changes
- Updated dependencies [a7b0ac3]
- Updated dependencies [c69605f]
- @llamaindex/core@0.4.10
## 0.0.67
### Patch Changes
- Updated dependencies [7ae6eaa]
- @llamaindex/core@0.4.9
## 0.0.66
### Patch Changes
- Updated dependencies [f865c98]
- @llamaindex/core@0.4.8
## 0.0.65
### Patch Changes
- Updated dependencies [d89ebe0]
- Updated dependencies [fd8c882]
- @llamaindex/core@0.4.7
## 0.0.64
### Patch Changes
- Updated dependencies [4fc001c]
- @llamaindex/env@0.1.20
- @llamaindex/core@0.4.6
## 0.0.63
### Patch Changes
- Updated dependencies [ad85bd0]
- @llamaindex/core@0.4.5
- @llamaindex/env@0.1.19
## 0.0.62
### Patch Changes
- Updated dependencies [a8d3fa6]
- @llamaindex/env@0.1.18
- @llamaindex/core@0.4.4
## 0.0.61
### Patch Changes
- 487782c: Add missing inference endpoints for Haiku 3.5
- Updated dependencies [95a5cc6]
- @llamaindex/core@0.4.3
## 0.0.60
### Patch Changes
- Updated dependencies [14cc9eb]
- @llamaindex/env@0.1.17
- @llamaindex/core@0.4.2
## 0.0.59
### Patch Changes
- 47a7c3e: feat: added support for Haiku 3.5 via Bedrock
## 0.0.58
### Patch Changes
- Updated dependencies [9c73f0a]
- @llamaindex/core@0.4.1
## 0.0.57
### Patch Changes
- Updated dependencies [359fd33]
- Updated dependencies [efb7e1b]
- Updated dependencies [98ba1e7]
- Updated dependencies [620c63c]
- @llamaindex/core@0.4.0
## 0.0.56
### Patch Changes
- Updated dependencies [60b185f]
- @llamaindex/core@0.3.7
## 0.0.55
### Patch Changes
- Updated dependencies [691c5bc]
- @llamaindex/core@0.3.6
## 0.0.54
### Patch Changes
- Updated dependencies [fa60fc6]
- @llamaindex/env@0.1.16
- @llamaindex/core@0.3.5
## 0.0.53
### Patch Changes
- Updated dependencies [e2a0876]
- @llamaindex/core@0.3.4
## 0.0.52
### Patch Changes
- a5a75f6: feat: added sonnet 3.5 v2
## 0.0.51
### Patch Changes
- Updated dependencies [0493f67]
- @llamaindex/core@0.3.3
## 0.0.50
### Patch Changes
- Updated dependencies [4ba2cfe]
- @llamaindex/env@0.1.15
- @llamaindex/core@0.3.2
## 0.0.49
### Patch Changes
- a75af83: refactor: move some llm and embedding to single package
- Updated dependencies [ae49ff4]
- Updated dependencies [a75af83]
- @llamaindex/env@0.1.14
- @llamaindex/core@0.3.1
## 0.0.48
### Patch Changes
- Updated dependencies [1364e8e]
- Updated dependencies [96fc69c]
- @llamaindex/core@0.3.0
## 0.0.47
### Patch Changes
- Updated dependencies [5f67820]
- @llamaindex/core@0.2.12
## 0.0.46
### Patch Changes
- Updated dependencies [ee697fb]
- @llamaindex/core@0.2.11
## 0.0.45
### Patch Changes
- Updated dependencies [3489e7d]
- Updated dependencies [468bda5]
- @llamaindex/core@0.2.10
## 0.0.44
### Patch Changes
- Updated dependencies [b17d439]
- @llamaindex/core@0.2.9
## 0.0.43
### Patch Changes
- 2774e80: feat: added meta3.2 support via Bedrock including vision, tool call and inference region support
## 0.0.42
### Patch Changes
- df441e2: fix: consoleLogger is missing from `@llamaindex/env`
- Updated dependencies [df441e2]
- @llamaindex/core@0.2.8
- @llamaindex/env@0.1.13
## 0.0.41
### Patch Changes
- Updated dependencies [6cce3b1]
- @llamaindex/core@0.2.7
## 0.0.40
### Patch Changes
- 50e6b57: feat: add Amazon Bedrock Retriever
- Updated dependencies [8b7fdba]
- @llamaindex/core@0.2.6
## 0.0.39
### Patch Changes
- Updated dependencies [d902cc3]
- @llamaindex/core@0.2.5
## 0.0.38
### Patch Changes
- Updated dependencies [b48bcc3]
- @llamaindex/core@0.2.4
- @llamaindex/env@0.1.12
## 0.0.37
### Patch Changes
- Updated dependencies [2cd1383]
- @llamaindex/core@0.2.3
## 0.0.36
### Patch Changes
- Updated dependencies [749b43a]
- @llamaindex/core@0.2.2
## 0.0.35
### Patch Changes
- Updated dependencies [ac07e3c]
- Updated dependencies [70ccb4a]
- Updated dependencies [1a6137b]
- Updated dependencies [ac07e3c]
- @llamaindex/core@0.2.1
- @llamaindex/env@0.1.11
## 0.0.34
### Patch Changes
- Updated dependencies [11feef8]
- @llamaindex/core@0.2.0
## 0.0.33
### Patch Changes
- Updated dependencies [711c814]
- @llamaindex/core@0.1.12
## 0.0.32
### Patch Changes
- Updated dependencies [4648da6]
- @llamaindex/env@0.1.10
- @llamaindex/core@0.1.11
## 0.0.31
### Patch Changes
- Updated dependencies [0148354]
- @llamaindex/core@0.1.10
## 0.0.30
### Patch Changes
- Updated dependencies [e27e7dd]
- @llamaindex/core@0.1.9
## 0.0.29
### Patch Changes
- 58abc57: fix: align version
- Updated dependencies [58abc57]
- @llamaindex/core@0.1.8
- @llamaindex/env@0.1.9
## 0.0.28
### Patch Changes
- Updated dependencies [04b2f8e]
- @llamaindex/core@0.1.7
## 0.0.27
### Patch Changes
- Updated dependencies [0452af9]
- @llamaindex/core@0.1.6
## 0.0.26
### Patch Changes
- 224d507: fix: prevent tool calling getting mixed with conversation
- 376d29a: feat: added tool calling and agent support for llama3.1 504B
## 0.0.25
### Patch Changes
- Updated dependencies [91d02a4]
- @llamaindex/core@0.1.5
## 0.0.24
### Patch Changes
- 3d9a802: feat: added llama 3.1
- Updated dependencies [15962b3]
- @llamaindex/core@0.1.4
## 0.0.23
### Patch Changes
- Updated dependencies [6cf6ae6]
- @llamaindex/core@0.1.3
## 0.0.22
### Patch Changes
- Updated dependencies [b974eea]
- @llamaindex/core@0.1.2
## 0.0.21
### Patch Changes
- Updated dependencies [b3681bf]
- @llamaindex/core@0.1.1
## 0.0.20
### Patch Changes
- 56746c2: fix: llama3 patched to handle empty content (can happen with system) and added max tokens export
## 0.0.19
### Patch Changes
- 16ef5dd: refactor: depends on core pacakge instead of llamaindex
- Updated dependencies [16ef5dd]
- Updated dependencies [16ef5dd]
- @llamaindex/core@0.1.0
## 0.0.18
### Patch Changes
- llamaindex@0.4.14
## 0.0.17
### Patch Changes
- Updated dependencies [e8f8bea]
- Updated dependencies [304484b]
- llamaindex@0.4.13
## 0.0.16
### Patch Changes
- f326ab8: chore: bump version
- Updated dependencies [f326ab8]
- llamaindex@0.4.12
## 0.0.15
### Patch Changes
- Updated dependencies [8bf5b4a]
- llamaindex@0.4.11
## 0.0.14
### Patch Changes
- Updated dependencies [7dce3d2]
- llamaindex@0.4.10
## 0.0.13
### Patch Changes
- Updated dependencies [3a96a48]
- llamaindex@0.4.9
## 0.0.12
### Patch Changes
- Updated dependencies [83ebdfb]
- llamaindex@0.4.8
## 0.0.11
### Patch Changes
- Updated dependencies [41fe871]
- Updated dependencies [321c39d]
- Updated dependencies [f7f1af0]
- llamaindex@0.4.7
## 0.0.10
### Patch Changes
- Updated dependencies [1feb23b]
- Updated dependencies [08c55ec]
- llamaindex@0.4.6
## 0.0.9
### Patch Changes
- Updated dependencies [6c3e5d0]
- llamaindex@0.4.5
## 0.0.8
### Patch Changes
- Updated dependencies [42eb73a]
- llamaindex@0.4.4
## 0.0.7
### Patch Changes
- Updated dependencies [2ef62a9]
- llamaindex@0.4.3
## 0.0.6
### Patch Changes
- a87a4d1: feat: added tool support calling for Bedrock's Calude and general llm support for agents
- Updated dependencies [a87a4d1]
- Updated dependencies [0730140]
- llamaindex@0.4.2
## 0.0.5
### Patch Changes
- ed467a9: Add model ids for Anthropic Claude 3.5 Sonnet model on Anthropic and Bedrock
- Updated dependencies [3c47910]
- Updated dependencies [ed467a9]
- Updated dependencies [cba5406]
- llamaindex@0.4.1
## 0.0.4
### Patch Changes
- b1a4a74: docs: updated Bedrock Opus region and added a basic README
- Updated dependencies [436bc41]
- Updated dependencies [a44e54f]
- Updated dependencies [a51ed8d]
- Updated dependencies [d3b635b]
- llamaindex@0.4.0
## 0.0.3
### Patch Changes
- Updated dependencies [6bc5bdd]
- Updated dependencies [bf25ff6]
- Updated dependencies [e6d6576]
- llamaindex@0.3.17
## 0.0.2
### Patch Changes
- 8832669: Community bedrock support added
- Updated dependencies [11ae926]
- Updated dependencies [631f000]
- Updated dependencies [1378ec4]
- Updated dependencies [6b1ded4]
- Updated dependencies [4d4bd85]
- Updated dependencies [24a9d1e]
- Updated dependencies [45952de]
- Updated dependencies [54230f0]
- Updated dependencies [a29d835]
- Updated dependencies [73819bf]
- llamaindex@0.3.16
+17
View File
@@ -0,0 +1,17 @@
# @llamaindex/community
AWS package for LlamaIndexTS, deprecated, use [@llamaindex/aws](https://www.npmjs.com/package/@llamaindex/aws) instead.
## Current Features:
- Bedrock support for Amazon Nova models Pro, Lite and Micro
- Bedrock support for the Anthropic Claude Models [usage](https://ts.llamaindex.ai/docs/llamaindex/modules/llms/bedrock) including the latest Sonnet 3.5 v2 and Haiku 3.5
- Bedrock support for the Meta LLama 2, 3, 3.1 and 3.2 Models [usage](https://ts.llamaindex.ai/docs/llamaindex/modules/llms/bedrock)
- Meta LLama3.1 405b and Llama3.2 tool call support
- Meta 3.2 11B and 90B vision support
- Bedrock support for querying Knowledge Base
- Bedrock: [Supported Regions and models for cross-region inference](https://docs.aws.amazon.com/bedrock/latest/userguide/cross-region-inference-support.html)
## LICENSE
MIT
+53
View File
@@ -0,0 +1,53 @@
{
"name": "@llamaindex/community",
"description": "Community package for LlamaIndexTS",
"version": "0.0.101",
"type": "module",
"types": "dist/type/index.d.ts",
"main": "dist/cjs/index.js",
"exports": {
".": {
"import": {
"types": "./dist/type/index.d.ts",
"default": "./dist/index.js"
},
"require": {
"types": "./dist/type/index.d.ts",
"default": "./dist/index.cjs"
}
},
"./llm/bedrock": {
"import": {
"types": "./dist/type/llm/bedrock.d.ts",
"default": "./dist/llm/bedrock/index.js"
},
"require": {
"types": "./dist/type/llm/bedrock.d.ts",
"default": "./dist/llm/bedrock/index.cjs"
}
}
},
"files": [
"dist",
"CHANGELOG.md",
"!**/*.tsbuildinfo"
],
"repository": {
"type": "git",
"url": "git+https://github.com/run-llama/LlamaIndexTS.git",
"directory": "packages/community"
},
"scripts": {
"build": "bunchee",
"dev": "bunchee --watch"
},
"devDependencies": {
"@types/node": "^22.9.0"
},
"dependencies": {
"@aws-sdk/client-bedrock-agent-runtime": "^3.706.0",
"@aws-sdk/client-bedrock-runtime": "^3.706.0",
"@llamaindex/core": "workspace:*",
"@llamaindex/env": "workspace:*"
}
}
+8
View File
@@ -0,0 +1,8 @@
export {
BEDROCK_MODELS,
BEDROCK_MODEL_MAX_TOKENS,
Bedrock,
INFERENCE_BEDROCK_MODELS,
INFERENCE_TO_BEDROCK_MAP,
} from "./llm/bedrock/index.js";
export { AmazonKnowledgeBaseRetriever } from "./retrievers/bedrock.js";
@@ -0,0 +1,134 @@
import type {
ContentBlockDelta,
ConverseOutput,
ConverseRequest,
ConverseResponse,
ConverseStreamOutput,
InvokeModelCommandInput,
InvokeModelWithResponseStreamCommandInput,
ResponseStream,
} from "@aws-sdk/client-bedrock-runtime";
import type {
BaseTool,
ChatMessage,
LLMMetadata,
ToolCall,
ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { toUtf8 } from "../utils";
import { Provider, type BedrockChatStreamResponse } from "../provider";
import {
mapBaseToolsToAmazonTools,
mapChatMessagesToAmazonMessages,
} from "./utils";
export class AmazonProvider extends Provider<ConverseStreamOutput> {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
getResultFromResponse(response: Record<string, any>): ConverseResponse {
return JSON.parse(toUtf8(response.body));
}
getToolsFromResponse<ToolContent>(response: ConverseOutput): ToolContent[] {
return (
response.message?.content
?.filter((item) => item.toolUse)
.map(
(item) =>
({
id: item.toolUse!.toolUseId,
name: item.toolUse!.name,
input: item.toolUse!.input
? JSON.parse(item.toolUse!.input as string)
: "",
}) as ToolContent,
) ?? []
);
}
getTextFromResponse(response: ConverseResponse): string {
const result = this.getResultFromResponse(response);
const content = result.output?.message?.content ?? [];
return content.map((item) => item.text).join(" ");
}
getTextFromStreamResponse(response: ResponseStream): string {
const event: ConverseStreamOutput | undefined =
this.getStreamingEventResponse(response);
if (!event || !event.contentBlockDelta) return "";
const delta: ContentBlockDelta | undefined = event.contentBlockDelta.delta;
return delta?.text || "";
}
async *reduceStream(
stream: AsyncIterable<ResponseStream>,
): BedrockChatStreamResponse {
let toolId: string | undefined = undefined;
let toolName: string | undefined = undefined;
for await (const response of stream) {
const event = this.getStreamingEventResponse(response);
const delta = this.getTextFromStreamResponse(response);
let options: undefined | ToolCallLLMMessageOptions = undefined;
if (event?.contentBlockStart && event.contentBlockStart.start?.toolUse) {
toolId = event.contentBlockStart.start?.toolUse.toolUseId;
toolName = event.contentBlockStart.start?.toolUse.name;
continue;
}
if (
toolId &&
toolName &&
event?.contentBlockDelta?.delta?.toolUse?.input
) {
options = {
toolCall: [
{
id: toolId,
name: toolName,
input: JSON.parse(event?.contentBlockDelta?.delta?.toolUse.input),
} as ToolCall,
],
};
toolId = undefined;
toolName = undefined;
}
if (!delta && !options) continue;
yield {
delta: options ? "" : delta,
options,
raw: response,
};
}
}
getRequestBody<T extends ChatMessage>(
metadata: LLMMetadata,
messages: T[],
tools: BaseTool[] = [],
options: Omit<ConverseRequest, "modelId" | "messages" | "inferenceConfig">,
): InvokeModelCommandInput | InvokeModelWithResponseStreamCommandInput {
const request: Omit<ConverseRequest, "modelId"> = {
...options,
messages: mapChatMessagesToAmazonMessages(messages),
inferenceConfig: {
maxTokens: metadata.maxTokens,
temperature: metadata.temperature,
topP: metadata.topP,
},
};
if (tools.length) {
request.toolConfig = {
tools: mapBaseToolsToAmazonTools(tools),
};
}
return {
modelId: metadata.model,
contentType: "application/json",
accept: "application/json",
body: JSON.stringify(request),
};
}
}
@@ -0,0 +1,5 @@
import type { ConverseRequest, Message } from "@aws-sdk/client-bedrock-runtime";
export type AmazonMessages = ConverseRequest["messages"];
export type AmazonMessage = Message;
@@ -0,0 +1,141 @@
import type {
ImageBlock,
ImageFormat,
Message,
Tool,
} from "@aws-sdk/client-bedrock-runtime";
import type {
BaseTool,
ChatMessage,
MessageContentDetail,
ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { extractDataUrlComponents } from "../utils";
import type { JSONObject } from "@llamaindex/core/global";
import { mapMessageContentToMessageContentDetails } from "../../utils";
import type { AmazonMessage, AmazonMessages } from "./types";
const ACCEPTED_IMAGE_MIME_TYPES = [
"image/jpeg",
"image/png",
"image/webp",
"image/gif",
] as const;
const ACCEPTED_IMAGE_MIME_TYPE_FORMAT_MAP: Record<
(typeof ACCEPTED_IMAGE_MIME_TYPES)[number],
ImageFormat
> = {
"image/jpeg": "jpeg",
"image/png": "png",
"image/webp": "webp",
"image/gif": "gif",
};
export const mapImageContent = (imageUrl: string): ImageBlock => {
if (!imageUrl.startsWith("data:"))
throw new Error(
"For Amazon please only use base64 data url, e.g.: data:image/jpeg;base64,SGVsbG8sIFdvcmxkIQ==",
);
const { mimeType, base64: data } = extractDataUrlComponents(imageUrl);
if (
!ACCEPTED_IMAGE_MIME_TYPES.includes(
mimeType as keyof typeof ACCEPTED_IMAGE_MIME_TYPE_FORMAT_MAP,
)
)
throw new Error(
`Amazon only accepts the following mimeTypes: ${ACCEPTED_IMAGE_MIME_TYPES.join("\n")}`,
);
return {
format:
ACCEPTED_IMAGE_MIME_TYPE_FORMAT_MAP[
mimeType as keyof typeof ACCEPTED_IMAGE_MIME_TYPE_FORMAT_MAP
],
// @ts-expect-error: there's a mistake in the "@aws-sdk/client-bedrock-runtime" compared to the actual api
source: { bytes: data },
};
};
export const mapMessageContentDetailToAmazonContent = <
T extends MessageContentDetail,
>(
detail: T,
): Message["content"] => {
let content: Message["content"] = [];
if (detail.type === "text") {
content = [{ text: detail.text }];
} else if (detail.type === "image_url") {
content = [{ image: mapImageContent(detail.image_url.url) }];
} else {
throw new Error("Unsupported content detail type");
}
return content;
};
export const mapChatMessagesToAmazonMessages = <
T extends ChatMessage<ToolCallLLMMessageOptions>,
>(
messages: T[],
): AmazonMessages => {
return messages.flatMap((msg: T): AmazonMessage[] => {
return mapMessageContentToMessageContentDetails(msg.content).map(
(detail: MessageContentDetail): AmazonMessage => {
if (msg.options && "toolCall" in msg.options) {
return {
role: "assistant",
content: msg.options.toolCall.map((call) => ({
toolUse: {
toolUseId: call.id,
name: call.name,
input: call.input as JSONObject,
},
})),
};
}
if (msg.options && "toolResult" in msg.options) {
return {
role: "user",
content: [
{
toolResult: {
toolUseId: msg.options.toolResult.id,
content: [
{
text: msg.options.toolResult.result,
},
],
},
},
],
};
}
return {
role: msg.role === "assistant" ? "assistant" : "user",
content: mapMessageContentDetailToAmazonContent(detail),
};
},
);
});
};
export const mapBaseToolsToAmazonTools = (tools?: BaseTool[]): Tool[] => {
if (!tools) return [];
return tools.map((tool: BaseTool) => {
const {
metadata: { parameters, ...options },
} = tool;
return {
toolSpec: {
...options,
inputSchema: {
json: parameters,
},
},
} as Tool;
});
};
@@ -0,0 +1,156 @@
import {
type InvokeModelCommandInput,
type InvokeModelWithResponseStreamCommandInput,
ResponseStream,
} from "@aws-sdk/client-bedrock-runtime";
import type {
BaseTool,
ChatMessage,
LLMMetadata,
PartialToolCall,
ToolCall,
ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { type BedrockChatStreamResponse, Provider } from "../provider";
import { toUtf8 } from "../utils";
import type {
AnthropicAdditionalChatOptions,
AnthropicNoneStreamingResponse,
AnthropicStreamEvent,
AnthropicTextContent,
ToolBlock,
} from "./types";
import {
mapBaseToolsToAnthropicTools,
mapChatMessagesToAnthropicMessages,
} from "./utils";
export class AnthropicProvider extends Provider<AnthropicStreamEvent> {
getResultFromResponse(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
response: Record<string, any>,
): AnthropicNoneStreamingResponse {
return JSON.parse(toUtf8(response.body));
}
getToolsFromResponse<AnthropicToolContent>(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
response: Record<string, any>,
): AnthropicToolContent[] {
const result = this.getResultFromResponse(response);
return result.content
.filter((item) => item.type === "tool_use")
.map((item) => item as AnthropicToolContent);
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
getTextFromResponse(response: Record<string, any>): string {
const result = this.getResultFromResponse(response);
return result.content
.filter((item) => item.type === "text")
.map((item) => (item as AnthropicTextContent).text)
.join(" ");
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
getTextFromStreamResponse(response: Record<string, any>): string {
const event = this.getStreamingEventResponse(response);
if (event?.type === "content_block_delta") {
if (event.delta.type === "text_delta") return event.delta.text;
if (event.delta.type === "input_json_delta")
return event.delta.partial_json;
}
return "";
}
async *reduceStream(
stream: AsyncIterable<ResponseStream>,
): BedrockChatStreamResponse {
let collecting = [];
let tool: ToolBlock | undefined = undefined;
// #TODO this should be broken down into a separate consumer
for await (const response of stream) {
const delta = this.getTextFromStreamResponse(response);
const event = this.getStreamingEventResponse(response);
if (
event?.type === "content_block_start" &&
event.content_block.type === "tool_use"
) {
tool = event.content_block;
continue;
}
if (
event?.type === "content_block_delta" &&
event.delta.type === "input_json_delta"
) {
collecting.push(event.delta.partial_json);
}
let options: undefined | ToolCallLLMMessageOptions = undefined;
if (tool && collecting.length) {
const input = collecting.filter((item) => item).join("");
// We have all we need to parse the tool_use json
if (event?.type === "content_block_stop") {
options = {
toolCall: [
{
id: tool.id,
name: tool.name,
input: JSON.parse(input),
} as ToolCall,
],
};
// reset the collection/tool
collecting = [];
tool = undefined;
} else {
options = {
toolCall: [
{
id: tool.id,
name: tool.name,
input,
} as PartialToolCall,
],
};
}
}
if (!delta && !options) continue;
yield {
delta: options ? "" : delta,
options,
raw: response,
};
}
}
getRequestBody<T extends ChatMessage<ToolCallLLMMessageOptions>>(
metadata: LLMMetadata,
messages: T[],
tools?: BaseTool[],
options?: AnthropicAdditionalChatOptions,
): InvokeModelCommandInput | InvokeModelWithResponseStreamCommandInput {
const extra: Record<string, unknown> = {};
if (options?.toolChoice) {
extra["tool_choice"] = options?.toolChoice;
}
const mapped = mapChatMessagesToAnthropicMessages(messages);
return {
modelId: metadata.model,
contentType: "application/json",
accept: "application/json",
body: JSON.stringify({
anthropic_version: "bedrock-2023-05-31",
messages: mapped,
tools: mapBaseToolsToAnthropicTools(tools),
max_tokens: metadata.maxTokens,
temperature: metadata.temperature,
top_p: metadata.topP,
...extra,
}),
};
}
}
@@ -0,0 +1,161 @@
import type { ToolMetadata } from "@llamaindex/core/llms";
import type { InvocationMetrics } from "../types";
export type ToolChoice =
| { type: "any" }
| { type: "auto" }
| { type: "tool"; name: string };
export interface ThinkingConfigDisabled {
type: "disabled";
}
export interface ThinkingConfigEnabled {
budget_tokens: number;
type: "enabled";
}
export type AnthropicAdditionalChatOptions = {
toolChoice: ToolChoice;
thinking?: ThinkingConfigDisabled | ThinkingConfigEnabled;
};
type Usage = {
input_tokens: number;
output_tokens: number;
};
type Message = {
id: string;
type: string;
role: string;
content: string[];
model: string;
stop_reason: string | null;
stop_sequence: string | null;
usage: Usage;
};
export type ToolBlock = {
id: string;
input: unknown;
name: string;
type: "tool_use";
};
export type TextBlock = {
type: "text";
text: string;
};
type ContentBlockStart = {
type: "content_block_start";
index: number;
content_block: ToolBlock | TextBlock;
};
type Delta =
| {
type: "text_delta";
text: string;
}
| {
type: "input_json_delta";
partial_json: string;
};
type ContentBlockDelta = {
type: "content_block_delta";
index: number;
delta: Delta;
};
type ContentBlockStop = {
type: "content_block_stop";
index: number;
};
type MessageDelta = {
type: "message_delta";
delta: {
stop_reason: string;
stop_sequence: string | null;
};
usage: Usage;
};
export type MessageStop = {
type: "message_stop";
"amazon-bedrock-invocationMetrics": InvocationMetrics;
};
export type AnthropicStreamEvent =
| { type: "message_start"; message: Message }
| ContentBlockStart
| ContentBlockDelta
| ContentBlockStop
| MessageDelta
| MessageStop;
export type AnthropicContent =
| AnthropicTextContent
| AnthropicImageContent
| AnthropicToolContent
| AnthropicToolResultContent;
export type AnthropicTextContent = {
type: "text";
text: string;
};
export type AnthropicToolContent = {
type: "tool_use";
id: string;
name: string;
input: Record<string, unknown>;
};
export type AnthropicToolResultContent = {
type: "tool_result";
tool_use_id: string;
content: string;
};
export type AnthropicMediaTypes =
| "image/jpeg"
| "image/png"
| "image/webp"
| "image/gif";
export type AnthropicImageSource = {
type: "base64";
media_type: AnthropicMediaTypes;
data: string; // base64 encoded image bytes
};
export type AnthropicImageContent = {
type: "image";
source: AnthropicImageSource;
};
export type AnthropicMessage = {
role: "user" | "assistant";
content: AnthropicContent[];
};
export type AnthropicNoneStreamingResponse = {
id: string;
type: "message";
role: "assistant";
content: AnthropicContent[];
model: string;
stop_reason: "end_turn" | "max_tokens" | "stop_sequence";
stop_sequence?: string;
usage: { input_tokens: number; output_tokens: number };
};
export type AnthropicTool = {
name: string;
description: string;
input_schema: ToolMetadata["parameters"];
};
@@ -0,0 +1,166 @@
import type { JSONObject } from "@llamaindex/core/global";
import type {
BaseTool,
ChatMessage,
MessageContent,
MessageContentDetail,
ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { mapMessageContentToMessageContentDetails } from "../../utils";
import { extractDataUrlComponents } from "../utils";
import type {
AnthropicContent,
AnthropicImageContent,
AnthropicMediaTypes,
AnthropicMessage,
AnthropicTextContent,
AnthropicTool,
} from "./types.js";
const ACCEPTED_IMAGE_MIME_TYPES = [
"image/jpeg",
"image/png",
"image/webp",
"image/gif",
];
export const mergeNeighboringSameRoleMessages = (
messages: AnthropicMessage[],
): AnthropicMessage[] => {
return messages.reduce(
(result: AnthropicMessage[], current: AnthropicMessage, index: number) => {
if (index > 0 && messages[index - 1]!.role === current.role) {
result[result.length - 1]!.content = [
...result[result.length - 1]!.content,
...current.content,
];
} else {
result.push(current);
}
return result;
},
[],
);
};
export const mapMessageContentDetailToAnthropicContent = <
T extends MessageContentDetail,
>(
detail: T,
): AnthropicContent => {
let content: AnthropicContent;
if (detail.type === "text") {
content = mapTextContent(detail.text);
} else if (detail.type === "image_url") {
content = mapImageContent(detail.image_url.url);
} else {
throw new Error("Unsupported content detail type");
}
return content;
};
export const mapMessageContentToAnthropicContent = <T extends MessageContent>(
content: T,
): AnthropicContent[] => {
return mapMessageContentToMessageContentDetails(content).map(
mapMessageContentDetailToAnthropicContent,
);
};
export const mapBaseToolsToAnthropicTools = (
tools?: BaseTool[],
): AnthropicTool[] => {
if (!tools) return [];
return tools.map((tool: BaseTool) => {
const {
metadata: { parameters, ...options },
} = tool;
return {
...options,
input_schema: parameters,
};
});
};
export const mapChatMessagesToAnthropicMessages = <
T extends ChatMessage<ToolCallLLMMessageOptions>,
>(
messages: T[],
): AnthropicMessage[] => {
const mapped = messages
.flatMap((msg: T): AnthropicMessage[] => {
if (msg.options && "toolCall" in msg.options) {
return [
{
role: "assistant",
content: msg.options.toolCall.map((call) => ({
type: "tool_use",
id: call.id,
name: call.name,
input: call.input as JSONObject,
})),
},
];
}
if (msg.options && "toolResult" in msg.options) {
return [
{
role: "user",
content: [
{
type: "tool_result",
tool_use_id: msg.options.toolResult.id,
content: JSON.stringify(msg.options.toolResult.result),
},
],
},
];
}
return mapMessageContentToMessageContentDetails(msg.content).map(
(detail: MessageContentDetail): AnthropicMessage => {
const content = mapMessageContentDetailToAnthropicContent(detail);
return {
role: msg.role === "assistant" ? "assistant" : "user",
content: [content],
};
},
);
})
.filter((message: AnthropicMessage) => {
const content = message.content[0]!;
if (content.type === "text" && !content.text) return false;
if (content.type === "image" && !content.source.data) return false;
if (content.type === "image" && message.role === "assistant")
return false;
return true;
});
return mergeNeighboringSameRoleMessages(mapped);
};
export const mapTextContent = (text: string): AnthropicTextContent => {
return { type: "text", text };
};
export const mapImageContent = (imageUrl: string): AnthropicImageContent => {
if (!imageUrl.startsWith("data:"))
throw new Error(
"For Anthropic please only use base64 data url, e.g.: data:image/jpeg;base64,SGVsbG8sIFdvcmxkIQ==",
);
const { mimeType, base64: data } = extractDataUrlComponents(imageUrl);
if (!ACCEPTED_IMAGE_MIME_TYPES.includes(mimeType))
throw new Error(
`Anthropic only accepts the following mimeTypes: ${ACCEPTED_IMAGE_MIME_TYPES.join("\n")}`,
);
return {
type: "image",
source: {
type: "base64",
media_type: mimeType as AnthropicMediaTypes,
data,
},
};
};
+513
View File
@@ -0,0 +1,513 @@
import {
BedrockRuntimeClient,
type BedrockRuntimeClientConfig,
InvokeModelCommand,
InvokeModelWithResponseStreamCommand,
} from "@aws-sdk/client-bedrock-runtime";
import {
type ChatMessage,
type ChatResponse,
type CompletionResponse,
type LLMChatParamsNonStreaming,
type LLMChatParamsStreaming,
type LLMCompletionParamsNonStreaming,
type LLMCompletionParamsStreaming,
type LLMMetadata,
ToolCallLLM,
type ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { streamConverter } from "@llamaindex/core/utils";
import {
type BedrockAdditionalChatOptions,
type BedrockChatStreamResponse,
Provider,
} from "./provider";
import { wrapLLMEvent } from "@llamaindex/core/decorator";
import { mapMessageContentToMessageContentDetails } from "../utils";
import { AmazonProvider } from "./amazon/provider";
import { AnthropicProvider } from "./anthropic/provider";
import { MetaProvider } from "./meta/provider";
// Other providers should go here
export const PROVIDERS: { [key: string]: Provider } = {
anthropic: new AnthropicProvider(),
meta: new MetaProvider(),
amazon: new AmazonProvider(),
};
export type BedrockChatParamsStreaming = LLMChatParamsStreaming<
BedrockAdditionalChatOptions,
ToolCallLLMMessageOptions
>;
export type BedrockChatParamsNonStreaming = LLMChatParamsNonStreaming<
BedrockAdditionalChatOptions,
ToolCallLLMMessageOptions
>;
export type BedrockChatNonStreamResponse =
ChatResponse<ToolCallLLMMessageOptions>;
export const BEDROCK_MODELS = {
AMAZON_TITAN_TG1_LARGE: "amazon.titan-tg1-large",
AMAZON_TITAN_TEXT_EXPRESS_V1: "amazon.titan-text-express-v1",
AI21_J2_GRANDE_INSTRUCT: "ai21.j2-grande-instruct",
AI21_J2_JUMBO_INSTRUCT: "ai21.j2-jumbo-instruct",
AI21_J2_MID: "ai21.j2-mid",
AI21_J2_MID_V1: "ai21.j2-mid-v1",
AI21_J2_ULTRA: "ai21.j2-ultra",
AI21_J2_ULTRA_V1: "ai21.j2-ultra-v1",
COHERE_COMMAND_TEXT_V14: "cohere.command-text-v14",
ANTHROPIC_CLAUDE_INSTANT_1: "anthropic.claude-instant-v1",
ANTHROPIC_CLAUDE_1: "anthropic.claude-v1", // EOF: No longer supported
ANTHROPIC_CLAUDE_2: "anthropic.claude-v2",
ANTHROPIC_CLAUDE_2_1: "anthropic.claude-v2:1",
ANTHROPIC_CLAUDE_3_SONNET: "anthropic.claude-3-sonnet-20240229-v1:0",
ANTHROPIC_CLAUDE_3_HAIKU: "anthropic.claude-3-haiku-20240307-v1:0",
ANTHROPIC_CLAUDE_3_OPUS: "anthropic.claude-3-opus-20240229-v1:0",
ANTHROPIC_CLAUDE_3_5_SONNET: "anthropic.claude-3-5-sonnet-20240620-v1:0",
ANTHROPIC_CLAUDE_3_5_SONNET_V2: "anthropic.claude-3-5-sonnet-20241022-v2:0",
ANTHROPIC_CLAUDE_3_5_HAIKU: "anthropic.claude-3-5-haiku-20241022-v1:0",
ANTHROPIC_CLAUDE_3_7_SONNET: "anthropic.claude-3-7-sonnet-20250219-v1:0",
META_LLAMA2_13B_CHAT: "meta.llama2-13b-chat-v1",
META_LLAMA2_70B_CHAT: "meta.llama2-70b-chat-v1",
META_LLAMA3_8B_INSTRUCT: "meta.llama3-8b-instruct-v1:0",
META_LLAMA3_70B_INSTRUCT: "meta.llama3-70b-instruct-v1:0",
META_LLAMA3_1_8B_INSTRUCT: "meta.llama3-1-8b-instruct-v1:0",
META_LLAMA3_1_70B_INSTRUCT: "meta.llama3-1-70b-instruct-v1:0",
META_LLAMA3_1_405B_INSTRUCT: "meta.llama3-1-405b-instruct-v1:0",
META_LLAMA3_2_1B_INSTRUCT: "meta.llama3-2-1b-instruct-v1:0",
META_LLAMA3_2_3B_INSTRUCT: "meta.llama3-2-3b-instruct-v1:0",
META_LLAMA3_2_11B_INSTRUCT: "meta.llama3-2-11b-instruct-v1:0",
META_LLAMA3_2_90B_INSTRUCT: "meta.llama3-2-90b-instruct-v1:0",
META_LLAMA3_3_70B_INSTRUCT: "meta.llama3-3-70b-instruct-v1:0",
MISTRAL_7B_INSTRUCT: "mistral.mistral-7b-instruct-v0:2",
MISTRAL_MIXTRAL_7B_INSTRUCT: "mistral.mixtral-8x7b-instruct-v0:1",
MISTRAL_MIXTRAL_LARGE_2402: "mistral.mistral-large-2402-v1:0",
AMAZON_NOVA_PREMIER_1: "amazon.nova-premier-v1:0",
AMAZON_NOVA_PRO_1: "amazon.nova-pro-v1:0",
AMAZON_NOVA_LITE_1: "amazon.nova-lite-v1:0",
AMAZON_NOVA_MICRO_1: "amazon.nova-micro-v1:0",
};
export type BEDROCK_MODELS =
(typeof BEDROCK_MODELS)[keyof typeof BEDROCK_MODELS];
export const INFERENCE_BEDROCK_MODELS = {
US_ANTHROPIC_CLAUDE_3_HAIKU: "us.anthropic.claude-3-haiku-20240307-v1:0",
US_ANTHROPIC_CLAUDE_3_5_HAIKU: "us.anthropic.claude-3-5-haiku-20241022-v1:0",
US_ANTHROPIC_CLAUDE_3_OPUS: "us.anthropic.claude-3-opus-20240229-v1:0",
US_ANTHROPIC_CLAUDE_3_SONNET: "us.anthropic.claude-3-sonnet-20240229-v1:0",
US_ANTHROPIC_CLAUDE_3_5_SONNET:
"us.anthropic.claude-3-5-sonnet-20240620-v1:0",
US_ANTHROPIC_CLAUDE_3_5_SONNET_V2:
"us.anthropic.claude-3-5-sonnet-20241022-v2:0",
US_ANTHROPIC_CLAUDE_3_7_SONNET:
"us.anthropic.claude-3-7-sonnet-20250219-v1:0",
US_META_LLAMA_3_2_1B_INSTRUCT: "us.meta.llama3-2-1b-instruct-v1:0",
US_META_LLAMA_3_2_3B_INSTRUCT: "us.meta.llama3-2-3b-instruct-v1:0",
US_META_LLAMA_3_2_11B_INSTRUCT: "us.meta.llama3-2-11b-instruct-v1:0",
US_META_LLAMA_3_2_90B_INSTRUCT: "us.meta.llama3-2-90b-instruct-v1:0",
US_META_LLAMA_3_3_70B_INSTRUCT: "us.meta.llama3-3-70b-instruct-v1:0",
US_AMAZON_NOVA_PREMIER_1: "us.amazon.nova-premier-v1:0",
US_AMAZON_NOVA_PRO_1: "us.amazon.nova-pro-v1:0",
US_AMAZON_NOVA_LITE_1: "us.amazon.nova-lite-v1:0",
US_AMAZON_NOVA_MICRO_1: "us.amazon.nova-micro-v1:0",
EU_ANTHROPIC_CLAUDE_3_HAIKU: "eu.anthropic.claude-3-haiku-20240307-v1:0",
EU_ANTHROPIC_CLAUDE_3_5_HAIKU: "eu.anthropic.claude-3-5-haiku-20240307-v1:0",
EU_ANTHROPIC_CLAUDE_3_SONNET: "eu.anthropic.claude-3-sonnet-20240229-v1:0",
EU_ANTHROPIC_CLAUDE_3_5_SONNET:
"eu.anthropic.claude-3-5-sonnet-20240620-v1:0",
EU_ANTHROPIC_CLAUDE_3_7_SONNET:
"eu.anthropic.claude-3-7-sonnet-20250219-v1:0",
EU_META_LLAMA_3_2_1B_INSTRUCT: "eu.meta.llama3-2-1b-instruct-v1:0",
EU_META_LLAMA_3_2_3B_INSTRUCT: "eu.meta.llama3-2-3b-instruct-v1:0",
EU_AMAZON_NOVA_PREMIER_1: "eu.amazon.nova-premier-v1:0",
EU_AMAZON_NOVA_PRO_1: "eu.amazon.nova-pro-v1:0",
EU_AMAZON_NOVA_LITE_1: "eu.amazon.nova-lite-v1:0",
EU_AMAZON_NOVA_MICRO_1: "eu.amazon.nova-micro-v1:0",
};
export type INFERENCE_BEDROCK_MODELS =
(typeof INFERENCE_BEDROCK_MODELS)[keyof typeof INFERENCE_BEDROCK_MODELS];
export const INFERENCE_TO_BEDROCK_MAP: Record<
INFERENCE_BEDROCK_MODELS,
BEDROCK_MODELS
> = {
[INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_HAIKU]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
[INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_OPUS]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_OPUS,
[INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_SONNET]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_SONNET,
[INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_5_SONNET]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET,
[INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_7_SONNET]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_7_SONNET,
[INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_5_SONNET_V2]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET_V2,
[INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_5_HAIKU]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_HAIKU,
[INFERENCE_BEDROCK_MODELS.US_META_LLAMA_3_2_1B_INSTRUCT]:
BEDROCK_MODELS.META_LLAMA3_2_1B_INSTRUCT,
[INFERENCE_BEDROCK_MODELS.US_META_LLAMA_3_2_3B_INSTRUCT]:
BEDROCK_MODELS.META_LLAMA3_2_3B_INSTRUCT,
[INFERENCE_BEDROCK_MODELS.US_META_LLAMA_3_2_11B_INSTRUCT]:
BEDROCK_MODELS.META_LLAMA3_2_11B_INSTRUCT,
[INFERENCE_BEDROCK_MODELS.US_META_LLAMA_3_2_90B_INSTRUCT]:
BEDROCK_MODELS.META_LLAMA3_2_90B_INSTRUCT,
[INFERENCE_BEDROCK_MODELS.US_META_LLAMA_3_3_70B_INSTRUCT]:
BEDROCK_MODELS.META_LLAMA3_3_70B_INSTRUCT,
[INFERENCE_BEDROCK_MODELS.US_AMAZON_NOVA_PREMIER_1]:
BEDROCK_MODELS.AMAZON_NOVA_PREMIER_1,
[INFERENCE_BEDROCK_MODELS.US_AMAZON_NOVA_PRO_1]:
BEDROCK_MODELS.AMAZON_NOVA_PRO_1,
[INFERENCE_BEDROCK_MODELS.US_AMAZON_NOVA_LITE_1]:
BEDROCK_MODELS.AMAZON_NOVA_LITE_1,
[INFERENCE_BEDROCK_MODELS.US_AMAZON_NOVA_MICRO_1]:
BEDROCK_MODELS.AMAZON_NOVA_MICRO_1,
[INFERENCE_BEDROCK_MODELS.EU_ANTHROPIC_CLAUDE_3_HAIKU]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
[INFERENCE_BEDROCK_MODELS.EU_ANTHROPIC_CLAUDE_3_SONNET]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_SONNET,
[INFERENCE_BEDROCK_MODELS.EU_ANTHROPIC_CLAUDE_3_5_SONNET]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET,
[INFERENCE_BEDROCK_MODELS.EU_ANTHROPIC_CLAUDE_3_7_SONNET]:
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_7_SONNET,
[INFERENCE_BEDROCK_MODELS.EU_META_LLAMA_3_2_1B_INSTRUCT]:
BEDROCK_MODELS.META_LLAMA3_2_1B_INSTRUCT,
[INFERENCE_BEDROCK_MODELS.EU_META_LLAMA_3_2_3B_INSTRUCT]:
BEDROCK_MODELS.META_LLAMA3_2_3B_INSTRUCT,
[INFERENCE_BEDROCK_MODELS.EU_AMAZON_NOVA_PREMIER_1]:
BEDROCK_MODELS.AMAZON_NOVA_PREMIER_1,
[INFERENCE_BEDROCK_MODELS.EU_AMAZON_NOVA_PRO_1]:
BEDROCK_MODELS.AMAZON_NOVA_PRO_1,
[INFERENCE_BEDROCK_MODELS.EU_AMAZON_NOVA_LITE_1]:
BEDROCK_MODELS.AMAZON_NOVA_LITE_1,
[INFERENCE_BEDROCK_MODELS.EU_AMAZON_NOVA_MICRO_1]:
BEDROCK_MODELS.AMAZON_NOVA_MICRO_1,
};
/*
* Values taken from https://docs.aws.amazon.com/bedrock/latest/userguide/model-parameters.html#model-parameters-claude
*/
const COMPLETION_MODELS = {
[BEDROCK_MODELS.AMAZON_TITAN_TG1_LARGE]: 8000,
[BEDROCK_MODELS.AMAZON_TITAN_TEXT_EXPRESS_V1]: 8000,
[BEDROCK_MODELS.AI21_J2_GRANDE_INSTRUCT]: 8000,
[BEDROCK_MODELS.AI21_J2_JUMBO_INSTRUCT]: 8000,
[BEDROCK_MODELS.AI21_J2_MID]: 8000,
[BEDROCK_MODELS.AI21_J2_MID_V1]: 8000,
[BEDROCK_MODELS.AI21_J2_ULTRA]: 8000,
[BEDROCK_MODELS.AI21_J2_ULTRA_V1]: 8000,
[BEDROCK_MODELS.COHERE_COMMAND_TEXT_V14]: 4096,
};
const CHAT_ONLY_MODELS = {
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_INSTANT_1]: 100000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_1]: 100000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_2]: 100000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_2_1]: 200000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_SONNET]: 200000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU]: 200000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_OPUS]: 200000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET]: 200000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET_V2]: 200000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_HAIKU]: 200000,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_7_SONNET]: 200000,
[BEDROCK_MODELS.META_LLAMA2_13B_CHAT]: 2048,
[BEDROCK_MODELS.META_LLAMA2_70B_CHAT]: 4096,
[BEDROCK_MODELS.META_LLAMA3_8B_INSTRUCT]: 8192,
[BEDROCK_MODELS.META_LLAMA3_70B_INSTRUCT]: 8192,
[BEDROCK_MODELS.META_LLAMA3_1_8B_INSTRUCT]: 128000,
[BEDROCK_MODELS.META_LLAMA3_1_70B_INSTRUCT]: 128000,
[BEDROCK_MODELS.META_LLAMA3_1_405B_INSTRUCT]: 128000,
[BEDROCK_MODELS.META_LLAMA3_2_1B_INSTRUCT]: 131000,
[BEDROCK_MODELS.META_LLAMA3_2_3B_INSTRUCT]: 131000,
[BEDROCK_MODELS.META_LLAMA3_2_11B_INSTRUCT]: 128000,
[BEDROCK_MODELS.META_LLAMA3_2_90B_INSTRUCT]: 128000,
[BEDROCK_MODELS.META_LLAMA3_3_70B_INSTRUCT]: 128000,
[BEDROCK_MODELS.MISTRAL_7B_INSTRUCT]: 32000,
[BEDROCK_MODELS.MISTRAL_MIXTRAL_7B_INSTRUCT]: 32000,
[BEDROCK_MODELS.MISTRAL_MIXTRAL_LARGE_2402]: 32000,
[BEDROCK_MODELS.AMAZON_NOVA_PREMIER_1]: 300000,
[BEDROCK_MODELS.AMAZON_NOVA_PRO_1]: 300000,
[BEDROCK_MODELS.AMAZON_NOVA_LITE_1]: 300000,
[BEDROCK_MODELS.AMAZON_NOVA_MICRO_1]: 130000,
};
const BEDROCK_FOUNDATION_LLMS = { ...COMPLETION_MODELS, ...CHAT_ONLY_MODELS };
/*
* Only the following models support streaming as
* per result of Bedrock.Client.list_foundation_models
* https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock/client/list_foundation_models.html
*/
export const STREAMING_MODELS = new Set([
BEDROCK_MODELS.AMAZON_TITAN_TG1_LARGE,
BEDROCK_MODELS.AMAZON_TITAN_TEXT_EXPRESS_V1,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_INSTANT_1,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_1,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_2,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_2_1,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_SONNET,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_OPUS,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET_V2,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_HAIKU,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_7_SONNET,
BEDROCK_MODELS.META_LLAMA2_13B_CHAT,
BEDROCK_MODELS.META_LLAMA2_70B_CHAT,
BEDROCK_MODELS.META_LLAMA3_8B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_70B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_1_8B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_1_70B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_1_405B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_1B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_3B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_11B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_90B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_3_70B_INSTRUCT,
BEDROCK_MODELS.MISTRAL_7B_INSTRUCT,
BEDROCK_MODELS.MISTRAL_MIXTRAL_7B_INSTRUCT,
BEDROCK_MODELS.MISTRAL_MIXTRAL_LARGE_2402,
BEDROCK_MODELS.AMAZON_NOVA_PREMIER_1,
BEDROCK_MODELS.AMAZON_NOVA_PRO_1,
BEDROCK_MODELS.AMAZON_NOVA_LITE_1,
BEDROCK_MODELS.AMAZON_NOVA_MICRO_1,
]);
export const TOOL_CALL_MODELS: BEDROCK_MODELS[] = [
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_SONNET,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_OPUS,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET_V2,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_HAIKU,
BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_7_SONNET,
BEDROCK_MODELS.META_LLAMA3_1_405B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_1B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_3B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_11B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_2_90B_INSTRUCT,
BEDROCK_MODELS.META_LLAMA3_3_70B_INSTRUCT,
BEDROCK_MODELS.AMAZON_NOVA_PREMIER_1,
BEDROCK_MODELS.AMAZON_NOVA_PRO_1,
BEDROCK_MODELS.AMAZON_NOVA_LITE_1,
BEDROCK_MODELS.AMAZON_NOVA_MICRO_1,
];
const getProvider = (model: string): Provider => {
const providerName = model.split(".")[0];
if (!providerName) {
throw new Error(`Model ${model} is not supported`);
}
if (!(providerName in PROVIDERS)) {
throw new Error(
`Provider ${providerName} for model ${model} is not supported`,
);
}
return PROVIDERS[providerName]!;
};
export type BedrockModelParams = {
model: BEDROCK_MODELS | INFERENCE_BEDROCK_MODELS;
temperature?: number;
topP?: number;
maxTokens?: number;
};
export const BEDROCK_MODEL_MAX_TOKENS: Partial<Record<BEDROCK_MODELS, number>> =
{
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_SONNET]: 4096,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU]: 4096,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_OPUS]: 4096,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET]: 4096,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET_V2]: 8192,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_HAIKU]: 8192,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_7_SONNET]: 8192,
[BEDROCK_MODELS.META_LLAMA2_13B_CHAT]: 2048,
[BEDROCK_MODELS.META_LLAMA2_70B_CHAT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_8B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_70B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_1_8B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_1_70B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_1_405B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_2_1B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_2_3B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_2_11B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_2_90B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_3_70B_INSTRUCT]: 2048,
};
const DEFAULT_BEDROCK_PARAMS = {
temperature: 0.1,
topP: 1,
maxTokens: 1024, // required by anthropic
};
export type BedrockParams = BedrockRuntimeClientConfig & BedrockModelParams;
/**
* ToolCallLLM for Bedrock
*/
export class Bedrock extends ToolCallLLM<BedrockAdditionalChatOptions> {
private client: BedrockRuntimeClient;
protected actualModel: BEDROCK_MODELS | INFERENCE_BEDROCK_MODELS;
model: BEDROCK_MODELS;
temperature: number;
topP: number;
maxTokens?: number;
provider: Provider;
topK?: number;
// there should be no check for env variables. Bedrock can be authenticated in various ways
// AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_REGION are the env variables used directly by the sdk
constructor({
temperature,
topP,
maxTokens,
model,
...params
}: BedrockParams) {
super();
this.actualModel = model;
this.model = INFERENCE_TO_BEDROCK_MAP[model] ?? model;
this.provider = getProvider(this.model);
this.maxTokens = maxTokens ?? DEFAULT_BEDROCK_PARAMS.maxTokens;
this.temperature = temperature ?? DEFAULT_BEDROCK_PARAMS.temperature;
this.topP = topP ?? DEFAULT_BEDROCK_PARAMS.topP;
this.client = new BedrockRuntimeClient(params);
}
get supportToolCall(): boolean {
return TOOL_CALL_MODELS.includes(this.model);
}
get metadata(): LLMMetadata {
// NOTE, Anthropic supports top_k but LLMMetadata does not
return {
model: this.model,
temperature: this.temperature,
topP: this.topP,
maxTokens: this.maxTokens,
contextWindow: BEDROCK_FOUNDATION_LLMS[this.model] ?? 128000,
tokenizer: undefined,
structuredOutput: false,
};
}
protected async nonStreamChat(
params: BedrockChatParamsNonStreaming,
): Promise<BedrockChatNonStreamResponse> {
if (!this.supportToolCall && params.tools?.length) {
console.warn(`The model "${this.model}" doesn't support ToolCall`);
}
const input = this.provider.getRequestBody(
this.metadata,
params.messages,
params.tools,
params.additionalChatOptions,
);
const command = new InvokeModelCommand(input);
command.input.modelId = this.actualModel;
const response = await this.client.send(command);
let options: ToolCallLLMMessageOptions = {};
if (this.supportToolCall) {
const tools = this.provider.getToolsFromResponse(response);
if (tools.length) {
options = { toolCall: tools };
}
}
return {
raw: response,
message: {
role: "assistant",
content: this.provider.getTextFromResponse(response),
options,
},
};
}
protected async *streamChat(
params: BedrockChatParamsStreaming,
): BedrockChatStreamResponse {
if (!STREAMING_MODELS.has(this.model))
throw new Error(`The model: ${this.model} does not support streaming`);
if (!this.supportToolCall && params.tools?.length) {
console.warn(`The model "${this.model}" doesn't support ToolCall`);
}
const input = this.provider.getRequestBody(
this.metadata,
params.messages,
params.tools,
params.additionalChatOptions,
);
const command = new InvokeModelWithResponseStreamCommand(input);
command.input.modelId = this.actualModel;
const response = await this.client.send(command);
if (response.body) yield* this.provider.reduceStream(response.body);
}
chat(params: BedrockChatParamsStreaming): Promise<BedrockChatStreamResponse>;
chat(
params: BedrockChatParamsNonStreaming,
): Promise<BedrockChatNonStreamResponse>;
@wrapLLMEvent
async chat(
params: BedrockChatParamsStreaming | BedrockChatParamsNonStreaming,
): Promise<BedrockChatStreamResponse | BedrockChatNonStreamResponse> {
if (params.stream) {
return this.streamChat(params);
}
return this.nonStreamChat(params);
}
complete(
params: LLMCompletionParamsStreaming,
): Promise<AsyncIterable<CompletionResponse>>;
complete(
params: LLMCompletionParamsNonStreaming,
): Promise<CompletionResponse>;
async complete(
params: LLMCompletionParamsStreaming | LLMCompletionParamsNonStreaming,
): Promise<CompletionResponse | AsyncIterable<CompletionResponse>> {
const message: ChatMessage = {
role: "user",
content: mapMessageContentToMessageContentDetails(params.prompt),
};
const input = this.provider.getRequestBody(this.metadata, [message]);
if (params.stream) {
const command = new InvokeModelWithResponseStreamCommand(input);
const response = await this.client.send(command);
if (response.body)
return streamConverter(response.body, (response) => {
return {
text: this.provider.getTextFromStreamResponse(response),
raw: response,
};
});
}
const command = new InvokeModelCommand(input);
const response = await this.client.send(command);
return {
text: this.provider.getTextFromResponse(response),
raw: response,
};
}
}
@@ -0,0 +1,3 @@
export const TOKENS = {
TOOL_CALL: "<|python_tag|>",
};
@@ -0,0 +1,153 @@
import type {
InvokeModelCommandInput,
InvokeModelWithResponseStreamCommandInput,
ResponseStream,
} from "@aws-sdk/client-bedrock-runtime";
import type {
BaseTool,
ChatMessage,
LLMMetadata,
ToolCall,
ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { toUtf8 } from "../utils";
import type { MetaNoneStreamingResponse, MetaStreamEvent } from "./types";
import { randomUUID } from "@llamaindex/env";
import { Provider, type BedrockChatStreamResponse } from "../provider";
import { TOKENS } from "./constants";
import {
mapChatMessagesToMetaLlama2Messages,
mapChatMessagesToMetaLlama3Messages,
} from "./utils";
export class MetaProvider extends Provider<MetaStreamEvent> {
getResultFromResponse(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
response: Record<string, any>,
): MetaNoneStreamingResponse {
return JSON.parse(toUtf8(response.body));
}
getToolsFromResponse<ToolContent>(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
response: Record<string, any>,
): ToolContent[] {
const result = this.getResultFromResponse(response);
if (!result.generation.trim().startsWith(TOKENS.TOOL_CALL)) return [];
const tool = JSON.parse(
result.generation.trim().split(TOKENS.TOOL_CALL)[1]!,
);
return [
{
id: randomUUID(),
name: tool.name,
input: tool.parameters,
} as ToolContent,
];
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
getTextFromResponse(response: Record<string, any>): string {
const result = this.getResultFromResponse(response);
if (result.generation.trim().startsWith(TOKENS.TOOL_CALL)) return "";
return result.generation;
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
getTextFromStreamResponse(response: Record<string, any>): string {
const event = this.getStreamingEventResponse(response);
if (event?.generation) {
return event.generation;
}
return "";
}
async *reduceStream(
stream: AsyncIterable<ResponseStream>,
): BedrockChatStreamResponse {
const collecting: string[] = [];
let toolId: string | undefined = undefined;
for await (const response of stream) {
const event = this.getStreamingEventResponse(response);
const delta = this.getTextFromStreamResponse(response);
// odd quirk of llama3.1, start token is \n\n
if (
!toolId &&
!event?.generation.trim() &&
event?.generation_token_count === 1 &&
event?.prompt_token_count !== null
)
continue;
if (delta.startsWith(TOKENS.TOOL_CALL)) {
toolId = randomUUID();
const parts = delta.split(TOKENS.TOOL_CALL).filter((part) => part);
collecting.push(...parts);
continue;
}
let options: undefined | ToolCallLLMMessageOptions = undefined;
if (toolId && event?.stop_reason === "stop") {
if (delta) collecting.push(delta);
const tool = JSON.parse(collecting.join(""));
options = {
toolCall: [
{
id: toolId,
name: tool.name,
input: tool.parameters,
} as ToolCall,
],
};
} else if (toolId && !event?.stop_reason) {
collecting.push(delta);
continue;
}
if (!delta && !options) continue;
yield {
delta: options ? "" : delta,
options,
raw: response,
};
}
}
getRequestBody<T extends ChatMessage>(
metadata: LLMMetadata,
messages: T[],
tools: BaseTool[] = [],
): InvokeModelCommandInput | InvokeModelWithResponseStreamCommandInput {
let prompt: string = "";
let images: string[] = [];
if (metadata.model.startsWith("meta.llama3")) {
const mapped = mapChatMessagesToMetaLlama3Messages({
messages,
tools,
model: metadata.model,
});
prompt = mapped.prompt;
images = mapped.images;
} else if (metadata.model.startsWith("meta.llama2")) {
prompt = mapChatMessagesToMetaLlama2Messages(messages);
} else {
throw new Error(`Meta model ${metadata.model} is not supported`);
}
return {
modelId: metadata.model,
contentType: "application/json",
accept: "application/json",
body: JSON.stringify({
prompt,
images: images.length ? images : undefined,
max_gen_len: metadata.maxTokens,
temperature: metadata.temperature,
top_p: metadata.topP,
}),
};
}
}
@@ -0,0 +1,21 @@
import type { InvocationMetrics } from "../types";
export type MetaTextContent = string;
export type MetaMessage = {
role: "user" | "assistant" | "system" | "ipython";
content: MetaTextContent;
};
type MetaResponse = {
generation: string;
prompt_token_count: number;
generation_token_count: number;
stop_reason: "stop" | "length";
};
export type MetaStreamEvent = MetaResponse & {
"amazon-bedrock-invocationMetrics": InvocationMetrics;
};
export type MetaNoneStreamingResponse = MetaResponse;
@@ -0,0 +1,273 @@
import type {
BaseTool,
ChatMessage,
LLMMetadata,
MessageContentTextDetail,
ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { extractDataUrlComponents } from "../utils";
import { TOKENS } from "./constants";
import type { MetaMessage } from "./types";
const getToolCallInstructionString = (tool: BaseTool): string => {
return `Use the function '${tool.metadata.name}' to '${tool.metadata.description}'`;
};
const getToolCallParametersString = (tool: BaseTool): string => {
return JSON.stringify({
name: tool.metadata.name,
description: tool.metadata.description,
parameters: tool.metadata.parameters
? Object.entries(tool.metadata.parameters.properties).map(
([name, definition]) => ({ [name]: definition }),
)
: {},
});
};
// ported from https://github.com/meta-llama/llama-agentic-system/blob/main/llama_agentic_system/system_prompt.py
// NOTE: using json instead of the above xml style tool calling works more reliability
export const getToolsPrompt_3_1 = (tools?: BaseTool[]) => {
if (!tools?.length) return "";
const customToolParams = tools.map((tool) => {
return [
getToolCallInstructionString(tool),
getToolCallParametersString(tool),
].join("\n\n");
});
return `
Environment: node
# Tool Instructions
- Never use ipython, always use javascript in node
Cutting Knowledge Date: December 2023
Today Date: ${new Date().toLocaleString("en-US", { year: "numeric", month: "long" })}
You have access to the following functions:
${customToolParams}
Think very carefully before calling functions.
If a you choose to call a function ONLY reply in the following json format:
{
"name": function_name,
"parameters": parameters,
}
where
{
"name": function_name,
"parameters": parameters, => a JSON dict with the function argument name as key and function argument value as value.
}
Here is an example,
{
"name": "example_function_name",
"parameters": {"example_name": "example_value"}
}
Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line
- Always add your sources when using search results to answer the user query
`;
};
export const getToolsPrompt_3_2 = (tools?: BaseTool[]) => {
if (!tools?.length) return "";
return `
You are an expert in composing functions. You are given a question and a set of possible functions.
Based on the question, you will need to make one or more function/tool calls to achieve the purpose.
If none of the function can be used, point it out. If the given question lacks the parameters required by the function,
also point it out. You should only return the function call in tools call sections.
If you decide to invoke any of the function(s), you MUST put it in the format of and start with the token: ${TOKENS.TOOL_CALL}:
{
"name": function_name,
"parameters": parameters,
}
where
{
"name": function_name,
"parameters": parameters, => a JSON dict with the function argument name as key and function argument value as value.
}
Here is an example,
{
"name": "example_function_name",
"parameters": {"example_name": "example_value"}
}
Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- You SHOULD NOT include any other text in the response
- Put the entire function call reply on one line
Here is a list of functions in JSON format that you can invoke.
${JSON.stringify(tools)}
`;
};
export const mapChatRoleToMetaRole = (
role: ChatMessage["role"],
): MetaMessage["role"] => {
if (role === "assistant") return "assistant";
if (role === "user") return "user";
return "system";
};
export const mapChatMessagesToMetaMessages = <
T extends ChatMessage<ToolCallLLMMessageOptions>,
>(
messages: T[],
): MetaMessage[] => {
return messages.flatMap((msg) => {
if (msg.options && "toolCall" in msg.options) {
return msg.options.toolCall.map((call) => ({
role: "assistant",
content: JSON.stringify({
id: call.id,
name: call.name,
parameters: call.input,
}),
}));
}
if (msg.options && "toolResult" in msg.options) {
return {
role: "ipython",
content: JSON.stringify(msg.options.toolResult),
};
}
let content: string = "";
if (typeof msg.content === "string") {
content = msg.content;
} else if (msg.content.length) {
content = (msg.content[0] as MessageContentTextDetail).text;
}
return {
role: mapChatRoleToMetaRole(msg.role),
content,
};
});
};
/**
* Documentation at https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3
*/
export const mapChatMessagesToMetaLlama3Messages = <T extends ChatMessage>({
messages,
model,
tools,
}: {
messages: T[];
model: LLMMetadata["model"];
tools?: BaseTool[];
}): { prompt: string; images: string[] } => {
const images: string[] = [];
const textMessages: T[] = [];
messages.forEach((message) => {
if (Array.isArray(message.content)) {
message.content.forEach((content) => {
if (content.type === "image_url") {
const { base64 } = extractDataUrlComponents(content.image_url.url);
images.push(base64);
} else {
textMessages.push(message);
}
});
} else {
textMessages.push(message);
}
});
const parts: string[] = [];
let toolsPrompt = "";
if (model.startsWith("meta.llama3-2")) {
toolsPrompt = getToolsPrompt_3_2(tools);
} else if (model.startsWith("meta.llama3-1")) {
toolsPrompt = getToolsPrompt_3_1(tools);
}
if (toolsPrompt) {
parts.push(
"<|begin_of_text|>",
"<|start_header_id|>system<|end_header_id|>",
toolsPrompt,
"<|eot_id|>",
);
}
const mapped = mapChatMessagesToMetaMessages(messages).map((message) => {
return [
"<|start_header_id|>",
message.role,
"<|end_header_id|>",
message.content,
"<|eot_id|>",
].join("\n");
});
parts.push(
"<|begin_of_text|>",
...mapped,
"<|start_header_id|>assistant<|end_header_id|>",
);
const prompt = parts.join("\n");
return { prompt, images };
};
/**
* Documentation at https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-2
*/
export const mapChatMessagesToMetaLlama2Messages = <T extends ChatMessage>(
messages: T[],
): string => {
const mapped = mapChatMessagesToMetaMessages(messages);
let output = "<s>";
let insideInst = false;
let needsStartAgain = false;
for (const message of mapped) {
if (needsStartAgain) {
output += "<s>";
needsStartAgain = false;
}
const text = message.content;
if (message.role === "system") {
if (!insideInst) {
output += "[INST] ";
insideInst = true;
}
output += `<<SYS>>\n${text}\n<</SYS>>\n`;
} else if (message.role === "user") {
output += text;
if (insideInst) {
output += " [/INST]";
insideInst = false;
}
} else if (message.role === "assistant") {
if (insideInst) {
output += " [/INST]";
insideInst = false;
}
output += ` ${text} </s>\n`;
needsStartAgain = true;
}
}
return output;
};
@@ -0,0 +1,63 @@
import {
type InvokeModelCommandInput,
type InvokeModelWithResponseStreamCommandInput,
ResponseStream,
} from "@aws-sdk/client-bedrock-runtime";
import {
type BaseTool,
type ChatMessage,
type ChatResponseChunk,
type LLMMetadata,
type ToolCallLLMMessageOptions,
} from "@llamaindex/core/llms";
import { streamConverter } from "@llamaindex/core/utils";
import { toUtf8 } from "./utils";
export type BedrockAdditionalChatOptions = Record<string, unknown>;
export type BedrockChatStreamResponse = AsyncIterable<
ChatResponseChunk<ToolCallLLMMessageOptions>
>;
export abstract class Provider<ProviderStreamEvent extends object = object> {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
abstract getTextFromResponse(response: Record<string, any>): string;
// Return tool calls from none streaming calls
abstract getToolsFromResponse<T extends object = object>(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
response: Record<string, any>,
): T[];
getStreamingEventResponse(
// eslint-disable-next-line @typescript-eslint/no-explicit-any
response: Record<string, any>,
): ProviderStreamEvent | undefined {
return response.chunk?.bytes
? (JSON.parse(toUtf8(response.chunk?.bytes)) as ProviderStreamEvent)
: undefined;
}
async *reduceStream(
stream: AsyncIterable<ResponseStream>,
): BedrockChatStreamResponse {
yield* streamConverter(stream, (response) => {
return {
delta: this.getTextFromStreamResponse(response),
raw: response,
};
});
}
// eslint-disable-next-line @typescript-eslint/no-explicit-any
getTextFromStreamResponse(response: Record<string, any>): string {
return this.getTextFromResponse(response);
}
abstract getRequestBody<T extends ChatMessage>(
metadata: LLMMetadata,
messages: T[],
tools?: BaseTool[],
options?: BedrockAdditionalChatOptions,
): InvokeModelCommandInput | InvokeModelWithResponseStreamCommandInput;
}
@@ -0,0 +1,6 @@
export type InvocationMetrics = {
inputTokenCount: number;
outputTokenCount: number;
invocationLatency: number;
firstByteLatency: number;
};
@@ -0,0 +1,23 @@
export const toUtf8 = (input: Uint8Array): string =>
new TextDecoder("utf-8").decode(input);
export const extractDataUrlComponents = (
dataUrl: string,
): {
mimeType: string;
base64: string;
} => {
const parts = dataUrl.split(";base64,");
if (parts.length !== 2 || !parts[0]!.startsWith("data:")) {
throw new Error("Invalid data URL");
}
const mimeType = parts[0]!.slice(5);
const base64 = parts[1]!;
return {
mimeType,
base64,
};
};
+10
View File
@@ -0,0 +1,10 @@
import type {
MessageContent,
MessageContentDetail,
} from "@llamaindex/core/llms";
export const mapMessageContentToMessageContentDetails = (
content: MessageContent,
): MessageContentDetail[] => {
return Array.isArray(content) ? content : [{ type: "text", text: content }];
};
@@ -0,0 +1,165 @@
import type { KnowledgeBaseVectorSearchConfiguration } from "@aws-sdk/client-bedrock-agent-runtime";
import {
BedrockAgentRuntimeClient,
type BedrockAgentRuntimeClientConfig,
type RetrievalFilter,
RetrieveCommand,
type SearchType,
} from "@aws-sdk/client-bedrock-agent-runtime";
import type { QueryBundle } from "@llamaindex/core/query-engine";
import { BaseRetriever } from "@llamaindex/core/retriever";
import { Document, type NodeWithScore } from "@llamaindex/core/schema";
import { extractText } from "@llamaindex/core/utils";
/**
* Interface for the arguments required to initialize an
* AmazonKnowledgeBaseRetriever instance.
*/
export interface AmazonKnowledgeBaseRetrieverArgs {
knowledgeBaseId: string;
topK: number;
region: string;
clientOptions?: BedrockAgentRuntimeClientConfig;
filter?: RetrievalFilter;
overrideSearchType?: SearchType;
}
/**
* Class for interacting with Amazon Bedrock Knowledge Bases, a RAG workflow oriented service
* Extends the BaseRetriever class.
* @example
* ```typescript
* const retriever = new AmazonKnowledgeBaseRetriever({
* topK: 10,
* knowledgeBaseId: "YOUR_KNOWLEDGE_BASE_ID",
* region: "us-east-2",
* clientOptions: {
* credentials: {
* accessKeyId: "YOUR_ACCESS_KEY_ID",
* secretAccessKey: "YOUR_SECRET_ACCESS_KEY",
* },
* },
* });
*
* const docs = await retriever.retrieve({query: "How are clouds formed?"});
* ```
*/
export class AmazonKnowledgeBaseRetriever extends BaseRetriever {
static lc_name() {
return "AmazonKnowledgeBaseRetriever";
}
lc_namespace = ["llamaindex", "retrievers", "amazon_bedrock_knowledge_base"];
knowledgeBaseId: string;
topK: number;
bedrockAgentRuntimeClient: BedrockAgentRuntimeClient;
filter: RetrievalFilter | undefined;
overrideSearchType: SearchType | undefined;
constructor({
knowledgeBaseId,
topK = 10,
clientOptions,
region,
filter,
overrideSearchType,
}: AmazonKnowledgeBaseRetrieverArgs) {
super();
this.topK = topK;
this.filter = filter;
this.overrideSearchType = overrideSearchType;
this.bedrockAgentRuntimeClient = new BedrockAgentRuntimeClient({
region,
...clientOptions,
});
this.knowledgeBaseId = knowledgeBaseId;
}
/**
* Cleans the result text by replacing sequences of whitespace with a
* single space and removing ellipses.
* @param resText The result text to clean.
* @returns The cleaned result text.
*/
cleanResult(resText: string) {
const res = resText.replace(/\s+/g, " ").replace(/\.\.\./g, "");
return res;
}
async queryKnowledgeBase(
query: QueryBundle,
topK: number,
filter?: RetrievalFilter,
overrideSearchType?: SearchType,
): Promise<NodeWithScore[]> {
const retrieveCommand = new RetrieveCommand({
knowledgeBaseId: this.knowledgeBaseId,
retrievalQuery: {
text: extractText(query),
},
retrievalConfiguration: {
vectorSearchConfiguration: {
numberOfResults: topK,
overrideSearchType,
filter,
} as KnowledgeBaseVectorSearchConfiguration,
},
});
const retrieveResponse =
await this.bedrockAgentRuntimeClient.send(retrieveCommand);
return (
retrieveResponse.retrievalResults?.map((result) => {
let source;
switch (result.location?.type) {
case "CONFLUENCE":
source = result.location?.confluenceLocation?.url;
break;
case "S3":
source = result.location?.s3Location?.uri;
break;
case "SALESFORCE":
source = result.location?.salesforceLocation?.url;
break;
case "SHAREPOINT":
source = result.location?.sharePointLocation?.url;
break;
case "WEB":
source = result.location?.webLocation?.url;
break;
default:
source = result.location?.s3Location?.uri;
break;
}
return {
node: new Document({
text: this.cleanResult(result.content?.text || ""),
metadata: {
source,
score: result.score,
...result.metadata,
},
}),
score: result.score ?? 1.0,
};
}) ?? []
);
}
async _retrieve(query: QueryBundle): Promise<NodeWithScore[]> {
return await this.queryKnowledgeBase(
query,
this.topK,
this.filter,
this.overrideSearchType,
);
}
}
+19
View File
@@ -0,0 +1,19 @@
{
"extends": "../../tsconfig.json",
"compilerOptions": {
"rootDir": "./src",
"outDir": "./dist/type",
"tsBuildInfoFile": "./dist/.tsbuildinfo",
"emitDeclarationOnly": true,
"module": "ESNext",
"moduleResolution": "bundler",
"types": ["node"]
},
"include": ["./src"],
"exclude": ["node_modules"],
"references": [
{
"path": "../llamaindex/tsconfig.json"
}
]
}
+19
View File
@@ -1,5 +1,24 @@
# @llamaindex/core
## 0.6.19
### Patch Changes
- f9f1de9: Use logger interface instead of directly hardcoding console.log
## 0.6.18
### Patch Changes
- f29799e: Add toolcall callbacks to agent workflows
- 7224c06: Add logger and callbacks to llm.exec
## 0.6.17
### Patch Changes
- 38da40b: feat: VectoryMemoryBlock
## 0.6.16
### Patch Changes
+1 -1
View File
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/core",
"type": "module",
"version": "0.6.16",
"version": "0.6.19",
"description": "LlamaIndex Core Module",
"exports": {
"./agent": {
+2
View File
@@ -15,6 +15,7 @@ import type {
} from "../llms";
import { baseToolWithCallSchema } from "../schema";
import {
assertIsJSONValue,
isAsyncIterable,
prettifyError,
stringifyJSONToMessageContent,
@@ -227,6 +228,7 @@ export async function callTool(
`Tool ${tool.metadata.name} (remote:${toolCall.name}) succeeded.`,
);
logger.log(`Output: ${JSON.stringify(output)}`);
assertIsJSONValue(output);
const toolOutput: ToolOutput = {
tool,
input,
+6 -1
View File
@@ -1,3 +1,4 @@
import { consoleLogger, emptyLogger, type Logger } from "@llamaindex/env";
import type { Tokenizers } from "@llamaindex/env/tokenizers";
import type { MessageContentDetail } from "../llms";
import { BaseNode, MetadataMode, TransformComponent } from "../schema";
@@ -18,6 +19,7 @@ export type EmbeddingInfo = {
export type BaseEmbeddingOptions = {
logProgress?: boolean;
progressCallback?: (current: number, total: number) => void;
logger?: Logger;
};
export abstract class BaseEmbedding extends TransformComponent<
@@ -133,6 +135,9 @@ export async function batchEmbeddings<T>(
const curBatch: T[] = [];
const logger =
options?.logger ?? (options?.logProgress ? consoleLogger : emptyLogger);
for (let i = 0; i < queue.length; i++) {
curBatch.push(queue[i]!);
if (i == queue.length - 1 || curBatch.length == chunkSize) {
@@ -143,7 +148,7 @@ export async function batchEmbeddings<T>(
options?.progressCallback?.(i + 1, queue.length);
}
if (options?.logProgress) {
console.log(`getting embedding progress: ${i + 1} / ${queue.length}`);
logger.log(`getting embedding progress: ${i + 1} / ${queue.length}`);
}
curBatch.length = 0;
+16 -9
View File
@@ -1,6 +1,7 @@
import { emptyLogger } from "@llamaindex/env";
import { extractText } from "../utils/llms";
import { streamConverter } from "../utils/stream";
import { callTool, getToolCallsFromResponse } from "./tool-call";
import { callToolToMessage, getToolCallsFromResponse } from "./tool-call";
import type {
ChatMessage,
ChatResponse,
@@ -99,16 +100,19 @@ export abstract class BaseLLM<
if (params.stream) {
return this.streamExec(params);
}
const logger = params.logger ?? emptyLogger;
const newMessages: ChatMessage<AdditionalMessageOptions>[] = [];
const response = await this.chat(params);
newMessages.push(response.message);
const toolCalls = getToolCallsFromResponse(response);
if (params.tools && toolCalls.length > 0) {
for (const toolCall of toolCalls) {
const toolResultMessage = await callTool<AdditionalMessageOptions>(
params.tools,
toolCall,
);
const toolResultMessage =
await callToolToMessage<AdditionalMessageOptions>(
params.tools,
toolCall,
logger,
);
if (toolResultMessage) {
newMessages.push(toolResultMessage);
}
@@ -126,6 +130,7 @@ export abstract class BaseLLM<
AdditionalMessageOptions
>,
): Promise<ExecStreamResponse<AdditionalMessageOptions>> {
const logger = params.logger ?? emptyLogger;
const responseStream = await this.chat(params);
const iterator = responseStream[Symbol.asyncIterator]();
const first = await iterator.next();
@@ -220,10 +225,12 @@ export abstract class BaseLLM<
} as AdditionalMessageOptions,
});
for (const toolCall of toolCalls) {
const toolResultMessage = await callTool<AdditionalMessageOptions>(
params.tools,
toolCall,
);
const toolResultMessage =
await callToolToMessage<AdditionalMessageOptions>(
params.tools,
toolCall,
logger,
);
if (toolResultMessage) {
messages.push(toolResultMessage);
}
+20 -17
View File
@@ -1,3 +1,5 @@
import { type Logger } from "@llamaindex/env";
import { callTool } from "../agent/utils.js";
import { stringifyJSONToMessageContent } from "../utils";
import type {
BaseTool,
@@ -35,27 +37,28 @@ export const getToolCallsFromResponse = (
return [];
};
export const callTool = async <
export const callToolToMessage = async <
AdditionalMessageOptions extends object = object,
>(
tools: BaseTool[],
toolCall: ToolCall,
logger: Logger,
): Promise<ChatMessage<AdditionalMessageOptions> | null> => {
const tool = tools?.find((t) => t.metadata.name === toolCall.name);
// TODO: consider using BaseToolWithCall instead of BaseTool to avoid checking for tool.call
if (tool && tool.call) {
const result = await tool.call(toolCall.input);
const toolResultMessage: ChatMessage<AdditionalMessageOptions> = {
role: "user",
content: stringifyJSONToMessageContent(result),
options: {
toolResult: {
id: toolCall.id,
result,
},
} as AdditionalMessageOptions,
};
return toolResultMessage;
}
return null;
const toolOutput = await callTool(tool, toolCall, logger);
const toolResultMessage: ChatMessage<AdditionalMessageOptions> = {
role: "user",
content: stringifyJSONToMessageContent(toolOutput.output),
options: {
toolResult: {
id: toolCall.id,
result: toolOutput.output,
isError: toolOutput.isError,
},
} as AdditionalMessageOptions,
};
return toolResultMessage;
};
+2
View File
@@ -1,3 +1,4 @@
import type { Logger } from "@llamaindex/env";
import type { Tokenizers } from "@llamaindex/env/tokenizers";
import type { JSONSchemaType } from "ajv";
import { z } from "zod";
@@ -139,6 +140,7 @@ export interface LLMChatParamsBase<
additionalChatOptions?: AdditionalChatOptions | undefined;
tools?: BaseTool[] | undefined;
responseFormat?: z.ZodType | object | undefined;
logger?: Logger | undefined;
}
export interface LLMChatParamsStreaming<
+3 -1
View File
@@ -39,7 +39,9 @@ export abstract class BaseMemoryBlock<
*
* @returns The memory block content as an array of ChatMessage.
*/
abstract get(): Promise<MemoryMessage<TAdditionalMessageOptions>[]>;
abstract get(
messages?: MemoryMessage<TAdditionalMessageOptions>[],
): Promise<MemoryMessage<TAdditionalMessageOptions>[]>;
/**
* Store the messages in the memory block.
+1
View File
@@ -1,3 +1,4 @@
export { BaseMemoryBlock } from "./base";
export { FactExtractionMemoryBlock } from "./fact";
export { StaticMemoryBlock } from "./static";
export { VectorMemoryBlock } from "./vector";
+250
View File
@@ -0,0 +1,250 @@
import type { BaseEmbedding } from "../../embeddings";
import type { BaseNodePostprocessor } from "../../postprocessor";
import { BasePromptTemplate, defaultContextSystemPrompt } from "../../prompts";
import type { NodeWithScore } from "../../schema";
import { MetadataMode, TextNode } from "../../schema";
import { extractText } from "../../utils/llms";
import type {
BaseVectorStore,
MetadataFilter,
VectorStoreQuery,
} from "../../vector-store";
import { VectorStoreQueryMode } from "../../vector-store";
import type { MemoryMessage } from "../types";
import { BaseMemoryBlock, type MemoryBlockOptions } from "./base";
/**
* The options for the vector memory block.
*/
export type VectorMemoryBlockOptions = {
/**
* The vector store to use for retrieval.
*/
vectorStore: BaseVectorStore;
/**
* Maximum number of messages to include for context when retrieving.
* @default 5
*/
retrievalContextWindow?: number;
/**
* Template for formatting the retrieved information.
* @default new PromptTemplate({ template: "{{ text }}" })
*/
formatTemplate?: BasePromptTemplate;
/**
* List of node postprocessors to apply to the retrieved nodes containing messages.
*
* @default []
*/
nodePostprocessors?: BaseNodePostprocessor[];
/**
* Configuration options for vector store queries when retrieving memory.
*
* @default
* ```typescript
* {
* similarityTopK: 2, // Number of top similar results to return
* mode: VectorStoreQueryMode.DEFAULT, // Query mode for the vector store
* sessionFilterKey: "session_id", // Metadata key for session filtering
* filters: {
* filters: [
* { key: "session_id", value: "<current block id>", operator: "==" }
* ],
* condition: "and"
* }
* }
* ```
*
* Note: A session filter is automatically added to ensure memory isolation between blocks.
* If custom filters are provided, the session filter will be merged with them.
*/
queryOptions?: Partial<VectorMemoryBlockQueryOptions>;
} & MemoryBlockOptions;
export type VectorMemoryBlockQueryOptions = Omit<
VectorStoreQuery,
"queryEmbedding" | "queryStr"
> & {
sessionFilterKey: string;
};
/**
* A memory block that retrieves relevant information from a vector store.
*
* This block stores conversation history in a vector store and retrieves
* relevant information based on the most recent messages.
*/
export class VectorMemoryBlock<
TAdditionalMessageOptions extends object = object,
> extends BaseMemoryBlock<TAdditionalMessageOptions> {
private readonly vectorStore: BaseVectorStore;
private readonly retrievalContextWindow: number;
private readonly formatTemplate: BasePromptTemplate;
private readonly nodePostprocessors: BaseNodePostprocessor[];
private readonly queryOptions: VectorMemoryBlockQueryOptions;
constructor(options: VectorMemoryBlockOptions) {
super(options);
// Validate vector store
if (!options.vectorStore.storesText) {
throw new Error(
"vectorStore must store text to be used as a retrieval memory block",
);
}
this.vectorStore = options.vectorStore;
this.retrievalContextWindow = options.retrievalContextWindow ?? 5;
this.queryOptions = this.buildDefaultQueryOptions(options.queryOptions);
this.formatTemplate = options.formatTemplate ?? defaultContextSystemPrompt;
this.nodePostprocessors = options.nodePostprocessors ?? [];
}
get embedModel(): BaseEmbedding {
return this.vectorStore.embedModel;
}
async get(
messages: MemoryMessage<TAdditionalMessageOptions>[] = [],
): Promise<MemoryMessage<TAdditionalMessageOptions>[]> {
if (messages?.length === 0) return [];
// Use the last message or a context window of messages for the query
let context: MemoryMessage<TAdditionalMessageOptions>[];
if (
this.retrievalContextWindow > 1 &&
messages.length >= this.retrievalContextWindow
) {
context = messages.slice(-this.retrievalContextWindow);
} else {
context = messages;
}
const queryText = context
.map((message) => extractText(message.content))
.join("\n\n");
if (!queryText) return [];
// Create and execute the query
const queryEmbedding = await this.embedModel.getTextEmbedding(queryText);
const query: VectorStoreQuery = {
queryStr: queryText,
queryEmbedding,
...this.queryOptions,
};
const results = await this.vectorStore.query(query);
if (!results.nodes?.length) return [];
// Create nodes with scores
const nodesWithScores: NodeWithScore[] = results.nodes.map(
(node, index) => ({
node,
score: results.similarities?.[index] ?? undefined,
}),
);
// Apply postprocessors
let processedNodes = nodesWithScores;
for (const postprocessor of this.nodePostprocessors) {
processedNodes = await postprocessor.postprocessNodes(
processedNodes,
queryText,
);
}
// Format the results
const retrievedText = processedNodes
.map(({ node }) => node.getContent(MetadataMode.NONE))
.join("\n\n");
const formattedText = this.formatTemplate.format({
context: retrievedText,
});
// Return as memory message
return [
{
id: this.id,
role: "memory",
content: formattedText,
} as MemoryMessage<TAdditionalMessageOptions>,
];
}
async put(
messages: MemoryMessage<TAdditionalMessageOptions>[],
): Promise<void> {
if (messages.length === 0) return;
// Format messages with role, text content, and additional info
const texts: string[] = [];
for (const message of messages) {
const text = extractText(message.content);
if (!text) continue;
let messageText = text;
// Add additional info if present
const additionalInfo = (message.options ?? {}) as Record<string, unknown>;
if (Object.keys(additionalInfo).length > 0) {
messageText += `\nAdditional Info: (${JSON.stringify(additionalInfo)})`;
}
texts.push(`<message role='${message.role}'>${messageText}</message>`);
}
if (texts.length === 0) return;
// Create text node with session metadata
const textNode = new TextNode({
text: texts.join("\n"),
metadata: { [this.queryOptions.sessionFilterKey]: this.id },
});
// Get embedding for the text
textNode.embedding = await this.embedModel.getTextEmbedding(textNode.text);
// Add to vector store
await this.vectorStore.add([textNode]);
}
private buildDefaultQueryOptions(
options: Partial<VectorMemoryBlockQueryOptions> | undefined,
): VectorMemoryBlockQueryOptions {
const {
similarityTopK = 2,
mode = VectorStoreQueryMode.DEFAULT,
sessionFilterKey = "session_id",
} = options ?? {};
let filters = options?.filters;
const sessionFilter: MetadataFilter = {
key: sessionFilterKey,
value: this.id,
operator: "==",
};
if (filters) {
// Only add session_id filter if it doesn't exist in the filters list
const sessionIdFilterExists = filters.filters.some(
(filter) => filter.key === sessionFilterKey,
);
if (!sessionIdFilterExists) {
filters.filters.push(sessionFilter);
}
} else {
// If no filters are provided, add the session_id filter
filters = {
filters: [sessionFilter],
condition: "and",
};
}
return { ...options, similarityTopK, mode, sessionFilterKey, filters };
}
}
+15
View File
@@ -8,6 +8,10 @@ import {
StaticMemoryBlock,
type StaticMemoryBlockOptions,
} from "./block/static";
import {
VectorMemoryBlock,
type VectorMemoryBlockOptions,
} from "./block/vector";
import { DEFAULT_TOKEN_LIMIT, Memory, type MemoryOptions } from "./memory";
import type { MemoryMessage } from "./types";
@@ -115,6 +119,17 @@ export function factExtractionBlock<TMessageOptions extends object = object>(
return new FactExtractionMemoryBlock<TMessageOptions>(options);
}
/**
* create a VectorMemoryBlock
* @param options - Configuration options for the vector memory block
* @returns A new VectorMemoryBlock instance
*/
export function vectorBlock<TMessageOptions extends object = object>(
options: VectorMemoryBlockOptions,
): VectorMemoryBlock<TMessageOptions> {
return new VectorMemoryBlock<TMessageOptions>(options);
}
/**
* Creates a new Memory instance from a snapshot
* @param snapshot The snapshot to load from
+49 -5
View File
@@ -1,3 +1,4 @@
import { consoleLogger, type Logger } from "@llamaindex/env";
import { Settings } from "../global";
import type { ChatMessage, LLM } from "../llms";
import { extractText } from "../utils";
@@ -31,6 +32,18 @@ export type MemoryOptions<TMessageOptions extends object = object> = {
* Used internally for memory restoration from snapshots.
*/
memoryCursor?: number;
/**
* The default LLM to use for memory retrieval.
* If not provided, the default `Settings.llm` will be used.
* This default LLM can be overridden by the LLM passed in the `getLLM` method.
*/
llm?: LLM | undefined;
/**
* Logger for memory operations
*/
logger?: Logger;
};
export class Memory<
@@ -65,6 +78,14 @@ export class Memory<
* The cursor for the messages that have been processed into long-term memory.
*/
private memoryCursor: number = 0;
/**
* The default LLM to use for memory retrieval.
*/
private llm: LLM | undefined;
/**
* Logger for memory operations
*/
private logger: Logger;
constructor(
messages: MemoryMessage<TMessageOptions>[] = [],
@@ -76,6 +97,8 @@ export class Memory<
options.shortTermTokenLimitRatio ?? DEFAULT_SHORT_TERM_TOKEN_LIMIT_RATIO;
this.memoryBlocks = options.memoryBlocks ?? [];
this.memoryCursor = options.memoryCursor ?? 0;
this.logger = options.logger ?? consoleLogger;
this.initLLM(options.llm);
this.adapters = {
...options.customAdapters,
@@ -84,6 +107,15 @@ export class Memory<
} as TAdapters & BuiltinAdapters<TMessageOptions>;
}
private initLLM(llm: LLM | undefined) {
// safe initialize LLM without throwing error if Settings.llm hasn't been set yet
try {
this.llm = llm ?? Settings.llm;
} catch (error) {
this.llm = undefined;
}
}
/**
* Add a message to the memory
* @param message - The message to add to the memory
@@ -160,12 +192,13 @@ export class Memory<
/**
* Get the messages from the memory, optionally including transient messages.
* only return messages that are within context window of the LLM
* @param llm - To fit the result messages to the context window of the LLM. If not provided, the default token limit will be used.
* @param llm - To fit the result messages to the context window of the LLM (fallback to default llm if not provided).
* If llm is not specified in both the constructor and the method, the default token limit will be used.
* @param transientMessages - Optional transient messages to include.
* @returns The messages from the memory, optionally including transient messages.
*/
async getLLM(
llm?: LLM,
llm: LLM | undefined = this.llm,
transientMessages?: ChatMessage<TMessageOptions>[],
): Promise<ChatMessage[]> {
// Priority of result messages:
@@ -176,11 +209,20 @@ export class Memory<
? Math.ceil(contextWindow * DEFAULT_TOKEN_LIMIT_RATIO)
: this.tokenLimit;
let blockInputMessages = this.messages;
if (transientMessages && transientMessages.length > 0) {
blockInputMessages = [
...this.messages,
...transientMessages.map((m) => this.adapters.llamaindex.toMemory(m)),
];
}
// Start with fixed block messages (priority=0)
// as it must always be included in the retrieval result
const messages = await this.getMemoryBlockMessages(
this.memoryBlocks.filter((block) => block.priority === 0),
tokenLimit,
blockInputMessages,
);
// remaining token limit for short-term and memory blocks content
const remainingTokenLimit =
@@ -207,6 +249,7 @@ export class Memory<
const longTermBlockMessages = await this.getMemoryBlockMessages(
longTermBlocks,
memoryBlocksTokenLimit,
blockInputMessages,
);
messages.push(...longTermBlockMessages);
@@ -252,6 +295,7 @@ export class Memory<
private async getMemoryBlockMessages(
blocks: BaseMemoryBlock<TMessageOptions>[],
tokenLimit?: number,
messages?: MemoryMessage<TMessageOptions>[],
): Promise<ChatMessage<TMessageOptions>[]> {
if (blocks.length === 0) {
return [];
@@ -265,7 +309,7 @@ export class Memory<
let addedTokenCount = 0;
for (const block of sortedBlocks) {
try {
const content = await block.get();
const content = await block.get(messages);
for (const message of content) {
const chatMessage = this.adapters.llamaindex.fromMemory(message);
const messageTokenCount = this.countMessagesToken([chatMessage]);
@@ -276,7 +320,7 @@ export class Memory<
addedTokenCount += messageTokenCount;
}
} catch (error) {
console.warn(
this.logger.warn(
`Failed to get content from memory block ${block.id}:`,
error,
);
@@ -338,7 +382,7 @@ export class Memory<
try {
await block.put(newMessages);
} catch (error) {
console.warn(
this.logger.warn(
`Failed to process messages into memory block ${block.id}:`,
error,
);
@@ -1,3 +1,4 @@
import { consoleLogger, type Logger } from "@llamaindex/env";
import type { Tokenizer } from "@llamaindex/env/tokenizers";
import { z } from "zod";
import { Settings } from "../global";
@@ -48,9 +49,11 @@ export class SentenceSplitter extends MetadataAwareTextSplitter {
#splitFns: Set<TextSplitterFn> = new Set();
#subSentenceSplitFns: Set<TextSplitterFn> = new Set();
#tokenizer: Tokenizer;
#logger: Logger;
constructor(
params?: z.input<typeof sentenceSplitterSchema> & SplitterParams,
params?: z.input<typeof sentenceSplitterSchema> &
SplitterParams & { logger?: Logger },
) {
super();
if (params) {
@@ -66,6 +69,7 @@ export class SentenceSplitter extends MetadataAwareTextSplitter {
this.extraAbbreviations,
);
this.#tokenizer = params?.tokenizer ?? Settings.tokenizer;
this.#logger = params?.logger ?? consoleLogger;
this.#splitFns.add(splitBySep(this.paragraphSeparator));
this.#splitFns.add(this.#chunkingTokenizerFn);
@@ -82,7 +86,7 @@ export class SentenceSplitter extends MetadataAwareTextSplitter {
`Metadata length (${metadataLength}) is longer than chunk size (${this.chunkSize}). Consider increasing the chunk size or decreasing the size of your metadata to avoid this.`,
);
} else if (effectiveChunkSize < 50) {
console.log(
this.#logger.log(
`Metadata length (${metadataLength}) is close to chunk size (${this.chunkSize}). Resulting chunks are less than 50 tokens. Consider increasing the chunk size or decreasing the size of your metadata to avoid this.`,
);
}
@@ -1,3 +1,4 @@
import { consoleLogger, type Logger } from "@llamaindex/env";
import type { Tokenizer } from "@llamaindex/env/tokenizers";
import { z } from "zod";
import { DEFAULT_CHUNK_OVERLAP, DEFAULT_CHUNK_SIZE, Settings } from "../global";
@@ -21,9 +22,11 @@ export class TokenTextSplitter extends MetadataAwareTextSplitter {
backupSeparators: string[] = ["\n"];
#tokenizer: Tokenizer;
#splitFns: Array<(text: string) => string[]> = [];
#logger: Logger;
constructor(
params?: SplitterParams & Partial<z.infer<typeof tokenTextSplitterSchema>>,
params?: SplitterParams &
Partial<z.infer<typeof tokenTextSplitterSchema>> & { logger?: Logger },
) {
super();
@@ -42,6 +45,7 @@ export class TokenTextSplitter extends MetadataAwareTextSplitter {
}
this.#tokenizer = params?.tokenizer ?? Settings.tokenizer;
this.#logger = params?.logger ?? consoleLogger;
const allSeparators = [this.separator, ...this.backupSeparators];
this.#splitFns = allSeparators.map((sep) => splitBySep(sep));
@@ -65,7 +69,7 @@ export class TokenTextSplitter extends MetadataAwareTextSplitter {
`Consider increasing the chunk size or decreasing the size of your metadata to avoid this.`,
);
} else if (effectiveChunkSize < 50) {
console.warn(
this.#logger.warn(
`Metadata length (${metadataLength}) is close to chunk size (${this.chunkSize}). ` +
`Resulting chunks are less than 50 tokens. Consider increasing the chunk size or decreasing the size of your metadata to avoid this.`,
);
@@ -148,7 +152,7 @@ export class TokenTextSplitter extends MetadataAwareTextSplitter {
const splitLength = this.tokenSize(split);
if (splitLength > chunkSize) {
console.warn(
this.#logger.warn(
`Got a split of size ${splitLength}, larger than chunk size ${chunkSize}.`,
);
}
@@ -1,3 +1,4 @@
import { consoleLogger, type Logger } from "@llamaindex/env";
import { DEFAULT_NAMESPACE } from "../../global";
import { BaseNode, ObjectType, type StoredValue } from "../../schema";
import type { BaseKVStore } from "../kv-store";
@@ -16,13 +17,19 @@ export class KVDocumentStore extends BaseDocumentStore {
private nodeCollection: string;
private refDocCollection: string;
private metadataCollection: string;
private logger: Logger;
constructor(kvstore: BaseKVStore, namespace: string = DEFAULT_NAMESPACE) {
constructor(
kvstore: BaseKVStore,
namespace: string = DEFAULT_NAMESPACE,
options?: { logger?: Logger },
) {
super();
this.kvstore = kvstore;
this.nodeCollection = `${namespace}/data`;
this.refDocCollection = `${namespace}/ref_doc_info`;
this.metadataCollection = `${namespace}/metadata`;
this.logger = options?.logger ?? consoleLogger;
}
async docs(): Promise<Record<string, BaseNode>> {
@@ -33,7 +40,7 @@ export class KVDocumentStore extends BaseDocumentStore {
if (isValidDocJson(value)) {
docs[key] = jsonToDoc(value, this.serializer);
} else {
console.warn(`Invalid JSON for docId ${key}`);
this.logger.warn(`Invalid JSON for docId ${key}`);
}
}
return docs;
+12 -5
View File
@@ -1,4 +1,4 @@
import { path } from "@llamaindex/env";
import { path, type Logger } from "@llamaindex/env";
import { IndexStruct, jsonToIndexStruct } from "../../data-structs";
import {
DEFAULT_INDEX_STORE_PERSIST_FILENAME,
@@ -8,8 +8,8 @@ import {
import {
BaseInMemoryKVStore,
BaseKVStore,
type DataType,
SimpleKVStore,
type DataType,
} from "../kv-store";
export const DEFAULT_PERSIST_PATH = path.join(
@@ -84,16 +84,23 @@ export class SimpleIndexStore extends KVIndexStore {
static async fromPersistDir(
persistDir: string = DEFAULT_PERSIST_DIR,
options?: { logger?: Logger },
): Promise<SimpleIndexStore> {
const persistPath = path.join(
persistDir,
DEFAULT_INDEX_STORE_PERSIST_FILENAME,
);
return this.fromPersistPath(persistPath);
return this.fromPersistPath(persistPath, options);
}
static async fromPersistPath(persistPath: string): Promise<SimpleIndexStore> {
const simpleKVStore = await SimpleKVStore.fromPersistPath(persistPath);
static async fromPersistPath(
persistPath: string,
options?: { logger?: Logger },
): Promise<SimpleIndexStore> {
const simpleKVStore = await SimpleKVStore.fromPersistPath(
persistPath,
options,
);
return new SimpleIndexStore(simpleKVStore);
}
+7 -3
View File
@@ -1,4 +1,4 @@
import { fs, path } from "@llamaindex/env";
import { consoleLogger, fs, path, type Logger } from "@llamaindex/env";
import { DEFAULT_COLLECTION } from "../../global";
import type { StoredValue } from "../../schema";
@@ -98,7 +98,11 @@ export class SimpleKVStore extends BaseKVStore {
await fs.writeFile(persistPath, JSON.stringify(this.data));
}
static async fromPersistPath(persistPath: string): Promise<SimpleKVStore> {
static async fromPersistPath(
persistPath: string,
options?: { logger?: Logger },
): Promise<SimpleKVStore> {
const logger = options?.logger ?? consoleLogger;
const dirPath = path.dirname(persistPath);
if (!(await exists(dirPath))) {
await fs.mkdir(dirPath, { recursive: true });
@@ -106,7 +110,7 @@ export class SimpleKVStore extends BaseKVStore {
let data: DataType = {};
if (!(await exists(persistPath))) {
console.info(`Starting new store from path: ${persistPath}`);
logger.log(`Starting new store from path: ${persistPath}`);
} else {
try {
const fileData = await fs.readFile(persistPath);
+5 -1
View File
@@ -1,3 +1,4 @@
import { consoleLogger, type Logger } from "@llamaindex/env";
import type { JSONSchemaType } from "ajv";
import { z } from "zod";
import { zodToJsonSchema } from "zod-to-json-schema";
@@ -14,11 +15,13 @@ export class FunctionTool<
#additionalArg: AdditionalToolArgument | undefined;
readonly #metadata: ToolMetadata<JSONSchemaType<T>>;
readonly #zodType: z.ZodType<T> | null = null;
readonly #logger: Logger;
constructor(
fn: (input: T, additionalArg?: AdditionalToolArgument) => R,
metadata: ToolMetadata<JSONSchemaType<T>>,
zodType?: z.ZodType<T>,
additionalArg?: AdditionalToolArgument,
logger?: Logger,
) {
this.#fn = fn;
this.#metadata = metadata;
@@ -26,6 +29,7 @@ export class FunctionTool<
this.#zodType = zodType;
}
this.#additionalArg = additionalArg;
this.#logger = logger ?? consoleLogger;
}
static from<T, AdditionalToolArgument extends object = object>(
@@ -140,7 +144,7 @@ export class FunctionTool<
if (result.success) {
params = result.data;
} else {
console.warn(result.error.errors);
this.#logger.warn(result.error.errors);
}
}
return this.#fn.call(null, params, this.#additionalArg);
+4 -1
View File
@@ -101,7 +101,9 @@ export type VectorStoreByType = {
};
export type VectorStoreBaseParams = {
// @deprecated: use embedModel instead
embeddingModel?: BaseEmbedding | undefined;
embedModel?: BaseEmbedding | undefined;
};
export abstract class BaseVectorStore<Client = unknown, T = unknown> {
@@ -117,7 +119,8 @@ export abstract class BaseVectorStore<Client = unknown, T = unknown> {
): Promise<VectorStoreQueryResult>;
protected constructor(params?: VectorStoreBaseParams) {
this.embedModel = params?.embeddingModel ?? Settings.embedModel;
this.embedModel =
params?.embedModel ?? params?.embeddingModel ?? Settings.embedModel;
}
}
+37
View File
@@ -1,5 +1,42 @@
# @llamaindex/experimental
## 0.0.203
### Patch Changes
- llamaindex@0.11.26
## 0.0.202
### Patch Changes
- Updated dependencies [049471b]
- llamaindex@0.11.25
## 0.0.201
### Patch Changes
- llamaindex@0.11.24
## 0.0.200
### Patch Changes
- llamaindex@0.11.23
## 0.0.199
### Patch Changes
- llamaindex@0.11.22
## 0.0.198
### Patch Changes
- llamaindex@0.11.21
## 0.0.197
### Patch Changes
+1 -1
View File
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/experimental",
"description": "Experimental package for LlamaIndexTS",
"version": "0.0.197",
"version": "0.0.203",
"type": "module",
"types": "dist/type/index.d.ts",
"main": "dist/cjs/index.js",
+54
View File
@@ -1,5 +1,59 @@
# llamaindex
## 0.11.26
### Patch Changes
- Updated dependencies [4b51791]
- @llamaindex/cloud@4.1.1
## 0.11.25
### Patch Changes
- 049471b: Moved LlamaCloudFileService, LlamaCloudIndex and LlamaCloudRetriever to llama-cloud-services
- Updated dependencies [049471b]
- @llamaindex/cloud@4.1.0
## 0.11.24
### Patch Changes
- Updated dependencies [c3bf3c7]
- Updated dependencies [f9f1de9]
- @llamaindex/cloud@4.0.28
- @llamaindex/core@0.6.19
- @llamaindex/node-parser@2.0.19
- @llamaindex/workflow@1.1.20
## 0.11.23
### Patch Changes
- Updated dependencies [f29799e]
- Updated dependencies [7224c06]
- @llamaindex/workflow@1.1.19
- @llamaindex/core@0.6.18
- @llamaindex/cloud@4.0.27
- @llamaindex/node-parser@2.0.18
## 0.11.22
### Patch Changes
- Updated dependencies [9ed3195]
- @llamaindex/workflow@1.1.18
## 0.11.21
### Patch Changes
- Updated dependencies [38da40b]
- @llamaindex/core@0.6.17
- @llamaindex/cloud@4.0.26
- @llamaindex/node-parser@2.0.17
- @llamaindex/workflow@1.1.17
## 0.11.20
### Patch Changes
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "llamaindex",
"version": "0.11.20",
"version": "0.11.26",
"license": "MIT",
"type": "module",
"keywords": [
+6
View File
@@ -1,3 +1,9 @@
console.warn(`
The classes LlamaCloudFileService, LlamaCloudIndex and LlamaCloudRetriever have been moved to the package llama-cloud-services.
* Please migrate your imports to llama-cloud-services, e.g. import { LlamaCloudIndex } from "llama-cloud-services";
* See the documentation: https://docs.cloud.llamaindex.ai
`);
export { LLamaCloudFileService } from "./LLamaCloudFileService.js";
export { LlamaCloudIndex } from "./LlamaCloudIndex.js";
export {
@@ -8,7 +8,7 @@ import {
BaseInMemoryKVStore,
SimpleKVStore,
} from "@llamaindex/core/storage/kv-store";
import { path } from "@llamaindex/env";
import { path, type Logger } from "@llamaindex/env";
import _ from "lodash";
// eslint-disable-next-line @typescript-eslint/no-explicit-any
@@ -27,19 +27,28 @@ export class SimpleDocumentStore extends KVDocumentStore {
static async fromPersistDir(
persistDir: string = DEFAULT_PERSIST_DIR,
namespace?: string,
options?: { logger?: Logger },
): Promise<SimpleDocumentStore> {
const persistPath = path.join(
persistDir,
DEFAULT_DOC_STORE_PERSIST_FILENAME,
);
return await SimpleDocumentStore.fromPersistPath(persistPath, namespace);
return await SimpleDocumentStore.fromPersistPath(
persistPath,
namespace,
options,
);
}
static async fromPersistPath(
persistPath: string,
namespace?: string,
options?: { logger?: Logger },
): Promise<SimpleDocumentStore> {
const simpleKVStore = await SimpleKVStore.fromPersistPath(persistPath);
const simpleKVStore = await SimpleKVStore.fromPersistPath(
persistPath,
options,
);
return new SimpleDocumentStore(simpleKVStore, namespace);
}
@@ -18,7 +18,7 @@ import {
type VectorStoreQuery,
type VectorStoreQueryResult,
} from "@llamaindex/core/vector-store";
import { fs, path } from "@llamaindex/env";
import { consoleLogger, fs, path, type Logger } from "@llamaindex/env";
import { exists } from "../storage/FileSystem.js";
const LEARNER_MODES = new Set<VectorStoreQueryMode>([
@@ -139,9 +139,14 @@ export class SimpleVectorStore extends BaseVectorStore {
static async fromPersistDir(
persistDir: string = DEFAULT_PERSIST_DIR,
embedModel?: BaseEmbedding,
options?: { logger?: Logger },
): Promise<SimpleVectorStore> {
const persistPath = path.join(persistDir, "vector_store.json");
return await SimpleVectorStore.fromPersistPath(persistPath, embedModel);
return await SimpleVectorStore.fromPersistPath(
persistPath,
embedModel,
options,
);
}
client() {
@@ -272,8 +277,10 @@ export class SimpleVectorStore extends BaseVectorStore {
static async fromPersistPath(
persistPath: string,
embeddingModel?: BaseEmbedding,
embedModel?: BaseEmbedding,
options?: { logger?: Logger },
): Promise<SimpleVectorStore> {
const logger = options?.logger ?? consoleLogger;
const dirPath = path.dirname(persistPath);
if (!(await exists(dirPath))) {
await fs.mkdir(dirPath, { recursive: true });
@@ -281,7 +288,7 @@ export class SimpleVectorStore extends BaseVectorStore {
let dataDict: Record<string, unknown> = {};
if (!(await exists(persistPath))) {
console.info(`Starting new store from path: ${persistPath}`);
logger.log(`Starting new store from path: ${persistPath}`);
} else {
try {
const fileData = await fs.readFile(persistPath);
@@ -300,20 +307,20 @@ export class SimpleVectorStore extends BaseVectorStore {
data.textIdToRefDocId = dataDict.textIdToRefDocId ?? {};
// @ts-expect-error TS2322
data.metadataDict = dataDict.metadataDict ?? {};
const store = new SimpleVectorStore({ data, embeddingModel });
const store = new SimpleVectorStore({ data, embedModel });
store.persistPath = persistPath;
return store;
}
static fromDict(
saveDict: SimpleVectorStoreData,
embeddingModel?: BaseEmbedding,
embedModel?: BaseEmbedding,
): SimpleVectorStore {
const data = new SimpleVectorStoreData();
data.embeddingDict = saveDict.embeddingDict;
data.textIdToRefDocId = saveDict.textIdToRefDocId;
data.metadataDict = saveDict.metadataDict;
return new SimpleVectorStore({ data, embeddingModel });
return new SimpleVectorStore({ data, embedModel });
}
toDict(): SimpleVectorStoreData {
+32
View File
@@ -1,5 +1,37 @@
# @llamaindex/core-test
## 0.1.17
### Patch Changes
- Updated dependencies [4c70376]
- @llamaindex/openai@0.4.16
## 0.1.16
### Patch Changes
- Updated dependencies [b6409b6]
- @llamaindex/openai@0.4.15
## 0.1.15
### Patch Changes
- @llamaindex/openai@0.4.14
## 0.1.14
### Patch Changes
- @llamaindex/openai@0.4.13
## 0.1.13
### Patch Changes
- @llamaindex/openai@0.4.12
## 0.1.12
### Patch Changes
+1 -1
View File
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/llamaindex-test",
"private": true,
"version": "0.1.12",
"version": "0.1.17",
"type": "module",
"scripts": {
"test": "vitest run"
@@ -47,22 +47,31 @@ describe("StorageContext", () => {
test("persists and loads", async () => {
const doc = new Document({ text: "test document" });
const consoleInfoSpy = vi
.spyOn(console, "info")
.mockImplementation(() => {});
// Create a Logger that spies on log (info) calls
const spyLogger = {
log: vi.fn(),
error: vi.fn(),
warn: vi.fn(),
};
// storage context from individual stores
const storageContext = await storageContextFromDefaults({
docStore: await SimpleDocumentStore.fromPersistDir(testDir),
vectorStore: await SimpleVectorStore.fromPersistDir(testDir),
indexStore: await SimpleIndexStore.fromPersistDir(testDir),
docStore: await SimpleDocumentStore.fromPersistDir(testDir, undefined, {
logger: spyLogger,
}),
vectorStore: await SimpleVectorStore.fromPersistDir(testDir, undefined, {
logger: spyLogger,
}),
indexStore: await SimpleIndexStore.fromPersistDir(testDir, {
logger: spyLogger,
}),
});
const index = await VectorStoreIndex.fromDocuments([doc], {
storageContext,
});
expect(consoleInfoSpy).toHaveBeenCalledTimes(3);
expect(consoleInfoSpy).toHaveBeenCalledWith(
expect(spyLogger.log).toHaveBeenCalledTimes(3);
expect(spyLogger.log).toHaveBeenCalledWith(
expect.stringContaining("Starting new store"),
);
expect(index).toBeDefined();
@@ -75,13 +84,19 @@ describe("StorageContext", () => {
// Check that the test data files exist
await expectTestDataFilesExist(testDir);
consoleInfoSpy.mockClear();
spyLogger.log.mockClear();
// Now, load it again. Since data was persisted, we should not see the error.
const newStorageContext = await storageContextFromDefaults({
docStore: await SimpleDocumentStore.fromPersistDir(testDir),
vectorStore: await SimpleVectorStore.fromPersistDir(testDir),
indexStore: await SimpleIndexStore.fromPersistDir(testDir),
docStore: await SimpleDocumentStore.fromPersistDir(testDir, undefined, {
logger: spyLogger,
}),
vectorStore: await SimpleVectorStore.fromPersistDir(testDir, undefined, {
logger: spyLogger,
}),
indexStore: await SimpleIndexStore.fromPersistDir(testDir, {
logger: spyLogger,
}),
});
const loadedIndex = await VectorStoreIndex.init({
@@ -94,9 +109,7 @@ describe("StorageContext", () => {
await expectTestDataFilesExist(testDir);
expect(consoleInfoSpy).not.toHaveBeenCalled();
consoleInfoSpy.mockRestore();
expect(spyLogger.log).not.toHaveBeenCalled();
});
test("throws error on corrupted data", async () => {
@@ -59,7 +59,7 @@ describe("SimpleVectorStore", () => {
}),
];
store = new SimpleVectorStore({
embeddingModel: {} as BaseEmbedding, // Mocking the embedModel
embedModel: {} as BaseEmbedding, // Mocking the embedModel
data: {
embeddingDict: {},
textIdToRefDocId: {},
+22
View File
@@ -1,5 +1,27 @@
# @llamaindex/node-parser
## 2.0.19
### Patch Changes
- Updated dependencies [f9f1de9]
- @llamaindex/core@0.6.19
## 2.0.18
### Patch Changes
- Updated dependencies [f29799e]
- Updated dependencies [7224c06]
- @llamaindex/core@0.6.18
## 2.0.17
### Patch Changes
- Updated dependencies [38da40b]
- @llamaindex/core@0.6.17
## 2.0.16
### Patch Changes

Some files were not shown because too many files have changed in this diff Show More