Compare commits

..

12 Commits

Author SHA1 Message Date
github-actions[bot] 09beb72f5b Release 0.5.3 (#1038)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-07-16 10:25:27 -07:00
Fabian Wimmer 9bbbc67c8e feat: add a reader for Discord messages (#1040) 2024-07-16 10:19:48 -07:00
Brian Peiris b3681bf681 fix: DataCloneError when using FunctionTool (#1037) 2024-07-14 15:24:49 -07:00
Alex Yang b548b1443b chore: bump version (#1032) 2024-07-12 15:14:27 -07:00
github-actions[bot] 0e980d962d Release 0.5.2 (#1035)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-07-12 11:44:32 -07:00
Alex Yang 3ed6acc6a6 chore: bump cloud api (#1036) 2024-07-12 11:21:37 -07:00
Parham Saidi 56746c240f fix: bedrock handle empty content and added max tokens export (#1034) 2024-07-12 09:47:49 -07:00
Alex Yang 5c1c2c7f5b ci: only commit lock file (#1031) 2024-07-10 10:17:35 -07:00
github-actions[bot] a699086f46 Release 0.5.1 (#1028)
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2024-07-09 15:36:20 -07:00
Alex Yang 454c204112 chore: bump version (#1029) 2024-07-09 13:42:09 -07:00
Julius Lipp 277468160d feat: add mixedbread ai integration (#953) 2024-07-09 09:36:43 -07:00
Ranjan Mangla a0f424e592 fix: corrected the regex in the ReactAgent (#1022)
Signed-off-by: ranjanmangla1 <ranjanmangla1@gmail.com>
2024-07-09 08:55:38 -07:00
54 changed files with 7202 additions and 5125 deletions
+1
View File
@@ -67,3 +67,4 @@ jobs:
with:
commit_message: "chore: update lock file"
branch: changeset-release/main
file_pattern: "pnpm-lock.yaml"
+23
View File
@@ -1,5 +1,28 @@
# docs
## 0.0.44
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
## 0.0.43
### Patch Changes
- llamaindex@0.5.2
## 0.0.42
### Patch Changes
- 2774681: Add mixedbread's embeddings and reranking API
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
## 0.0.41
### Patch Changes
@@ -0,0 +1,34 @@
import CodeBlock from "@theme/CodeBlock";
import CodeSource from "!raw-loader!../../../../../examples/readers/src/discord";
# DiscordReader
DiscordReader is a simple data loader that reads all messages in a given Discord channel and returns them as Document objects.
It uses the [@discordjs/rest](https://github.com/discordjs/discord.js/tree/main/packages/rest) library to fetch the messages.
## Usage
First step is to create a Discord Application and generating a bot token [here](https://discord.com/developers/applications).
In your Discord Application, go to the `OAuth2` tab and generate an invite URL by selecting `bot` and click `Read Messages/View Channels` as wells as `Read Message History`.
This will invite the bot with the necessary permissions to read messages.
Copy the URL in your browser and select the server you want your bot to join.
<CodeBlock language="ts">{CodeSource}</CodeBlock>
### Params
#### DiscordReader()
- `discordToken?`: The Discord bot token.
- `makeRequest?`: Optionally provide a custom request function for edge environments, e.g. `fetch`. See discord.js for more info.
#### DiscordReader.loadData
- `channelIDs`: The ID(s) of discord channels as an array of strings.
- `limit?`: Optionally limit the number of messages to read
- `additionalInfo?`: An optional flag to include embedded messages and attachment urls in the document.
- `oldestFirst?`: An optional flag to return the oldest messages first.
## API Reference
- [DiscordReader](../../api/classes/DiscordReader.md)
@@ -0,0 +1,100 @@
# MixedbreadAI
Welcome to the mixedbread embeddings guide! This guide will help you use the mixedbread ai's API to generate embeddings for your text documents, ensuring you get the most relevant information, just like picking the freshest bread from the bakery.
To find out more about the latest features, updates, and available models, visit [mixedbread.ai](https://mixedbread-ai.com/).
## Table of Contents
1. [Setup](#setup)
2. [Usage with LlamaIndex](#integration-with-llamaindex)
3. [Embeddings with Custom Parameters](#embeddings-with-custom-parameters)
## Setup
First, you will need to install the `llamaindex` package.
```bash
pnpm install llamaindex
```
Next, sign up for an API key at [mixedbread.ai](https://mixedbread.ai/). Once you have your API key, you can import the necessary modules and create a new instance of the `MixedbreadAIEmbeddings` class.
```ts
import { MixedbreadAIEmbeddings, Document, Settings } from "llamaindex";
```
## Usage with LlamaIndex
This section will guide you through integrating mixedbread embeddings with LlamaIndex for more advanced usage.
### Step 1: Load and Index Documents
For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index, like a variety of breads in a bakery.
```ts
Settings.embedModel = new MixedbreadAIEmbeddings({
apiKey: "<MIXEDBREAD_API_KEY>",
model: "mixedbread-ai/mxbai-embed-large-v1",
});
const document = new Document({
text: "The true source of happiness.",
id_: "bread",
});
const index = await VectorStoreIndex.fromDocuments([document]);
```
### Step 2: Create a Query Engine
Combine the retriever and the embed model to create a query engine. This setup ensures that your queries are processed to provide the best results, like arranging the bread in the order of freshness and quality.
Models can require prompts to generate embeddings for queries, in the 'mixedbread-ai/mxbai-embed-large-v1' model's case, the prompt is `Represent this sentence for searching relevant passages:`.
```ts
const queryEngine = index.asQueryEngine();
const query =
"Represent this sentence for searching relevant passages: What is bread?";
// Log the response
const results = await queryEngine.query(query);
console.log(results); // Serving up the freshest, most relevant results.
```
## Embeddings with Custom Parameters
This section will guide you through generating embeddings with custom parameters and usage with f.e. matryoshka and binary embeddings.
### Step 1: Create an Instance of MixedbreadAIEmbeddings
Create a new instance of the `MixedbreadAIEmbeddings` class with custom parameters. For example, to use the `mixedbread-ai/mxbai-embed-large-v1` model with a batch size of 64, normalized embeddings, and binary encoding format:
```ts
const embeddings = new MixedbreadAIEmbeddings({
apiKey: "<MIXEDBREAD_API_KEY>",
model: "mixedbread-ai/mxbai-embed-large-v1",
batchSize: 64,
normalized: true,
dimensions: 512,
encodingFormat: MixedbreadAI.EncodingFormat.Binary,
});
```
### Step 2: Define Texts
Define the texts you want to generate embeddings for.
```ts
const texts = ["Bread is life", "Bread is love"];
```
### Step 3: Generate Embeddings
Use the `embedDocuments` method to generate embeddings for the texts.
```ts
const result = await embeddings.embedDocuments(texts);
console.log(result); // Perfectly customized embeddings, ready to serve.
```
@@ -0,0 +1,164 @@
# MixedbreadAI
Welcome to the mixedbread ai reranker guide! This guide will help you use mixedbread ai's API to rerank search query results, ensuring you get the most relevant information, just like picking the freshest bread from the bakery.
To find out more about the latest features and updates, visit the [mixedbread.ai](https://mixedbread.ai/).
## Table of Contents
1. [Setup](#setup)
2. [Usage with LlamaIndex](#integration-with-llamaindex)
3. [Simple Reranking Guide](#simple-reranking-guide)
4. [Reranking with Objects](#reranking-with-objects)
## Setup
First, you will need to install the `llamaindex` package.
```bash
pnpm install llamaindex
```
Next, sign up for an API key at [mixedbread.ai](https://mixedbread.ai/). Once you have your API key, you can import the necessary modules and create a new instance of the `MixedbreadAIReranker` class.
```ts
import {
MixedbreadAIReranker,
Document,
OpenAI,
VectorStoreIndex,
Settings,
} from "llamaindex";
```
## Usage with LlamaIndex
This section will guide you through integrating mixedbread's reranker with LlamaIndex.
### Step 1: Load and Index Documents
For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index, like a variety of breads in a bakery.
```ts
const document = new Document({
text: "This is a sample document.",
id_: "sampleDoc",
});
Settings.llm = new OpenAI({ model: "gpt-3.5-turbo", temperature: 0.1 });
const index = await VectorStoreIndex.fromDocuments([document]);
```
### Step 2: Increase Similarity TopK
The default value for `similarityTopK` is 2, which means only the most similar document will be returned. To get more results, like picking a variety of fresh breads, you can increase the value of `similarityTopK`.
```ts
const retriever = index.asRetriever();
retriever.similarityTopK = 5;
```
### Step 3: Create a MixedbreadAIReranker Instance
Create a new instance of the `MixedbreadAIReranker` class.
```ts
const nodePostprocessor = new MixedbreadAIReranker({
apiKey: "<MIXEDBREAD_API_KEY>",
topN: 4,
});
```
### Step 4: Create a Query Engine
Combine the retriever and node postprocessor to create a query engine. This setup ensures that your queries are processed and reranked to provide the best results, like arranging the bread in the order of freshness and quality.
```ts
const queryEngine = index.asQueryEngine({
retriever,
nodePostprocessors: [nodePostprocessor],
});
// Log the response
const response = await queryEngine.query("Where did the author grow up?");
console.log(response);
```
With mixedbread's Reranker, you're all set to serve up the most relevant and well-ordered results, just like a skilled baker arranging their best breads for eager customers. Enjoy the perfect blend of technology and culinary delight!
## Simple Reranking Guide
This section will guide you through a simple reranking process using mixedbread ai.
### Step 1: Create an Instance of MixedbreadAIReranker
Create a new instance of the `MixedbreadAIReranker` class, passing in your API key and the number of results you want to return. It's like setting up your bakery to offer a specific number of freshly baked items.
```ts
const reranker = new MixedbreadAIReranker({
apiKey: "<MIXEDBREAD_API_KEY>",
topN: 4,
});
```
### Step 2: Define Nodes and Query
Define the nodes (documents) you want to rerank and the query.
```ts
const nodes = [
{ node: new BaseNode("To bake bread you need flour") },
{ node: new BaseNode("To bake bread you need yeast") },
];
const query = "What do you need to bake bread?";
```
### Step 3: Perform Reranking
Use the `postprocessNodes` method to rerank the nodes based on the query.
```ts
const result = await reranker.postprocessNodes(nodes, query);
console.log(result); // Like pulling freshly baked nodes out of the oven.
```
## Reranking with Objects
This section will guide you through reranking when working with objects.
### Step 1: Create an Instance of MixedbreadAIReranker
Create a new instance of the `MixedbreadAIReranker` class, just like before.
```ts
const reranker = new MixedbreadAIReranker({
apiKey: "<MIXEDBREAD_API_KEY>",
model: "mixedbread-ai/mxbai-rerank-large-v1",
topK: 5,
rankFields: ["title", "content"],
returnInput: true,
maxRetries: 5,
});
```
### Step 2: Define Documents and Query
Define the documents (objects) you want to rerank and the query.
```ts
const documents = [
{ title: "Bread Recipe", content: "To bake bread you need flour" },
{ title: "Bread Recipe", content: "To bake bread you need yeast" },
];
const query = "What do you need to bake bread?";
```
### Step 3: Perform Reranking
Use the `rerank` method to reorder the documents based on the query.
```ts
const result = await reranker.rerank(documents, query);
console.log(result); // Perfectly customized results, ready to serve.
```
+19 -19
View File
@@ -1,6 +1,6 @@
{
"name": "docs",
"version": "0.0.41",
"version": "0.0.44",
"private": true,
"scripts": {
"docusaurus": "docusaurus",
@@ -15,29 +15,29 @@
"typecheck": "tsc"
},
"dependencies": {
"@docusaurus/core": "^3.3.2",
"@docusaurus/remark-plugin-npm2yarn": "^3.3.2",
"@docusaurus/core": "3.4.0",
"@docusaurus/remark-plugin-npm2yarn": "3.4.0",
"@llamaindex/examples": "workspace:*",
"@mdx-js/react": "^3.0.1",
"clsx": "^2.1.1",
"@mdx-js/react": "3.0.1",
"clsx": "2.1.1",
"llamaindex": "workspace:*",
"postcss": "^8.4.38",
"prism-react-renderer": "^2.3.1",
"raw-loader": "^4.0.2",
"react": "^18.3.1",
"react-dom": "^18.3.1"
"postcss": "8.4.39",
"prism-react-renderer": "2.3.1",
"raw-loader": "4.0.2",
"react": "18.3.1",
"react-dom": "18.3.1"
},
"devDependencies": {
"@docusaurus/module-type-aliases": "3.3.2",
"@docusaurus/preset-classic": "^3.3.2",
"@docusaurus/theme-classic": "^3.3.2",
"@docusaurus/types": "^3.3.2",
"@tsconfig/docusaurus": "^2.0.3",
"@docusaurus/module-type-aliases": "3.4.0",
"@docusaurus/preset-classic": "3.4.0",
"@docusaurus/theme-classic": "3.4.0",
"@docusaurus/types": "3.4.0",
"@tsconfig/docusaurus": "2.0.3",
"@types/node": "^20.12.11",
"docusaurus-plugin-typedoc": "^1.0.1",
"typedoc": "^0.25.13",
"typedoc-plugin-markdown": "^4.0.1",
"typescript": "^5.5.2"
"docusaurus-plugin-typedoc": "1.0.3",
"typedoc": "0.26.4",
"typedoc-plugin-markdown": "4.1.2",
"typescript": "^5.5.3"
},
"browserslist": {
"production": [
+2 -2
View File
@@ -9,7 +9,7 @@
"@llamaindex/core": "^0.1.0",
"@notionhq/client": "^2.2.15",
"@pinecone-database/pinecone": "^2.2.2",
"@zilliz/milvus2-sdk-node": "^2.4.2",
"@zilliz/milvus2-sdk-node": "^2.4.4",
"chromadb": "^1.8.1",
"commander": "^12.1.0",
"dotenv": "^16.4.5",
@@ -21,7 +21,7 @@
"devDependencies": {
"@types/node": "^20.14.1",
"tsx": "^4.15.6",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
},
"scripts": {
"lint": "eslint ."
+3 -2
View File
@@ -12,7 +12,8 @@
"start:llamaparse": "node --import tsx ./src/llamaparse.ts",
"start:notion": "node --import tsx ./src/notion.ts",
"start:llamaparse-dir": "node --import tsx ./src/simple-directory-reader-with-llamaparse.ts",
"start:llamaparse-json": "node --import tsx ./src/llamaparse-json.ts"
"start:llamaparse-json": "node --import tsx ./src/llamaparse-json.ts",
"start:discord": "node --import tsx ./src/discord.ts"
},
"dependencies": {
"llamaindex": "*"
@@ -20,6 +21,6 @@
"devDependencies": {
"@types/node": "^20.12.11",
"tsx": "^4.15.6",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
}
}
+20
View File
@@ -0,0 +1,20 @@
import { DiscordReader } from "llamaindex";
async function main() {
// Create an instance of the DiscordReader. Set token here or DISCORD_TOKEN environment variable
const discordReader = new DiscordReader();
// Specify the channel IDs you want to read messages from as an arry of strings
const channelIds = ["721374320794009630", "719596376261918720"];
// Specify the number of messages to fetch per channel
const limit = 10;
// Load messages from the specified channel
const messages = await discordReader.loadData(channelIds, limit, true);
// Print out the messages
console.log(messages);
}
main().catch(console.error);
+5 -5
View File
@@ -21,19 +21,19 @@
"@changesets/cli": "^2.27.5",
"@typescript-eslint/eslint-plugin": "^7.13.1",
"eslint": "^8.57.0",
"eslint-config-next": "^14.2.4",
"eslint-config-next": "^14.2.5",
"eslint-config-prettier": "^9.1.0",
"eslint-config-turbo": "^2.0.5",
"eslint-plugin-react": "7.34.1",
"eslint-plugin-react": "7.34.3",
"husky": "^9.0.11",
"lint-staged": "^15.2.7",
"madge": "^7.0.0",
"prettier": "^3.3.2",
"prettier-plugin-organize-imports": "^3.2.4",
"prettier-plugin-organize-imports": "^4.0.0",
"turbo": "^2.0.5",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
},
"packageManager": "pnpm@9.4.0",
"packageManager": "pnpm@9.5.0",
"pnpm": {
"overrides": {
"trim": "1.0.1",
@@ -5,7 +5,7 @@
"dependencies": {
"@llamaindex/autotool": "workspace:*",
"llamaindex": "workspace:*",
"openai": "^4.52.0"
"openai": "^4.52.5"
},
"devDependencies": {
"tsx": "^4.15.6"
@@ -1,5 +1,30 @@
# @llamaindex/autotool-02-next-example
## 0.1.28
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
- @llamaindex/autotool@2.0.0
## 0.1.27
### Patch Changes
- llamaindex@0.5.2
- @llamaindex/autotool@2.0.0
## 0.1.26
### Patch Changes
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
- @llamaindex/autotool@2.0.0
## 0.1.25
### Patch Changes
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/autotool-02-next-example",
"private": true,
"version": "0.1.25",
"version": "0.1.28",
"scripts": {
"dev": "next dev",
"build": "next build",
@@ -14,7 +14,7 @@
"class-variance-authority": "^0.7.0",
"dotenv": "^16.3.1",
"llamaindex": "workspace:*",
"lucide-react": "^0.378.0",
"lucide-react": "^0.407.0",
"next": "14.3.0-canary.51",
"react": "^18.3.1",
"react-dom": "^18.3.1",
@@ -32,6 +32,6 @@
"cross-env": "^7.0.3",
"postcss": "^8.4.32",
"tailwindcss": "^3.4.4",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
}
}
+5 -5
View File
@@ -47,11 +47,11 @@
"dependencies": {
"@swc/core": "^1.6.3",
"jotai": "^2.8.3",
"typedoc": "^0.25.13",
"typedoc": "^0.26.4",
"unplugin": "^1.10.1"
},
"peerDependencies": {
"llamaindex": "^0.5.0",
"llamaindex": "^0.5.3",
"openai": "^4",
"typescript": "^4"
},
@@ -72,11 +72,11 @@
"@types/node": "^20.12.11",
"bunchee": "5.3.0-beta.0",
"llamaindex": "workspace:*",
"next": "14.2.3",
"next": "14.2.5",
"rollup": "^4.18.0",
"tsx": "^4.15.6",
"typescript": "^5.5.2",
"vitest": "^1.6.0",
"typescript": "^5.5.3",
"vitest": "^2.0.2",
"webpack": "^5.92.1"
}
}
+6
View File
@@ -1,5 +1,11 @@
# @llamaindex/cloud
## 0.2.0
### Minor Changes
- 3ed6acc: feat: cloud api change
## 0.1.4
### Patch Changes
File diff suppressed because it is too large Load Diff
+1 -1
View File
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/cloud",
"version": "0.1.4",
"version": "0.2.0",
"type": "module",
"license": "MIT",
"scripts": {
+13
View File
@@ -1,5 +1,18 @@
# @llamaindex/community
## 0.0.21
### Patch Changes
- Updated dependencies [b3681bf]
- @llamaindex/core@0.1.1
## 0.0.20
### Patch Changes
- 56746c2: fix: llama3 patched to handle empty content (can happen with system) and added max tokens export
## 0.0.19
### Patch Changes
+2 -2
View File
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/community",
"description": "Community package for LlamaIndexTS",
"version": "0.0.19",
"version": "0.0.21",
"type": "module",
"types": "dist/type/index.d.ts",
"main": "dist/cjs/index.js",
@@ -46,7 +46,7 @@
"bunchee": "5.3.0-beta.0"
},
"dependencies": {
"@aws-sdk/client-bedrock-runtime": "^3.600.0",
"@aws-sdk/client-bedrock-runtime": "^3.613.0",
"@llamaindex/core": "workspace:*"
}
}
+5 -1
View File
@@ -1 +1,5 @@
export { BEDROCK_MODELS, Bedrock } from "./llm/bedrock/base.js";
export {
BEDROCK_MODELS,
BEDROCK_MODEL_MAX_TOKENS,
Bedrock,
} from "./llm/bedrock/base.js";
@@ -150,6 +150,18 @@ export type BedrockModelParams = {
maxTokens?: number;
};
export const BEDROCK_MODEL_MAX_TOKENS: Partial<Record<BEDROCK_MODELS, number>> =
{
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_SONNET]: 4096,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU]: 4096,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_OPUS]: 4096,
[BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_5_SONNET]: 4096,
[BEDROCK_MODELS.META_LLAMA2_13B_CHAT]: 2048,
[BEDROCK_MODELS.META_LLAMA2_70B_CHAT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_8B_INSTRUCT]: 2048,
[BEDROCK_MODELS.META_LLAMA3_70B_INSTRUCT]: 2048,
};
const DEFAULT_BEDROCK_PARAMS = {
temperature: 0.1,
topP: 1,
+2 -2
View File
@@ -154,10 +154,10 @@ export const mapChatMessagesToMetaMessages = <T extends ChatMessage>(
messages: T[],
): MetaMessage[] => {
return messages.map((msg) => {
let content: string;
let content: string = "";
if (typeof msg.content === "string") {
content = msg.content;
} else {
} else if (msg.content.length) {
content = (msg.content[0] as MessageContentTextDetail).text;
}
return {
+6
View File
@@ -1,5 +1,11 @@
# @llamaindex/core
## 0.1.1
### Patch Changes
- b3681bf: fix: DataCloneError when using FunctionTool
## 0.1.0
### Minor Changes
+1 -1
View File
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/core",
"type": "module",
"version": "0.1.0",
"version": "0.1.1",
"description": "LlamaIndex Core Module",
"exports": {
"./llms": {
@@ -105,9 +105,7 @@ export class CallbackManager {
}
queueMicrotask(() => {
cbs.forEach((handler) =>
handler(
LlamaIndexCustomEvent.fromEvent(event, structuredClone(detail)),
),
handler(LlamaIndexCustomEvent.fromEvent(event, { ...detail })),
);
});
}
+22
View File
@@ -6,6 +6,9 @@ declare module "@llamaindex/core/global" {
test: {
value: number;
};
functionTest: {
fn: ({ x }: { x: number }) => string;
};
}
}
@@ -42,6 +45,25 @@ describe("event system", () => {
expect(callback).toHaveBeenCalledTimes(1);
});
test("dispatch function tool event", async () => {
const testFunction = ({ x }: { x: number }) => `${x * 2}`;
let callback;
Settings.callbackManager.on(
"functionTest",
(callback = vi.fn((event) => {
const data = event.detail;
expect(data.fn).toBe(testFunction);
})),
);
Settings.callbackManager.dispatchEvent("functionTest", {
fn: testFunction,
});
expect(callback).toHaveBeenCalledTimes(0);
await new Promise((resolve) => process.nextTick(resolve));
expect(callback).toHaveBeenCalledTimes(1);
});
// rollup doesn't support decorators for now
// test('wrap event caller', async () => {
// class A {
+1 -1
View File
@@ -7,6 +7,6 @@
},
"devDependencies": {
"@llamaindex/core": "workspace:*",
"vitest": "^1.6.0"
"vitest": "^2.0.2"
}
}
+2 -2
View File
@@ -68,11 +68,11 @@
},
"devDependencies": {
"@aws-crypto/sha256-js": "^5.2.0",
"@swc/cli": "^0.3.12",
"@swc/cli": "^0.4.0",
"@swc/core": "^1.6.3",
"concurrently": "^8.2.2",
"pathe": "^1.1.2",
"vitest": "^1.6.0"
"vitest": "^2.0.2"
},
"dependencies": {
"@types/lodash": "^4.17.5",
+22
View File
@@ -1,5 +1,27 @@
# @llamaindex/experimental
## 0.0.53
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
## 0.0.52
### Patch Changes
- llamaindex@0.5.2
## 0.0.51
### Patch Changes
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
## 0.0.50
### Patch Changes
+2 -2
View File
@@ -1,7 +1,7 @@
{
"name": "@llamaindex/experimental",
"description": "Experimental package for LlamaIndexTS",
"version": "0.0.50",
"version": "0.0.53",
"type": "module",
"types": "dist/type/index.d.ts",
"main": "dist/cjs/index.js",
@@ -56,7 +56,7 @@
},
"devDependencies": {
"@aws-crypto/sha256-js": "^5.2.0",
"@swc/cli": "^0.3.12",
"@swc/cli": "^0.4.0",
"@swc/core": "^1.6.3",
"@types/jsonpath": "^0.2.4",
"concurrently": "^8.2.2",
+23
View File
@@ -1,5 +1,28 @@
# llamaindex
## 0.5.3
### Patch Changes
- 9bbbc67: feat: add a reader for Discord messages
- b3681bf: fix: DataCloneError when using FunctionTool
- Updated dependencies [b3681bf]
- @llamaindex/core@0.1.1
## 0.5.2
### Patch Changes
- Updated dependencies [3ed6acc]
- @llamaindex/cloud@0.2.0
## 0.5.1
### Patch Changes
- 2774681: Add mixedbread's embeddings and reranking API
- a0f424e: corrected the regex in the react.ts file in extractToolUse & extractJsonStr functions, as mentioned in https://github.com/run-llama/LlamaIndexTS/issues/1019
## 0.5.0
### Minor Changes
@@ -1,5 +1,27 @@
# @llamaindex/cloudflare-worker-agent-test
## 0.0.37
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
## 0.0.36
### Patch Changes
- llamaindex@0.5.2
## 0.0.35
### Patch Changes
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
## 0.0.34
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/cloudflare-worker-agent-test",
"version": "0.0.34",
"version": "0.0.37",
"type": "module",
"private": true,
"scripts": {
@@ -12,13 +12,13 @@
"cf-typegen": "wrangler types"
},
"devDependencies": {
"@cloudflare/vitest-pool-workers": "^0.4.3",
"@cloudflare/workers-types": "^4.20240605.0",
"@vitest/runner": "1.3.0",
"@vitest/snapshot": "1.3.0",
"typescript": "^5.5.2",
"vitest": "1.3.0",
"wrangler": "^3.60.1"
"@cloudflare/vitest-pool-workers": "^0.4.10",
"@cloudflare/workers-types": "^4.20240620.0",
"@vitest/runner": "1.5.3",
"@vitest/snapshot": "1.5.3",
"typescript": "^5.5.3",
"vitest": "1.5.3",
"wrangler": "^3.63.2"
},
"dependencies": {
"llamaindex": "workspace:*"
@@ -1,5 +1,27 @@
# @llamaindex/next-agent-test
## 0.1.37
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
## 0.1.36
### Patch Changes
- llamaindex@0.5.2
## 0.1.35
### Patch Changes
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
## 0.1.34
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/next-agent-test",
"version": "0.1.34",
"version": "0.1.37",
"private": true,
"scripts": {
"dev": "next dev",
@@ -11,7 +11,7 @@
"dependencies": {
"ai": "^3.2.1",
"llamaindex": "workspace:*",
"next": "14.2.4",
"next": "14.2.5",
"react": "18.3.1",
"react-dom": "18.3.1"
},
@@ -20,9 +20,9 @@
"@types/react": "^18.3.3",
"@types/react-dom": "^18.3.0",
"eslint": "^8.57.0",
"eslint-config-next": "14.2.3",
"eslint-config-next": "14.2.5",
"postcss": "^8",
"tailwindcss": "^3.4.4",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
}
}
@@ -1,5 +1,27 @@
# test-edge-runtime
## 0.1.36
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
## 0.1.35
### Patch Changes
- llamaindex@0.5.2
## 0.1.34
### Patch Changes
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
## 0.1.33
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/nextjs-edge-runtime-test",
"version": "0.1.33",
"version": "0.1.36",
"private": true,
"scripts": {
"dev": "next dev",
@@ -9,7 +9,7 @@
},
"dependencies": {
"llamaindex": "workspace:*",
"next": "14.2.4",
"next": "14.2.5",
"react": "^18.3.1",
"react-dom": "^18.3.1"
},
@@ -17,6 +17,6 @@
"@types/node": "^20.12.11",
"@types/react": "^18.3.3",
"@types/react-dom": "^18.3.0",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
}
}
@@ -1,5 +1,27 @@
# @llamaindex/next-node-runtime
## 0.0.18
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
## 0.0.17
### Patch Changes
- llamaindex@0.5.2
## 0.0.16
### Patch Changes
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
## 0.0.15
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/next-node-runtime-test",
"version": "0.0.15",
"version": "0.0.18",
"private": true,
"scripts": {
"dev": "next dev",
@@ -10,7 +10,7 @@
},
"dependencies": {
"llamaindex": "workspace:*",
"next": "14.2.4",
"next": "14.2.5",
"react": "18.3.1",
"react-dom": "18.3.1"
},
@@ -19,9 +19,9 @@
"@types/react": "^18.3.3",
"@types/react-dom": "^18.3.0",
"eslint": "^8.57.0",
"eslint-config-next": "14.2.3",
"eslint-config-next": "14.2.5",
"postcss": "^8",
"tailwindcss": "^3.4.4",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
}
}
@@ -1,5 +1,27 @@
# @llamaindex/waku-query-engine-test
## 0.0.37
### Patch Changes
- Updated dependencies [9bbbc67]
- Updated dependencies [b3681bf]
- llamaindex@0.5.3
## 0.0.36
### Patch Changes
- llamaindex@0.5.2
## 0.0.35
### Patch Changes
- Updated dependencies [2774681]
- Updated dependencies [a0f424e]
- llamaindex@0.5.1
## 0.0.34
### Patch Changes
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/waku-query-engine-test",
"version": "0.0.34",
"version": "0.0.37",
"type": "module",
"private": true,
"scripts": {
@@ -10,16 +10,16 @@
},
"dependencies": {
"llamaindex": "workspace:*",
"react": "19.0.0-canary-e3ebcd54b-20240405",
"react-dom": "19.0.0-canary-e3ebcd54b-20240405",
"react-server-dom-webpack": "19.0.0-canary-e3ebcd54b-20240405",
"waku": "0.20.1"
"react": "19.0.0-beta-e7d213dfb0-20240507",
"react-dom": "19.0.0-beta-e7d213dfb0-20240507",
"react-server-dom-webpack": "19.0.0-beta-e7d213dfb0-20240507",
"waku": "0.20.2"
},
"devDependencies": {
"@types/react": "18.3.1",
"@types/react": "18.3.3",
"@types/react-dom": "18.3.0",
"autoprefixer": "10.4.19",
"tailwindcss": "3.4.3",
"typescript": "5.4.5"
"tailwindcss": "3.4.4",
"typescript": "5.5.3"
}
}
+18 -15
View File
@@ -1,6 +1,6 @@
{
"name": "llamaindex",
"version": "0.5.0",
"version": "0.5.3",
"license": "MIT",
"type": "module",
"keywords": [
@@ -20,18 +20,20 @@
"llamaindex"
],
"dependencies": {
"@anthropic-ai/sdk": "^0.21.1",
"@anthropic-ai/sdk": "0.21.1",
"@aws-crypto/sha256-js": "^5.2.0",
"@azure/identity": "^4.2.1",
"@datastax/astra-db-ts": "^1.2.1",
"@discordjs/rest": "^2.3.0",
"@google-cloud/vertexai": "^1.2.0",
"@google/generative-ai": "^0.12.0",
"@grpc/grpc-js": "^1.10.8",
"@google/generative-ai": "0.12.0",
"@grpc/grpc-js": "^1.10.11",
"@huggingface/inference": "^2.7.0",
"@llamaindex/cloud": "workspace:*",
"@llamaindex/core": "workspace:*",
"@llamaindex/env": "workspace:*",
"@mistralai/mistralai": "^0.4.0",
"@mistralai/mistralai": "^0.5.0",
"@mixedbread-ai/sdk": "^2.2.11",
"@pinecone-database/pinecone": "^2.2.2",
"@qdrant/js-client-rest": "^1.9.0",
"@types/lodash": "^4.17.4",
@@ -39,11 +41,12 @@
"@types/papaparse": "^5.3.14",
"@types/pg": "^8.11.6",
"@xenova/transformers": "^2.17.2",
"@zilliz/milvus2-sdk-node": "^2.4.2",
"@zilliz/milvus2-sdk-node": "^2.4.4",
"ajv": "^8.16.0",
"assemblyai": "^4.4.5",
"assemblyai": "^4.6.0",
"chromadb": "1.8.1",
"cohere-ai": "7.9.5",
"cohere-ai": "7.10.6",
"discord-api-types": "^0.37.92",
"groq-sdk": "^0.5.0",
"js-tiktoken": "^1.0.12",
"lodash": "^4.17.21",
@@ -52,16 +55,16 @@
"md-utils-ts": "^2.0.0",
"mongodb": "^6.7.0",
"notion-md-crawler": "^1.0.0",
"openai": "^4.52.0",
"openai": "^4.52.5",
"papaparse": "^5.4.1",
"pathe": "^1.1.2",
"pg": "^8.12.0",
"pgvector": "^0.1.8",
"portkey-ai": "^0.1.16",
"pgvector": "^0.2.0",
"portkey-ai": "0.1.16",
"rake-modified": "^1.0.8",
"string-strip-html": "^13.4.8",
"tiktoken": "^1.0.15",
"unpdf": "^0.10.1",
"unpdf": "^0.11.0",
"wikipedia": "^2.1.2",
"wink-nlp": "^2.3.0",
"zod": "^3.23.8"
@@ -76,11 +79,11 @@
},
"devDependencies": {
"@notionhq/client": "^2.2.15",
"@swc/cli": "^0.3.12",
"@swc/cli": "^0.4.0",
"@swc/core": "^1.6.3",
"concurrently": "^8.2.2",
"glob": "^10.4.2",
"typescript": "^5.5.2"
"glob": "^11.0.0",
"typescript": "^5.5.3"
},
"engines": {
"node": ">=18.0.0"
+2 -2
View File
@@ -66,7 +66,7 @@ function reasonFormatter(reason: Reason): string | Promise<string> {
}
function extractJsonStr(text: string): string {
const pattern = /\{.*}/s;
const pattern = /\{.*\}/s;
const match = text.match(pattern);
if (!match) {
@@ -98,7 +98,7 @@ function extractToolUse(
inputText: string,
): [thought: string, action: string, input: string] {
const pattern =
/\s*Thought: (.*?)\nAction: ([a-zA-Z0-9_]+).*?\.*Input: .*?(\{.*?})/s;
/\s*Thought: (.*?)\nAction: ([a-zA-Z0-9_]+).*?\.*Input: .*?(\{.*?\})/s;
const match = inputText.match(pattern);
@@ -0,0 +1,170 @@
import { getEnv } from "@llamaindex/env";
import { MixedbreadAI, MixedbreadAIClient } from "@mixedbread-ai/sdk";
import { BaseEmbedding, type EmbeddingInfo } from "./types.js";
type EmbeddingsRequestWithoutInput = Omit<
MixedbreadAI.EmbeddingsRequest,
"input"
>;
/**
* Interface extending EmbeddingsParams with additional
* parameters specific to the MixedbreadAIEmbeddings class.
*/
export interface MixedbreadAIEmbeddingsParams
extends Omit<EmbeddingsRequestWithoutInput, "model"> {
/**
* The model to use for generating embeddings.
* @default {"mixedbread-ai/mxbai-embed-large-v1"}
*/
model?: string;
/**
* The API key to use.
* @default {process.env.MXBAI_API_KEY}
*/
apiKey?: string;
/**
* The base URL for the API.
*/
baseUrl?: string;
/**
* The maximum number of documents to embed in a single request.
* @default {128}
*/
embedBatchSize?: number;
/**
* The embed info for the model.
*/
embedInfo?: EmbeddingInfo;
/**
* The maximum number of retries to attempt.
* @default {3}
*/
maxRetries?: number;
/**
* Timeouts for the request.
*/
timeoutInSeconds?: number;
}
/**
* Class for generating embeddings using the mixedbread ai API.
*
* This class leverages the model "mixedbread-ai/mxbai-embed-large-v1" to generate
* embeddings for text documents. The embeddings can be used for various NLP tasks
* such as similarity comparison, clustering, or as features in machine learning models.
*
* @example
* const mxbai = new MixedbreadAIEmbeddings({ apiKey: 'your-api-key' });
* const texts = ["Baking bread is fun", "I love baking"];
* const result = await mxbai.getTextEmbeddings(texts);
* console.log(result);
*
* @example
* const mxbai = new MixedbreadAIEmbeddings({
* apiKey: 'your-api-key',
* model: 'mixedbread-ai/mxbai-embed-large-v1',
* encodingFormat: MixedbreadAI.EncodingFormat.Binary,
* dimensions: 512,
* normalized: true,
* });
* const query = "Represent this sentence for searching relevant passages: Is baking bread fun?";
* const result = await mxbai.getTextEmbedding(query);
* console.log(result);
*/
export class MixedbreadAIEmbeddings extends BaseEmbedding {
requestParams: EmbeddingsRequestWithoutInput;
requestOptions: MixedbreadAIClient.RequestOptions;
private client: MixedbreadAIClient;
/**
* Constructor for MixedbreadAIEmbeddings.
* @param {Partial<MixedbreadAIEmbeddingsParams>} params - An optional object with properties to configure the instance.
* @throws {Error} If the API key is not provided or found in the environment variables.
* @throws {Error} If the batch size exceeds 256.
*/
constructor(params?: Partial<MixedbreadAIEmbeddingsParams>) {
super();
const apiKey = params?.apiKey ?? getEnv("MXBAI_API_KEY");
if (!apiKey) {
throw new Error(
"mixedbread ai API key not found. Either provide it in the constructor or set the 'MXBAI_API_KEY' environment variable.",
);
}
if (params?.embedBatchSize && params?.embedBatchSize > 256) {
throw new Error(
"The maximum batch size for mixedbread ai embeddings API is 256.",
);
}
this.embedBatchSize = params?.embedBatchSize ?? 128;
this.embedInfo = params?.embedInfo;
this.requestParams = {
model: params?.model ?? "mixedbread-ai/mxbai-embed-large-v1",
normalized: params?.normalized,
dimensions: params?.dimensions,
encodingFormat: params?.encodingFormat,
truncationStrategy: params?.truncationStrategy,
prompt: params?.prompt,
};
this.requestOptions = {
timeoutInSeconds: params?.timeoutInSeconds,
maxRetries: params?.maxRetries ?? 3,
// Support for this already exists in the python sdk and will be added to the js sdk soon
// @ts-ignore
additionalHeaders: {
"user-agent": "@mixedbread-ai/llamaindex-ts-sdk",
},
};
this.client = new MixedbreadAIClient({
apiKey,
environment: params?.baseUrl,
});
}
/**
* Generates an embedding for a single text.
* @param {string} text - A string to generate an embedding for.
* @returns {Promise<number[]>} A Promise that resolves to an array of numbers representing the embedding.
*
* @example
* const query = "Represent this sentence for searching relevant passages: Is baking bread fun?";
* const result = await mxbai.getTextEmbedding(text);
* console.log(result);
*/
async getTextEmbedding(text: string): Promise<number[]> {
return (await this.getTextEmbeddings([text]))[0];
}
/**
* Generates embeddings for an array of texts.
* @param {string[]} texts - An array of strings to generate embeddings for.
* @returns {Promise<Array<number[]>>} A Promise that resolves to an array of embeddings.
*
* @example
* const texts = ["Baking bread is fun", "I love baking"];
* const result = await mxbai.getTextEmbeddings(texts);
* console.log(result);
*/
async getTextEmbeddings(texts: string[]): Promise<Array<number[]>> {
if (texts.length === 0) {
return [];
}
const response = await this.client.embeddings(
{
...this.requestParams,
input: texts,
},
this.requestOptions,
);
return response.data.map((d) => d.embedding as number[]);
}
}
@@ -4,6 +4,7 @@ export * from "./GeminiEmbedding.js";
export { HuggingFaceInferenceAPIEmbedding } from "./HuggingFaceEmbedding.js";
export * from "./JinaAIEmbedding.js";
export * from "./MistralAIEmbedding.js";
export * from "./MixedbreadAIEmbeddings.js";
export * from "./MultiModalEmbedding.js";
export { OllamaEmbedding } from "./OllamaEmbedding.js";
export * from "./OpenAIEmbedding.js";
@@ -0,0 +1,203 @@
import { getEnv } from "@llamaindex/env";
import { MixedbreadAI, MixedbreadAIClient } from "@mixedbread-ai/sdk";
import { MetadataMode } from "@llamaindex/core/schema";
import { extractText } from "@llamaindex/core/utils";
import type { MessageContent } from "@llamaindex/core/llms";
import type { BaseNode, NodeWithScore } from "@llamaindex/core/schema";
import type { BaseNodePostprocessor } from "../types.js";
type RerankingRequestWithoutInput = Omit<
MixedbreadAI.RerankingRequest,
"query" | "input"
>;
/**
* Interface extending RerankingRequestWithoutInput with additional
* parameters specific to the MixedbreadRerank class.
*/
export interface MixedbreadAIRerankerParams
extends Omit<RerankingRequestWithoutInput, "model"> {
/**
* The model to use for reranking. For example "default" or "mixedbread-ai/mxbai-rerank-large-v1".
* @default {"default"}
*/
model?: string;
/**
* The API key to use.
* @default {process.env.MXBAI_API_KEY}
*/
apiKey?: string;
/**
* The base URL of the MixedbreadAI API.
*/
baseUrl?: string;
/**
* The maximum number of retries to attempt.
* @default {3}
*/
maxRetries?: number;
/**
* Timeouts for the request.
*/
timeoutInSeconds?: number;
}
/**
* Node postprocessor that uses MixedbreadAI's rerank API.
*
* This class utilizes MixedbreadAI's rerank model to reorder a set of nodes based on their relevance
* to a given query. The reranked nodes are then used for various applications like search results refinement.
*
* @example
* const reranker = new MixedbreadAIReranker({ apiKey: 'your-api-key' });
* const nodes = [{ node: new BaseNode('To bake bread you need flour') }, { node: new BaseNode('To bake bread you need yeast') }];
* const query = "What do you need to bake bread?";
* const result = await reranker.postprocessNodes(nodes, query);
* console.log(result);
*
* @example
* const reranker = new MixedbreadAIReranker({
* apiKey: 'your-api-key',
* model: 'mixedbread-ai/mxbai-rerank-large-v1',
* topK: 5,
* rankFields: ["title", "content"],
* returnInput: true,
* maxRetries: 5
* });
* const documents = [{ title: "Bread Recipe", content: "To bake bread you need flour" }, { title: "Bread Recipe", content: "To bake bread you need yeast" }];
* const query = "What do you need to bake bread?";
* const result = await reranker.rerank(documents, query);
* console.log(result);
*/
export class MixedbreadAIReranker implements BaseNodePostprocessor {
requestParams: RerankingRequestWithoutInput;
requestOptions: MixedbreadAIClient.RequestOptions;
private readonly client: MixedbreadAIClient;
/**
* Constructor for MixedbreadRerank.
* @param {Partial<MixedbreadAIRerankerParams>} params - An optional object with properties to configure the instance.
* @throws {Error} If the API key is not provided or found in the environment variables.
*/
constructor(params: Partial<MixedbreadAIRerankerParams>) {
const apiKey = params?.apiKey ?? getEnv("MXBAI_API_KEY");
if (!apiKey) {
throw new Error(
"MixedbreadAI API key not found. Either provide it in the constructor or set the 'MXBAI_API_KEY' environment variable.",
);
}
this.requestOptions = {
maxRetries: params?.maxRetries ?? 3,
timeoutInSeconds: params?.timeoutInSeconds,
// Support for this already exists in the python sdk and will be added to the js sdk soon
// @ts-ignore
additionalHeaders: {
"user-agent": "@mixedbread-ai/llamaindex-ts-sdk",
},
};
this.client = new MixedbreadAIClient({
apiKey: apiKey,
environment: params?.baseUrl,
});
this.requestParams = {
model: params?.model ?? "default",
returnInput: params?.returnInput ?? false,
topK: params?.topK,
rankFields: params?.rankFields,
};
}
/**
* Reranks the nodes using the mixedbread.ai API.
* @param {NodeWithScore[]} nodes - Array of nodes with scores.
* @param {MessageContent} [query] - Query string.
* @throws {Error} If query is undefined.
*
* @returns {Promise<NodeWithScore[]>} A Promise that resolves to an ordered list of nodes with relevance scores.
*
* @example
* const nodes = [{ node: new BaseNode('To bake bread you need flour') }, { node: new BaseNode('To bake bread you need yeast') }];
* const query = "What do you need to bake bread?";
* const result = await reranker.postprocessNodes(nodes, query);
* console.log(result);
*/
async postprocessNodes(
nodes: NodeWithScore[],
query?: MessageContent,
): Promise<NodeWithScore[]> {
if (query === undefined) {
throw new Error("MixedbreadAIReranker requires a query");
}
if (nodes.length === 0) {
return [];
}
const input = nodes.map((n) => n.node.getContent(MetadataMode.ALL));
const result = await this.client.reranking(
{
query: extractText(query),
input,
...this.requestParams,
},
this.requestOptions,
);
const newNodes: NodeWithScore[] = [];
for (const document of result.data) {
const node = nodes[document.index];
node.score = document.score;
newNodes.push(node);
}
return newNodes;
}
/**
* Returns an ordered list of documents sorted by their relevance to the provided query.
* @param {(Array<string> | Array<BaseNode> | Array<Record<string, unknown>>)} nodes - A list of documents as strings, DocumentInterfaces, or objects with a `pageContent` key.
* @param {string} query - The query to use for reranking the documents.
* @param {RerankingRequestWithoutInput} [options] - Optional parameters for reranking.
*
* @returns {Promise<Array<MixedbreadAI.RankedDocument>>} A Promise that resolves to an ordered list of documents with relevance scores.
*
* @example
* const nodes = ["To bake bread you need flour", "To bake bread you need yeast"];
* const query = "What do you need to bake bread?";
* const result = await reranker.rerank(nodes, query);
* console.log(result);
*/
async rerank(
nodes: Array<string> | Array<BaseNode> | Array<Record<string, unknown>>,
query: string,
options?: RerankingRequestWithoutInput,
): Promise<Array<MixedbreadAI.RankedDocument>> {
if (nodes.length === 0) {
return [];
}
const input =
typeof nodes[0] === "object" && "node" in nodes[0]
? (nodes as BaseNode[]).map((n) => n.getContent(MetadataMode.ALL))
: (nodes as string[]);
const result = await this.client.reranking(
{
query,
input,
...this.requestParams,
...options,
},
this.requestOptions,
);
return result.data;
}
}
@@ -1,2 +1,3 @@
export * from "./CohereRerank.js";
export * from "./JinaAIReranker.js";
export * from "./MixedbreadAIReranker.js";
@@ -0,0 +1,137 @@
import { REST, type RESTOptions } from "@discordjs/rest";
import { Document } from "@llamaindex/core/schema";
import { getEnv } from "@llamaindex/env";
import { Routes, type APIEmbed, type APIMessage } from "discord-api-types/v10";
/**
* Represents a reader for Discord messages using @discordjs/rest
* See https://github.com/discordjs/discord.js/tree/main/packages/rest
*/
export class DiscordReader {
private client: REST;
constructor(
discordToken?: string,
requestHandler?: RESTOptions["makeRequest"],
) {
const token = discordToken ?? getEnv("DISCORD_TOKEN");
if (!token) {
throw new Error(
"Must specify `discordToken` or set environment variable `DISCORD_TOKEN`.",
);
}
const restOptions: Partial<RESTOptions> = { version: "10" };
// Use the provided request handler if specified
if (requestHandler) {
restOptions.makeRequest = requestHandler;
}
this.client = new REST(restOptions).setToken(token);
}
// Read all messages in a channel given a channel ID
private async readChannel(
channelId: string,
limit?: number,
additionalInfo?: boolean,
oldestFirst?: boolean,
): Promise<Document[]> {
const params = new URLSearchParams();
if (limit) params.append("limit", limit.toString());
if (oldestFirst) params.append("after", "0");
try {
const endpoint =
`${Routes.channelMessages(channelId)}?${params}` as `/channels/${string}/messages`;
const messages = (await this.client.get(endpoint)) as APIMessage[];
return messages.map((msg) =>
this.createDocumentFromMessage(msg, additionalInfo),
);
} catch (err) {
console.error(err);
return [];
}
}
private createDocumentFromMessage(
msg: APIMessage,
additionalInfo?: boolean,
): Document {
let content = msg.content || "";
// Include information from embedded messages
if (additionalInfo && msg.embeds.length > 0) {
content +=
"\n" + msg.embeds.map((embed) => this.embedToString(embed)).join("\n");
}
// Include URL from attachments
if (additionalInfo && msg.attachments.length > 0) {
content +=
"\n" +
msg.attachments
.map((attachment) => `Attachment: ${attachment.url}`)
.join("\n");
}
return new Document({
text: content,
id_: msg.id,
metadata: {
messageId: msg.id,
username: msg.author.username,
createdAt: new Date(msg.timestamp).toISOString(),
editedAt: msg.edited_timestamp
? new Date(msg.edited_timestamp).toISOString()
: undefined,
},
});
}
// Create a string representation of an embedded message
private embedToString(embed: APIEmbed): string {
let result = "***Embedded Message***\n";
if (embed.title) result += `**${embed.title}**\n`;
if (embed.description) result += `${embed.description}\n`;
if (embed.url) result += `${embed.url}\n`;
if (embed.fields) {
result += embed.fields
.map((field) => `**${field.name}**: ${field.value}`)
.join("\n");
}
return result.trim();
}
/**
* Loads messages from multiple discord channels and returns an array of Document Objects.
*
* @param {string[]} channelIds - An array of channel IDs from which to load data.
* @param {number} [limit] - An optional limit on the number of messages to load per channel.
* @param {boolean} [additionalInfo] - An optional flag to include content from embedded messages and attachments urls as text.
* @param {boolean} [oldestFirst] - An optional flag to load oldest messages first.
* @return {Promise<Document[]>} A promise that resolves to an array of loaded documents.
*/
async loadData(
channelIds: string[],
limit?: number,
additionalInfo?: boolean,
oldestFirst?: boolean,
): Promise<Document[]> {
let results: Document[] = [];
for (const channelId of channelIds) {
if (typeof channelId !== "string") {
throw new Error(`Channel id ${channelId} must be a string.`);
}
const channelDocuments = await this.readChannel(
channelId,
limit,
additionalInfo,
oldestFirst,
);
results = results.concat(channelDocuments);
}
return results;
}
}
+1
View File
@@ -1,5 +1,6 @@
export * from "./AssemblyAIReader.js";
export * from "./CSVReader.js";
export * from "./DiscordReader.js";
export * from "./DocxReader.js";
export * from "./HTMLReader.js";
export * from "./ImageReader.js";
+1 -1
View File
@@ -8,6 +8,6 @@
},
"devDependencies": {
"llamaindex": "workspace:*",
"vitest": "^1.6.0"
"vitest": "^2.0.2"
}
}
@@ -1,95 +1,36 @@
Sample PDF
This is a simple PDF file. Fun fun fun.
Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Phasellus facilisis odio sed mi.
Curabitur suscipit. Nullam vel nisi. Etiam semper ipsum ut lectus. Proin aliquam, erat eget
pharetra commodo, eros mi condimentum quam, sed commodo justo quam ut velit.
Integer
a
erat.
Cras
laoreet
ligula
cursus
enim.
Aenean
scelerisque
velit
et
tellus.
Integer a erat. Cras laoreet ligula cursus enim. Aenean scelerisque velit et tellus.
Vestibulum dictum aliquet sem. Nulla facilisi. Vestibulum accumsan ante vitae elit. Nulla
erat dolor, blandit in, rutrum quis, semper pulvinar, enim. Nullam varius congue risus.
Vivamus sollicitudin, metus ut interdum eleifend, nisi tellus pellentesque elit, tristique
accumsan eros quam et risus. Suspendisse libero odio, mattis sit amet, aliquet eget,
hendrerit vel, nulla. Sed vitae augue. Aliquam erat volutpat. Aliquam feugiat vulputate nisl.
Suspendisse quis nulla pretium ante pretium mollis. Proin velit ligula, sagittis at, egestas a,
pulvinar quis, nisl.
Pellentesque sit amet lectus. Praesent pulvinar, nunc quis iaculis sagittis, justo quam
lobortis tortor, sed vestibulum dui metus venenatis est. Nunc cursus ligula. Nulla facilisi.
Phasellus ullamcorper consectetuer ante. Duis tincidunt, urna id condimentum luctus, nibh
ante vulputate sapien, id sagittis massa orci ut enim. Pellentesque vestibulum convallis
sem. Nulla consequat quam ut nisl. Nullam est. Curabitur tincidunt dapibus lorem. Proin
velit turpis, scelerisque sit amet, iaculis nec, rhoncus ac, ipsum. Phasellus lorem arcu,
feugiat eu, gravida eu, consequat molestie, ipsum. Nullam vel est ut ipsum volutpat
feugiat. Aenean pellentesque.
In mauris. Pellentesque dui nisi, iaculis eu, rhoncus in, venenatis ac, ante. Ut odio justo,
scelerisque vel, facilisis non, commodo a, pede. Cras nec massa sit amet tortor volutpat
varius. Donec lacinia, neque a luctus aliquet, pede massa imperdiet ante, at varius lorem
pede sed sapien. Fusce erat nibh, aliquet in, eleifend eget, commodo eget, erat. Fusce
consectetuer. Cras risus tortor, porttitor nec, tristique sed, convallis semper, eros. Fusce
vulputate ipsum a mauris. Phasellus mollis. Curabitur sed urna. Aliquam nec sapien non
nibh pulvinar convallis. Vivamus facilisis augue quis quam. Proin cursus aliquet metus.
Suspendisse lacinia. Nulla at tellus ac turpis eleifend scelerisque. Maecenas a pede vitae
enim commodo interdum. Donec odio. Sed sollicitudin dui vitae justo.
Morbi elit nunc, facilisis a, mollis a, molestie at, lectus. Suspendisse eget mauris eu tellus
molestie cursus. Duis ut magna at justo dignissim condimentum. Cum sociis natoque
penatibus et magnis dis parturient montes, nascetur ridiculus mus. Vivamus varius. Ut sit
amet diam suscipit mauris ornare aliquam. Sed varius. Duis arcu. Etiam tristique massa
eget dui. Phasellus congue. Aenean est erat, tincidunt eget, venenatis quis, commodo at,
quam.
@@ -20,7 +20,7 @@ describe("pdf reader", () => {
const documents = await reader.loadData(
"../../../examples/data/brk-2022.pdf",
);
expect(documents.length).toBe(140);
expect(documents.length).toBe(144);
});
test("manga.pdf", async () => {
const documents = await reader.loadData("../../../examples/data/manga.pdf");
+2 -2
View File
@@ -8,10 +8,10 @@
"@types/node": "^20.12.11"
},
"devDependencies": {
"@swc/cli": "^0.3.12",
"@swc/cli": "^0.4.0",
"@swc/core": "^1.6.3",
"assemblyscript": "^0.27.27",
"typescript": "^5.5.2"
"typescript": "^5.5.3"
},
"engines": {
"node": ">=18.0.0"
+4776 -4957
View File
File diff suppressed because it is too large Load Diff