Compare commits

...

213 Commits

Author SHA1 Message Date
github-actions[bot] 5b4a53177e Release 0.11.29 (#2188)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-09-11 10:44:29 +08:00
Thuc Pham 5da1cda939 feat: support zod v4 & v3 (#2186)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-09-11 10:34:45 +08:00
Thuc Pham 1285e381bd feat: add ci-build script for size limit testing (#2194) 2025-09-10 18:09:47 +08:00
Neha Prasad 5d5cd44276 fix: anthropic temperature parameter not respecting value 0 (#2190)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-09-10 11:45:12 +08:00
hunter ed37c645af chore: addition of apac claude 4 sonnet to aws records (#2189) 2025-09-10 11:44:57 +08:00
hunter c40adafecc chore: add latest google models (#2191) 2025-09-10 11:44:30 +08:00
dependabot[bot] 995b465205 chore(deps-dev): bump vite from 6.3.3 to 6.3.6 (#2193)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-10 10:46:55 +08:00
Jeremy B. Merrill 8929dcf1dd vectorStoreIndex has new option progressCallback (#2187)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-09-05 10:37:22 +08:00
github-actions[bot] af0b79f1cd Release 0.11.28 (#2174)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-08-28 17:28:15 +08:00
Thuc Pham 1995b38660 chore: bump @llamaindex/workflow-core in @llamaindex/workflow package (#2181) 2025-08-27 17:30:09 +08:00
Raj Shrestha 001a5159cf chore: add minimal reasoning effort for gpt5 (#2177)
Co-authored-by: Raj Shrestha <raj.shrestha@carelon.com>
2025-08-27 11:52:58 +08:00
Zhanghao 9d7d2052e7 fix: fix the problem that the usage field in the streaming response was not handled correctly (#2173) 2025-08-24 12:33:14 +08:00
Orry fd90e25f0e Docs settings per request (#2166)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-08-20 16:31:26 +08:00
github-actions[bot] 97c00d67c3 Release 0.11.27 (#2169)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-08-19 12:11:06 +08:00
Daniel 6ebd7c2f13 fix: bedrock complete using actual modelId (#2172) 2025-08-19 11:04:32 +08:00
Clelia (Astra) Bertelli 0267bb0e8e feat: add responseFormat to llm.exec (#2167) 2025-08-13 12:39:37 +08:00
Marcus Schiesser 7875ee91e6 chore: update chat-ui docs (#2168) 2025-08-13 12:26:22 +08:00
Orry e3405fca44 chore: point the local llm full example to the correct URL (#2162)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-08-08 14:56:35 +08:00
github-actions[bot] f3bc2b61e7 Release (#2164) 2025-08-07 15:18:42 -06:00
Logan 4c703767b7 Adding GPT-5 support (#2163) 2025-08-07 13:39:47 -06:00
github-actions[bot] a27648200d Release (#2161) 2025-08-07 13:39:20 -06:00
abdeliibrahim c93bb02002 #2159 Remove unneeded console logs from gemini stream (#2160)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-08-07 11:38:35 +08:00
github-actions[bot] e9ded4e65f Release (#2154)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-08-06 12:18:06 +08:00
Marcus Schiesser 47a6f5fe5a chore: bump ollama (#2156) 2025-08-06 12:11:17 +08:00
Marcus Schiesser b80f33e264 chore: add opus 4.1 and fix prompt caching (#2155) 2025-08-06 11:54:27 +08:00
Alex Yang b6409b6823 chore: bump openai (#2152)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-08-06 10:58:45 +08:00
github-actions[bot] db3f556cb4 Release 0.11.26 (#2149)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-08-05 12:00:17 +08:00
Marcus Schiesser 4b5179169b chore: add deprecation to readme (#2150) 2025-08-05 11:53:35 +08:00
abdeliibrahim 971d37ceba fix(deepseek): add 'as const' assertion to DEEPSEEK_MODELS for correct TypeScript inference (#2148)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-08-05 10:30:13 +08:00
github-actions[bot] 3e0ffdc688 Release 0.11.25 (#2144)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-31 12:18:18 +08:00
Marcus Schiesser 049471bade chore: deprecate cloud packages (#2143) 2025-07-31 12:12:56 +08:00
github-actions[bot] 1e296ebe72 Release 0.11.24 (#2141)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-30 12:56:45 -04:00
Marcus Schiesser f9f1de9516 chore: use Logger for core (#2139) 2025-07-30 11:43:45 +08:00
Twisha Bansal f576812e7a docs: Using MCP Toolbox for Databases with LlamaIndex (#2138) 2025-07-30 11:19:34 +08:00
Adrian Lyjak c3bf3c7178 Adding support for page citations, and refactor the confidence into the field metadata (#2140) 2025-07-30 10:25:19 +08:00
github-actions[bot] 38487da65d Release 0.11.23 (#2136)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-28 14:07:23 +08:00
Marcus Schiesser f29799e385 feat: Add toolcall callbacks to agent workflows (#2137) 2025-07-24 15:37:14 +08:00
Marcus Schiesser 9bca30620b fix: docs build 2025-07-23 12:55:35 +08:00
Marcus Schiesser 7224c06409 feat: Add logger and callbacks to llm.exec (#2135) 2025-07-23 12:37:02 +08:00
github-actions[bot] 29c7cf0989 Release 0.11.22 (#2131)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-23 11:30:04 +08:00
Marcus Schiesser c65a2dc4a7 chore: Deprecate community package and link to AWS package (#2134) 2025-07-23 11:05:50 +08:00
Terence Sim f1c5079290 docs: updated bedrock import and supported models (#2129)
Co-authored-by: Terence Sim <40583743+InTheAxis@users.noreply.github.com>
2025-07-23 10:40:49 +08:00
Terence Sim 9ed31958a7 chore: add logger as param to AgentWorkflow constructor (#2130)
Co-authored-by: Terence Sim <40583743+InTheAxis@users.noreply.github.com>
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-07-22 16:35:28 +08:00
github-actions[bot] e4c7113614 Release 0.11.21 (#2128)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-22 12:23:58 +08:00
Thuc Pham 38da40bc98 feat: VectoryMemoryBlock (#2110)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-22 12:18:09 +08:00
Marcus Schiesser 4d50ca4d84 chore: add streamchat test (#2122) 2025-07-22 11:30:01 +08:00
github-actions[bot] 8b5253a297 Release (#2127) 2025-07-21 15:40:31 -06:00
Logan ea15e75c89 deployment docs nits (#2126) 2025-07-21 15:30:37 -06:00
github-actions[bot] 3be87d4670 Release 0.11.20 (#2121)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: himself65 <14026360+himself65@users.noreply.github.com>
2025-07-21 09:37:44 -07:00
Terence Sim 94da13db0d fix: azure openai streamchat empty delta throw TypeError (#2118)
Co-authored-by: Terence Sim <40583743+InTheAxis@users.noreply.github.com>
2025-07-21 09:16:09 -07:00
Terence Sim acd50ea99f chore: replaced console.log with logger type from @llamaindex/env (#2123)
Co-authored-by: Terence Sim <40583743+InTheAxis@users.noreply.github.com>
2025-07-21 09:14:06 -07:00
Adrian Lyjak 2967d57ac0 feat: default to _public agent data (#2117) 2025-07-21 09:07:15 -07:00
Thuc Pham a8ec08c682 fix: ensure correct message content in agent workflow (#2114)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-21 15:13:27 +08:00
Terence Sim 678b327051 feat: added apac bedrock models (#2119)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-21 12:13:37 +08:00
Jeremy B. Merrill 650eeb1df3 fix: GeminiEmbedding should send batches of max 100 (#2099)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-07-21 12:12:42 +08:00
Laurie Voss 50f6747758 Instrumenting with Google Tag Manager (in addition to Google Analytics) (#2116) 2025-07-20 13:18:09 -07:00
github-actions[bot] 12414a6836 Release (#2113)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-18 13:54:38 +08:00
Marcus Schiesser 856dd8cca8 fix: assume new models are function call models (#2112) 2025-07-18 12:52:43 +08:00
Jerry Cheng d8f4f6a859 Update SupabaseVectorStore.ts to fix score calculating error (#2109)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-07-18 12:48:47 +08:00
Logan f594d7034f revamp getting started flow and main index page (#2079)
Co-authored-by: Thuc Pham <51660321+thucpn@users.noreply.github.com>
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
Co-authored-by: thucpn <thucsh2@gmail.com>
2025-07-17 16:27:28 +08:00
github-actions[bot] c1c58feed2 Release 0.11.19 (#2105)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-17 15:44:22 +08:00
Marcus Schiesser 7ad3411766 feat: add llm.exec (#2078) 2025-07-17 15:36:56 +08:00
Neha Prasad a1fdb07b96 feat: multi-turn image generation support (#2106)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-07-17 10:30:39 +08:00
Jeremy B. Merrill 5da5b3c89c feat: add progress callback to embeddings (#2098)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-16 13:49:49 +08:00
r3rer3 ddc0eafbaa feat(anthropic): stream partial tool calls (#2100) 2025-07-15 10:06:17 -07:00
github-actions[bot] 1782554488 Release 0.11.18 (#2103)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-14 15:53:20 -07:00
Adrian Lyjak a1b1598bc6 fix(cloud): add generic types into agent data responses (#2102)
Co-authored-by: Alex Yang <himself65@outlook.com>
2025-07-14 12:01:56 -07:00
Terry Zhao b02847ae91 fix(notion): resolve @notionhq/client dependency conflict (#2097) 2025-07-12 11:04:06 -07:00
Alex Yang 50acb4821e feat(cloud): use camelCase (#2096) 2025-07-12 10:59:46 -07:00
github-actions[bot] 47a5b94b0c Release 0.11.17 (#2095)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-11 21:57:02 -07:00
Alex Yang d2be868b93 feat(cloud): missing agent api (#2094) 2025-07-11 20:45:22 -07:00
github-actions[bot] 50d42c4129 Release 0.11.16 (#2093)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-11 20:13:37 -07:00
github-actions[bot] 848b97d4d0 Release 0.11.16 (#2092)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-11 18:19:17 -07:00
Alex Yang c5796b8d2d fix: only allow pnpm (#2091) 2025-07-11 18:17:47 -07:00
Alex Yang 579ca0cf60 chore: bump sdk version (#2090) 2025-07-11 18:10:15 -07:00
Alex Yang f7e670c8d9 fix: sdk type improvement (#2089) 2025-07-11 17:56:41 -07:00
Alex Yang 9ff971435c fix(cloud): agent sdk (#2088) 2025-07-11 17:41:25 -07:00
github-actions[bot] 7c9d0e24c4 Release (#2086)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-11 12:30:04 -07:00
NIEDASEN af3f86694b feat: add supportToolCall getter to DeepSeekLLM class (#2085)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-07-11 16:11:22 +08:00
github-actions[bot] 5cce681f62 Release 0.11.15 (#2084)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-10 19:08:05 -07:00
Alex Yang 48b0d88941 chore: bump dev deps (#2082) 2025-07-10 19:00:37 -07:00
Alex Yang f18577263a fix(cloud): missing file (#2083) 2025-07-10 18:33:41 -07:00
github-actions[bot] 214e133e92 Release 0.11.14 (#2068)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: himself65 <14026360+himself65@users.noreply.github.com>
2025-07-10 17:10:02 -07:00
Alex Yang ae58862669 fix: missing agent entry (#2081) 2025-07-10 11:39:07 -07:00
Alex Yang 5a0ed1f990 feat: init agent api on cloud sdk (#2069) 2025-07-10 10:00:53 -07:00
Logan 36773a82b6 fix examples scripts (#2077) 2025-07-09 11:24:07 +08:00
Logan 891562d598 remove workspace from examples package.json (#2075) 2025-07-08 16:36:33 -07:00
Alex Yang 93852e15fd chore: bump zod (#2074) 2025-07-08 13:58:52 -07:00
Clelia (Astra) Bertelli e1320b08a8 fix: adding more details in the contribution guidelines about changesets (#2073) 2025-07-08 13:58:36 -07:00
Logan 8eeac3310f fix memory factory (#2066) 2025-07-08 10:01:19 +07:00
Logan 984a573068 docs: update contributing instructions (#2067) 2025-07-07 16:38:26 -07:00
github-actions[bot] f0160d9646 Release (#2065)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-07 12:15:33 -06:00
Logan 39758ab018 add title to root layout (#2064) 2025-07-07 12:06:13 -06:00
dependabot[bot] f631d4f7d6 chore(deps): bump next from 15.3.0 to 15.3.3 (#2063) 2025-07-07 12:40:42 +07:00
github-actions[bot] d68c2a4be8 Release 0.11.13 (#2060) 2025-07-07 11:24:21 +07:00
Alex Yang 47a7555c07 chore: bump sdk version (#2062) 2025-07-03 12:05:16 -07:00
Marcus Schiesser 363bfa778e chore: re-add lib folder from docs and rename it to libs (so pnpm clean doesn't delete it) 2025-07-03 11:03:05 +07:00
Jan Z 229cdeb0ff feat: add agent update to groq models (#2054)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-01 22:53:47 -07:00
github-actions[bot] 7a2485cca2 Release 0.11.12 (#2050)
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-07-02 11:41:55 +07:00
Marcus Schiesser 1329186a23 docs: clarify how to run docs 2025-07-02 11:33:48 +07:00
dependabot[bot] 5d6e7384f5 chore(deps-dev): bump @modelcontextprotocol/server-filesystem from 2025.3.28 to 2025.7.1 (#2055) 2025-07-02 11:26:18 +07:00
allen f2dfd305fb implement bm25 retriever (#2045)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-02 11:22:47 +07:00
Huu Le 3cd8a573df feat: update interpreter to always upload all files in the configured directory (#2057) 2025-07-02 10:57:04 +07:00
Laurie Voss 09c6077f6e Import path for llamaparsereader (#2056) 2025-07-01 16:51:25 -07:00
Logan 14cc65b4e3 add google analytics (#2053)
Co-authored-by: Alex Yang <himself65@outlook.com>
2025-07-01 11:18:14 -07:00
Marcus Schiesser c544d8f67c docs: review and update memory doc 2025-07-01 15:10:43 +07:00
Huu Le d578889e21 feat: new memory api (#2028)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-07-01 09:30:49 +07:00
Marcus Schiesser 9f745d1941 chore: revert to wrong opus change 2025-07-01 09:07:46 +07:00
Alex Yang f292e94dcd fix: change default claude model (#2052) 2025-06-30 15:19:40 -07:00
Marcus Schiesser 0fcc92f632 fix: sentence splitter must not trim whitespaces (#2046) 2025-06-30 17:32:04 +07:00
Marcus Schiesser 515a8b9111 fix: error logging for fromPersistPath (#2049) 2025-06-30 13:41:13 +07:00
github-actions[bot] 7e8efc6284 Release @llamaindex/tools@0.1.2 (#2048) 2025-06-30 11:40:54 +07:00
Wassim Chegham 0fcf65126d chore: export type MCPClientOptions (#2047)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-06-28 10:55:07 +07:00
github-actions[bot] a50acf634c Release 0.11.11 (#2044)
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-06-27 14:51:09 +07:00
Thuc Pham 7039e1a214 chore: migrate to @google/genai SDK (#2038)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-06-27 12:09:26 +07:00
github-actions[bot] 785d010cd3 Release 0.11.10 (#2037)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-06-26 14:29:33 +07:00
Marcus Schiesser b878032131 fix release step 2025-06-26 14:18:56 +07:00
Marcus Schiesser f7ec293a0f chore: Update workflow-core (#2042) 2025-06-26 14:03:03 +07:00
jerinthomascarmel 49a5e0a8cf feat(readers): add ExcelReader for parsing Excel files (run-llama#1959) (#2033)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
Co-authored-by: leehuwuj <leehuwuj@gmail.com>
2025-06-26 11:15:19 +07:00
Logan 118924799a Rename llama-flow -> workflows in docs (#2040) 2025-06-25 15:52:04 -07:00
allen ec8f673dae support filter to supabase vector search (#2036) 2025-06-25 16:17:54 +07:00
github-actions[bot] 85039a5360 Release @llamaindex/tools@0.1.0 (#2034) 2025-06-24 12:32:24 +07:00
Marcus Schiesser d7305edb53 fix changesets 2025-06-24 12:26:09 +07:00
Huu Le 096bf2bda1 feat: Add support for StreamableHTTP MCP Client (#2032) 2025-06-24 11:40:34 +07:00
jerinthomascarmel c5846bd7dc feat(readers): add XMLReader for parsing XML files (#1846) (#2031)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-06-24 10:46:32 +07:00
github-actions[bot] 97bbce6e13 Release 0.11.9 (#2023)
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-06-20 12:28:01 +07:00
Marcus Schiesser 62699b7497 chore: improve performance of sentence splitter (#2030) 2025-06-20 12:16:24 +07:00
Broda Noel a89e187796 Add extraAbbreviations on sentence-splitter (#2029)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-06-20 11:27:06 +07:00
ANKIT VARSHNEY d8ac8d385d feat: add openai realtime api (#2006)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-06-20 10:22:04 +07:00
Marcus Schiesser a6cef9c6be chore: no core in examples (#2024) 2025-06-18 09:39:32 +07:00
Broda Noel c5b2691302 Add more Acronyms on SentenceSplitter (#2022)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-06-17 10:43:36 +07:00
github-actions[bot] 8122c7245e Release 0.11.8 (#2018)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-06-12 16:20:58 +07:00
Huu Le 8a51c167f8 feat: use agent to handle a workflow step (#2014)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-06-12 16:06:13 +07:00
Marcus Schiesser 1b5af1402d fix: jsonToNode for image nodes (#2017) 2025-06-12 11:59:05 +07:00
github-actions[bot] fffe93fac8 Release 0.11.7 (#2013)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-06-12 10:34:24 +07:00
Marcus Schiesser dbd857f6b5 chore: add changeset 2025-06-11 16:20:32 +07:00
정물결 a4d394f727 fix: correct SimpleDirectoryReader import path (#2011) 2025-06-10 12:43:01 +07:00
Marcus Schiesser 3c857f4132 chore: move ajv to dev deps (#2012) 2025-06-10 12:20:54 +07:00
Thuc Pham 36cfb93eb2 feat: export snapshot apis from llama-flow (#2009) 2025-06-10 11:56:33 +07:00
github-actions[bot] ab4762f026 Release (#2005)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-06-06 14:45:39 +07:00
Peter Goldstein 56763dc57d Update to the latest Gemini 2.5 Pro Preview key (#2004) 2025-06-06 11:25:41 +07:00
github-actions[bot] 5375fdd704 Release (#2003)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-06-05 09:57:35 +07:00
Marcus Schiesser e7484efca5 feat: weaviate: Add metadata sanitization before adding node. Add err… (#2001) 2025-06-04 11:48:18 +07:00
Marcus Schiesser c958a1645a docs: update chat-ui (#2002) 2025-06-03 17:01:07 +07:00
github-actions[bot] 0140a257c4 Release 0.11.6 (#1999)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-06-02 18:03:31 +07:00
GhosT 40161fe8d2 chore: Bump @llama-flow/core package version (#1998)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-06-02 17:28:47 +07:00
github-actions[bot] d883fe7351 Release @llamaindex/google@0.3.7 (#1994)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-31 14:04:14 +07:00
Parham Saidi 2bc6914784 fix: ignore empty parts for gemini which confuses agent (#1993) 2025-05-30 22:47:21 +07:00
github-actions[bot] 78fbec17a6 Release 0.11.5 (#1986)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-05-30 22:37:26 +07:00
Marcus Schiesser 8b10a2e880 docs: add chat-ui docs (#1992) 2025-05-30 16:56:47 +07:00
ANKIT VARSHNEY 534662368f fix(google): use api key provided by the user in the session store (#1989) 2025-05-30 11:53:54 +07:00
Marcus Schiesser b370bd59f1 docs: fix agent docs (#1988) 2025-05-29 11:38:11 +07:00
Huu Le 766054ba67 chore: remove log input to avoid confusing (#1987) 2025-05-28 17:40:03 +07:00
ANKIT VARSHNEY 71598f86d7 feat: add support for interrupted and other server content event in live api (#1980)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-05-28 15:18:56 +07:00
github-actions[bot] 677abe46d2 Release 0.11.4 (#1983)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: logan-markewich <22285038+logan-markewich@users.noreply.github.com>
2025-05-28 09:46:52 +07:00
Logan 1cc271ccae improve funcion call check in anthropic llm (#1985) 2025-05-27 13:36:42 -06:00
Marcus Schiesser c927457e2e chore: Use base64 for encoding files (#1965) 2025-05-27 17:20:07 +07:00
github-actions[bot] 17ae23560e Release @llamaindex/azure@0.1.18 (#1982)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-27 13:56:38 +07:00
yangqiao 0d9169e42d feat: Add vector index compression for AzureCosmosDBMongoDBVectorStore (#1981)
Co-authored-by: yangqiao <yangqiao@microsoft.com>
2025-05-27 13:49:46 +07:00
ANKIT VARSHNEY 3864c77ac3 Update supabase.mdx (#1979) 2025-05-27 13:46:18 +07:00
Marcus Schiesser a86f66cd2d feat: add claude.md files (#1977) 2025-05-26 16:49:45 +07:00
github-actions[bot] e5b25acc3d Release @llamaindex/qdrant@0.1.17 (#1976)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-26 11:27:15 +07:00
Marcus Schiesser ba35240b4c fix: missing payload (#1975) 2025-05-26 11:11:47 +07:00
github-actions[bot] 7384e4d273 Release @llamaindex/anthropic@0.3.9 (#1972)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-23 13:04:47 +07:00
Peter Goldstein ae75966721 Update Gemini model keys to reflect Google changes (#1968) 2025-05-23 11:22:55 +07:00
Peter Goldstein 5cdab12791 Add Claude Sonnet 4 and Claude Opus 4 models (#1969) 2025-05-23 11:10:50 +07:00
github-actions[bot] eaf2cb11a5 Release 0.11.3 (#1966)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-22 16:58:59 +07:00
Marcus Schiesser 3ae01a227e chore: remove repin (#1967) 2025-05-22 16:53:44 +07:00
Marcus Schiesser 76ff23dc48 fix: pRetry not working with CommonJS 2025-05-22 15:14:00 +07:00
github-actions[bot] ed497727b1 Release 0.11.2 (#1964)
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-05-22 14:34:37 +07:00
Marcus Schiesser 59601dd3ab feat: Add support for builtin image generation tool 2025-05-22 13:12:23 +07:00
github-actions[bot] 8474ca970e Release 0.11.1 (#1961)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-20 22:18:57 -07:00
Alex Yang 3703f907d9 fix(parse): upload API (#1960) 2025-05-20 17:39:39 -07:00
github-actions[bot] f63b702bec Release 0.11.0 (#1950)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-05-19 12:23:04 +07:00
Marcus Schiesser ccde88fe0b docs: update azure docs (#1958) 2025-05-19 11:49:18 +07:00
ANKIT VARSHNEY b0cd5301bb remove openai from llamaindex package and remove default setting for llm and embedModel (#1809)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-05-19 11:12:57 +07:00
Marcus Schiesser 3e66ddc10d chore: Move Azure models to azure package (#1888) 2025-05-16 15:50:12 +07:00
Marcus Schiesser c719b968f3 Fix: broken links in docs (#1956)
Co-authored-by: Andrew Kostka <apkostka@gmail.com>
2025-05-15 16:49:05 +07:00
Anubhav Rana c73c659c6d chore: qdrant version updates (#1913)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-05-15 12:30:24 +07:00
Marcus Schiesser 361a685012 chore: remove old workflows (#1951) 2025-05-15 10:29:47 +07:00
Marcus Schiesser 680b529e94 chore: remove requireContext from tools (#1949) 2025-05-14 16:38:44 +07:00
github-actions[bot] 389acbd307 Release 0.10.6 (#1942)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-05-13 17:16:55 +07:00
Marcus Schiesser 2e181be160 feat: add xai tools (#1948) 2025-05-13 17:10:57 +07:00
Marcus Schiesser 7a7ca604c5 feat: add xai support (#1947) 2025-05-13 16:48:53 +07:00
Marcus Schiesser c2fd4f9fc1 docs: add docs for concept (#1946) 2025-05-13 16:02:21 +07:00
GiftMungmeeprued 40f5f410c0 fix: enhance loadJson in LlamaParseReader to handle URL inputs correctly (#1936) 2025-05-13 10:10:04 +07:00
Anubhav Rana d671ed6d25 feat: qdrant search params (#1911) 2025-05-13 09:50:23 +07:00
Marcus Schiesser 76c9a80057 chore: make core peer dep (#1941) 2025-05-12 18:08:55 +07:00
operagxsasha 46a416517c docs: added a badge to the social network Twitter (#1943) 2025-05-12 18:05:08 +07:00
Tomer Igal 168d11fe51 feat: update agent input interface to support files (#1938)
Co-authored-by: Marcus Schiesser <marcus.schiesser@googlemail.com>
2025-05-12 17:21:46 +07:00
operagxsasha 3dfa5eb9ff docs: edited the link to the license badge (#1939) 2025-05-12 17:10:17 +07:00
Marcus Schiesser 9b20859dc5 docs: reorder examples (#1937) 2025-05-12 14:16:47 +07:00
Thuc Pham 93691793c5 feat: add E2E test for installing packages with npm (#1930) 2025-05-12 11:02:44 +07:00
Marcus Schiesser 3b231cf11c readd old sentence splitter for testing (#1926) 2025-05-10 09:01:22 +07:00
Marcus Schiesser 7073fca171 docs: LlamaParseReader how to use EU (#1931) 2025-05-09 16:45:20 +07:00
Marcus Schiesser 9145577bf5 docs: move live examples (#1928) 2025-05-09 15:02:33 +07:00
github-actions[bot] 4a18a2eb3d Release 0.10.5 (#1922)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: marcusschiesser <17126+marcusschiesser@users.noreply.github.com>
2025-05-09 14:30:39 +07:00
ANKIT VARSHNEY 206b491724 feat: Support for google live api (#1905) 2025-05-09 14:20:40 +07:00
Marcus Schiesser 9b2e25a184 fix: Use Uint8Array instead of Buffer for file type messages (works w… (#1921) 2025-05-08 13:19:59 +07:00
github-actions[bot] b29521bf6c Release @llamaindex/google@0.2.6 (#1918)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-07 16:31:58 +07:00
Marcus Schiesser 73e25787e7 feat: add gemini-2.5-pro-preview-05-06 (#1917) 2025-05-07 16:18:21 +07:00
Marcus Schiesser 3ce80540fe docs: add workflows documentation and update installation instruction… (#1916) 2025-05-07 15:22:08 +07:00
Marcus Schiesser dbc1ee3089 docs: update installation instructions for LlamaIndex to include Work… (#1915) 2025-05-07 12:31:48 +07:00
github-actions[bot] 3b45191228 Release 0.10.4 (#1901)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-07 11:11:11 +07:00
Marcus Schiesser aaf2f8b2db docs: fix docs for agents (#1914) 2025-05-07 11:03:08 +07:00
Marcus Schiesser 6ddf1c1b1f chore: fixes for workflows before release (#1908) 2025-05-07 09:29:09 +07:00
Marcus Schiesser a8717d5ece chore: ensure pinning workflow version (#1907) 2025-05-06 12:51:13 +07:00
Huu Le 7e8e4549f2 chore: update @llama-flow/core to version 0.4.1 and export stream api (#1906)
Co-authored-by: Marcus Schiesser <mail@marcusschiesser.de>
2025-05-05 16:06:44 +07:00
Alex Yang cc3fe92a22 docs: update llama-flow 2025-05-04 02:28:04 -07:00
Alex Yang 63ab0dba4e chore: drop node.js 18 support (#1904) 2025-05-02 11:51:18 -07:00
Alex Yang 2225ffd1d4 feat: bump llama cloud sdk (#1903) 2025-05-01 13:30:52 -07:00
Marcus Schiesser bc5334249b chore: migrate agentworkflows to llama-flow (#1895)
Co-authored-by: leehuwuj <leehuwuj@gmail.com>
2025-04-30 18:14:17 +07:00
Thuc Pham 41953a3ef9 fix: node10 module resolution fail in sub llamaindex packages (#1900) 2025-04-29 17:47:50 +07:00
781 changed files with 50690 additions and 15407 deletions
+29 -4
View File
@@ -23,7 +23,7 @@ jobs:
strategy:
fail-fast: false
matrix:
node-version: [18.x, 20.x, 22.x, 23.x]
node-version: [20.x, 22.x, 23.x]
name: E2E on Node.js ${{ matrix.node-version }}
runs-on: ubuntu-latest
steps:
@@ -53,7 +53,7 @@ jobs:
strategy:
fail-fast: false
matrix:
node-version: [18.x, 20.x, 22.x, 23.x]
node-version: [20.x, 22.x, 23.x]
name: Test on Node.js ${{ matrix.node-version }}
runs-on: ubuntu-latest
steps:
@@ -87,6 +87,31 @@ jobs:
run: pnpm run type-check
- name: Run Circular Dependency Check
run: pnpm run circular-check
e2e-npm:
runs-on: ubuntu-latest
name: Test using packages with npm
steps:
- uses: actions/checkout@v4
- uses: pnpm/action-setup@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version-file: ".nvmrc"
- name: Install dependencies
run: pnpm install
- name: Build packages
run: pnpm run build
- name: Pack packages
run: |
pnpm pack --pack-destination ${{ runner.temp }} -C packages/llamaindex
pnpm pack --pack-destination ${{ runner.temp }} -C packages/workflow
pnpm pack --pack-destination ${{ runner.temp }} -C packages/core
- name: Install packed packages
run: npm add ${{ runner.temp }}/*.tgz
working-directory: e2e/npm
- name: Run tests
run: npm test
working-directory: e2e/npm
e2e-llamaindex-examples:
strategy:
fail-fast: false
@@ -138,7 +163,7 @@ jobs:
github_token: ${{ secrets.GITHUB_TOKEN }}
directory: e2e/examples/vite-import-llamaindex
skip_step: "install"
build_script: build
build_script: ci-build
package_manager: pnpm
typecheck-examples:
@@ -179,7 +204,7 @@ jobs:
fi
done
- name: Install
run: npm add ${{ runner.temp }}/*.tgz
run: npm add ${{ runner.temp }}/*.tgz --legacy-peer-deps
working-directory: ${{ runner.temp }}/examples
- name: Run Type Check
run: npx tsc --project ./tsconfig.json
+1 -1
View File
@@ -1 +1 @@
20
22
+92
View File
@@ -0,0 +1,92 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Development Commands
This project uses pnpm as the package manager and Turbo for build orchestration:
- `pnpm install` - Install all dependencies
- `pnpm build` - Build all packages using Turbo
- `pnpm dev` - Start development mode for all packages
- `pnpm test` - Run all unit tests
- `pnpm e2e` - Run end-to-end tests
- `pnpm lint` - Run ESLint across all packages
- `pnpm type-check` - Run TypeScript type checking across workspace
- `pnpm format` - Check code formatting with Prettier
- `pnpm format:write` - Auto-fix formatting issues
- `pnpm circular-check` - Check for circular dependencies using madge
For individual package development:
- `turbo run build --filter="@llamaindex/core"` - Build specific package
- `turbo run test --filter="@llamaindex/core"` - Test specific package
- Navigate to specific package directory and run `pnpm test` for focused testing
- `pnpm clean` - Remove all build artifacts and node_modules across workspace
## Architecture Overview
LlamaIndex.TS is a TypeScript data framework for LLM applications organized as a pnpm monorepo with multiple runtime environment support (Node.js, Deno, Bun, Vercel Edge, Cloudflare Workers).
### Package Structure
**Core Packages:**
- `packages/core/` - Abstract base classes and interfaces for all runtime environments
- `packages/llamaindex/` - Main package that aggregates core functionality
- `packages/env/` - Environment-specific compatibility layers for different JS runtimes
**Provider Packages (`packages/providers/`):**
- LLM providers: `openai/`, `anthropic/`, `ollama/`, `google/`, `groq/`, etc.
- Vector stores: `storage/pinecone/`, `storage/chroma/`, `storage/qdrant/`, etc.
- Embeddings: Various embedding providers integrated within LLM packages
- Readers: `assemblyai/`, `discord/`, `notion/` for data ingestion
**Specialized Packages:**
- `packages/cloud/` - LlamaCloud integration for managed services
- `packages/tools/` - Function calling tools and utilities
- `packages/workflow/` - Agent workflow orchestration
- `packages/readers/` - File format readers (PDF, DOCX, etc.)
### Key Architectural Patterns
**Runtime Abstraction:** Core functionality is runtime-agnostic, with environment-specific implementations in separate entry points (`index.ts`, `index.edge.ts`, `index.workerd.ts`).
**Provider Pattern:** LLMs, embeddings, and vector stores implement common interfaces from `@llamaindex/core`, allowing easy swapping between providers.
**Modular Design:** Each provider is a separate package to minimize bundle size - users install only what they need.
**Data Flow:** Document → NodeParser → Embedding → VectorStore → Retriever → QueryEngine → Response
### Core Components
- **Agents and Workflows:** Abstractions for building agentic workflows and agents in `packages/workflow`
- **Chat Engines:** Conversational interfaces in `core/chat-engine/`
- **Query Engines:** Document querying with retrieval in `core/query-engine/`
- **Indices:** VectorStoreIndex, SummaryIndex, KeywordTable in `llamaindex/indices/`
- **Node Parsers:** Text splitting and chunking in `core/node-parser/`
- **Ingestion Pipeline:** Document processing workflows in `llamaindex/ingestion/`
- **Storage:** Chat stores, document stores, index stores, and KV stores in `core/storage/`
### Deprecated Components
- **Agents:** ReAct and function calling agents in `core/agent/` and `llamaindex/agent/`
### Testing Structure
- Unit tests in each package's `tests/` directory
- E2E tests in `e2e/` directory with runtime-specific examples
- Tests depend on build artifacts, so always run `pnpm build` before testing
### Multi-Runtime Support
The codebase supports multiple JavaScript runtimes through conditional exports and separate entry points. When making changes, consider compatibility across Node.js, Deno, Bun, and edge runtimes.
### Development Notes
- The project uses Husky for git hooks with lint-staged for pre-commit formatting and linting
- All packages use bunchee for building with dual CJS/ESM support
- Core package exports are organized as sub-modules (e.g., `@llamaindex/core/llms`, `@llamaindex/core/embeddings`)
- Always run `pnpm build` before running tests, as tests depend on build artifacts
+55 -2
View File
@@ -25,7 +25,7 @@ Make sure you have Node.js LTS (Long-term Support) installed. You can check your
```shell
node -v
# v20.x.x
# v22.x.x
```
### Use pnpm
@@ -38,6 +38,7 @@ npm install -g pnpm
```shell
pnpm install
pnpm install -g tsx
```
### Build the packages
@@ -48,6 +49,56 @@ To build all packages, run:
pnpm build
```
### Start Developing
You can launch the package in dev-mode by running:
```shell
pnpm dev
```
This will use turbo to run all packages in watch-mode. This means you can make changes and have them automatically built.
If you want to customize what packages are built/watched, you can run turbo directly and adjust the filter:
```shell
pnpm turbo run dev --filter="./packages/core" --concurrency=100
```
In another terminal, you can write and run any script needed to quickly test your changes. For example:
```typescript
import { createMemory, staticBlock } from "@llamaindex/core/memory";
// Create memory with predefined context
const memory = createMemory({
memoryBlocks: [
staticBlock({
content:
"The user is a software engineer who loves TypeScript and LlamaIndex.",
messageRole: "system",
}),
],
});
async function main() {
const result = await memory.getLLM();
console.log(result);
}
void main().catch(console.error);
```
And run it with:
```shell
pnpm exec tsx my_script.ts
```
This flow allows you to easily test your changes without having to build the entire project.
Once you are happy with your changes, be sure to add tests (and confirm existing tests are passing!).
### Run tests
#### Unit tests
@@ -92,7 +143,7 @@ Before sending a PR, make sure of the following:
3. If you have a new feature, add a new example in the `examples` folder.
4. You have a descriptive changeset for each PR:
### Changesets
### Bumping the versions of packages you've modified
We use [changesets](https://github.com/changesets/changesets) for managing versions and changelogs. To create a new
changeset, run in the root folder:
@@ -101,6 +152,8 @@ changeset, run in the root folder:
pnpm changeset
```
You will be prompted to choose what packages need their versions bumped, and what kind of bump (major, minor or patch) is needed. Once you carry out this operation, the bumping will be automatic after the PR is merged.
## Publishing (maintainers only)
The [Release Github Action](.github/workflows/release.yml) is automatically generating and updating a
+4 -15
View File
@@ -7,9 +7,10 @@
</h3>
[![NPM Version](https://img.shields.io/npm/v/llamaindex)](https://www.npmjs.com/package/llamaindex)
[![NPM License](https://img.shields.io/npm/l/llamaindex)](https://www.npmjs.com/package/llamaindex)
[![NPM License](https://img.shields.io/npm/l/llamaindex)](https://github.com/run-llama/LlamaIndexTS/blob/main/LICENSE)
[![NPM Downloads](https://img.shields.io/npm/dm/llamaindex)](https://www.npmjs.com/package/llamaindex)
[![Discord](https://img.shields.io/discord/1059199217496772688)](https://discord.com/invite/eN6D2HQ4aX)
[![Twitter](https://img.shields.io/twitter/follow/llama_index)](https://x.com/llama_index)
Use your own data with large language models (LLMs, OpenAI ChatGPT and others) in JS runtime environments with TypeScript support.
@@ -63,7 +64,7 @@ yarn add llamaindex
### Setup in Node.js, Deno, Bun, TypeScript...?
See our official document: <https://ts.llamaindex.ai/docs/llamaindex/getting_started/>
See our official document: https://ts.llamaindex.ai/docs/llamaindex/getting_started
### Adding provider packages
@@ -83,19 +84,7 @@ Check out our NextJS playground at https://llama-playground.vercel.app/. The sou
## Core concepts for getting started:
- [Document](/packages/llamaindex/src/Node.ts): A document represents a text file, PDF file or other contiguous piece of data.
- [Node](/packages/llamaindex/src/Node.ts): The basic data building block. Most commonly, these are parts of the document split into manageable pieces that are small enough to be fed into an embedding model and LLM.
- [Embedding](/packages/llamaindex/src/embeddings/OpenAIEmbedding.ts): Embeddings are sets of floating point numbers which represent the data in a Node. By comparing the similarity of embeddings, we can derive an understanding of the similarity of two pieces of data. One use case is to compare the embedding of a question with the embeddings of our Nodes to see which Nodes may contain the data needed to answer that question. Because the default service context is OpenAI, the default embedding is `OpenAIEmbedding`. If using different models, say through Ollama, use this [Embedding](/packages/llamaindex/src/embeddings/OllamaEmbedding.ts) (see all [here](/packages/llamaindex/src/embeddings)).
- [Indices](/packages/llamaindex/src/indices/): Indices store the Nodes and the embeddings of those nodes. QueryEngines retrieve Nodes from these Indices using embedding similarity.
- [QueryEngine](/packages/llamaindex/src/engines/query/RetrieverQueryEngine.ts): Query engines are what generate the query you put in and give you back the result. Query engines generally combine a pre-built prompt with selected Nodes from your Index to give the LLM the context it needs to answer your query. To build a query engine from your Index (recommended), use the [`asQueryEngine`](/packages/llamaindex/src/indices/BaseIndex.ts) method on your Index. See all query engines [here](/packages/llamaindex/src/engines/query).
- [ChatEngine](/packages/llamaindex/src/engines/chat/SimpleChatEngine.ts): A ChatEngine helps you build a chatbot that will interact with your Indices. See all chat engines [here](/packages/llamaindex/src/engines/chat).
- [SimplePrompt](/packages/llamaindex/src/Prompt.ts): A simple standardized function call definition that takes in inputs and formats them in a template literal. SimplePrompts can be specialized using currying and combined using other SimplePrompt functions.
See our documentation: https://ts.llamaindex.ai/docs/llamaindex/getting_started/concepts
## Contributing:
+433
View File
@@ -1,5 +1,438 @@
# @llamaindex/doc
## 0.2.54
### Patch Changes
- ed37c64: Addition of APAC_ANTHROPIC_CLAUDE_4_SONNET type/record in @llamaindex/aws for APAC support for claude 4 sonnet per issue 2184.
- Updated dependencies [8929dcf]
- Updated dependencies [5da1cda]
- llamaindex@0.11.29
- @llamaindex/core@0.6.21
- @llamaindex/workflow@1.1.23
- @llamaindex/openai@0.4.19
- @llamaindex/cloud@4.1.3
- @llamaindex/node-parser@2.0.21
- @llamaindex/readers@3.1.20
## 0.2.53
### Patch Changes
- Updated dependencies [1995b38]
- Updated dependencies [001a515]
- Updated dependencies [9d7d205]
- @llamaindex/workflow@1.1.22
- @llamaindex/openai@0.4.18
- llamaindex@0.11.28
## 0.2.52
### Patch Changes
- Updated dependencies [0267bb0]
- @llamaindex/core@0.6.20
- @llamaindex/cloud@4.1.2
- llamaindex@0.11.27
- @llamaindex/node-parser@2.0.20
- @llamaindex/openai@0.4.17
- @llamaindex/readers@3.1.19
- @llamaindex/workflow@1.1.21
## 0.2.51
### Patch Changes
- Updated dependencies [4c70376]
- @llamaindex/openai@0.4.16
## 0.2.50
### Patch Changes
- Updated dependencies [b6409b6]
- @llamaindex/openai@0.4.15
## 0.2.49
### Patch Changes
- Updated dependencies [4b51791]
- @llamaindex/cloud@4.1.1
- llamaindex@0.11.26
## 0.2.48
### Patch Changes
- Updated dependencies [049471b]
- Updated dependencies [049471b]
- @llamaindex/cloud@4.1.0
- llamaindex@0.11.25
## 0.2.47
### Patch Changes
- Updated dependencies [c3bf3c7]
- Updated dependencies [f9f1de9]
- @llamaindex/cloud@4.0.28
- @llamaindex/core@0.6.19
- llamaindex@0.11.24
- @llamaindex/node-parser@2.0.19
- @llamaindex/openai@0.4.14
- @llamaindex/readers@3.1.18
- @llamaindex/workflow@1.1.20
## 0.2.46
### Patch Changes
- Updated dependencies [f29799e]
- Updated dependencies [7224c06]
- @llamaindex/workflow@1.1.19
- @llamaindex/core@0.6.18
- llamaindex@0.11.23
- @llamaindex/cloud@4.0.27
- @llamaindex/node-parser@2.0.18
- @llamaindex/openai@0.4.13
- @llamaindex/readers@3.1.17
## 0.2.45
### Patch Changes
- Updated dependencies [9ed3195]
- @llamaindex/workflow@1.1.18
- llamaindex@0.11.22
## 0.2.44
### Patch Changes
- 38da40b: feat: VectoryMemoryBlock
- Updated dependencies [38da40b]
- @llamaindex/core@0.6.17
- @llamaindex/cloud@4.0.26
- llamaindex@0.11.21
- @llamaindex/node-parser@2.0.17
- @llamaindex/openai@0.4.12
- @llamaindex/readers@3.1.16
- @llamaindex/workflow@1.1.17
## 0.2.43
### Patch Changes
- ea15e75: Minor updates in deployment docs
## 0.2.42
### Patch Changes
- a8ec08c: fix: ensure correct message content in agent workflow
- Updated dependencies [a8ec08c]
- Updated dependencies [2967d57]
- @llamaindex/core@0.6.16
- @llamaindex/workflow@1.1.16
- @llamaindex/cloud@4.0.25
- llamaindex@0.11.20
- @llamaindex/node-parser@2.0.16
- @llamaindex/openai@0.4.11
- @llamaindex/readers@3.1.15
## 0.2.41
### Patch Changes
- Updated dependencies [856dd8c]
- @llamaindex/openai@0.4.10
## 0.2.40
### Patch Changes
- Updated dependencies [7ad3411]
- Updated dependencies [5da5b3c]
- Updated dependencies [a1fdb07]
- @llamaindex/core@0.6.15
- @llamaindex/workflow@1.1.15
- @llamaindex/openai@0.4.9
- @llamaindex/cloud@4.0.24
- llamaindex@0.11.19
- @llamaindex/node-parser@2.0.15
- @llamaindex/readers@3.1.14
## 0.2.39
### Patch Changes
- Updated dependencies [a1b1598]
- @llamaindex/cloud@4.0.23
- llamaindex@0.11.18
## 0.2.38
### Patch Changes
- Updated dependencies [d2be868]
- @llamaindex/cloud@4.0.22
- llamaindex@0.11.17
## 0.2.37
### Patch Changes
- Updated dependencies [579ca0c]
- @llamaindex/cloud@4.0.21
- llamaindex@0.11.16
## 0.2.36
### Patch Changes
- Updated dependencies [48b0d88]
- Updated dependencies [f185772]
- @llamaindex/cloud@4.0.20
- llamaindex@0.11.15
## 0.2.35
### Patch Changes
- Updated dependencies [5a0ed1f]
- Updated dependencies [5a0ed1f]
- Updated dependencies [8eeac33]
- @llamaindex/cloud@4.0.19
- @llamaindex/core@0.6.14
- llamaindex@0.11.14
- @llamaindex/node-parser@2.0.14
- @llamaindex/openai@0.4.8
- @llamaindex/readers@3.1.13
- @llamaindex/workflow@1.1.14
## 0.2.34
### Patch Changes
- 39758ab: Add title to homepage header
## 0.2.33
### Patch Changes
- Updated dependencies [47a7555]
- @llamaindex/cloud@4.0.18
- llamaindex@0.11.13
## 0.2.32
### Patch Changes
- Updated dependencies [d578889]
- Updated dependencies [0fcc92f]
- Updated dependencies [515a8b9]
- @llamaindex/core@0.6.13
- llamaindex@0.11.12
- @llamaindex/cloud@4.0.17
- @llamaindex/node-parser@2.0.13
- @llamaindex/openai@0.4.7
- @llamaindex/readers@3.1.12
- @llamaindex/workflow@1.1.13
## 0.2.31
### Patch Changes
- Updated dependencies [7039e1a]
- Updated dependencies [7039e1a]
- llamaindex@0.11.11
- @llamaindex/core@0.6.12
- @llamaindex/cloud@4.0.16
- @llamaindex/node-parser@2.0.12
- @llamaindex/openai@0.4.6
- @llamaindex/readers@3.1.11
- @llamaindex/workflow@1.1.12
## 0.2.30
### Patch Changes
- Updated dependencies [f7ec293]
- @llamaindex/workflow@1.1.11
- llamaindex@0.11.10
## 0.2.29
### Patch Changes
- Updated dependencies [c5846bd]
- @llamaindex/readers@3.1.10
## 0.2.28
### Patch Changes
- Updated dependencies [a89e187]
- Updated dependencies [62699b7]
- Updated dependencies [c5b2691]
- Updated dependencies [d8ac8d3]
- @llamaindex/core@0.6.11
- @llamaindex/openai@0.4.5
- @llamaindex/cloud@4.0.15
- llamaindex@0.11.9
- @llamaindex/node-parser@2.0.11
- @llamaindex/readers@3.1.9
- @llamaindex/workflow@1.1.10
## 0.2.27
### Patch Changes
- 8a51c16: Add natural language agent page
- Updated dependencies [8a51c16]
- Updated dependencies [1b5af14]
- @llamaindex/workflow@1.1.9
- @llamaindex/core@0.6.10
- llamaindex@0.11.8
- @llamaindex/cloud@4.0.14
- @llamaindex/node-parser@2.0.10
- @llamaindex/openai@0.4.4
- @llamaindex/readers@3.1.8
## 0.2.26
### Patch Changes
- a4d394f: fix: correct SimpleDirectoryReader import path in documentation example
- Updated dependencies [dbd857f]
- Updated dependencies [3c857f4]
- @llamaindex/workflow@1.1.8
- llamaindex@0.11.7
## 0.2.25
### Patch Changes
- Updated dependencies [40161fe]
- @llamaindex/workflow@1.1.7
- llamaindex@0.11.6
## 0.2.24
### Patch Changes
- Updated dependencies [766054b]
- Updated dependencies [71598f8]
- @llamaindex/workflow@1.1.6
- @llamaindex/core@0.6.9
- llamaindex@0.11.5
- @llamaindex/cloud@4.0.13
- @llamaindex/node-parser@2.0.9
- @llamaindex/openai@0.4.3
- @llamaindex/readers@3.1.7
## 0.2.23
### Patch Changes
- Updated dependencies [c927457]
- @llamaindex/openai@0.4.2
- @llamaindex/core@0.6.8
- @llamaindex/cloud@4.0.12
- llamaindex@0.11.4
- @llamaindex/node-parser@2.0.8
- @llamaindex/readers@3.1.6
- @llamaindex/workflow@1.1.5
## 0.2.22
### Patch Changes
- Updated dependencies [76ff23d]
- @llamaindex/cloud@4.0.11
- llamaindex@0.11.3
## 0.2.21
### Patch Changes
- Updated dependencies [59601dd]
- @llamaindex/openai@0.4.1
- @llamaindex/core@0.6.7
- @llamaindex/cloud@4.0.10
- llamaindex@0.11.2
- @llamaindex/node-parser@2.0.7
- @llamaindex/readers@3.1.5
- @llamaindex/workflow@1.1.4
## 0.2.20
### Patch Changes
- Updated dependencies [3703f90]
- @llamaindex/cloud@4.0.9
- llamaindex@0.11.1
## 0.2.19
### Patch Changes
- Updated dependencies [680b529]
- Updated dependencies [b0cd530]
- Updated dependencies [361a685]
- Updated dependencies [3e66ddc]
- @llamaindex/workflow@1.1.3
- @llamaindex/core@0.6.6
- llamaindex@0.11.0
- @llamaindex/openai@0.4.0
- @llamaindex/cloud@4.0.8
- @llamaindex/node-parser@2.0.6
- @llamaindex/readers@3.1.4
## 0.2.18
### Patch Changes
- d671ed6: Add functionality for search params when querying Qdrant vector store.
- Updated dependencies [76c9a80]
- Updated dependencies [168d11f]
- Updated dependencies [d671ed6]
- Updated dependencies [40f5f41]
- @llamaindex/openai@0.3.7
- @llamaindex/workflow@1.1.2
- @llamaindex/core@0.6.5
- @llamaindex/cloud@4.0.7
- llamaindex@0.10.6
- @llamaindex/node-parser@2.0.5
- @llamaindex/readers@3.1.3
## 0.2.17
### Patch Changes
- Updated dependencies [9b2e25a]
- @llamaindex/openai@0.3.6
- @llamaindex/core@0.6.4
- llamaindex@0.10.5
- @llamaindex/cloud@4.0.6
- @llamaindex/node-parser@2.0.4
- @llamaindex/readers@3.1.2
- @llamaindex/workflow@1.1.1
## 0.2.16
### Patch Changes
- Updated dependencies [7e8e454]
- Updated dependencies [2225ffd]
- Updated dependencies [6ddf1c1]
- Updated dependencies [bc53342]
- Updated dependencies [41953a3]
- @llamaindex/workflow@1.1.0
- @llamaindex/cloud@4.0.5
- llamaindex@0.10.4
## 0.2.15
### Patch Changes
+143
View File
@@ -0,0 +1,143 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with the LlamaIndex.TS documentation site.
## Application Overview
This is a Next.js documentation site (`@llamaindex/doc`) that serves as the official documentation for LlamaIndex.TS. It's built using Fumadocs, a modern documentation framework, and includes interactive features, API documentation generation, and AI-powered chat functionality.
## Development Commands
From this directory (`apps/next/`):
- `pnpm dev` - Start development server with Turbo
- `pnpm build` - Build the documentation site (includes `prebuild` step)
- `pnpm start` - Start production server
- `pnpm build:docs` - Generate API documentation from TypeScript source
- `pnpm validate-links` - Validate all internal and external links
Key build process:
1. `prebuild` runs `build:docs` to generate API documentation using TypeDoc
2. `build` runs Next.js build process
3. `postbuild` runs post-processing scripts and link validation
## Architecture
### Framework Stack
- **Next.js 15.3** - React framework with App Router
- **Fumadocs** - Documentation framework with MDX support
- **React Server Components** - AI chat functionality with server actions
- **Tailwind CSS** - Styling with custom design system
- **TypeScript** - Full type safety
### Key Dependencies
- **Fumadocs ecosystem**: `fumadocs-ui`, `fumadocs-mdx`, `fumadocs-core`, `fumadocs-openapi`
- **AI features**: `ai` package for React Server Components chat
- **Code features**: Monaco Editor, Shiki syntax highlighting, Twoslash TypeScript integration
- **UI components**: Radix UI primitives, Framer Motion animations
- **Content processing**: MDX, remark/rehype plugins, TypeDoc for API generation
### Directory Structure
**Content Management:**
- `src/content/docs/` - MDX documentation files organized by topic
- `src/content/docs/api/` - Auto-generated API documentation from TypeScript
- `scripts/` - Build-time documentation generation and validation
**Application Code:**
- `src/app/` - Next.js App Router pages and API routes
- `src/components/` - Reusable React components including UI library
- `src/lib/` - Utilities, constants, and configuration
**Configuration:**
- `source.config.ts` - Fumadocs MDX configuration with plugins
- `next.config.mjs` - Next.js configuration with MDX integration
- `tailwind.config.mjs` - Tailwind CSS customization
### Key Features
**Documentation Features:**
- MDX-based content with TypeScript code highlighting
- Auto-generated API documentation from TypeScript source
- Interactive code examples with Monaco Editor
- Math equation support with KaTeX
- Link validation and build-time checks
**Interactive Features:**
- AI-powered chat interface using React Server Components
- Code demos with live TypeScript execution
- Interactive UI components and animations
- Search functionality across all documentation
**Build Process:**
- TypeDoc generates API documentation from workspace packages
- Custom scripts transform and validate generated content
- Link checking ensures all internal/external links work
- Static site generation with 10-minute timeout for large documentation set
### Configuration Files
**source.config.ts**: Defines MDX processing pipeline with:
- Code highlighting themes (Catppuccin)
- Twoslash TypeScript integration
- Remark/rehype plugins for enhanced Markdown
- Content directories including external docs
**next.config.mjs**: Next.js configuration with:
- Extended static generation timeout (10 minutes)
- Monaco Editor transpilation
- Server external packages for build optimization
- Webpack/Turbopack aliases for browser compatibility
### Content Organization
**Documentation Structure:**
- `/docs/llamaindex/` - Core LlamaIndex.TS documentation
- `/docs/cloud/` - LlamaCloud integration guides
- `/docs/api/` - Auto-generated TypeScript API reference
**Content Sources:**
- Local MDX files in `src/content/docs/`
- External docs from `@llamaindex/workflow-docs` package
- Generated API docs from TypeScript source
### Development Notes
- Documentation content is sourced from multiple locations including external packages
- API documentation is regenerated on each build from TypeScript source
- The site uses advanced MDX features including custom transformers and plugins
- Build process includes comprehensive link validation
- Large memory allocation needed for TypeDoc generation (`--max-old-space-size=8192`)
- Chat functionality uses React Server Components with streaming responses
### AI Chat Integration
The documentation includes an AI chat feature that:
- Uses React Server Components for server-side AI processing
- Integrates with LlamaIndex.TS packages for demonstrations
- Provides interactive examples and code generation
- Streams responses for better user experience
### Content Authoring
When adding new documentation:
- Create MDX files in appropriate `src/content/docs/` subdirectories
- Follow existing content structure and frontmatter conventions
- Use Fumadocs MDX features like code blocks, callouts, and tabs
- API documentation is auto-generated - edit TypeScript source comments instead
- Run `pnpm validate-links` to check all links before publishing
+2
View File
@@ -3,6 +3,8 @@
This is a Next.js application generated with
[Create Fumadocs](https://github.com/fuma-nama/fumadocs).
> Note: Before running the development server, make sure to build the whole project first, see [CONTRIBUTING.md](../../CONTRIBUTING.md) for more details.
Run development server:
```bash
+2 -2
View File
@@ -12,9 +12,9 @@
},
"aliases": {
"components": "@/components",
"utils": "@/lib/utils",
"utils": "@/libs/utils",
"ui": "@/components/ui",
"lib": "@/lib",
"lib": "@/libs",
"hooks": "@/hooks"
}
}
+41
View File
@@ -15,6 +15,47 @@ const config = {
"twoslash",
"typescript",
],
async redirects() {
return [
{
source: "/docs/chat-ui/:path*.mdx",
destination: "/docs/chat-ui/:path*",
permanent: true,
},
{
source: "/docs/workflows/:path*.mdx",
destination: "/docs/workflows/:path*",
permanent: true,
},
{
source: "/docs/llamaindex/getting_started/installation/node.mdx",
destination:
"/docs/llamaindex/getting_started/installation/server-apis.mdx",
permanent: true,
},
{
source: "/docs/llamaindex/getting_started/installation/typescript.mdx",
destination: "/docs/llamaindex/getting_started/installation/index.mdx",
permanent: true,
},
{
source: "/docs/llamaindex/getting_started/installation/next.mdx",
destination: "/docs/llamaindex/getting_started/installation/nextjs.mdx",
permanent: true,
},
{
source: "/docs/llamaindex/getting_started/installation/vite.mdx",
destination: "/docs/llamaindex/getting_started/installation/index.mdx",
permanent: true,
},
{
source: "/docs/llamaindex/getting_started/installation/cloudflare.mdx",
destination:
"/docs/llamaindex/getting_started/installation/serverless.mdx",
permanent: true,
},
];
},
turbopack: {
resolveAlias: {
fs: { browser: "./fallback.js" },
+20 -19
View File
@@ -1,6 +1,6 @@
{
"name": "@llamaindex/doc",
"version": "0.2.15",
"version": "0.2.54",
"private": true,
"scripts": {
"postinstall": "fumadocs-mdx",
@@ -15,16 +15,17 @@
"dependencies": {
"@huggingface/transformers": "^3.5.0",
"@icons-pack/react-simple-icons": "^10.1.0",
"@llama-flow/docs": "0.0.5",
"@llamaindex/chat-ui": "0.2.0",
"@llamaindex/chat-ui-docs": "^0.1.0",
"@llamaindex/cloud": "workspace:*",
"@llamaindex/core": "workspace:*",
"@llamaindex/node-parser": "workspace:*",
"@llamaindex/openai": "workspace:*",
"@llamaindex/readers": "workspace:*",
"@llamaindex/workflow": "workspace:*",
"@llamaindex/workflow-docs": "0.1.1",
"@mdx-js/mdx": "^3.1.0",
"@monaco-editor/react": "^4.7.0",
"@next/third-parties": "^15.3.4",
"@number-flow/react": "^0.3.4",
"@radix-ui/react-dialog": "^1.1.2",
"@radix-ui/react-icons": "^1.3.2",
@@ -34,22 +35,22 @@
"@radix-ui/react-tooltip": "^1.1.4",
"@scalar/api-client-react": "^1.1.25",
"@vercel/functions": "^1.5.0",
"ai": "^3.4.33",
"ai": "^4.3.17",
"class-variance-authority": "^0.7.0",
"clsx": "2.1.1",
"foxact": "^0.2.41",
"framer-motion": "^11.11.17",
"fumadocs-core": "^15.2.7",
"fumadocs-core": "^15.5.0",
"fumadocs-docgen": "^2.0.0",
"fumadocs-mdx": "^11.6.0",
"fumadocs-openapi": "^8.0.1",
"fumadocs-twoslash": "^3.1.1",
"fumadocs-typescript": "^4.0.2",
"fumadocs-ui": "^15.2.7",
"fumadocs-mdx": "^11.6.6",
"fumadocs-openapi": "^9.0.5",
"fumadocs-twoslash": "^3.1.3",
"fumadocs-typescript": "^4.0.5",
"fumadocs-ui": "^15.5.0",
"hast-util-to-jsx-runtime": "^2.3.2",
"llamaindex": "workspace:*",
"lucide-react": "^0.460.0",
"next": "^15.3.0",
"next": "^15.3.3",
"next-themes": "^0.4.3",
"react": "^19.1.0",
"react-dom": "^19.1.0",
@@ -69,30 +70,30 @@
"twoslash": "^0.3.1",
"use-stick-to-bottom": "^1.0.42",
"web-tree-sitter": "^0.24.4",
"zod": "^3.23.8"
"zod": "^3.25.76"
},
"devDependencies": {
"@next/env": "^15.3.0",
"@tailwindcss/postcss": "^4.0.9",
"@types/mdx": "^2.0.13",
"@types/node": "22.9.0",
"@types/react": "^19.0.10",
"@types/react-dom": "^19.0.4",
"@types/node": "24.0.13",
"@types/react": "^19.1.8",
"@types/react-dom": "^19.1.6",
"autoprefixer": "^10.4.20",
"cross-env": "^7.0.3",
"fast-glob": "^3.3.2",
"gray-matter": "^4.0.3",
"postcss": "^8.5.3",
"postcss": "^8.5.6",
"raw-loader": "^4.0.2",
"remark": "^15.0.1",
"remark-gfm": "^4.0.0",
"remark-mdx": "^3.1.0",
"remark-stringify": "^11.0.0",
"tailwindcss": "^4.0.9",
"tsx": "^4.19.3",
"tailwindcss": "^4.1.11",
"tsx": "^4.20.3",
"typedoc": "0.28.3",
"typedoc-plugin-markdown": "^4.6.2",
"typedoc-plugin-merge-modules": " ^7.0.0",
"typescript": "^5.7.3"
"typescript": "^5.8.3"
}
}
Binary file not shown.

Before

Width:  |  Height:  |  Size: 540 KiB

After

Width:  |  Height:  |  Size: 206 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 154 KiB

+10 -7
View File
@@ -4,7 +4,6 @@ import matter from "gray-matter";
import path from "path";
const CONTENT_DIR = path.join(process.cwd(), "src/content/docs");
const BUILD_DIR = path.join(process.cwd(), ".next");
// Regular expression to find internal links
// This captures Markdown links [text](/docs/path) and href attributes href="/docs/path"
@@ -14,6 +13,8 @@ const INTERNAL_LINK_REGEX = /(?:(?:\]\(|\bhref=["'])\/docs\/([^")]+))/g;
// This captures relative links like [text](./path) or ![alt](../images/image.png)
const RELATIVE_LINK_REGEX = /(?:\]\()(?:\s*)(?:\.\.?)\//g;
const ALLOWED_LINKS = ["/docs/workflows", "/docs/chat-ui"];
interface LinkValidationResult {
file: string;
invalidLinks: Array<{ link: string; line: number }>;
@@ -28,7 +29,7 @@ interface RelativeLinkResult {
* Get all valid documentation routes from the content directory
*/
async function getValidRoutes(): Promise<Set<string>> {
const mdxFiles = await glob("**/*.mdx?", { cwd: CONTENT_DIR });
const mdxFiles = await glob("**/*.{md,mdx}", { cwd: CONTENT_DIR });
const routes = new Set<string>();
@@ -124,14 +125,11 @@ function findRelativeLinksInFile(
return relativeLinks;
}
/**
* Validate internal links in all MDX files
*/
/**
* Find relative links in all MDX files
*/
async function findRelativeLinks(): Promise<RelativeLinkResult[]> {
const mdxFiles = await glob("**/*.mdx?", { cwd: CONTENT_DIR });
const mdxFiles = await glob("**/*.mdx", { cwd: CONTENT_DIR });
const results: RelativeLinkResult[] = [];
for (const file of mdxFiles) {
@@ -150,7 +148,7 @@ async function findRelativeLinks(): Promise<RelativeLinkResult[]> {
}
async function validateLinks(): Promise<LinkValidationResult[]> {
const mdxFiles = await glob("**/*.mdx?", { cwd: CONTENT_DIR });
const mdxFiles = await glob("**/*.mdx", { cwd: CONTENT_DIR });
const validRoutes = await getValidRoutes();
const results: LinkValidationResult[] = [];
@@ -160,6 +158,11 @@ async function validateLinks(): Promise<LinkValidationResult[]> {
const links = extractLinksFromFile(filePath);
const invalidLinks = links.filter(({ link }) => {
// Check if the link is in the allowed list
if (ALLOWED_LINKS.includes(`/docs/${link}`)) {
return false;
}
// Check if the link exists in valid routes
// First normalize the link (remove any query string or hash)
const baseLink = link.split("?")[0].split("#")[0];
+10 -1
View File
@@ -9,7 +9,16 @@ import rehypeKatex from "rehype-katex";
import remarkMath from "remark-math";
export const docs = defineDocs({
dir: ["./src/content/docs", "./node_modules/@llama-flow/docs"],
dir: [
"./src/content/docs",
"./node_modules/@llamaindex/workflow-docs",
"./node_modules/@llamaindex/chat-ui-docs",
// NOTE: When adding external docs (like chat-ui or workflow-docs above),
// make sure to also update:
// 1. scripts/validate-links.mts - add to ALLOWED_LINKS array
// 2. next.config.mjs - add redirect for .mdx files
// 3. src/content/docs/meta.json - add to pages array
],
docs: {
async: true,
},
+6 -4
View File
@@ -10,7 +10,7 @@ import { MagicMove } from "@/components/magic-move";
import { NpmInstall } from "@/components/npm-install";
import { Supports } from "@/components/supports";
import { Button } from "@/components/ui/button";
import { DOCUMENT_URL } from "@/lib/const";
import { DOCUMENT_URL } from "@/libs/const";
import { SiStackblitz } from "@icons-pack/react-simple-icons";
import { Blocks, Bot, Footprints, Terminal } from "lucide-react";
import Link from "next/link";
@@ -26,7 +26,7 @@ const llm = openai();
const response = await llm.chat({
messages: [{ content: "Tell me a joke.", role: "user" }],
});`,
`import { agent } from "llamaindex";
`import { agent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";
const analyseAgent = agent({
@@ -36,7 +36,7 @@ const analyseAgent = agent({
});
const response = await analyseAgent.run(\`Analyse the given data:
\${data}\`);`,
`import { agent, multiAgent } from "llamaindex";
`import { agent, multiAgent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";
const analyseAgent = agent({
@@ -113,8 +113,10 @@ export default function HomePage() {
description="Truly powerful retrieval-augmented generation applications use agentic techniques, and LlamaIndex.TS makes it easy to build them."
>
<CodeBlock
code={`import { agent, SimpleDirectoryReader, VectorStoreIndex } from "llamaindex";
code={`import { VectorStoreIndex } from "llamaindex";
import { SimpleDirectoryReader } from "@llamaindex/readers/directory";
import { openai } from "@llamaindex/openai";
import { agent } from "@llamaindex/workflow";
// load documents from current directoy into an index
const reader = new SimpleDirectoryReader();
+1 -1
View File
@@ -1,4 +1,4 @@
import { MockLLM } from "@llamaindex/core/utils";
import { MockLLM } from "@llamaindex/core/llms/mock";
import { LlamaIndexAdapter, type Message } from "ai";
import { Settings, SimpleChatEngine, type ChatMessage } from "llamaindex";
import { NextResponse, type NextRequest } from "next/server";
+1 -1
View File
@@ -1,4 +1,4 @@
import { source } from "@/lib/source";
import { source } from "@/libs/source";
import { structure } from "fumadocs-core/mdx-plugins";
import { createFromSource } from "fumadocs-core/search/server";
+2 -4
View File
@@ -1,7 +1,6 @@
import { ChatDemoRSC } from "@/components/demo/chat/rsc/demo";
import * as demos from "@/components/demo/lazy";
import { createMetadata, metadataImage } from "@/lib/metadata";
import { openapi, source } from "@/lib/source";
import { createMetadata, metadataImage } from "@/libs/metadata";
import { openapi, source } from "@/libs/source";
import * as Icons from "@icons-pack/react-simple-icons";
import { APIPage } from "fumadocs-openapi/ui";
import { Popup, PopupContent, PopupTrigger } from "fumadocs-twoslash/ui";
@@ -51,7 +50,6 @@ export default async function Page(props: {
...Icons,
...defaultMdxComponents,
...demos,
ChatDemoRSC,
Accordion,
Accordions,
APIPage: (props) => <APIPage {...openapi.getAPIPageProps(props)} />,
+1 -1
View File
@@ -1,5 +1,5 @@
import { baseOptions } from "@/app/layout.config";
import { source } from "@/lib/source";
import { source } from "@/libs/source";
import "fumadocs-twoslash/twoslash.css";
import { DocsLayout } from "fumadocs-ui/layouts/docs";
import type { ReactNode } from "react";
+1 -1
View File
@@ -1,4 +1,4 @@
import { DOCUMENT_URL } from "@/lib/const";
import { DOCUMENT_URL } from "@/libs/const";
import type { BaseLayoutProps } from "fumadocs-ui/layouts/shared";
import Image from "next/image";
+6
View File
@@ -1,5 +1,6 @@
import { AIProvider } from "@/actions";
import { TooltipProvider } from "@/components/ui/tooltip";
import { GoogleAnalytics, GoogleTagManager } from "@next/third-parties/google";
import { RootProvider } from "fumadocs-ui/provider";
import { Inter } from "next/font/google";
import type { ReactNode } from "react";
@@ -31,7 +32,11 @@ export default function Layout({ children }: { children: ReactNode }) {
sizes="16x16"
href="/favicon-16x16.png"
/>
<title>
LlamaIndex.TS - Build LLM-powered document agents and workflows
</title>
</head>
<GoogleTagManager gtmId="GTM-WWRFB36R" />
<body className="flex min-h-screen flex-col">
<TooltipProvider>
<AIProvider>
@@ -39,6 +44,7 @@ export default function Layout({ children }: { children: ReactNode }) {
</AIProvider>
</TooltipProvider>
</body>
<GoogleAnalytics gaId="G-NB9B8LW9W5" />
</html>
);
}
+1 -1
View File
@@ -1,5 +1,5 @@
import { generateOGImage } from "@/app/og/[...slug]/og";
import { metadataImage } from "@/lib/metadata";
import { metadataImage } from "@/libs/metadata";
import { type ImageResponse } from "next/og";
import { readFileSync } from "node:fs";
+1 -1
View File
@@ -1,6 +1,6 @@
import ContributorCounter from "@/components/contributor-count";
import { buttonVariants } from "@/components/ui/button";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
import { Heart } from "lucide-react";
import { ReactElement } from "react";
@@ -1,5 +1,5 @@
import { fetchContributors } from "@/lib/get-contributors";
import { cn } from "@/lib/utils";
import { fetchContributors } from "@/libs/get-contributors";
import { cn } from "@/libs/utils";
import Image from "next/image";
import type { HTMLAttributes, ReactElement } from "react";
@@ -1,5 +1,5 @@
"use client";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
import { TerminalIcon } from "lucide-react";
import {
Fragment,
@@ -1,21 +0,0 @@
"use client";
import {
ChatHandler,
ChatInput,
ChatMessages,
ChatSection,
} from "@llamaindex/chat-ui";
import { useChat } from "ai/react";
export const ChatDemo = () => {
const handler = useChat();
return (
<ChatSection handler={handler as ChatHandler}>
<ChatMessages>
<ChatMessages.List className="h-auto max-h-[400px]" />
<ChatMessages.Actions />
</ChatMessages>
<ChatInput />
</ChatSection>
);
};
@@ -1,57 +0,0 @@
import { Markdown } from "@llamaindex/chat-ui/widgets";
import { MockLLM } from "@llamaindex/core/utils";
import { generateId, Message } from "ai";
import { createAI, createStreamableUI, getMutableAIState } from "ai/rsc";
import { type ChatMessage, Settings, SimpleChatEngine } from "llamaindex";
import { ReactNode } from "react";
type ServerState = Message[];
type FrontendState = Array<Message & { display: ReactNode }>;
type Actions = {
chat: (message: Message) => Promise<Message & { display: ReactNode }>;
};
Settings.llm = new MockLLM(); // config your LLM here
export const AI = createAI<ServerState, FrontendState, Actions>({
initialAIState: [],
initialUIState: [],
actions: {
chat: async (message: Message) => {
"use server";
const aiState = getMutableAIState<typeof AI>();
aiState.update((prev) => [...prev, message]);
const uiStream = createStreamableUI();
const chatEngine = new SimpleChatEngine();
const assistantMessage: Message = {
id: generateId(),
role: "assistant",
content: "",
};
// run the async function without blocking
(async () => {
const chatResponse = await chatEngine.chat({
stream: true,
message: message.content,
chatHistory: aiState.get() as ChatMessage[],
});
for await (const chunk of chatResponse) {
assistantMessage.content += chunk.delta;
uiStream.update(<Markdown content={assistantMessage.content} />);
}
aiState.done([...aiState.get(), assistantMessage]);
uiStream.done();
})();
return {
...assistantMessage,
display: uiStream.value,
};
},
},
});
@@ -1,35 +0,0 @@
"use client";
import {
ChatHandler,
ChatInput,
ChatMessage,
ChatMessages,
ChatSection as ChatSectionUI,
Message,
} from "@llamaindex/chat-ui";
import { useChatRSC } from "./use-chat-rsc";
export const ChatSectionRSC = () => {
const handler = useChatRSC();
return (
<ChatSectionUI handler={handler as ChatHandler}>
<ChatMessages>
<ChatMessages.List className="h-auto max-h-[400px]">
{handler.messages.map((message, index) => (
<ChatMessage
key={index}
message={message as Message}
isLast={index === handler.messages.length - 1}
>
<ChatMessage.Avatar />
<ChatMessage.Content>{message.display}</ChatMessage.Content>
</ChatMessage>
))}
<ChatMessages.Loading />
</ChatMessages.List>
</ChatMessages>
<ChatInput />
</ChatSectionUI>
);
};
@@ -1,8 +0,0 @@
import { AI } from "./ai-action";
import { ChatSectionRSC } from "./chat-section";
export const ChatDemoRSC = () => (
<AI>
<ChatSectionRSC />
</AI>
);
@@ -1,41 +0,0 @@
"use client";
import { useActions } from "ai/rsc";
import { generateId, Message } from "ai";
import { useUIState } from "ai/rsc";
import { useState } from "react";
import { AI } from "./ai-action";
export function useChatRSC() {
const [input, setInput] = useState<string>("");
const [isLoading, setIsLoading] = useState<boolean>(false);
const [messages, setMessages] = useUIState<typeof AI>();
const { chat } = useActions<typeof AI>();
const append = async (message: Omit<Message, "id">) => {
const newMsg: Message = { ...message, id: generateId() };
setIsLoading(true);
try {
setMessages((prev) => [...prev, { ...newMsg, display: message.content }]);
const assistantMsg = await chat(newMsg);
setMessages((prev) => [...prev, assistantMsg]);
} catch (error) {
console.error(error);
}
setIsLoading(false);
setInput("");
return message.content;
};
return {
input,
setInput,
isLoading,
messages,
setMessages,
append,
};
}
-5
View File
@@ -1,11 +1,6 @@
"use client";
import dynamic from "next/dynamic";
// lazy load client components
export const ChatDemo = dynamic(() =>
import("@/components/demo/chat/api/demo").then((mod) => mod.ChatDemo),
);
export const CodeNodeParserDemo = dynamic(() =>
import("@/components/demo/code-node-parser").then(
(mod) => mod.CodeNodeParserDemo,
+1 -1
View File
@@ -1,4 +1,4 @@
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
import { LucideIcon } from "lucide-react";
import { HTMLAttributes, ReactElement, ReactNode } from "react";
+1 -1
View File
@@ -1,6 +1,6 @@
"use client";
import { Button } from "@/components/ui/button";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
import { CodeBlock } from "fumadocs-ui/components/codeblock";
import { RotateCcw } from "lucide-react";
import { useTheme } from "next-themes";
+1 -1
View File
@@ -1,6 +1,6 @@
"use client";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
import Image from "next/image";
import { ReactNode } from "react";
import { IconAI, IconUser } from "./ui/icons";
@@ -1,4 +1,4 @@
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
import {
AnimatePresence,
motion,
+1 -1
View File
@@ -1,7 +1,7 @@
import { cva, type VariantProps } from "class-variance-authority";
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
const alertVariants = cva(
"relative w-full rounded-lg border px-4 py-3 text-sm [&>svg+div]:translate-y-[-3px] [&>svg]:absolute [&>svg]:left-4 [&>svg]:top-4 [&>svg]:text-foreground [&>svg~*]:pl-7",
+1 -1
View File
@@ -1,7 +1,7 @@
import { cva, type VariantProps } from "class-variance-authority";
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
const badgeVariants = cva(
"inline-flex items-center rounded-md border px-2.5 py-0.5 text-xs font-semibold transition-colors focus:outline-none focus:ring-2 focus:ring-ring focus:ring-offset-2",
+1 -1
View File
@@ -2,7 +2,7 @@ import { Slot } from "@radix-ui/react-slot";
import { cva, type VariantProps } from "class-variance-authority";
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
const buttonVariants = cva(
"inline-flex items-center justify-center gap-2 whitespace-nowrap rounded-md text-sm font-medium transition-colors focus-visible:outline-none focus-visible:ring-1 focus-visible:ring-ring disabled:pointer-events-none disabled:opacity-50 [&_svg]:pointer-events-none [&_svg]:size-4 [&_svg]:shrink-0",
+1 -1
View File
@@ -4,7 +4,7 @@ import * as DialogPrimitive from "@radix-ui/react-dialog";
import { Cross2Icon } from "@radix-ui/react-icons";
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
const Dialog = DialogPrimitive.Root;
+1 -1
View File
@@ -1,4 +1,4 @@
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
export function IconAI({ className, ...props }: React.ComponentProps<"svg">) {
return (
@@ -1,5 +1,5 @@
"use client";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
import { animate, motion, useMotionValue } from "framer-motion";
import { useEffect, useState } from "react";
import useMeasure from "react-use-measure";
+1 -1
View File
@@ -1,6 +1,6 @@
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
export type InputProps = React.InputHTMLAttributes<HTMLInputElement>;
+1 -1
View File
@@ -4,7 +4,7 @@ import * as LabelPrimitive from "@radix-ui/react-label";
import { cva, type VariantProps } from "class-variance-authority";
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
const labelVariants = cva(
"text-sm font-medium leading-none peer-disabled:cursor-not-allowed peer-disabled:opacity-70",
+1 -1
View File
@@ -1,4 +1,4 @@
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
function Skeleton({
className,
+1 -1
View File
@@ -3,7 +3,7 @@
import * as SliderPrimitive from "@radix-ui/react-slider";
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
const Slider = React.forwardRef<
React.ElementRef<typeof SliderPrimitive.Root>,
+1 -1
View File
@@ -1,6 +1,6 @@
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
export type TextareaProps = React.TextareaHTMLAttributes<HTMLTextAreaElement>;
+1 -1
View File
@@ -3,7 +3,7 @@
import * as TooltipPrimitive from "@radix-ui/react-tooltip";
import * as React from "react";
import { cn } from "@/lib/utils";
import { cn } from "@/libs/utils";
const TooltipProvider = TooltipPrimitive.Provider;
@@ -0,0 +1,60 @@
---
title: High-Level Concepts
---
This is a quick guide to the high-level concepts you'll encounter frequently when building LLM applications.
## Large Language Models (LLMs)
LLMs are the fundamental innovation that launched LlamaIndex. They are an artificial intelligence (AI) computer system that can understand, generate, and manipulate natural language, including answering questions based on their training data or data provided to them at query time.
## Agentic Applications
When an LLM is used within an application, it is often used to make decisions, take actions, and/or interact with the world. This is the core definition of an **agentic application**.
While the definition of an agentic application is broad, there are several key characteristics that define an agentic application:
- **LLM Augmentation**: The LLM is augmented with tools (i.e. arbitrary callable functions in code), memory, and/or dynamic prompts.
- **Prompt Chaining**: Several LLM calls are used that build on each other, with the output of one LLM call being used as the input to the next.
- **Routing**: The LLM is used to route the application to the next appropriate step or state in the application.
- **Parallelism**: The application can perform multiple steps or actions in parallel.
- **Orchestration**: A hierarchical structure of LLMs is used to orchestrate lower-level actions and LLMs.
- **Reflection**: The LLM is used to reflect and validate outputs of previous steps or LLM calls, which can be used to guide the application to the next appropriate step or state.
In LlamaIndex, you can build agentic applications by using the workflows to orchestrate a sequence of steps and LLMs. You can [learn more about workflows](/docs/llamaindex/tutorials/workflows).
## Agents
We define an agent as a specific instance of an "agentic application". An agent is a piece of software that semi-autonomously performs tasks by combining LLMs with other tools and memory, orchestrated in a reasoning loop that decides which tool to use next (if any).
What this means in practice, is something like:
- An agent receives a user message
- The agent uses an LLM to determine the next appropriate action to take using the previous chat history, tools, and the latest user message
- The agent may invoke one or more tools to assist in the users request
- If tools are used, the agent will then interpret the tool outputs and use them to inform the next action
- Once the agent stops taking actions, it returns the final output to the user
You can [learn more about agents](/docs/llamaindex/tutorials/basic_agent).
## Retrieval Augmented Generation (RAG)
Retrieval-Augmented Generation (RAG) is a core technique for building data-backed LLM applications with LlamaIndex. It allows LLMs to answer questions about your private data by providing it to the LLM at query time, rather than training the LLM on your data. To avoid sending **all** of your data to the LLM every time, RAG indexes your data and selectively sends only the relevant parts along with your query. You can [learn more about RAG](/docs/llamaindex/tutorials/rag).
## Use cases
There are endless use cases for data-backed LLM applications but they can be roughly grouped into four categories:
[**Agents**](/docs/llamaindex/tutorials/basic_agent):
An agent is an automated decision-maker powered by an LLM that interacts with the world via a set of [tools](/docs/llamaindex/modules/agents/tool). Agents can take an arbitrary number of steps to complete a given task, dynamically deciding on the best course of action rather than following pre-determined steps. This gives it additional flexibility to tackle more complex tasks.
[**Workflows**](/docs/llamaindex/tutorials/workflows):
A Workflow in LlamaIndex is a specific event-driven abstraction that allows you to orchestrate a sequence of steps and LLMs calls. Workflows can be used to implement any agentic application, and are a core component of LlamaIndex.
[**Structured Data Extraction**](/docs/llamaindex/tutorials/structured_data_extraction):
Pydantic extractors allow you to specify a precise data structure to extract from your data and use LLMs to fill in the missing pieces in a type-safe way. This is useful for extracting structured data from unstructured sources like PDFs, websites, and more, and is key to automating workflows.
[**Query Engines**](/docs/llamaindex/modules/rag/query_engines):
A query engine is an end-to-end flow that allows you to ask questions over your data. It takes in a natural language query, and returns a response, along with reference context retrieved and passed to the LLM.
[**Chat Engines**](/docs/llamaindex/modules/rag/chat_engine):
A chat engine is an end-to-end flow for having a conversation with your data (multiple back-and-forth instead of a single question-and-answer).
@@ -19,3 +19,8 @@ npm run dev
to start the development server. You can then visit [http://localhost:3000](http://localhost:3000) to see your app, which should look something like this:
![create-llama interface](/images/create_llama.png)
## Learn more
- [Learn more about `create-llama`](https://github.com/run-llama/create-llama)
- [Want to use the same UI components? You can use our React components](https://ui.llamaindex.ai/)
@@ -17,7 +17,8 @@ npm i
Then you can run any example in the folder with `tsx`, e.g.:
```bash npm2yarn
npx tsx ./vectorIndex.ts
export OPENAI_API_KEY=your-api-key
npx tsx ./agents/agent/openai.ts
```
## Try examples online
@@ -1,70 +0,0 @@
---
title: With Cloudflare Worker
description: In this guide, you'll learn how to use LlamaIndex with CloudFlare Worker
---
Before you start, make sure you have try LlamaIndex.TS in Node.js to make sure you understand the basics.
<Card
title="Getting Started with LlamaIndex.TS in Node.js"
href="/docs/llamaindex/getting_started/installation/node"
/>
Also, you need have the basic understanding of <a href='https://developers.cloudflare.com/workers/'><SiCloudflareworkers className="inline mr-2" color="#F38020" />Cloudflare Worker</a>.
## Adding environment variables
```ts
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { setEnvs } = await import("@llamaindex/env");
setEnvs(env);
const { OpenAIAgent } = await import("@llamaindex/openai");
// Start your code here
return new Response("Hello, world!");
},
};
```
Then, you need create `.dev.vars` and add LLM api keys for the local development, such as `OPENAI_API_KEY` for OpenAI API key.
<Callout type="warn">Do not commit the api key to git repository.</Callout>
## Integrating with Hono
```ts
import { Hono } from "hono";
type Bindings = {
OPENAI_API_KEY: string;
};
const app = new Hono<{
Bindings: Bindings;
}>();
app.post("/llm", async (c) => {
const { setEnvs } = await import("@llamaindex/env");
setEnvs(c.env);
// ...
return new Response('Hello, world!');
})
export default {
fetch: app.fetch,
};
```
## Difference between Node.js and Cloudflare Worker
In Cloudflare Worker and similar serverless JS environment, you need to be aware of the following differences:
- Some Node.js modules are not available in Cloudflare Worker, such as `node:fs`, `node:child_process`, `node:cluster`...
- You are recommend to design your code using network request, such as use `fetch` API to communicate with database, instead of a long-running process in Node.js.
- Some of LlamaIndex.TS packages are not available in Cloudflare Worker, for example `@llamaindex/readers` and `@llamaindex/huggingface`.
- The main `llamaindex` is designed to work in all JavaScript environment, including Cloudflare Worker. If you find any issue, please report to us.
- `@llamaindex/env` is a JS environment binding module, which polyfill some Node.js/Modern Web API (for example, we have a memory based `fs` module, and Crypto API polyfill). It is designed to work in all JavaScript environment, including Cloudflare Worker.
@@ -1,69 +1,177 @@
---
title: Installation
description: How to install llamaindex packages.
description: How to install and set up LlamaIndex.TS for your project.
---
To install llamaindex, run the following command:
## Quick Start
Install the core package:
```package-install
npm i llamaindex
```
In most cases, you'll also need an LLM package to use LlamaIndex. For example, to use the OpenAI LLM, you would install the following:
In most cases, you'll also need an LLM provider and the Workflow package:
```package-install
npm i @llamaindex/openai
npm i @llamaindex/openai @llamaindex/workflow
```
Go to [LLM APIs](/docs/llamaindex/modules/models/llms) to find out how to use other LLMs.
## Environment Setup
### API Keys
## Frameworks
Most LLM providers require API keys. Set your OpenAI key (or other provider):
LlamaIndex supports a wide range of frameworks and runtimes. Click on the card below to learn more.
```bash
export OPENAI_API_KEY=your-api-key
```
Or use a `.env` file:
```bash
echo "OPENAI_API_KEY=your-api-key" > .env
```
<Callout type="warn">Never commit API keys to your repository.</Callout>
### Loading Environment Variables
For Node.js applications:
```bash
node --env-file .env your-script.js
```
For other environments, see the deployment-specific guides below.
## TypeScript Configuration
LlamaIndex.TS is built with TypeScript and provides excellent type safety. Add these settings to your `tsconfig.json`:
```json5
{
"compilerOptions": {
// Essential for module resolution
"moduleResolution": "bundler", // or "nodenext" | "node16" | "node"
// Required for Web Stream API support
"lib": ["DOM.AsyncIterable"],
// Recommended for better compatibility
"target": "es2020",
"module": "esnext"
}
}
```
## Running your first agent
### Set up
If you don't already have a project, you can create a new one in a new folder:
```package-install
npm init
npm i -D typescript @types/node
npm i @llamaindex/openai @llamaindex/workflow llamaindex zod
```
### Run the agent
Create the file `example.ts`. This code will:
- Create two tools for use by the agent:
- A `sumNumbers` tool that adds two numbers
- A `divideNumbers` tool that divides numbers
- Give an example of the data structure we wish to generate
- Prompt the LLM with instructions and the example, plus a sample transcript
<include cwd>../../examples/agents/agent/openai.ts</include>
To run the code:
```package-install
npx tsx example.ts
```
You should expect output something like:
```
{
result: '5 + 5 is 10. Then, 10 divided by 2 is 5.',
state: {
memory: Memory {
messages: [Array],
tokenLimit: 30000,
shortTermTokenLimitRatio: 0.7,
memoryBlocks: [],
memoryCursor: 0,
adapters: [Object]
},
scratchpad: [],
currentAgentName: 'Agent',
agents: [ 'Agent' ],
nextAgentName: null
}
}
Done
```
## Performance Optimization
### Tokenization Speed
Install `gpt-tokenizer` for 60x faster tokenization (Node.js environments only):
```package-install
npm i gpt-tokenizer
```
LlamaIndex will automatically use this when available.
## Deployment Guides
Choose your deployment target:
<Cards>
<Card title={
<>
<SiNodedotjs className="inline" color="#5FA04E" /> Node.js
</>
} href="/docs/llamaindex/getting_started/installation/node" />
<Card title={
<>
<SiTypescript className="inline" color="#3178C6" /> TypeScript
</>
} href="/docs/llamaindex/getting_started/installation/typescript" />
<Card title={
<>
<SiVite className='inline' color='#646CFF' /> Vite
</>
} href="/docs/llamaindex/getting_started/installation/vite" />
<Card
title={
<>
<SiNextdotjs className='inline' /> Next.js (React Server Component)
</>
}
href="/docs/llamaindex/getting_started/installation/next"
/>
<Card title={
<>
<SiCloudflareworkers className='inline' color='#F38020' /> Cloudflare Workers
</>
} href="/docs/llamaindex/getting_started/installation/cloudflare" />
<Card
title="Server APIs & Backends"
description="Express, Fastify, Koa, standalone Node.js servers"
href="/docs/llamaindex/getting_started/installation/server-apis"
/>
<Card
title="Serverless Functions"
description="Vercel, Netlify, AWS Lambda, Cloudflare Workers"
href="/docs/llamaindex/getting_started/installation/serverless"
/>
<Card
title="Next.js Applications"
description="API routes, server components, edge runtime"
href="/docs/llamaindex/getting_started/installation/nextjs"
/>
<Card
title="Troubleshooting"
description="Common issues, bundle optimization, compatibility"
href="/docs/llamaindex/getting_started/installation/troubleshooting"
/>
</Cards>
## What's next?
## LLM/Embedding Providers
Go to [LLM APIs](/docs/llamaindex/modules/models/llms) and [Embedding APIs](/docs/llamaindex/modules/models/embeddings) to find out how to use different LLM and embedding providers beyond OpenAI.
## What's Next?
<Cards>
<Card
title="Learn LlamaIndex.TS"
description="Learn how to use LlamaIndex.TS by starting with one of our tutorials."
href="/docs/llamaindex/tutorials/rag"
/>
<Card
title="Show me code examples"
description="Explore code examples using LlamaIndex.TS."
href="/docs/llamaindex/getting_started/examples"
/>
<Card
title="Learn LlamaIndex.TS"
description="Learn how to use LlamaIndex.TS by starting with one of our tutorials."
href="/docs/llamaindex/tutorials/basic_agent"
/>
<Card
title="Show me code examples"
description="Explore code examples using LlamaIndex.TS."
href="/docs/llamaindex/getting_started/examples"
/>
</Cards>
@@ -1,4 +1,4 @@
{
"title": "Installation",
"pages": ["node", "typescript", "next", "vite", "cloudflare"]
"pages": ["server-apis", "serverless", "nextjs", "troubleshooting"]
}
@@ -1,41 +0,0 @@
---
title: With Next.js
description: In this guide, you'll learn how to use LlamaIndex with Next.js.
---
Before you start, make sure you have try LlamaIndex.TS in Node.js to make sure you understand the basics.
<Card
title="Getting Started with LlamaIndex.TS in Node.js"
href="/docs/llamaindex/getting_started/installation/node"
/>
## Differences between Node.js and Next.js
Next.js is a React framework that has both server side compatibility and client side compatibility.
This means that you need to be careful when using LlamaIndex.TS in Next.js.
Don't leak the import data like API keys to the client side.
Also, in Next.js, there is build time and runtime. Some computations can be done at build time like Document embedding could be done at build time for better performance.
Where as the `llamaindex` package is working with Next.js, some provider packages like `@llamaindex/huggingface` are not working well with Next.js. This is due to the upstream dependencies used by the provider package.
Make sure to use `withLlamaIndex` to make sure that LlamaIndex.TS works well with Next.js.
```js
// next.config.mjs / next.config.ts
import withLlamaIndex from "llamaindex/next";
/** @type {import('next').NextConfig} */
const nextConfig = {};
export default withLlamaIndex(nextConfig);
```
If you see any dependency issues, you are welcome to open an issue on the GitHub.
## Edge Runtime
[Vercel Edge Runtime](https://edge-runtime.vercel.app/) is a subset of Node.js APIs. Similar to [Cloudflare Workers](/docs/llamaindex/getting_started/installation/cloudflare#difference-between-nodejs-and-cloudflare-worker),
it is a serverless platform that runs your code on the edge.
Not all features of Node.js are supported in Vercel Edge Runtime, so does LlamaIndex.TS, we are working on more compatibility with all JavaScript runtimes.
@@ -0,0 +1,405 @@
---
title: Next.js Applications
description: Deploy LlamaIndex.TS in Next.js applications with API routes, server components, and edge runtime.
---
This guide covers integrating LlamaIndex.TS agents with Next.js applications.
## Essential Configuration
### Next.js Config
Use `withLlamaIndex` to ensure compatibility:
```javascript
// next.config.mjs
import withLlamaIndex from "llamaindex/next";
/** @type {import('next').NextConfig} */
const nextConfig = {
// Your existing config
};
export default withLlamaIndex(nextConfig);
```
## API Routes
### App Router (Recommended)
```typescript
// app/api/chat/route.ts
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
import { NextRequest, NextResponse } from "next/server";
// Initialize agent once (consider using a singleton pattern)
let myAgent: any = null;
async function initializeAgent() {
if (myAgent) return myAgent;
try {
const greetTool = tool({
name: "greet",
description: "Greets a user with their name",
parameters: z.object({
name: z.string(),
}),
execute: ({ name }) => `Hello, ${name}! How can I help you today?`,
});
myAgent = agent({
tools: [greetTool],
llm: openai({ model: "gpt-4o-mini" }),
});
return myAgent;
} catch (error) {
console.error("Failed to initialize agent:", error);
throw error;
}
}
export async function POST(request: NextRequest) {
try {
const { message } = await request.json();
if (!message || typeof message !== 'string') {
return NextResponse.json(
{ error: "Message is required and must be a string" },
{ status: 400 }
);
}
const agent = await initializeAgent();
const result = await agent.run(message);
return NextResponse.json({ response: result.data });
} catch (error) {
console.error("Chat error:", error);
return NextResponse.json(
{ error: "Internal server error" },
{ status: 500 }
);
}
}
```
### Pages Router (Legacy)
```typescript
// pages/api/chat.ts
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
import type { NextApiRequest, NextApiResponse } from "next";
let myAgent: any = null;
async function initializeAgent() {
if (myAgent) return myAgent;
const timeTool = tool({
name: "getCurrentTime",
description: "Gets the current time",
parameters: z.object({}),
execute: () => new Date().toISOString(),
});
myAgent = agent({
tools: [timeTool],
llm: openai({ model: "gpt-4o-mini" }),
});
return myAgent;
}
export default async function handler(
req: NextApiRequest,
res: NextApiResponse
) {
if (req.method !== "POST") {
return res.status(405).json({ error: "Method not allowed" });
}
try {
const { message } = req.body;
const agent = await initializeAgent();
const result = await agent.run(message);
res.json({ response: result.data });
} catch (error) {
console.error("Chat error:", error);
res.status(500).json({ error: "Internal server error" });
}
}
```
## Server Components
Initialize agents in server components:
```typescript
// app/chat/page.tsx
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
async function initializeAgent() {
const helpTool = tool({
name: "getHelp",
description: "Provides help information",
parameters: z.object({
topic: z.string().optional(),
}),
execute: ({ topic }) => {
if (topic) {
return `Here's help for ${topic}: This is a helpful resource about ${topic}.`;
}
return "Available topics: general, troubleshooting, api, deployment";
},
});
return agent({
tools: [helpTool],
llm: openai({ model: "gpt-4o-mini" }),
});
}
export default async function ChatPage() {
const chatAgent = await initializeAgent();
return (
<div>
<h1>Chat Interface</h1>
<p>Agent initialized and ready to help!</p>
{/* Your chat UI components */}
</div>
);
}
```
## Edge Runtime
The Edge Runtime has limited Node.js API access:
```typescript
// app/api/chat-edge/route.ts
import { NextRequest, NextResponse } from "next/server";
export const runtime = "edge";
export async function POST(request: NextRequest) {
const { setEnvs } = await import("@llamaindex/env");
setEnvs(process.env);
try {
const { message } = await request.json();
const { agent } = await import("@llamaindex/workflow");
const { tool } = await import("llamaindex");
const { openai } = await import("@llamaindex/openai");
const { z } = await import("zod");
const timeTool = tool({
name: "time",
description: "Gets current time",
parameters: z.object({}),
execute: () => new Date().toISOString(),
});
const myAgent = agent({
tools: [timeTool],
llm: openai({ model: "gpt-4o-mini" }),
});
const result = await myAgent.run(message);
return NextResponse.json({ response: result.data });
} catch (error) {
return NextResponse.json({ error: error.message }, { status: 500 });
}
}
```
## Streaming Responses
Implement streaming for better user experience:
```typescript
// app/api/chat-stream/route.ts
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { agentStreamEvent } from "@llamaindex/workflow";
import { NextRequest } from "next/server";
import { z } from "zod";
// Initialize agent once (consider using a singleton pattern)
let myAgent: any = null;
async function initializeAgent() {
if (myAgent) return myAgent;
try {
const greetTool = tool({
name: "greet",
description: "Greets a user with their name",
parameters: z.object({
name: z.string(),
}),
execute: ({ name }) => `Hello, ${name}! How can I help you today?`,
});
myAgent = agent({
tools: [greetTool],
llm: openai({ model: "gpt-4o-mini" }),
});
return myAgent;
} catch (error) {
console.error("Failed to initialize agent:", error);
throw error;
}
}
export async function POST(request: NextRequest) {
const { message } = await request.json();
const stream = new ReadableStream({
async start(controller) {
try {
const agent = await initializeAgent();
const events = agent.runStream(message);
for await (const event of events) {
if (agentStreamEvent.include(event)) {
controller.enqueue(new TextEncoder().encode(event.data.delta));
}
}
controller.close();
} catch (error) {
controller.error(error);
}
},
});
return new Response(stream, {
headers: {
"Content-Type": "text/plain",
"Transfer-Encoding": "chunked",
},
});
}
```
## Client-side Integration
### React Hook for API Calls
```typescript
// hooks/useAgentChat.ts
import { useState } from "react";
export function useAgentChat() {
const [loading, setLoading] = useState(false);
const [error, setError] = useState<string | null>(null);
const [response, setResponse] = useState<string | null>(null);
const chat = async (message: string) => {
setLoading(true);
setError(null);
try {
const res = await fetch("/api/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ message }),
});
if (!res.ok) {
throw new Error(`HTTP error! status: ${res.status}`);
}
const data = await res.json();
setResponse(data.response);
} catch (err) {
setError(err instanceof Error ? err.message : "An error occurred");
} finally {
setLoading(false);
}
};
return { chat, loading, error, response };
}
```
### Chat Component
```typescript
// components/ChatInterface.tsx
"use client";
import { useState } from "react";
import { useAgentChat } from "@/hooks/useAgentChat";
export default function ChatInterface() {
const [message, setMessage] = useState("");
const { chat, loading, error, response } = useAgentChat();
const handleSubmit = async (e: React.FormEvent) => {
e.preventDefault();
if (!message.trim()) return;
await chat(message);
setMessage("");
};
return (
<div className="max-w-2xl mx-auto p-4">
<form onSubmit={handleSubmit} className="mb-4">
<input
type="text"
value={message}
onChange={(e) => setMessage(e.target.value)}
placeholder="Send a message..."
className="w-full p-2 border rounded"
disabled={loading}
/>
<button
type="submit"
disabled={loading || !message.trim()}
className="mt-2 px-4 py-2 bg-blue-500 text-white rounded disabled:opacity-50"
>
{loading ? "Thinking..." : "Send"}
</button>
</form>
{error && (
<div className="p-3 mb-4 bg-red-100 border border-red-400 text-red-700 rounded">
Error: {error}
</div>
)}
{response && (
<div className="p-3 bg-gray-100 border rounded">
<strong>Agent:</strong>
<p>{response}</p>
</div>
)}
</div>
);
}
```
## Next Steps
- Learn about [serverless deployment](/docs/llamaindex/getting_started/installation/serverless)
- Explore [server APIs](/docs/llamaindex/getting_started/installation/server-apis)
- Check [troubleshooting guide](/docs/llamaindex/getting_started/installation/troubleshooting) for common issues
@@ -1,40 +0,0 @@
---
title: With Node.js/Bun/Deno
description: In this guide, you'll learn how to use LlamaIndex with Node.js, Bun, and Deno.
---
## Adding environment variables
By default, LlamaIndex uses OpenAI provider, which requires an API key. You can set the `OPENAI_API_KEY` environment variable to authenticate with OpenAI.
```shell
export OPENAI_API_KEY=your-api-key
```
Or you can use a `.env` file:
```shell
echo "OPENAI_API_KEY=your-api-key" > .env
node --env-file .env your-script.js
```
<Callout type="warn">Do not commit the api key to git repository.</Callout>
For more information, see the [How to read environment variables from Node.js](https://nodejs.org/en/learn/command-line/how-to-read-environment-variables-from-nodejs).
## Performance Optimization
By the default, we are using `js-tiktoken` for tokenization. You can install `gpt-tokenizer` which is then automatically used by LlamaIndex to get a 60x speedup for tokenization:
```package-install
npm i gpt-tokenizer
```
**Note**: This only works for Node.js
## TypeScript support
<Card
title="Getting Started with LlamaIndex.TS in TypeScript"
href="/docs/llamaindex/getting_started/installation/typescript"
/>
@@ -0,0 +1,211 @@
---
title: Server APIs & Backends
description: Deploy LlamaIndex.TS in server environments like Express, Fastify, and standalone Node.js applications.
---
This guide covers adding LlamaIndex.TS agents to traditional server environments where you have full Node.js runtime access.
## Supported Runtimes
LlamaIndex.TS works seamlessly with:
- **Node.js** (v18+)
- **Bun** (v1.0+)
- **Deno** (v1.30+)
## Common Server Frameworks
### Express.js
```typescript
import express from 'express';
import { agent } from '@llamaindex/workflow';
import { tool } from 'llamaindex';
import { openai } from '@llamaindex/openai';
import { z } from 'zod';
const app = express();
app.use(express.json());
// Initialize agent once at startup
let myAgent: any;
async function initializeAgent() {
// Create tools for the agent
const sumTool = tool({
name: "sum",
description: "Adds two numbers",
parameters: z.object({
a: z.number(),
b: z.number(),
}),
execute: ({ a, b }) => a + b,
});
const multiplyTool = tool({
name: "multiply",
description: "Multiplies two numbers",
parameters: z.object({
a: z.number(),
b: z.number(),
}),
execute: ({ a, b }) => a * b,
});
// Create the agent
myAgent = agent({
tools: [sumTool, multiplyTool],
llm: openai({ model: "gpt-4o-mini" }),
});
}
app.post('/api/chat', async (req, res) => {
try {
const { message } = req.body;
const result = await myAgent.run(message);
res.json({ response: result.data });
} catch (error) {
res.status(500).json({ error: 'Chat failed' });
}
});
// Initialize and start server
initializeAgent().then(() => {
app.listen(3000, () => {
console.log('Server running on port 3000');
});
});
```
### Fastify
```typescript
import Fastify from 'fastify';
import { agent } from '@llamaindex/workflow';
import { tool } from 'llamaindex';
import { openai } from '@llamaindex/openai';
import { z } from 'zod';
const fastify = Fastify();
let myAgent: any;
async function initializeAgent() {
const sumTool = tool({
name: "sum",
description: "Adds two numbers",
parameters: z.object({
a: z.number(),
b: z.number(),
}),
execute: ({ a, b }) => a + b,
});
myAgent = agent({
tools: [sumTool],
llm: openai({ model: "gpt-4o-mini" }),
});
}
fastify.post('/api/chat', async (request, reply) => {
try {
const { message } = request.body as { message: string };
const result = await myAgent.run(message);
return { response: result.data };
} catch (error) {
reply.status(500).send({ error: 'Chat failed' });
}
});
const start = async () => {
await initializeAgent();
await fastify.listen({ port: 3000 });
console.log('Server running on port 3000');
};
start();
```
### Hono
```typescript
import { Hono } from "hono";
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
type Bindings = {
OPENAI_API_KEY: string;
};
const app = new Hono<{ Bindings: Bindings }>();
app.post("/api/chat", async (c) => {
const { setEnvs } = await import("@llamaindex/env");
setEnvs(c.env);
const { message } = await c.req.json();
const greetTool = tool({
name: "greet",
description: "Greets a user",
parameters: z.object({
name: z.string(),
}),
execute: ({ name }) => `Hello, ${name}!`,
});
const myAgent = agent({
tools: [greetTool],
llm: openai({ model: "gpt-4o-mini" }),
});
try {
const result = await myAgent.run(message);
return c.json({ response: result.data });
} catch (error) {
return c.json({ error: error.message }, 500);
}
});
export default app;
```
## Streaming Responses
For real-time agent responses:
```typescript
import { agentStreamEvent } from "@llamaindex/workflow";
app.post('/api/chat-stream', async (req, res) => {
const { message } = req.body;
res.writeHead(200, {
'Content-Type': 'text/plain',
'Transfer-Encoding': 'chunked',
});
try {
const events = myAgent.runStream(message);
for await (const event of events) {
if (agentStreamEvent.include(event)) {
res.write(event.data.delta);
}
}
res.end();
} catch (error) {
res.write('Error: ' + error.message);
res.end();
}
});
```
## Next Steps
- Learn about [serverless deployment](/docs/llamaindex/getting_started/installation/serverless)
- Explore [Next.js integration](/docs/llamaindex/getting_started/installation/nextjs)
- Check [troubleshooting guide](/docs/llamaindex/getting_started/installation/troubleshooting) for common issues
@@ -0,0 +1,240 @@
---
title: Serverless Functions
description: Deploy LlamaIndex.TS in serverless environments like Vercel, Netlify, AWS Lambda, and Cloudflare Workers.
---
This guide covers adding LlamaIndex.TS agents to serverless environments where you have execution time and memory constraints.
## Cloudflare Workers
```typescript
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { setEnvs } = await import("@llamaindex/env");
setEnvs(env);
const { agent } = await import("@llamaindex/workflow");
const { openai } = await import("@llamaindex/openai");
const { tool } = await import("llamaindex");
const { z } = await import("zod");
const timeTool = tool({
name: "getCurrentTime",
description: "Gets the current time",
parameters: z.object({}),
execute: () => new Date().toISOString(),
});
const myAgent = agent({
tools: [timeTool],
llm: openai({ model: "gpt-4o-mini" }),
});
try {
const { message } = await request.json();
const result = await myAgent.run(message);
return new Response(JSON.stringify({ response: result.data }), {
headers: { "Content-Type": "application/json" },
});
} catch (error) {
return new Response(JSON.stringify({ error: error.message }), {
status: 500,
headers: { "Content-Type": "application/json" },
});
}
},
};
```
## Vercel Functions
### Node.js Runtime
```typescript
// pages/api/chat.ts or app/api/chat/route.ts
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
export default async function handler(req, res) {
if (req.method !== 'POST') {
return res.status(405).json({ error: 'Method not allowed' });
}
const { message } = req.body;
const weatherTool = tool({
name: "getWeather",
description: "Get weather information",
parameters: z.object({
city: z.string(),
}),
execute: ({ city }) => `Weather in ${city}: 72°F, sunny`,
});
const myAgent = agent({
tools: [weatherTool],
llm: openai({ model: "gpt-4o-mini" }),
});
try {
const result = await myAgent.run(message);
res.json({ response: result.data });
} catch (error) {
res.status(500).json({ error: error.message });
}
}
```
### Edge Runtime
```typescript
// app/api/chat/route.ts
import { NextRequest, NextResponse } from "next/server";
export const runtime = "edge";
export async function POST(request: NextRequest) {
const { setEnvs } = await import("@llamaindex/env");
setEnvs(process.env);
const { message } = await request.json();
try {
// Use simpler tools for edge runtime
const { agent } = await import("@llamaindex/workflow");
const { tool } = await import("llamaindex");
const { openai } = await import("@llamaindex/openai");
const { z } = await import("zod");
const timeTool = tool({
name: "time",
description: "Gets current time",
parameters: z.object({}),
execute: () => new Date().toISOString(),
});
const myAgent = agent({
tools: [timeTool],
llm: openai({ model: "gpt-4o-mini" }),
});
const result = await myAgent.run(message);
return NextResponse.json({ response: result.data });
} catch (error) {
return NextResponse.json({ error: error.message }, { status: 500 });
}
}
```
## AWS Lambda
```typescript
import { APIGatewayProxyHandler } from "aws-lambda";
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
export const handler: APIGatewayProxyHandler = async (event, context) => {
const { message } = JSON.parse(event.body || "{}");
const calculatorTool = tool({
name: "calculate",
description: "Performs basic math",
parameters: z.object({
expression: z.string(),
}),
execute: ({ expression }) => {
// Simple calculator implementation
try {
return `Result: ${eval(expression)}`;
} catch {
return "Invalid expression";
}
},
});
const myAgent = agent({
tools: [calculatorTool],
llm: openai({ model: "gpt-4o-mini" }),
});
try {
const result = await myAgent.run(message);
return {
statusCode: 200,
headers: {
"Content-Type": "application/json",
"Access-Control-Allow-Origin": "*",
},
body: JSON.stringify({ response: result.data }),
};
} catch (error) {
return {
statusCode: 500,
body: JSON.stringify({ error: error.message }),
};
}
};
```
## Netlify Functions
```typescript
// netlify/functions/chat.ts
import { Handler } from "@netlify/functions";
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
export const handler: Handler = async (event, context) => {
if (event.httpMethod !== "POST") {
return { statusCode: 405, body: "Method Not Allowed" };
}
const { message } = JSON.parse(event.body || "{}");
const helpTool = tool({
name: "help",
description: "Provides help information",
parameters: z.object({
topic: z.string().optional(),
}),
execute: ({ topic }) => {
return topic ? `Help for ${topic}` : "Available help topics";
},
});
const myAgent = agent({
tools: [helpTool],
llm: openai({ model: "gpt-4o-mini" }),
});
try {
const result = await myAgent.run(message);
return {
statusCode: 200,
body: JSON.stringify({ response: result.data }),
};
} catch (error) {
return {
statusCode: 500,
body: JSON.stringify({ error: error.message }),
};
}
};
```
## Next Steps
- Learn about [Next.js integration](/docs/llamaindex/getting_started/installation/nextjs)
- Explore [server deployment](/docs/llamaindex/getting_started/installation/server-apis)
- Check [troubleshooting guide](/docs/llamaindex/getting_started/installation/troubleshooting) for common issues
@@ -0,0 +1,501 @@
---
title: Troubleshooting
description: Common issues and solutions when installing and deploying LlamaIndex.TS applications.
---
This guide addresses common issues you might encounter when installing and deploying LlamaIndex.TS applications across different environments.
## Installation Issues
### Module Resolution Errors
**Problem:** Import errors or module not found errors
**Solution:** Ensure your `tsconfig.json` is properly configured:
```json5
{
"compilerOptions": {
"moduleResolution": "bundler", // or "nodenext" | "node16" | "node"
"lib": ["DOM.AsyncIterable"],
"target": "es2020",
"module": "esnext"
}
}
```
**Alternative solution:** Try different module resolution strategies:
```bash
# Clear node_modules and reinstall
rm -rf node_modules package-lock.json
npm install
# Or try with different package manager
pnpm install
# or
yarn install
```
### TypeScript Errors
**Problem:** TypeScript compilation errors with LlamaIndex imports
**Solution:** Ensure you have the correct TypeScript configuration:
```json5
{
"compilerOptions": {
"strict": true,
"skipLibCheck": true, // Skip type checking of node_modules
"allowSyntheticDefaultImports": true,
"esModuleInterop": true
}
}
```
### Package Compatibility Issues
**Problem:** Some packages don't work in certain environments
**Common incompatibilities:**
- `@llamaindex/readers` - May not work in serverless environments
- `@llamaindex/huggingface` - Limited browser/edge compatibility
- File system readers - Don't work in browser/edge environments
**Solution:** Use environment-specific alternatives:
```typescript
// Instead of file system readers in serverless
// Use remote data sources
async function loadDocumentsFromAPI() {
const response = await fetch('https://api.example.com/documents');
const data = await response.json();
return data.map(doc => new Document(doc.content));
}
```
## Runtime Issues
### Memory Errors
**Problem:** Out of memory errors during index creation or querying
**Solution:** Optimize memory usage:
```typescript
// Batch process large document sets
async function batchProcessDocuments(documents: Document[], batchSize = 10) {
const results = [];
for (let i = 0; i < documents.length; i += batchSize) {
const batch = documents.slice(i, i + batchSize);
const batchIndex = await VectorStoreIndex.fromDocuments(batch);
results.push(batchIndex);
// Optional: Add delay between batches
await new Promise(resolve => setTimeout(resolve, 100));
}
return results;
}
```
**For serverless environments:**
```typescript
// Use external vector stores instead of in-memory
// TODO: Example with Pinecone, Weaviate, etc.
// const vectorStore = new PineconeVectorStore(/* config */);
// const index = await VectorStoreIndex.fromVectorStore(vectorStore);
```
### API Rate Limiting
**Problem:** Rate limiting errors from LLM providers
**Solution:** Implement retry logic with exponential backoff:
```typescript
async function queryWithRetry(queryEngine: any, question: string, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await queryEngine.query(question);
} catch (error) {
if (error.message.includes('rate limit') && i < maxRetries - 1) {
const delay = Math.pow(2, i) * 1000; // Exponential backoff
await new Promise(resolve => setTimeout(resolve, delay));
continue;
}
throw error;
}
}
}
```
### Tokenization Performance
**Problem:** Slow tokenization affecting performance
**Solution:** Install faster tokenizer (Node.js only):
```bash
npm install gpt-tokenizer
```
LlamaIndex will automatically use this for 60x faster tokenization.
## Bundling Issues
### Bundle Size Too Large
**Problem:** Large bundle sizes affecting performance
**Solution:** Use dynamic imports and code splitting:
```typescript
// Lazy load LlamaIndex components
const initializeLlamaIndex = async () => {
const { VectorStoreIndex, SimpleDirectoryReader } = await import("llamaindex");
return { VectorStoreIndex, SimpleDirectoryReader };
};
// In your API route
export async function POST(request: NextRequest) {
const { VectorStoreIndex, SimpleDirectoryReader } = await initializeLlamaIndex();
// Use the imported modules
}
```
### Webpack/Vite Bundling Issues
**Problem:** Bundler compatibility issues
**Solution for Next.js:**
```javascript
// next.config.mjs
import withLlamaIndex from "llamaindex/next";
const nextConfig = {
webpack: (config, { isServer }) => {
// Custom webpack configuration if needed
if (!isServer) {
config.resolve.fallback = {
...config.resolve.fallback,
fs: false,
net: false,
tls: false,
};
}
return config;
},
};
export default withLlamaIndex(nextConfig);
```
**Solution for Vite:**
```typescript
// vite.config.ts
import { defineConfig } from 'vite';
export default defineConfig({
define: {
global: 'globalThis',
},
resolve: {
alias: {
// Add aliases for problematic modules
},
},
optimizeDeps: {
include: ['llamaindex'],
},
});
```
## Environment-Specific Issues
### Node.js Version Compatibility
**Problem:** Node.js version compatibility issues
**Solution:** Use supported Node.js versions:
```json
{
"engines": {
"node": ">=18.0.0"
}
}
```
**Check your Node.js version:**
```bash
node --version
```
### Cloudflare Workers Issues
**Problem:** Module not available in Cloudflare Workers
**Solution:** Use `@llamaindex/env` for environment compatibility:
```typescript
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const { setEnvs } = await import("@llamaindex/env");
setEnvs(env);
// Your LlamaIndex code here
},
};
```
### Vercel Edge Runtime Issues
**Problem:** Limited Node.js API access in Edge Runtime
**Solution:** Use standard runtime or adapt code:
```typescript
// Force standard runtime
export const runtime = "nodejs";
// Or adapt for edge
export const runtime = "edge";
export async function POST(request: NextRequest) {
// Use edge-compatible code only
const { setEnvs } = await import("@llamaindex/env");
setEnvs(process.env);
// Avoid file system operations
// Use remote data sources
}
```
## Performance Issues
### Slow Query Responses
**Problem:** Slow query performance
**Solution:** Implement caching and optimization:
```typescript
import { LRUCache } from 'lru-cache';
const queryCache = new LRUCache<string, string>({
max: 100,
ttl: 1000 * 60 * 10, // 10 minutes
});
export async function optimizedQuery(question: string, queryEngine: any) {
// Check cache first
const cached = queryCache.get(question);
if (cached) return cached;
// Query and cache result
const result = await queryEngine.query(question);
queryCache.set(question, result);
return result;
}
```
### Cold Start Issues
**Problem:** Slow cold starts in serverless environments
**Solution:** Pre-warm your functions:
```typescript
// Pre-initialize outside handler
let cachedQueryEngine: any = null;
export async function handler(event: any) {
if (!cachedQueryEngine) {
cachedQueryEngine = await initializeQueryEngine();
}
// Use cached engine
return await cachedQueryEngine.query(question);
}
```
## Environment Variable Issues
### Missing API Keys
**Problem:** API key not found or invalid
**Solution:** Verify environment variable setup:
```typescript
// Check if API key is available
if (!process.env.OPENAI_API_KEY) {
throw new Error('OPENAI_API_KEY environment variable is required');
}
// For debugging (remove in production)
console.log('API Key present:', !!process.env.OPENAI_API_KEY);
```
### Environment Variable Loading
**Problem:** Environment variables not loading correctly
**Solution:** Use proper loading mechanisms:
```typescript
// For Node.js
import 'dotenv/config';
// For Next.js - use .env.local
// Variables are automatically loaded
// For Cloudflare Workers
export default {
async fetch(request: Request, env: Env): Promise<Response> {
// Use env parameter, not process.env
const apiKey = env.OPENAI_API_KEY;
// ...
},
};
```
## Common Error Messages
### "Cannot find module 'llamaindex'"
**Cause:** Package not installed or module resolution issue
**Solution:**
```bash
npm install llamaindex
```
### "Module not found: Can't resolve 'fs'"
**Cause:** File system modules used in browser/edge environment
**Solution:**
```typescript
// Use dynamic imports with fallbacks
const loadDocuments = async () => {
if (typeof window !== 'undefined') {
// Browser environment - use alternative
return await loadDocumentsFromAPI();
} else {
// Node.js environment - use file system
const { SimpleDirectoryReader } = await import('llamaindex');
return await new SimpleDirectoryReader('data').loadData();
}
};
```
### "ReferenceError: global is not defined"
**Cause:** Global polyfill missing in browser environments
**Solution:**
```typescript
// Add to your app entry point
if (typeof global === 'undefined') {
global = globalThis;
}
```
### "Cannot read properties of undefined (reading 'query')"
**Cause:** Query engine not properly initialized
**Solution:**
```typescript
// Always check initialization
if (!queryEngine) {
throw new Error('Query engine not initialized');
}
// Or use optional chaining
const response = await queryEngine?.query(question);
```
## Debugging Tips
### Enable Debug Logging
```typescript
// Enable debug logging
process.env.DEBUG = "llamaindex:*";
// Or specific modules
process.env.DEBUG = "llamaindex:vector-store";
```
### Check Package Versions
```bash
npm list llamaindex
npm list @llamaindex/openai
```
### Test in Isolation
```typescript
// Create minimal test case
import { VectorStoreIndex } from 'llamaindex';
async function testBasic() {
try {
console.log('Testing basic import...');
const index = new VectorStoreIndex();
console.log('Success!');
} catch (error) {
console.error('Error:', error);
}
}
testBasic();
```
## Getting Help
### Before Asking for Help
1. **Check this troubleshooting guide**
2. **Search existing GitHub issues**
3. **Try minimal reproduction**
4. **Check your environment configuration**
### When Reporting Issues
Include:
- Node.js version (`node --version`)
- Package versions (`npm list llamaindex`)
- Environment (Node.js, Cloudflare Workers, Vercel, etc.)
- Minimal code reproduction
- Full error message and stack trace
### Useful Resources
- [GitHub Issues](https://github.com/run-llama/LlamaIndexTS/issues)
- [Discord Community](https://discord.gg/dGcwcsnxhU)
- [Documentation](https://docs.llamaindex.ai/)
## Next Steps
If you're still experiencing issues:
1. **Check specific deployment guides:**
- [Server APIs](/docs/llamaindex/getting_started/installation/server-apis)
- [Serverless Functions](/docs/llamaindex/getting_started/installation/serverless)
- [Next.js Applications](/docs/llamaindex/getting_started/installation/nextjs)
2. **Open an issue** on GitHub with a minimal reproduction
3. **Join our Discord** for community support
@@ -1,110 +0,0 @@
---
title: With TypeScript
description: In this guide, you'll learn how to use LlamaIndex with TypeScript
---
LlamaIndex.TS is written in TypeScript and designed to be used in TypeScript projects.
We put a lot of work on strong typing to make sure you have a great typing experience with code completion such as:
```ts twoslash
import { PromptTemplate } from 'llamaindex'
const promptTemplate = new PromptTemplate({
template: `Context information from multiple sources is below.
---------------------
{context}
---------------------
Given the information from multiple sources and not prior knowledge.
Answer the query in the style of a Shakespeare play"
Query: {query}
Answer:`,
templateVars: ["context", "query"],
});
// @noErrors
promptTemplate.format({
c
//^|
})
```
## Enable TypeScript
Make sure to set [moduleResolution](https://www.typescriptlang.org/docs/handbook/modules/theory.html#module-resolution) in your `tsconfig.json` file:
```json5
{
compilerOptions: {
// ⬇️ add this line to your tsconfig.json
moduleResolution: "bundler", // or "nodenext" | "node16" | "node"
},
}
```
We recommend using `bundler` or `nodenext`, but due to popularity of `node`, we still added support for it, but with import path limitations.
So you may encounter type errors when importing sub paths from the `llamaindex` package like:
```ts
import { Settings } from "llamaindex";
```
The simplest way to fix this without changing `moduleResolution` is to import directly from `llamaindex`:
```ts
import { Settings } from "llamaindex";
```
## Enable AsyncIterable for `Web Stream` API
Some modules uses `Web Stream` API like `ReadableStream` and `WritableStream`, you need to enable `DOM.AsyncIterable` in your `tsconfig.json`.
```json5
{
compilerOptions: {
// ⬇️ add this lib to your tsconfig.json
lib: ["DOM.AsyncIterable"],
},
}
```
```typescript
import { agent, tool } from 'llamaindex'
import { openai } from "@llamaindex/openai";
Settings.llm = openai({
model: "gpt-4o-mini",
});
const addTool = tool({
name: "add",
description: "Adds two numbers",
parameters: z.object({x: z.number(), y: z.number()}),
execute: ({ x, y }) => x + y,
});
const myAgent = agent({
tools: [addTool],
});
// Chat with the agent
const context = myAgent.run("Hello, how are you?");
for await (const event of context) {
if (event instanceof AgentStream) {
for (const chunk of event.data.delta) {
process.stdout.write(chunk); // stream response
}
} else {
console.log(event); // other events
}
}
```
## Run TypeScript Script in Node.js
We recommend to use [tsx](https://www.npmjs.com/package/tsx) to run TypeScript script in Node.js.
```shell
node --import tsx ./my-script.ts
```
@@ -1,23 +0,0 @@
---
title: With Vite
description: In this guide, you'll learn how to use LlamaIndex with Vite
---
Before you start, make sure you have try LlamaIndex.TS in Node.js to make sure you understand the basics.
<Card
title="Getting Started with LlamaIndex.TS in Node.js"
href="/docs/llamaindex/getting_started/installation/node"
/>
Also, make sure you have a basic understanding of [Vite](https://vitejs.dev/).
## Why mention Vite?
Vite.js is widely used in building many web applications, like React.js, even for some native app like [Electron](https://www.electronjs.org/).
However, it's not a ready-to-use solution for a Node.js-like application using Vite, as Vite is designed for web applications(run in browser).
There's some plugin/framework based on Vite, like [Waku.gg](https://github.com/dai-shi/waku), or [Electron Vite](https://electron-vite.org/)
For now, we have no clear solution for bundling LlamaIndex.TS with Vite, if you have any idea/solution, please let us know.
@@ -1,4 +1,4 @@
{
"title": "Getting Started",
"pages": ["installation", "create_llama", "examples"]
"pages": ["concepts", "installation", "create_llama", "examples"]
}
+105 -8
View File
@@ -1,21 +1,118 @@
---
title: What is LlamaIndex.TS
description: LlamaIndex is the leading data framework for building LLM applications
title: Welcome to LlamaIndex.TS
description: LlamaIndex.TS is the leading framework for utilizing context engineering to build LLM applications in JavaScript and TypeScript.
---
LlamaIndex is a framework for building context-augmented generative AI applications with LLMs including agents and workflows.
LlamaIndex.TS is a **framework for utilizing context engineering to build generative AI applications** with large language models. From rapid-prototyping RAG chatbots to deploying multi-agent workflows in production, LlamaIndex gives you everything you need — all in idiomatic TypeScript.
The TypeScript implementation is designed for JavaScript server side applications using <SiNodedotjs className="inline" color="#5FA04E" /> Node.js, <SiDeno className="inline" color="#70FFAF" /> Deno, <SiBun className="inline" /> Bun, <SiCloudflareworkers className="inline" color="#F38020" /> Cloudflare Workers, and more.
Built for modern JavaScript runtimes like <SiNodedotjs className="inline" color="#5FA04E" /> **Node.js**, <SiDeno className="inline" color="#70FFAF" /> **Deno**, <SiBun className="inline" /> **Bun**, <SiCloudflareworkers className="inline" color="#F38020" /> **Cloudflare Workers**, and more.
LlamaIndex.TS provides tools for beginners, advanced users, and everyone in between.
<div className="grid grid-cols-1 gap-4 sm:grid-cols-2 lg:grid-cols-3 my-6">
<a href="#introduction" className="block rounded-lg border border-gray-600/40 p-4 hover:border-gray-400 hover:bg-gray-700/20 no-underline">
<h3 className="mb-1 text-lg font-semibold underline">Introduction</h3>
<p className="text-sm text-gray-400 no-underline">Context engineering, agents &amp; workflows — what do they mean?</p>
</a>
Try it out with a starter example using StackBlitz:
<a href="#use-cases" className="block rounded-lg border border-gray-600/40 p-4 hover:border-gray-400 hover:bg-gray-700/20 no-underline">
<h3 className="mb-1 text-lg font-semibold underline">Use cases</h3>
<p className="text-sm text-gray-400 no-underline">See what you can build with LlamaIndex.TS.</p>
</a>
<a href="#getting-started" className="block rounded-lg border border-gray-600/40 p-4 hover:border-gray-400 hover:bg-gray-700/20 no-underline">
<h3 className="mb-1 text-lg font-semibold underline">Getting started</h3>
<p className="text-sm text-gray-400 no-underline">Your first app in 5 lines of code.</p>
</a>
<a href="https://docs.cloud.llamaindex.ai/" className="block rounded-lg border border-gray-600/40 p-4 hover:border-gray-400 hover:bg-gray-700/20 no-underline" target="_blank" rel="noopener noreferrer">
<h3 className="mb-1 text-lg font-semibold underline">LlamaCloud</h3>
<p className="text-sm text-gray-400 no-underline">Managed parsing, extraction &amp; retrieval pipelines.</p>
</a>
<a href="#community" className="block rounded-lg border border-gray-600/40 p-4 hover:border-gray-400 hover:bg-gray-700/20 no-underline">
<h3 className="mb-1 text-lg font-semibold underline">Community</h3>
<p className="text-sm text-gray-400 no-underline">Join thousands of builders on Discord, Twitter, and more.</p>
</a>
<a href="#related-projects" className="block rounded-lg border border-gray-600/40 p-4 hover:border-gray-400 hover:bg-gray-700/20 no-underline">
<h3 className="mb-1 text-lg font-semibold underline">Related projects</h3>
<p className="text-sm text-gray-400 no-underline">Connectors, demos &amp; starter kits.</p>
</a>
</div>
## Introduction
### What are agents?
[Agents](/docs/llamaindex/tutorials/agents/1_setup) are LLM-powered assistants that can reason, use external tools, and take actions to accomplish tasks such as research, data extraction, and automation.
LlamaIndex.TS provides foundational building blocks for creating and orchestrating these agents.
### What are workflows?
[Workflows](/docs/llamaindex/tutorials/workflows) are multi-step, event-driven processes that combine agents, data connectors, and other tools to solve complex problems.
With LlamaIndex.TS you can chain together retrieval, generation, and tool-calling steps and then deploy the entire pipeline as a microservice.
### What is context engineering?
LLMs come pre-trained on vast public corpora, but not on **your** private or domain-specific data.
Context engineering bridges that gap by injecting the right pieces of your data into the LLM prompt at the right time.
The most popular example is [Retrieval-Augmented Generation (RAG)](/docs/llamaindex/getting_started/concepts), but the same idea powers agent memory, evaluation, extraction, summarisation, and more.
LlamaIndex.TS gives you:
- **Data connectors** to ingest from APIs, files, SQL, and dozens more sources.
- **Indexes & retrievers** to store and retrieve your data for LLM consumption.
- **Agents and Engines** to query and use chat+reasoning interfaces over your data.
- **Workflows** for fine-grained orchestration of your data and LLM-powered agents.
- **Observability** integrations so you can iterate with confidence.
You can learn more about these concepts in our [concepts guide](/docs/llamaindex/getting_started/concepts).
## Use cases
Popular scenarios include:
- [LLM-Powered Agents](/docs/llamaindex/tutorials/agents/1_setup)
- [Indexing and Retrieval](/docs/llamaindex/tutorials/rag)
- [Extracting Structured Data](/docs/llamaindex/tutorials/structured_data_extraction)
- [Custom Orchestration with Workflows](/docs/llamaindex/tutorials/workflows)
## Getting started
The fastest way to get started is in StackBlitz below — no local setup required:
<iframe
className="w-full h-[440px]"
aria-label="LlamaIndex.TS Starter"
aria-description="This is a starter example for LlamaIndex.TS, it shows the basic usage of the library."
aria-description="Interactive starter for LlamaIndex.TS"
src="https://stackblitz.com/github/run-llama/LlamaIndexTS/tree/main/examples?embed=1&file=starter.ts"
/>
You'll need an OpenAI API key to run this example. You can retrieve it from [OpenAI](https://platform.openai.com/api-keys).
Want to learn more? We have several tutorials to get you started:
- [Installation + Runtime Guide](/docs/llamaindex/getting_started/installation)
- [Create your first agent](/docs/llamaindex/tutorials/agents/1_setup)
- [Learn how to index data and chat with it](/docs/llamaindex/tutorials/rag)
- [Learn how to write your own workflows and agents](/docs/llamaindex/tutorials/workflows)
---
## LlamaCloud
Need an end-to-end managed pipeline? Check out **[LlamaCloud](https://cloud.llamaindex.ai/)**: best-in-class document parsing (LlamaParse), extraction (LlamaExtract), and indexing services with generous free tiers.
---
## Community
- [Twitter](https://twitter.com/llama_index)
- [Discord](https://discord.gg/dGcwcsnxhU)
- [LinkedIn](https://www.linkedin.com/company/llamaindex/)
We 💜 contributors! View our [contributing guide](https://github.com/run-llama/LlamaIndexTS/blob/main/CONTRIBUTING.md) to get started.
## Related projects
- [Python framework GitHub](https://github.com/run-llama/llama_index)
- [Python docs](https://docs.llamaindex.ai/)
- [create-llama](https://www.npmjs.com/package/create-llama) — scaffold a new project in seconds!
- [UI Components](https://ui.llamaindex.ai/) — build chat applications with our Next.js components.
@@ -0,0 +1,85 @@
---
title: MCP Toolbox For Databases
description: MCP Toolbox for Databases is an open source MCP server for databases.
---
# MCP Toolbox for Databases
[MCP Toolbox for Databases](https://github.com/googleapis/genai-toolbox) is an open source MCP server for databases. It was designed with enterprise-grade and production-quality in mind. It enables you to develop tools easier, faster, and more securely by handling the complexities such as connection pooling, authentication, and more.
Toolbox Tools can be seemlessly integrated with LlamaIndex applications. For more
information on [getting
started](https://googleapis.github.io/genai-toolbox/getting-started/local_quickstart_js/) or
[configuring](https://googleapis.github.io/genai-toolbox/getting-started/configure/)
Toolbox, see the
[documentation](https://googleapis.github.io/genai-toolbox/getting-started/introduction/).
![architecture](/images/mcp_db_toolbox.png)
### Configure and deploy
Toolbox is an open source server that you deploy and manage yourself. For more
instructions on deploying and configuring, see the official Toolbox
documentation:
* [Installing the Server](https://googleapis.github.io/genai-toolbox/getting-started/introduction/#installing-the-server)
* [Configuring Toolbox](https://googleapis.github.io/genai-toolbox/getting-started/configure/)
### Install client SDK
LlamaIndex relies on the `@toolbox-sdk/core` node package to use Toolbox. Install the
package before getting started:
```shell
npm install @toolbox-sdk/core
```
### Loading Toolbox Tools
Once your Toolbox server is configured and up and running, you can load tools
from your server using the SDK:
```javascript
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
import { agent } from "@llamaindex/workflow";
import { tool } from "llamaindex";
import { ToolboxClient } from "@toolbox-sdk/core";
// Initialize LLM
const llm = gemini({
model: GEMINI_MODEL.GEMINI_2_0_FLASH,
apiKey: process.env.GOOGLE_API_KEY,
});
// Replace with your Toolbox Server URL
const URL = 'https://127.0.0.1:5000';
const client = new ToolboxClient("http://127.0.0.1:5000");
const toolboxTools = await client.loadToolset("my-toolset");
const getTool = (toolboxTool) => tool({
name: toolboxTool.getName(),
description: toolboxTool.getDescription(),
parameters: toolboxTool.getParamSchema(),
execute: toolboxTool
});
const tools = toolboxTools.map(getTool);
const myAgent = agent({
tools: tools,
llm,
memory,
systemPrompt: prompt,
});
const result = await myAgent.run(query);
console.log(result);
```
### Advanced Toolbox Features
Toolbox has a variety of features to make developing Gen AI tools for databases seamless.
For more information, read more about the following:
- [Authenticated Parameters](https://googleapis.github.io/genai-toolbox/resources/tools/#authenticated-parameters): bind tool inputs to values from OIDC tokens automatically, making it easy to run sensitive queries without potentially leaking data
- [Authorized Invocations](https://googleapis.github.io/genai-toolbox/resources/tools/#authorized-invocations): restrict access to use a tool based on the users Auth token
- [OpenTelemetry](https://googleapis.github.io/genai-toolbox/how-to/export_telemetry/): get metrics and tracing from Toolbox with [OpenTelemetry](https://opentelemetry.io/docs/)
@@ -1,5 +1,5 @@
{
"title": "Integration",
"description": "See our integrations",
"pages": ["open-llm-metry", "lang-trace", "vercel"]
"pages": ["open-llm-metry", "lang-trace", "mcp-toolbox", "vercel"]
}
@@ -12,7 +12,8 @@ Agent Workflows are a powerful system that enables you to create and orchestrate
The simplest use case is creating a single agent with specific tools. Here's an example of creating an assistant that tells jokes:
```typescript
import { agent, tool } from "llamaindex";
import { tool } from "llamaindex";
import { agent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";
// Define a joke-telling tool
@@ -32,7 +33,8 @@ const jokeAgent = agent({
// Run the workflow
const result = await jokeAgent.run("Tell me something funny");
console.log(result); // Baby Llama is called cria
console.log(result.data.result); // Baby Llama is called cria
console.log(result.data.message); // { role: 'assistant', content: 'Baby Llama is called cria' }
```
### Event Streaming
@@ -40,17 +42,17 @@ console.log(result); // Baby Llama is called cria
Agent Workflows provide a unified interface for event streaming, making it easy to track and respond to different events during execution:
```typescript
import { AgentToolCall, AgentStream } from "llamaindex";
import { agentToolCallEvent, agentStreamEvent } from "@llamaindex/workflow";
// Get the workflow execution context
const context = workflow.run("Tell me something funny");
const events = jokeAgent.runStream("Tell me something funny");
// Stream and handle events
for await (const event of context) {
if (event instanceof AgentToolCall) {
for await (const event of events) {
if (agentToolCallEvent.include(event)) {
console.log(`Tool being called: ${event.data.toolName}`);
}
if (event instanceof AgentStream) {
if (agentStreamEvent.include(event)) {
process.stdout.write(event.data.delta);
}
}
@@ -68,7 +70,8 @@ An Agent Workflow can orchestrate multiple agents, enabling complex interactions
Here's an example of a multi-agent system that combines joke-telling and weather information:
```typescript
import { multiAgent, agent, tool } from "llamaindex";
import { tool } from "llamaindex";
import { multiAgent, agent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";
import { z } from "zod";
@@ -110,6 +113,7 @@ const agents = multiAgent({
const result = await agents.run(
"Give me a morning greeting with a joke and the weather in San Francisco"
);
console.log(result.data.result);
```
The workflow will coordinate between agents, allowing them to handle different aspects of the request and hand off tasks when appropriate.
@@ -0,0 +1,164 @@
---
title: Low-Level LLM Execution
---
Sometimes your need more control over LLM interactions than what high-level agents provide. The `llm.exec` method makes it simple for you to make a single LLM call with tools but hides the complexity of executing the tools and generating the tool messages.
## When to Use `llm.exec`
Use `llm.exec` when you need to:
- Build custom agent logic in [workflow](/docs/llamaindex/modules/agents/workflows) steps
- Have precise control over message handling and tool execution
## Basic Usage
The `llm.exec` method takes messages and tools as parameter and executes one LLM call.
The LLM might either request to call one or more of the tools or generate an assistant message as result.
For each tool call that is requested, `llm.exec` executes it and generates the two tool call messages (call and result). If no tool call is requested, just the assistant message is returned.
```ts
import { openai } from "@llamaindex/openai";
import { ChatMessage, tool } from "llamaindex";
import z from "zod";
const llm = openai({ model: "gpt-4.1-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
const { newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
});
// Add the new messages (including tool calls and responses) to your conversation
messages.push(...newMessages);
```
> `newMessages` is an array as each tool call generates two messages: a tool call message and the tool call result message.
## Agent Loop Pattern
A common pattern is to use `llm.exec` in a loop until the LLM stops making tool calls:
```ts
import { openai } from "@llamaindex/openai";
import { ChatMessage, tool } from "llamaindex";
import z from "zod";
async function runAgentLoop() {
const llm = openai({ model: "gpt-4.1-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
let exit = false;
do {
const { newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
});
console.log(newMessages);
messages.push(...newMessages);
// Exit when no more tool calls are made
exit = toolCalls.length === 0;
} while (!exit);
}
```
## Streaming Support
For real-time responses, use the `stream` option to get the assistant's response as streamed tokens:
```ts
import { openai } from "@llamaindex/openai";
import { tool } from "llamaindex";
import z from "zod";
async function streamingAgentLoop() {
const llm = openai({ model: "gpt-4o-mini" });
const messages = [
{
content: "What's the weather like in San Francisco?",
role: "user",
} as ChatMessage,
];
let exit = false;
do {
const { stream, newMessages, toolCalls } = await llm.exec({
messages,
tools: [
tool({
name: "get_weather",
description: "Get the current weather for a location",
parameters: z.object({
address: z.string().describe("The address"),
}),
execute: ({ address }) => {
return `It's sunny in ${address}!`;
},
}),
],
stream: true,
});
// Stream the response token by token
for await (const chunk of stream) {
process.stdout.write(chunk.delta);
}
messages.push(...newMessages());
exit = toolCalls.length === 0;
} while (!exit);
}
```
> `newMessages` is a function when streaming. The reason is that the result only is available after streaming. Calling it before, will throw an error.
## Return Values
`llm.exec` returns an object with:
- **`newMessages`**: Array of new chat messages including the LLM response and any tool call messages (call or result). This is a function return the array when streaming.
- **`toolCalls`**: Array of tool calls made by the LLM
- **`stream`**: Async iterable for streaming responses (only when `stream: true`)
## Best Practices
For using `llm.exec` in an agent loop, take care to:
1. **Maintain message history**: Always add `newMessages` to your conversation history
2. **Set exit conditions**: Implement proper logic to avoid infinite loops
@@ -1,4 +1,10 @@
{
"title": "Agents",
"pages": ["tool", "agent_workflow", "workflows"]
"pages": [
"tool",
"agent_workflow",
"workflows",
"low-level",
"natural_language_workflow"
]
}
@@ -0,0 +1,103 @@
---
title: Define workflows using natural language
---
When working with Workflows, you have to write code to handle an event in the workflow.
Often, the logic of the handler is not too complex so that it can be expressed using natural language and executed by an LLM.
Besides the instructions, we just need the expected result event of the step, possible tool calls and optionally other events that can be emitted.
## Usage
Let's take an example of a workflow that generates a joke, gets a critique for it, and then improves it.
### Define the events
First, we define the events for our workflow. We need one for writing the joke, one for critiquing it, and one for the final result:
```typescript
import { z } from "zod";
import { zodEvent } from "@llamaindex/workflow";
const writeJokeSchema = z.object({
description: z
.string()
.describe("The topic to write a joke or describe the joke to improve."),
writtenJoke: z.optional(z.string()).describe("The written joke."),
retriedTimes: z
.number()
.default(0)
.describe(
"The retried times for writing the joke. Always increase this from the input retriedTimes.",
),
});
const critiqueSchema = z.object({
joke: z.string().describe("The joke to critique"),
retriedTimes: z.number().describe("The retried times for writing the joke."),
});
const finalResultSchema = z.object({
joke: z.string().describe("The joke to critique"),
critique: z.string().describe("The critique of the joke"),
});
const writeJokeEvent = zodEvent(writeJokeSchema, {
debugLabel: "writeJokeEvent",
});
const critiqueEvent = zodEvent(critiqueSchema, {
debugLabel: "critiqueEvent",
});
const finalResultEvent = zodEvent(finalResultSchema, {
debugLabel: "finalResultEvent",
});
```
Note that your natural language workflows the events need to be created by the `zodEvent` function passing the zod schema as an argument. The agent needs the schema of the event data to correctly generate events.
Also, we need a `debugLabel` so the LLM can identify the event to emit in the workflow.
### Define the workflow
As usual you first create the workflow:
```typescript
import { agentHandler, createWorkflow } from "@llamaindex/workflow";
const jokeFlow = createWorkflow();
```
Then you need to handle the events. For the handlers, instead of code, you're now going to use natural language by calling the `agentHandler` function.
It only requires two parameters:
- `instructions`: A prompt to guide the agent how to handle the steps.
- `results`: The output events that the agent should return after handling the step.
Then you will have a simple code to handle the step:
```typescript
jokeFlow.handle(
[writeJokeEvent],
agentHandler({
instructions: `You are a joke writer. You are given a topic and you need to write a joke about it.`,
results: [critiqueEvent],
}),
);
jokeFlow.handle(
[critiqueEvent],
agentHandler({
instructions: `
You are given a joke and you need to critique it. Follow the following guidelines:
1. You have maximum 3 times to improve the joke.
2. If the joke is not good, increase the retriedTimes, describe how to improve the joke and send a writeJokeEvent.
3. If the joke is good, trigger the finalResultEvent event.
`,
results: [writeJokeEvent, finalResultEvent],
}),
);
```
For advanced usage, you can add more functionality to `agentHandler` by using these parameters:
- `events`: A list of additional events that the agent can emit to the workflow. E.g., your agent can emit a `uiEvent` to update the UI during the execution.
- `tools`: A list of tools that the agent can use to handle the step. E.g., your agent can use a `search` tool to search the web.
You can find more code examples in the [examples](https://github.com/run-llama/LlamaIndexTS/tree/main/examples/agents/natural) folder.
@@ -17,7 +17,8 @@ The `parameters` field in the tool configuration is defined using `zod`, a TypeS
Example:
```ts
import { agent, tool } from "llamaindex";
import { tool } from "llamaindex";
import { agent } from "@llamaindex/workflow";
import { z } from "zod";
// first arg is LLM input, second is bound arg
@@ -46,7 +47,7 @@ In this example, `z.object` is used to define a schema for the `parameters` wher
You can import built-in tools from the `@llamaindex/tools` package.
```ts
import { agent } from "llamaindex";
import { agent } from "@llamaindex/workflow";
import { wiki } from "@llamaindex/tools";
const researchAgent = agent({
@@ -64,7 +65,7 @@ If you have a MCP server running, you can fetch tools from the server and use th
```ts
// 1. Import MCP tools adapter
import { mcp } from "@llamaindex/tools";
import { agent } from "llamaindex";
import { agent } from "@llamaindex/workflow";
// 2. Initialize a MCP client
// by npx
@@ -73,12 +74,21 @@ const server = mcp({
args: ["-y", "@modelcontextprotocol/server-filesystem", "."],
verbose: true,
});
// or by SSE
// or by StreamableHTTP transport
const server = mcp({
url: "http://localhost:8000/mcp",
verbose: true,
});
// if your MCP server is not using StreamableHTTP transport, you can also use SSE transport
// by setting useSSETransport to true.
// See: https://modelcontextprotocol.io/docs/concepts/transports#server-sent-events-sse-deprecated
const server = mcp({
url: "http://localhost:8000/mcp",
useSSETransport: true,
verbose: true,
});
// 3. Get tools from MCP server
const tools = await server.tools();
@@ -91,6 +101,9 @@ const agent = agent({
});
```
You can also use [MCP Toolbox for
Databases](/docs/llamaindex/integration/mcp-toolbox) to interact with MCP tools.
## Function tool
@@ -114,7 +127,8 @@ Note: calling the `bind` method will return a new `FunctionTool` instance, witho
Example to pass a `userToken` as additional argument:
```ts
import { agent, tool } from "llamaindex";
import { tool } from "llamaindex";
import { agent } from "@llamaindex/workflow";
// first arg is LLM input, second is bound arg
const queryKnowledgeBase = async ({ question }, { userToken }) => {
@@ -6,256 +6,16 @@ A `Workflow` in LlamaIndex is a lightweight, event-driven abstraction used to ch
Workflows are designed to be flexible and can be used to build agents, RAG flows, extraction flows, or anything else you want to implement.
To use workflows install this package:
```package-install
npm i @llama-flow/core @llamaindex/openai
npm i @llamaindex/workflow-core
```
## Getting Started
This contains the core functionality for the workflow system. You can read more about the core concepts in the [workflow-core](/docs/workflows) section.
Let's explore a simple workflow example where a joke is generated and then critiqued and iterated on:
In contrast, the `@llamaindex/workflow` package contains more utiltities, such as prebuilt agents.
```typescript
import { OpenAI } from "@llamaindex/openai";
import { createWorkflow, workflowEvent } from "@llama-flow/core";
import { withStore } from "@llama-flow/core/middleware/store";
// Create LLM instance
const llm = new OpenAI({ model: "gpt-4.1-mini", apiKey: "..."});
// Define our workflow events
const startEvent = workflowEvent<string>(); // Input topic for joke
const jokeEvent = workflowEvent<{ joke: string }>(); // Intermediate joke
const critiqueEvent = workflowEvent<{ joke: string, critique: string }>(); // Intermediate critique
const resultEvent = workflowEvent<{ joke: string, critique: string }>(); // Final joke + critique
// Create our workflow
const jokeFlow = withStore(
() => ({
numIterations: 0,
maxIterations: 3,
}),
createWorkflow()
);
// Define handlers for each step
jokeFlow.handle([startEvent], async (event) => {
// Prompt the LLM to write a joke
const prompt = `Write your best joke about ${event.data}. Write the joke between <joke> and </joke> tags.`;
const response = await llm.complete({ prompt });
// Parse the joke from the response
const joke = response.text.match(/<joke>([\s\S]*?)<\/joke>/)?.[1]?.trim() ?? response.text;
return jokeEvent.with({ joke: joke });
});
jokeFlow.handle([jokeEvent], async (event) => {
// Prompt the LLM to critique the joke
const prompt = `Give a thorough critique of the following joke. If the joke needs improvement, put "IMPROVE" somewhere in the critique: ${event.data.joke}`;
const response = await llm.complete({ prompt });
// If the critique includes "IMPROVE", keep iterating, else, return the result
if (response.text.includes("IMPROVE")) {
return critiqueEvent.with({ joke: event.data.joke, critique: response.text });
}
return resultEvent.with({ joke: event.data.joke, critique: response.text });
});
jokeFlow.handle([critiqueEvent], async (event) => {
// Keep track of the number of iterations
const store = jokeFlow.getStore();
store.numIterations++;
// Write a new joke based on the previous joke and critique
const prompt = `Write a new joke based on the following critique and the original joke. Write the joke between <joke> and </joke> tags.\n\nJoke: ${event.data.joke}\n\nCritique: ${event.data.critique}`;
const response = await llm.complete({ prompt });
// Parse the joke from the response
const joke = response.text.match(/<joke>([\s\S]*?)<\/joke>/)?.[1]?.trim() ?? response.text;
// If we've done less than the max number of iterations, keep iterating
// else, return the result
if (store.numIterations < store.maxIterations) {
return jokeEvent.with({ joke: joke });
}
return resultEvent.with({ joke: joke, critique: event.data.critique });
});
// Usage
async function main() {
const { stream, sendEvent } = jokeFlow.createContext();
sendEvent(startEvent.with("pirates"));
let result: { joke: string, critique: string } | undefined;
for await (const event of stream) {
// console.log(event.data); optionally log the event data
if (resultEvent.include(event)) {
result = event.data;
break; // Stop when we get the final result
}
}
console.log(result);
}
main().catch(console.error);
```package-install
npm i @llamaindex/workflow
```
There are a few moving pieces here, so let's go through this step by step.
### Defining Workflow Events
```typescript
const startEvent = workflowEvent<string>(); // Input topic for joke
const jokeEvent = workflowEvent<{ joke: string }>(); // Intermediate joke
const critiqueEvent = workflowEvent<{ joke: string, critique: string }>(); // Intermediate critique
const resultEvent = workflowEvent<{ joke: string, critique: string }>(); // Final joke + critique
```
Events are defined using the `workflowEvent` function and contain arbitrary data provided as a generic type. In this example, we have four events:
- `startEvent`: Takes a string input (the joke topic)
- `jokeEvent`: Contains an object with a joke property
- `critiqueEvent`: Contains both the joke and its critique, used for the feedback loop
- `resultEvent`: Contains the final joke and critique after any iterations
### Setting up the Workflow with Store Middleware
```typescript
const jokeFlow = withStore(
() => ({
numIterations: 0,
maxIterations: 3,
}),
createWorkflow()
);
```
Our workflow is implemented using the `createWorkflow()` function, enhanced with the `withStore` middleware. The store provides shared state across all handlers, which in this case tracks:
- `numIterations`: Counts how many iterations of joke improvement we've done
- `maxIterations`: Sets a limit to prevent infinite loops
This store will be accesible within workflows by using the `jokeFlow.getStore()` function.
### Adding Handlers with Loops
We have three key handlers in our workflow:
1. The first handler processes the `startEvent`, generates an initial joke, and emits a `jokeEvent`:
```typescript
jokeFlow.handle([startEvent], async (event) => {
// Prompt the LLM to write a joke
const prompt = `Write your best joke about ${event.data}. Write the joke between <joke> and </joke> tags.`;
const response = await llm.complete({ prompt });
// Parse the joke from the response
const joke = response.text.match(/<joke>([\s\S]*?)<\/joke>/)?.[1]?.trim() ?? response.text;
return jokeEvent.with({ joke: joke });
});
```
2. The second handler handles the `jokeEvent`, critiques the joke, and either:
- Emits a `critiqueEvent` if the joke needs improvement
- Emits a `resultEvent` if the joke is good enough
```typescript
jokeFlow.handle([jokeEvent], async (event) => {
// Prompt the LLM to critique the joke
const prompt = `Give a thorough critique of the following joke. If the joke needs improvement, put "IMPROVE" somewhere in the critique: ${event.data.joke}`;
const response = await llm.complete({ prompt });
// If the critique includes "IMPROVE", keep iterating, else, return the result
if (response.text.includes("IMPROVE")) {
return critiqueEvent.with({ joke: event.data.joke, critique: response.text });
}
return resultEvent.with({ joke: event.data.joke, critique: response.text });
});
```
3. The third handler processes the `critiqueEvent`, generates an improved joke based on the critique, and either:
- Loops back to the joke evaluation (if under the iteration limit)
- Emits the final `resultEvent` (if iteration limit reached)
```typescript
jokeFlow.handle([critiqueEvent], async (event) => {
// Keep track of the number of iterations
const store = jokeFlow.getStore();
store.numIterations++;
// Write a new joke based on the previous joke and critique
const prompt = `Write a new joke based on the following critique and the original joke. Write the joke between <joke> and </joke> tags.\n\nJoke: ${event.data.joke}\n\nCritique: ${event.data.critique}`;
const response = await llm.complete({ prompt });
// Parse the joke from the response
const joke = response.text.match(/<joke>([\s\S]*?)<\/joke>/)?.[1]?.trim() ?? response.text;
// If we've done less than the max number of iterations, keep iterating
// else, return the result
if (store.numIterations < store.maxIterations) {
return jokeEvent.with({ joke: joke });
}
return resultEvent.with({ joke: joke, critique: event.data.critique });
});
```
### Running the Workflow
```typescript
async function main() {
const { stream, sendEvent } = jokeFlow.createContext();
sendEvent(startEvent.with("pirates"));
let result: { joke: string, critique: string } | undefined;
for await (const event of stream) {
// console.log(event.data); optionally log the event data
if (resultEvent.include(event)) {
result = event.data;
break; // Stop when we get the final result
}
}
console.log(result);
}
```
To run the workflow, we:
1. Create a workflow context with `createContext()`
2. Trigger the initial event with `sendEvent()`
3. Listen to the event stream and process events as they arrive
4. Use `include()` to check if an event is of a specific type
5. Break the loop when we receive our final result
### Using Stream Utilities
Workflows provide utility functions to make working with event streams easier:
```typescript
import { collect } from "@llama-flow/core/stream/consumer";
import { until } from "@llama-flow/core/stream/until";
// Create a workflow context and send the initial event
const { stream, sendEvent } = jokeFlow.createContext();
sendEvent(startEvent.with("pirates"));
// Collect all events until we get a resultEvent
const allEvents = await collect(until(stream, resultEvent));
// The last event will be the resultEvent
const finalEvent = allEvents[allEvents.length - 1];
console.log(finalEvent.data); // Output the joke and critique
```
The stream utilities make it easier to work with the asynchronous event flow. In this example, we use:
- `collect`: Aggregates all events into an array
- `until`: Creates a stream that emits events until a condition is met (in this case, until a resultEvent is received)
You can combine these utilities with other stream operators like `filter` and `map` to create powerful processing pipelines.
## Next Steps
To learn more about workflows, check out [the documentation in the tutorial section](../../../llamaflow).
@@ -0,0 +1,228 @@
---
title: Memory
description: Manage conversation history and context with agents
---
## Concept
Memory is a core component of agentic systems. It allows you to store and retrieve information from the past.
In LlamaIndexTS, you can create memory by using the `createMemory` function. This function will return a `Memory` object, which you can then use to store and retrieve information.
As the agent runs, it will make calls to `add()` to store information, and `get()` to retrieve information.
## Usage
A `Memory` object has both short-term memory (i.e. a FIFO queue of messages) and optionally long-term memory (i.e. extracting information over time).
`get()` always returns all messages stored in the memory. The longer the agent runs, this will exceed the context window of the agent. To avoid this, the agent is using the `getLLM` method to get the last X messages that fit into the context window.
### Configuring Memory for an Agent
Here we're creating a memory with a static block (read more about [memory blocks](#long-term-memory)) that contains some information about the user.
```ts twoslash
import { openai } from "@llamaindex/openai";
import { agent } from "@llamaindex/workflow";
import { createMemory, staticBlock } from "llamaindex";
const llm = openai({ model: "gpt-4.1-mini" });
// Create memory with predefined context
const memory = createMemory({
memoryBlocks: [
staticBlock({
content:
"The user is a software engineer who loves TypeScript and LlamaIndex.",
}),
],
});
// Create an agent with the memory
const workflow = agent({
name: "assistant",
llm,
memory,
});
const result = await workflow.run("What is my name?");
console.log("Response:", result.data.result);
```
### Using Vercel format
You can also put messages in Vercel format directly to the memory:
```ts
await memory.add({
id: "1",
createdAt: new Date(),
role: "user",
content: "Hello!",
options: {
parts: [
{
type: "file",
data: "base64...",
mimeType: "image/png",
},
],
},
});
```
If you call `get`, messages are usually retrieved in the LlamaIndexTS format (type `ChatMessage`). If you specify the `type` parameter using `get`, you can return the messages in different formats. E.g.: using `type: "vercel"`, you can return the messages in Vercel format:
```ts
const messages = await memory.get({ type: "vercel" });
console.log(messages);
```
## Customizing Memory
### Short-Term Memory
The `Memory` object will store all the messages that are added to the `Memory` object. Unless you call `clear()`, no messages are removed from the memory. This is the short-term memory (usually you will store the memory of one user session there) which is augmented by the long-term memory.
Calling `getLLM` will retrieve messages from long-term memory and ensure that the given `tokenLimit` is not reached. These are the messages that you will sent to the LLM.
For initialization, you call `createMemory` with the following options:
- `tokenLimit`: Maximum tokens for memory retrieval using `getLLM` (default: 30000).
- `shortTermTokenLimitRatio`: Ratio of tokens for short-term vs long-term memory (default: 0.7)
- `customAdapters`: Custom message adapters for different message formats. LlamaIndex (`ChatMessageAdapter`) and Vercel (`VercelMessageAdapter`) are built-in adapters.
- `memoryBlocks`: Memory blocks for long-term storage, see [Long-Term Memory](#long-term-memory)
Example:
```ts
const memory = createMemory({
tokenLimit=40000,
shortTermTokenLimitRatio=0.5,
});
```
### Long-Term Memory
Long-term memory is represented as `Memory Block` objects. These objects contain information that are from previous user sessions or from the beginning of the current conversation. When memory is retrieved (by calling `getLLM`), the short-term and long-term memories are merged together within the given `tokenLimit`.
Currently, there are three predefined memory blocks:
- `staticBlock`: A memory block that stores a static piece of information.
- `factExtractionBlock`: A memory block that extracts facts from the chat history.
- `vectorBlock`: A memory block that stores and retrieves chat messages from a vector database using semantic similarity search. Messages are stored individually and retrieved based on their relevance to recent conversation context. Here we've passed in the `vectorStore` to use to store and retrieve the chat messages.
This sounds a bit complicated, but it's actually quite simple. Let's look at an example:
```ts
import { createMemory, factExtractionBlock, staticBlock, vectorBlock } from "llamaindex";
import { QdrantVectorStore } from "@llamaindex/qdrant";
import { OpenAIEmbedding } from "@llamaindex/openai";
const memoryBlocks= [
staticBlock({
content: "My name is Logan, and I live in Saskatoon. I work at LlamaIndex.",
}),
factExtractionBlock({
priority: 1,
llm: llm,
maxFacts: 50,
}),
vectorBlock({
vectorStore: new QdrantVectorStore({ url: "http://localhost:6333" }),
priority: 2,
}),
];
```
Here, we've setup three memory blocks:
- `staticBlock`: A static memory block that stores some core information about the user. This information will always be inserted into the memory. The type used is `MessageContent` to support multi-modal content.
- `factExtractionBlock`: An extracted memory block that will extract information from the chat history. Here we've passed in the `llm` to use to extract facts from the chat history, and set the `maxFacts` to 50. If the number of extracted facts exceeds this limit, the `maxFacts` will be automatically summarized and reduced to leave room for new information.
- `vectorBlock`: A vector memory block that will store in a vector database and retrieve them from there. Messages are stored individually and retrieved based on their relevance to recent conversation context. Here we've passed in the `vectorStore` to use to store and retrieve the chat messages.
You'll also notice that we've set the `priority` for the `factExtractionBlock` block. This is used to determine the handling when the memory blocks content (i.e. long-term memory) + short-term memory exceeds the token limit on the `Memory` object.
- `priority=0`: This block will always be kept in memory (`staticBlocks` always have priority 0.)
- `priority=1, 2, 3, etc`: This determines the order in which memory blocks are truncated when the memory exceeds the token limit, to help the overall short-term memory + long-term memory content be less than or equal to the `tokenLimit`.
Now, let's pass these blocks into the `createMemory` function:
```ts
const memory = createMemory({
tokenLimit: 40000,
memoryBlocks: memoryBlocks,
)
```
When memory is retrieved (using `getLLM`), the short-term and long-term memories are merged together. The `Memory` object will ensure that the short-term memory + long-term memory content is less than or equal to the `tokenLimit`. If it is longer, messages are retrieved in the following order:
1. StaticMemoryBlock (information always included)
2. LongTermMemoryBlock (depending on priority)
3. ShortTermMemoryBlock
4. Transient messages
The amount of short-term memory included is specified by the `shortTermTokenLimitRatio`. If it's set to `0.7`, 70% of the `tokenLimit` is used for short-term memory (not including the static memory block).
#### VectorBlock Configuration Options
The `vectorBlock` offers several configuration options to customize its behavior:
```ts
vectorBlock({
vectorStore: new QdrantVectorStore({ url: "http://localhost:6333" }),
priority: 2,
retrievalContextWindow: 5, // Number of recent messages to use for context when retrieving
formatTemplate: new PromptTemplate({ template: "Context: {{ context }}" }), // Custom formatting template
nodePostprocessors: [/* custom postprocessors */], // Apply processing to retrieved nodes
queryOptions: {
similarityTopK: 3, // Number of top similar results to return (default: 2)
mode: VectorStoreQueryMode.DEFAULT, // Query mode for the vector store
sessionFilterKey: "session_id", // Metadata key for session filtering (default: "session_id")
// Custom filters can be added here - session filter is automatically included
filters: {
filters: [
{ key: "custom_field", value: "custom_value", operator: "==" }
],
condition: "and"
}
}
})
```
**Key Configuration Options:**
- **`retrievalContextWindow`**: Number of recent messages to consider when creating the retrieval query (default: 5). A larger window provides more context but may be less precise.
- **`formatTemplate`**: Template for formatting retrieved information before adding to memory. Defaults to a simple context template.
- **`nodePostprocessors`**: Array of postprocessors to apply to retrieved nodes, useful for filtering or transforming results.
- **`queryOptions.similarityTopK`**: Number of most similar messages to retrieve from the vector store (default: 2).
- **`queryOptions.sessionFilterKey`**: Metadata key used to isolate memory between different sessions (default: "session_id").
- **`queryOptions.filters`**: Additional metadata filters for retrieval. The session filter is automatically added to ensure memory isolation.
**Session Isolation:**
The vectorBlock automatically adds a session filter using the block's ID to ensure that memories from different sessions don't interfere with each other. This filter uses the `sessionFilterKey` (default: "session_id") and can be customized if needed.
## Persistence with Snapshots
Save and restore memory state:
```ts twoslash
import { createMemory, loadMemory } from "llamaindex";
const memory = createMemory();
// Add some messages
await memory.add({ role: "user", content: "Hello!" });
// Create snapshot
const snapshot = memory.snapshot();
// Later, restore from the snapshot
const restoredMemory = loadMemory(snapshot);
```
## Examples
Want to learn more about the Memory class? Check out our example codes in [Github](https://github.com/run-llama/LlamaIndexTS/tree/main/examples/agents/memory).
@@ -1,4 +1,11 @@
{
"title": "Data",
"pages": ["index", "readers", "data_index", "ingestion_pipeline", "stores"]
"pages": [
"index",
"memory",
"readers",
"data_index",
"ingestion_pipeline",
"stores"
]
}
@@ -18,7 +18,7 @@ In your Discord Application, go to the `OAuth2` tab and generate an invite URL b
This will invite the bot with the necessary permissions to read messages.
Copy the URL in your browser and select the server you want your bot to join.
<include cwd>../../examples/discord/reader.ts</include>
<include cwd>../../examples/readers/discord/reader.ts</include>
### Params
@@ -88,7 +88,7 @@ async function main() {
const response = await queryEngine.query({
query: "What did the author do in college?",
});
}); // Additional filters and params can be passed as options
// Output response
console.log(response.toString());
@@ -28,11 +28,12 @@ embedding vector(1536)
);
```
-- Create a function for similarity search
-- Create a function for similarity search with filtering support
```sql
create function match_documents (
query_embedding vector(1536),
match_count int
match_count int,
filter jsonb DEFAULT '{}'
) returns table (
id uuid,
content text,
@@ -42,6 +43,7 @@ similarity float
)
language plpgsql
as $$
#variable_conflict use_column
begin
return query
select
@@ -51,6 +53,7 @@ metadata,
embedding,
1 - (embedding <=> query_embedding) as similarity
from documents
where metadata @> filter
order by embedding <=> query_embedding
limit match_count;
end;
@@ -95,6 +98,7 @@ const index = await VectorStoreIndex.fromDocuments(documents, {
```ts
const queryEngine = index.asQueryEngine();
// Basic query without filters
const response = await queryEngine.query({
query: "What is in the document?",
});
@@ -103,6 +107,32 @@ const response = await queryEngine.query({
console.log(response.toString());
```
## Query with filters
You can filter documents based on metadata when querying:
```ts
import { FilterOperator, MetadataFilters } from "llamaindex";
// Create a filter for documents with author = "Jane Smith"
const filters: MetadataFilters = {
filters: [
{
key: "author",
value: "Jane Smith",
operator: FilterOperator.EQ,
},
],
};
// Query with filters
const filteredResponse = await vectorStore.query({
queryEmbedding: embedModel.getQueryEmbedding("What is vector search?"),
similarityTopK: 5,
filters,
});
```
## Full code
```ts
@@ -2,89 +2,43 @@
title: Azure OpenAI
---
To use Azure OpenAI, you only need to set a few environment variables together with the `OpenAI` class.
For example:
## Environment Variables
```
export AZURE_OPENAI_KEY="<YOUR KEY HERE>"
export AZURE_OPENAI_ENDPOINT="<YOUR ENDPOINT, see https://learn.microsoft.com/en-us/azure/ai-services/openai/quickstart?tabs=command-line%2Cpython&pivots=rest-api>"
export AZURE_OPENAI_DEPLOYMENT="gpt-4" # or some other deployment name
```
To use Azure OpenAI, you only need to install the `@llamaindex/azure` package:
## Installation
```package-install
npm i llamaindex @llamaindex/openai
npm i llamaindex @llamaindex/azure
```
## Usage
The class `AzureOpenAI` is used for setting the LLM and `AzureOpenAIEmbedding` is used for setting the embedding model, e.g.:
```ts
import { Settings } from "llamaindex";
import { OpenAI } from "@llamaindex/openai";
import { AzureOpenAI, AzureOpenAIEmbedding } from "@llamaindex/azure";
Settings.llm = new OpenAI({ model: "gpt-4", temperature: 0 });
```
## Load and index documents
For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
```ts
const document = new Document({ text: essay, id_: "essay" });
const index = await VectorStoreIndex.fromDocuments([document]);
```
## Query
```ts
const queryEngine = index.asQueryEngine();
const query = "What is the meaning of life?";
const results = await queryEngine.query({
query,
Settings.llm = new AzureOpenAI({
apiKey: '[key]',
deployment: '[model]',
apiVersion: '[version]',
endpoint: `https://[deployment].openai.azure.com/`,
});
Settings.embedModel = new AzureOpenAIEmbedding({
apiKey: '[key]',
deployment: '[embedding-model]',
apiVersion: '[version]',
endpoint: `https://[deployment].openai.azure.com/`,
});
```
## Full Example
Instead of explicitly setting the API key, deployment, version, and endpoint in the constructor, you can use the following environment variables: `AZURE_OPENAI_DEPLOYMENT` for the model deployment name, `AZURE_OPENAI_KEY` for your API key, `AZURE_OPENAI_ENDPOINT` for your Azure endpoint URL, and `AZURE_OPENAI_API_VERSION` for the API version.
```ts
import { Document, VectorStoreIndex, Settings } from "llamaindex";
import { OpenAI } from "@llamaindex/openai";
## Examples
Settings.llm = new OpenAI({ model: "gpt-4", temperature: 0 });
async function main() {
const document = new Document({ text: essay, id_: "essay" });
// Load and index documents
const index = await VectorStoreIndex.fromDocuments([document]);
// get retriever
const retriever = index.asRetriever();
// Create a query engine
const queryEngine = index.asQueryEngine({
retriever,
});
const query = "What is the meaning of life?";
// Query
const response = await queryEngine.query({
query,
});
// Log the response
console.log(response.response);
}
```
See the [Azure examples](https://github.com/run-llama/LlamaIndexTS/tree/main/examples/storage/azure) for more examples of how to use Azure OpenAI.
## API Reference
- [OpenAI](/docs/api/classes/OpenAI)
- [AzureOpenAI](/docs/api/classes/AzureOpenAI)
- [AzureOpenAIEmbedding](/docs/api/classes/AzureOpenAIEmbedding)
@@ -5,13 +5,13 @@ title: Bedrock
## Installation
```package-install
npm i llamaindex @llamaindex/community
npm i llamaindex @llamaindex/aws
```
## Usage
```ts
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/community";
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/aws";
Settings.llm = new Bedrock({
model: BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
@@ -23,9 +23,19 @@ Settings.llm = new Bedrock({
});
```
Currently only supports Anthropic and Meta models:
Supported models are listed below (accessible by BEDROCK_MODELS).
```ts
AMAZON_TITAN_TG1_LARGE = "amazon.titan-tg1-large";
AMAZON_TITAN_TEXT_EXPRESS_V1 = "amazon.titan-text-express-v1";
AI21_J2_GRANDE_INSTRUCT = "ai21.j2-grande-instruct";
AI21_J2_JUMBO_INSTRUCT = "ai21.j2-jumbo-instruct";
AI21_J2_MID = "ai21.j2-mid";
AI21_J2_MID_V1 = "ai21.j2-mid-v1";
AI21_J2_ULTRA = "ai21.j2-ultra";
AI21_J2_ULTRA_V1 = "ai21.j2-ultra-v1";
COHERE_COMMAND_TEXT_V14 = "cohere.command-text-v14";
ANTHROPIC_CLAUDE_INSTANT_1 = "anthropic.claude-instant-v1";
ANTHROPIC_CLAUDE_2 = "anthropic.claude-v2";
ANTHROPIC_CLAUDE_2_1 = "anthropic.claude-v2:1";
@@ -33,7 +43,12 @@ ANTHROPIC_CLAUDE_3_SONNET = "anthropic.claude-3-sonnet-20240229-v1:0";
ANTHROPIC_CLAUDE_3_HAIKU = "anthropic.claude-3-haiku-20240307-v1:0";
ANTHROPIC_CLAUDE_3_OPUS = "anthropic.claude-3-opus-20240229-v1:0"; // available on us-west-2
ANTHROPIC_CLAUDE_3_5_SONNET = "anthropic.claude-3-5-sonnet-20240620-v1:0";
ANTHROPIC_CLAUDE_3_5_SONNET_V2 = "anthropic.claude-3-5-sonnet-20241022-v2:0";
ANTHROPIC_CLAUDE_3_5_HAIKU = "anthropic.claude-3-5-haiku-20241022-v1:0";
ANTHROPIC_CLAUDE_3_7_SONNET = "anthropic.claude-3-7-sonnet-20250219-v1:0";
ANTHROPIC_CLAUDE_4_SONNET = "anthropic.claude-sonnet-4-20250514-v1:0";
ANTHROPIC_CLAUDE_4_OPUS = "anthropic.claude-opus-4-20250514-v1:0";
META_LLAMA2_13B_CHAT = "meta.llama2-13b-chat-v1";
META_LLAMA2_70B_CHAT = "meta.llama2-70b-chat-v1";
META_LLAMA3_8B_INSTRUCT = "meta.llama3-8b-instruct-v1:0";
@@ -45,41 +60,67 @@ META_LLAMA3_2_1B_INSTRUCT = "meta.llama3-2-1b-instruct-v1:0"; // only available
META_LLAMA3_2_3B_INSTRUCT = "meta.llama3-2-3b-instruct-v1:0"; // only available via inference endpoints (see below)
META_LLAMA3_2_11B_INSTRUCT = "meta.llama3-2-11b-instruct-v1:0"; // only available via inference endpoints (see below), multimodal and function call supported
META_LLAMA3_2_90B_INSTRUCT = "meta.llama3-2-90b-instruct-v1:0"; // only available via inference endpoints (see below), multimodal and function call supported
META_LLAMA3_3_70B_INSTRUCT = "meta.llama3-3-70b-instruct-v1:0";
MISTRAL_7B_INSTRUCT = "mistral.mistral-7b-instruct-v0:2";
MISTRAL_MIXTRAL_7B_INSTRUCT = "mistral.mixtral-8x7b-instruct-v0:1";
MISTRAL_MIXTRAL_LARGE_2402 = "mistral.mistral-large-2402-v1:0";
AMAZON_NOVA_PREMIER_1 = "amazon.nova-premier-v1:0";
AMAZON_NOVA_PRO_1 = "amazon.nova-pro-v1:0";
AMAZON_NOVA_LITE_1 = "amazon.nova-lite-v1:0";
AMAZON_NOVA_MICRO_1 = "amazon.nova-micro-v1:0";
```
You can also use Bedrock's Inference endpoints by using the model names:
You can also use Bedrock's Inference endpoints by using the model names (accessible by INFERENCE_BEDROCK_MODELS).
Note that the region must be set correctly.
```ts
// US
//US
US_ANTHROPIC_CLAUDE_3_HAIKU = "us.anthropic.claude-3-haiku-20240307-v1:0";
US_ANTHROPIC_CLAUDE_3_5_HAIKU = "us.anthropic.claude-3-5-haiku-20241022-v1:0";
US_ANTHROPIC_CLAUDE_3_OPUS = "us.anthropic.claude-3-opus-20240229-v1:0";
US_ANTHROPIC_CLAUDE_3_SONNET = "us.anthropic.claude-3-sonnet-20240229-v1:0";
US_ANTHROPIC_CLAUDE_3_5_SONNET = "us.anthropic.claude-3-5-sonnet-20240620-v1:0";
US_ANTHROPIC_CLAUDE_3_5_SONNET_V2 =
"us.anthropic.claude-3-5-sonnet-20241022-v2:0";
US_ANTHROPIC_CLAUDE_3_5_SONNET_V2 = "us.anthropic.claude-3-5-sonnet-20241022-v2:0";
US_ANTHROPIC_CLAUDE_3_7_SONNET = "us.anthropic.claude-3-7-sonnet-20250219-v1:0";
US_ANTHROPIC_CLAUDE_4_SONNET = "us.anthropic.claude-sonnet-4-20250514-v1:0";
US_ANTHROPIC_CLAUDE_4_OPUS = "us.anthropic.claude-opus-4-20250514-v1:0";
US_META_LLAMA_3_2_1B_INSTRUCT = "us.meta.llama3-2-1b-instruct-v1:0";
US_META_LLAMA_3_2_3B_INSTRUCT = "us.meta.llama3-2-3b-instruct-v1:0";
US_META_LLAMA_3_2_11B_INSTRUCT = "us.meta.llama3-2-11b-instruct-v1:0";
US_META_LLAMA_3_2_90B_INSTRUCT = "us.meta.llama3-2-90b-instruct-v1:0";
US_AMAZON_NOVA_PRO_1 = "us.amazon.nova-premier-v1:0";
US_META_LLAMA_3_3_70B_INSTRUCT = "us.meta.llama3-3-70b-instruct-v1:0";
US_AMAZON_NOVA_PREMIER_1 = "us.amazon.nova-premier-v1:0";
US_AMAZON_NOVA_PRO_1 = "us.amazon.nova-pro-v1:0";
US_AMAZON_NOVA_LITE_1 = "us.amazon.nova-lite-v1:0";
US_AMAZON_NOVA_MICRO_1 = "us.amazon.nova-micro-v1:0";
// EU
//EU
EU_ANTHROPIC_CLAUDE_3_HAIKU = "eu.anthropic.claude-3-haiku-20240307-v1:0";
EU_ANTHROPIC_CLAUDE_3_5_HAIKU = "eu.anthropic.claude-3-5-haiku-20240307-v1:0";
EU_ANTHROPIC_CLAUDE_3_SONNET = "eu.anthropic.claude-3-sonnet-20240229-v1:0";
EU_ANTHROPIC_CLAUDE_3_5_SONNET = "eu.anthropic.claude-3-5-sonnet-20240620-v1:0";
EU_ANTHROPIC_CLAUDE_3_7_SONNET = "eu.anthropic.claude-3-7-sonnet-20250219-v1:0";
EU_ANTHROPIC_CLAUDE_4_SONNET = "eu.anthropic.claude-sonnet-4-20250514-v1:0";
EU_ANTHROPIC_CLAUDE_4_OPUS = "eu.anthropic.claude-opus-4-20250514-v1:0";
EU_META_LLAMA_3_2_1B_INSTRUCT = "eu.meta.llama3-2-1b-instruct-v1:0";
EU_META_LLAMA_3_2_3B_INSTRUCT = "eu.meta.llama3-2-3b-instruct-v1:0";
EU_AMAZON_NOVA_PRO_1 = "eu.amazon.nova-premier-v1:0";
EU_AMAZON_NOVA_PREMIER_1 = "eu.amazon.nova-premier-v1:0";
EU_AMAZON_NOVA_PRO_1 = "eu.amazon.nova-pro-v1:0";
EU_AMAZON_NOVA_LITE_1 = "eu.amazon.nova-lite-v1:0";
EU_AMAZON_NOVA_MICRO_1 = "eu.amazon.nova-micro-v1:0";
//APAC
APAC_ANTHROPIC_CLAUDE_3_5_SONNET = "apac.anthropic.claude-3-5-sonnet-20240620-v1:0";
APAC_ANTHROPIC_CLAUDE_3_5_SONNET_V2 = "apac.anthropic.claude-3-5-sonnet-20241022-v2:0";
APAC_ANTHROPIC_CLAUDE_3_7_SONNET = "apac.anthropic.claude-3-7-sonnet-20250219-v1:0";
APAC_ANTHROPIC_CLAUDE_4_SONNET = "apac.anthropic.claude-sonnet-4-20250514-v1:0";
APAC_ANTHROPIC_CLAUDE_3_HAIKU = "apac.anthropic.claude-3-haiku-20240307-v1:0";
APAC_ANTHROPIC_CLAUDE_3_SONNET = "apac.anthropic.claude-3-sonnet-20240229-v1:0";
APAC_AMAZON_NOVA_PRO_1 = "apac.amazon.nova-pro-v1:0";
APAC_AMAZON_NOVA_LITE_1 = "apac.amazon.nova-lite-v1:0";
APAC_AMAZON_NOVA_MICRO_1 = "apac.amazon.nova-micro-v1:0";
```
Sonnet, Haiku and Opus are multimodal, image_url only supports base64 data url format, e.g. `data:image/jpeg;base64,SGVsbG8sIFdvcmxkIQ==`
@@ -87,10 +128,11 @@ Sonnet, Haiku and Opus are multimodal, image_url only supports base64 data url f
## Full Example
```ts
import { BEDROCK_MODELS, Bedrock } from "llamaindex";
import { INFERENCE_BEDROCK_MODELS, Bedrock } from "@llamaindex/aws";
Settings.llm = new Bedrock({
model: BEDROCK_MODELS.ANTHROPIC_CLAUDE_3_HAIKU,
model: INFERENCE_BEDROCK_MODELS.US_ANTHROPIC_CLAUDE_3_SONNET,
region: "us-east-1",
});
async function main() {
@@ -119,12 +161,12 @@ async function main() {
## Agent Example
```ts
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/community";
import { FunctionTool, LLMAgent } from "llamaindex";
import { BEDROCK_MODELS, Bedrock } from "@llamaindex/aws";
import { tool } from "llamaindex";
import { agent } from "@llamaindex/workflow";
import { z } from "zod";
const sumNumbers = FunctionTool.from(
({ a, b }: { a: number; b: number }) => `${a + b}`,
const sumNumbers = tool(
{
name: "sumNumbers",
description: "Use this function to sum two numbers",
@@ -136,11 +178,11 @@ const sumNumbers = FunctionTool.from(
description: "The second number",
}),
}),
execute: ({ a, b }: { a: number; b: number }) => `${a + b}`,
},
);
const divideNumbers = FunctionTool.from(
({ a, b }: { a: number; b: number }) => `${a / b}`,
const divideNumbers = tool(
{
name: "divideNumbers",
description: "Use this function to divide two numbers",
@@ -152,6 +194,7 @@ const divideNumbers = FunctionTool.from(
description: "The divisor b to divide by",
}),
}),
execute: ({ a, b }: { a: number; b: number }) => `${a / b}`,
},
);
@@ -161,15 +204,15 @@ const bedrock = new Bedrock({
});
async function main() {
const agent = new LLMAgent({
const myAgent = agent({
llm: bedrock,
tools: [sumNumbers, divideNumbers],
});
const response = await agent.chat({
message: "How much is 5 + 5? then divide by 2",
});
const response = await myAgent.run(
"How much is 5 + 5? then divide by 2",
);
console.log(response.message);
console.log(response);
}
```
@@ -11,58 +11,130 @@ npm i llamaindex @llamaindex/google
## Usage
```ts
import { Gemini, GEMINI_MODEL } from "@llamaindex/google";
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
import { Settings } from "llamaindex";
Settings.llm = new Gemini({
model: GEMINI_MODEL.GEMINI_PRO,
});
```
## Usage with Proxy
```ts
import { Gemini, GEMINI_MODEL } from "@llamaindex/google";
import { Settings } from "llamaindex";
Settings.llm = new Gemini({
model: GEMINI_MODEL.GEMINI_PRO,
requestOptions: {
baseUrl: <YOUR_PROXY_URL> // optional, but useful for custom endpoints
}
Settings.llm = gemini({
model: GEMINI_MODEL.GEMINI_2_0_FLASH,
});
```
### Usage with Vertex AI
To use Gemini via Vertex AI you can use `GeminiVertexSession`.
GeminiVertexSession accepts the env variables: `GOOGLE_VERTEX_LOCATION` and `GOOGLE_VERTEX_PROJECT`
To use Gemini via Vertex AI, you can specify the vertex configuration:
```ts
import { Gemini, GEMINI_MODEL, GeminiVertexSession } from "@llamaindex/google";
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
const gemini = new Gemini({
model: GEMINI_MODEL.GEMINI_PRO,
session: new GeminiVertexSession({
location: "us-central1", // optional if provided by GOOGLE_VERTEX_LOCATION env variable
project: "project1", // optional if provided by GOOGLE_VERTEX_PROJECT env variable
googleAuthOptions: {...}, // optional, but useful for production. It accepts all values from `GoogleAuthOptions`
}),
const llm = gemini({
model: GEMINI_MODEL.GEMINI_2_0_FLASH,
vertex: {
project: "your-cloud-project", // required for Vertex AI
location: "us-central1", // required for Vertex AI
},
});
```
[GoogleAuthOptions](https://github.com/googleapis/google-auth-library-nodejs/blob/main/src/auth/googleauth.ts)
To authenticate for local development:
```bash
npm i @google-cloud/vertexai
gcloud auth application-default login
```
To authenticate for production you'll have to use a [service account](https://cloud.google.com/docs/authentication/). `googleAuthOptions` has `credentials` which might be useful for you.
## Multimodal Usage
Gemini supports multimodal inputs including text, images, audio, and video:
```ts
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
import fs from "fs";
const llm = gemini({ model: GEMINI_MODEL.GEMINI_2_0_FLASH });
const result = await llm.chat({
messages: [
{
role: "user",
content: [
{
type: "text",
text: "What's in this image?",
},
{
type: "image",
data: fs.readFileSync("./image.jpg").toString("base64"),
mimeType: "image/jpeg",
},
],
},
],
});
```
## Tool Calling
Gemini supports function calling with tools:
```ts
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
import { tool } from "llamaindex";
import { z } from "zod";
const llm = gemini({ model: GEMINI_MODEL.GEMINI_2_0_FLASH });
const result = await llm.chat({
messages: [
{
content: "What's the weather in Tokyo?",
role: "user",
},
],
tools: [
tool({
name: "weather",
description: "Get the weather",
parameters: z.object({
location: z.string().describe("The location to get the weather for"),
}),
execute: ({ location }) => {
return `The weather in ${location} is sunny and hot`;
},
}),
],
});
```
## Live API (Real-time Conversations)
For real-time audio/video conversations using [Gemini Live API](https://ai.google.dev/gemini-api/docs/live).
The Live API is running directly in the frontend. That's why you have to generate an ephemeral key first on the server side and pass it to the frontend.
To use the Live API, make sure to pass `apiVersion: "v1alpha"` to the `httpOptions`.
```ts
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
// Server-side: Generate ephemeral key
const serverLlm = gemini({
model: GEMINI_MODEL.GEMINI_2_0_FLASH_LIVE,
httpOptions: { apiVersion: "v1alpha" },
});
const ephemeralKey = await serverLlm.live.getEphemeralKey();
// Client-side: Use ephemeral key for Live API
const llm = gemini({
apiKey: ephemeralKey,
model: GEMINI_MODEL.GEMINI_2_0_FLASH_LIVE,
voiceName: "Zephyr",
httpOptions: { apiVersion: "v1alpha" },
});
const session = await llm.live.connect();
```
## Load and index documents
For this example, we will use a single document. In a real-world scenario, you would have multiple documents to index.
@@ -90,11 +162,11 @@ const results = await queryEngine.query({
## Full Example
```ts
import { Gemini, GEMINI_MODEL } from "@llamaindex/google";
import { gemini, GEMINI_MODEL } from "@llamaindex/google";
import { Document, VectorStoreIndex, Settings } from "llamaindex";
Settings.llm = new Gemini({
model: GEMINI_MODEL.GEMINI_PRO,
Settings.llm = gemini({
model: GEMINI_MODEL.GEMINI_2_0_FLASH,
});
async function main() {
@@ -104,9 +176,7 @@ async function main() {
const index = await VectorStoreIndex.fromDocuments([document]);
// Create a query engine
const queryEngine = index.asQueryEngine({
retriever,
});
const queryEngine = index.asQueryEngine();
const query = "What is the meaning of life?";
@@ -55,7 +55,7 @@ const results = await queryEngine.query({
## Full Example
<include cwd>../../examples/groq.ts</include>
<include cwd>../../examples/models/groq.ts</include>
## API Reference
@@ -378,3 +378,186 @@ async function main() {
## API Reference
- [OpenAI](/docs/api/classes/OpenAI)
# OpenAI Live LLM
The OpenAI Live LLM integration in LlamaIndex provides real-time chat capabilities with support for audio streaming and tool calling.
## Basic Usage
```typescript
import { openai } from "@llamaindex/openai";
import { tool, ModalityType } from "llamaindex";
// Get the ephimeral key on the server
const serverllm = openai({
apiKey: "your-api-key",
model: "gpt-4o-realtime-preview-2025-06-03",
});
// Get an ephemeral key
// Usually this code is run on the server and the ephemeral key is passed to the
// client - the ephemeral key can be securely used on the client side
const ephemeralKey = await serverllm.live.getEphemeralKey();
// Create a client-side LLM instance with the ephemeral key
const llm = openai({
apiKey: ephemeralKey,
model: "gpt-4o-realtime-preview-2025-06-03"
});
// Create a live sessionimport { tool } from "llamaindex";
const session = await llm.live.connect({
systemInstruction: "You are a helpful assistant.",
});
// Send a message
session.sendMessage({
content: "Hello!",
role: "user",
});
```
## Tool Integration
Tools are handled server-side, making it simple to pass them to the live session:
```typescript
// Define your tools
const weatherTool = tool({
name: "weather",
description: "Get the weather for a location",
parameters: z.object({
location: z.string().describe("The location to get weather for"),
}),
execute: async ({ location }) => {
return `The weather in ${location} is sunny`;
},
});
// Create session with tools
const session = await llm.live.connect({
systemInstruction: "You are a helpful assistant.",
tools: [weatherTool],
});
```
## Audio Support
For audio capabilities:
```typescript
// Get microphone access
const userStream = await navigator.mediaDevices.getUserMedia({
audio: true,
});
// Create session with audio
const session = await llm.live.connect({
audioConfig: {
stream: userStream,
onTrack: (remoteStream) => {
// Handle incoming audio
audioElement.srcObject = remoteStream;
},
},
});
```
## Event Handling
Listen to events from the session:
```typescript
for await (const event of session.streamEvents()) {
if (liveEvents.open.include(event)) {
// Connection established
console.log("Connected!");
} else if (liveEvents.text.include(event)) {
// Received text response
console.log("Assistant:", event.text);
}
}
```
## Capabilities
The OpenAI Live LLM supports:
- Real-time text chat
- Audio streaming (if configured)
- Tool calling (server-side execution)
- Ephemeral key generation for secure sessions
## API Reference
### LiveLLM Methods
// Get an ephemeral key
// Usually this code is run on the server and the ephemeral key is passed to the
// client - the ephemeral key can be securely used on the client side
#### `connect(config?: LiveConnectConfig)`
Creates a new live session.
```typescript
interface LiveConnectConfig {
systemInstruction?: string;
tools?: BaseTool[];
audioConfig?: AudioConfig;
responseModality?: ModalityType[];
}
```
#### `getEphemeralKey()`
Gets a temporary key for the session.
### LiveLLMSession Methods
#### `sendMessage(message: ChatMessage)`
Sends a message to the assistant.
```typescript
interface ChatMessage {
content: string | MessageContentDetail[];
role: "user" | "assistant";
}
```
#### `disconnect()`
Closes the session and cleans up resources.
## Error Handling
```typescript
try {
const session = await llm.live.connect();
} catch (error) {
if (error instanceof Error) {
console.error("Connection failed:", error.message);
}
}
```
## Best Practices
1. **Tool Definition**
- Keep tool implementations server-side
- Use clear descriptions for tools
- Handle tool errors gracefully
2. **Session Management**
- Always disconnect sessions when done
- Clean up audio resources
- Handle reconnection scenarios
3. **Security**
- Use ephemeral keys for sessions
- Validate tool inputs
- Secure API key handling
@@ -11,6 +11,7 @@ A retriever in LlamaIndex is what is used to fetch `Node`s from an index using a
- [KeywordTableLLMRetriever](/docs/api/classes/KeywordTableLLMRetriever) uses an LLM to extract keywords from the query and retrieve relevant nodes based on keyword matches.
- [KeywordTableSimpleRetriever](/docs/api/classes/KeywordTableSimpleRetriever) uses a basic frequency-based approach to extract keywords and retrieve nodes.
- [KeywordTableRAKERetriever](/docs/api/classes/KeywordTableRAKERetriever) uses the RAKE (Rapid Automatic Keyword Extraction) algorithm to extract keywords from the query, focusing on co-occurrence and context for keyword-based retrieval.
- [Bm25Retriever](/docs/api/classes/Bm25Retriever) uses the BM25 algorithm to extract keywords from the query and retrieve relevant nodes based on keyword matches.
```typescript
const retriever = vectorIndex.asRetriever({
@@ -1,44 +0,0 @@
---
title: Using API Route
description: Chat interface for your LlamaIndexTS application using API Route
---
Using [chat-ui](https://github.com/run-llama/chat-ui), it's easy to add a chat interface to your LlamaIndexTS application.
You just need to create an API route that provides an `api/chat` endpoint and a chat component to consume the API.
## API route
As an example, this is an API route for the Next.js App Router. Copy the following code into your `app/api/chat/route.ts` file to get started:
```json doc-gen:file
{
"file": "./src/app/api/chat/route.ts",
"codeblock": true
}
```
## Chat UI
This is the simplest way to add a chat interface to your application. Copy the following code into your application to consume the API:
```json doc-gen:file
{
"file": "./src/components/demo/chat/api/demo.tsx",
"codeblock": true
}
```
## Try it out ⬇️
Combining both, you're getting a fully functional chat interface:
<ChatDemo />
## Next Steps
The steps above are the bare minimum to get a chat interface working. From here, you can go two ways:
1. Use [create-llama](https://github.com/run-llama/create-llama) to scaffold a new LlamaIndexTS project including complex API routes and chat interfaces or
2. Learn more about [chat-ui](https://github.com/run-llama/chat-ui) and [LlamaIndexTS](https://github.com/run-llama/llamaindex-ts) to customize the chat interface and API routes to your needs.
@@ -0,0 +1,8 @@
---
title: Using @llamaindex/chat-ui
description: Chat UI components for your LlamaIndexTS application
---
@llamaindex/chat-ui is a library that provides a set of components for building chat user interfaces. It is built on top of [Shadcn UI](https://ui.shadcn.com).
Check out our [chat-ui](/docs/chat-ui) documentation or try running examples on the [ui.llamaindex.ai](https://ui.llamaindex.ai) website.
@@ -1,22 +0,0 @@
---
title: Install @llamaindex/chat
description: Chat interface for your LlamaIndexTS application
---
## Quick Start
You can quickly add a chatbot to your project by using Shadcn CLI command:
```sh
npx shadcn@latest add https://ui.llamaindex.ai/r/chat.json
```
## Manual Installation
To install the package, run the following command in your project directory:
```sh
npm i @llamaindex/chat-ui
```
For more information, check out the [github.comrun-llama/chat-ui](https://github.com/run-llama/chat-ui)
@@ -9,161 +9,11 @@ LlamaIndexServer is a Next.js-based application that allows you to quickly launc
## Features
- Serving a workflow as a chatbot
- Add a sophisticated chatbot UI to your LlamaIndex workflow
- Edit code and document artifacts in an OpenAI Canvas-style UI
- Extendable UI components for events and headers
- Built on Next.js for high performance and easy API development
- Optional built-in chat UI with extendable UI components
- Prebuilt development code
## Installation
```package-install
npm i @llamaindex/server
```
## Quick Start
Create an `index.ts` file and add the following code:
```ts
import { LlamaIndexServer } from "@llamaindex/server";
import { wiki } from "@llamaindex/tools"; // or any other tool
const createWorkflow = () => agent({ tools: [wiki()] })
new LlamaIndexServer({
workflow: createWorkflow,
uiConfig: {
appTitle: "LlamaIndex App",
starterQuestions: ["Who is the first president of the United States?"],
},
}).start();
```
## Running the Server
In the same directory as `index.ts`, run the following command to start the server:
```bash
tsx index.ts
```
The server will start at `http://localhost:3000`
You can also make a request to the server:
```bash
curl -X POST "http://localhost:3000/api/chat" -H "Content-Type: application/json" -d '{"message": "Who is the first president of the United States?"}'
```
## Configuration Options
The `LlamaIndexServer` accepts the following configuration options:
- `workflow`: A callable function that creates a workflow instance for each request
- `uiConfig`: An object to configure the chat UI containing the following properties:
- `appTitle`: The title of the application (default: `"LlamaIndex App"`)
- `starterQuestions`: List of starter questions for the chat UI (default: `[]`)
- `componentsDir`: The directory for custom UI components rendering events emitted by the workflow. The default is undefined, which does not render custom UI components.
- `llamaCloudIndexSelector`: Whether to show the LlamaCloud index selector in the chat UI (requires `LLAMA_CLOUD_API_KEY` to be set in the environment variables) (default: `false`)
LlamaIndexServer accepts all the configuration options from Nextjs Custom Server such as `port`, `hostname`, `dev`, etc.
See all Nextjs Custom Server options [here](https://nextjs.org/docs/app/building-your-application/configuring/custom-server).
## AI-generated UI Components
The LlamaIndex server provides support for rendering workflow events using custom UI components, allowing you to extend and customize the chat interface.
These components can be auto-generated using an LLM by providing a JSON schema of the workflow event.
### UI Event Schema
To display custom UI components, your workflow needs to emit UI events that have an event type for identification and a data object:
```typescript
class UIEvent extends WorkflowEvent<{
type: "ui_event";
data: UIEventData;
}> {}
```
The `data` object can be any JSON object. To enable AI generation of the UI component, you need to provide a schema for that data (here we're using Zod):
```typescript
const MyEventDataSchema = z.object({
stage: z.enum(["retrieve", "analyze", "answer"]).describe("The current stage the workflow process is in."),
progress: z.number().min(0).max(1).describe("The progress in percent of the current stage"),
}).describe("WorkflowStageProgress");
type UIEventData = z.infer<typeof MyEventDataSchema>;
```
### Generate UI Components
The `generateEventComponent` function uses an LLM to generate a custom UI component based on the JSON schema of a workflow event. The schema should contain accurate descriptions of each field so that the LLM can generate matching components for your use case. We've done this for you in the example above using the `describe` function from Zod:
```typescript
import { OpenAI } from "llamaindex";
import { generateEventComponent } from "@llamaindex/server";
import { MyEventDataSchema } from "./your-workflow";
// Also works well with Claude 3.5 Sonnet and Google Gemini 2.5 Pro
const llm = new OpenAI({ model: "gpt-4.1" });
const code = generateEventComponent(MyEventDataSchema, llm);
```
After generating the code, we need to save it to a file. The file name must match the event type from your workflow (e.g., `ui_event.jsx` for handling events with `ui_event` type):
```ts
fs.writeFileSync("components/ui_event.jsx", code);
```
Feel free to modify the generated code to match your needs. If you're not satisfied with the generated code, we suggest improving the provided JSON schema first or trying another LLM.
> Note that `generateEventComponent` is generating JSX code, but you can also provide a TSX file.
### Server Setup
To use the generated UI components, you need to initialize the LlamaIndex server with the `componentsDir` that contains your custom UI components:
```ts
new LlamaIndexServer({
workflow: createWorkflow,
uiConfig: {
appTitle: "LlamaIndex App",
componentsDir: "components",
},
}).start();
```
## Default Endpoints and Features
### Chat Endpoint
The server includes a default chat endpoint at `/api/chat` for handling chat interactions.
### Chat UI
The server always provides a chat interface at the root path (`/`) with:
- Configurable starter questions
- Real-time chat interface
- API endpoint integration
### Static File Serving
- The server automatically mounts the `data` and `output` folders at `{server_url}{api_prefix}/files/data` (default: `/api/files/data`) and `{server_url}{api_prefix}/files/output` (default: `/api/files/output`) respectively.
- Your workflows can use both folders to store and access files. By convention, the `data` folder is used for documents that are ingested, and the `output` folder is used for documents generated by the workflow.
## Best Practices
1. Always provide a workflow factory that creates a fresh workflow instance for each request.
2. Use environment variables for sensitive configuration (e.g., API keys).
3. Use starter questions to guide users in the chat UI.
## Getting Started with a New Project
Want to start a new project with LlamaIndexServer? Check out our [create-llama](https://github.com/run-llama/create-llama) tool to quickly generate a new project with LlamaIndexServer.
## API Reference
- [LlamaIndexServer](/docs/api/classes/LlamaIndexServer)
Check the latest information on the NPM package page: https://www.npmjs.com/package/@llamaindex/server
@@ -2,5 +2,5 @@
"title": "Chat UI",
"description": "Use chat-ui to add a chat interface to your LlamaIndexTS application.",
"defaultOpen": false,
"pages": ["install", "chat", "rsc", "llamaindex-server"]
"pages": ["index", "llamaindex-server"]
}
@@ -1,65 +0,0 @@
---
title: Using Next.js RSC
description: Chat interface for your LlamaIndexTS application using Next.js RSC
---
Using [chat-ui](https://github.com/run-llama/chat-ui), it's easy to add a chat interface to your LlamaIndexTS application using [Next.js RSC](https://nextjs.org/docs/app/building-your-application/rendering/server-components) and [Vercel AI RSC](https://sdk.vercel.ai/docs/ai-sdk-rsc/overview).
With RSC, the chat messages are not returned as JSON from the server (like when using an [API route](/docs/llamaindex/modules/ui/chat)), instead the chat message components are rendered on the server side.
This is for example useful for rendering a whole chat history on the server before sending it to the client. [Check here](https://sdk.vercel.ai/docs/getting-started/navigating-the-library#when-to-use-ai-sdk-rsc), for a discussion of when to use use RSC.
For implementing a chat interface with RSC, you need to create an AI action and then connect the chat interface to use it.
## Create an AI action
First, define an [AI context provider](https://sdk.vercel.ai/examples/rsc/state-management/ai-ui-states) with a chat server action:
```json doc-gen:file
{
"file": "./src/components/demo/chat/rsc/ai-action.tsx",
"codeblock": true
}
```
The chat server action is using LlamaIndexTS to generate a response based on the chat history and the user input.
## Create the chat UI
The entrypoint of our application initializes the AI provider for the application and adds a `ChatSection` component:
```json doc-gen:file
{
"file": "./src/components/demo/chat/rsc/demo.tsx",
"codeblock": true
}
```
The `ChatSection` component is created by using chat components from @llamaindex/chat-ui:
```json doc-gen:file
{
"file": "./src/components/demo/chat/rsc/chat-section.tsx",
"codeblock": true
}
```
It is using a `useChatRSC` hook to conntect the chat interface to the `chat` AI action that we defined earlier:
```json doc-gen:file
{
"file": "./src/components/demo/chat/rsc/use-chat-rsc.tsx",
"codeblock": true
}
```
## Try RSC Chat ⬇️
<ChatDemoRSC />
## Next Steps
The steps above are the bare minimum to get a chat interface working with RSC. From here, you can go two ways:
1. Use our [full-stack RSC example](https://github.com/run-llama/nextjs-rsc) based on [create-llama](https://github.com/run-llama/create-llama) to get started quickly with a fully working chat interface or
2. Learn more about [AI RSC](https://sdk.vercel.ai/examples/rsc), [chat-ui](https://github.com/run-llama/chat-ui) and [LlamaIndexTS](https://github.com/run-llama/llamaindex-ts) to customize the chat interface and AI actions to your needs.
@@ -15,7 +15,7 @@ In LlamaIndex, an agent is a semi-autonomous piece of software powered by an LLM
You'll need to have a recent version of [Node.js](https://nodejs.org/en) installed. Then you can install LlamaIndex.TS by running
```package-install
npm i llamaindex @llamaindex/openai @llamaindex/readers @llamaindex/huggingface
npm i llamaindex @llamaindex/openai @llamaindex/readers @llamaindex/huggingface @llamaindex/workflow
```
## Choose your model
@@ -35,11 +35,16 @@ First we'll need to pull in our dependencies. These are:
import "dotenv/config";
import {
agent,
AgentStream,
tool,
agentStreamEvent,
openai,
} from "@llamaindex/workflow";
import {
tool,
Settings,
} from "llamaindex";
import {
openai,
} from "@llamaindex/openai";
import { z } from "zod";
```
@@ -108,11 +113,10 @@ const myAgent = agent({ tools });
### Ask the agent a question
We can use the `chat` interface to ask our agent a question, and it will use the tools we've defined to find an answer.
We can use the `run` method to ask our agent a question, and it will use the tools we've defined to find an answer.
```javascript
const context = myAgent.run("Sum 101 and 303");
const result = await context;
const result = await myAgent.run("Sum 101 and 303");
console.log(result.data);
```
You will see the following output:
@@ -123,12 +127,13 @@ You will see the following output:
{ result: 'The sum of 101 and 303 is 404.' }
```
To stream the response, you can use the `AgentStream` event which provides chunks of the response as they become available. This allows you to display the response incrementally rather than waiting for the full response:
To stream the response, you need to call `runStream`, which returns a stream of events.
The `agentStreamEvent` provides chunks of the response as they become available. This allows you to display the response incrementally rather than waiting for the full response:
```javascript
const context = myAgent.run("Add 101 and 303");
for await (const event of context) {
if (event instanceof AgentStream) {
const events = myAgent.runStream("Add 101 and 303");
for await (const event of events) {
if (agentStreamEvent.include(event)) {
process.stdout.write(event.data.delta);
}
}
@@ -140,18 +145,18 @@ for await (const event of context) {
The sum of 101 and 303 is 404.
```
Note that we're filtering for `agentStreamEvent` as an agent might return other events - more about that in the following section.
### Logging workflow events
To log the workflow events, you can check the event type and log the event data.
```javascript
const context = myAgent.run("Sum 202 and 404");
for await (const event of context) {
if (event instanceof AgentStream) {
const events = myAgent.runStream("Sum 202 and 404");
for await (const event of events) {
if (agentStreamEvent.include(event)) {
// Stream the response
for (const chunk of event.data.delta) {
process.stdout.write(chunk);
}
process.stdout.write(event.data.delta);
} else {
// Log other events
console.log("\nWorkflow event:", JSON.stringify(event, null, 2));
@@ -30,16 +30,16 @@ Settings.llm = ollama({
### Run local agent
You can also create local agent by importing `agent` from `llamaindex`.
You can also create local agent by importing `agent` from `@llamaindex/workflow`.
```javascript
import { agent } from "llamaindex";
import { agent } from "@llamaindex/workflow";
const workflow = agent({
tools: [getWeatherTool],
});
const workflowContext = workflow.run(
const resutl = workflow.run(
"What's the weather like in San Francisco?",
);
```
@@ -25,7 +25,8 @@ We'll be bringing in `SimpleDirectoryReader`, `HuggingFaceEmbedding`, `VectorSto
```javascript
import { QueryEngineTool, Settings, VectorStoreIndex } from "llamaindex";
import { OpenAI, OpenAIAgent } from "@llamaindex/openai";
import { agent } from "@llamaindex/workflow";
import { openai } from "@llamaindex/openai";
import { HuggingFaceEmbedding } from "@llamaindex/huggingface";
import { SimpleDirectoryReader } from "@llamaindex/readers/directory";
```
@@ -8,9 +8,10 @@ We have a comprehensive, step-by-step [guide to building agents in LlamaIndex.TS
In a new folder:
```bash npm2yarn
```package-install
npm init
npm i -D typescript @types/node
npm i @llamaindex/openai @llamaindex/workflow llamaindex zod
```
## Run agent
@@ -20,15 +21,14 @@ Create the file `example.ts`. This code will:
- Create two tools for use by the agent:
- A `sumNumbers` tool that adds two numbers
- A `divideNumbers` tool that divides numbers
-
- Give an example of the data structure we wish to generate
- Prompt the LLM with instructions and the example, plus a sample transcript
<include cwd>../../examples/agent/openai.ts</include>
<include cwd>../../examples/agents/agent/openai.ts</include>
To run the code:
```bash
```package-install
npx tsx example.ts
```
@@ -36,9 +36,21 @@ You should expect output something like:
```
{
content: 'The sum of 5 + 5 is 10. When you divide 10 by 2, you get 5.',
role: 'assistant',
options: {}
result: '5 + 5 is 10. Then, 10 divided by 2 is 5.',
state: {
memory: Memory {
messages: [Array],
tokenLimit: 30000,
shortTermTokenLimitRatio: 0.7,
memoryBlocks: [],
memoryCursor: 0,
adapters: [Object]
},
scratchpad: [],
currentAgentName: 'Agent',
agents: [ 'Agent' ],
nextAgentName: null
}
}
Done
```
@@ -0,0 +1,47 @@
---
title: Custom Model Per Request
---
There are scenarios, such as the case of a multi-tenant backend API, where it may be required to handle each request with a custom model.
In such a scenario, modifying the `Settings` object directly as follows is not recommended:
```typescript
import { Settings } from 'llamaindex';
import { OpenAIEmbedding } from '@llamaindex/embeddings-openai';
Settings.embedModel = new OpenAIEmbedding({ apiKey: 'CLIENT_API_KEY' });
Settings.llm = openai({ apiKey: key, model: 'gpt-4o' })
```
Setting `llm` and `embedModel` directly will lead to unpredictable responses, since `Settings` is global and mutable.
This can lead to race conditions, as each request modifies `Settings.embedModel` or `Settings.llm`.
The recommended approach is to use `Settings.withEmbedModel` or `Settings.withLLM` as follows:
```typescript
const embedModel = new OpenAIEmbedding({
apiKey: process.env.OPENAI_API_KEY,
});
const llm = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
const llmResponse = await Settings.withEmbedModel(embedModel, async () => {
return Settings.withLLM(llm, async () => {
const path = "node_modules/llamaindex/examples/abramov.txt";
const essay = await fs.readFile(path, "utf-8");
// Create Document object with essay
const document = new Document({ text: essay, id_: path });
// Split text and create embeddings. Store them in a VectorStoreIndex
const index = await VectorStoreIndex.fromDocuments([document]);
// Query the index
const queryEngine = index.asQueryEngine();
const { message, sourceNodes } = await queryEngine.query({
query: "What did the author do in college?",
});
// Return response with sources
return message.content;
});
});
```
The full example can be found [here](https://github.com/run-llama/LlamaIndexTS/tree/main/examples/local-settings).
@@ -93,4 +93,4 @@ async function main() {
main().catch(console.error);
```
You can see the [full example file](https://github.com/run-llama/LlamaIndexTS/blob/main/examples/vectorIndexLocal.ts).
You can see the [full example file](https://github.com/run-llama/LlamaIndexTS/blob/main/examples/index/vectorIndexLocal.ts).
@@ -4,9 +4,10 @@
"basic_agent",
"rag",
"agents",
"../../llamaflow",
"workflows",
"local_llm",
"chatbot",
"structured_data_extraction"
"structured_data_extraction",
"custom_model_per_request"
]
}
@@ -16,7 +16,7 @@ LlamaIndex uses a two stage method when using an LLM with your data:
1. **indexing stage**: preparing a knowledge base, and
2. **querying stage**: retrieving relevant context from the knowledge to assist the LLM in responding to a question
![](./_static/concepts/rag.jpg)
![](/_static/concepts/rag.jpg)
This process is also known as Retrieval Augmented Generation (RAG).
@@ -28,7 +28,7 @@ Let's explore each stage in detail.
LlamaIndex.TS help you prepare the knowledge base with a suite of data connectors and indexes.
![](./_static/concepts/indexing.jpg)
![](/_static/concepts/indexing.jpg)
[**Data Loaders**](/docs/llamaindex/modules/data/readers):
A data connector (i.e. `Reader`) ingest data from different data sources and data formats into a simple `Document` representation (text and simple metadata).
@@ -54,7 +54,7 @@ LlamaIndex provides composable modules that help you build and integrate RAG pip
These building blocks can be customized to reflect ranking preferences, as well as composed to reason over multiple knowledge bases in a structured way.
![](./_static/concepts/querying.jpg)
![](/_static/concepts/querying.jpg)
#### Building Blocks
@@ -8,9 +8,10 @@ One of the most common use-cases for LlamaIndex is Retrieval-Augmented Generatio
In a new folder, run:
```bash npm2yarn
```package-install
npm init
npm i -D typescript @types/node
npm i llamaindex
```
Then, check out the [installation](/docs/llamaindex/getting_started/installation) steps to install LlamaIndex.TS and prepare an OpenAI key.
@@ -26,7 +27,7 @@ Create the file `example.ts`. This code will
- index it (which creates embeddings using OpenAI)
- create a query engine to answer questions about the data
<include cwd>../../examples/vectorIndex.ts</include>
<include cwd>../../examples/index/vectorIndex.ts</include>
Create a `tsconfig.json` file in the same folder:
@@ -34,7 +35,7 @@ Create a `tsconfig.json` file in the same folder:
Now you can run the code with
```bash
```package-install
npx tsx example.ts
```
@@ -10,9 +10,10 @@ You can use [other LLMs](/docs/llamaindex/modules/models/llms) via their APIs; i
In a new folder:
```bash npm2yarn
```package-install
npm init
npm i -D typescript @types/node
npm i @llamaindex/openai zod
```
## Extract data
@@ -23,11 +24,11 @@ Create the file `example.ts`. This code will:
- Give an example of the data structure we wish to generate
- Prompt the LLM with instructions and the example, plus a sample transcript
<include cwd>../../examples/jsonExtract.ts</include>
<include cwd>../../examples/misc/jsonExtract.ts</include>
To run the code:
```bash
```package-install
npx tsx example.ts
```
@@ -45,3 +46,31 @@ You should expect output something like:
]
}
```
## Using the `exec` method
Many LLMs do not natively support structured output, and often rely exclusively on prompt or context engineering.
In this sense, we proved you with an alternative for structured data extraction, using the `exec` method with `responseFormat`.
For example, you can, in a new folder, install our Anthropic integration and `zod` v3:
```package-install
npm init
npm i -D typescript @types/node
npm i @llamaindex/anthropic zod@3.25.76
```
And then try extracting data with this code:
<include cwd>../../examples/agents/tools/response-format-exec.ts</include>
The output should look like this:
```json
{
"title": "La Divina Commedia",
"author": "Dante Alighieri",
"year": 1321
}
```

Some files were not shown because too many files have changed in this diff Show More