Merge branch 'main' of github.com:run-llama/LlamaIndexTS into seldo/deploy-fixes

Merge pull request #215 from run-llama/seldo/deploy-fixes
dotenv must load before chat_router or .env isn't picked up in time
2026-07-03 19:19:08 -04:00 · 2023-11-19 17:11:45 -08:00 · 2023-11-19 16:21:19 -08:00 · 2023-11-19 16:15:41 -08:00 · 2023-11-19 15:55:27 -08:00 · 2023-11-19 15:54:53 -08:00
22 changed files with 191 additions and 37 deletions
@@ -1,5 +0,0 @@
---
-"create-llama": patch
---
-
-Label bug fix
@@ -4,7 +4,7 @@

 ### Patch Changes

- 63f2108: Add multimodal support (thanks Marcus)
+- 63f2108: Add multimodal support (thanks @marcusschiesser)

 ## 0.0.34

@@ -1,5 +1,23 @@
 # create-llama

+## 0.0.8
+
+### Patch Changes
+
+- 8cdb07f: Fix Next deployment (thanks @seldo and @marcusschiesser)
+
+## 0.0.7
+
+### Patch Changes
+
+- 9f9f293: Added more to README and made it easier to switch models (thanks @seldo)
+
+## 0.0.6
+
+### Patch Changes
+
+- 4431ec7: Label bug fix (thanks @marcusschiesser)
+
 ## 0.0.5

 ### Patch Changes
@@ -10,16 +28,16 @@

 ### Patch Changes

- 031e926: Update create-llama readme (thanks Logan)
+- 031e926: Update create-llama readme (thanks @logan-markewich)

 ## 0.0.3

 ### Patch Changes

- 91b42a3: change version (thanks Marcus)
+- 91b42a3: change version (thanks @marcusschiesser)

 ## 0.0.2

 ### Patch Changes

- e2a6805: Hello Create Llama (thanks Marcus!)
+- e2a6805: Hello Create Llama (thanks @marcusschiesser)
@@ -2,19 +2,69 @@

 The easiest way to get started with [LlamaIndex](https://www.llamaindex.ai/) is by using `create-llama`. This CLI tool enables you to quickly start building a new LlamaIndex application, with everything set up for you.

-## Features
+Just run

- NextJS, ExpressJS, or FastAPI (python) stateless backend generation 💻
- Streaming or non-streaming backend ⚡
- Optional `shadcn` frontend generation 🎨
+```bash
+npx create-llama@latest
+```

-## Get Started
+to get started, or see below for more options. Once your app is generated, run

-You can run `create-llama` in interactive or non-interactive mode.
+```bash
+npm run dev
+```

-### Interactive
+to start the development server. You can then visit [http://localhost:3000](http://localhost:3000) to see your app.

-You can create a new project interactively by running:
+## What you'll get
+
+- A Next.js-powered front-end. The app is set up as a chat interface that can answer questions about your data (see below)
+  - You can style it with HTML and CSS, or you can optionally use components from [shadcn/ui](https://ui.shadcn.com/)
+- Your choice of 3 back-ends:
+  - **Next.js**: if you select this option, you’ll have a full stack Next.js application that you can deploy to a host like [Vercel](https://vercel.com/) in just a few clicks. This uses [LlamaIndex.TS](https://www.npmjs.com/package/llamaindex), our TypeScript library.
+  - **Express**: if you want a more traditional Node.js application you can generate an Express backend. This also uses LlamaIndex.TS.
+  - **Python FastAPI**: if you select this option you’ll get a backend powered by the [llama-index python package](https://pypi.org/project/llama-index/), which you can deploy to a service like Render or fly.io.
+- The back-end has a single endpoint that allows you to send the state of your chat and receive additional responses
+- You can choose whether you want a streaming or non-streaming back-end (if you're not sure, we recommend streaming)
+- You can choose whether you want to use `ContextChatEngine` or `SimpleChatEngine`
+  - `SimpleChatEngine` will just talk to the LLM directly without using your data
+  - `ContextChatEngine` will use your data to answer questions (see below).
+- The app uses OpenAI by default, so you'll need an OpenAI API key, or you can customize it to use any of the dozens of LLMs we support.
+
+## Using your data
+
+If you've enabled `ContextChatEngine`, you can supply your own data and the app will index it and answer questions. Your generated app will have a folder called `data`:
+
+- With the Next.js backend this is `./data`
+- With the Express or Python backend this is in `./backend/data`
+
+The app will ingest any supported files you put in this directory. Your Next.js and Express apps use LlamaIndex.TS so they will be able to ingest any PDF, text, CSV, Markdown, Word and HTML files. The Python backend can read even more types, including video and audio files.
+
+Before you can use your data, you need to index it. If you're using the Next.js or Express apps, run:
+
+```bash
+npm run generate
+```
+
+Then re-start your app. Remember you'll need to re-run `generate` if you add new files to your `data` folder. If you're using the Python backend, you can trigger indexing of your data by deleting the `./storage` folder and re-starting the app.
+
+## Don't want a front-end?
+
+It's optional! If you've selected the Python or Express back-ends, just delete the `frontend` folder and you'll get an API without any front-end code.
+
+## Customizing the LLM
+
+By default the app will use OpenAI's gpt-3.5-turbo model. If you want to use GPT-4, you can modify this by editing a file:
+
+- In the Next.js backend, edit `./app/api/chat/route.ts` and replace `gpt-3.5-turbo` with `gpt-4`
+- In the Express backend, edit `./backend/src/controllers/chat.controller.ts` and likewise replace `gpt-3.5-turbo` with `gpt-4`
+- In the Python backend, edit `./backend/app/utils/index.py` and once again replace `gpt-3.5-turbo` with `gpt-4`
+
+You can also replace OpenAI with one of our [dozens of other supported LLMs](https://docs.llamaindex.ai/en/stable/module_guides/models/llms/modules.html).
+
+## Example
+
+The simplest thing to do is run `create-llama` in interactive mode:

 ```bash
 npx create-llama@latest
@@ -26,9 +76,7 @@ yarn create llama
 pnpm create llama@latest
 ```

-You will be asked for the name of your project, along with other configuration options.
-
-Here is an example:
+You will be asked for the name of your project, along with other configuration options, something like this:

 ```bash
 >> npm create llama@latest
@@ -45,7 +93,7 @@ Ok to proceed? (y) y
 Creating a new LlamaIndex app in /home/my-app.
 ```

-### Non-interactive
+### Running non-interactively

 You can also pass command line arguments to set up a new project
 non-interactively. See `create-llama --help`:
@@ -88,7 +88,7 @@ export async function createApp({
      path.join(root, "README.md"),
    );
  } else {
-    await installTemplate({ ...args, backend: true });
+    await installTemplate({ ...args, backend: true, forBackend: framework });
  }

  process.chdir(root);
@@ -1,6 +1,6 @@
 {
  "name": "create-llama",
-  "version": "0.0.5",
+  "version": "0.0.8",
  "keywords": [
    "rag",
    "llamaindex",
@@ -103,6 +103,7 @@ const installTSTemplate = async ({
  ui,
  eslint,
  customApiPath,
+  forBackend,
 }: InstallTemplateArgs) => {
  console.log(bold(`Using ${packageManager}.`));

@@ -120,6 +121,26 @@ const installTSTemplate = async ({
    rename,
  });

+  /**
+   * If the backend is next.js, rename next.config.app.js to next.config.js
+   * If not, rename next.config.static.js to next.config.js
+   */
+  if (framework == "nextjs" && forBackend === "nextjs") {
+    const nextConfigAppPath = path.join(root, "next.config.app.js");
+    const nextConfigPath = path.join(root, "next.config.js");
+    await fs.rename(nextConfigAppPath, nextConfigPath);
+    // delete next.config.static.js
+    const nextConfigStaticPath = path.join(root, "next.config.static.js");
+    await fs.rm(nextConfigStaticPath);
+  } else if (framework == "nextjs" && typeof forBackend === "undefined") {
+    const nextConfigStaticPath = path.join(root, "next.config.static.js");
+    const nextConfigPath = path.join(root, "next.config.js");
+    await fs.rename(nextConfigStaticPath, nextConfigPath);
+    // delete next.config.app.js
+    const nextConfigAppPath = path.join(root, "next.config.app.js");
+    await fs.rm(nextConfigAppPath);
+  }
+
  /**
   * Copy the selected chat engine files to the target directory and reference it.
   */
@@ -17,4 +17,5 @@ export interface InstallTemplateArgs {
  eslint: boolean;
  customApiPath?: string;
  openAIKey?: string;
+  forBackend?: string;
 }
@@ -0,0 +1,2 @@
+# local env files
+.env
@@ -8,12 +8,24 @@ const port = 8000;

 const env = process.env["NODE_ENV"];
 const isDevelopment = !env || env === "development";
+const prodCorsOrigin = process.env["PROD_CORS_ORIGIN"];
+
 if (isDevelopment) {
  console.warn("Running in development mode - allowing CORS for all origins");
  app.use(cors());
+} else if (prodCorsOrigin) {
+  console.log(
+    `Running in production mode - allowing CORS for domain: ${prodCorsOrigin}`,
+  );
+  const corsOptions = {
+    origin: prodCorsOrigin, // Restrict to production domain
+  };
+  app.use(cors(corsOptions));
+} else {
+  console.warn("Production CORS origin not set, defaulting to no CORS.");
 }

-app.use(express.json());
+app.use(express.text());

 app.get("/", (req: Request, res: Response) => {
  res.send("LlamaIndex Express Server");
@@ -4,7 +4,7 @@ import { createChatEngine } from "./engine";

 export const chat = async (req: Request, res: Response, next: NextFunction) => {
  try {
-    const { messages }: { messages: ChatMessage[] } = req.body;
+    const { messages }: { messages: ChatMessage[] } = JSON.parse(req.body);
    const lastMessage = messages.pop();
    if (!messages || !lastMessage || lastMessage.role !== "user") {
      return res.status(400).json({
@@ -6,12 +6,18 @@ from llama_index import (
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
+    ServiceContext,
 )
+from llama_index.llms import OpenAI


 STORAGE_DIR = "./storage"  # directory to cache the generated index
 DATA_DIR = "./data"  # directory containing the documents to index

+service_context = ServiceContext.from_defaults(
+    llm=OpenAI(model="gpt-3.5-turbo")
+)
+

 def get_index():
    logger = logging.getLogger("uvicorn")
@@ -20,7 +26,7 @@ def get_index():
        logger.info("Creating new index")
        # load the documents and create the index
        documents = SimpleDirectoryReader(DATA_DIR).load_data()
-        index = VectorStoreIndex.from_documents(documents)
+        index = VectorStoreIndex.from_documents(documents,service_context=service_context)
        # store it for later
        index.storage_context.persist(STORAGE_DIR)
        logger.info(f"Finished creating new index. Stored in {STORAGE_DIR}")
@@ -28,6 +34,6 @@ def get_index():
        # load the existing index
        logger.info(f"Loading index from {STORAGE_DIR}...")
        storage_context = StorageContext.from_defaults(persist_dir=STORAGE_DIR)
-        index = load_index_from_storage(storage_context)
+        index = load_index_from_storage(storage_context,service_context=service_context)
        logger.info(f"Finished loading index from {STORAGE_DIR}")
    return index
@@ -1,12 +1,12 @@
+from dotenv import load_dotenv
+load_dotenv()
+
 import logging
 import os
 import uvicorn
 from app.api.routers.chat import chat_router
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
-from dotenv import load_dotenv
-
-load_dotenv()

 app = FastAPI()

@@ -2,6 +2,9 @@
 const nextConfig = {
  experimental: {
    serverComponentsExternalPackages: ["llamaindex"],
+    outputFileTracingIncludes: {
+      "/*": ["./cache/**/*"],
+    },
  },
 };

@@ -0,0 +1,13 @@
+/** @type {import('next').NextConfig} */
+const nextConfig = {
+  output: "export",
+  images: { unoptimized: true },
+  experimental: {
+    serverComponentsExternalPackages: ["llamaindex"],
+    outputFileTracingIncludes: {
+      "/*": ["./cache/**/*"],
+    },
+  },
+};
+
+module.exports = nextConfig;
@@ -0,0 +1,2 @@
+# local env files
+.env
@@ -8,12 +8,24 @@ const port = 8000;

 const env = process.env["NODE_ENV"];
 const isDevelopment = !env || env === "development";
+const prodCorsOrigin = process.env["PROD_CORS_ORIGIN"];
+
 if (isDevelopment) {
  console.warn("Running in development mode - allowing CORS for all origins");
  app.use(cors());
+} else if (prodCorsOrigin) {
+  console.log(
+    `Running in production mode - allowing CORS for domain: ${prodCorsOrigin}`,
+  );
+  const corsOptions = {
+    origin: prodCorsOrigin, // Restrict to production domain
+  };
+  app.use(cors(corsOptions));
+} else {
+  console.warn("Production CORS origin not set, defaulting to no CORS.");
 }

-app.use(express.json());
+app.use(express.text());

 app.get("/", (req: Request, res: Response) => {
  res.send("LlamaIndex Express Server");
@@ -6,7 +6,7 @@ import { LlamaIndexStream } from "./llamaindex-stream";

 export const chat = async (req: Request, res: Response, next: NextFunction) => {
  try {
-    const { messages }: { messages: ChatMessage[] } = req.body;
+    const { messages }: { messages: ChatMessage[] } = JSON.parse(req.body);
    const lastMessage = messages.pop();
    if (!messages || !lastMessage || lastMessage.role !== "user") {
      return res.status(400).json({
@@ -6,12 +6,17 @@ from llama_index import (
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
+    ServiceContext,
 )
+from llama_index.llms import OpenAI


 STORAGE_DIR = "./storage"  # directory to cache the generated index
 DATA_DIR = "./data"  # directory containing the documents to index

+service_context = ServiceContext.from_defaults(
+    llm=OpenAI(model="gpt-3.5-turbo")
+)

 def get_index():
    logger = logging.getLogger("uvicorn")
@@ -20,7 +25,7 @@ def get_index():
        logger.info("Creating new index")
        # load the documents and create the index
        documents = SimpleDirectoryReader(DATA_DIR).load_data()
-        index = VectorStoreIndex.from_documents(documents)
+        index = VectorStoreIndex.from_documents(documents,service_context=service_context)
        # store it for later
        index.storage_context.persist(STORAGE_DIR)
        logger.info(f"Finished creating new index. Stored in {STORAGE_DIR}")
@@ -28,6 +33,6 @@ def get_index():
        # load the existing index
        logger.info(f"Loading index from {STORAGE_DIR}...")
        storage_context = StorageContext.from_defaults(persist_dir=STORAGE_DIR)
-        index = load_index_from_storage(storage_context)
+        index = load_index_from_storage(storage_context,service_context=service_context)
        logger.info(f"Finished loading index from {STORAGE_DIR}")
    return index
@@ -1,12 +1,12 @@
+from dotenv import load_dotenv
+load_dotenv()
+
 import logging
 import os
 import uvicorn
 from app.api.routers.chat import chat_router
 from fastapi import FastAPI
 from fastapi.middleware.cors import CORSMiddleware
-from dotenv import load_dotenv
-
-load_dotenv()

 app = FastAPI()

@@ -2,6 +2,9 @@
 const nextConfig = {
  experimental: {
    serverComponentsExternalPackages: ["llamaindex"],
+    outputFileTracingIncludes: {
+      "/*": ["./cache/**/*"],
+    },
  },
 };

@@ -0,0 +1,13 @@
+/** @type {import('next').NextConfig} */
+const nextConfig = {
+  output: "export",
+  images: { unoptimized: true },
+  experimental: {
+    serverComponentsExternalPackages: ["llamaindex"],
+    outputFileTracingIncludes: {
+      "/*": ["./cache/**/*"],
+    },
+  },
+};
+
+module.exports = nextConfig;
Author	SHA1	Message	Date
Laurie Voss	d18748aba4	Merge branch 'main' of github.com:run-llama/LlamaIndexTS into seldo/deploy-fixes	2023-11-19 17:11:45 -08:00
yisding	27c4ef3410	Merge pull request #215 from run-llama/seldo/deploy-fixes	2023-11-19 16:21:19 -08:00
Laurie Voss	a7ee392d3e	dotenv must load before chat_router or .env isn't picked up in time	2023-11-19 16:15:41 -08:00
Laurie Voss	4415a6fdef	next.config.js has to be different for express/python backends	2023-11-19 15:55:27 -08:00
Laurie Voss	1e1e6e96a1	Handle CORS in prod	2023-11-19 15:54:53 -08:00
Laurie Voss	461d1dfbcc	Don't commit .env in the backend	2023-11-19 15:52:57 -08:00
yisding	5975fafefb	Merge pull request #208 from run-llama/seldo/express-parsing-bug fix: generated frontend is sending text/plain	2023-11-17 16:57:42 -08:00
Laurie Voss	71169fd545	fix: generated frontend is sending text/plain so handle that instead of JSON	2023-11-17 15:29:56 -08:00
Logan	be895d564d	Merge pull request #202 from run-llama/logan/fix_llm_def	2023-11-17 15:02:04 -06:00
yisding	f36a27c218	create-llama 0.0.8	2023-11-17 09:06:00 -08:00
yisding	8cdb07f151	changeset	2023-11-17 09:05:24 -08:00
yisding	ea403a0ffe	Merge branch 'main' of github.com:run-llama/LlamaIndexTS	2023-11-17 09:04:33 -08:00
yisding	7f0b4e66ae	create-llama 0.0.7	2023-11-17 09:04:01 -08:00
yisding	3b226965ba	Merge pull request #205 from run-llama/ms/copy-cache-folder fix: copy cache folder for vercel deployments	2023-11-17 09:03:26 -08:00
Logan Markewich	63daf77412	remove accidental files	2023-11-17 09:57:43 -06:00
Marcus Schiesser	079a1d5cc3	fix: copy cache folder for vercel deployments	2023-11-17 08:52:42 +07:00
Logan Markewich	2377d1a466	Fix LLM definitions	2023-11-16 15:55:38 -06:00
yisding	9f9f29391e	changeset	2023-11-15 16:25:07 -08:00
yisding	b64716d3f7	Merge pull request #197 from run-llama/seldo/create-llama-readme Expanding README docs	2023-11-15 15:56:42 -08:00
Laurie Voss	d7a47abe38	Lots of new docs	2023-11-15 15:52:56 -08:00
yisding	58b314a61e	create-llama 0.0.6	2023-11-14 20:54:59 -08:00