add HW 1

2026-07-01 20:04:01 -04:00 · 2025-05-17 18:29:39 -07:00
parent 0574cec587
commit b98a8903e3
11 changed files with 785 additions and 0 deletions
@@ -172,3 +172,7 @@ cython_debug/

 # PyPI configuration file
 .pypirc
+
+
+# Our files
+results/*
@@ -0,0 +1,202 @@
+# Homework Assignment: Recipe Chatbot
+
+This project provides a starting point for building and evaluating an AI-powered Recipe Chatbot. You will be working with a web application that uses FastAPI for the backend and a simple HTML/CSS/JavaScript frontend. The core of the chatbot involves interacting with a Large Language Model (LLM) via LiteLLM to get recipe recommendations.
+
+Your main tasks will be to refine the chatbot's persona and intelligence by crafting a detailed system prompt, expanding its test query dataset, and evaluating its performance.
+
+![Recipe Chatbot UI](./screenshots/hw1.png)
+
+## Core Components Provided
+
+This initial setup includes:
+
+*   **Backend (FastAPI)**: Serves the frontend and provides an API endpoint (`/chat`) for the chatbot logic.
+*   **Frontend (HTML/CSS/JS)**: A basic, modern chat interface where users can send messages and receive responses.
+    *   Renders assistant responses as Markdown.
+    *   Includes a typing indicator for better user experience.
+*   **LLM Integration (LiteLLM)**: The backend connects to an LLM (configurable via `.env`) to generate recipe advice.
+*   **Bulk Testing Script**: A Python script (`scripts/bulk_test.py`) to send multiple predefined queries (from `data/sample_queries.csv`) to the chatbot's core logic and save the responses for evaluation. This script uses `rich` for pretty console output.
+
+## Project Structure
+
+```
+recipe-chatbot/
+├── backend/
+│   ├── __init__.py
+│   ├── main.py         # FastAPI application, routes
+│   └── utils.py        # LiteLLM wrapper, system prompt, env loading
+├── data/
+│   └── sample_queries.csv # Sample queries for bulk testing (ID, Query)
+├── frontend/
+│   └── index.html      # Chat UI (HTML, CSS, JavaScript)
+├── results/            # Output folder for bulk_test.py
+├── scripts/
+│   └── bulk_test.py    # Bulk testing script
+├── .env.example        # Example environment file
+├── env.example         # Backup env example (can be removed if .env.example is preferred)
+├── requirements.txt    # Python dependencies
+└── README.md           # This file (Your guide!)
+```
+
+## Setup Instructions
+
+1.  **Clone the Repository (if you haven't already)**
+    ```bash
+    git clone <your-repository-url>
+    cd recipe-chatbot
+    ```
+
+2.  **Create and Activate a Python Virtual Environment**
+    ```bash
+    python -m venv .venv
+    ```
+    *   On macOS/Linux:
+        ```bash
+        source .venv/bin/activate
+        ```
+    *   On Windows:
+        ```bash
+        .venv\Scripts\activate
+        ```
+
+3.  **Install Dependencies**
+    ```bash
+    pip install -r requirements.txt
+    ```
+
+4.  **Configure Environment Variables (`.env` file)**
+    *   Copy the example environment file:
+        ```bash
+        cp env.example .env
+        ```
+        (or `cp .env.example .env` if you have that one)
+    *   Edit the `.env` file. You will need to:
+        1.  Set the `MODEL_NAME` to the specific model you want to use (e.g., `openai/gpt-3.5-turbo`, `anthropic/claude-3-opus-20240229`, `ollama/llama2`).
+        2.  Set the **appropriate API key environment variable** for the chosen model provider. 
+            Refer to your `env.example` for common API key names like `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, etc. 
+            LiteLLM will automatically use these provider-specific keys.
+
+        Example of a configured `.env` file if using an OpenAI model:
+        ```env
+        MODEL_NAME=openai/gpt-3.5-turbo
+        OPENAI_API_KEY=sk-yourActualOpenAIKey...
+        ```
+        Example for an Anthropic model:
+        ```env
+        MODEL_NAME=anthropic/claude-3-haiku-20240307
+        ANTHROPIC_API_KEY=sk-ant-yourActualAnthropicKey...
+        ```
+
+    *   **Important - Model Naming and API Keys with LiteLLM**:
+        LiteLLM supports a wide array of model providers. To use a model from a specific provider, you generally need to:
+        *   **Prefix the `MODEL_NAME`** correctly (e.g., `openai/`, `anthropic/`, `mistral/`, `ollama/`).
+        *   **Set the corresponding API key variable** in your `.env` file (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `MISTRAL_API_KEY`). Some local models like Ollama might not require an API key.
+
+        Please refer to the official LiteLLM documentation for the correct model prefixes and required environment variables for your chosen provider: [LiteLLM Supported Providers](https://docs.litellm.ai/docs/providers).
+
+## Running the Provided Application
+
+### 1. Run the Web Application (Frontend and Backend)
+
+*   Ensure your virtual environment is activated and your `.env` file is configured.
+*   From the project root directory, start the FastAPI server using Uvicorn:
+    ```bash
+    uvicorn backend.main:app --reload
+    ```
+*   Open your web browser and navigate to: `http://127.0.0.1:8000`
+
+    You should see the chat interface.
+
+
+### 2. Run the Bulk Test Script
+
+The bulk test script allows you to evaluate your chatbot's responses to a predefined set of queries. It sends queries from `data/sample_queries.csv` directly to the backend agent logic and saves the responses to the `results/` directory.
+
+*   Ensure your virtual environment is activated and your `.env` file is configured.
+*   From the project root directory, run:
+    ```bash
+    python scripts/bulk_test.py
+    ```
+*   To use a different CSV file for queries:
+    ```bash
+    python scripts/bulk_test.py --csv path/to/your/queries.csv
+    ```
+    The CSV file must have `id` and `query` columns.
+*   Check the `results/` folder for a new CSV file containing the IDs, queries, and their corresponding responses. This will be crucial for evaluating your system prompt changes.
+
+---
+
+## Homework Assignment 1: Write a Starting Prompt
+
+Your main task is to get the repo to a starting point for Lesson 2.
+
+1.  **Write an Effective System Prompt**:
+    *   Open `backend/utils.py` and locate the `SYSTEM_PROMPT` constant. Currently, it's a naive placeholder.
+    *   Replace it with a well-crafted system prompt. Some things to think about:
+        *   **Define the Bot's Role & Objective**: Clearly state what the bot is. (e.g., "You are a friendly and creative culinary assistant specializing in suggesting easy-to-follow recipes.")
+        *   **Instructions & Response Rules**: Be specific.
+            *   What should it *always* do? (e.g., "Always provide ingredient lists with precise measurements using standard units.", "Always include clear, step-by-step instructions.")
+            *   What should it *never* do? (e.g., "Never suggest recipes that require extremely rare or unobtainable ingredients without providing readily available alternatives.", "Never use offensive or derogatory language.")
+            *   Safety Clause: (e.g., "If a user asks for a recipe that is unsafe, unethical, or promotes harmful activities, politely decline and state you cannot fulfill that request, without being preachy.")
+        *   **LLM Agency – How Much Freedom?**:
+            *   Define its creativity level. (e.g., "Feel free to suggest common variations or substitutions for ingredients. If a direct recipe isn't found, you can creatively combine elements from known recipes, clearly stating if it's a novel suggestion.")
+            *   Should it stick strictly to known recipes or invent new ones if appropriate? (Be explicit).
+        *   **Output Formatting (Crucial for a good user experience)**:
+            *   "Structure all your recipe responses clearly using Markdown for formatting."
+            *   "Begin every recipe response with the recipe name as a Level 2 Heading (e.g., `## Amazing Blueberry Muffins`)."
+            *   "Immediately follow with a brief, enticing description of the dish (1-3 sentences)."
+            *   "Next, include a section titled `### Ingredients`. List all ingredients using a Markdown unordered list (bullet points)."
+            *   "Following ingredients, include a section titled `### Instructions`. Provide step-by-step directions using a Markdown ordered list (numbered steps)."
+            *   "Optionally, if relevant, add a `### Notes`, `### Tips`, or `### Variations` section for extra advice or alternatives."
+            *   **Example of desired Markdown structure for a recipe response**:
+                ```markdown
+                ## Golden Pan-Fried Salmon
+
+                A quick and delicious way to prepare salmon with a crispy skin and moist interior, perfect for a weeknight dinner.
+
+                ### Ingredients
+                * 2 salmon fillets (approx. 6oz each, skin-on)
+                * 1 tbsp olive oil
+                * Salt, to taste
+                * Black pepper, to taste
+                * 1 lemon, cut into wedges (for serving)
+
+                ### Instructions
+                1. Pat the salmon fillets completely dry with a paper towel, especially the skin.
+                2. Season both sides of the salmon with salt and pepper.
+                3. Heat olive oil in a non-stick skillet over medium-high heat until shimmering.
+                4. Place salmon fillets skin-side down in the hot pan.
+                5. Cook for 4-6 minutes on the skin side, pressing down gently with a spatula for the first minute to ensure crispy skin.
+                6. Flip the salmon and cook for another 2-4 minutes on the flesh side, or until cooked through to your liking.
+                7. Serve immediately with lemon wedges.
+
+                ### Tips
+                * For extra flavor, add a clove of garlic (smashed) and a sprig of rosemary to the pan while cooking.
+                * Ensure the pan is hot before adding the salmon for the best sear.
+                ```
+
+2.  **Expand and Diversify the Query Dataset**:
+    *   Open `data/sample_queries.csv`.
+    *   Add at least **10 new, diverse queries** to this file. Ensure each new query has a unique `id` and a corresponding query text.
+    *   Your queries should test various aspects of a recipe chatbot. Consider including requests related to:
+        *   Specific cuisines (e.g., "Italian pasta dish", "Spicy Thai curry")
+        *   Dietary restrictions (e.g., "Vegan dessert recipe", "Gluten-free breakfast ideas")
+        *   Available ingredients (e.g., "What can I make with chicken, rice, and broccoli?")
+        *   Meal types (e.g., "Quick lunch for work", "Easy dinner for two", "Healthy snack for kids")
+        *   Cooking time constraints (e.g., "Recipe under 30 minutes")
+        *   Skill levels (e.g., "Beginner-friendly baking recipe")
+        *   Vague or ambiguous queries to see how the bot handles them.
+    * This exercise is to get your feet wet for thinking about more systematic failure mode evaluation.
+
+3.  **Run the Bulk Test & Evaluate**:
+    *   After you have updated the system prompt in `backend/utils.py` and expanded the queries in `data/sample_queries.csv`, run the bulk test script:
+        ```bash
+        python scripts/bulk_test.py
+        ```
+    * Make sure a new CSV has been written.
+    
+Good luck with your assignment!
+
+---
+
+*This README was last updated: {datetime.datetime.now().strftime('%Y-%m-%d')}* 
@@ -0,0 +1 @@
+# Package marker for the backend modules. 
@@ -0,0 +1,87 @@
+from __future__ import annotations
+
+"""FastAPI application entry-point for the recipe chatbot."""
+
+from pathlib import Path
+from typing import Final, List, Dict
+
+from fastapi import FastAPI, HTTPException, status
+from fastapi.responses import HTMLResponse
+from fastapi.staticfiles import StaticFiles
+from pydantic import BaseModel, Field
+
+from backend.utils import get_agent_response  # noqa: WPS433 import from parent
+
+# -----------------------------------------------------------------------------
+# Application setup
+# -----------------------------------------------------------------------------
+
+APP_TITLE: Final[str] = "Recipe Chatbot"
+app = FastAPI(title=APP_TITLE)
+
+# Serve static assets (currently just the HTML) under `/static/*`.
+STATIC_DIR = Path(__file__).parent.parent / "frontend"
+app.mount("/static", StaticFiles(directory=STATIC_DIR), name="static")
+
+
+# -----------------------------------------------------------------------------
+# Request / response models
+# -----------------------------------------------------------------------------
+
+class ChatMessage(BaseModel):
+    """Schema for a single message in the chat history."""
+    role: str = Field(..., description="Role of the message sender (system, user, or assistant).")
+    content: str = Field(..., description="Content of the message.")
+
+class ChatRequest(BaseModel):
+    """Schema for incoming chat messages."""
+
+    messages: List[ChatMessage] = Field(..., description="The entire conversation history.")
+
+
+class ChatResponse(BaseModel):
+    """Schema for the assistant's reply returned to the front-end."""
+
+    messages: List[ChatMessage] = Field(..., description="The updated conversation history.")
+
+
+# -----------------------------------------------------------------------------
+# Routes
+# -----------------------------------------------------------------------------
+
+
+@app.post("/chat", response_model=ChatResponse)
+async def chat_endpoint(payload: ChatRequest) -> ChatResponse:  # noqa: WPS430
+    """Main conversational endpoint.
+
+    It proxies the user's message list to the underlying agent and returns the updated list.
+    """
+    # Convert Pydantic models to simple dicts for the agent
+    request_messages: List[Dict[str, str]] = [msg.model_dump() for msg in payload.messages]
+
+    try:
+        updated_messages_dicts = get_agent_response(request_messages)
+    except Exception as exc:  # noqa: BLE001 broad; surface as HTTP 500
+        # In production you would log the traceback here.
+        raise HTTPException(
+            status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
+            detail=str(exc),
+        ) from exc
+
+    # Convert dicts back to Pydantic models for the response
+    response_messages: List[ChatMessage] = [ChatMessage(**msg) for msg in updated_messages_dicts]
+    return ChatResponse(messages=response_messages)
+
+
+@app.get("/", response_class=HTMLResponse)
+async def index() -> HTMLResponse:  # noqa: WPS430
+    """Serve the chat UI."""
+
+    html_path = STATIC_DIR / "index.html"
+    if not html_path.exists():
+        raise HTTPException(
+            status_code=status.HTTP_404_NOT_FOUND,
+            detail="Frontend not found. Did you forget to build it?",
+        )
+
+    return HTMLResponse(html_path.read_text(encoding="utf-8")) 
@@ -0,0 +1,74 @@
+from __future__ import annotations
+
+"""Utility helpers for the recipe chatbot backend.
+
+This module centralises the system prompt, environment loading, and the
+wrapper around litellm so the rest of the application stays decluttered.
+"""
+
+from pathlib import Path
+from typing import Final, List, Dict
+
+import litellm  # type: ignore
+from dotenv import load_dotenv
+
+# Ensure the .env file is loaded as early as possible.
+load_dotenv(override=False)
+
+# --- Constants -------------------------------------------------------------------
+
+SYSTEM_PROMPT: Final[str] = (
+    "You are an expert chef recommending delicious and useful recipes. "
+    "Present only one recipe at a time. If the user doesn't specify what ingredients "
+    "they have available, ask them about their available ingredients rather than "
+    "assuming what's in their fridge."
+)
+
+# Fetch configuration *after* we loaded the .env file.
+MODEL_NAME: Final[str] = (
+    Path.cwd()  # noqa: WPS432
+    .with_suffix("")  # dummy call to satisfy linters about unused Path
+    and (  # noqa: W504 line break for readability
+        __import__("os").environ.get("MODEL_NAME", "gpt-3.5-turbo")
+    )
+)
+
+
+# --- Agent wrapper ---------------------------------------------------------------
+
+def get_agent_response(messages: List[Dict[str, str]]) -> List[Dict[str, str]]:  # noqa: WPS231
+    """Call the underlying large-language model via *litellm*.
+
+    Parameters
+    ----------
+    messages:
+        The full conversation history. Each item is a dict with "role" and "content".
+
+    Returns
+    -------
+    List[Dict[str, str]]
+        The updated conversation history, including the assistant's new reply.
+    """
+
+    # litellm is model-agnostic; we only need to supply the model name and key.
+    # The first message is assumed to be the system prompt if not explicitly provided
+    # or if the history is empty. We'll ensure the system prompt is always first.
+    current_messages: List[Dict[str, str]]
+    if not messages or messages[0]["role"] != "system":
+        current_messages = [{"role": "system", "content": SYSTEM_PROMPT}] + messages
+    else:
+        current_messages = messages
+
+    completion = litellm.completion(
+        model=MODEL_NAME,
+        messages=current_messages, # Pass the full history
+    )
+
+    assistant_reply_content: str = (
+        completion["choices"][0]["message"]["content"]  # type: ignore[index]
+        .strip()
+    )
+    
+    # Append assistant's response to the history
+    updated_messages = current_messages + [{"role": "assistant", "content": assistant_reply_content}]
+    return updated_messages 
@@ -0,0 +1,4 @@
+id,query
+1,Suggest a quick vegan breakfast recipe
+2,I have chicken and rice, what can I cook?
+3,Give me a dessert recipe with chocolate 
@@ -0,0 +1,9 @@
+# Copy this file to `.env` and fill in your credentials.
+
+MODEL_NAME=openai/gpt-4.1-nano
+
+# Example API keys
+# OPENAI_API_KEY=
+# TOGETHER_API_KEY=
+# GEMINI_API_KEY=
+# ANTHROPIC_API_KEY=
@@ -0,0 +1,264 @@
+<!DOCTYPE html>
+<html lang="en">
+  <head>
+    <meta charset="UTF-8" />
+    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
+    <title>Recipe Chatbot</title>
+    <script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
+    <link rel="preconnect" href="https://fonts.googleapis.com">
+    <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
+    <link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600&display=swap" rel="stylesheet">
+    <style>
+      /* Basic, responsive chat styling */
+      body {
+        font-family: 'Inter', -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
+        background: #f8f9fa; /* Slightly lighter background */
+        display: flex;
+        flex-direction: column;
+        align-items: center;
+        margin: 0;
+        height: 100vh;
+      }
+
+      #app-container { /* New wrapper */
+        display: flex;
+        flex-direction: column;
+        width: 100%;
+        max-width: 600px;
+        height: 100%; /* Take full height of body */
+        box-sizing: border-box;
+        /* padding: 0 16px; /* Remove horizontal padding here, apply to children if needed */
+        background-color: #ffffff; /* Give app-container a white background */
+        box-shadow: 0 4px 12px rgba(0,0,0,0.05); /* Subtle shadow for depth */
+        border-radius: 12px; /* Rounded corners for the whole app box */
+        overflow: hidden; /* To make sure children respect border-radius */
+        margin: 16px; /* Add some margin around the app container */
+      }
+
+      #chat-container {
+        /* width: 100%; /* Handled by app-container */
+        /* max-width: 600px; /* Handled by app-container */
+        flex: 1; /* Takes available space */
+        display: flex;
+        flex-direction: column;
+        padding: 20px 20px 10px 20px; /* More padding */
+        overflow-y: auto;
+        box-sizing: border-box;
+      }
+
+      .message {
+        padding: 10px 15px;
+        border-radius: 18px; /* Softer, more modern radius */
+        margin-bottom: 10px;
+        max-width: 85%;
+        line-height: 1.5;
+      }
+
+      .user {
+        align-self: flex-end;
+        background-color: #e0f0ff; /* Lighter, softer blue for user */
+        color: #00529b; /* Darker blue text for contrast */
+      }
+
+      .assistant {
+        align-self: flex-start;
+        background-color: #f1f3f5; /* Lighter gray for assistant */
+        color: #343a40; /* Darker gray text */
+        border: none; /* Remove default border */
+      }
+      
+      /* Styling for markdown elements inside assistant messages */
+      .assistant p { margin: 0.5em 0; }
+      .assistant ul, .assistant ol { margin: 0.5em 0 0.5em 20px; padding: 0; }
+      .assistant li { margin-bottom: 0.25em; }
+      .assistant pre { 
+        background-color: #e9ecef; 
+        padding: 10px; 
+        border-radius: 6px; 
+        overflow-x: auto; 
+        font-size: 0.9em;
+      }
+      .assistant code { 
+        font-family: 'SFMono-Regular', Consolas, 'Liberation Mono', Menlo, Courier, monospace;
+        font-size: 0.9em; 
+        background-color: rgba(0,0,0,0.04);
+        padding: 2px 4px;
+        border-radius: 4px;
+      }
+      .assistant pre code {
+        background-color: transparent;
+        padding: 0;
+        border-radius: 0;
+      }
+
+      #typing-indicator {
+        box-sizing: border-box; 
+        display: none; 
+        color: #6c757d; /* Subtler gray for typing */
+        font-style: italic;
+        padding: 10px 15px; /* Match message padding */
+        margin-bottom: 8px; 
+        border-radius: 18px; /* Match message radius */
+        background-color: #f1f3f5; /* Match assistant background */
+        align-self: flex-start; /* Ensure it aligns like an assistant message */
+        max-width: fit-content; /* Only as wide as its content */
+        margin-left: 20px; /* Align with chat content padding */
+      }
+
+      #input-form {
+        display: flex;
+        align-items: center; /* Vertically align input and button */
+        padding: 15px 20px; /* More padding */
+        box-sizing: border-box;
+        background-color: #f8f9fa; /* Slightly off-white for input area */
+        border-top: 1px solid #dee2e6; /* Subtle separator line */
+      }
+
+      #user-input {
+        flex: 1;
+        padding: 12px 15px;
+        font-size: 1rem; /* Consistent font size */
+        border: 1px solid #ced4da; /* Softer border */
+        border-radius: 20px; /* Rounded pill shape */
+        margin-right: 10px;
+        outline: none; /* Remove default focus outline */
+        transition: border-color 0.2s ease-in-out, box-shadow 0.2s ease-in-out;
+      }
+      #user-input:focus {
+        border-color: #4dabf7; /* Blue border on focus */
+        box-shadow: 0 0 0 3px rgba(77, 171, 247, 0.25); /* Subtle glow on focus */
+      }
+
+      #send-btn {
+        padding: 12px 20px;
+        font-size: 1rem;
+        font-weight: 500; /* Slightly bolder text */
+        background-color: #28a745; /* Green send button */
+        color: #fff;
+        border: none;
+        border-radius: 20px; /* Match input field */
+        cursor: pointer;
+        transition: background-color 0.2s ease-in-out;
+      }
+      #send-btn:hover {
+        background-color: #218838; /* Darker green on hover */
+      }
+      #send-btn:disabled {
+        background-color: #adb5bd; /* Gray when disabled */
+        cursor: not-allowed;
+      }
+    </style>
+  </head>
+  <body>
+    <h2 style="font-family: 'Inter', sans-serif; font-weight: 600; color: #343a40; margin-top: 24px; margin-bottom: 16px; text-align: center;">Recipe Chatbot</h2>
+    <div id="app-container"> <!-- New wrapper -->
+        <div id="chat-container">
+            <!-- Messages are rendered here by renderChat -->
+        </div>
+        <div id="typing-indicator">Assistant is typing...</div>
+        <form id="input-form">
+            <input
+                id="user-input"
+                type="text"
+                placeholder="Ask for a recipe..."
+                autocomplete="off"
+                required
+            />
+            <button id="send-btn" type="submit">Send</button>
+        </form>
+    </div> <!-- End of app-container -->
+
+    <script>
+      const form = document.getElementById("input-form");
+      const input = document.getElementById("user-input");
+      const chatContainer = document.getElementById("chat-container");
+      const sendBtn = document.getElementById("send-btn");
+      const typingIndicator = document.getElementById("typing-indicator");
+
+      let chatHistory = []; // Holds all messages: { role: string, content: string }[]
+      let typingInterval = null; // Variable to hold the interval ID
+
+      /**
+       * Clears and re-renders all messages in the chat container based on chatHistory.
+       */
+      function renderChat() {
+        chatContainer.innerHTML = ""; // Clear existing messages
+        chatHistory.forEach(msg => {
+          if (msg.role === "system") return;
+
+          const bubble = document.createElement("div");
+          bubble.classList.add("message", msg.role);
+          
+          if (msg.role === "assistant") {
+            bubble.innerHTML = marked.parse(msg.content || ""); // Use marked.parse for assistant
+          } else {
+            bubble.textContent = msg.content;
+          }
+          chatContainer.appendChild(bubble);
+        });
+        chatContainer.scrollTop = chatContainer.scrollHeight;
+      }
+
+      async function sendMessage(evt) {
+        evt.preventDefault();
+        const userText = input.value.trim();
+        if (!userText) return;
+
+        // Add user message to history and re-render
+        chatHistory.push({ role: "user", content: userText });
+        renderChat();
+
+        input.value = "";
+        input.focus();
+        sendBtn.disabled = true;
+        
+        typingIndicator.style.display = "block";
+        let dotCount = 0;
+        typingIndicator.textContent = "Assistant is typing"; // Initial text without dots
+        if (typingInterval) clearInterval(typingInterval);
+        typingInterval = setInterval(() => {
+          dotCount = (dotCount + 1) % 4;
+          typingIndicator.textContent = `Assistant is typing${'.'.repeat(dotCount)}`;
+        }, 300);
+        typingIndicator.scrollIntoView({ behavior: "smooth", block: "end" }); // Scroll indicator into view
+
+        try {
+          // Send the whole history
+          const res = await fetch("/chat", {
+            method: "POST",
+            headers: { "Content-Type": "application/json" },
+            body: JSON.stringify({ messages: chatHistory }),
+          });
+
+          if (!res.ok) {
+            const errorData = await res.json();
+            throw new Error(errorData.detail || `Server responded with ${res.status}`);
+          }
+
+          const data = await res.json();
+          chatHistory = data.messages; // Update history with the server's version
+          renderChat(); // Re-render with the full history from server
+
+        } catch (error) {
+          // Add error message to history and re-render
+          chatHistory.push({
+            role: "assistant", 
+            content: `Oops! Something went wrong: ${error.message}. Please try again.`
+            // Error message will also be parsed by marked in renderChat
+          });
+          renderChat();
+          console.error(error);
+        } finally {
+          sendBtn.disabled = false;
+          typingIndicator.style.display = "none";
+          if (typingInterval) clearInterval(typingInterval); // Stop animation
+          typingInterval = null;
+        }
+      }
+
+      form.addEventListener("submit", sendMessage);
+      // Initial render (in case there's a system message or pre-filled history later)
+      renderChat();
+    </script>
+  </body>
+</html> 
@@ -0,0 +1,6 @@
+fastapi
+uvicorn
+litellm
+python-dotenv
+httpx
+rich
@@ -0,0 +1,134 @@
+from __future__ import annotations
+
+import sys
+from pathlib import Path
+
+# Add project root to sys.path to allow a_s_b_absolute imports
+PROJECT_ROOT = Path(__file__).resolve().parent.parent
+sys.path.insert(0, str(PROJECT_ROOT))
+
+"""Bulk testing utility for the recipe chatbot agent.
+
+Reads a CSV file containing user queries, fires them against the `/chat`
+endpoint concurrently, and stores the results for later manual evaluation.
+"""
+
+import argparse
+import csv
+import datetime as dt
+from typing import List, Tuple, Dict
+from concurrent.futures import ThreadPoolExecutor, as_completed
+
+from rich.console import Console, Group
+from rich.panel import Panel
+from rich.text import Text
+from rich.markdown import Markdown
+
+from backend.utils import get_agent_response, SYSTEM_PROMPT
+
+# -----------------------------------------------------------------------------
+# Configuration helpers
+# -----------------------------------------------------------------------------
+
+DEFAULT_CSV: Path = Path("data/sample_queries.csv")
+RESULTS_DIR: Path = Path("results")
+RESULTS_DIR.mkdir(exist_ok=True)
+
+MAX_WORKERS = 10 # For ThreadPoolExecutor
+
+# -----------------------------------------------------------------------------
+# Core logic
+# -----------------------------------------------------------------------------
+
+# --- Sync function for ThreadPoolExecutor ---
+def process_query_sync(query_id: str, query: str) -> Tuple[str, str, str]:
+    """Processes a single query by calling the agent directly."""
+    initial_messages: List[Dict[str, str]] = [
+        {"role": "user", "content": query}
+    ]
+    try:
+        # get_agent_response now returns the full history
+        updated_history = get_agent_response(initial_messages)
+        # Extract the last assistant message for the result
+        assistant_reply = ""
+        if updated_history and updated_history[-1]["role"] == "assistant":
+            assistant_reply = updated_history[-1]["content"]
+        else: # Should not happen with current logic but good to handle
+            assistant_reply = "Error: No assistant reply found in history."
+        return query_id, query, assistant_reply
+    except Exception as e:
+        return query_id, query, f"Error processing query: {str(e)}"
+
+
+# Renamed and made sync
+def run_bulk_test(csv_path: Path) -> None:
+    """Main entry point for bulk testing (synchronous version)."""
+
+    with csv_path.open("r", newline="", encoding="utf-8") as csv_file:
+        reader = csv.DictReader(csv_file)
+        # Expects columns 'id' and 'query'
+        input_data: List[Dict[str, str]] = [
+            row for row in reader if row.get("id") and row.get("query")
+        ]
+
+    if not input_data:
+        raise ValueError("No valid data (with 'id' and 'query') found in the provided CSV file.")
+
+    console = Console()
+    results_data: List[Tuple[str, str, str]] = [] # Will store (id, query, response)
+    with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
+        future_to_data = {
+            executor.submit(process_query_sync, item["id"], item["query"]):
+            item for item in input_data
+        }
+        console.print(f"[bold blue]Submitting {len(input_data)} queries to the executor...[/bold blue]")
+        for i, future in enumerate(as_completed(future_to_data)):
+            item_data = future_to_data[future]
+            item_id = item_data["id"]
+            item_query = item_data["query"]
+            try:
+                processed_id, original_query, response_text = future.result()
+                results_data.append((processed_id, original_query, response_text))
+
+                panel_content = Text()
+                panel_content.append(f"ID: {processed_id}\n", style="bold magenta")
+                panel_content.append("Query:\n", style="bold yellow")
+                panel_content.append(f"{original_query}\n\n")
+
+                # Create a separate Markdown object for the response
+                response_markdown = Markdown(response_text)
+
+                # Group the different parts for the Panel
+                panel_group = Group(
+                    panel_content, # Contains ID and Query
+                    Markdown("--- Response ---"), # A small separator for clarity
+                    response_markdown  # The Markdown rendered response
+                )
+
+                console.print(Panel(
+                    panel_group, # Pass the group as the single renderable
+                    title=f"Result {i+1}/{len(input_data)} - ID: {processed_id}", 
+                    border_style="cyan"
+                ))
+
+            except Exception as exc:
+                console.print(Panel(f"[bold red]Exception for ID {item_id}, Query:[/bold red]\n{item_query}\n\n[bold red]Error:[/bold red]\n{exc}", title=f"Error in Result {i+1}/{len(input_data)} - ID: {item_id}", border_style="red"))
+                results_data.append((item_id, item_query, f"Exception during processing: {str(exc)}"))
+        console.print("[bold blue]All queries processed.[/bold blue]")
+
+    timestamp = dt.datetime.now().strftime("%Y%m%d_%H%M%S")
+    out_path = RESULTS_DIR / f"results_{timestamp}.csv"
+
+    with out_path.open("w", newline="", encoding="utf-8") as csv_file:
+        writer = csv.writer(csv_file)
+        writer.writerow(["id", "query", "response"])
+        writer.writerows(results_data)
+
+    console.print(f"[bold green]Saved {len(results_data)} results to {str(out_path)}[/bold green]")
+
+
+if __name__ == "__main__":
+    parser = argparse.ArgumentParser(description="Bulk test the recipe chatbot")
+    parser.add_argument("--csv", type=Path, default=DEFAULT_CSV, help="Path to CSV file containing queries (column name: 'query').")
+    args = parser.parse_args()
+    run_bulk_test(args.csv)