This commit is contained in:
Shreya Shankar
2025-05-17 18:29:39 -07:00
parent 0574cec587
commit b98a8903e3
11 changed files with 785 additions and 0 deletions
+4
View File
@@ -172,3 +172,7 @@ cython_debug/
# PyPI configuration file
.pypirc
# Our files
results/*
+202
View File
@@ -0,0 +1,202 @@
# Homework Assignment: Recipe Chatbot
This project provides a starting point for building and evaluating an AI-powered Recipe Chatbot. You will be working with a web application that uses FastAPI for the backend and a simple HTML/CSS/JavaScript frontend. The core of the chatbot involves interacting with a Large Language Model (LLM) via LiteLLM to get recipe recommendations.
Your main tasks will be to refine the chatbot's persona and intelligence by crafting a detailed system prompt, expanding its test query dataset, and evaluating its performance.
![Recipe Chatbot UI](./screenshots/hw1.png)
## Core Components Provided
This initial setup includes:
* **Backend (FastAPI)**: Serves the frontend and provides an API endpoint (`/chat`) for the chatbot logic.
* **Frontend (HTML/CSS/JS)**: A basic, modern chat interface where users can send messages and receive responses.
* Renders assistant responses as Markdown.
* Includes a typing indicator for better user experience.
* **LLM Integration (LiteLLM)**: The backend connects to an LLM (configurable via `.env`) to generate recipe advice.
* **Bulk Testing Script**: A Python script (`scripts/bulk_test.py`) to send multiple predefined queries (from `data/sample_queries.csv`) to the chatbot's core logic and save the responses for evaluation. This script uses `rich` for pretty console output.
## Project Structure
```
recipe-chatbot/
├── backend/
│ ├── __init__.py
│ ├── main.py # FastAPI application, routes
│ └── utils.py # LiteLLM wrapper, system prompt, env loading
├── data/
│ └── sample_queries.csv # Sample queries for bulk testing (ID, Query)
├── frontend/
│ └── index.html # Chat UI (HTML, CSS, JavaScript)
├── results/ # Output folder for bulk_test.py
├── scripts/
│ └── bulk_test.py # Bulk testing script
├── .env.example # Example environment file
├── env.example # Backup env example (can be removed if .env.example is preferred)
├── requirements.txt # Python dependencies
└── README.md # This file (Your guide!)
```
## Setup Instructions
1. **Clone the Repository (if you haven't already)**
```bash
git clone <your-repository-url>
cd recipe-chatbot
```
2. **Create and Activate a Python Virtual Environment**
```bash
python -m venv .venv
```
* On macOS/Linux:
```bash
source .venv/bin/activate
```
* On Windows:
```bash
.venv\Scripts\activate
```
3. **Install Dependencies**
```bash
pip install -r requirements.txt
```
4. **Configure Environment Variables (`.env` file)**
* Copy the example environment file:
```bash
cp env.example .env
```
(or `cp .env.example .env` if you have that one)
* Edit the `.env` file. You will need to:
1. Set the `MODEL_NAME` to the specific model you want to use (e.g., `openai/gpt-3.5-turbo`, `anthropic/claude-3-opus-20240229`, `ollama/llama2`).
2. Set the **appropriate API key environment variable** for the chosen model provider.
Refer to your `env.example` for common API key names like `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, etc.
LiteLLM will automatically use these provider-specific keys.
Example of a configured `.env` file if using an OpenAI model:
```env
MODEL_NAME=openai/gpt-3.5-turbo
OPENAI_API_KEY=sk-yourActualOpenAIKey...
```
Example for an Anthropic model:
```env
MODEL_NAME=anthropic/claude-3-haiku-20240307
ANTHROPIC_API_KEY=sk-ant-yourActualAnthropicKey...
```
* **Important - Model Naming and API Keys with LiteLLM**:
LiteLLM supports a wide array of model providers. To use a model from a specific provider, you generally need to:
* **Prefix the `MODEL_NAME`** correctly (e.g., `openai/`, `anthropic/`, `mistral/`, `ollama/`).
* **Set the corresponding API key variable** in your `.env` file (e.g., `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `MISTRAL_API_KEY`). Some local models like Ollama might not require an API key.
Please refer to the official LiteLLM documentation for the correct model prefixes and required environment variables for your chosen provider: [LiteLLM Supported Providers](https://docs.litellm.ai/docs/providers).
## Running the Provided Application
### 1. Run the Web Application (Frontend and Backend)
* Ensure your virtual environment is activated and your `.env` file is configured.
* From the project root directory, start the FastAPI server using Uvicorn:
```bash
uvicorn backend.main:app --reload
```
* Open your web browser and navigate to: `http://127.0.0.1:8000`
You should see the chat interface.
### 2. Run the Bulk Test Script
The bulk test script allows you to evaluate your chatbot's responses to a predefined set of queries. It sends queries from `data/sample_queries.csv` directly to the backend agent logic and saves the responses to the `results/` directory.
* Ensure your virtual environment is activated and your `.env` file is configured.
* From the project root directory, run:
```bash
python scripts/bulk_test.py
```
* To use a different CSV file for queries:
```bash
python scripts/bulk_test.py --csv path/to/your/queries.csv
```
The CSV file must have `id` and `query` columns.
* Check the `results/` folder for a new CSV file containing the IDs, queries, and their corresponding responses. This will be crucial for evaluating your system prompt changes.
---
## Homework Assignment 1: Write a Starting Prompt
Your main task is to get the repo to a starting point for Lesson 2.
1. **Write an Effective System Prompt**:
* Open `backend/utils.py` and locate the `SYSTEM_PROMPT` constant. Currently, it's a naive placeholder.
* Replace it with a well-crafted system prompt. Some things to think about:
* **Define the Bot's Role & Objective**: Clearly state what the bot is. (e.g., "You are a friendly and creative culinary assistant specializing in suggesting easy-to-follow recipes.")
* **Instructions & Response Rules**: Be specific.
* What should it *always* do? (e.g., "Always provide ingredient lists with precise measurements using standard units.", "Always include clear, step-by-step instructions.")
* What should it *never* do? (e.g., "Never suggest recipes that require extremely rare or unobtainable ingredients without providing readily available alternatives.", "Never use offensive or derogatory language.")
* Safety Clause: (e.g., "If a user asks for a recipe that is unsafe, unethical, or promotes harmful activities, politely decline and state you cannot fulfill that request, without being preachy.")
* **LLM Agency How Much Freedom?**:
* Define its creativity level. (e.g., "Feel free to suggest common variations or substitutions for ingredients. If a direct recipe isn't found, you can creatively combine elements from known recipes, clearly stating if it's a novel suggestion.")
* Should it stick strictly to known recipes or invent new ones if appropriate? (Be explicit).
* **Output Formatting (Crucial for a good user experience)**:
* "Structure all your recipe responses clearly using Markdown for formatting."
* "Begin every recipe response with the recipe name as a Level 2 Heading (e.g., `## Amazing Blueberry Muffins`)."
* "Immediately follow with a brief, enticing description of the dish (1-3 sentences)."
* "Next, include a section titled `### Ingredients`. List all ingredients using a Markdown unordered list (bullet points)."
* "Following ingredients, include a section titled `### Instructions`. Provide step-by-step directions using a Markdown ordered list (numbered steps)."
* "Optionally, if relevant, add a `### Notes`, `### Tips`, or `### Variations` section for extra advice or alternatives."
* **Example of desired Markdown structure for a recipe response**:
```markdown
## Golden Pan-Fried Salmon
A quick and delicious way to prepare salmon with a crispy skin and moist interior, perfect for a weeknight dinner.
### Ingredients
* 2 salmon fillets (approx. 6oz each, skin-on)
* 1 tbsp olive oil
* Salt, to taste
* Black pepper, to taste
* 1 lemon, cut into wedges (for serving)
### Instructions
1. Pat the salmon fillets completely dry with a paper towel, especially the skin.
2. Season both sides of the salmon with salt and pepper.
3. Heat olive oil in a non-stick skillet over medium-high heat until shimmering.
4. Place salmon fillets skin-side down in the hot pan.
5. Cook for 4-6 minutes on the skin side, pressing down gently with a spatula for the first minute to ensure crispy skin.
6. Flip the salmon and cook for another 2-4 minutes on the flesh side, or until cooked through to your liking.
7. Serve immediately with lemon wedges.
### Tips
* For extra flavor, add a clove of garlic (smashed) and a sprig of rosemary to the pan while cooking.
* Ensure the pan is hot before adding the salmon for the best sear.
```
2. **Expand and Diversify the Query Dataset**:
* Open `data/sample_queries.csv`.
* Add at least **10 new, diverse queries** to this file. Ensure each new query has a unique `id` and a corresponding query text.
* Your queries should test various aspects of a recipe chatbot. Consider including requests related to:
* Specific cuisines (e.g., "Italian pasta dish", "Spicy Thai curry")
* Dietary restrictions (e.g., "Vegan dessert recipe", "Gluten-free breakfast ideas")
* Available ingredients (e.g., "What can I make with chicken, rice, and broccoli?")
* Meal types (e.g., "Quick lunch for work", "Easy dinner for two", "Healthy snack for kids")
* Cooking time constraints (e.g., "Recipe under 30 minutes")
* Skill levels (e.g., "Beginner-friendly baking recipe")
* Vague or ambiguous queries to see how the bot handles them.
* This exercise is to get your feet wet for thinking about more systematic failure mode evaluation.
3. **Run the Bulk Test & Evaluate**:
* After you have updated the system prompt in `backend/utils.py` and expanded the queries in `data/sample_queries.csv`, run the bulk test script:
```bash
python scripts/bulk_test.py
```
* Make sure a new CSV has been written.
Good luck with your assignment!
---
*This README was last updated: {datetime.datetime.now().strftime('%Y-%m-%d')}*
+1
View File
@@ -0,0 +1 @@
# Package marker for the backend modules.
+87
View File
@@ -0,0 +1,87 @@
from __future__ import annotations
"""FastAPI application entry-point for the recipe chatbot."""
from pathlib import Path
from typing import Final, List, Dict
from fastapi import FastAPI, HTTPException, status
from fastapi.responses import HTMLResponse
from fastapi.staticfiles import StaticFiles
from pydantic import BaseModel, Field
from backend.utils import get_agent_response # noqa: WPS433 import from parent
# -----------------------------------------------------------------------------
# Application setup
# -----------------------------------------------------------------------------
APP_TITLE: Final[str] = "Recipe Chatbot"
app = FastAPI(title=APP_TITLE)
# Serve static assets (currently just the HTML) under `/static/*`.
STATIC_DIR = Path(__file__).parent.parent / "frontend"
app.mount("/static", StaticFiles(directory=STATIC_DIR), name="static")
# -----------------------------------------------------------------------------
# Request / response models
# -----------------------------------------------------------------------------
class ChatMessage(BaseModel):
"""Schema for a single message in the chat history."""
role: str = Field(..., description="Role of the message sender (system, user, or assistant).")
content: str = Field(..., description="Content of the message.")
class ChatRequest(BaseModel):
"""Schema for incoming chat messages."""
messages: List[ChatMessage] = Field(..., description="The entire conversation history.")
class ChatResponse(BaseModel):
"""Schema for the assistant's reply returned to the front-end."""
messages: List[ChatMessage] = Field(..., description="The updated conversation history.")
# -----------------------------------------------------------------------------
# Routes
# -----------------------------------------------------------------------------
@app.post("/chat", response_model=ChatResponse)
async def chat_endpoint(payload: ChatRequest) -> ChatResponse: # noqa: WPS430
"""Main conversational endpoint.
It proxies the user's message list to the underlying agent and returns the updated list.
"""
# Convert Pydantic models to simple dicts for the agent
request_messages: List[Dict[str, str]] = [msg.model_dump() for msg in payload.messages]
try:
updated_messages_dicts = get_agent_response(request_messages)
except Exception as exc: # noqa: BLE001 broad; surface as HTTP 500
# In production you would log the traceback here.
raise HTTPException(
status_code=status.HTTP_500_INTERNAL_SERVER_ERROR,
detail=str(exc),
) from exc
# Convert dicts back to Pydantic models for the response
response_messages: List[ChatMessage] = [ChatMessage(**msg) for msg in updated_messages_dicts]
return ChatResponse(messages=response_messages)
@app.get("/", response_class=HTMLResponse)
async def index() -> HTMLResponse: # noqa: WPS430
"""Serve the chat UI."""
html_path = STATIC_DIR / "index.html"
if not html_path.exists():
raise HTTPException(
status_code=status.HTTP_404_NOT_FOUND,
detail="Frontend not found. Did you forget to build it?",
)
return HTMLResponse(html_path.read_text(encoding="utf-8"))
+74
View File
@@ -0,0 +1,74 @@
from __future__ import annotations
"""Utility helpers for the recipe chatbot backend.
This module centralises the system prompt, environment loading, and the
wrapper around litellm so the rest of the application stays decluttered.
"""
from pathlib import Path
from typing import Final, List, Dict
import litellm # type: ignore
from dotenv import load_dotenv
# Ensure the .env file is loaded as early as possible.
load_dotenv(override=False)
# --- Constants -------------------------------------------------------------------
SYSTEM_PROMPT: Final[str] = (
"You are an expert chef recommending delicious and useful recipes. "
"Present only one recipe at a time. If the user doesn't specify what ingredients "
"they have available, ask them about their available ingredients rather than "
"assuming what's in their fridge."
)
# Fetch configuration *after* we loaded the .env file.
MODEL_NAME: Final[str] = (
Path.cwd() # noqa: WPS432
.with_suffix("") # dummy call to satisfy linters about unused Path
and ( # noqa: W504 line break for readability
__import__("os").environ.get("MODEL_NAME", "gpt-3.5-turbo")
)
)
# --- Agent wrapper ---------------------------------------------------------------
def get_agent_response(messages: List[Dict[str, str]]) -> List[Dict[str, str]]: # noqa: WPS231
"""Call the underlying large-language model via *litellm*.
Parameters
----------
messages:
The full conversation history. Each item is a dict with "role" and "content".
Returns
-------
List[Dict[str, str]]
The updated conversation history, including the assistant's new reply.
"""
# litellm is model-agnostic; we only need to supply the model name and key.
# The first message is assumed to be the system prompt if not explicitly provided
# or if the history is empty. We'll ensure the system prompt is always first.
current_messages: List[Dict[str, str]]
if not messages or messages[0]["role"] != "system":
current_messages = [{"role": "system", "content": SYSTEM_PROMPT}] + messages
else:
current_messages = messages
completion = litellm.completion(
model=MODEL_NAME,
messages=current_messages, # Pass the full history
)
assistant_reply_content: str = (
completion["choices"][0]["message"]["content"] # type: ignore[index]
.strip()
)
# Append assistant's response to the history
updated_messages = current_messages + [{"role": "assistant", "content": assistant_reply_content}]
return updated_messages
+4
View File
@@ -0,0 +1,4 @@
id,query
1,Suggest a quick vegan breakfast recipe
2,I have chicken and rice, what can I cook?
3,Give me a dessert recipe with chocolate
1 id,query
2 1,Suggest a quick vegan breakfast recipe
3 2,I have chicken and rice, what can I cook?
4 3,Give me a dessert recipe with chocolate
+9
View File
@@ -0,0 +1,9 @@
# Copy this file to `.env` and fill in your credentials.
MODEL_NAME=openai/gpt-4.1-nano
# Example API keys
# OPENAI_API_KEY=
# TOGETHER_API_KEY=
# GEMINI_API_KEY=
# ANTHROPIC_API_KEY=
+264
View File
@@ -0,0 +1,264 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<title>Recipe Chatbot</title>
<script src="https://cdn.jsdelivr.net/npm/marked/marked.min.js"></script>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600&display=swap" rel="stylesheet">
<style>
/* Basic, responsive chat styling */
body {
font-family: 'Inter', -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, Helvetica, Arial, sans-serif;
background: #f8f9fa; /* Slightly lighter background */
display: flex;
flex-direction: column;
align-items: center;
margin: 0;
height: 100vh;
}
#app-container { /* New wrapper */
display: flex;
flex-direction: column;
width: 100%;
max-width: 600px;
height: 100%; /* Take full height of body */
box-sizing: border-box;
/* padding: 0 16px; /* Remove horizontal padding here, apply to children if needed */
background-color: #ffffff; /* Give app-container a white background */
box-shadow: 0 4px 12px rgba(0,0,0,0.05); /* Subtle shadow for depth */
border-radius: 12px; /* Rounded corners for the whole app box */
overflow: hidden; /* To make sure children respect border-radius */
margin: 16px; /* Add some margin around the app container */
}
#chat-container {
/* width: 100%; /* Handled by app-container */
/* max-width: 600px; /* Handled by app-container */
flex: 1; /* Takes available space */
display: flex;
flex-direction: column;
padding: 20px 20px 10px 20px; /* More padding */
overflow-y: auto;
box-sizing: border-box;
}
.message {
padding: 10px 15px;
border-radius: 18px; /* Softer, more modern radius */
margin-bottom: 10px;
max-width: 85%;
line-height: 1.5;
}
.user {
align-self: flex-end;
background-color: #e0f0ff; /* Lighter, softer blue for user */
color: #00529b; /* Darker blue text for contrast */
}
.assistant {
align-self: flex-start;
background-color: #f1f3f5; /* Lighter gray for assistant */
color: #343a40; /* Darker gray text */
border: none; /* Remove default border */
}
/* Styling for markdown elements inside assistant messages */
.assistant p { margin: 0.5em 0; }
.assistant ul, .assistant ol { margin: 0.5em 0 0.5em 20px; padding: 0; }
.assistant li { margin-bottom: 0.25em; }
.assistant pre {
background-color: #e9ecef;
padding: 10px;
border-radius: 6px;
overflow-x: auto;
font-size: 0.9em;
}
.assistant code {
font-family: 'SFMono-Regular', Consolas, 'Liberation Mono', Menlo, Courier, monospace;
font-size: 0.9em;
background-color: rgba(0,0,0,0.04);
padding: 2px 4px;
border-radius: 4px;
}
.assistant pre code {
background-color: transparent;
padding: 0;
border-radius: 0;
}
#typing-indicator {
box-sizing: border-box;
display: none;
color: #6c757d; /* Subtler gray for typing */
font-style: italic;
padding: 10px 15px; /* Match message padding */
margin-bottom: 8px;
border-radius: 18px; /* Match message radius */
background-color: #f1f3f5; /* Match assistant background */
align-self: flex-start; /* Ensure it aligns like an assistant message */
max-width: fit-content; /* Only as wide as its content */
margin-left: 20px; /* Align with chat content padding */
}
#input-form {
display: flex;
align-items: center; /* Vertically align input and button */
padding: 15px 20px; /* More padding */
box-sizing: border-box;
background-color: #f8f9fa; /* Slightly off-white for input area */
border-top: 1px solid #dee2e6; /* Subtle separator line */
}
#user-input {
flex: 1;
padding: 12px 15px;
font-size: 1rem; /* Consistent font size */
border: 1px solid #ced4da; /* Softer border */
border-radius: 20px; /* Rounded pill shape */
margin-right: 10px;
outline: none; /* Remove default focus outline */
transition: border-color 0.2s ease-in-out, box-shadow 0.2s ease-in-out;
}
#user-input:focus {
border-color: #4dabf7; /* Blue border on focus */
box-shadow: 0 0 0 3px rgba(77, 171, 247, 0.25); /* Subtle glow on focus */
}
#send-btn {
padding: 12px 20px;
font-size: 1rem;
font-weight: 500; /* Slightly bolder text */
background-color: #28a745; /* Green send button */
color: #fff;
border: none;
border-radius: 20px; /* Match input field */
cursor: pointer;
transition: background-color 0.2s ease-in-out;
}
#send-btn:hover {
background-color: #218838; /* Darker green on hover */
}
#send-btn:disabled {
background-color: #adb5bd; /* Gray when disabled */
cursor: not-allowed;
}
</style>
</head>
<body>
<h2 style="font-family: 'Inter', sans-serif; font-weight: 600; color: #343a40; margin-top: 24px; margin-bottom: 16px; text-align: center;">Recipe Chatbot</h2>
<div id="app-container"> <!-- New wrapper -->
<div id="chat-container">
<!-- Messages are rendered here by renderChat -->
</div>
<div id="typing-indicator">Assistant is typing...</div>
<form id="input-form">
<input
id="user-input"
type="text"
placeholder="Ask for a recipe..."
autocomplete="off"
required
/>
<button id="send-btn" type="submit">Send</button>
</form>
</div> <!-- End of app-container -->
<script>
const form = document.getElementById("input-form");
const input = document.getElementById("user-input");
const chatContainer = document.getElementById("chat-container");
const sendBtn = document.getElementById("send-btn");
const typingIndicator = document.getElementById("typing-indicator");
let chatHistory = []; // Holds all messages: { role: string, content: string }[]
let typingInterval = null; // Variable to hold the interval ID
/**
* Clears and re-renders all messages in the chat container based on chatHistory.
*/
function renderChat() {
chatContainer.innerHTML = ""; // Clear existing messages
chatHistory.forEach(msg => {
if (msg.role === "system") return;
const bubble = document.createElement("div");
bubble.classList.add("message", msg.role);
if (msg.role === "assistant") {
bubble.innerHTML = marked.parse(msg.content || ""); // Use marked.parse for assistant
} else {
bubble.textContent = msg.content;
}
chatContainer.appendChild(bubble);
});
chatContainer.scrollTop = chatContainer.scrollHeight;
}
async function sendMessage(evt) {
evt.preventDefault();
const userText = input.value.trim();
if (!userText) return;
// Add user message to history and re-render
chatHistory.push({ role: "user", content: userText });
renderChat();
input.value = "";
input.focus();
sendBtn.disabled = true;
typingIndicator.style.display = "block";
let dotCount = 0;
typingIndicator.textContent = "Assistant is typing"; // Initial text without dots
if (typingInterval) clearInterval(typingInterval);
typingInterval = setInterval(() => {
dotCount = (dotCount + 1) % 4;
typingIndicator.textContent = `Assistant is typing${'.'.repeat(dotCount)}`;
}, 300);
typingIndicator.scrollIntoView({ behavior: "smooth", block: "end" }); // Scroll indicator into view
try {
// Send the whole history
const res = await fetch("/chat", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({ messages: chatHistory }),
});
if (!res.ok) {
const errorData = await res.json();
throw new Error(errorData.detail || `Server responded with ${res.status}`);
}
const data = await res.json();
chatHistory = data.messages; // Update history with the server's version
renderChat(); // Re-render with the full history from server
} catch (error) {
// Add error message to history and re-render
chatHistory.push({
role: "assistant",
content: `Oops! Something went wrong: ${error.message}. Please try again.`
// Error message will also be parsed by marked in renderChat
});
renderChat();
console.error(error);
} finally {
sendBtn.disabled = false;
typingIndicator.style.display = "none";
if (typingInterval) clearInterval(typingInterval); // Stop animation
typingInterval = null;
}
}
form.addEventListener("submit", sendMessage);
// Initial render (in case there's a system message or pre-filled history later)
renderChat();
</script>
</body>
</html>
+6
View File
@@ -0,0 +1,6 @@
fastapi
uvicorn
litellm
python-dotenv
httpx
rich
Binary file not shown.

After

Width:  |  Height:  |  Size: 426 KiB

+134
View File
@@ -0,0 +1,134 @@
from __future__ import annotations
import sys
from pathlib import Path
# Add project root to sys.path to allow a_s_b_absolute imports
PROJECT_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(PROJECT_ROOT))
"""Bulk testing utility for the recipe chatbot agent.
Reads a CSV file containing user queries, fires them against the `/chat`
endpoint concurrently, and stores the results for later manual evaluation.
"""
import argparse
import csv
import datetime as dt
from typing import List, Tuple, Dict
from concurrent.futures import ThreadPoolExecutor, as_completed
from rich.console import Console, Group
from rich.panel import Panel
from rich.text import Text
from rich.markdown import Markdown
from backend.utils import get_agent_response, SYSTEM_PROMPT
# -----------------------------------------------------------------------------
# Configuration helpers
# -----------------------------------------------------------------------------
DEFAULT_CSV: Path = Path("data/sample_queries.csv")
RESULTS_DIR: Path = Path("results")
RESULTS_DIR.mkdir(exist_ok=True)
MAX_WORKERS = 10 # For ThreadPoolExecutor
# -----------------------------------------------------------------------------
# Core logic
# -----------------------------------------------------------------------------
# --- Sync function for ThreadPoolExecutor ---
def process_query_sync(query_id: str, query: str) -> Tuple[str, str, str]:
"""Processes a single query by calling the agent directly."""
initial_messages: List[Dict[str, str]] = [
{"role": "user", "content": query}
]
try:
# get_agent_response now returns the full history
updated_history = get_agent_response(initial_messages)
# Extract the last assistant message for the result
assistant_reply = ""
if updated_history and updated_history[-1]["role"] == "assistant":
assistant_reply = updated_history[-1]["content"]
else: # Should not happen with current logic but good to handle
assistant_reply = "Error: No assistant reply found in history."
return query_id, query, assistant_reply
except Exception as e:
return query_id, query, f"Error processing query: {str(e)}"
# Renamed and made sync
def run_bulk_test(csv_path: Path) -> None:
"""Main entry point for bulk testing (synchronous version)."""
with csv_path.open("r", newline="", encoding="utf-8") as csv_file:
reader = csv.DictReader(csv_file)
# Expects columns 'id' and 'query'
input_data: List[Dict[str, str]] = [
row for row in reader if row.get("id") and row.get("query")
]
if not input_data:
raise ValueError("No valid data (with 'id' and 'query') found in the provided CSV file.")
console = Console()
results_data: List[Tuple[str, str, str]] = [] # Will store (id, query, response)
with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
future_to_data = {
executor.submit(process_query_sync, item["id"], item["query"]):
item for item in input_data
}
console.print(f"[bold blue]Submitting {len(input_data)} queries to the executor...[/bold blue]")
for i, future in enumerate(as_completed(future_to_data)):
item_data = future_to_data[future]
item_id = item_data["id"]
item_query = item_data["query"]
try:
processed_id, original_query, response_text = future.result()
results_data.append((processed_id, original_query, response_text))
panel_content = Text()
panel_content.append(f"ID: {processed_id}\n", style="bold magenta")
panel_content.append("Query:\n", style="bold yellow")
panel_content.append(f"{original_query}\n\n")
# Create a separate Markdown object for the response
response_markdown = Markdown(response_text)
# Group the different parts for the Panel
panel_group = Group(
panel_content, # Contains ID and Query
Markdown("--- Response ---"), # A small separator for clarity
response_markdown # The Markdown rendered response
)
console.print(Panel(
panel_group, # Pass the group as the single renderable
title=f"Result {i+1}/{len(input_data)} - ID: {processed_id}",
border_style="cyan"
))
except Exception as exc:
console.print(Panel(f"[bold red]Exception for ID {item_id}, Query:[/bold red]\n{item_query}\n\n[bold red]Error:[/bold red]\n{exc}", title=f"Error in Result {i+1}/{len(input_data)} - ID: {item_id}", border_style="red"))
results_data.append((item_id, item_query, f"Exception during processing: {str(exc)}"))
console.print("[bold blue]All queries processed.[/bold blue]")
timestamp = dt.datetime.now().strftime("%Y%m%d_%H%M%S")
out_path = RESULTS_DIR / f"results_{timestamp}.csv"
with out_path.open("w", newline="", encoding="utf-8") as csv_file:
writer = csv.writer(csv_file)
writer.writerow(["id", "query", "response"])
writer.writerows(results_data)
console.print(f"[bold green]Saved {len(results_data)} results to {str(out_path)}[/bold green]")
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Bulk test the recipe chatbot")
parser.add_argument("--csv", type=Path, default=DEFAULT_CSV, help="Path to CSV file containing queries (column name: 'query').")
args = parser.parse_args()
run_bulk_test(args.csv)