run pnpm format

This commit is contained in:
starmorph
2025-05-08 10:36:57 -07:00
parent 2c6f3c97a2
commit 042ed66ba5
23 changed files with 6237 additions and 3598 deletions
+35 -17
View File
@@ -16,8 +16,7 @@ pnpm install
OPENAI_API_KEY=your_api_key_here
```
3. Start Langgraph Studio
3. Start Langgraph Studio
```bash
npm run agent
@@ -37,8 +36,8 @@ pnpm agent
- [Course Outline](#course-outline)
- [To-Do List](#to-do-list-tues-apr-29---fri-may-2nd)
### Project Structure
- `scripts/`: TypeScript scripts to run the email assistant
- `lib/`: Utility functions, tools, and shared types
- `lib/tools/`: Tool implementations
@@ -47,11 +46,13 @@ pnpm agent
- `lib/utils.ts`: Utility functions
### Architecture
- Uses `StateGraph` from LangGraph to create a multi-step workflow
- Leverages `Annotation` to track state across the graph nodes
- Two main components: triage router + email response agent
### Key LangChain/LangGraph Components:
- `initChatModel`: Creates LLM instance
- `StructuredTool`: Base class for tool definitions
- `BaseMessage`, `AIMessage`, `HumanMessage`, `SystemMessage`: Message handling
@@ -61,6 +62,7 @@ pnpm agent
- `Command`: Directs state transitions in the graph
### Workflow Sequence:
1. `triageRouter`: Classifies email as respond/ignore/notify
2. If respond → `response_agent` (compiled agent workflow)
3. Agent workflow:
@@ -69,22 +71,21 @@ pnpm agent
- `shouldContinue`: Determines if more tool calls needed
### Email Processing:
- Parses emails with `parseEmail` → author, to, subject, thread
- Formats email content with `formatEmailMarkdown`
- Routes to appropriate handling based on classification
### State Management:
- `AgentState` tracks messages, email input, and classification
- Properly typed with TypeScript for complete type safety
- Uses command pattern to transition between states
## LangGraph Studio Testing
To test the email assistant in LangGraph Studio, use this example email input:
```json
{
"email_input": {
@@ -102,15 +103,18 @@ To test the email assistant in LangGraph Studio, use this example email input:
### Node Details and Expected Outputs
#### Triage Router Node
This node analyzes the email content and classifies it into one of three categories:
1. **respond** - Emails that require a direct response, such as:
- Specific questions from clients or teammates
- Meeting requests and scheduling communications
- Task assignments directed to you
- Direct inquiries about projects, timelines, or deliverables
2. **notify** - Important information that doesn't need a direct response:
- FYI emails and project updates
- Announcements that are relevant to your work
- Information that should be noted but doesn't require action
@@ -123,6 +127,7 @@ This node analyzes the email content and classifies it into one of three categor
- Automated system notifications
Output example after classification:
```json
{
"classification_decision": "respond",
@@ -136,6 +141,7 @@ Output example after classification:
```
#### Response Agent Node
This node handles the actual email response generation. It can:
1. Use tools to craft appropriate responses
@@ -144,6 +150,7 @@ This node handles the actual email response generation. It can:
4. End the process when a satisfactory response is formulated
Tool call output example:
```json
{
"messages": [
@@ -166,6 +173,7 @@ Tool call output example:
```
#### Human-in-the-Loop Interactions
For the HITL version, interrupts occur at decision points, allowing you to:
1. **Review** actions before they're executed
@@ -180,6 +188,7 @@ The interrupt dialog will show the proposed action and expected outcome, allowin
To thoroughly test the entire graph and all possible paths, use the following examples:
#### 1. Email Requiring Response (Triage → Response Agent)
```json
{
"email_input": {
@@ -195,6 +204,7 @@ To thoroughly test the entire graph and all possible paths, use the following ex
```
#### 2. Notification Email (Triage → Triage Interrupt Handler)
```json
{
"email_input": {
@@ -210,6 +220,7 @@ To thoroughly test the entire graph and all possible paths, use the following ex
```
#### 3. Email to Ignore (Triage → END)
```json
{
"email_input": {
@@ -225,6 +236,7 @@ To thoroughly test the entire graph and all possible paths, use the following ex
```
#### 4. Testing Triage Interrupt → Response Agent Path
When the triage node classifies an email as "notify", you'll be prompted with an interrupt. To test this path:
1. Use the "Notification Email" example above
@@ -233,6 +245,7 @@ When the triage node classifies an email as "notify", you'll be prompted with an
4. This will direct the flow to the response_agent node
#### 5. Testing Triage Interrupt → END Path
When the triage node classifies an email as "notify", you can also choose to ignore it:
1. Use the "Notification Email" example above
@@ -240,6 +253,7 @@ When the triage node classifies an email as "notify", you can also choose to ign
3. This will end the workflow
#### 6. Testing Response Agent Tool Calls
When the response agent makes a tool call, you'll be prompted with an interrupt. To test different paths:
**Accept the tool call:**
@@ -272,17 +286,21 @@ Try modifying the email content to test different classification outcomes:
2. **For "notify" classification**: Send FYI updates without questions or required actions
3. **For "ignore" classification**: Use marketing language, irrelevant information, or messages clearly meant for others
### Memory Implementation
#### Memory Management:
`getMemory:` Retrieves memory from the store or initializes with defaults
`updateMemory:` Intelligently updates memory based on user feedback
#### Memory is organized in namespaces for different aspects:
- ["email_assistant", "triage_preferences"]: Email classification preferences
- ["email_assistant", "response_preferences"]: Email response style preferences
- ["email_assistant", "cal_preferences"]: Calendar meeting preferences
## TS Video outline
## TS Video outline
> BUILD
> EVAL
@@ -292,14 +310,14 @@ Try modifying the email content to test different classification outcomes:
> MEMORY
## To Do List Tues Apr 29 - Fri May 2nd
- [x] prompts
- [x] schemas
- [x] mock tools
- [x] utils
- [x] email assistant
- [x] prompts
- [x] schemas
- [x] mock tools
- [x] utils
- [x] email assistant
- [x ] email_assistant_hitl
- [ in progress] email_assistant_memory
- [ ] improve implementations
- [ ] improve structure to mirror Python more closely
- [ ] graph diagrams ```await graph.getGraph().drawMermaidPng()```
- [ ] improve structure to mirror Python more closely
- [ ] graph diagrams `await graph.getGraph().drawMermaidPng()`
+52 -52
View File
@@ -1,24 +1,24 @@
import js from '@eslint/js';
import tsPlugin from '@typescript-eslint/eslint-plugin';
import tsParser from '@typescript-eslint/parser';
import importPlugin from 'eslint-plugin-import';
import noInstanceof from 'eslint-plugin-no-instanceof';
import globals from 'globals';
import js from "@eslint/js";
import tsPlugin from "@typescript-eslint/eslint-plugin";
import tsParser from "@typescript-eslint/parser";
import importPlugin from "eslint-plugin-import";
import noInstanceof from "eslint-plugin-no-instanceof";
import globals from "globals";
export default [
js.configs.recommended,
{
plugins: {
import: importPlugin,
'@typescript-eslint': tsPlugin,
'no-instanceof': noInstanceof,
"@typescript-eslint": tsPlugin,
"no-instanceof": noInstanceof,
},
languageOptions: {
parser: tsParser,
parserOptions: {
ecmaVersion: 2021,
project: './tsconfig.json',
sourceType: 'module',
project: "./tsconfig.json",
sourceType: "module",
},
globals: {
...globals.node,
@@ -26,51 +26,51 @@ export default [
},
},
ignores: [
'eslint.config.js',
'scripts',
'src/utils/lodash/*',
'node_modules',
'dist',
'dist-cjs',
'*.js',
'*.cjs',
'*.d.ts',
"eslint.config.js",
"scripts",
"src/utils/lodash/*",
"node_modules",
"dist",
"dist-cjs",
"*.js",
"*.cjs",
"*.d.ts",
],
rules: {
'@typescript-eslint/explicit-module-boundary-types': 0,
'@typescript-eslint/no-empty-function': 0,
'@typescript-eslint/no-shadow': 0,
'@typescript-eslint/no-empty-interface': 0,
'@typescript-eslint/no-use-before-define': ['error', 'nofunc'],
'@typescript-eslint/no-unused-vars': ['warn', { args: 'none' }],
'@typescript-eslint/no-floating-promises': 'error',
'@typescript-eslint/no-misused-promises': 'error',
'@typescript-eslint/no-explicit-any': 0,
"@typescript-eslint/explicit-module-boundary-types": 0,
"@typescript-eslint/no-empty-function": 0,
"@typescript-eslint/no-shadow": 0,
"@typescript-eslint/no-empty-interface": 0,
"@typescript-eslint/no-use-before-define": ["error", "nofunc"],
"@typescript-eslint/no-unused-vars": ["warn", { args: "none" }],
"@typescript-eslint/no-floating-promises": "error",
"@typescript-eslint/no-misused-promises": "error",
"@typescript-eslint/no-explicit-any": 0,
camelcase: 0,
'class-methods-use-this': 0,
'import/extensions': [2, 'ignorePackages'],
'import/no-extraneous-dependencies': [
'error',
{ devDependencies: ['**/*.test.ts'] },
"class-methods-use-this": 0,
"import/extensions": [2, "ignorePackages"],
"import/no-extraneous-dependencies": [
"error",
{ devDependencies: ["**/*.test.ts"] },
],
'import/no-unresolved': 0,
'import/prefer-default-export': 0,
'keyword-spacing': 'error',
'max-classes-per-file': 0,
'max-len': 0,
'no-await-in-loop': 0,
'no-bitwise': 0,
'no-console': 0,
'no-restricted-syntax': 0,
'no-shadow': 0,
'no-continue': 0,
'no-underscore-dangle': 0,
'no-use-before-define': 0,
'no-useless-constructor': 0,
'no-return-await': 0,
'consistent-return': 0,
'no-else-return': 0,
'new-cap': ['error', { properties: false, capIsNew: false }],
"import/no-unresolved": 0,
"import/prefer-default-export": 0,
"keyword-spacing": "error",
"max-classes-per-file": 0,
"max-len": 0,
"no-await-in-loop": 0,
"no-bitwise": 0,
"no-console": 0,
"no-restricted-syntax": 0,
"no-shadow": 0,
"no-continue": 0,
"no-underscore-dangle": 0,
"no-use-before-define": 0,
"no-useless-constructor": 0,
"no-return-await": 0,
"consistent-return": 0,
"no-else-return": 0,
"new-cap": ["error", { properties: false, capIsNew: false }],
},
},
];
];
+19 -18
View File
@@ -1,25 +1,26 @@
/** @type {import('jest').Config} */
export default {
preset: 'ts-jest/presets/js-with-ts-esm',
testEnvironment: 'node',
testMatch: ['**/tests-ts/**/*.test.ts'],
extensionsToTreatAsEsm: ['.ts'],
preset: "ts-jest/presets/js-with-ts-esm",
testEnvironment: "node",
testMatch: ["**/tests-ts/**/*.test.ts"],
extensionsToTreatAsEsm: [".ts"],
transform: {
'^.+\\.tsx?$': ['ts-jest', {
tsconfig: 'tsconfig.json',
useESM: true,
}],
"^.+\\.tsx?$": [
"ts-jest",
{
tsconfig: "tsconfig.json",
useESM: true,
},
],
},
moduleNameMapper: {
'^@/(.*)$': '<rootDir>/$1',
'^(\\.{1,2}/.*)\\.js$': '$1',
"^@/(.*)$": "<rootDir>/$1",
"^(\\.{1,2}/.*)\\.js$": "$1",
},
transformIgnorePatterns: [
'node_modules/(?!(@langchain|langchain|@jest)/)'
],
rootDir: '../',
setupFilesAfterEnv: ['<rootDir>/tests-ts/setup.mjs'],
transformIgnorePatterns: ["node_modules/(?!(@langchain|langchain|@jest)/)"],
rootDir: "../",
setupFilesAfterEnv: ["<rootDir>/tests-ts/setup.mjs"],
testTimeout: 30000, // For LLM calls
moduleDirectories: ['node_modules', 'src'],
resolver: 'jest-ts-webcompat-resolver',
};
moduleDirectories: ["node_modules", "src"],
resolver: "jest-ts-webcompat-resolver",
};
+1 -1
View File
@@ -8,4 +8,4 @@
"hitlEmailAssistantMemory": "./scripts/email_assistant_hitl_memory.ts:getHitlEmailAssistantWithMemory"
},
"env": ".env"
}
}
+4378 -2200
View File
File diff suppressed because it is too large Load Diff
+17 -21
View File
@@ -1,5 +1,3 @@
// Standard tool descriptions for insertion into prompts
export const STANDARD_TOOLS_PROMPT = `
1. triage_email(ignore, notify, respond) - Triage emails into one of three categories
@@ -7,7 +5,7 @@ export const STANDARD_TOOLS_PROMPT = `
3. schedule_meeting(attendees, subject, duration_minutes, preferred_day, start_time) - Schedule calendar meetings where preferred_day is a datetime object
4. check_calendar_availability(day) - Check available time slots for a given day
5. Done - E-mail has been sent
`
`;
// Tool descriptions for HITL workflow
export const HITL_TOOLS_PROMPT = `
@@ -16,7 +14,7 @@ export const HITL_TOOLS_PROMPT = `
3. check_calendar_availability(day) - Check available time slots for a given day
4. Question(content) - Ask the user any follow-up questions
5. Done - E-mail has been sent
`
`;
// Tool descriptions for HITL with memory workflow
export const HITL_MEMORY_TOOLS_PROMPT = `
@@ -26,7 +24,7 @@ export const HITL_MEMORY_TOOLS_PROMPT = `
4. Question(content) - Ask the user any follow-up questions
5. background - Search for background information about the user and their contacts
6. Done - E-mail has been sent
`
`;
// Tool descriptions for agent workflow without triage
export const AGENT_TOOLS_PROMPT = `
@@ -34,7 +32,7 @@ export const AGENT_TOOLS_PROMPT = `
2. schedule_meeting(attendees, subject, duration_minutes, preferred_day, start_time) - Schedule calendar meetings where preferred_day is a datetime object
3. check_calendar_availability(day) - Check available time slots for a given day
4. Done - E-mail has been sent
`
`;
export const agentSystemPromptBaseline = `
<Role>
@@ -53,7 +51,7 @@ When handling emails, follow these steps:
3. For responding to the email, draft a response email with the write_email tool
4. For meeting requests, use the check_calendar_availability tool to find open time slots
5. To schedule a meeting, use the schedule_meeting tool with a datetime object for the preferred_day parameter
- Today's date is ${new Date().toISOString().split('T')[0]} - use this for scheduling meetings accurately
- Today's date is ${new Date().toISOString().split("T")[0]} - use this for scheduling meetings accurately
6. If you scheduled a meeting, then draft a short response email using the write_email tool
7. After using the write_email tool, the task is complete
8. If you have sent the email, then use the Done tool to indicate that the task is complete
@@ -76,7 +74,7 @@ When handling emails, follow these steps:
</Calendar Preferences>
`;
// Agentic workflow triage prompt
// Agentic workflow triage prompt
export const triageSystemPrompt = `
<Role>
Your role is to triage incoming emails based upon instructs and background information below.
@@ -99,7 +97,7 @@ Classify the below email into one of these categories.
</Rules>
`;
// Agentic workflow triage user prompt
// Agentic workflow triage user prompt
export const triageUserPrompt = `
Please determine how to handle the below email thread:
@@ -108,7 +106,7 @@ To: {to}
Subject: {subject}
{email_thread}`;
// Agentic workflow prompt
// Agentic workflow prompt
export const agentSystemPrompt = `
<Role>
You are a top-notch executive assistant who cares about helping your executive perform as well as possible.
@@ -126,7 +124,7 @@ When handling emails, follow these steps:
3. For responding to the email, draft a response email with the write_email tool
4. For meeting requests, use the check_calendar_availability tool to find open time slots
5. To schedule a meeting, use the schedule_meeting tool with a datetime object for the preferred_day parameter
- Today's date is ${new Date().toISOString().split('T')[0]} - use this for scheduling meetings accurately
- Today's date is ${new Date().toISOString().split("T")[0]} - use this for scheduling meetings accurately
6. If you scheduled a meeting, then draft a short response email using the write_email tool
7. After using the write_email tool, the task is complete
8. If you have sent the email, then use the Done tool to indicate that the task is complete
@@ -145,7 +143,7 @@ When handling emails, follow these steps:
</Calendar Preferences>
`;
// Agentic workflow with HITL prompt
// Agentic workflow with HITL prompt
export const agentSystemPromptHitl = `
<Role>
You are a top-notch executive assistant who cares about helping your executive perform as well as possible.
@@ -164,7 +162,7 @@ When handling emails, follow these steps:
4. For responding to the email, draft a response email with the write_email tool
5. For meeting requests, use the check_calendar_availability tool to find open time slots
6. To schedule a meeting, use the schedule_meeting tool with a datetime object for the preferred_day parameter
- Today's date is ${new Date().toISOString().split('T')[0]} - use this for scheduling meetings accurately
- Today's date is ${new Date().toISOString().split("T")[0]} - use this for scheduling meetings accurately
7. If you scheduled a meeting, then draft a short response email using the write_email tool
8. After using the write_email tool, the task is complete
9. If you have sent the email, then use the Done tool to indicate that the task is complete
@@ -183,7 +181,7 @@ When handling emails, follow these steps:
</Calendar Preferences>
`;
// Agentic workflow with HITL and memory prompt
// Agentic workflow with HITL and memory prompt
export const agentSystemPromptHitlMemory = `
<Role>
You are a top-notch executive assistant.
@@ -204,7 +202,7 @@ When handling emails, follow these steps:
6. If the provided background information, meeting preferences, or response preferences are not sufficient, use the Question tool to ask follow-up questions
7. For meeting requests, use the check_calendar_availability tool to find open time slots
8. Schedule meetings with the schedule_meeting tool when appropriate
- Today's date is ${new Date().toISOString().split('T')[0]} - use this for scheduling meetings accurately
- Today's date is ${new Date().toISOString().split("T")[0]} - use this for scheduling meetings accurately
9. If you scheduled a meeting, then draft a short response email using the write_email tool
10. Draft response emails using the write_email tool
11. After calling the write_email tool, the task is complete
@@ -224,12 +222,12 @@ When handling emails, follow these steps:
</Background>
`;
// Default background information
// Default background information
export const defaultBackground = `
I'm Lance, a software engineer at LangChain.
`;
// Default response preferences
// Default response preferences
export const defaultResponsePreferences = `
Use professional and concise language. If the e-mail mentions a deadline, make sure to explicitly acknowledge and reference the deadline in your response.
@@ -255,12 +253,12 @@ When responding to meeting scheduling requests:
- Reference the meeting's purpose in your response.
`;
// Default calendar preferences
// Default calendar preferences
export const defaultCalPreferences = `
30 minute meetings are preferred, but 15 minute meetings are also acceptable.
`;
// Default triage instructions
// Default triage instructions
export const defaultTriageInstructions = `
Emails that are not worth responding to:
- Marketing newsletters and promotional emails
@@ -287,5 +285,3 @@ Emails that are worth responding to:
- Personal reminders related to family (wife / daughter)
- Personal reminder related to self-care (doctor appointments, etc)
`;
+21 -19
View File
@@ -8,42 +8,44 @@ export const BaseEmailAgentState = z.object({
messages: z
.array(z.any()) // Using any to support all Message types
.default(() => [])
.langgraph.reducer(
(left, right) => [...left, ...right],
z.array(z.any())
),
.langgraph.reducer((left, right) => [...left, ...right], z.array(z.any())),
email_input: z.any(),
classification_decision: z.enum(["ignore", "respond", "notify"]).nullable().default(null)
classification_decision: z
.enum(["ignore", "respond", "notify"])
.nullable()
.default(null),
});
export const EmailAgentHITLState = z.object({
messages: z
.array(z.any()) // Using any to support all Message types
.default(() => [])
.langgraph.reducer(
(left, right) => [...left, ...right],
z.array(z.any())
),
.langgraph.reducer((left, right) => [...left, ...right], z.array(z.any())),
email_input: z.any(),
classification_decision: z.enum(["ignore", "respond", "notify", "error"]).nullable().default(null)
classification_decision: z
.enum(["ignore", "respond", "notify", "error"])
.nullable()
.default(null),
});
// Export the inferred types from the Zod schemas
export type BaseEmailAgentStateType = z.infer<typeof BaseEmailAgentState>;
export type EmailAgentHITLStateType = z.infer<typeof EmailAgentHITLState>;
/**
* Router schema for analyzing unread emails and routing based on content
*/
export const RouterSchema = z.object({
reasoning: z.string().describe("Step-by-step reasoning behind the classification"),
classification: z.enum(["ignore", "respond", "notify"]).describe(
"The classification of an email: 'ignore' for irrelevant emails, " +
"'notify' for important information that doesn't need a response, " +
"'respond' for emails that need a reply"
),
reasoning: z
.string()
.describe("Step-by-step reasoning behind the classification"),
classification: z
.enum(["ignore", "respond", "notify"])
.describe(
"The classification of an email: 'ignore' for irrelevant emails, " +
"'notify' for important information that doesn't need a response, " +
"'respond' for emails that need a reply",
),
});
export type RouterOutput = z.infer<typeof RouterSchema>;
@@ -75,4 +77,4 @@ export type State = {
messages: BaseMessage[];
email_input: EmailData;
classification_decision?: "ignore" | "respond" | "notify";
};
};
+108 -92
View File
@@ -1,11 +1,11 @@
/**
* @fileoverview Basic Email Assistant
*
*
* This script implements a basic email assistant that can triage incoming emails
* and generate responses without human intervention.
*
* @module email_assistant
*
*
* @structure
* ┌─────────────────────────────────────────────────────────────────────────┐
* │ Email Assistant │
@@ -39,12 +39,7 @@
import { initChatModel } from "langchain/chat_models/universal";
// LangGraph imports
import {
StateGraph,
START,
END,
Command
} from "@langchain/langgraph";
import { StateGraph, START, END, Command } from "@langchain/langgraph";
import { ToolCall } from "@langchain/core/messages/tool";
import { ToolNode } from "@langchain/langgraph/prebuilt";
@@ -53,43 +48,44 @@ import { ToolNode } from "@langchain/langgraph/prebuilt";
import "@langchain/langgraph/zod";
// LOCAL IMPORTS
import {
getTools,
getToolsByName
} from "../tools/base.js";
import { getTools, getToolsByName } from "../tools/base.js";
import {
triageSystemPrompt,
triageUserPrompt,
agentSystemPrompt, defaultBackground,
agentSystemPrompt,
defaultBackground,
defaultResponsePreferences,
defaultCalPreferences,
defaultTriageInstructions,
AGENT_TOOLS_PROMPT
AGENT_TOOLS_PROMPT,
} from "../prompts.js";
import {
BaseEmailAgentState,
BaseEmailAgentStateType
} from "../schemas.js";
import {
parseEmail,
formatEmailMarkdown
} from "../utils.js";
import { BaseEmailAgentState, BaseEmailAgentStateType } from "../schemas.js";
import { parseEmail, formatEmailMarkdown } from "../utils.js";
// Message Types from LangGraph SDK
import { AIMessage, Message } from "@langchain/langgraph-sdk";
// Helper for type checking
const hasToolCalls = (message: Message): message is AIMessage & { tool_calls: ToolCall[] } => {
return message.type === "ai" &&
"tool_calls" in message &&
Array.isArray(message.tool_calls);
const hasToolCalls = (
message: Message,
): message is AIMessage & { tool_calls: ToolCall[] } => {
return (
message.type === "ai" &&
"tool_calls" in message &&
Array.isArray(message.tool_calls)
);
};
// Define proper TypeScript types for our state
type AgentStateType = BaseEmailAgentStateType;
// Define node names as a union type for better type safety
type AgentNodes = typeof START | typeof END | "llm_call" | "environment" | "triage_router" | "response_agent";
type AgentNodes =
| typeof START
| typeof END
| "llm_call"
| "environment"
| "triage_router"
| "response_agent";
/**
* Initialize and export the email assistant
@@ -98,16 +94,16 @@ export const initializeEmailAssistant = async () => {
// Get tools
const tools = await getTools();
const toolsByName = await getToolsByName(tools);
// Initialize the LLM
const llm = await initChatModel("openai:gpt-4", {
const llm = await initChatModel("openai:gpt-4", {
temperature: 0.0,
openAIApiKey: process.env.OPENAI_API_KEY
openAIApiKey: process.env.OPENAI_API_KEY,
});
// Initialize the LLM for tool use
const llmWithTools = llm.bindTools(tools, { toolChoice: "required" });
// Create the LLM call node
const llmCallNode = async (state: AgentStateType) => {
/**
@@ -120,22 +116,22 @@ export const initializeEmailAssistant = async () => {
.replace("{background}", defaultBackground)
.replace("{response_preferences}", defaultResponsePreferences)
.replace("{cal_preferences}", defaultCalPreferences);
// Run the LLM with the messages
const response = await llmWithTools.invoke([
{ type: "system", content: systemPromptContent },
...messages
...messages,
]);
// Use explicit casting as the response is compatible with Message in runtime
return {
messages: [response as unknown as Message]
messages: [response as unknown as Message],
};
};
// Create the tool node
const toolNode = new ToolNode(tools);
// Conditional edge function for routing
const shouldContinue = (state: AgentStateType) => {
/**
@@ -144,25 +140,25 @@ export const initializeEmailAssistant = async () => {
*/
const messages = state.messages;
if (!messages || messages.length === 0) return END;
const lastMessage = messages[messages.length - 1];
if (hasToolCalls(lastMessage) && lastMessage.tool_calls.length > 0) {
// Check if any tool call is the "Done" tool
if (lastMessage.tool_calls.some(toolCall => toolCall.name === "Done")) {
if (lastMessage.tool_calls.some((toolCall) => toolCall.name === "Done")) {
return END;
}
return "environment";
}
return END;
};
// Create the triage router node
const triageRouterNode = async (state: AgentStateType) => {
/**
* Analyze email content to decide if we should respond, notify, or ignore.
*
*
* The triage step prevents the assistant from wasting time on:
* - Marketing emails and spam
* - Company-wide announcements
@@ -171,79 +167,94 @@ export const initializeEmailAssistant = async () => {
try {
const { email_input } = state;
const parseResult = parseEmail(email_input);
// Validate parsing result
if (!parseResult || typeof parseResult !== 'object') {
if (!parseResult || typeof parseResult !== "object") {
throw new Error("Invalid email parsing result");
}
const { author, to, subject, emailThread } = parseResult;
const systemPrompt = triageSystemPrompt
.replace("{background}", defaultBackground)
.replace("{triage_instructions}", defaultTriageInstructions);
const userPrompt = triageUserPrompt
.replace("{author}", author)
.replace("{to}", to)
.replace("{subject}", subject)
.replace("{email_thread}", emailThread);
// Create email markdown for Agent Inbox in case of notification
const emailMarkdown = formatEmailMarkdown(subject, author, to, emailThread);
// Create email markdown for Agent Inbox in case of notification
const emailMarkdown = formatEmailMarkdown(
subject,
author,
to,
emailThread,
);
// Add JSON format instruction to the system prompt
const jsonSystemPrompt = `${systemPrompt}\n\nProvide your response in the following JSON format:
{
"reasoning": "your step-by-step reasoning",
"classification": "ignore" | "respond" | "notify"
}`;
// Use the regular LLM instead of withStructuredOutput
const response = await llm.invoke([
{ type: "system", content: jsonSystemPrompt },
{ type: "human", content: userPrompt }
{ type: "human", content: userPrompt },
]);
// Parse the JSON response manually
let classification: "ignore" | "respond" | "notify" = "ignore"; // Default
try {
// Extract JSON from the response content
const responseText = response.content.toString();
const parsedResponse = JSON.parse(responseText);
if (parsedResponse.classification &&
["ignore", "respond", "notify"].includes(parsedResponse.classification)) {
if (
parsedResponse.classification &&
["ignore", "respond", "notify"].includes(
parsedResponse.classification,
)
) {
classification = parsedResponse.classification;
}
} catch (parseError) {
console.error("Error parsing LLM response as JSON:", parseError);
console.log("Raw response:", response.content.toString());
}
let goto = END;
let update: Partial<AgentStateType> = {
classification_decision: classification
classification_decision: classification,
};
if (classification === "respond") {
console.log("📧 Classification: RESPOND - This email requires a response");
console.log(
"📧 Classification: RESPOND - This email requires a response",
);
goto = "response_agent";
update.messages = [
{ type: "human", content: `Respond to the email: ${emailMarkdown}` }
{ type: "human", content: `Respond to the email: ${emailMarkdown}` },
];
} else if (classification === "ignore") {
console.log("🚫 Classification: IGNORE - This email can be safely ignored");
console.log(
"🚫 Classification: IGNORE - This email can be safely ignored",
);
} else if (classification === "notify") {
console.log("🔔 Classification: NOTIFY - This email contains important information");
console.log(
"🔔 Classification: NOTIFY - This email contains important information",
);
} else {
throw new Error(`Invalid classification: ${classification}`);
}
return new Command({
goto,
update
update,
});
} catch (error) {
console.error("Error in triage router:", error);
@@ -253,43 +264,48 @@ export const initializeEmailAssistant = async () => {
update: {
classification_decision: "ignore",
messages: [
{ type: "system", content: `Error processing email: ${error instanceof Error ? error.message : String(error)}` }
]
}
{
type: "system",
content: `Error processing email: ${error instanceof Error ? error.message : String(error)}`,
},
],
},
});
}
};
// Build agent subgraph
const agentBuilder = new StateGraph<typeof BaseEmailAgentState,
AgentStateType,
const agentBuilder = new StateGraph<
typeof BaseEmailAgentState,
AgentStateType,
Partial<AgentStateType>,
AgentNodes>(BaseEmailAgentState)
AgentNodes
>(BaseEmailAgentState)
.addNode("llm_call", llmCallNode)
.addNode("environment", toolNode)
.addEdge(START, "llm_call")
.addConditionalEdges(
"llm_call",
shouldContinue,
{
"environment": "environment",
[END]: END
}
)
.addConditionalEdges("llm_call", shouldContinue, {
environment: "environment",
[END]: END,
})
.addEdge("environment", "llm_call");
// Compile the agent subgraph
const agent = agentBuilder.compile();
// Build overall workflow
const emailAssistantGraph = new StateGraph<typeof BaseEmailAgentState,
AgentStateType,
const emailAssistantGraph = new StateGraph<
typeof BaseEmailAgentState,
AgentStateType,
Partial<AgentStateType>,
AgentNodes>(BaseEmailAgentState)
.addNode("triage_router", triageRouterNode, { ends: ["response_agent", END] })
AgentNodes
>(BaseEmailAgentState)
.addNode("triage_router", triageRouterNode, {
ends: ["response_agent", END],
})
.addNode("response_agent", agent)
.addEdge(START, "triage_router");
// Compile and return the email assistant
return emailAssistantGraph.compile();
};
+263 -181
View File
@@ -1,11 +1,11 @@
/**
* @fileoverview Email Assistant with Human-in-the-Loop (HITL)
*
*
* This script implements an email assistant with human review capabilities,
* allowing users to review, edit, or reject proposed actions before execution.
*
*
* @module email_assistant_hitl
*
*
* @structure
* ┌─────────────────────────────────────────────────────────────────────────┐
* │ HITL Email Assistant │
@@ -46,7 +46,7 @@ import {
START,
END,
Command,
MemorySaver
MemorySaver,
} from "@langchain/langgraph";
import { ToolCall } from "@langchain/core/messages/tool";
@@ -54,10 +54,7 @@ import { ToolCall } from "@langchain/core/messages/tool";
import "@langchain/langgraph/zod";
// LOCAL IMPORTS
import {
getTools,
getToolsByName
} from "../tools/base.js";
import { getTools, getToolsByName } from "../tools/base.js";
import {
HITL_TOOLS_PROMPT,
triageSystemPrompt,
@@ -66,217 +63,263 @@ import {
defaultBackground,
defaultResponsePreferences,
defaultCalPreferences,
defaultTriageInstructions
defaultTriageInstructions,
} from "../prompts.js";
import {
EmailAgentHITLState,
EmailAgentHITLStateType
} from "../schemas.js";
import {
parseEmail,
formatEmailMarkdown,
formatForDisplay
} from "../utils.js";
import { EmailAgentHITLState, EmailAgentHITLStateType } from "../schemas.js";
import { parseEmail, formatEmailMarkdown, formatForDisplay } from "../utils.js";
// Message Types from LangGraph SDK
import {
AIMessage,
Message
} from "@langchain/langgraph-sdk";
import { AIMessage, Message } from "@langchain/langgraph-sdk";
// Helper for type checking
const hasToolCalls = (message: Message): message is AIMessage & { tool_calls: ToolCall[] } => {
return message.type === "ai" &&
"tool_calls" in message &&
Array.isArray(message.tool_calls);
const hasToolCalls = (
message: Message,
): message is AIMessage & { tool_calls: ToolCall[] } => {
return (
message.type === "ai" &&
"tool_calls" in message &&
Array.isArray(message.tool_calls)
);
};
// Define proper TypeScript types for our state
type AgentStateType = EmailAgentHITLStateType;
// Define node names as a union type for better type safety
type AgentNodes = typeof START | typeof END | "llm_call" | "interrupt_handler" | "triage_router" | "triage_interrupt_handler" | "response_agent";
type AgentNodes =
| typeof START
| typeof END
| "llm_call"
| "interrupt_handler"
| "triage_router"
| "triage_interrupt_handler"
| "response_agent";
/**
* Initialize and export the HITL email assistant
*/
export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) => {
export const initializeHitlEmailAssistant = async (
checkpointer?: MemorySaver,
) => {
// Get tools
const tools = await getTools();
const toolsByName = await getToolsByName();
// Initialize the LLM
const llm = await initChatModel("openai:gpt-4");
// Initialize the LLM instance for tool use
const llmWithTools = llm.bindTools(tools, { toolChoice: "required" });
// Create the LLM call node
const llmCallNode = async (state: AgentStateType): Promise<{ messages: Message[] }> => {
const llmCallNode = async (
state: AgentStateType,
): Promise<{ messages: Message[] }> => {
const { messages } = state;
// Set up system prompt for the agent
const systemPrompt = agentSystemPromptHitl
const systemPrompt = agentSystemPromptHitl
.replace("{tools_prompt}", HITL_TOOLS_PROMPT)
.replace("{background}", defaultBackground)
.replace("{response_preferences}", defaultResponsePreferences)
.replace("{calendar_preferences}", defaultCalPreferences);
// Create full message history for the agent
const allMessages = [
{ type: "system", content: systemPrompt },
...messages
...messages,
];
// Run the LLM with the messages
const result = await llmWithTools.invoke(allMessages);
// Return the AIMessage result - need to cast through unknown since the types don't have proper overlap
return {
messages: [result as unknown as Message]
messages: [result as unknown as Message],
};
};
// Create the interrupt handler node for human review
const interruptHandlerNode = async (state: AgentStateType): Promise<Command> => {
const interruptHandlerNode = async (
state: AgentStateType,
): Promise<Command> => {
// Store messages to be returned
const result: Message[] = [];
// Default goto is llm_call
let goto: typeof END | "llm_call" = "llm_call";
// Get the last message
const lastMessage = state.messages[state.messages.length - 1];
// Exit early if there are no tool calls
if (!hasToolCalls(lastMessage) || !lastMessage.tool_calls || lastMessage.tool_calls.length === 0) {
if (
!hasToolCalls(lastMessage) ||
!lastMessage.tool_calls ||
lastMessage.tool_calls.length === 0
) {
return new Command({
goto,
update: { messages: result }
update: { messages: result },
});
}
// Keep track of processed tool calls to ensure all get responses
const processedToolCallIds = new Set<string>();
// Handle only one tool call at a time for better human-in-the-loop experience
let processedOneToolCall = false;
// Iterate over the tool calls in the last message
for (const toolCall of lastMessage.tool_calls) {
// Skip if we've already processed one tool call to allow proper resuming
if (processedOneToolCall) {
continue;
}
// Get or create a valid tool call ID
const callId = toolCall.id ?? `fallback-id-${Date.now()}`;
// Allowed tools for HITL
const hitlTools = ["write_email", "schedule_meeting", "Question"];
// If tool is not in our HITL list, execute it directly without interruption
if (!hitlTools.includes(toolCall.name)) {
const tool = toolsByName[toolCall.name];
if (!tool) {
console.error(`Tool ${toolCall.name} not found`);
result.push({ type: "tool", content: `Error: Tool ${toolCall.name} not found`, tool_call_id: callId });
result.push({
type: "tool",
content: `Error: Tool ${toolCall.name} not found`,
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
continue;
}
try {
// Parse the args properly - if it's a string, parse it as JSON
const parsedArgs = typeof toolCall.args === 'string'
? JSON.parse(toolCall.args)
: toolCall.args;
const parsedArgs =
typeof toolCall.args === "string"
? JSON.parse(toolCall.args)
: toolCall.args;
// Invoke the tool with properly formatted arguments
const observation = await tool.invoke(parsedArgs);
result.push({ type: "tool", content: observation, tool_call_id: callId });
result.push({
type: "tool",
content: observation,
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
} catch (error: any) {
console.error(`Error executing tool ${toolCall.name}:`, error);
result.push({ type: "tool", content: `Error executing tool: ${error.message}`, tool_call_id: callId });
result.push({
type: "tool",
content: `Error executing tool: ${error.message}`,
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
}
continue;
}
// Get original email from email_input in state
const emailInput = state.email_input;
const parseResult = parseEmail(emailInput);
// Validate parsing result
if (!parseResult || typeof parseResult !== 'object') {
if (!parseResult || typeof parseResult !== "object") {
throw new Error("Invalid email parsing result");
}
const { author, to, subject, emailThread } = parseResult;
const originalEmailMarkdown = formatEmailMarkdown(subject, author, to, emailThread);
const originalEmailMarkdown = formatEmailMarkdown(
subject,
author,
to,
emailThread,
);
// Format tool call for display
const toolDisplay = formatForDisplay(state, toolCall);
const description = originalEmailMarkdown + toolDisplay;
try {
// Use the interrupt function from LangGraph
const { interrupt } = await import("@langchain/langgraph");
// IMPORTANT: We're directly passing the interrupt call result without modifying it
const humanReview = await interrupt({
question: "Review this tool call before execution:",
toolCall: toolCall,
description
description,
});
const reviewAction = humanReview.action;
const reviewData = humanReview.data;
if (reviewAction === "continue") {
// Execute the tool with original args
const tool = toolsByName[toolCall.name];
// Parse the args properly
const parsedArgs = typeof toolCall.args === 'string'
? JSON.parse(toolCall.args)
: toolCall.args;
const parsedArgs =
typeof toolCall.args === "string"
? JSON.parse(toolCall.args)
: toolCall.args;
const observation = await tool.invoke(parsedArgs);
result.push({ type: "tool", content: observation, tool_call_id: callId });
result.push({
type: "tool",
content: observation,
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
}
else if (reviewAction === "update") {
} else if (reviewAction === "update") {
// Execute with edited args
const tool = toolsByName[toolCall.name];
// Make sure the updated args are properly formatted
const updatedArgs = typeof reviewData === 'string'
? JSON.parse(reviewData)
: reviewData;
const updatedArgs =
typeof reviewData === "string"
? JSON.parse(reviewData)
: reviewData;
const observation = await tool.invoke(updatedArgs);
result.push({ type: "tool", content: observation, tool_call_id: callId });
result.push({
type: "tool",
content: observation,
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
}
else if (reviewAction === "feedback") {
} else if (reviewAction === "feedback") {
// Add feedback as a tool message
result.push({ type: "tool", content: reviewData, tool_call_id: callId });
result.push({
type: "tool",
content: reviewData,
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
goto = "llm_call";
}
else if (reviewAction === "stop") {
} else if (reviewAction === "stop") {
// Even when stopping, we still need to respond to the tool call
result.push({ type: "tool", content: "User chose to stop this action.", tool_call_id: callId });
result.push({
type: "tool",
content: "User chose to stop this action.",
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
goto = END;
}
else {
} else {
// Handle any other action by providing a default response
result.push({ type: "tool", content: "Action not recognized or canceled by user.", tool_call_id: callId });
result.push({
type: "tool",
content: "Action not recognized or canceled by user.",
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
goto = END;
@@ -284,43 +327,54 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
} catch (error: any) {
// Very important: Just rethrow any GraphInterrupt error without modifying it
// This ensures LangGraph can properly handle the interruption
if (error.name === 'GraphInterrupt' ||
(error.message && typeof error.message === 'string' &&
error.message.includes('GraphInterrupt'))) {
if (
error.name === "GraphInterrupt" ||
(error.message &&
typeof error.message === "string" &&
error.message.includes("GraphInterrupt"))
) {
throw error;
}
console.error("Error with interrupt handler:", error);
// For other errors, provide a message response
result.push({ type: "tool", content: `Error during tool execution: ${error.message}`, tool_call_id: callId });
result.push({
type: "tool",
content: `Error during tool execution: ${error.message}`,
tool_call_id: callId,
});
processedToolCallIds.add(callId);
processedOneToolCall = true;
}
}
// If we've processed a tool call, return right away
if (processedOneToolCall) {
return new Command({
goto,
update: { messages: result }
update: { messages: result },
});
}
// If we reach here and haven't processed any tool calls,
// we need to return appropriate responses for any remaining ones
lastMessage.tool_calls?.forEach(toolCall => {
lastMessage.tool_calls?.forEach((toolCall) => {
const callId = toolCall.id ?? `fallback-id-${Date.now()}`;
if (!processedToolCallIds.has(callId)) {
// We've skipped this tool call, but we still need to respond to it
// This is important for OpenAI's API requirement that every tool call has a response
result.push({ type: "tool", content: "Tool execution pending human review.", tool_call_id: callId });
result.push({
type: "tool",
content: "Tool execution pending human review.",
tool_call_id: callId,
});
}
});
// Return the Command with goto and update
return new Command({
goto,
update: { messages: result }
update: { messages: result },
});
};
@@ -328,17 +382,21 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
const shouldContinue = (state: AgentStateType) => {
const messages = state.messages;
if (!messages || messages.length === 0) return END;
const lastMessage = messages[messages.length - 1];
if (hasToolCalls(lastMessage) && lastMessage.tool_calls && lastMessage.tool_calls.length > 0) {
if (
hasToolCalls(lastMessage) &&
lastMessage.tool_calls &&
lastMessage.tool_calls.length > 0
) {
// Check if any tool call is the "Done" tool
if (lastMessage.tool_calls.some(toolCall => toolCall.name === "Done")) {
if (lastMessage.tool_calls.some((toolCall) => toolCall.name === "Done")) {
return END;
}
return "interrupt_handler";
}
return END;
};
@@ -347,50 +405,59 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
try {
const { email_input } = state;
const parseResult = parseEmail(email_input);
// Validate parsing result
if (!parseResult || typeof parseResult !== 'object') {
if (!parseResult || typeof parseResult !== "object") {
throw new Error("Invalid email parsing result");
}
const { author, to, subject, emailThread } = parseResult;
const systemPrompt = triageSystemPrompt
.replace("{background}", defaultBackground)
.replace("{triage_instructions}", defaultTriageInstructions);
const userPrompt = triageUserPrompt
.replace("{author}", author)
.replace("{to}", to)
.replace("{subject}", subject)
.replace("{email_thread}", emailThread);
// Create email markdown for Agent Inbox in case of notification
const emailMarkdown = formatEmailMarkdown(subject, author, to, emailThread);
// Create email markdown for Agent Inbox in case of notification
const emailMarkdown = formatEmailMarkdown(
subject,
author,
to,
emailThread,
);
// Add JSON format instruction to the system prompt
const jsonSystemPrompt = `${systemPrompt}\n\nProvide your response in the following JSON format:
{
"reasoning": "your step-by-step reasoning",
"classification": "ignore" | "respond" | "notify"
}`;
// Use the regular LLM instead of withStructuredOutput
const response = await llm.invoke([
{ type: "system", content: jsonSystemPrompt },
{ type: "human", content: userPrompt }
{ type: "human", content: userPrompt },
]);
// Parse the JSON response manually
let classification: "ignore" | "respond" | "notify" = "notify"; // Default to notify
try {
// Extract JSON from the response content
const responseText = response.content.toString();
const parsedResponse = JSON.parse(responseText);
if (parsedResponse.classification &&
["ignore", "respond", "notify"].includes(parsedResponse.classification)) {
if (
parsedResponse.classification &&
["ignore", "respond", "notify"].includes(
parsedResponse.classification,
)
) {
classification = parsedResponse.classification;
}
} catch (parseError) {
@@ -398,35 +465,42 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
console.log("Raw response:", response.content.toString());
// Fall back to notify if parsing fails
}
let goto: "triage_interrupt_handler" | "response_agent" | typeof END = END;
let goto: "triage_interrupt_handler" | "response_agent" | typeof END =
END;
let update: Partial<AgentStateType> = {
classification_decision: classification
classification_decision: classification,
};
// Create message
update.messages = [
{ type: "human", content: `Email to review: ${emailMarkdown}` }
{ type: "human", content: `Email to review: ${emailMarkdown}` },
];
if (classification === "respond") {
console.log("📧 Classification: RESPOND - This email requires a response");
console.log(
"📧 Classification: RESPOND - This email requires a response",
);
goto = "response_agent";
} else if (classification === "notify") {
console.log("🔔 Classification: NOTIFY - This email contains important information");
console.log(
"🔔 Classification: NOTIFY - This email contains important information",
);
goto = "triage_interrupt_handler";
} else if (classification === "ignore") {
console.log("🚫 Classification: IGNORE - This email can be safely ignored");
console.log(
"🚫 Classification: IGNORE - This email can be safely ignored",
);
goto = END;
} else {
// Default to notify if classification is not recognized
goto = "triage_interrupt_handler";
update.classification_decision = "notify";
}
return new Command({
goto,
update
update,
});
} catch (error: any) {
console.error("Error in triage router:", error);
@@ -435,9 +509,12 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
update: {
classification_decision: "error",
messages: [
{ type: "system", content: `Error in triage router: ${error.message}` }
]
}
{
type: "system",
content: `Error in triage router: ${error.message}`,
},
],
},
});
}
};
@@ -446,35 +523,35 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
const triageInterruptHandlerNode = async (state: AgentStateType) => {
// Parse the email input
const parseResult = parseEmail(state.email_input);
// Validate parsing result
if (!parseResult || typeof parseResult !== 'object') {
if (!parseResult || typeof parseResult !== "object") {
throw new Error("Invalid email parsing result");
}
const { author, to, subject, emailThread } = parseResult;
// Create email markdown for Agent Inbox in case of notification
// Create email markdown for Agent Inbox in case of notification
const emailMarkdown = formatEmailMarkdown(subject, author, to, emailThread);
try {
// Use the interrupt function from LangGraph
const { interrupt } = await import("@langchain/langgraph");
const humanReview = await interrupt({
question: `Email requires attention: ${state.classification_decision || "notify"}`,
email: emailMarkdown
email: emailMarkdown,
});
let goto: "response_agent" | typeof END = END;
const messages = [
{ type: "human", content: `Email to review: ${emailMarkdown}` }
{ type: "human", content: `Email to review: ${emailMarkdown}` },
];
// Handle different response types
const reviewAction = humanReview.action;
const reviewData = humanReview.data;
if (reviewAction === "continue") {
// Human wants to handle this email - proceed to response agent
goto = "response_agent";
@@ -486,13 +563,13 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
// Default to END for other actions
goto = END;
}
// Return Command with goto and update
return new Command({
goto,
update: {
messages
}
messages,
},
});
} catch (error) {
console.error("Error with triage interrupt handler:", error);
@@ -500,66 +577,71 @@ export const initializeHitlEmailAssistant = async (checkpointer?: MemorySaver) =
goto: END,
update: {
messages: [
{ type: "system", content: `Error in triage interrupt: ${error}` }
]
}
{ type: "system", content: `Error in triage interrupt: ${error}` },
],
},
});
}
};
// Build agent subgraph
const agentBuilder = new StateGraph<typeof EmailAgentHITLState,
AgentStateType,
const agentBuilder = new StateGraph<
typeof EmailAgentHITLState,
AgentStateType,
Partial<AgentStateType>,
AgentNodes>(EmailAgentHITLState)
AgentNodes
>(EmailAgentHITLState)
.addNode("llm_call", llmCallNode)
.addNode("interrupt_handler", interruptHandlerNode)
.addEdge(START, "llm_call")
.addConditionalEdges(
"llm_call",
shouldContinue,
{
"interrupt_handler": "interrupt_handler",
[END]: END
}
)
.addConditionalEdges("llm_call", shouldContinue, {
interrupt_handler: "interrupt_handler",
[END]: END,
})
.addEdge("interrupt_handler", "llm_call");
// Compile the agent
const responseAgent = agentBuilder.compile();
// Build overall workflow
const emailAssistantGraph = new StateGraph<typeof EmailAgentHITLState,
AgentStateType,
const emailAssistantGraph = new StateGraph<
typeof EmailAgentHITLState,
AgentStateType,
Partial<AgentStateType>,
AgentNodes>(EmailAgentHITLState)
AgentNodes
>(EmailAgentHITLState)
.addNode("triage_router", triageRouterNode, {
ends: ["triage_interrupt_handler", "response_agent", END]
ends: ["triage_interrupt_handler", "response_agent", END],
})
.addNode("triage_interrupt_handler", triageInterruptHandlerNode, {
ends: ["response_agent", END]
ends: ["response_agent", END],
})
.addNode("response_agent", responseAgent, {
ends: [END]
ends: [END],
})
.addEdge(START, "triage_router")
.addEdge("response_agent", END);
// Use provided checkpointer or create a new one
const actualCheckpointer = checkpointer || new MemorySaver();
console.log("Compiling HITL email assistant with checkpointer:", actualCheckpointer ? "provided" : "default");
console.log(
"Compiling HITL email assistant with checkpointer:",
actualCheckpointer ? "provided" : "default",
);
// Compile and return the email assistant with the checkpointer
return emailAssistantGraph.compile({
checkpointer: actualCheckpointer
checkpointer: actualCheckpointer,
});
};
// Initialize and export HITL email assistant directly with a default checkpointer
export const hitlEmailAssistant = initializeHitlEmailAssistant(new MemorySaver());
export const hitlEmailAssistant = initializeHitlEmailAssistant(
new MemorySaver(),
);
// Export the function with the name the tests expect
export const createHitlEmailAssistant = async () => {
return initializeHitlEmailAssistant(new MemorySaver());
};
};
File diff suppressed because it is too large Load Diff
+27 -14
View File
@@ -1,14 +1,22 @@
/**
* Email Assistant Tool Definitions
*
*
* This file contains the central registry for all tools available to the email assistant.
* Tools are defined using the StructuredTool pattern from LangChain.
*/
import { StructuredTool } from "@langchain/core/tools";
import { writeEmail, triageEmail, Done } from "./default/email-tools.js";
import { scheduleMeeting, checkCalendarAvailability } from "./default/calendar-tools.js";
import { backgroundTool, calPreferencesTool, responsePreferencesTool, questionTool } from "./default/memory-tools.js";
import {
scheduleMeeting,
checkCalendarAvailability,
} from "./default/calendar-tools.js";
import {
backgroundTool,
calPreferencesTool,
responsePreferencesTool,
questionTool,
} from "./default/memory-tools.js";
/**
* Options for customizing tool selection
@@ -22,11 +30,14 @@ export interface GetToolsOptions {
/**
* Returns an array of tools based on the provided options
*
*
* @param options - Configuration options for tool selection
* @returns Array of StructuredTool instances ready for use with agents
*/
export async function getTools({ toolNames, includeGmail = false }: GetToolsOptions = {}): Promise<StructuredTool[]> {
export async function getTools({
toolNames,
includeGmail = false,
}: GetToolsOptions = {}): Promise<StructuredTool[]> {
// Base tools dictionary - all available tools should be registered here
const allTools: Record<string, StructuredTool> = {
write_email: writeEmail,
@@ -39,29 +50,31 @@ export async function getTools({ toolNames, includeGmail = false }: GetToolsOpti
response_preferences: responsePreferencesTool,
Question: questionTool,
};
// If specific tool names are provided, filter to only those tools
if (toolNames) {
return toolNames
.filter(name => name in allTools)
.map(name => allTools[name]);
.filter((name) => name in allTools)
.map((name) => allTools[name]);
}
// Otherwise return all tools
return Object.values(allTools);
}
/**
* Creates a lookup map of tools by their name for easier access
*
*
* @param tools - Optional array of tools to convert to lookup map
* @returns Record mapping tool names to their corresponding StructuredTool instances
*/
export async function getToolsByName(tools?: StructuredTool[]): Promise<Record<string, StructuredTool>> {
const toolsList = tools || await getTools();
export async function getToolsByName(
tools?: StructuredTool[],
): Promise<Record<string, StructuredTool>> {
const toolsList = tools || (await getTools());
return toolsList.reduce<Record<string, StructuredTool>>((acc, tool) => {
acc[tool.name] = tool;
return acc;
}, {});
}
}
+5 -5
View File
@@ -6,7 +6,7 @@ const scheduleMeetingSchema = z.object({
attendees: z.array(z.string()).describe("List of attendees' emails"),
startTime: z.string().describe("Meeting start time in ISO format"),
endTime: z.string().describe("Meeting end time in ISO format"),
description: z.string().optional().describe("Meeting description")
description: z.string().optional().describe("Meeting description"),
});
export const scheduleMeeting = new DynamicStructuredTool({
@@ -17,12 +17,12 @@ export const scheduleMeeting = new DynamicStructuredTool({
const { title, attendees, startTime, endTime, description } = args;
// Mock implementation
return `Meeting "${title}" scheduled from ${startTime} to ${endTime} with ${attendees.length} attendees`;
}
},
});
const availabilitySchema = z.object({
startTime: z.string().describe("Start time in ISO format"),
endTime: z.string().describe("End time in ISO format")
endTime: z.string().describe("End time in ISO format"),
});
export const checkCalendarAvailability = new DynamicStructuredTool({
@@ -33,5 +33,5 @@ export const checkCalendarAvailability = new DynamicStructuredTool({
const { startTime, endTime } = args;
// Mock implementation
return `Time slot from ${startTime} to ${endTime} is available`;
}
});
},
});
+22 -15
View File
@@ -8,7 +8,7 @@ import { DynamicStructuredTool } from "@langchain/core/tools";
const emailSchema = z.object({
recipient: z.string().describe("Email address of the recipient"),
subject: z.string().describe("Clear and concise subject line for the email"),
content: z.string().describe("Main body text of the email")
content: z.string().describe("Main body text of the email"),
});
/**
@@ -17,10 +17,14 @@ const emailSchema = z.object({
*/
export const writeEmail = new DynamicStructuredTool({
name: "write_email",
description:
description:
"Write an email draft based on provided information. Use this when the user wants to compose a new email message.",
schema: emailSchema,
func: async ({ recipient, subject, content }: z.infer<typeof emailSchema>) => {
func: async ({
recipient,
subject,
content,
}: z.infer<typeof emailSchema>) => {
// In a real implementation, this would interact with an email service API
// For now, we return a formatted string representation of the draft
return `Email draft created:
@@ -30,7 +34,7 @@ Subject: ${subject}
${content}
[Draft saved. Ready to send or edit further.]`;
}
},
});
/**
@@ -40,7 +44,7 @@ ${content}
const triageSchema = z.object({
sender: z.string().describe("Email address of the sender"),
subject: z.string().describe("Subject line of the email to triage"),
content: z.string().describe("Content/body of the email to analyze")
content: z.string().describe("Content/body of the email to analyze"),
});
/**
@@ -49,28 +53,30 @@ const triageSchema = z.object({
*/
export const triageEmail = new DynamicStructuredTool({
name: "triage_email",
description:
description:
"Analyze and categorize an email by importance and type. Use this when evaluating how to handle incoming messages.",
schema: triageSchema,
func: async ({ sender, subject, content }: z.infer<typeof triageSchema>) => {
// In a real implementation, this would use some classification logic
// For demonstration, we return a mock categorization
// Simple keyword-based priority assessment
let priority = "Medium";
if (subject.toLowerCase().includes("urgent") ||
subject.toLowerCase().includes("important") ||
content.toLowerCase().includes("asap")) {
if (
subject.toLowerCase().includes("urgent") ||
subject.toLowerCase().includes("important") ||
content.toLowerCase().includes("asap")
) {
priority = "High";
} else if (content.length < 50 || subject.toLowerCase().includes("fyi")) {
priority = "Low";
}
return `Email from ${sender} has been analyzed:
Priority: ${priority}
Category: General correspondence
Recommended action: ${priority === "High" ? "Respond immediately" : "Review when convenient"}`;
}
},
});
/**
@@ -85,9 +91,10 @@ const doneSchema = z.object({});
*/
export const Done = new DynamicStructuredTool({
name: "Done",
description: "Signal that you've completed the current task and no further actions are needed.",
description:
"Signal that you've completed the current task and no further actions are needed.",
schema: doneSchema,
func: async () => {
return "Task completed successfully. No further actions required.";
}
});
},
});
+11 -11
View File
@@ -1,6 +1,6 @@
/**
* Memory-related tools for the Email Assistant with HITL and Memory
*
*
* These tools allow the agent to retrieve information from the memory store.
*/
@@ -19,8 +19,8 @@ export const backgroundTool = tool(
{
name: "background",
description: "Get background information about the user",
schema: z.object({}).describe("This tool doesn't take any arguments")
}
schema: z.object({}).describe("This tool doesn't take any arguments"),
},
);
/**
@@ -35,8 +35,8 @@ export const calPreferencesTool = tool(
{
name: "cal_preferences",
description: "Get the user's calendar preferences",
schema: z.object({}).describe("This tool doesn't take any arguments")
}
schema: z.object({}).describe("This tool doesn't take any arguments"),
},
);
/**
@@ -51,8 +51,8 @@ export const responsePreferencesTool = tool(
{
name: "response_preferences",
description: "Get the user's response style preferences",
schema: z.object({}).describe("This tool doesn't take any arguments")
}
schema: z.object({}).describe("This tool doesn't take any arguments"),
},
);
/**
@@ -66,7 +66,7 @@ export const questionTool = tool(
name: "Question",
description: "Ask the user a follow-up question",
schema: z.object({
content: z.string().describe("The question to ask the user")
})
}
);
content: z.string().describe("The question to ask the user"),
}),
},
);
+1 -1
View File
@@ -1,3 +1,3 @@
import { StructuredTool } from "@langchain/core/tools";
export type ToolsMap = Record<string, StructuredTool>;
export type ToolsMap = Record<string, StructuredTool>;
+99 -85
View File
@@ -1,8 +1,6 @@
import { EmailData } from "./schemas.js";
import type { Message } from '@langchain/langgraph-sdk';
import type { ToolCall } from '@langchain/core/messages/tool';
import type { Message } from "@langchain/langgraph-sdk";
import type { ToolCall } from "@langchain/core/messages/tool";
/**
* Email Assistant utilities
@@ -22,7 +20,12 @@ interface Example {
* @param email The email data to parse
* @returns An object with { author, to, subject, emailThread }
*/
export function parseEmail(email: EmailData): { author: string; to: string; subject: string; emailThread: string } {
export function parseEmail(email: EmailData): {
author: string;
to: string;
subject: string;
emailThread: string;
} {
try {
// Extract key information from email data
const author = email.from_email;
@@ -49,7 +52,7 @@ export function formatEmailMarkdown(
subject: string,
author: string,
to: string,
emailThread: string
emailThread: string,
): string {
return `## Email: ${subject}
@@ -65,12 +68,12 @@ ${emailThread}`;
* @param toolCall The tool call to format
*/
export function formatForDisplay(
state: { messages: Message[] },
toolCall: ToolCall
state: { messages: Message[] },
toolCall: ToolCall,
): string {
// Initialize empty display
let display = "";
// Add tool call information based on tool type
switch (toolCall.name) {
case "write_email":
@@ -82,17 +85,17 @@ export function formatForDisplay(
${toolCall.args.content}
`;
break;
case "schedule_meeting":
display += `# Calendar Invite
**Meeting**: ${toolCall.args.subject}
**Attendees**: ${toolCall.args.attendees?.join(', ')}
**Attendees**: ${toolCall.args.attendees?.join(", ")}
**Duration**: ${toolCall.args.duration_minutes} minutes
**Day**: ${toolCall.args.preferred_day}
`;
break;
case "Question":
// Special formatting for questions to make them clear
display += `# Question for User
@@ -100,7 +103,7 @@ ${toolCall.args.content}
${toolCall.args.content}
`;
break;
default:
// Generic format for other tools
display += `# Tool Call: ${toolCall.name}
@@ -109,7 +112,7 @@ Arguments:
${JSON.stringify(toolCall.args, null, 2)}
`;
}
return display;
}
@@ -118,28 +121,31 @@ ${JSON.stringify(toolCall.args, null, 2)}
*/
export function extractMessageContent(message: any): string {
const content = message.content;
// Check for recursion marker in string
if (typeof content === 'string' && content.includes('<Recursion on AIMessage with id=')) {
if (
typeof content === "string" &&
content.includes("<Recursion on AIMessage with id=")
) {
return "[Recursive content]";
}
// Handle string content
if (typeof content === 'string') {
if (typeof content === "string") {
return content;
}
// Handle list content (AIMessage format)
else if (Array.isArray(content)) {
const textParts: string[] = [];
for (const item of content) {
if (typeof item === 'object' && item !== null && 'text' in item) {
if (typeof item === "object" && item !== null && "text" in item) {
textParts.push(item.text);
}
}
return textParts.join("\n");
}
// Don't try to handle recursion to avoid infinite loops
// Just return string representation instead
return String(content);
@@ -150,26 +156,26 @@ export function extractMessageContent(message: any): string {
*/
export function formatFewShotExamples(examples: Example[]): string {
const formatted: string[] = [];
for (const example of examples) {
// Parse the example value string into components
const parts = example.value.split('Original routing:');
const parts = example.value.split("Original routing:");
const emailPart = parts[0].trim();
const routingParts = parts[1].split('Correct routing:');
const routingParts = parts[1].split("Correct routing:");
const originalRouting = routingParts[0].trim();
const correctRouting = routingParts[1].trim();
// Format into clean string
const formattedExample = `Example:
Email: ${emailPart}
Original Classification: ${originalRouting}
Correct Classification: ${correctRouting}
---`;
formatted.push(formattedExample);
}
return formatted.join("\n");
}
@@ -177,10 +183,12 @@ Correct Classification: ${correctRouting}
* Type guard for checking if an object is a ToolCall
*/
export function isToolCall(item: any): item is ToolCall {
return typeof item === 'object' &&
item !== null &&
'name' in item &&
'args' in item;
return (
typeof item === "object" &&
item !== null &&
"name" in item &&
"args" in item
);
}
/**
@@ -188,25 +196,25 @@ export function isToolCall(item: any): item is ToolCall {
*/
export function extractToolCalls(messages: any[]): string[] {
const toolCallNames: string[] = [];
for (const message of messages) {
// Check if message is an object and has tool_calls
if (typeof message === 'object' && message !== null) {
if (typeof message === "object" && message !== null) {
// Handle plain objects
if (message.tool_calls && Array.isArray(message.tool_calls)) {
toolCallNames.push(...message.tool_calls.map(
(call: any) => call.name.toLowerCase())
toolCallNames.push(
...message.tool_calls.map((call: any) => call.name.toLowerCase()),
);
}
// Handle class instances with toolCalls property
else if ('toolCalls' in message && Array.isArray(message.toolCalls)) {
toolCallNames.push(...message.toolCalls.map(
(call: any) => call.name.toLowerCase())
else if ("toolCalls" in message && Array.isArray(message.toolCalls)) {
toolCallNames.push(
...message.toolCalls.map((call: any) => call.name.toLowerCase()),
);
}
}
}
return toolCallNames;
}
@@ -216,63 +224,69 @@ export function extractToolCalls(messages: any[]): string[] {
* since we don't use stdout redirection.
*/
export function formatMessagesString(messages: Message[]): string {
return messages.map(message => {
let prefix = '';
// Determine prefix based on role
if ('role' in message && message.role) {
switch (message.role) {
case 'user':
prefix = '🧑 Human: ';
break;
case 'assistant':
prefix = '🤖 Assistant: ';
break;
case 'system':
prefix = '🧠 System: ';
break;
case 'tool':
prefix = '🛠️ Tool: ';
break;
default:
prefix = `${message.role}: `;
return messages
.map((message) => {
let prefix = "";
// Determine prefix based on role
if ("role" in message && message.role) {
switch (message.role) {
case "user":
prefix = "🧑 Human: ";
break;
case "assistant":
prefix = "🤖 Assistant: ";
break;
case "system":
prefix = "🧠 System: ";
break;
case "tool":
prefix = "🛠️ Tool: ";
break;
default:
prefix = `${message.role}: `;
}
}
}
// Format content
let content = typeof message.content === 'string'
? message.content
: JSON.stringify(message.content);
// Only AIMessage can have tool_calls or toolCalls
if (message.type === 'ai') {
const aiMsg = message as import('@langchain/langgraph-sdk').AIMessage;
const toolCalls = aiMsg.tool_calls || (aiMsg as any).toolCalls;
if (toolCalls && toolCalls.length > 0) {
const toolCallsStr = toolCalls.map((tc: ToolCall) =>
`\n Tool: ${tc.name}\n Args: ${JSON.stringify(tc.args, null, 2)}`
).join('\n');
content += `\n[Tool Calls: ${toolCallsStr}]`;
// Format content
let content =
typeof message.content === "string"
? message.content
: JSON.stringify(message.content);
// Only AIMessage can have tool_calls or toolCalls
if (message.type === "ai") {
const aiMsg = message as import("@langchain/langgraph-sdk").AIMessage;
const toolCalls = aiMsg.tool_calls || (aiMsg as any).toolCalls;
if (toolCalls && toolCalls.length > 0) {
const toolCallsStr = toolCalls
.map(
(tc: ToolCall) =>
`\n Tool: ${tc.name}\n Args: ${JSON.stringify(tc.args, null, 2)}`,
)
.join("\n");
content += `\n[Tool Calls: ${toolCallsStr}]`;
}
}
}
return `${prefix}${content}`;
}).join('\n\n');
return `${prefix}${content}`;
})
.join("\n\n");
}
/**
* Format email with optional parameters
*/
export function formatEmailOptional(
subject?: string,
author?: string,
to?: string,
emailThread?: string
subject?: string,
author?: string,
to?: string,
emailThread?: string,
): string {
return formatEmailMarkdown(
subject ?? "No Subject",
author ?? "Unknown Sender",
to ?? "Unknown Recipient",
emailThread ?? ""
emailThread ?? "",
);
}
+3 -3
View File
@@ -46,7 +46,7 @@ AGENT_MODULE=email_assistant_hitl_memory pnpm test
This project uses ESM modules which required specific Jest configuration:
1. The setup file is in `.mjs` format to support ESM modules
2. Jest is configured to use the proper ESM preset
2. Jest is configured to use the proper ESM preset
3. Global types are declared in a separate `.d.ts` file
## Mock Assistant Implementation
@@ -115,5 +115,5 @@ To add new utility functions:
## Notes
- Tests may take longer to run due to LLM calls
- Default timeout is set to 2 minutes for LLM-based tests
- The mock assistant approach allows for faster tests without actual LLM calls
- Default timeout is set to 2 minutes for LLM-based tests
- The mock assistant approach allows for faster tests without actual LLM calls
+263 -206
View File
@@ -1,20 +1,20 @@
/**
* Human-in-the-Loop (HITL) functionality tests
*
*
* Tests the interactions where human approval is required for agent actions
*
*
* Test cases:
* - Accept write_email and schedule_meeting flow: Tests the basic approval flow when a user accepts all agent actions
* - Edit tool call parameters: Tests the functionality of editing tool call parameters (meeting duration, time)
* - Reject tool call with feedback: Tests rejecting a proposed action with feedback and ensuring the agent adapts
*
*
* Key concepts:
* - Action requests/interrupts: Points where the agent pauses for human approval
* - Command resume: How the flow is continued after human intervention
* - Tool call verification: Checking the correct tools are called with appropriate parameters
*/
import { describe, test, expect, beforeAll } from '@jest/globals';
import { Command } from '@langchain/langgraph';
import { describe, test, expect, beforeAll } from "@jest/globals";
import { Command } from "@langchain/langgraph";
import {
AGENT_MODULE,
@@ -22,287 +22,344 @@ import {
createMockAssistant,
createThreadConfig,
testEmails,
collectStream
} from './utils.js';
collectStream,
} from "./utils.js";
// Set module to HITL version for these tests
setAgentModule(process.env.AGENT_MODULE || "email_assistant_hitl");
describe('HITL functionality tests', () => {
describe("HITL functionality tests", () => {
beforeAll(() => {
// Setup LangSmith tracing if API key is available
// Setup LangSmith tracing if API key is available
if (process.env.LANGCHAIN_API_KEY) {
process.env.LANGCHAIN_TRACING_V2 = "true";
process.env.LANGCHAIN_CALLBACKS_BACKGROUND = "true";
}
console.log(`Using agent module: ${AGENT_MODULE}`);
});
test('Accept write_email and schedule_meeting flow', async () => {
test("Accept write_email and schedule_meeting flow", async () => {
// This test demonstrates the basic HITL approval flow when a user accepts all agent actions
const email = testEmails[0]; // Meeting request email
const threadConfig = createThreadConfig("test-thread-1");
// Create mock assistant with configured responses
const mockWriteEmailInterrupt = {
__interrupt__: [{
name: "action_request",
value: [{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled the meeting as requested."
}
}
}]
}]
};
const mockDoneResponse = {
ai_message: {
content: "All tasks completed",
tool_calls: [{ name: "Done", args: {} }]
}
};
const emailAssistant = createMockAssistant({
mockResponses: {
"test-thread-1": [mockWriteEmailInterrupt, mockDoneResponse]
}
});
// Run the graph until the first interrupt
console.log("Running the graph until the first interrupt...");
const initialChunks = await collectStream(emailAssistant.stream(
{"email_input": email},
threadConfig
));
// Get the interrupt object
const initialInterrupt = initialChunks.find(chunk => '__interrupt__' in chunk);
expect(initialInterrupt).toBeDefined();
// Extract the action request from the interrupt
const actionRequest = initialInterrupt?.__interrupt__[0].value[0].action_request;
console.log("\nINTERRUPT OBJECT:");
console.log(`Action Request: ${JSON.stringify(actionRequest)}`);
// Verify it's a schedule_meeting request
expect(actionRequest.action).toBe('schedule_meeting');
// Accept the schedule_meeting tool call
console.log(`\nSimulating user accepting the ${JSON.stringify(actionRequest)} tool call...`);
const secondChunks = await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
// Find the meeting confirmation message and the next interrupt
expect(secondChunks.length).toBeGreaterThan(0);
// The second element should be the write_email interrupt
const secondInterrupt = secondChunks.find(chunk => '__interrupt__' in chunk);
expect(secondInterrupt).toBeDefined();
// Extract the write_email action
const emailActionRequest = secondInterrupt?.__interrupt__[0].value[0].action_request;
console.log("\nINTERRUPT OBJECT:");
console.log(`Action Request: ${JSON.stringify(emailActionRequest)}`);
// Verify it's a write_email request
expect(emailActionRequest.action).toBe('write_email');
// Accept the write_email tool call
console.log(`\nSimulating user accepting the ${JSON.stringify(emailActionRequest)} tool call...`);
const finalChunks = await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
// Verify completion with Done tool call
const doneMessage = finalChunks.find(chunk => chunk.ai_message?.tool_calls?.some((tc: { name: string }) => tc.name === 'Done'));
expect(doneMessage).toBeDefined();
}, 120000); // 2 minute timeout for LLM calls
test('Edit tool call parameters', async () => {
// This test demonstrates editing a tool call's parameters
const email = testEmails[0]; // Meeting request email
const threadConfig = createThreadConfig("test-thread-2");
// Create mock assistant with specific responses for this test case
const mockWriteEmailInterruptWithEditedParams = {
__interrupt__: [{
name: "action_request",
value: [{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled a 30-minute meeting at 3:00 PM as requested."
}
}
}]
}]
// Create mock assistant with configured responses
const mockWriteEmailInterrupt = {
__interrupt__: [
{
name: "action_request",
value: [
{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled the meeting as requested.",
},
},
},
],
},
],
};
const mockDoneResponse = {
ai_message: {
content: "All tasks completed",
tool_calls: [{ name: "Done", args: {} }],
is_final: true
}
},
};
const emailAssistant = createMockAssistant({
mockResponses: {
"test-thread-2": [mockWriteEmailInterruptWithEditedParams, mockDoneResponse]
"test-thread-1": [mockWriteEmailInterrupt, mockDoneResponse],
},
});
// Run the graph until the first interrupt
console.log("Running the graph until the first interrupt...");
const initialChunks = await collectStream(
emailAssistant.stream({ email_input: email }, threadConfig),
);
// Get the interrupt object
const initialInterrupt = initialChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(initialInterrupt).toBeDefined();
// Extract the action request from the interrupt
const actionRequest =
initialInterrupt?.__interrupt__[0].value[0].action_request;
console.log("\nINTERRUPT OBJECT:");
console.log(`Action Request: ${JSON.stringify(actionRequest)}`);
// Verify it's a schedule_meeting request
expect(actionRequest.action).toBe("schedule_meeting");
// Accept the schedule_meeting tool call
console.log(
`\nSimulating user accepting the ${JSON.stringify(actionRequest)} tool call...`,
);
const secondChunks = await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Find the meeting confirmation message and the next interrupt
expect(secondChunks.length).toBeGreaterThan(0);
// The second element should be the write_email interrupt
const secondInterrupt = secondChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(secondInterrupt).toBeDefined();
// Extract the write_email action
const emailActionRequest =
secondInterrupt?.__interrupt__[0].value[0].action_request;
console.log("\nINTERRUPT OBJECT:");
console.log(`Action Request: ${JSON.stringify(emailActionRequest)}`);
// Verify it's a write_email request
expect(emailActionRequest.action).toBe("write_email");
// Accept the write_email tool call
console.log(
`\nSimulating user accepting the ${JSON.stringify(emailActionRequest)} tool call...`,
);
const finalChunks = await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Verify completion with Done tool call
const doneMessage = finalChunks.find((chunk) =>
chunk.ai_message?.tool_calls?.some(
(tc: { name: string }) => tc.name === "Done",
),
);
expect(doneMessage).toBeDefined();
}, 120000); // 2 minute timeout for LLM calls
test("Edit tool call parameters", async () => {
// This test demonstrates editing a tool call's parameters
const email = testEmails[0]; // Meeting request email
const threadConfig = createThreadConfig("test-thread-2");
// Create mock assistant with specific responses for this test case
const mockWriteEmailInterruptWithEditedParams = {
__interrupt__: [
{
name: "action_request",
value: [
{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled a 30-minute meeting at 3:00 PM as requested.",
},
},
},
],
},
],
};
const mockDoneResponse = {
ai_message: {
content: "All tasks completed",
tool_calls: [{ name: "Done", args: {} }],
is_final: true,
},
};
const emailAssistant = createMockAssistant({
mockResponses: {
"test-thread-2": [
mockWriteEmailInterruptWithEditedParams,
mockDoneResponse,
],
},
mockStates: {
"test-thread-2": {
values: {
is_final: true
}
}
}
is_final: true,
},
},
},
});
// Run the graph until the first interrupt
console.log("Running the graph until the first interrupt...");
const initialChunks = await collectStream(emailAssistant.stream(
{"email_input": email},
threadConfig
));
const initialChunks = await collectStream(
emailAssistant.stream({ email_input: email }, threadConfig),
);
// Get the interrupt object
const initialInterrupt = initialChunks.find(chunk => '__interrupt__' in chunk);
const initialInterrupt = initialChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(initialInterrupt).toBeDefined();
// Extract the action request from the interrupt
const actionRequest = initialInterrupt?.__interrupt__[0].value[0].action_request;
const actionRequest =
initialInterrupt?.__interrupt__[0].value[0].action_request;
// Verify it's a schedule_meeting request
expect(actionRequest.action).toBe('schedule_meeting');
expect(actionRequest.action).toBe("schedule_meeting");
// Edit the meeting duration and time
const editedArgs = {
...actionRequest.args,
duration_minutes: 30, // Change from 45 minutes to 30 minutes
start_time: 15 // Change from 2pm to 3pm
start_time: 15, // Change from 2pm to 3pm
};
// Edit the tool call
console.log(`\nSimulating user editing the meeting parameters...`);
const secondChunks = await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "edit", args: editedArgs }] }),
threadConfig
));
const secondChunks = await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "edit", args: editedArgs }] }),
threadConfig,
),
);
// Find the next interrupt for write_email
const secondInterrupt = secondChunks.find(chunk => '__interrupt__' in chunk);
const secondInterrupt = secondChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(secondInterrupt).toBeDefined();
// Extract the write_email action
const emailActionRequest = secondInterrupt?.__interrupt__[0].value[0].action_request;
const emailActionRequest =
secondInterrupt?.__interrupt__[0].value[0].action_request;
// Verify it's a write_email request
expect(emailActionRequest.action).toBe('write_email');
expect(emailActionRequest.action).toBe("write_email");
// Verify email mentions the edited parameters (30 minutes, 3pm)
const emailContent = emailActionRequest.args.body;
expect(emailContent).toContain('30');
expect(emailContent).toContain('3:00 PM');
expect(emailContent).toContain("30");
expect(emailContent).toContain("3:00 PM");
// Accept the write_email
const finalChunks = await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
const finalChunks = await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Verify completion
const state = await emailAssistant.getState(threadConfig);
expect(state.values.is_final).toBeTruthy();
}, 120000); // 2 minute timeout for LLM calls
test('Reject tool call with feedback', async () => {
test("Reject tool call with feedback", async () => {
// This test demonstrates rejecting a tool call with feedback
const email = testEmails[0]; // Meeting request email
const threadConfig = createThreadConfig("test-thread-3");
// Create mock assistant with specific responses for this test case
const mockNewScheduleMeetingInterrupt = {
__interrupt__: [{
name: "action_request",
value: [{
action_request: {
action: "schedule_meeting",
args: {
emails: ["pm@client.com"],
title: "Tax Discussion",
time: "2023-07-29T14:00:00Z", // Next week!
duration: 45,
duration_minutes: 45,
preferred_day: "2023-07-29"
}
}
}]
}]
const mockNewScheduleMeetingInterrupt = {
__interrupt__: [
{
name: "action_request",
value: [
{
action_request: {
action: "schedule_meeting",
args: {
emails: ["pm@client.com"],
title: "Tax Discussion",
time: "2023-07-29T14:00:00Z", // Next week!
duration: 45,
duration_minutes: 45,
preferred_day: "2023-07-29",
},
},
},
],
},
],
};
const emailAssistant = createMockAssistant({
mockResponses: {
"test-thread-3": [mockNewScheduleMeetingInterrupt]
}
"test-thread-3": [mockNewScheduleMeetingInterrupt],
},
});
// Run the graph until the first interrupt
console.log("Running the graph until the first interrupt...");
const initialChunks = await collectStream(emailAssistant.stream(
{"email_input": email},
threadConfig
));
const initialChunks = await collectStream(
emailAssistant.stream({ email_input: email }, threadConfig),
);
// Get the interrupt object
const initialInterrupt = initialChunks.find(chunk => '__interrupt__' in chunk);
const initialInterrupt = initialChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(initialInterrupt).toBeDefined();
// Extract the action request from the interrupt
const actionRequest = initialInterrupt?.__interrupt__[0].value[0].action_request;
const actionRequest =
initialInterrupt?.__interrupt__[0].value[0].action_request;
// Verify it's a schedule_meeting request
expect(actionRequest.action).toBe('schedule_meeting');
expect(actionRequest.action).toBe("schedule_meeting");
// Original date for comparison later
const originalDate = new Date(actionRequest.args.time);
// Reject the tool call with feedback
console.log(`\nSimulating user rejecting the tool call with feedback...`);
const secondChunks = await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "reject", args: "I'm not available next week. Please suggest the following week instead." }] }),
threadConfig
));
const secondChunks = await collectStream(
emailAssistant.stream(
new Command({
resume: [
{
type: "reject",
args: "I'm not available next week. Please suggest the following week instead.",
},
],
}),
threadConfig,
),
);
// The agent should now propose a different meeting time
// Find the next interrupt
const secondInterrupt = secondChunks.find(chunk => '__interrupt__' in chunk);
const secondInterrupt = secondChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(secondInterrupt).toBeDefined();
// Extract the action request
const newActionRequest = secondInterrupt?.__interrupt__[0].value[0].action_request;
const newActionRequest =
secondInterrupt?.__interrupt__[0].value[0].action_request;
// Should still be a schedule_meeting but with a different date
expect(newActionRequest.action).toBe('schedule_meeting');
expect(newActionRequest.action).toBe("schedule_meeting");
// The new proposed date should be in a different week
const newDate = new Date(newActionRequest.args.preferred_day);
// Calculate the difference in days
const dayDifference = Math.floor((newDate.getTime() - originalDate.getTime()) / (1000 * 60 * 60 * 24));
const dayDifference = Math.floor(
(newDate.getTime() - originalDate.getTime()) / (1000 * 60 * 60 * 24),
);
// Expect at least 7 days difference (next week)
expect(dayDifference).toBeGreaterThanOrEqual(7);
}, 120000); // 2 minute timeout for LLM calls
});
});
+201 -141
View File
@@ -1,21 +1,21 @@
/**
* Memory functionality tests
*
*
* Tests the agent's ability to store, retrieve, and use memory for personalization
*
*
* Test cases:
* - Accept flow without memory updates: Verifies simple accepts don't modify user preferences
* - Memory updates based on edit with feedback: Tests how editing with feedback updates stored preferences
* - Memory affects subsequent emails: Tests that stored preferences influence future interactions
*
*
* Key concepts:
* - TestInMemoryStore: Custom store implementation that simulates memory updates
* - User preferences: Stored preferences for different aspects (calendar, response style)
* - Memory persistence: Verifying memory is maintained between interactions
* - Memory application: Testing preferences are properly applied to new interactions
*/
import { describe, test, expect, beforeAll } from '@jest/globals';
import { Command } from '@langchain/langgraph';
import { describe, test, expect, beforeAll } from "@jest/globals";
import { Command } from "@langchain/langgraph";
import {
AGENT_MODULE,
@@ -25,243 +25,303 @@ import {
testEmails,
collectStream,
displayMemoryContent,
TestInMemoryStore
} from './utils.js';
TestInMemoryStore,
} from "./utils.js";
// Set module to HITL+Memory version for these tests
setAgentModule(process.env.AGENT_MODULE || "email_assistant_hitl_memory");
describe('Memory functionality tests', () => {
describe("Memory functionality tests", () => {
beforeAll(() => {
// Setup LangSmith tracing if API key is available
// Setup LangSmith tracing if API key is available
if (process.env.LANGCHAIN_API_KEY) {
process.env.LANGCHAIN_TRACING_V2 = "true";
process.env.LANGCHAIN_CALLBACKS_BACKGROUND = "true";
}
console.log(`Using agent module: ${AGENT_MODULE}`);
});
test('Accept flow without memory updates', async () => {
test("Accept flow without memory updates", async () => {
// This test demonstrates how accepting without feedback doesn't update memory
const email = testEmails[0]; // Meeting request email
const threadConfig = createThreadConfig("memory-test-thread-1");
const store = new TestInMemoryStore();
// Create mock assistant with configured responses
const mockWriteEmailInterrupt = {
__interrupt__: [{
name: "action_request",
value: [{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled the meeting as requested."
}
}
}]
}]
const mockWriteEmailInterrupt = {
__interrupt__: [
{
name: "action_request",
value: [
{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled the meeting as requested.",
},
},
},
],
},
],
};
const emailAssistant = createMockAssistant({
mockResponses: {
"memory-test-thread-1": [mockWriteEmailInterrupt]
}
"memory-test-thread-1": [mockWriteEmailInterrupt],
},
});
// Check initial memory state
await displayMemoryContent(store);
// Run the graph until the first interrupt
console.log("Running the graph until the first interrupt...");
const initialChunks = await collectStream(emailAssistant.stream(
{"email_input": email},
threadConfig
));
const initialChunks = await collectStream(
emailAssistant.stream({ email_input: email }, threadConfig),
);
// Get the interrupt object
const initialInterrupt = initialChunks.find(chunk => '__interrupt__' in chunk);
const initialInterrupt = initialChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(initialInterrupt).toBeDefined();
// Extract the action request from the interrupt
const actionRequest = initialInterrupt?.__interrupt__[0].value[0].action_request;
const actionRequest =
initialInterrupt?.__interrupt__[0].value[0].action_request;
console.log("\nINTERRUPT OBJECT:");
console.log(`Action Request: ${JSON.stringify(actionRequest)}`);
// Verify it's a schedule_meeting request
expect(actionRequest.action).toBe('schedule_meeting');
expect(actionRequest.action).toBe("schedule_meeting");
// Get initial calendar preferences
const initialCalPreferences = await store.get(["email_assistant", "cal_preferences"], "user_preferences");
const initialCalPreferences = await store.get(
["email_assistant", "cal_preferences"],
"user_preferences",
);
const initialPrefsContent = initialCalPreferences?.value;
// Accept without modification
console.log(`\nSimulating user accepting the ${actionRequest.action} tool call...`);
const secondChunks = await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
console.log(
`\nSimulating user accepting the ${actionRequest.action} tool call...`,
);
const secondChunks = await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Find the next interrupt
const secondInterrupt = secondChunks.find(chunk => '__interrupt__' in chunk);
const secondInterrupt = secondChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(secondInterrupt).toBeDefined();
// Extract the write_email action
const emailActionRequest = secondInterrupt?.__interrupt__[0].value[0].action_request;
const emailActionRequest =
secondInterrupt?.__interrupt__[0].value[0].action_request;
// Verify no memory changes after simple accept
const currentCalPreferences = await store.get(["email_assistant", "cal_preferences"], "user_preferences");
const currentCalPreferences = await store.get(
["email_assistant", "cal_preferences"],
"user_preferences",
);
expect(currentCalPreferences?.value).toEqual(initialPrefsContent);
// Accept the write_email tool call
await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Verify memory still unchanged
const finalCalPreferences = await store.get(["email_assistant", "cal_preferences"], "user_preferences");
const finalCalPreferences = await store.get(
["email_assistant", "cal_preferences"],
"user_preferences",
);
expect(finalCalPreferences?.value).toEqual(initialPrefsContent);
}, 120000); // 2 minute timeout for LLM calls
test('Memory updates based on edit with feedback', async () => {
test("Memory updates based on edit with feedback", async () => {
// This test demonstrates how editing with feedback updates memory
const email = testEmails[0]; // Meeting request email
const threadConfig = createThreadConfig("memory-test-thread-2");
const store = new TestInMemoryStore();
// Create mock assistant with configured responses
const mockWriteEmailInterrupt = {
__interrupt__: [{
name: "action_request",
value: [{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled a 30-minute meeting as per your preference."
}
}
}]
}]
const mockWriteEmailInterrupt = {
__interrupt__: [
{
name: "action_request",
value: [
{
action_request: {
action: "write_email",
args: {
to: "pm@client.com",
subject: "Re: Tax season let's schedule call",
body: "I've scheduled a 30-minute meeting as per your preference.",
},
},
},
],
},
],
};
const emailAssistant = createMockAssistant({
mockResponses: {
"memory-test-thread-2": [mockWriteEmailInterrupt]
}
"memory-test-thread-2": [mockWriteEmailInterrupt],
},
});
// Check initial memory state
await displayMemoryContent(store);
// Run the graph until the first interrupt
console.log("Running the graph until the first interrupt...");
const initialChunks = await collectStream(emailAssistant.stream(
{"email_input": email},
threadConfig
));
const initialChunks = await collectStream(
emailAssistant.stream({ email_input: email }, threadConfig),
);
// Get the interrupt object
const initialInterrupt = initialChunks.find(chunk => '__interrupt__' in chunk);
const initialInterrupt = initialChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(initialInterrupt).toBeDefined();
// Extract the action request from the interrupt
const actionRequest = initialInterrupt?.__interrupt__[0].value[0].action_request;
const actionRequest =
initialInterrupt?.__interrupt__[0].value[0].action_request;
// Get initial calendar preferences
const initialCalPreferences = await store.get(["email_assistant", "cal_preferences"], "user_preferences");
const initialCalPreferences = await store.get(
["email_assistant", "cal_preferences"],
"user_preferences",
);
const initialPrefsContent = initialCalPreferences?.value;
// Edit the meeting duration and add explicit feedback
const editedArgs = {
...actionRequest.args,
duration_minutes: 30, // Change from 45 to 30 minutes
};
// Edit with feedback about preference - this should trigger memory update in our mock
console.log(`\nSimulating user editing with feedback about 30-minute meeting preference...`);
const secondChunks = await collectStream(emailAssistant.stream(
new Command({ resume: [{
type: "edit",
args: editedArgs,
feedback: "I always prefer 30-minute meetings unless longer is specifically needed."
}] }),
threadConfig
));
console.log(
`\nSimulating user editing with feedback about 30-minute meeting preference...`,
);
const secondChunks = await collectStream(
emailAssistant.stream(
new Command({
resume: [
{
type: "edit",
args: editedArgs,
feedback:
"I always prefer 30-minute meetings unless longer is specifically needed.",
},
],
}),
threadConfig,
),
);
// Update store to simulate memory changes
await store.put(
["email_assistant", "cal_preferences"],
"user_preferences",
{ value: "For calendar events, prefer 30-minute meetings instead of 45-minute meetings..." }
["email_assistant", "cal_preferences"],
"user_preferences",
{
value:
"For calendar events, prefer 30-minute meetings instead of 45-minute meetings...",
},
);
// Check memory after edit with feedback
const updatedCalPreferences = await store.get(["email_assistant", "cal_preferences"], "user_preferences");
const updatedCalPreferences = await store.get(
["email_assistant", "cal_preferences"],
"user_preferences",
);
const updatedPrefsContent = updatedCalPreferences?.value;
// Verify memory was updated with 30-minute preference
expect(updatedPrefsContent).not.toEqual(initialPrefsContent);
expect(updatedPrefsContent).toContain("30-minute");
// Finish the flow by accepting the email
const secondInterrupt = secondChunks.find(chunk => '__interrupt__' in chunk);
const secondInterrupt = secondChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(secondInterrupt).toBeDefined();
await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
}, 120000); // 2 minute timeout for LLM calls
test('Memory affects subsequent emails', async () => {
test("Memory affects subsequent emails", async () => {
// This test demonstrates how memory affects future interactions
const email = testEmails[0]; // Meeting request email
const threadConfig = createThreadConfig("memory-test-thread-3");
const store = new TestInMemoryStore();
// Update the calendar preferences directly to set a known state
await store.put(
["email_assistant", "cal_preferences"],
"user_preferences",
{ value: "I strictly prefer 25-minute meetings. This is a non-negotiable preference." }
["email_assistant", "cal_preferences"],
"user_preferences",
{
value:
"I strictly prefer 25-minute meetings. This is a non-negotiable preference.",
},
);
// Create a new email with different times for second test
const newEmail = {
...testEmails[0],
id: "test-email-4",
thread_id: "thread-4",
subject: "Another meeting request",
page_content: "Lance,\n\nCan we schedule a 45-minute call next Monday?\n\nRegards,\nSomeone"
page_content:
"Lance,\n\nCan we schedule a 45-minute call next Monday?\n\nRegards,\nSomeone",
};
// Create mock assistant that returns a meeting request with 25 minutes duration
const emailAssistant = createMockAssistant();
// Run the graph until the first interrupt
console.log("Processing new email with existing memory preferences...");
const initialChunks = await collectStream(emailAssistant.stream(
{"email_input": newEmail},
threadConfig
));
const initialChunks = await collectStream(
emailAssistant.stream({ email_input: newEmail }, threadConfig),
);
// Get the interrupt object
const initialInterrupt = initialChunks.find(chunk => '__interrupt__' in chunk);
const initialInterrupt = initialChunks.find(
(chunk) => "__interrupt__" in chunk,
);
expect(initialInterrupt).toBeDefined();
// Extract the action request from the interrupt
const actionRequest = initialInterrupt?.__interrupt__[0].value[0].action_request;
const actionRequest =
initialInterrupt?.__interrupt__[0].value[0].action_request;
// Verify the scheduler honors the 25-minute preference from memory
expect(actionRequest.action).toBe('schedule_meeting');
expect(actionRequest.action).toBe("schedule_meeting");
expect(actionRequest.args.duration_minutes).toBe(25);
// Verify the tool call proposal mentions the 25-minute preference
console.log(`\nVerifying memory is used in the proposal: ${JSON.stringify(actionRequest)}`);
console.log(
`\nVerifying memory is used in the proposal: ${JSON.stringify(actionRequest)}`,
);
}, 120000); // 2 minute timeout for LLM calls
});
});
+163 -113
View File
@@ -1,12 +1,12 @@
/**
* Email response quality tests
*
*
* Tests the quality and correctness of email responses generated by the agent
*
*
* Test suites:
* - Tool call tests: Verifies the agent calls the expected tools for each email type
* - Response quality tests: Evaluates whether responses meet quality criteria
*
*
* Key concepts:
* - Expected tool calls: Predefined list of tools that should be called for each email
* - Quality criteria: Defined standards for what makes a good response
@@ -14,8 +14,8 @@
* - Extra vs. missing calls: Test allows extra tool calls but fails on missing expected ones
* - Response evaluation: Using evaluateResponseCriteria to assess response quality
*/
import { describe, test, expect, beforeAll } from '@jest/globals';
import { Command } from '@langchain/langgraph';
import { describe, test, expect, beforeAll } from "@jest/globals";
import { Command } from "@langchain/langgraph";
import {
AGENT_MODULE,
@@ -28,13 +28,13 @@ import {
testCriteria,
expectedToolCalls,
evaluateResponseCriteria,
EmailData
} from './utils.js';
import { extractToolCalls, formatMessagesString } from '../src/utils.js';
EmailData,
} from "./utils.js";
import { extractToolCalls, formatMessagesString } from "../src/utils.js";
/**
* Test utilities for the Email Assistant test suite
*
*
* This module provides:
* - Mock data (emails, criteria, expected tool calls)
* - Utility functions for testing email assistant behavior
@@ -46,41 +46,52 @@ import { extractToolCalls, formatMessagesString } from '../src/utils.js';
// or default to the HITL+Memory version
setAgentModule(process.env.AGENT_MODULE || "email_assistant_hitl_memory");
describe('Email response tests', () => {
describe("Email response tests", () => {
beforeAll(() => {
// Setup LangSmith tracing if API key is available
// Setup LangSmith tracing if API key is available
if (process.env.LANGCHAIN_API_KEY) {
process.env.LANGCHAIN_TRACING_V2 = "true";
process.env.LANGCHAIN_CALLBACKS_BACKGROUND = "true";
}
console.log(`Using agent module: ${AGENT_MODULE}`);
});
describe('Tool call tests', () => {
describe("Tool call tests", () => {
// Only include emails that should have tool calls (triage_output == "respond")
const responseCases = testEmails
.map((email: EmailData, i: number) => [email, testCriteria[i], expectedToolCalls[i]] as const)
.map(
(email: EmailData, i: number) =>
[email, testCriteria[i], expectedToolCalls[i]] as const,
)
.filter((_: any, i: number) => expectedToolCalls[i].length > 0);
test.each(responseCases)(
'processes %s with expected tool calls',
async (emailInput: EmailData, criteria: string, expectedCalls: string[]) => {
"processes %s with expected tool calls",
async (
emailInput: EmailData,
criteria: string,
expectedCalls: string[],
) => {
// Log test info
console.log(`Processing ${emailInput.subject}...`);
// Set up the assistant with thread ID from the email input
const threadConfig = createThreadConfig(emailInput.thread_id);
// Use custom mock state for thread-1 to ensure it has the expected schedule_meeting tool call
const mockStates: Record<string, any> = {};
if (emailInput.thread_id === 'thread-1') {
mockStates['thread-1'] = {
if (emailInput.thread_id === "thread-1") {
mockStates["thread-1"] = {
values: {
messages: [
{ type: "human", content: "This is a test email about scheduling a tax discussion." },
{
type: "ai",
{
type: "human",
content:
"This is a test email about scheduling a tax discussion.",
},
{
type: "ai",
content: "I'll help you schedule that meeting.",
tool_calls: [
{
@@ -91,8 +102,8 @@ describe('Email response tests', () => {
title: "Tax Discussion",
time: "2023-07-16T14:00:00Z",
duration: 45,
duration_minutes: 45
}
duration_minutes: 45,
},
},
{
id: "call_124",
@@ -100,100 +111,125 @@ describe('Email response tests', () => {
args: {
to: "pm@client.com",
subject: "Re: Tax Planning Discussion",
body: "I've scheduled the meeting as requested."
}
}
]
body: "I've scheduled the meeting as requested.",
},
},
],
},
{
type: "tool",
content: "Meeting scheduled",
tool_call_id: "call_123",
},
{
type: "tool",
content: "Email sent successfully",
tool_call_id: "call_124",
},
{ type: "tool", content: "Meeting scheduled", tool_call_id: "call_123" },
{ type: "tool", content: "Email sent successfully", tool_call_id: "call_124" }
],
email_input: null,
classification_decision: "respond",
is_final: true
}
is_final: true,
},
};
}
const emailAssistant = createMockAssistant({
mockStates: mockStates
mockStates: mockStates,
});
// First stream for initial interrupt
await collectStream(emailAssistant.stream(
{"email_input": emailInput},
threadConfig
));
await collectStream(
emailAssistant.stream({ email_input: emailInput }, threadConfig),
);
// Accept the schedule meeting
await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Accept the write email if there is one
await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Get the final state
const state = await emailAssistant.getState(threadConfig);
const values = extractValues(state);
// Extract tool calls from messages
const extractedToolCalls = extractToolCalls(values.messages);
// Check if all expected tool calls are in the extracted ones
const missingCalls = expectedCalls.filter(
(call: string) => !extractedToolCalls.includes(call.toLowerCase())
(call: string) => !extractedToolCalls.includes(call.toLowerCase()),
);
// Extra calls are allowed (we only fail if expected calls are missing)
const extraCalls = extractedToolCalls.filter(
(call: string) => !expectedCalls.map((c: string) => c.toLowerCase()).includes(call.toLowerCase())
(call: string) =>
!expectedCalls
.map((c: string) => c.toLowerCase())
.includes(call.toLowerCase()),
);
// Log for debugging
console.log('Extracted tool calls:', extractedToolCalls);
console.log('Missing calls:', missingCalls);
console.log('Extra calls:', extraCalls);
console.log("Extracted tool calls:", extractedToolCalls);
console.log("Missing calls:", missingCalls);
console.log("Extra calls:", extraCalls);
// Get formatted messages for detailed logging
const allMessagesStr = formatMessagesString(values.messages);
console.log('Response:', allMessagesStr);
console.log("Response:", allMessagesStr);
// Assert that there are no missing calls
expect(missingCalls.length).toBe(0);
},
60000 // 60 second timeout for LLM calls
60000, // 60 second timeout for LLM calls
);
});
describe('Response quality tests', () => {
describe("Response quality tests", () => {
// Only test emails that require a response
const responseCases = testEmails
.map((email: EmailData, i: number) => [email, testCriteria[i], expectedToolCalls[i]] as const)
.map(
(email: EmailData, i: number) =>
[email, testCriteria[i], expectedToolCalls[i]] as const,
)
.filter((_: any, i: number) => expectedToolCalls[i].length > 0);
test.each(responseCases)(
'produces quality response for %s',
async (emailInput: EmailData, criteria: string, expectedCalls: string[]) => {
"produces quality response for %s",
async (
emailInput: EmailData,
criteria: string,
expectedCalls: string[],
) => {
// Log test info
console.log(`Processing ${emailInput.subject}...`);
// Set up the assistant with thread ID from the email input
const threadConfig = createThreadConfig(emailInput.thread_id);
// Use custom mock state for thread-1 to ensure it has the expected schedule_meeting tool call
const mockStates: Record<string, any> = {};
if (emailInput.thread_id === 'thread-1') {
mockStates['thread-1'] = {
if (emailInput.thread_id === "thread-1") {
mockStates["thread-1"] = {
values: {
messages: [
{ type: "human", content: "This is a test email about scheduling a tax discussion." },
{
type: "ai",
{
type: "human",
content:
"This is a test email about scheduling a tax discussion.",
},
{
type: "ai",
content: "I'll help you schedule that meeting.",
tool_calls: [
{
@@ -204,8 +240,8 @@ describe('Email response tests', () => {
title: "Tax Discussion",
time: "2023-07-16T14:00:00Z",
duration: 45,
duration_minutes: 45
}
duration_minutes: 45,
},
},
{
id: "call_124",
@@ -213,60 +249,74 @@ describe('Email response tests', () => {
args: {
to: "pm@client.com",
subject: "Re: Tax Planning Discussion",
body: "I've scheduled the meeting as requested."
}
}
]
body: "I've scheduled the meeting as requested.",
},
},
],
},
{
type: "tool",
content: "Meeting scheduled",
tool_call_id: "call_123",
},
{
type: "tool",
content: "Email sent successfully",
tool_call_id: "call_124",
},
{ type: "tool", content: "Meeting scheduled", tool_call_id: "call_123" },
{ type: "tool", content: "Email sent successfully", tool_call_id: "call_124" }
],
email_input: null,
classification_decision: "respond",
is_final: true
}
is_final: true,
},
};
}
const emailAssistant = createMockAssistant({
mockStates: mockStates
mockStates: mockStates,
});
// First stream for initial interrupt
await collectStream(emailAssistant.stream(
{"email_input": emailInput},
threadConfig
));
await collectStream(
emailAssistant.stream({ email_input: emailInput }, threadConfig),
);
// Accept the schedule meeting
await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Accept the write email if there is one
await collectStream(emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig
));
await collectStream(
emailAssistant.stream(
new Command({ resume: [{ type: "accept" }] }),
threadConfig,
),
);
// Get the final state
const state = await emailAssistant.getState(threadConfig);
const values = extractValues(state);
// Format all messages for evaluation
const allMessagesStr = formatMessagesString(values.messages);
// Evaluate the response against criteria
const evaluation = await evaluateResponseCriteria(allMessagesStr, criteria);
const evaluation = await evaluateResponseCriteria(
allMessagesStr,
criteria,
);
// Log the evaluation
console.log('Evaluation:', evaluation);
console.log("Evaluation:", evaluation);
// Assert that the response meets the criteria
expect(evaluation.grade).toBe(true);
},
60000 // 60 second timeout for LLM calls
60000, // 60 second timeout for LLM calls
);
});
});
});
+2 -2
View File
@@ -1,7 +1,7 @@
import { ChatOpenAI } from '@langchain/openai';
import { ChatOpenAI } from "@langchain/openai";
declare global {
var criteriaEvalLLM: ChatOpenAI;
}
export {};
export {};
+6 -6
View File
@@ -1,6 +1,6 @@
import { config } from 'dotenv';
import { ChatOpenAI } from '@langchain/openai';
import { beforeAll } from '@jest/globals';
import { config } from "dotenv";
import { ChatOpenAI } from "@langchain/openai";
import { beforeAll } from "@jest/globals";
// Load environment variables
config();
@@ -8,8 +8,8 @@ config();
// Set up global model for evaluations
// @ts-ignore - Global types will be picked up from setup.d.ts
globalThis.criteriaEvalLLM = new ChatOpenAI({
modelName: 'gpt-4o',
temperature: 0
modelName: "gpt-4o",
temperature: 0,
});
// This runs before all tests
@@ -19,4 +19,4 @@ beforeAll(() => {
process.env.LANGCHAIN_TRACING_V2 = "true";
process.env.LANGCHAIN_CALLBACKS_BACKGROUND = "true";
}
});
});
+185 -128
View File
@@ -1,15 +1,14 @@
/**
* Test utilities for the Email Assistant test suite
*
*
* This module provides:
* - Mock data (emails, criteria, expected tool calls)
* - Utility functions for testing email assistant behavior
* - Custom InMemoryStore implementation for memory testing
* - Mock assistant factory with configurable behavior
*/
import { v4 as uuidv4 } from 'uuid';
import { InMemoryStore } from '@langchain/langgraph';
import { v4 as uuidv4 } from "uuid";
import { InMemoryStore } from "@langchain/langgraph";
// Define EmailData interface directly instead of importing
export interface EmailData {
@@ -43,93 +42,107 @@ export function createThreadConfig(threadId?: string) {
export class TestInMemoryStore extends InMemoryStore {
private mockContent: Record<string, any> = {};
private memoryUpdated = false;
constructor() {
super();
// Initialize with default memory values
this.mockContent = {
"email_assistant/triage_preferences/user_preferences": {
value: "Emails should be categorized as 'respond' if they require a direct response or action..."
value:
"Emails should be categorized as 'respond' if they require a direct response or action...",
},
"email_assistant/response_preferences/user_preferences": {
value: "When responding to emails, be concise and professional..."
value: "When responding to emails, be concise and professional...",
},
"email_assistant/cal_preferences/user_preferences": {
value: "For calendar events, prefer 25-minute meetings..."
}
value: "For calendar events, prefer 25-minute meetings...",
},
};
}
async get(namespace: string[], key: string): Promise<any> {
const fullKey = namespace.join('/') + '/' + key;
if (fullKey === "email_assistant/cal_preferences/user_preferences" && this.memoryUpdated) {
return {
value: "For calendar events, prefer 30-minute meetings instead of 45-minute meetings..."
const fullKey = namespace.join("/") + "/" + key;
if (
fullKey === "email_assistant/cal_preferences/user_preferences" &&
this.memoryUpdated
) {
return {
value:
"For calendar events, prefer 30-minute meetings instead of 45-minute meetings...",
};
}
return this.mockContent[fullKey] || null;
}
async put(namespace: string[], key: string, value: any): Promise<void> {
const fullKey = namespace.join('/') + '/' + key;
const fullKey = namespace.join("/") + "/" + key;
// Always update memory for tests
this.memoryUpdated = true;
// Simulate memory update
if (fullKey === "email_assistant/cal_preferences/user_preferences") {
// If this is an edit to cal_preferences, simulate adding 30-minute preference
this.mockContent[fullKey] = {
value: "For calendar events, prefer 30-minute meetings instead of 45-minute meetings..."
this.mockContent[fullKey] = {
value:
"For calendar events, prefer 30-minute meetings instead of 45-minute meetings...",
};
} else {
this.mockContent[fullKey] = value;
}
}
async list(namespace: string[]): Promise<any[]> {
const prefix = namespace.join('/');
const prefix = namespace.join("/");
return Object.keys(this.mockContent)
.filter(key => key.startsWith(prefix))
.map(key => ({ key, value: this.mockContent[key] }));
.filter((key) => key.startsWith(prefix))
.map((key) => ({ key, value: this.mockContent[key] }));
}
async delete(namespace: string[], key: string): Promise<void> {
const fullKey = namespace.join('/') + '/' + key;
const fullKey = namespace.join("/") + "/" + key;
delete this.mockContent[fullKey];
}
}
// Utility function to create a mock assistant with configurable behavior
export function createMockAssistant(options: {
mockResponses?: Record<string, any[]>,
mockStates?: Record<string, any>
} = {}) {
export function createMockAssistant(
options: {
mockResponses?: Record<string, any[]>;
mockStates?: Record<string, any>;
} = {},
) {
return {
stream: async function* (input: any, config: any) {
const threadId = config?.configurable?.thread_id || "default";
const mockResponses = options.mockResponses || {};
if (input.email_input) {
// If it's an email, return the schedule_meeting interrupt first
yield {
__interrupt__: [{
name: "action_request",
value: [{
action_request: {
action: "schedule_meeting",
args: {
emails: [input.email_input.from_email],
title: `Meeting about: ${input.email_input.subject}`,
duration: input.email_input.thread_id === "thread-4" ? 25 : 45,
time: "2023-07-15T14:00:00Z",
duration_minutes: input.email_input.thread_id === "thread-4" ? 25 : 45
}
}
}]
}]
yield {
__interrupt__: [
{
name: "action_request",
value: [
{
action_request: {
action: "schedule_meeting",
args: {
emails: [input.email_input.from_email],
title: `Meeting about: ${input.email_input.subject}`,
duration:
input.email_input.thread_id === "thread-4" ? 25 : 45,
time: "2023-07-15T14:00:00Z",
duration_minutes:
input.email_input.thread_id === "thread-4" ? 25 : 45,
},
},
},
],
},
],
};
} else if (input.resume) {
// If it's a resume command, we need to know if this is the first acceptance
@@ -138,59 +151,75 @@ export function createMockAssistant(options: {
yield mockResponses[threadId].shift();
} else {
// Default interrupt as the second step in the HITL flow
yield {
yield {
messages: [
{ type: "tool", content: "Meeting scheduled successfully", tool_call_id: "call_123" }
]
{
type: "tool",
content: "Meeting scheduled successfully",
tool_call_id: "call_123",
},
],
};
yield {
__interrupt__: [{
name: "action_request",
value: [{
action_request: {
action: "write_email",
args: {
to: "recipient@example.com",
subject: "Meeting Scheduled",
body: "I've scheduled the meeting as requested."
}
}
}]
}]
yield {
__interrupt__: [
{
name: "action_request",
value: [
{
action_request: {
action: "write_email",
args: {
to: "recipient@example.com",
subject: "Meeting Scheduled",
body: "I've scheduled the meeting as requested.",
},
},
},
],
},
],
};
}
} else {
yield { messages: [] };
}
},
invoke: async function(input: any, config: any) {
invoke: async function (input: any, config: any) {
if (input.email_input) {
return {
classification_decision: "respond",
messages: [{ type: "human", content: `Processed email: ${input.email_input.subject}` }]
messages: [
{
type: "human",
content: `Processed email: ${input.email_input.subject}`,
},
],
};
} else {
return { messages: [] };
}
},
getState: async function(config: any) {
getState: async function (config: any) {
const threadId = config?.configurable?.thread_id || "default";
const mockStates = options.mockStates || {};
if (mockStates[threadId]) {
return mockStates[threadId];
}
// Default states for different thread IDs
if (threadId.includes('test-email-1')) {
if (threadId.includes("test-email-1")) {
return {
values: {
messages: [
{ type: "human", content: "This is a test email about scheduling." },
{
type: "ai",
{
type: "human",
content: "This is a test email about scheduling.",
},
{
type: "ai",
content: "I'll help you schedule that meeting.",
tool_calls: [
{
@@ -201,8 +230,8 @@ export function createMockAssistant(options: {
title: "Tax Discussion",
time: "2023-07-16T14:00:00Z",
duration: 45,
duration_minutes: 45
}
duration_minutes: 45,
},
},
{
id: "call_124",
@@ -210,26 +239,37 @@ export function createMockAssistant(options: {
args: {
to: "test@example.com",
subject: "Re: Tax Planning Discussion",
body: "I've scheduled the meeting as requested."
}
}
]
body: "I've scheduled the meeting as requested.",
},
},
],
},
{
type: "tool",
content: "Meeting scheduled",
tool_call_id: "call_123",
},
{
type: "tool",
content: "Email sent successfully",
tool_call_id: "call_124",
},
{ type: "tool", content: "Meeting scheduled", tool_call_id: "call_123" },
{ type: "tool", content: "Email sent successfully", tool_call_id: "call_124" }
],
email_input: null,
classification_decision: "respond",
is_final: true
}
is_final: true,
},
};
} else if (threadId.includes('test-email-3')) {
} else if (threadId.includes("test-email-3")) {
return {
values: {
messages: [
{ type: "human", content: "Please provide a project status update by EOD." },
{
type: "ai",
{
type: "human",
content: "Please provide a project status update by EOD.",
},
{
type: "ai",
content: "I'll send an update right away.",
tool_calls: [
{
@@ -238,27 +278,31 @@ export function createMockAssistant(options: {
args: {
to: "manager@company.com",
subject: "Re: Urgent: Project Status",
body: "Here is the project status update you requested. We are on track to complete all deliverables by Friday."
}
}
]
body: "Here is the project status update you requested. We are on track to complete all deliverables by Friday.",
},
},
],
},
{
type: "tool",
content: "Email sent successfully",
tool_call_id: "call_125",
},
{ type: "tool", content: "Email sent successfully", tool_call_id: "call_125" }
],
email_input: null,
classification_decision: "respond",
is_final: true
}
is_final: true,
},
};
}
// Default state
return {
values: {
messages: [
{ type: "human", content: "This is a test email." },
{
type: "ai",
{
type: "ai",
content: "I'll help with that.",
tool_calls: [
{
@@ -267,19 +311,23 @@ export function createMockAssistant(options: {
args: {
to: "test@example.com",
subject: "Test Subject",
body: "This is a test response."
}
}
]
body: "This is a test response.",
},
},
],
},
{
type: "tool",
content: "Email sent successfully",
tool_call_id: "call_123",
},
{ type: "tool", content: "Email sent successfully", tool_call_id: "call_123" }
],
email_input: null,
classification_decision: "respond",
is_final: true
}
is_final: true,
},
};
}
},
};
}
@@ -293,10 +341,13 @@ export function extractValues(state: any) {
}
// Mock evaluation function
export async function evaluateResponseCriteria(response: string, criteria: string): Promise<CriteriaGrade> {
export async function evaluateResponseCriteria(
response: string,
criteria: string,
): Promise<CriteriaGrade> {
return {
grade: true,
justification: "The response meets the criteria."
justification: "The response meets the criteria.",
};
}
@@ -305,7 +356,7 @@ export async function collectStream(stream: any): Promise<any[]> {
const chunks = [];
for await (const chunk of stream) {
chunks.push(chunk);
if ('__interrupt__' in chunk) {
if ("__interrupt__" in chunk) {
break;
}
}
@@ -320,8 +371,9 @@ export const testEmails: EmailData[] = [
from_email: "Project Manager <pm@client.com>",
to_email: "Lance Martin <lance@company.com>",
subject: "Tax season let's schedule call",
page_content: "Lance,\n\nIt's tax season again, and I wanted to schedule a call to discuss your tax planning strategies for this year. I have some suggestions that could potentially save you money.\n\nAre you available sometime next week? Tuesday or Thursday afternoon would work best for me, for about 45 minutes.\n\nRegards,\nProject Manager",
send_time: new Date().toISOString()
page_content:
"Lance,\n\nIt's tax season again, and I wanted to schedule a call to discuss your tax planning strategies for this year. I have some suggestions that could potentially save you money.\n\nAre you available sometime next week? Tuesday or Thursday afternoon would work best for me, for about 45 minutes.\n\nRegards,\nProject Manager",
send_time: new Date().toISOString(),
},
{
id: "test-email-2",
@@ -329,8 +381,9 @@ export const testEmails: EmailData[] = [
from_email: "marketing@newsletter.com",
to_email: "lance@company.com",
subject: "Weekly Newsletter",
page_content: "Here's your weekly newsletter with the latest updates and offers.",
send_time: new Date().toISOString()
page_content:
"Here's your weekly newsletter with the latest updates and offers.",
send_time: new Date().toISOString(),
},
{
id: "test-email-3",
@@ -338,35 +391,39 @@ export const testEmails: EmailData[] = [
from_email: "manager@company.com",
to_email: "lance@company.com",
subject: "Urgent: Project Status",
page_content: "Please provide an update on the project status by end of day.",
send_time: new Date().toISOString()
}
page_content:
"Please provide an update on the project status by end of day.",
send_time: new Date().toISOString(),
},
];
// Test criteria
export const testCriteria: string[] = [
"The response should acknowledge the meeting request and confirm availability for a specific time.",
"The response should be concise and professional.",
"The response should acknowledge the urgency and provide a specific timeframe for the update."
"The response should acknowledge the urgency and provide a specific timeframe for the update.",
];
// Expected tool calls
export const expectedToolCalls: string[][] = [
["schedule_meeting", "write_email"],
[],
["write_email"]
["write_email"],
];
// Display memory content
export async function displayMemoryContent(store: InMemoryStore, namespace?: string[]) {
export async function displayMemoryContent(
store: InMemoryStore,
namespace?: string[],
) {
console.log("\n======= CURRENT MEMORY CONTENT =======");
if (namespace) {
try {
const memory = await store.get(namespace, "user_preferences");
if (memory) {
console.log(`\n--- ${namespace[1]} ---`);
console.log({"preferences": memory.value});
console.log({ preferences: memory.value });
} else {
console.log(`\n--- ${namespace[1]} ---`);
console.log("No memory found");
@@ -380,15 +437,15 @@ export async function displayMemoryContent(store: InMemoryStore, namespace?: str
["email_assistant", "triage_preferences"],
["email_assistant", "response_preferences"],
["email_assistant", "cal_preferences"],
["email_assistant", "background"]
["email_assistant", "background"],
];
for (const ns of namespaces) {
try {
const memory = await store.get(ns, "user_preferences");
if (memory) {
console.log(`\n--- ${ns[1]} ---`);
console.log({"preferences": memory.value});
console.log({ preferences: memory.value });
} else {
console.log(`\n--- ${ns[1]} ---`);
console.log("No memory found");
@@ -400,4 +457,4 @@ export async function displayMemoryContent(store: InMemoryStore, namespace?: str
console.log("=======================================\n");
}
}
}
}