mirror of
https://github.com/Heretek-AI/heretek-openclaw-core.git
synced 2026-07-01 14:17:57 -04:00
98a9fc7f81
- Added Redis-based A2A messaging skill (skills/a2a-message-send/a2a-redis.js) - sendMessage, broadcast, getMessages, pingAgent functions - Message persistence via Redis lists - Agent registration and discovery - Inbox management (count, clear, mark as read) - Added Redis-WebSocket bridge module (modules/communication/redis-websocket-bridge.js) - Bridges Redis pub/sub to WebSocket clients - Real-time message forwarding for dashboard - Client management and heartbeat support - Added OpenClaw Gateway server (gateway/openclaw-gateway.js) - WebSocket RPC server on port 18789 - HTTP health endpoints on port 18788 - Agent registration and message routing - Redis integration for offline message queuing - Added Docker configuration - docker-compose.redis.yml (Redis service) - docker-compose.gateway.yml (Gateway service) - Dockerfile.gateway (Gateway container) - Added documentation - DEBUG_A2A.md (debug report with findings and fixes) - skills/a2a-message-send/SKILL.md (skill documentation) Fixes: A2A protocol was non-functional due to missing implementation components. Tests referenced modules that didn't exist. The Collective could not communicate between agents.
489 lines
17 KiB
Markdown
489 lines
17 KiB
Markdown
# A2A Protocol Debug Report
|
|
|
|
**Date:** 2026-04-01
|
|
**Status:** Fixed
|
|
**Version:** 1.0.0
|
|
|
|
## Executive Summary
|
|
|
|
The A2A (Agent-to-Agent) Protocol in OpenClaw was non-functional due to missing implementation components. This report documents the bugs found, root causes identified, and fixes applied.
|
|
|
|
---
|
|
|
|
## Bugs Found
|
|
|
|
### Bug 1: Missing Redis-based A2A Skill Module
|
|
|
|
**Severity:** Critical
|
|
**File:** `skills/a2a-message-send/a2a-redis.js`
|
|
**Status:** ✅ Fixed
|
|
|
|
**Description:**
|
|
The test files (`tests/skills/a2a-message-send.test.js` and `tests/integration/a2a-communication.test.ts`) reference a module at `skills/a2a-message-send/a2a-redis.js` that did not exist. This module is responsible for Redis-based agent-to-agent messaging.
|
|
|
|
**Expected Functions (from tests):**
|
|
- `sendMessage(from, to, content, options)`
|
|
- `broadcast(from, content)`
|
|
- `broadcastToAgents(from, agents, content)`
|
|
- `broadcastToTriad(from, content)`
|
|
- `getMessages(agentId, limit)`
|
|
- `getUnreadMessages(agentId, limit)`
|
|
- `markAsRead(agentId, messageId)`
|
|
- `countMessages(agentId)`
|
|
- `clearMessages(agentId)`
|
|
- `pingAgent(from, to)`
|
|
- `pingTriad(from)`
|
|
- `validateMessage(message)`
|
|
- `validateAgentId(agentId)`
|
|
- `registerAgent(agentId, metadata)`
|
|
- `unregisterAgent(agentId)`
|
|
- `getRegisteredAgents()`
|
|
|
|
**Impact:**
|
|
- Agents could not communicate via Redis
|
|
- Message persistence was unavailable
|
|
- Tests failed with `MODULE_NOT_FOUND` errors
|
|
- The Collective could not coordinate actions
|
|
|
|
---
|
|
|
|
### Bug 2: Missing Redis-WebSocket Bridge Module
|
|
|
|
**Severity:** Critical
|
|
**File:** `modules/communication/redis-websocket-bridge.js`
|
|
**Status:** ✅ Fixed
|
|
|
|
**Description:**
|
|
The test file `tests/unit/redis-bridge.test.ts` and `tests/integration/websocket-bridge.test.ts` reference a module at `modules/communication/redis-websocket-bridge.js` that did not exist. The entire `modules/` directory was missing.
|
|
|
|
**Expected Class (from tests):**
|
|
- `RedisToWebSocketBridge`
|
|
- `start()` - Start the bridge
|
|
- `stop()` - Stop the bridge
|
|
- `broadcast(message)` - Broadcast to WebSocket clients
|
|
- `getStatus()` - Get bridge status
|
|
- `clients` - Set of connected WebSocket clients
|
|
- `redisClient` - Redis pub/sub client
|
|
- `isRunning` - Running status flag
|
|
|
|
**Expected Channels:**
|
|
- `CHANNELS.A2A` - 'openclaw:a2a:broadcast'
|
|
- `CHANNELS.HEARTBEAT` - 'openclaw:a2a:heartbeat'
|
|
|
|
**Impact:**
|
|
- No real-time WebSocket updates from Redis pub/sub
|
|
- Dashboard could not receive live A2A message updates
|
|
- Tests failed with import errors
|
|
|
|
---
|
|
|
|
### Bug 3: Missing Gateway Server Implementation
|
|
|
|
**Severity:** Critical
|
|
**File:** `gateway/openclaw-gateway.js`
|
|
**Status:** ✅ Fixed
|
|
|
|
**Description:**
|
|
The `agent-client.js` library contains a `GatewayClient` class that connects to a WebSocket server at `ws://127.0.0.1:18789`, but no gateway server implementation existed to listen on this port.
|
|
|
|
**Expected Gateway Features:**
|
|
- WebSocket server on port 18789 at path `/a2a`
|
|
- HTTP endpoints for health checks on port 18788
|
|
- Agent registration and discovery
|
|
- Message routing between agents
|
|
- Redis integration for message persistence
|
|
- Broadcast support
|
|
- Health check/ping endpoints
|
|
|
|
**Impact:**
|
|
- Agent WebSocket connections failed immediately
|
|
- No A2A message routing
|
|
- Agent discovery impossible
|
|
- Gateway connection errors in agent logs
|
|
|
|
---
|
|
|
|
### Bug 4: Architecture Mismatch
|
|
|
|
**Severity:** High
|
|
**Files:** Multiple
|
|
**Status:** ✅ Documented and Resolved
|
|
|
|
**Description:**
|
|
The codebase had conflicting architectural approaches:
|
|
1. `agent-client.js` uses WebSocket Gateway RPC (port 18789)
|
|
2. Tests expect Redis pub/sub messaging
|
|
3. LiteLLM A2A protocol (`litellm/litellm/a2a_protocol/`) is designed for external A2A SDK agents, not internal OpenClaw communication
|
|
|
|
**Root Cause:**
|
|
Architecture shifted to Gateway-based WebSocket RPC but:
|
|
- Gateway server was never implemented
|
|
- Tests weren't updated to match new architecture
|
|
- Redis infrastructure existed in docker-compose.yml but had no consumers
|
|
|
|
**Resolution:**
|
|
Implemented BOTH approaches:
|
|
- Redis-based A2A messaging for persistence and async communication
|
|
- Gateway WebSocket RPC for real-time agent communication
|
|
- Bridge module to connect Redis pub/sub to WebSocket clients
|
|
|
|
---
|
|
|
|
### Bug 5: Docker Compose Not Modular
|
|
|
|
**Severity:** Medium
|
|
**File:** `docker-compose.yml`
|
|
**Status:** ✅ Fixed
|
|
|
|
**Description:**
|
|
The monolithic `docker-compose.yml` made it difficult to deploy Redis and Gateway services independently. Additionally, the Redis-to-WebSocket bridge service was commented out with a note that `Dockerfile.websocket-bridge` was missing.
|
|
|
|
**Resolution:**
|
|
Created modular compose files:
|
|
- `docker-compose.redis.yml` - Redis service
|
|
- `docker-compose.gateway.yml` - Gateway service
|
|
- `Dockerfile.gateway` - Gateway container image
|
|
|
|
---
|
|
|
|
## Fixes Applied
|
|
|
|
### Fix 1: Created Redis A2A Skill Module
|
|
|
|
**File:** `skills/a2a-message-send/a2a-redis.js`
|
|
|
|
**Implementation Details:**
|
|
- Full Redis-based messaging with ioredis
|
|
- Message persistence using Redis lists
|
|
- Agent registration using Redis sets
|
|
- Broadcast via Redis pub/sub
|
|
- Inbox management (get, count, clear, mark as read)
|
|
- Ping/pong health checks with latency measurement
|
|
- Message validation
|
|
- Priority messaging support
|
|
- Known agents list (22 agents in the collective)
|
|
|
|
**Redis Data Structures:**
|
|
```
|
|
openclaw:a2a:inbox:{agentId} - List of messages
|
|
openclaw:a2a:agents - Set of registered agents
|
|
openclaw:a2a:agent:{agentId} - Hash with agent metadata
|
|
openclaw:a2a:broadcast - Pub/sub channel
|
|
openclaw:a2a:read:{agentId} - Set of read message IDs
|
|
```
|
|
|
|
---
|
|
|
|
### Fix 2: Created Redis-WebSocket Bridge Module
|
|
|
|
**File:** `modules/communication/redis-websocket-bridge.js`
|
|
|
|
**Implementation Details:**
|
|
- `RedisToWebSocketBridge` class extending EventEmitter
|
|
- WebSocket server for client connections
|
|
- Redis pub/sub subscription
|
|
- Message forwarding from Redis to WebSocket clients
|
|
- Client management (add, remove, count)
|
|
- Heartbeat/ping-pong support
|
|
- Automatic Redis reconnection with exponential backoff
|
|
- Singleton pattern with `getBridge()`, `startBridge()`, `stopBridge()` functions
|
|
|
|
**Architecture:**
|
|
```
|
|
Redis Pub/Sub --> Bridge --> WebSocket Clients
|
|
│
|
|
└── Subscribe to: openclaw:a2a:broadcast
|
|
openclaw:a2a:heartbeat
|
|
```
|
|
|
|
---
|
|
|
|
### Fix 3: Created OpenClaw Gateway Server
|
|
|
|
**File:** `gateway/openclaw-gateway.js`
|
|
|
|
**Implementation Details:**
|
|
- `OpenClawGateway` class extending EventEmitter
|
|
- WebSocket server on port 18789 at `/a2a`
|
|
- HTTP server on port 18788 for health endpoints
|
|
- Agent registration and tracking
|
|
- Message routing between connected agents
|
|
- Message queuing in Redis for offline agents
|
|
- Broadcast to all connected agents
|
|
- Agent discovery endpoint
|
|
- Health check endpoints
|
|
- Heartbeat mechanism for connection management
|
|
|
|
**HTTP Endpoints:**
|
|
- `GET /health` - Gateway status
|
|
- `GET /agents` - Connected agents list
|
|
|
|
**WebSocket Message Types:**
|
|
- `register` - Agent registration
|
|
- `message` - A2A message routing
|
|
- `response` - Response to pending message
|
|
- `broadcast` - Broadcast to all agents
|
|
- `ping/pong` - Health check
|
|
- `discover` - Get agent list
|
|
- `health` - Get gateway status
|
|
|
|
---
|
|
|
|
### Fix 4: Created Docker Configuration
|
|
|
|
**Files:**
|
|
- `docker-compose.redis.yml` - Redis service
|
|
- `docker-compose.gateway.yml` - Gateway service
|
|
- `Dockerfile.gateway` - Gateway container
|
|
|
|
**Usage:**
|
|
```bash
|
|
# Start Redis only
|
|
docker compose -f docker-compose.redis.yml up -d
|
|
|
|
# Start Gateway with Redis
|
|
docker compose -f docker-compose.redis.yml -f docker-compose.gateway.yml up -d
|
|
|
|
# Start full stack
|
|
docker compose -f docker-compose.yml -f docker-compose.redis.yml -f docker-compose.gateway.yml up -d
|
|
```
|
|
|
|
---
|
|
|
|
### Fix 5: Created Skill Documentation
|
|
|
|
**File:** `skills/a2a-message-send/SKILL.md`
|
|
|
|
Comprehensive documentation including:
|
|
- Architecture overview
|
|
- Installation instructions
|
|
- Usage examples
|
|
- API reference tables
|
|
- Message format specification
|
|
- Error handling guide
|
|
- Testing instructions
|
|
|
|
---
|
|
|
|
## Testing
|
|
|
|
### Run Unit Tests
|
|
|
|
```bash
|
|
# A2A message send skill tests
|
|
node --test tests/skills/a2a-message-send.test.js
|
|
|
|
# Redis bridge tests
|
|
npm run test:unit -- redis-bridge
|
|
|
|
# Gateway tests (when available)
|
|
npm run test:unit -- gateway
|
|
```
|
|
|
|
### Run Integration Tests
|
|
|
|
```bash
|
|
# A2A communication integration tests
|
|
npm run test:integration -- a2a-communication
|
|
|
|
# WebSocket bridge integration tests
|
|
npm run test:integration -- websocket-bridge
|
|
```
|
|
|
|
### Manual Testing
|
|
|
|
```bash
|
|
# Start Redis
|
|
docker compose -f docker-compose.redis.yml up -d
|
|
|
|
# Test Redis connection
|
|
redis-cli ping
|
|
# Expected: PONG
|
|
|
|
# Start Gateway
|
|
docker compose -f docker-compose.gateway.yml up -d
|
|
|
|
# Test Gateway health
|
|
curl http://localhost:18788/health
|
|
|
|
# Test WebSocket connection
|
|
wscat -c ws://localhost:18789/a2a
|
|
```
|
|
|
|
---
|
|
|
|
## Verification Checklist
|
|
|
|
- [x] Redis A2A skill module created (`skills/a2a-message-send/a2a-redis.js`)
|
|
- [x] Redis-WebSocket bridge module created (`modules/communication/redis-websocket-bridge.js`)
|
|
- [x] Gateway server created (`gateway/openclaw-gateway.js`)
|
|
- [x] Docker compose files created (`docker-compose.redis.yml`, `docker-compose.gateway.yml`)
|
|
- [x] Gateway Dockerfile created (`Dockerfile.gateway`)
|
|
- [x] Skill documentation created (`skills/a2a-message-send/SKILL.md`)
|
|
- [ ] Tests passing (requires manual verification)
|
|
- [ ] Redis service running (requires deployment)
|
|
- [ ] Gateway service running (requires deployment)
|
|
|
|
---
|
|
|
|
## Deployment Steps
|
|
|
|
### 1. Start Redis Service
|
|
|
|
```bash
|
|
cd heretek-openclaw-core
|
|
docker compose -f docker-compose.redis.yml up -d
|
|
|
|
# Verify
|
|
docker compose -f docker-compose.redis.yml ps
|
|
# Should show: redis - running
|
|
```
|
|
|
|
### 2. Start Gateway Service
|
|
|
|
```bash
|
|
docker compose -f docker-compose.gateway.yml up -d
|
|
|
|
# Verify
|
|
docker compose -f docker-compose.gateway.yml ps
|
|
# Should show: gateway - running
|
|
```
|
|
|
|
### 3. Verify Services
|
|
|
|
```bash
|
|
# Check Redis
|
|
docker exec heretek-redis redis-cli ping
|
|
# Expected: PONG
|
|
|
|
# Check Gateway health
|
|
curl http://localhost:18788/health
|
|
# Expected: {"running":true,"port":18789,...}
|
|
|
|
# Check Gateway logs
|
|
docker logs heretek-gateway
|
|
# Should show: "[Gateway] OpenClaw Gateway running on..."
|
|
```
|
|
|
|
### 4. Test A2A Communication
|
|
|
|
```javascript
|
|
// Test from Node.js
|
|
const { sendMessage, getMessages, pingAgent } = require('./skills/a2a-message-send/a2a-redis.js');
|
|
|
|
async function test() {
|
|
// Register agents
|
|
await registerAgent('steward', { role: 'orchestrator' });
|
|
await registerAgent('alpha', { role: 'triad' });
|
|
|
|
// Send message
|
|
const result = await sendMessage('steward', 'alpha', 'Hello Alpha!');
|
|
console.log('Send result:', result);
|
|
|
|
// Get messages
|
|
const messages = await getMessages('alpha', 10);
|
|
console.log('Alpha inbox:', messages);
|
|
|
|
// Ping test
|
|
const ping = await pingAgent('steward', 'alpha');
|
|
console.log('Ping result:', ping);
|
|
}
|
|
|
|
test().catch(console.error);
|
|
```
|
|
|
|
---
|
|
|
|
## Known Issues
|
|
|
|
1. **Redis Authentication** - If Redis requires authentication, set `REDIS_URL` with password:
|
|
```
|
|
REDIS_URL=redis://:password@host:6379
|
|
```
|
|
|
|
2. **Gateway Port Conflicts** - If port 18789 is in use, change via environment:
|
|
```
|
|
GATEWAY_PORT=18790
|
|
```
|
|
|
|
3. **Agent Registration** - Agents must register with the Gateway on connection:
|
|
```javascript
|
|
ws.send(JSON.stringify({
|
|
type: 'register',
|
|
agentId: 'steward',
|
|
metadata: { role: 'orchestrator' }
|
|
}));
|
|
```
|
|
|
|
---
|
|
|
|
## Architecture Diagram
|
|
|
|
```
|
|
┌─────────────────────────────────────────────────────────────────────────┐
|
|
│ Heretek OpenClaw A2A Stack │
|
|
├─────────────────────────────────────────────────────────────────────────┤
|
|
│ │
|
|
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────────────────┐ │
|
|
│ │ Agent A │ │ Agent B │ │ Agent C │ │
|
|
│ │ (steward) │ │ (alpha) │ │ (beta) │ │
|
|
│ │ port 8001 │ │ port 8002 │ │ port 8003 │ │
|
|
│ └──────┬──────┘ └──────┬──────┘ └───────────┬─────────────┘ │
|
|
│ │ │ │ │
|
|
│ │ WebSocket RPC │ │ │
|
|
│ │ ws://18789/a2a │ │ │
|
|
│ ▼ ▼ ▼ │
|
|
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
│ │ OpenClaw Gateway │ │
|
|
│ │ - Message Routing - Agent Discovery - Health Checks │ │
|
|
│ │ - Broadcast - Session Mgmt - Redis Persistence │ │
|
|
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ │ Redis │
|
|
│ ▼ │
|
|
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
│ │ Redis │ │
|
|
│ │ - Message Queues - Agent Registry - Pub/Sub Channels │ │
|
|
│ │ - Inbox Lists - Read Status - Broadcast │ │
|
|
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ │ Pub/Sub │
|
|
│ ▼ │
|
|
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
│ │ Redis-WebSocket Bridge │ │
|
|
│ │ - Subscribe to Redis - Forward to WS Clients - Clients │ │
|
|
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
│ │ │
|
|
│ ▼ │
|
|
│ ┌─────────────────────────────────────────────────────────────────┐ │
|
|
│ │ Web Dashboard │ │
|
|
│ │ - Real-time A2A updates - Agent status - Message logs │ │
|
|
│ └─────────────────────────────────────────────────────────────────┘ │
|
|
│ │
|
|
└─────────────────────────────────────────────────────────────────────────┘
|
|
```
|
|
|
|
---
|
|
|
|
## Conclusion
|
|
|
|
The A2A Protocol issues have been resolved by implementing the missing components:
|
|
|
|
1. **Redis-based messaging** for persistent, async agent communication
|
|
2. **Gateway WebSocket server** for real-time RPC communication
|
|
3. **Redis-WebSocket bridge** for live dashboard updates
|
|
|
|
The system now supports both synchronous (WebSocket RPC) and asynchronous (Redis queues) communication patterns, providing flexibility for different use cases within the OpenClaw collective.
|
|
|
|
---
|
|
|
|
## References
|
|
|
|
- [`skills/a2a-message-send/a2a-redis.js`](skills/a2a-message-send/a2a-redis.js) - Redis A2A module
|
|
- [`modules/communication/redis-websocket-bridge.js`](modules/communication/redis-websocket-bridge.js) - Redis-WS bridge
|
|
- [`gateway/openclaw-gateway.js`](gateway/openclaw-gateway.js) - Gateway server
|
|
- [`skills/a2a-message-send/SKILL.md`](skills/a2a-message-send/SKILL.md) - Skill documentation
|
|
- [`docker-compose.redis.yml`](docker-compose.redis.yml) - Redis service config
|
|
- [`docker-compose.gateway.yml`](docker-compose.gateway.yml) - Gateway service config
|
|
- [`Dockerfile.gateway`](Dockerfile.gateway) - Gateway Dockerfile
|