Files
John Doe 98a9fc7f81 feat: Implement A2A Protocol infrastructure
- Added Redis-based A2A messaging skill (skills/a2a-message-send/a2a-redis.js)
  - sendMessage, broadcast, getMessages, pingAgent functions
  - Message persistence via Redis lists
  - Agent registration and discovery
  - Inbox management (count, clear, mark as read)

- Added Redis-WebSocket bridge module (modules/communication/redis-websocket-bridge.js)
  - Bridges Redis pub/sub to WebSocket clients
  - Real-time message forwarding for dashboard
  - Client management and heartbeat support

- Added OpenClaw Gateway server (gateway/openclaw-gateway.js)
  - WebSocket RPC server on port 18789
  - HTTP health endpoints on port 18788
  - Agent registration and message routing
  - Redis integration for offline message queuing

- Added Docker configuration
  - docker-compose.redis.yml (Redis service)
  - docker-compose.gateway.yml (Gateway service)
  - Dockerfile.gateway (Gateway container)

- Added documentation
  - DEBUG_A2A.md (debug report with findings and fixes)
  - skills/a2a-message-send/SKILL.md (skill documentation)

Fixes: A2A protocol was non-functional due to missing implementation
components. Tests referenced modules that didn't exist.
The Collective could not communicate between agents.
2026-04-01 12:29:33 -04:00

17 KiB

A2A Protocol Debug Report

Date: 2026-04-01
Status: Fixed
Version: 1.0.0

Executive Summary

The A2A (Agent-to-Agent) Protocol in OpenClaw was non-functional due to missing implementation components. This report documents the bugs found, root causes identified, and fixes applied.


Bugs Found

Bug 1: Missing Redis-based A2A Skill Module

Severity: Critical
File: skills/a2a-message-send/a2a-redis.js
Status: Fixed

Description: The test files (tests/skills/a2a-message-send.test.js and tests/integration/a2a-communication.test.ts) reference a module at skills/a2a-message-send/a2a-redis.js that did not exist. This module is responsible for Redis-based agent-to-agent messaging.

Expected Functions (from tests):

  • sendMessage(from, to, content, options)
  • broadcast(from, content)
  • broadcastToAgents(from, agents, content)
  • broadcastToTriad(from, content)
  • getMessages(agentId, limit)
  • getUnreadMessages(agentId, limit)
  • markAsRead(agentId, messageId)
  • countMessages(agentId)
  • clearMessages(agentId)
  • pingAgent(from, to)
  • pingTriad(from)
  • validateMessage(message)
  • validateAgentId(agentId)
  • registerAgent(agentId, metadata)
  • unregisterAgent(agentId)
  • getRegisteredAgents()

Impact:

  • Agents could not communicate via Redis
  • Message persistence was unavailable
  • Tests failed with MODULE_NOT_FOUND errors
  • The Collective could not coordinate actions

Bug 2: Missing Redis-WebSocket Bridge Module

Severity: Critical
File: modules/communication/redis-websocket-bridge.js
Status: Fixed

Description: The test file tests/unit/redis-bridge.test.ts and tests/integration/websocket-bridge.test.ts reference a module at modules/communication/redis-websocket-bridge.js that did not exist. The entire modules/ directory was missing.

Expected Class (from tests):

  • RedisToWebSocketBridge
    • start() - Start the bridge
    • stop() - Stop the bridge
    • broadcast(message) - Broadcast to WebSocket clients
    • getStatus() - Get bridge status
    • clients - Set of connected WebSocket clients
    • redisClient - Redis pub/sub client
    • isRunning - Running status flag

Expected Channels:

  • CHANNELS.A2A - 'openclaw:a2a:broadcast'
  • CHANNELS.HEARTBEAT - 'openclaw:a2a:heartbeat'

Impact:

  • No real-time WebSocket updates from Redis pub/sub
  • Dashboard could not receive live A2A message updates
  • Tests failed with import errors

Bug 3: Missing Gateway Server Implementation

Severity: Critical
File: gateway/openclaw-gateway.js
Status: Fixed

Description: The agent-client.js library contains a GatewayClient class that connects to a WebSocket server at ws://127.0.0.1:18789, but no gateway server implementation existed to listen on this port.

Expected Gateway Features:

  • WebSocket server on port 18789 at path /a2a
  • HTTP endpoints for health checks on port 18788
  • Agent registration and discovery
  • Message routing between agents
  • Redis integration for message persistence
  • Broadcast support
  • Health check/ping endpoints

Impact:

  • Agent WebSocket connections failed immediately
  • No A2A message routing
  • Agent discovery impossible
  • Gateway connection errors in agent logs

Bug 4: Architecture Mismatch

Severity: High
Files: Multiple
Status: Documented and Resolved

Description: The codebase had conflicting architectural approaches:

  1. agent-client.js uses WebSocket Gateway RPC (port 18789)
  2. Tests expect Redis pub/sub messaging
  3. LiteLLM A2A protocol (litellm/litellm/a2a_protocol/) is designed for external A2A SDK agents, not internal OpenClaw communication

Root Cause: Architecture shifted to Gateway-based WebSocket RPC but:

  • Gateway server was never implemented
  • Tests weren't updated to match new architecture
  • Redis infrastructure existed in docker-compose.yml but had no consumers

Resolution: Implemented BOTH approaches:

  • Redis-based A2A messaging for persistence and async communication
  • Gateway WebSocket RPC for real-time agent communication
  • Bridge module to connect Redis pub/sub to WebSocket clients

Bug 5: Docker Compose Not Modular

Severity: Medium
File: docker-compose.yml
Status: Fixed

Description: The monolithic docker-compose.yml made it difficult to deploy Redis and Gateway services independently. Additionally, the Redis-to-WebSocket bridge service was commented out with a note that Dockerfile.websocket-bridge was missing.

Resolution: Created modular compose files:

  • docker-compose.redis.yml - Redis service
  • docker-compose.gateway.yml - Gateway service
  • Dockerfile.gateway - Gateway container image

Fixes Applied

Fix 1: Created Redis A2A Skill Module

File: skills/a2a-message-send/a2a-redis.js

Implementation Details:

  • Full Redis-based messaging with ioredis
  • Message persistence using Redis lists
  • Agent registration using Redis sets
  • Broadcast via Redis pub/sub
  • Inbox management (get, count, clear, mark as read)
  • Ping/pong health checks with latency measurement
  • Message validation
  • Priority messaging support
  • Known agents list (22 agents in the collective)

Redis Data Structures:

openclaw:a2a:inbox:{agentId}    - List of messages
openclaw:a2a:agents             - Set of registered agents
openclaw:a2a:agent:{agentId}    - Hash with agent metadata
openclaw:a2a:broadcast          - Pub/sub channel
openclaw:a2a:read:{agentId}     - Set of read message IDs

Fix 2: Created Redis-WebSocket Bridge Module

File: modules/communication/redis-websocket-bridge.js

Implementation Details:

  • RedisToWebSocketBridge class extending EventEmitter
  • WebSocket server for client connections
  • Redis pub/sub subscription
  • Message forwarding from Redis to WebSocket clients
  • Client management (add, remove, count)
  • Heartbeat/ping-pong support
  • Automatic Redis reconnection with exponential backoff
  • Singleton pattern with getBridge(), startBridge(), stopBridge() functions

Architecture:

Redis Pub/Sub --> Bridge --> WebSocket Clients
     │
     └── Subscribe to: openclaw:a2a:broadcast
                         openclaw:a2a:heartbeat

Fix 3: Created OpenClaw Gateway Server

File: gateway/openclaw-gateway.js

Implementation Details:

  • OpenClawGateway class extending EventEmitter
  • WebSocket server on port 18789 at /a2a
  • HTTP server on port 18788 for health endpoints
  • Agent registration and tracking
  • Message routing between connected agents
  • Message queuing in Redis for offline agents
  • Broadcast to all connected agents
  • Agent discovery endpoint
  • Health check endpoints
  • Heartbeat mechanism for connection management

HTTP Endpoints:

  • GET /health - Gateway status
  • GET /agents - Connected agents list

WebSocket Message Types:

  • register - Agent registration
  • message - A2A message routing
  • response - Response to pending message
  • broadcast - Broadcast to all agents
  • ping/pong - Health check
  • discover - Get agent list
  • health - Get gateway status

Fix 4: Created Docker Configuration

Files:

  • docker-compose.redis.yml - Redis service
  • docker-compose.gateway.yml - Gateway service
  • Dockerfile.gateway - Gateway container

Usage:

# Start Redis only
docker compose -f docker-compose.redis.yml up -d

# Start Gateway with Redis
docker compose -f docker-compose.redis.yml -f docker-compose.gateway.yml up -d

# Start full stack
docker compose -f docker-compose.yml -f docker-compose.redis.yml -f docker-compose.gateway.yml up -d

Fix 5: Created Skill Documentation

File: skills/a2a-message-send/SKILL.md

Comprehensive documentation including:

  • Architecture overview
  • Installation instructions
  • Usage examples
  • API reference tables
  • Message format specification
  • Error handling guide
  • Testing instructions

Testing

Run Unit Tests

# A2A message send skill tests
node --test tests/skills/a2a-message-send.test.js

# Redis bridge tests
npm run test:unit -- redis-bridge

# Gateway tests (when available)
npm run test:unit -- gateway

Run Integration Tests

# A2A communication integration tests
npm run test:integration -- a2a-communication

# WebSocket bridge integration tests
npm run test:integration -- websocket-bridge

Manual Testing

# Start Redis
docker compose -f docker-compose.redis.yml up -d

# Test Redis connection
redis-cli ping
# Expected: PONG

# Start Gateway
docker compose -f docker-compose.gateway.yml up -d

# Test Gateway health
curl http://localhost:18788/health

# Test WebSocket connection
wscat -c ws://localhost:18789/a2a

Verification Checklist

  • Redis A2A skill module created (skills/a2a-message-send/a2a-redis.js)
  • Redis-WebSocket bridge module created (modules/communication/redis-websocket-bridge.js)
  • Gateway server created (gateway/openclaw-gateway.js)
  • Docker compose files created (docker-compose.redis.yml, docker-compose.gateway.yml)
  • Gateway Dockerfile created (Dockerfile.gateway)
  • Skill documentation created (skills/a2a-message-send/SKILL.md)
  • Tests passing (requires manual verification)
  • Redis service running (requires deployment)
  • Gateway service running (requires deployment)

Deployment Steps

1. Start Redis Service

cd heretek-openclaw-core
docker compose -f docker-compose.redis.yml up -d

# Verify
docker compose -f docker-compose.redis.yml ps
# Should show: redis - running

2. Start Gateway Service

docker compose -f docker-compose.gateway.yml up -d

# Verify
docker compose -f docker-compose.gateway.yml ps
# Should show: gateway - running

3. Verify Services

# Check Redis
docker exec heretek-redis redis-cli ping
# Expected: PONG

# Check Gateway health
curl http://localhost:18788/health
# Expected: {"running":true,"port":18789,...}

# Check Gateway logs
docker logs heretek-gateway
# Should show: "[Gateway] OpenClaw Gateway running on..."

4. Test A2A Communication

// Test from Node.js
const { sendMessage, getMessages, pingAgent } = require('./skills/a2a-message-send/a2a-redis.js');

async function test() {
    // Register agents
    await registerAgent('steward', { role: 'orchestrator' });
    await registerAgent('alpha', { role: 'triad' });
    
    // Send message
    const result = await sendMessage('steward', 'alpha', 'Hello Alpha!');
    console.log('Send result:', result);
    
    // Get messages
    const messages = await getMessages('alpha', 10);
    console.log('Alpha inbox:', messages);
    
    // Ping test
    const ping = await pingAgent('steward', 'alpha');
    console.log('Ping result:', ping);
}

test().catch(console.error);

Known Issues

  1. Redis Authentication - If Redis requires authentication, set REDIS_URL with password:

    REDIS_URL=redis://:password@host:6379
    
  2. Gateway Port Conflicts - If port 18789 is in use, change via environment:

    GATEWAY_PORT=18790
    
  3. Agent Registration - Agents must register with the Gateway on connection:

    ws.send(JSON.stringify({
        type: 'register',
        agentId: 'steward',
        metadata: { role: 'orchestrator' }
    }));
    

Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────┐
│                        Heretek OpenClaw A2A Stack                       │
├─────────────────────────────────────────────────────────────────────────┤
│                                                                         │
│  ┌─────────────┐     ┌─────────────┐     ┌─────────────────────────┐   │
│  │   Agent A   │     │   Agent B   │     │      Agent C            │   │
│  │  (steward)  │     │   (alpha)   │     │    (beta)               │   │
│  │  port 8001  │     │  port 8002  │     │     port 8003           │   │
│  └──────┬──────┘     └──────┬──────┘     └───────────┬─────────────┘   │
│         │                   │                         │                 │
│         │   WebSocket RPC   │                         │                 │
│         │   ws://18789/a2a  │                         │                 │
│         ▼                   ▼                         ▼                 │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                    OpenClaw Gateway                              │   │
│  │  - Message Routing    - Agent Discovery    - Health Checks      │   │
│  │  - Broadcast          - Session Mgmt       - Redis Persistence  │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                   │                                     │
│                                   │ Redis                               │
│                                   ▼                                     │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                         Redis                                    │   │
│  │  - Message Queues     - Agent Registry   - Pub/Sub Channels    │   │
│  │  - Inbox Lists        - Read Status      - Broadcast           │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                   │                                     │
│                                   │ Pub/Sub                             │
│                                   ▼                                     │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │               Redis-WebSocket Bridge                             │   │
│  │  - Subscribe to Redis    - Forward to WS Clients    - Clients   │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                   │                                     │
│                                   ▼                                     │
│  ┌─────────────────────────────────────────────────────────────────┐   │
│  │                    Web Dashboard                                 │   │
│  │  - Real-time A2A updates    - Agent status    - Message logs   │   │
│  └─────────────────────────────────────────────────────────────────┘   │
│                                                                         │
└─────────────────────────────────────────────────────────────────────────┘

Conclusion

The A2A Protocol issues have been resolved by implementing the missing components:

  1. Redis-based messaging for persistent, async agent communication
  2. Gateway WebSocket server for real-time RPC communication
  3. Redis-WebSocket bridge for live dashboard updates

The system now supports both synchronous (WebSocket RPC) and asynchronous (Redis queues) communication patterns, providing flexibility for different use cases within the OpenClaw collective.


References