The rapid proliferation of specialized AI agents has marked a significant turning point in automation. While these agents excel at specific tasks, their true potential is unlocked only when they can collaborate, combining unique capabilities to solve complex, multi-faceted problems.

The challenge lies in enabling this collaboration without creating a tangled web of custom, point-to-point integrations. The Agent2Agent (A2A) protocol emerges as the standardized solution, providing a common language for diverse agents to communicate effectively.

This article explores how we integrated this powerful protocol into the LightSpeed Stack, transforming it into a fully interoperable agent.

1. The blueprint for collaboration

In an ecosystem where agents are developed by different organizations using various frameworks, a shared protocol is the foundation for true interoperability. Standardization prevents vendor lock-in, fosters a more open and collaborative development environment, and significantly reduces the complexity of building sophisticated, multi-agent systems that can evolve and scale over time.

1.2. Core Concepts of A2A

The A2A protocol is an Open Standard enabling seamless communication and collaboration between diverse AI Agents. It achieves this by defining a clear set of actors and communication elements that govern how interactions occur.

Core Actors

User: The entity that initiates a request.
A2A Client: The agent acting for the user.
A2A Server: A remote agent exposing the A2A endpoint.

Communication Elements

Agent Card: JSON metadata for discovery, providing information on an agent’s capabilities, skills, and security.
Task: A stateful, unique unit of work designed for long-running operations.
Message: A single turn of communication containing content and a role.
Artifact: A tangible output generated by the agent, such as a document or an image.

1.3. Analyzing the Key Benefits

The adoption of the A2A protocol delivers several immediate and tangible benefits for developers and organizations building multi-agent systems.

Interoperability: Allows agents from different organizations and frameworks to work together seamlessly, breaking down technical silos.
Reduced Complexity: Standardizes agent communication, minimizing the need for custom, point-to-point integrations and simplifying system architecture.
Secure Collaboration: Utilizes HTTPS for secure communication and supports standard authentication and authorization through HTTP headers.
Asynchronous Support: Natively handles Long-Running Operations (LROs) through mechanisms like polling, streaming (Server-Sent Events, SSE), and push notifications, which is critical for complex tasks.
Agent Autonomy: Enables agents to retain their individual capabilities while collaborating effectively within a larger network.

These foundational benefits provided the primary motivation for integrating the A2A protocol with the LightSpeed Stack, aiming to unlock new levels of orchestration and functionality.

2. Exposing LightSpeed Stack via A2A

To expose the LightSpeed Stack as an A2A agent, we implemented a layered architecture that bridges the external JSON-RPC protocol with the internal Llama Stack client. This has been done as part of this Pull Request

The following diagram illustrates the request flow from an external client down to the core Llama Stack components: Plaintext

┌─────────────────────────────────────────────────────────────────┐
│                        A2A Client                               │
│                  (A2A Inspector, Other Agents)                  │
└─────────────────────────┬───────────────────────────────────────┘
                          │ JSON-RPC over HTTP
                          ▼
┌─────────────────────────────────────────────────────────────────┐
│                    FastAPI Application                          │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                  A2A Endpoints                           │   │
│  │  /.well-known/agent.json  - Agent Card Discovery         │   │
│  │  /a2a                     - JSON-RPC Handler             │   │
│  │  /a2a/health              - Health Check                 │   │
│  └──────────────────────────────────────────────────────────┘   │
│                          │                                      │
│                          ▼                                      │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                 A2AAgentExecutor                         │   │
│  │  - Handles task execution                                │   │
│  │  - Converts Responses API events to A2A events           │   │
│  │  - Manages multi-turn conversations                      │   │
│  └──────────────────────────────────────────────────────────┘   │
│                          │                                      │
│                          ▼                                      │
│  ┌──────────────────────────────────────────────────────────┐   │
│  │                  Llama Stack Client                      │   │
│  │  - Responses API (streaming responses)                   │   │
│  │  - Tools, Shields, RAG integration                       │   │
│  └──────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘

The integration required the following additions to the LightSpeed Stack:

New Endpoints:
- /.well-known/agent-card.json: For agent discovery via its Agent Card.
- /a2a: The primary A2A JSONRPC endpoint for all agent communication.
- /a2a/health: A standard healthcheck endpoint.
LightSpeed Stack Modifications:
- New Dependency: a2a-sdk was added to provide core A2A functionality.
- New Configuration: An agent card yaml file was created to define the agent’s public capabilities.
- Application Mapping: A queue was implemented to map requests between the core LSC FastAPI and the A2A StarletteApp.
Streaming Implementation:
- The existing streaming_query_v2 (Responses API) function was leveraged for real-time responses.

The mapping queue serves as a critical architectural bridge, decoupling the existing LightSpeed Core’s FastAPI application from the new A2A-compliant StarletteApp. This ensures that internal application logic remains separate from the standardized communication protocol, a key principle of clean system design:

the FastAPI layer handles protocol compliance,
while the A2AAgentExecutor manages the business logic and translation layer.

2.1. How It Works: A Python Implementation Overview

The integration was achieved through a clear and logical Python implementation process, leveraging the A2A SDK to streamline development.

Defining Agent Skills: The agent’s capabilities, or “skills,” are first defined as part of a configuration yaml. This makes the agent exposed information explicit and easy to manage.
Creating the Agent Card: To enable the service discovery central to the A2A protocol, an agent card JSON object is created, based on the provided yaml file. This card acts as a public manifest, advertising the agent’s identity, skills, and security protocols, making it discoverable by supervisors in the network.
Executing Agent Logic: The agent executor is responsible for the core logic, mapping the defined skills to the A2A protocol and handling the routing of queries to the appropriate backend.
Serving the Agent: Finally, the agent server utilizes StarletteApp to expose the necessary endpoints and handle all incoming A2A communications according to the protocol specification.

Note: The current implementation has a key limitation: there is no support for the LightSpeed Stack to call other A2A agents. It can only act as an endpoint to be called by others.

2.2. Deep Dive: How the Executor Works

At the heart of this integration is the A2AAgentExecutor. This class implements the standard A2A AgentExecutor interface and acts as the orchestrator for all incoming tasks.

The Execution Lifecycle

When a request hits the endpoint, the executor performs five key steps:

Receives A2A Request: It extracts the user input and intent from the standardized A2A message.
Creates Query Request: It builds an internal QueryRequest object, injecting the necessary conversation context.
Calls Llama Stack: It utilizes the Responses API to trigger the underlying model, enabling streaming responses.
Converts Events: This is the critical translation step. It transforms Llama Stack’s streaming chunks into standardized A2A events in real-time.
Manages State: throughout the process, it tracks the task state and publishes status updates to the client.

Event Flow Visualization

The flow from a raw request to a finalized task involves several state transitions and event emissions:

A2A Request
    │
    ▼
┌─────────────────────┐
│ Extract User Input  │
└─────────────────────┘
    │
    ▼
┌─────────────────────┐
│ Create/Resume Task  │──► TaskSubmittedEvent
└─────────────────────┘
    │
    ▼
┌─────────────────────┐
│ Call Llama Stack    │──► TaskStatusUpdateEvent (working)
│ Responses API       │
└─────────────────────┘
    │
    ▼
┌─────────────────────┐
│ Stream Response     │──► TaskStatusUpdateEvent (working, with deltas)
│ Chunks              │──► TaskStatusUpdateEvent (tool calls)
└─────────────────────┘
    │
    ▼
┌─────────────────────┐
│ Response Complete   │──► TaskArtifactUpdateEvent (final content)
└─────────────────────┘
    │
    ▼
┌─────────────────────┐
│ Finalize Task       │──► TaskStatusUpdateEvent (completed/failed)
└─────────────────────┘

Task States

We map every operation to a specific state, allowing the calling agent to understand exactly where the process stands:

State	Description
submitted	Task has been received and queued.
working	Task is being processed (includes streaming generation).
completed	Task finished successfully.
failed	Task failed with an error.
input_required	Agent needs additional input from the user (e.g., clarification).
auth_required	Authentication is required to continue.

Status Update Handling

Real-time feedback is crucial for a responsive user experience. The agent sends TaskStatusUpdateEvents at critical junctures:

Initial Status: When a task starts, a working status is sent containing metadata (model used, conversation ID).
Text Deltas: As the LLM generates text, every token is streamed as a working status with the delta text included.
Tool Calls: If the agent triggers RAG or MCP servers, specific status updates indicate which tool is being called.
Final Status: A completed or failed event marks the end of the transaction.

3. Strategic Advantages and Future Horizons

This integration is far more than a technical exercise; it is a strategic move that positions the LightSpeed Stack as a key component in a larger, collaborative intelligence ecosystem. It immediately unlocks new capabilities and paves the way for more sophisticated use cases in the future.

Unlocking Value

By adopting the A2A protocol, this work enables a range of powerful, forward-looking possibilities that extend well beyond the initial use case.

Exposure to Partners: The standardized A2A endpoint provides a secure and controlled method to expose the agent’s capabilities to external partners, fostering new integration opportunities.
Enhanced Integration: The agent can now be integrated into diverse external environments and orchestration frameworks that support the A2A protocol, increasing its utility and reach.

The Road Ahead

Advanced Orchestration: This integration is a critical step toward enabling more complex, multi-step agentic orchestration and direct agent-to-agent communication, where agents can leverage each other as specialized “tools” to complete complex tasks.

As this landscape of collaborative AI evolves, it also surfaces important questions about the best practices for implementation.

4. Open Questions: Navigating the Future of Agent Frameworks

While the A2A protocol provides the rails for interoperability, it does not prescribe every implementation detail. As architects, we must still navigate critical design crossroads when integrating this standard into frameworks like LightSpeed Stack or Google’s ADK. The choices we make here have significant implications for system performance, state management, and user experience.

4.1. Key Considerations for A2A Implementation

A notable characteristic of the A2A protocol is its inherent flexibility, which provides architects with multiple implementation options to suit specific use cases. Understanding the strategic implications of these choices is critical for sound system design. As multi-agent systems mature, the community will need to establish best practices around several open questions inherent in the A2A protocol’s design.

• Status vs. Artifact Updates: The protocol allows progress to be communicated via status changes or through the incremental delivery of artifacts. Architects can choose lightweight status updates for simple progress tracking or use artifact updates when the intermediate outputs of a task are themselves valuable and need to be delivered as they are produced.

• Streaming vs. Non-streaming: This choice dictates how data is returned. Non-streaming is suitable for quick operations with a single, complete response. Streaming, via Server-Sent Events (SSE), is essential for long-running tasks where providing real-time, incremental feedback to the user or calling agent is critical for a responsive experience.

• Tasks vs. Messages: This grants architects critical flexibility: simpler, stateless interactions can be modeled with lightweight Messages for rapid request-response scenarios, while complex, multi-turn processes that require state tracking and progress monitoring are robustly managed via the Task construct.

This flexibility ensures that the A2A protocol can accommodate a wide range of agentic behaviors and interaction patterns, from simple request-response to complex, long-running collaborative tasks.

Conclusion: Building the Future of Collaborative Intelligence

The journey toward truly collaborative AI hinges on standardization and interoperability. The A2A protocol provides the essential blueprint for this future, defining a common language that allows specialized agents to work in concert. The successful integration of an A2A endpoint with the LightSpeed Stack serves as a powerful proof-of-concept, demonstrating not only the technical feasibility but also the immense strategic value of embracing open standards. Adopting open standards like A2A is no longer a technical choice but a strategic imperative for any enterprise aiming to build a scalable, future-proof, and truly collaborative AI ecosystem.

Unlocking Collaborative AI: Integrating the A2A Protocol with LightSpeed Stack

Blog about AI, Hybrid Cloud, Networking, Open Source, research, and other related stuff