Solution Architecture

Estimated time to read: 3 minutes

Solution Architecture¶

In this workshop, you will create the Zava Sales Agent: a conversational agent designed to answer questions about sales data, generate charts, provide product recommendations, and support image-based product searches for Zava's retail DIY business.

Components of the Agent App¶

Microsoft Azure services

This agent is built on Microsoft Azure services.
- Generative AI model: The underlying LLM powering this app is the Azure OpenAI gpt-4o LLM.
- Control Plane: The app and its architectural components are managed and monitored using the Azure AI Foundry portal, accessible via the browser.
Azure AI Foundry (SDK)

The workshop is offered in Python using the Azure AI Foundry SDK. The SDK supports key features of the Azure AI Agents service, including Code Interpreter and Model Context Protocol (MCP) integration.
Database

The app is powered by the Zava Sales Database, a Azure Database for PostgreSQL flexible server with pgvector extension containing comprehensive sales data for Zava's retail DIY operations.

The database supports complex queries for sales, inventory, and customer data. Row-Level Security (RLS) ensures agents access only their assigned stores.
MCP Server

The Model Context Protocol (MCP) server is a custom Python service that acts as a bridge between the agent and the PostgreSQL database. It handles:
- Database Schema Discovery: Automatically retrieves database schemas to help the agent understand available data.
- Query Generation: Transforms natural language requests into SQL queries.
- Tool Execution: Executes SQL queries and returns results in a format the agent can use.
- Time Services: Provides time-related data for generating time-sensitive reports.

Extending the Workshop Solution¶

The workshop is easily adaptable to use cases like customer support by updating the database and customizing Foundry Agent Service instructions.

Best Practices Demonstrated in the App¶

The app also demonstrates some best practices for efficiency and user experience.

Asynchronous APIs: In the workshop sample, both the Foundry Agent Service and PostgreSQL use asynchronous APIs, optimizing resource efficiency and scalability. This design choice becomes especially advantageous when deploying the application with asynchronous web frameworks like FastAPI, ASP.NET, or Streamlit.
Token Streaming: Token streaming is implemented to improve user experience by reducing perceived response times for the LLM-powered agent app.
Observability: The app includes built-in tracing and metrics to monitor agent performance, usage patterns, and latency. This enables you to identify issues and optimize the agent over time.