Demo notes
Slides
The PowerPoint slides with speaker notes are available for download
Actions
- Deploy the Azure Resources.
- Open Azure portal Webpage on Cosmos DB, Azure AI Search Azure AI Studio.
- Deploy the Prompt Flow endpoint in ai.azure.com.
High level
- Discuss Azure AI Platform
- Discuss RAG Pattern
- Discuss Prompt Flow
AI Studio
AI Studio is a generative AI development hub that provides access to thousands of language models from OpenAI, Meta, Hugging Face, and more.
Prompt Flow
There are several approaches to building RAG applications, there are code centric approaches like LangChain, Semantic Kernel, and the OpenAI Assistant API and low code/no code approaches.
Prompt Flow is a low code/no code approach to building RAG applications.
Azure Prompt Flow simplifies the process of prototyping, experimenting, and deploying AI applications powered by Large Language Models (LLMs).
Step 1: Provisioning Azure resources
- Review the Azure resources created by the Bicep template.
Step 2A: Create your first Prompt Flow
- Load grounding data and review in the Azure portal.
- Create your first Prompt Flow
- Set question to
what tents can you recommend for beginners?
. - Set connection to
aoai-connection
. - Run the flow
- Set question to
Step 2B: Add a new tool
- Select ctrl or cmd + n to create a new tool.
- Select the embedding tool.
- Set the connection to
aoai-connection
. - Set the deployment_name to
text-embedding-3-small
. - Connect the embedding tool to the question input.
- Run the flow.
Step 3: Load grounding data
- Review loaded data in Cosmos DB and Azure AI Search.
Step 4: Retrieve, the R in RAG
Retrieve flow.dag.yaml
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
environment:
python_requirements_txt: requirements.txt
inputs:
chat_history:
type: list
is_chat_history: true
default: []
question:
type: string
is_chat_input: true
default: recommended tents for beginners
customer_id:
type: string
default: "7"
outputs:
order_history:
type: string
reference: ${customer_lookup.output}
is_chat_output: true
Product_info:
type: string
reference: ${retrieve_documentation.output}
nodes:
- name: question_embedding
type: python
source:
type: package
tool: promptflow.tools.embedding.embedding
inputs:
connection: aoai-connection
deployment_name: text-embedding-3-small
input: ${inputs.question}
- name: retrieve_documentation
type: python
source:
type: code
path: ../contoso-chat/retrieve_documentation.py
inputs:
question: ${inputs.question}
index_name: contoso-products
embedding: ${question_embedding.output}
search: contoso-search
- name: customer_lookup
type: python
source:
type: code
path: ../contoso-chat/customer_lookup.py
inputs:
customerId: ${inputs.customer_id}
conn: contoso-cosmos
Step 5: Augmentation, the A in RAG
Prompt templating
Prompt Flow uses Jinja2 a templating language for Python, to format prompts.
- Update the YAML from below.
- To see templating in action, select the link on the customer_prompt tool.
Augmentation flow.dag.yaml
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
environment:
python_requirements_txt: requirements.txt
inputs:
chat_history:
type: list
is_chat_history: true
default: []
question:
type: string
is_chat_input: true
default: recommended tents for beginners
customer_id:
type: string
default: "7"
outputs:
answer:
type: string
reference: ${inputs.question}
is_chat_output: true
context:
type: string
reference: ${customer_prompt.output}
nodes:
- name: question_embedding
type: python
source:
type: package
tool: promptflow.tools.embedding.embedding
inputs:
connection: aoai-connection
deployment_name: text-embedding-3-small
input: ${inputs.question}
- name: retrieve_documentation
type: python
source:
type: code
path: ../contoso-chat/retrieve_documentation.py
inputs:
question: ${inputs.question}
index_name: contoso-products
embedding: ${question_embedding.output}
search: contoso-search
- name: customer_lookup
type: python
source:
type: code
path: ../contoso-chat/customer_lookup.py
inputs:
customerId: ${inputs.customer_id}
conn: contoso-cosmos
- name: customer_prompt
type: prompt
source:
type: code
path: ../contoso-chat/customer_prompt.jinja2
inputs:
documentation: ${retrieve_documentation.output}
customer: ${customer_lookup.output}
history: ${inputs.chat_history}
Step 6: Generation, the G in RAG
- Run all the flow.
- Review the outputs in a new tab.
- Review the prompt tokens and duration tab
Generation flow.dag.yaml
$schema: https://azuremlschemas.azureedge.net/promptflow/latest/Flow.schema.json
environment:
python_requirements_txt: requirements.txt
inputs:
chat_history:
type: list
is_chat_history: true
default: []
question:
type: string
is_chat_input: true
default: recommended tents for beginners
customer_id:
type: string
default: "7"
outputs:
answer:
type: string
reference: ${llm_response.output}
is_chat_output: true
context:
type: string
reference: ${retrieve_documentation.output}
nodes:
- name: question_embedding
type: python
source:
type: package
tool: promptflow.tools.embedding.embedding
inputs:
connection: aoai-connection
deployment_name: text-embedding-3-small
input: ${inputs.question}
- name: retrieve_documentation
type: python
source:
type: code
path: ../contoso-chat/retrieve_documentation.py
inputs:
question: ${inputs.question}
index_name: contoso-products
embedding: ${question_embedding.output}
search: contoso-search
- name: customer_lookup
type: python
source:
type: code
path: ../contoso-chat/customer_lookup.py
inputs:
customerId: ${inputs.customer_id}
conn: contoso-cosmos
- name: customer_prompt
type: prompt
source:
type: code
path: ../contoso-chat/customer_prompt.jinja2
inputs:
documentation: ${retrieve_documentation.output}
customer: ${customer_lookup.output}
history: ${inputs.chat_history}
- name: llm_response
type: llm
source:
type: code
path: ../contoso-chat/llm_response.jinja2
inputs:
deployment_name: gpt-35-turbo
prompt_text: ${customer_prompt.output}
question: ${inputs.question}
connection: aoai-connection
api: chat
Step 7: Prompt evaluations
Run and discuss the Prompt evaluations. The evaluations uses GPT -4 to evaluate the performance of the gpt-35-turbo model. At the moment, gpt-4 atm, is slower and more expensive that gpt-35-turbo, but it makes a good evaluation model.
- Open the evaluate-chat-local.ipynb from the eval folder.
- Run the notebook.
- Discuss calling two flows.
- Discuss the results:
- The flow.dag.yaml flow in the contoso-chat folder.
- The flow.dag.yaml flow in the groundedness folder.
Step 8: Testing and deployment
- Run local
- Discuss deployment options
- Test the deployed endpoint
Few shot example in the repo
For a great example of using few shot learning to understand customer intent, see the contoso-intent folder in the repo.