SlideShare a Scribd company logo
1 | © Copyright 8/16/23 Zilliz
1 | © Copyright 8/16/23 Zilliz
Stephen Batifol | Zilliz
A Beginners Guide to Building
a RAG App Using Milvus
2 | © Copyright 8/16/23 Zilliz
2 | © Copyright 8/16/23 Zilliz
Stephen Batifol
Developer Advocate, Zilliz
stephen.batifol@zilliz.com
https://www.linkedin.com/in/stephen-batifol/
https://twitter.com/stephenbtl
Speaker
3 | © Copyright 8/16/23 Zilliz
3 | © Copyright 8/16/23 Zilliz
| © Copyright 8/16/23 Zilliz
3
RAG
(Retrieval Augmented Generation)
4 | © Copyright 8/16/23 Zilliz
4 | © Copyright 8/16/23 Zilliz
Basic Idea
Use RAG to force the LLM to work with your data
by injecting it via a vector database like Milvus
5 | © Copyright 8/16/23 Zilliz
5 | © Copyright 8/16/23 Zilliz
Vector DB for RAG
Vector Databases provide the ability to inject your data via
semantic similarity
Considerations include: scale, performance, and flexibility
6 | © Copyright 8/16/23 Zilliz
6 | © Copyright 8/16/23 Zilliz
LLMs are Stochastic
LLMs predict future tokens (a-la RNNs)
• “Milvus is the world ’s most popular vector ___”
• {“database”: 0.86, “search”: 0.11, “embedding”, 0.01,
…}
Downside: outdated input data could be cause for
hallucination
• Plausible-sounding but factually incorrect responses
7 | © Copyright 8/16/23 Zilliz
7 | © Copyright 8/16/23 Zilliz
Basic RAG Architecture
8 | © Copyright 8/16/23 Zilliz
8 | © Copyright 8/16/23 Zilliz
01 Tech Stack
9 | © Copyright 8/16/23 Zilliz
9 | © Copyright 8/16/23 Zilliz
Tech Stack
10 | © Copyright 8/16/23 Zilliz
10 | © Copyright 8/16/23 Zilliz
• Framework for building LLM Applications
• Focus on retrieving data and integrating with LLMs
• Loading the Data
• Chunk & Chunk Overlap
• Integrations with most popular tools
Langchain
11 | © Copyright 8/16/23 Zilliz
11 | © Copyright 8/16/23 Zilliz
Ollama
• Run quantized LLMs Locally
• Embeddings Models
12 | © Copyright 8/16/23 Zilliz
12 | © Copyright 8/16/23 Zilliz
Milvus
1. Cloud Native, Distributed System Architecture
2. True Separation of Concerns
3. Scalable Index Creation Strategy with 512 MB Segments
13 | © Copyright 8/16/23 Zilliz
13 | © Copyright 8/16/23 Zilliz
Embeddings Models
14 | © Copyright 8/16/23 Zilliz
14 | © Copyright 8/16/23 Zilliz
02 Embeddings
15 | © Copyright 8/16/23 Zilliz
15 | © Copyright 8/16/23 Zilliz
Examining Embeddings
Picking a model
What to embed
Metadata
16 | © Copyright 8/16/23 Zilliz
16 | © Copyright 8/16/23 Zilliz
Embeddings Strategies
Level 1: Embedding Chunks Directly
Level 2: Embedding Sub and Super Chunks
Level 3: Incorporating Chunking and Non-Chunking Metadata
17 | © Copyright 8/16/23 Zilliz
17 | © Copyright 8/16/23 Zilliz
Metadata Examples
Chunking
- Paragraph position
- Section header
- Larger paragraph
- Sentence Number
- …
Non-Chunking
- Author
- Publisher
- Organization
- Role Based Access Control
- …
18 | © Copyright 8/16/23 Zilliz
18 | © Copyright 8/16/23 Zilliz
Text:
“preferences of customers and prospective customers with respect to remote or hybrid
working, as a result of the COVID-19 pandemic, leading to a parallel delay, or potentially
permanent change, in receiving the corresponding revenue; •our projected financial
information, anticipated growth rate, and market opportunity; •our ability to maintain the
listing of our Class A Common Stock and Warrants on the NYSE; •our public securities’
potential liquidity and trading;”
Vector:
[-0.09975282847881317,-0.02853492833673954,-0.047886092215776443,0.01231582183
3908558,-0.004004416521638632,0.08756010979413986,0.013248161412775517,0.01070
4956017434597,-0.06194952502846718,0.021150749176740646,0.02453230880200863,0
.03979797288775444,-0.032914288341999054,-0.011855324730277061,...]
What your data looks like
19 | © Copyright 8/16/23 Zilliz
19 | © Copyright 8/16/23 Zilliz
Your embeddings strategy depends on your accuracy,
cost, and use case needs
Takeaway:
20 | © Copyright 8/16/23 Zilliz
20 | © Copyright 8/16/23 Zilliz
03 Chunking
21 | © Copyright 8/16/23 Zilliz
21 | © Copyright 8/16/23 Zilliz
Chunking Considerations
Chunk Size
Chunk Overlap
Character Splitters
22 | © Copyright 8/16/23 Zilliz
22 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=50, Overlap=0
23 | © Copyright 8/16/23 Zilliz
23 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=128, Overlap=20
24 | © Copyright 8/16/23 Zilliz
24 | © Copyright 8/16/23 Zilliz
Examples
Chunk Size=256, Overlap=50
25 | © Copyright 8/16/23 Zilliz
25 | © Copyright 8/16/23 Zilliz
Examples
SemanticChunker
26 | © Copyright 8/16/23 Zilliz
26 | © Copyright 8/16/23 Zilliz
How Does Your Data Look?
Conversation
Data
Documentation
Data
Lecture or Q/A
Data
27 | © Copyright 8/16/23 Zilliz
27 | © Copyright 8/16/23 Zilliz
Your chunking strategy depends on what your data looks
like and what you need from it.
Takeaway:
28 | © Copyright 8/16/23 Zilliz
28 | © Copyright 8/16/23 Zilliz
| © Copyright 8/16/23 Zilliz
28
Demo!
29 | © Copyright 8/16/23 Zilliz
29 | © Copyright 8/16/23 Zilliz
Questions?
Give Milvus a Star! Chat with me on Discord!
30 | © Copyright 8/16/23 Zilliz
30 | © Copyright 8/16/23 Zilliz
Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node QUERY DATA DATA
Message Storage
VECTOR
DATABASE
Access Layer
Query Node Data Node Index Node
Milvus Architecture

More Related Content

Similar to A Beginners Guide to Building a RAG App Using Open Source Milvus

Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)
Lustratus REPAMA
 
Wcm777 Apresentação Armazenamento em nuvens
Wcm777 Apresentação Armazenamento em nuvensWcm777 Apresentação Armazenamento em nuvens
Wcm777 Apresentação Armazenamento em nuvens
Paulo Morais
 
Wcm7 serviços de cloud
Wcm7 serviços de cloudWcm7 serviços de cloud
Wcm7 serviços de cloud
Mundo Novo Informatica
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
Ioan Toma
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
LDBC council
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Introduction to Blockchain Business Models
Introduction to Blockchain Business ModelsIntroduction to Blockchain Business Models
Introduction to Blockchain Business Models
Gokul Alex
 
Cloud Technologies for Businesses
Cloud Technologies for BusinessesCloud Technologies for Businesses
Cloud Technologies for Businesses
Ernesto Loya
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
Gina Buck
 
Situation Normal, FOWA Dublin
Situation Normal, FOWA DublinSituation Normal, FOWA Dublin
Situation Normal, FOWA Dublin
Simon Wardley
 
Mitre ATT&CK by Mattias Almeflo Nixu
Mitre ATT&CK by Mattias Almeflo NixuMitre ATT&CK by Mattias Almeflo Nixu
Mitre ATT&CK by Mattias Almeflo Nixu
Nixu Corporation
 
IRJET- Blockchain Technology a Literature Survey
IRJET- Blockchain Technology a Literature SurveyIRJET- Blockchain Technology a Literature Survey
IRJET- Blockchain Technology a Literature Survey
IRJET Journal
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
DATAVERSITY
 
MajorProject_AnilSharma
MajorProject_AnilSharmaMajorProject_AnilSharma
MajorProject_AnilSharma
Anil Sharma
 
SYN207: Newest and coolest NetScaler features you should be jazzed about
SYN207: Newest and coolest NetScaler features you should be jazzed aboutSYN207: Newest and coolest NetScaler features you should be jazzed about
SYN207: Newest and coolest NetScaler features you should be jazzed about
Citrix
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
InfluxData
 
Cisco Connect 2018 Malaysia - Secure data center-building a secure zero-trus...
Cisco Connect 2018 Malaysia -  Secure data center-building a secure zero-trus...Cisco Connect 2018 Malaysia -  Secure data center-building a secure zero-trus...
Cisco Connect 2018 Malaysia - Secure data center-building a secure zero-trus...
NetworkCollaborators
 
Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3
Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3
Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3
Ahmed Misbah
 
Building clouds with apache cloudstack apache roadshow 2018
Building clouds with apache cloudstack   apache roadshow 2018Building clouds with apache cloudstack   apache roadshow 2018
Building clouds with apache cloudstack apache roadshow 2018
ShapeBlue
 

Similar to A Beginners Guide to Building a RAG App Using Open Source Milvus (20)

Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)Cloud Computing Without The Hype   An Executive Guide (1.00 Slideshare)
Cloud Computing Without The Hype An Executive Guide (1.00 Slideshare)
 
Wcm777 Apresentação Armazenamento em nuvens
Wcm777 Apresentação Armazenamento em nuvensWcm777 Apresentação Armazenamento em nuvens
Wcm777 Apresentação Armazenamento em nuvens
 
Wcm7 serviços de cloud
Wcm7 serviços de cloudWcm7 serviços de cloud
Wcm7 serviços de cloud
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
 
MODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service SelectionMODAClouds Decision Support System for Cloud Service Selection
MODAClouds Decision Support System for Cloud Service Selection
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Introduction to Blockchain Business Models
Introduction to Blockchain Business ModelsIntroduction to Blockchain Business Models
Introduction to Blockchain Business Models
 
Cloud Technologies for Businesses
Cloud Technologies for BusinessesCloud Technologies for Businesses
Cloud Technologies for Businesses
 
The Growth Of Data Centers
The Growth Of Data CentersThe Growth Of Data Centers
The Growth Of Data Centers
 
Situation Normal, FOWA Dublin
Situation Normal, FOWA DublinSituation Normal, FOWA Dublin
Situation Normal, FOWA Dublin
 
Mitre ATT&CK by Mattias Almeflo Nixu
Mitre ATT&CK by Mattias Almeflo NixuMitre ATT&CK by Mattias Almeflo Nixu
Mitre ATT&CK by Mattias Almeflo Nixu
 
IRJET- Blockchain Technology a Literature Survey
IRJET- Blockchain Technology a Literature SurveyIRJET- Blockchain Technology a Literature Survey
IRJET- Blockchain Technology a Literature Survey
 
MLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive AdvantageMLOps – Applying DevOps to Competitive Advantage
MLOps – Applying DevOps to Competitive Advantage
 
MajorProject_AnilSharma
MajorProject_AnilSharmaMajorProject_AnilSharma
MajorProject_AnilSharma
 
SYN207: Newest and coolest NetScaler features you should be jazzed about
SYN207: Newest and coolest NetScaler features you should be jazzed aboutSYN207: Newest and coolest NetScaler features you should be jazzed about
SYN207: Newest and coolest NetScaler features you should be jazzed about
 
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
 
Cisco Connect 2018 Malaysia - Secure data center-building a secure zero-trus...
Cisco Connect 2018 Malaysia -  Secure data center-building a secure zero-trus...Cisco Connect 2018 Malaysia -  Secure data center-building a secure zero-trus...
Cisco Connect 2018 Malaysia - Secure data center-building a secure zero-trus...
 
Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3
Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3
Istio as an Enabler for Migrating Monolithic Applications to Microservices v1.3
 
Building clouds with apache cloudstack apache roadshow 2018
Building clouds with apache cloudstack   apache roadshow 2018Building clouds with apache cloudstack   apache roadshow 2018
Building clouds with apache cloudstack apache roadshow 2018
 

More from Zilliz

How CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&AHow CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&A
Zilliz
 
Multimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup SlidesMultimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup Slides
Zilliz
 
Ensuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG DeploymentsEnsuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG Deployments
Zilliz
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Zilliz
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Zilliz
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
Zilliz
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
Zilliz
 
ASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLCASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLC
Zilliz
 
Metadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - DatastratoMetadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - Datastrato
Zilliz
 
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with MilvusMultimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Zilliz
 
Specializing Small Language Models With Less Data
Specializing Small Language Models With Less DataSpecializing Small Language Models With Less Data
Specializing Small Language Models With Less Data
Zilliz
 
Occiglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for EuropeOcciglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for Europe
Zilliz
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
MemGPT: Introduction to Memory Augmented Chat
MemGPT: Introduction to Memory Augmented ChatMemGPT: Introduction to Memory Augmented Chat
MemGPT: Introduction to Memory Augmented Chat
Zilliz
 
Copilot Workspace: What it is, how it works, why it matters
Copilot Workspace: What it is, how it works, why it mattersCopilot Workspace: What it is, how it works, why it matters
Copilot Workspace: What it is, how it works, why it matters
Zilliz
 

More from Zilliz (20)

How CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&AHow CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&A
 
Multimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup SlidesMultimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup Slides
 
Ensuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG DeploymentsEnsuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG Deployments
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
 
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
 
ASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLCASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLC
 
Metadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - DatastratoMetadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - Datastrato
 
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with MilvusMultimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with Milvus
 
Specializing Small Language Models With Less Data
Specializing Small Language Models With Less DataSpecializing Small Language Models With Less Data
Specializing Small Language Models With Less Data
 
Occiglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for EuropeOcciglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for Europe
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
MemGPT: Introduction to Memory Augmented Chat
MemGPT: Introduction to Memory Augmented ChatMemGPT: Introduction to Memory Augmented Chat
MemGPT: Introduction to Memory Augmented Chat
 
Copilot Workspace: What it is, how it works, why it matters
Copilot Workspace: What it is, how it works, why it mattersCopilot Workspace: What it is, how it works, why it matters
Copilot Workspace: What it is, how it works, why it matters
 

Recently uploaded

History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )
Badri_Bady
 
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc
 
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
DianaGray10
 
FIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptxFIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Alliance
 
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Alliance
 
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdfDefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
Yury Chemerkin
 
Indian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for StartupsIndian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for Startups
AMol NAik
 
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
Priyanka Aash
 
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
Priyanka Aash
 
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium
Knoldus Inc.
 
UiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPathCommunity
 
Enterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdfEnterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdf
Yury Chemerkin
 
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
Stephanie Beckett
 
FIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptxFIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Alliance
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...
Nohoax Kanont
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
Priyanka Aash
 
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Alliance
 
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
OnBoard
 

Recently uploaded (20)

History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )History and Introduction for Generative AI ( GenAI )
History and Introduction for Generative AI ( GenAI )
 
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
 
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
 
FIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptxFIDO Munich Seminar Workforce Authentication Case Study.pptx
FIDO Munich Seminar Workforce Authentication Case Study.pptx
 
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
 
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdfDefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
 
Indian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for StartupsIndian Privacy law & Infosec for Startups
Indian Privacy law & Infosec for Startups
 
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
 
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
 
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium
 
UiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, Connect
 
Enterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdfEnterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdf
 
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
 
FIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptxFIDO Munich Seminar FIDO Automotive Apps.pptx
FIDO Munich Seminar FIDO Automotive Apps.pptx
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...Generative AI technology is a fascinating field that focuses on creating comp...
Generative AI technology is a fascinating field that focuses on creating comp...
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
 
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptxFIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
FIDO Munich Seminar: Biometrics and Passkeys for In-Vehicle Apps.pptx
 
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
Mastering Board Best Practices: Essential Skills for Effective Non-profit Lea...
 

A Beginners Guide to Building a RAG App Using Open Source Milvus

  • 1. 1 | © Copyright 8/16/23 Zilliz 1 | © Copyright 8/16/23 Zilliz Stephen Batifol | Zilliz A Beginners Guide to Building a RAG App Using Milvus
  • 2. 2 | © Copyright 8/16/23 Zilliz 2 | © Copyright 8/16/23 Zilliz Stephen Batifol Developer Advocate, Zilliz stephen.batifol@zilliz.com https://www.linkedin.com/in/stephen-batifol/ https://twitter.com/stephenbtl Speaker
  • 3. 3 | © Copyright 8/16/23 Zilliz 3 | © Copyright 8/16/23 Zilliz | © Copyright 8/16/23 Zilliz 3 RAG (Retrieval Augmented Generation)
  • 4. 4 | © Copyright 8/16/23 Zilliz 4 | © Copyright 8/16/23 Zilliz Basic Idea Use RAG to force the LLM to work with your data by injecting it via a vector database like Milvus
  • 5. 5 | © Copyright 8/16/23 Zilliz 5 | © Copyright 8/16/23 Zilliz Vector DB for RAG Vector Databases provide the ability to inject your data via semantic similarity Considerations include: scale, performance, and flexibility
  • 6. 6 | © Copyright 8/16/23 Zilliz 6 | © Copyright 8/16/23 Zilliz LLMs are Stochastic LLMs predict future tokens (a-la RNNs) • “Milvus is the world ’s most popular vector ___” • {“database”: 0.86, “search”: 0.11, “embedding”, 0.01, …} Downside: outdated input data could be cause for hallucination • Plausible-sounding but factually incorrect responses
  • 7. 7 | © Copyright 8/16/23 Zilliz 7 | © Copyright 8/16/23 Zilliz Basic RAG Architecture
  • 8. 8 | © Copyright 8/16/23 Zilliz 8 | © Copyright 8/16/23 Zilliz 01 Tech Stack
  • 9. 9 | © Copyright 8/16/23 Zilliz 9 | © Copyright 8/16/23 Zilliz Tech Stack
  • 10. 10 | © Copyright 8/16/23 Zilliz 10 | © Copyright 8/16/23 Zilliz • Framework for building LLM Applications • Focus on retrieving data and integrating with LLMs • Loading the Data • Chunk & Chunk Overlap • Integrations with most popular tools Langchain
  • 11. 11 | © Copyright 8/16/23 Zilliz 11 | © Copyright 8/16/23 Zilliz Ollama • Run quantized LLMs Locally • Embeddings Models
  • 12. 12 | © Copyright 8/16/23 Zilliz 12 | © Copyright 8/16/23 Zilliz Milvus 1. Cloud Native, Distributed System Architecture 2. True Separation of Concerns 3. Scalable Index Creation Strategy with 512 MB Segments
  • 13. 13 | © Copyright 8/16/23 Zilliz 13 | © Copyright 8/16/23 Zilliz Embeddings Models
  • 14. 14 | © Copyright 8/16/23 Zilliz 14 | © Copyright 8/16/23 Zilliz 02 Embeddings
  • 15. 15 | © Copyright 8/16/23 Zilliz 15 | © Copyright 8/16/23 Zilliz Examining Embeddings Picking a model What to embed Metadata
  • 16. 16 | © Copyright 8/16/23 Zilliz 16 | © Copyright 8/16/23 Zilliz Embeddings Strategies Level 1: Embedding Chunks Directly Level 2: Embedding Sub and Super Chunks Level 3: Incorporating Chunking and Non-Chunking Metadata
  • 17. 17 | © Copyright 8/16/23 Zilliz 17 | © Copyright 8/16/23 Zilliz Metadata Examples Chunking - Paragraph position - Section header - Larger paragraph - Sentence Number - … Non-Chunking - Author - Publisher - Organization - Role Based Access Control - …
  • 18. 18 | © Copyright 8/16/23 Zilliz 18 | © Copyright 8/16/23 Zilliz Text: “preferences of customers and prospective customers with respect to remote or hybrid working, as a result of the COVID-19 pandemic, leading to a parallel delay, or potentially permanent change, in receiving the corresponding revenue; •our projected financial information, anticipated growth rate, and market opportunity; •our ability to maintain the listing of our Class A Common Stock and Warrants on the NYSE; •our public securities’ potential liquidity and trading;” Vector: [-0.09975282847881317,-0.02853492833673954,-0.047886092215776443,0.01231582183 3908558,-0.004004416521638632,0.08756010979413986,0.013248161412775517,0.01070 4956017434597,-0.06194952502846718,0.021150749176740646,0.02453230880200863,0 .03979797288775444,-0.032914288341999054,-0.011855324730277061,...] What your data looks like
  • 19. 19 | © Copyright 8/16/23 Zilliz 19 | © Copyright 8/16/23 Zilliz Your embeddings strategy depends on your accuracy, cost, and use case needs Takeaway:
  • 20. 20 | © Copyright 8/16/23 Zilliz 20 | © Copyright 8/16/23 Zilliz 03 Chunking
  • 21. 21 | © Copyright 8/16/23 Zilliz 21 | © Copyright 8/16/23 Zilliz Chunking Considerations Chunk Size Chunk Overlap Character Splitters
  • 22. 22 | © Copyright 8/16/23 Zilliz 22 | © Copyright 8/16/23 Zilliz Examples Chunk Size=50, Overlap=0
  • 23. 23 | © Copyright 8/16/23 Zilliz 23 | © Copyright 8/16/23 Zilliz Examples Chunk Size=128, Overlap=20
  • 24. 24 | © Copyright 8/16/23 Zilliz 24 | © Copyright 8/16/23 Zilliz Examples Chunk Size=256, Overlap=50
  • 25. 25 | © Copyright 8/16/23 Zilliz 25 | © Copyright 8/16/23 Zilliz Examples SemanticChunker
  • 26. 26 | © Copyright 8/16/23 Zilliz 26 | © Copyright 8/16/23 Zilliz How Does Your Data Look? Conversation Data Documentation Data Lecture or Q/A Data
  • 27. 27 | © Copyright 8/16/23 Zilliz 27 | © Copyright 8/16/23 Zilliz Your chunking strategy depends on what your data looks like and what you need from it. Takeaway:
  • 28. 28 | © Copyright 8/16/23 Zilliz 28 | © Copyright 8/16/23 Zilliz | © Copyright 8/16/23 Zilliz 28 Demo!
  • 29. 29 | © Copyright 8/16/23 Zilliz 29 | © Copyright 8/16/23 Zilliz Questions? Give Milvus a Star! Chat with me on Discord!
  • 30. 30 | © Copyright 8/16/23 Zilliz 30 | © Copyright 8/16/23 Zilliz Meta Storage Root Query Data Index Coordinator Service Proxy Proxy etcd Log Broker SDK Load Balancer DDL/DCL DML NOTIFICATION CONTROL SIGNAL Object Storage Minio / S3 / AzureBlob Log Snapshot Delta File Index File Worker Node QUERY DATA DATA Message Storage VECTOR DATABASE Access Layer Query Node Data Node Index Node Milvus Architecture