SlideShare a Scribd company logo
Yujian Tang | Zilliz
Beyond RAG: Vector
Databases
Yujian Tang
Senior Developer Advocate, Zilliz
yujian@zilliz.com
https://www.linkedin.com/in/yujiantang
https://www.twitter.com/yujian_tang
Speaker
01 Why Vector Databases?
CONTENTS
03
04 Vector Database Architecture
02 How Do Vector Databases Work?
Use Cases
01 Why Vector Databases?
Compare data that you couldn’t compare before
Unstructured Data is Everywhere
Unstructured data is any data that does not conform to a predefined data model.
By 2025, IDC estimates there will be 175 zettabytes of data globally
(that's 175 with 21 zeros), with 80% of that data being unstructured.
Currently, 90% of unstructured data is never analyzed.
Text Images Video and more!
Vector
Databases
Unstructured Data + ML = Vector Magic
Find Semantically Similar Data
Apple made profits of $97 Billion in 2023
I like to eat apple pie for profit in 2023
Apple’s bottom line increased by record numbers in 2023
But wait! There’s more!
Use math to quantify relationships between
entities
02 How Do Vector Databases Work?
Vector similarity is a mathematical measure of
how close two vectors are
Semantic Similarity
Image from Sutor et al
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Woman = [0.3, 0.4]
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Man = [0.5, 0.2]
Queen - Woman + Man = King
Queen = [0.3, 0.9]
- Woman = [0.3, 0.4]
[0.0, 0.5]
+ Man = [0.5, 0.2]
King = [0.5, 0.7]
Man = [0.5, 0.2]
Similarity metrics are ways to measure distance in
vector space
Vector Similarity Metric: L2 (Euclidean)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
d(Queen, King) = √(0.3-0.5)2
+ (0.9-0.7)2
= √(0.2)2
+ (0.2)2
= √0.04 + 0.04
= √0.08 ≅ 0.28
Vector Similarity Metric: Inner Product (IP)
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Queen · King = (0.3*0.5) + (0.9*0.7)
= 0.15 + 0.63 = 0.78
Queen = [0.3, 0.9]
King = [0.5, 0.7]
Vector Similarity Metric: Cosine
𝚹
cos(Queen, King) = (0.3*0.5)+(0.9*0.7)
√0.32
+0.92
* √0.52
+0.72
= 0.15+0.63 _
√0.9 * √0.74
= 0.78 _
√0.666
≅ 0.03
Vector Similarity Metrics
Euclidean - Spatial distance
Cosine - Orientational distance
Inner Product - Both
With normalized vectors, IP = Cosine
Indexes organize the way we access our data
Inverted File Index
Source:
https://towardsdatascience.com/similarity-search-with-ivfpq-9c6348fd4db3
Hierarchical Navigable Small Worlds (HNSW)
Source:
https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf
Scalar Quantization (SQ)
Product Quantization
Source:
https://towardsdatascience.com/product-quantization-for-similarity-search-2f1f67c5fddd
Indexes Overview
- IVF = Intuitive, medium memory, performant
- HNSW = Graph based, high memory, highly performant
- Flat = brute force
- SQ = bucketize across one dimension, accuracy x
memory tradeoff
- PQ = bucketize across two dimensions, more accuracy x
memory tradeoff
Vector databases efficiently store, index, and
relate entities by a quantitative value
03 Use Cases
What Does Vector Data Look Like?
RAG
RAG
Inject your data via a vector
database like Milvus/Zilliz
Query LLM
Milvus
Your Data
Primary Use Case
● Factual Recall
● Forced Data Injection
● Cost Optimization
Common AI Use Cases
LLM Augmented Retrieval
Expand LLMs' knowledge by
incorporating external data sources
into LLMs and your AI applications.
Match user behavior or content
features with other similar
behaviors or features to make
effective recommendations.
Recommender System
Search for semantically similar
texts across vast amounts of
natural language documents.
Text/ Semantic Search
Image Similarity Search
Identify and search for visually
similar images or objects from a
vast collection of image libraries.
Video Similarity Search
Search for similar videos, scenes,
or objects from extensive
collections of video libraries.
Audio Similarity Search
Find similar audios from massive
amounts of audio data to perform
tasks such as genre classification,
or recognize speech.
Molecular Similarity Search
Search for similar substructures,
superstructures, and other
structures for a specific molecule.
Question Answering System
Interactive QA chatbot that
automatically answers user
questions
Multimodal Similarity Search
Search over multiple types of data
simultaneously, e.g. text and
images
Example Use Case
Example Use Case
Example Use Case
04 Vector Database Architecture
Why Not Use a SQL/NoSQL Database?
● Inefficiency in High-dimensional spaces
● Suboptimal Indexing
● Inadequate query support
● Lack of scalability
● Limited analytics capabilities
● Data conversion issues
TL;DR: Vector operations are too computationally intensive for traditional
database infrastructures
Why Not Use a Vector Search Library?
● Have to manually implement filtering
● Not optimized to take advantage of the latest hardware
● Unable to handle large scale data
● Lack of lifecycle management
● Inefficient indexing capabilities
● No built in safety mechanisms
TL;DR: Vector search libraries lack the infrastructure to help you scale,
deploy, and manage your apps in production.
What is Milvus/Zilliz ideal for?
○ Advanced filtering
○ Hybrid search
○ Durability and backups
○ Replications/High Availability
○ Sharding
○ Aggregations
○ Lifecycle management
○ Multi-tenancy
○ High query load
○ High insertion/deletion
○ Full precision/recall
○ Accelerator support (GPU,
FPGA)
○ Billion-scale storage
Purpose-built to store, index and query vector embeddings from unstructured data at scale.
Meta Storage
Root Query Data Index
Coordinator Service
Proxy
Proxy
etcd
Log Broker
SDK
Load Balancer
DDL/DCL
DML
NOTIFICATION
CONTROL SIGNAL
Object Storage
Minio / S3 / AzureBlob
Log Snapshot Delta File Index File
Worker Node QUERY DATA DATA
Message Storage
Access Layer
Query Node Data Node Index Node
High-level overview of Milvus’ Architecture
Start building
with Zilliz Cloud today!
zilliz.com/cloud
| © Copyright 9/25/23 Zilliz
39
Appendix
Important Notes
- Cosine, IP, and L2 are all the SAME rank order.
- They differ in use case
- L2 for when you need magnitude
- Cosine for orientation
- IP for magnitude and orientation
- OR
- Cosine = IP for normalized vectors
Embeddings Models
Basic Idea
You want to use your data with a large language
model
RAG vs Fine Tuning
LLM
Fine Tuning
Augment an LLM by training it on
your data
Your Data
“New” LLM
Query
Primary Use Case
● Style transfer
Takeaway
Use RAG to force the LLM to work with your data by injecting it via a
vector database like Milvus or Zilliz
Chunking Considerations
Chunk Size
Chunk Overlap
Character Splitters
How Does Your Data Look?
Conversation
Data
Documentation Data Lecture or Q/A
Data
Examples
Examples
Examples
Examples
Examples
Examples
Your chunking strategy depends on what your data looks
like and what you need from it.
Takeaway:
Examining Embeddings
Picking a model
What to embed
Metadata
Embeddings Strategies
Level 1: Embedding Chunks Directly
Level 2: Embedding Sub and Super Chunks
Level 3: Incorporating Chunking and Non-Chunking Metadata
Metadata Examples
Chunking
● Paragraph position
● Section header
● Larger paragraph
● Sentence Number
● …
Non-Chunking
● Author
● Publisher
● Organization
● Role Based Access Control
● …
Your embeddings strategy depends on your accuracy,
cost, and use case needs
Takeaway:
Basic Idea
Vector Databases provide the ability to inject your data via
semantic similarity
Considerations include: scale, performance, and flexibility
Milvus Architecture: Differentiation
1. Cloud Native, Distributed System Architecture
2. True Separation of Concerns
3. Scalable Index Creation Strategy with 512 MB Segments
Vector Databases are purpose-built to handle
indexing, storing, and querying vector data.
Milvus & Zilliz are specifically designed for high
performance and billion+ scale use cases.
Takeaway:
Vector Database Resources
Give Milvus a Star! Chat with me on Discord!
Get Started Free
Got questions? Stop by our booth!
Milvus
Open Source
Self-Managed
github.com/milvus-io/milvus
Zilliz Cloud
SaaS
Fully-Managed
zilliz.com/cloud

More Related Content

What's hot

Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
Semantic Web Company
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
Rachel Lovinger
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
Jeff Z. Pan
 
Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and Future
Lorenzo Nicora
 
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
NUS-ISS
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
LibbySchulze
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AI
Neo4j
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
Slava Kokaev
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
Precisely
 
Loading your Life into a Vector Database
Loading your Life into a Vector DatabaseLoading your Life into a Vector Database
Loading your Life into a Vector Database
Ben Church
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Databricks
 
Understanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
NUS-ISS
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
Wasm1953
 
Conceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data ModelingConceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data Modeling
DATAVERSITY
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse Webinar
Peter Ward
 
Learn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML LifecycleLearn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML Lifecycle
Databricks
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
DataScienceConferenc1
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the Ugly
Tyler Wishnoff
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
Databricks
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
Sion Smith
 

What's hot (20)

Introduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AIIntroduction to Knowledge Graphs and Semantic AI
Introduction to Knowledge Graphs and Semantic AI
 
RDF and OWL
RDF and OWLRDF and OWL
RDF and OWL
 
Introduction of Knowledge Graphs
Introduction of Knowledge GraphsIntroduction of Knowledge Graphs
Introduction of Knowledge Graphs
 
Data Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and FutureData Mesh at CMC Markets: Past, Present and Future
Data Mesh at CMC Markets: Past, Present and Future
 
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
 
Time to Talk about Data Mesh
Time to Talk about Data MeshTime to Talk about Data Mesh
Time to Talk about Data Mesh
 
Knowledge Graphs and Generative AI
Knowledge Graphs and Generative AIKnowledge Graphs and Generative AI
Knowledge Graphs and Generative AI
 
Introduction to Azure Data Factory
Introduction to Azure Data FactoryIntroduction to Azure Data Factory
Introduction to Azure Data Factory
 
You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?You Need a Data Catalog. Do You Know Why?
You Need a Data Catalog. Do You Know Why?
 
Loading your Life into a Vector Database
Loading your Life into a Vector DatabaseLoading your Life into a Vector Database
Loading your Life into a Vector Database
 
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for CybersecurityEmpower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
Empower Splunk and other SIEMs with the Databricks Lakehouse for Cybersecurity
 
Understanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix GohUnderstanding GenAI/LLM and What is Google Offering - Felix Goh
Understanding GenAI/LLM and What is Google Offering - Felix Goh
 
Databricks on AWS.pptx
Databricks on AWS.pptxDatabricks on AWS.pptx
Databricks on AWS.pptx
 
Conceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data ModelingConceptual vs. Logical vs. Physical Data Modeling
Conceptual vs. Logical vs. Physical Data Modeling
 
Introduction to Azure Synapse Webinar
Introduction to Azure Synapse WebinarIntroduction to Azure Synapse Webinar
Introduction to Azure Synapse Webinar
 
Learn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML LifecycleLearn to Use Databricks for the Full ML Lifecycle
Learn to Use Databricks for the Full ML Lifecycle
 
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...
 
Snowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the UglySnowflake: The Good, the Bad, and the Ugly
Snowflake: The Good, the Bad, and the Ugly
 
Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4Data Lakehouse Symposium | Day 4
Data Lakehouse Symposium | Day 4
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 

Similar to Beyond Retrieval Augmented Generation (RAG): Vector Databases

Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
Ivo Andreev
 
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
BIWUG
 
How to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePointHow to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePoint
Joris Poelmans
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Kent Graziano
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
Alberto Diaz Martin
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
BigDataExpo
 
Steps towards business intelligence
Steps towards business intelligenceSteps towards business intelligence
Steps towards business intelligence
Ahsan Kabir
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Alberto Diaz Martin
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
Nathan Bijnens
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
Timothy Spann
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PM
Product School
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
Caserta
 
Managing Large Amounts of Data with Salesforce
Managing Large Amounts of Data with SalesforceManaging Large Amounts of Data with Salesforce
Managing Large Amounts of Data with Salesforce
Sense Corp
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
James Serra
 
NOSQL
NOSQLNOSQL
2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with Azure2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with Azure
Marco Parenzan
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
James Serra
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
Codecamp Romania
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Trivadis
 
What Your Database Query is Really Doing
What Your Database Query is Really DoingWhat Your Database Query is Really Doing
What Your Database Query is Really Doing
Dave Stokes
 

Similar to Beyond Retrieval Augmented Generation (RAG): Vector Databases (20)

Data Warehouse Design and Best Practices
Data Warehouse Design and Best PracticesData Warehouse Design and Best Practices
Data Warehouse Design and Best Practices
 
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
Spsbepoelmanssharepointbigdataclean 150421080105-conversion-gate02
 
How to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePointHow to build your own Delve: combining machine learning, big data and SharePoint
How to build your own Delve: combining machine learning, big data and SharePoint
 
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile ApproachUsing OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
Using OBIEE and Data Vault to Virtualize Your BI Environment: An Agile Approach
 
Global AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure DatabricksGlobal AI Bootcamp Madrid - Azure Databricks
Global AI Bootcamp Madrid - Azure Databricks
 
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is EssentialBig Data Expo 2015 - Barnsten Why Data Modelling is Essential
Big Data Expo 2015 - Barnsten Why Data Modelling is Essential
 
Steps towards business intelligence
Steps towards business intelligenceSteps towards business intelligence
Steps towards business intelligence
 
Ai & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientistAi & Data Analytics 2018 - Azure Databricks for data scientist
Ai & Data Analytics 2018 - Azure Databricks for data scientist
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
06-20-2024-AI Camp Meetup-Unstructured Data and Vector Databases
 
What is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PMWhat is Data as a Service by T-Mobile Principle Technical PM
What is Data as a Service by T-Mobile Principle Technical PM
 
Mastering Customer Data on Apache Spark
Mastering Customer Data on Apache SparkMastering Customer Data on Apache Spark
Mastering Customer Data on Apache Spark
 
Managing Large Amounts of Data with Salesforce
Managing Large Amounts of Data with SalesforceManaging Large Amounts of Data with Salesforce
Managing Large Amounts of Data with Salesforce
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
NOSQL
NOSQLNOSQL
NOSQL
 
2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with Azure2014.11.14 Data Opportunities with Azure
2014.11.14 Data Opportunities with Azure
 
Data Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future OutlookData Warehousing Trends, Best Practices, and Future Outlook
Data Warehousing Trends, Best Practices, and Future Outlook
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
 
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
Azure Days 2019: Business Intelligence auf Azure (Marco Amhof & Yves Mauron)
 
What Your Database Query is Really Doing
What Your Database Query is Really DoingWhat Your Database Query is Really Doing
What Your Database Query is Really Doing
 

More from Zilliz

How CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&AHow CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&A
Zilliz
 
Multimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup SlidesMultimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup Slides
Zilliz
 
Ensuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG DeploymentsEnsuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG Deployments
Zilliz
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Zilliz
 
Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+
Zilliz
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Zilliz
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
Zilliz
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
Zilliz
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
Zilliz
 
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
Zilliz
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Zilliz
 
ASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLCASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLC
Zilliz
 
Metadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - DatastratoMetadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - Datastrato
Zilliz
 
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with MilvusMultimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Zilliz
 
Building an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and MilvusBuilding an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and Milvus
Zilliz
 
Specializing Small Language Models With Less Data
Specializing Small Language Models With Less DataSpecializing Small Language Models With Less Data
Specializing Small Language Models With Less Data
Zilliz
 
Occiglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for EuropeOcciglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for Europe
Zilliz
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
Zilliz
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 

More from Zilliz (20)

How CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&AHow CXAI Toolkit uses RAG for Intelligent Q&A
How CXAI Toolkit uses RAG for Intelligent Q&A
 
Multimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup SlidesMultimodal Embeddings (continued) - South Bay Meetup Slides
Multimodal Embeddings (continued) - South Bay Meetup Slides
 
Ensuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG DeploymentsEnsuring Secure and Permission-Aware RAG Deployments
Ensuring Secure and Permission-Aware RAG Deployments
 
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
 
Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+
 
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
 
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
 
The History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal EmbeddingsThe History of Embeddings & Multimodal Embeddings
The History of Embeddings & Multimodal Embeddings
 
Using LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and MilvusUsing LLM Agents with Llama 3, LangGraph and Milvus
Using LLM Agents with Llama 3, LangGraph and Milvus
 
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
How Vector Databases are Revolutionizing Unstructured Data Search in AI Appli...
 
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and OllamaTirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
Tirana Tech Meetup - Agentic RAG with Milvus, Llama3 and Ollama
 
ASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLCASIMOV: Enterprise RAG at Dialog Axiata PLC
ASIMOV: Enterprise RAG at Dialog Axiata PLC
 
Metadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - DatastratoMetadata Lakes for Next-Gen AI/ML - Datastrato
Metadata Lakes for Next-Gen AI/ML - Datastrato
 
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with MilvusMultimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with Milvus
 
Building an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and MilvusBuilding an Agentic RAG locally with Ollama and Milvus
Building an Agentic RAG locally with Ollama and Milvus
 
Specializing Small Language Models With Less Data
Specializing Small Language Models With Less DataSpecializing Small Language Models With Less Data
Specializing Small Language Models With Less Data
 
Occiglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for EuropeOcciglot - Open Language Models by and for Europe
Occiglot - Open Language Models by and for Europe
 
Fueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte WebinarFueling AI with Great Data with Airbyte Webinar
Fueling AI with Great Data with Airbyte Webinar
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 

Recently uploaded

Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security
Priyanka Aash
 
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
Priyanka Aash
 
AMD Zen 5 Architecture Deep Dive from Tech Day
AMD Zen 5 Architecture Deep Dive from Tech DayAMD Zen 5 Architecture Deep Dive from Tech Day
AMD Zen 5 Architecture Deep Dive from Tech Day
Low Hong Chuan
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
Bhajan Mehta
 
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
Fwdays
 
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdfDefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
Yury Chemerkin
 
UiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPathCommunity
 
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptxFIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Alliance
 
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium
Knoldus Inc.
 
Enterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdfEnterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdf
Yury Chemerkin
 
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024
Peter Caitens
 
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceCracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Quentin Reul
 
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Alliance
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
siddu769252
 
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
Stephanie Beckett
 
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
Alison B. Lowndes
 
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
ZachWylie3
 
What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024
Stephanie Beckett
 
Top 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdfTop 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdf
Marrie Morris
 
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Alliance
 

Recently uploaded (20)

Keynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive SecurityKeynote : AI & Future Of Offensive Security
Keynote : AI & Future Of Offensive Security
 
Keynote : Presentation on SASE Technology
Keynote : Presentation on SASE TechnologyKeynote : Presentation on SASE Technology
Keynote : Presentation on SASE Technology
 
AMD Zen 5 Architecture Deep Dive from Tech Day
AMD Zen 5 Architecture Deep Dive from Tech DayAMD Zen 5 Architecture Deep Dive from Tech Day
AMD Zen 5 Architecture Deep Dive from Tech Day
 
Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17Mule Experience Hub and Release Channel with Java 17
Mule Experience Hub and Release Channel with Java 17
 
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
"Hands-on development experience using wasm Blazor", Furdak Vladyslav.pptx
 
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdfDefCamp_2016_Chemerkin_Yury_--_publish.pdf
DefCamp_2016_Chemerkin_Yury_--_publish.pdf
 
UiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, ConnectUiPath Community Day Amsterdam: Code, Collaborate, Connect
UiPath Community Day Amsterdam: Code, Collaborate, Connect
 
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptxFIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
 
Self-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - HealeniumSelf-Healing Test Automation Framework - Healenium
Self-Healing Test Automation Framework - Healenium
 
Enterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdfEnterprise_Mobile_Security_Forum_2013.pdf
Enterprise_Mobile_Security_Forum_2013.pdf
 
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024
 
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise ExcellenceCracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
Cracking AI Black Box - Strategies for Customer-centric Enterprise Excellence
 
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
 
Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024Generative AI Reasoning Tech Talk - July 2024
Generative AI Reasoning Tech Talk - July 2024
 
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
 
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
 
Camunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptxCamunda Chapter NY Meetup July 2024.pptx
Camunda Chapter NY Meetup July 2024.pptx
 
What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024What's New in Teams Calling, Meetings, Devices June 2024
What's New in Teams Calling, Meetings, Devices June 2024
 
Top 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdfTop 12 AI Technology Trends For 2024.pdf
Top 12 AI Technology Trends For 2024.pdf
 
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptxFIDO Munich Seminar In-Vehicle Payment Trends.pptx
FIDO Munich Seminar In-Vehicle Payment Trends.pptx
 

Beyond Retrieval Augmented Generation (RAG): Vector Databases

  • 1. Yujian Tang | Zilliz Beyond RAG: Vector Databases
  • 2. Yujian Tang Senior Developer Advocate, Zilliz yujian@zilliz.com https://www.linkedin.com/in/yujiantang https://www.twitter.com/yujian_tang Speaker
  • 3. 01 Why Vector Databases? CONTENTS 03 04 Vector Database Architecture 02 How Do Vector Databases Work? Use Cases
  • 4. 01 Why Vector Databases?
  • 5. Compare data that you couldn’t compare before
  • 6. Unstructured Data is Everywhere Unstructured data is any data that does not conform to a predefined data model. By 2025, IDC estimates there will be 175 zettabytes of data globally (that's 175 with 21 zeros), with 80% of that data being unstructured. Currently, 90% of unstructured data is never analyzed. Text Images Video and more!
  • 8. Find Semantically Similar Data Apple made profits of $97 Billion in 2023 I like to eat apple pie for profit in 2023 Apple’s bottom line increased by record numbers in 2023
  • 10. Use math to quantify relationships between entities
  • 11. 02 How Do Vector Databases Work?
  • 12. Vector similarity is a mathematical measure of how close two vectors are
  • 13. Semantic Similarity Image from Sutor et al Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Woman = [0.3, 0.4] Queen = [0.3, 0.9] King = [0.5, 0.7] Man = [0.5, 0.2] Queen - Woman + Man = King Queen = [0.3, 0.9] - Woman = [0.3, 0.4] [0.0, 0.5] + Man = [0.5, 0.2] King = [0.5, 0.7] Man = [0.5, 0.2]
  • 14. Similarity metrics are ways to measure distance in vector space
  • 15. Vector Similarity Metric: L2 (Euclidean) Queen = [0.3, 0.9] King = [0.5, 0.7] d(Queen, King) = √(0.3-0.5)2 + (0.9-0.7)2 = √(0.2)2 + (0.2)2 = √0.04 + 0.04 = √0.08 ≅ 0.28
  • 16. Vector Similarity Metric: Inner Product (IP) Queen = [0.3, 0.9] King = [0.5, 0.7] Queen · King = (0.3*0.5) + (0.9*0.7) = 0.15 + 0.63 = 0.78
  • 17. Queen = [0.3, 0.9] King = [0.5, 0.7] Vector Similarity Metric: Cosine 𝚹 cos(Queen, King) = (0.3*0.5)+(0.9*0.7) √0.32 +0.92 * √0.52 +0.72 = 0.15+0.63 _ √0.9 * √0.74 = 0.78 _ √0.666 ≅ 0.03
  • 18. Vector Similarity Metrics Euclidean - Spatial distance Cosine - Orientational distance Inner Product - Both With normalized vectors, IP = Cosine
  • 19. Indexes organize the way we access our data
  • 21. Hierarchical Navigable Small Worlds (HNSW) Source: https://arxiv.org/ftp/arxiv/papers/1603/1603.09320.pdf
  • 24. Indexes Overview - IVF = Intuitive, medium memory, performant - HNSW = Graph based, high memory, highly performant - Flat = brute force - SQ = bucketize across one dimension, accuracy x memory tradeoff - PQ = bucketize across two dimensions, more accuracy x memory tradeoff
  • 25. Vector databases efficiently store, index, and relate entities by a quantitative value
  • 27. What Does Vector Data Look Like?
  • 28. RAG RAG Inject your data via a vector database like Milvus/Zilliz Query LLM Milvus Your Data Primary Use Case ● Factual Recall ● Forced Data Injection ● Cost Optimization
  • 29. Common AI Use Cases LLM Augmented Retrieval Expand LLMs' knowledge by incorporating external data sources into LLMs and your AI applications. Match user behavior or content features with other similar behaviors or features to make effective recommendations. Recommender System Search for semantically similar texts across vast amounts of natural language documents. Text/ Semantic Search Image Similarity Search Identify and search for visually similar images or objects from a vast collection of image libraries. Video Similarity Search Search for similar videos, scenes, or objects from extensive collections of video libraries. Audio Similarity Search Find similar audios from massive amounts of audio data to perform tasks such as genre classification, or recognize speech. Molecular Similarity Search Search for similar substructures, superstructures, and other structures for a specific molecule. Question Answering System Interactive QA chatbot that automatically answers user questions Multimodal Similarity Search Search over multiple types of data simultaneously, e.g. text and images
  • 33. 04 Vector Database Architecture
  • 34. Why Not Use a SQL/NoSQL Database? ● Inefficiency in High-dimensional spaces ● Suboptimal Indexing ● Inadequate query support ● Lack of scalability ● Limited analytics capabilities ● Data conversion issues TL;DR: Vector operations are too computationally intensive for traditional database infrastructures
  • 35. Why Not Use a Vector Search Library? ● Have to manually implement filtering ● Not optimized to take advantage of the latest hardware ● Unable to handle large scale data ● Lack of lifecycle management ● Inefficient indexing capabilities ● No built in safety mechanisms TL;DR: Vector search libraries lack the infrastructure to help you scale, deploy, and manage your apps in production.
  • 36. What is Milvus/Zilliz ideal for? ○ Advanced filtering ○ Hybrid search ○ Durability and backups ○ Replications/High Availability ○ Sharding ○ Aggregations ○ Lifecycle management ○ Multi-tenancy ○ High query load ○ High insertion/deletion ○ Full precision/recall ○ Accelerator support (GPU, FPGA) ○ Billion-scale storage Purpose-built to store, index and query vector embeddings from unstructured data at scale.
  • 37. Meta Storage Root Query Data Index Coordinator Service Proxy Proxy etcd Log Broker SDK Load Balancer DDL/DCL DML NOTIFICATION CONTROL SIGNAL Object Storage Minio / S3 / AzureBlob Log Snapshot Delta File Index File Worker Node QUERY DATA DATA Message Storage Access Layer Query Node Data Node Index Node High-level overview of Milvus’ Architecture
  • 38. Start building with Zilliz Cloud today! zilliz.com/cloud
  • 39. | © Copyright 9/25/23 Zilliz 39 Appendix
  • 40. Important Notes - Cosine, IP, and L2 are all the SAME rank order. - They differ in use case - L2 for when you need magnitude - Cosine for orientation - IP for magnitude and orientation - OR - Cosine = IP for normalized vectors
  • 42. Basic Idea You want to use your data with a large language model
  • 43. RAG vs Fine Tuning LLM Fine Tuning Augment an LLM by training it on your data Your Data “New” LLM Query Primary Use Case ● Style transfer
  • 44. Takeaway Use RAG to force the LLM to work with your data by injecting it via a vector database like Milvus or Zilliz
  • 45. Chunking Considerations Chunk Size Chunk Overlap Character Splitters
  • 46. How Does Your Data Look? Conversation Data Documentation Data Lecture or Q/A Data
  • 53. Your chunking strategy depends on what your data looks like and what you need from it. Takeaway:
  • 54. Examining Embeddings Picking a model What to embed Metadata
  • 55. Embeddings Strategies Level 1: Embedding Chunks Directly Level 2: Embedding Sub and Super Chunks Level 3: Incorporating Chunking and Non-Chunking Metadata
  • 56. Metadata Examples Chunking ● Paragraph position ● Section header ● Larger paragraph ● Sentence Number ● … Non-Chunking ● Author ● Publisher ● Organization ● Role Based Access Control ● …
  • 57. Your embeddings strategy depends on your accuracy, cost, and use case needs Takeaway:
  • 58. Basic Idea Vector Databases provide the ability to inject your data via semantic similarity Considerations include: scale, performance, and flexibility
  • 59. Milvus Architecture: Differentiation 1. Cloud Native, Distributed System Architecture 2. True Separation of Concerns 3. Scalable Index Creation Strategy with 512 MB Segments
  • 60. Vector Databases are purpose-built to handle indexing, storing, and querying vector data. Milvus & Zilliz are specifically designed for high performance and billion+ scale use cases. Takeaway:
  • 61. Vector Database Resources Give Milvus a Star! Chat with me on Discord!
  • 62. Get Started Free Got questions? Stop by our booth! Milvus Open Source Self-Managed github.com/milvus-io/milvus Zilliz Cloud SaaS Fully-Managed zilliz.com/cloud