SlideShare a Scribd company logo
Zing Database – Distributed Key-Value Database Nguyễn Quang Nam Zing Web-Technical Team
Content Why Introduction Overview architecture 1 3 2 Single Server/Storage 4 Distribution 5
Some statistics: - Feeds: 1.6 B, 700 GB hard drive in 4 DB instances, 8 caching servers, 136 GB memory cache in used. - User Profiles: 44.5 M registered accounts, 2 database instances, 30 GB memory cache. - Comments: 350 M, 50 GB hard drive in 2 DB instances, 20 GB memory cache
Access time L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from network 10,000,000 ns Read 1 MB sequentially from disk 30,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns by Jeff Dean (
Standard & Real Requirement - Time to load a page < 200 ms - Read data rate ~12K ops/sec - Write data rate ~8K ops/sec - Caching service/Database recovery time < 5 mins
Existent thing - RDBMS (MySQL, MSSQL): Write: too slow; Read: so so with a small DB, too bad with a huge DB - Cassandra (by Facebook): difficult to do operation/maintain, and performance is not so good - HBase/Hadoop: We use this for log system - MongoDB, Membase, Tokyo Tyrant, .. : OK! we use these in several cases, but not suitable for all
Overview architecture
ZNonblockingServer - Based on TNonblockingServer (Apache Thrift) - 185K reqs/sec (original TNonblockingServer is just 45K reqs/sec) - Serialize/Deserialize data - Prevent overload server - Data is not secured while transferring - Protect service from invalid requests
ICache - Least Recently Used/Time based expiration strategy - zlru_table<key_type, value_type>: hash table data structure - Re-write malloc/free functions instead of using standard malloc/free in glibc to reduce memory fragment - Support dirty-items marking => for lazy DB flush
ZiDB - Separate into DataFile & IndexFile - 1 seek for a read, 1-2 seeks for a write - IndexFile (hash structure) is loaded onto memory as a mapping file (shared memory) to reduce system call - Write-ahead log to avoid data loss - Data magic-padding - Checksum & checkpoint for repair data - Partitioning DB for easier maintenance
Key requirements: - Scalability - Load balance - Availability - Consistency
2 Models: - Centralized: 1 addressing server & multiple storage servers => bottleneck & single-point-of-failure - Peer-peer: Each server includes addressing module & storage 2 Types of routing: - Client routing: Each client itself does the addressing and query data  - Server routing: The addressing is done at server
Operation Flows * Addressing module is moved into each storage node in Peer-peer model  Business Logic Server Addressing Server (DHT) Storage Layer Storage Node 1 ICache ZiDB Storage Module Storage Node N ICache ZiDB Storage Module … (1)  Request key locations (2) Key locations (3) Get & Set  operations (4) Operation  returns
Addressing: - Provide key locations of resources - Basically a Distributed Hash Table, using consistent hashing - Hashing: Jenkins, Murmur, or any algorithm that satisfies two conditions:   - Uniform distribution of generated keys in the key space   - Consistency (MD5, SHA are bad choice since performance)
Addressing - Node location: Each node is assigned a continuous range of IDs (hashed key)
Addressing - Node location: Golden ratio principle (a/b = 2b/a) - Init ratio = 1.618 - Max ratio ~ 2.6 - Easy to implement - Easy for routing from client 2 3 4 5 1
Server 1: 1,2,3 Server 2: 4,5,6,7 Server 3: 8,9 1 4 7 3 6 2 5 8 9 Addressing - Node location: Virtual nodes - Each real server has multiple virtual nodes on ring - More virtual nodes, more balance of load - Hard to maintain table of nodes
A A A B B C Addressing – Multi-layer rings - Store the change history of system  - Provide availability/reconfigurability - Able to put a node on ring manually * Write: data is located on the highest ring * Read: data is located on the highest ring, then lower rings if not found
Replication & Backup  - Each node has one primary range of IDs, and Some secondary range of IDs - Each real node need a backup instance to replace in case  it’s down * Data is queried from primary node, then secondary nodes
Configuration: to find the best parameters to configure DB or to choose the suitable DB type.  - How many read/write per second? - Length Deviation of data: data length is same same or much different each others,  - Has updation/deletion data?  - How important of data: acceptable loss or not - The old data can be recycled?
Q & A Contact: Nguyễn Quang Nam [email_address]

More Related Content

What's hot

How Facebook actually works????
How Facebook actually works????How Facebook actually works????
How Facebook actually works????
Dhruv Patel
Redis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button gamesRedis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button games
Redis Labs
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen AnhOGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
Buff Nguyen
10 domino integration
10   domino integration10   domino integration
10 domino integration
Microsoft Web Technology Stack
Microsoft Web Technology StackMicrosoft Web Technology Stack
Microsoft Web Technology Stack
Lushanthan Sivaneasharajah
Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01
Jagadeesha DG
High Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nlHigh Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nl
Joomla!Days Netherlands
Zarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday KeynoteZarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
WordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & TuningWordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & Tuning
Timothy Wood
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
Max Alexejev
[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data Integration[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data Integration
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed Storage
Ui perf
Ui perfUi perf
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
Load balancing at tuenti
Load balancing at tuentiLoad balancing at tuenti
Load balancing at tuenti
Ricardo Bartolomé
Zarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical informationZarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical information
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
Pierre Baillet
Optimising for Performance
Optimising for PerformanceOptimising for Performance
Optimising for Performance

What's hot (19)

How Facebook actually works????
How Facebook actually works????How Facebook actually works????
How Facebook actually works????
Redis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button gamesRedis : Database, cache, pub/sub and more at Jelly button games
Redis : Database, cache, pub/sub and more at Jelly button games
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen AnhOGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
10 domino integration
10   domino integration10   domino integration
10 domino integration
Microsoft Web Technology Stack
Microsoft Web Technology StackMicrosoft Web Technology Stack
Microsoft Web Technology Stack
Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01Newsql 2015-150213024325-conversion-gate01
Newsql 2015-150213024325-conversion-gate01
High Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nlHigh Performance - Joomla!Days NL 2009 #jd09nl
High Performance - Joomla!Days NL 2009 #jd09nl
Zarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday KeynoteZarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Zarafa SummerCamp 2012 - Steve Hardy Friday Keynote
Operationalizing MongoDB at AOL
Operationalizing MongoDB at AOLOperationalizing MongoDB at AOL
Operationalizing MongoDB at AOL
WordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & TuningWordCamp RVA 2011 - Performance & Tuning
WordCamp RVA 2011 - Performance & Tuning
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data Integration[WSO2Con EU 2017] Ballerina: Exploring Data Integration
[WSO2Con EU 2017] Ballerina: Exploring Data Integration
Introduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed StorageIntroduction to Apache BookKeeper Distributed Storage
Introduction to Apache BookKeeper Distributed Storage
Ui perf
Ui perfUi perf
Ui perf
When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...When to Use MongoDB...and When You Should Not...
When to Use MongoDB...and When You Should Not...
Load balancing at tuenti
Load balancing at tuentiLoad balancing at tuenti
Load balancing at tuenti
Zarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical informationZarafa SummerCamp 2012 - Exchange Web Services, technical information
Zarafa SummerCamp 2012 - Exchange Web Services, technical information
MongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of viewMongoDB vs Mysql. A devops point of view
MongoDB vs Mysql. A devops point of view
Optimising for Performance
Optimising for PerformanceOptimising for Performance
Optimising for Performance

Viewers also liked

Big data
Big dataBig data
Big data
Luis Goldster
Design a scalable social network: Problems and Solutions
Design a scalable social network: Problems and SolutionsDesign a scalable social network: Problems and Solutions
Design a scalable social network: Problems and Solutions
Chau Thanh
IoT and developer chances
IoT and developer chancesIoT and developer chances
IoT and developer chances
Chau Thanh
Buiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-ServiceBuiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-Service
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redis
Design a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsDesign a scalable site: Problem and solutions
Design a scalable site: Problem and solutions
Chau Thanh
Sơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing MeSơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing Me
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed System
Chau Thanh
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
Chau Thanh
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
Nguyen Tung
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHP
Chau Thanh
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014
Nguyen Tung
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
Nguyen Tung
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications
David Mitzenmacher
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users
Jongyoon Choi

Viewers also liked (15)

Big data
Big dataBig data
Big data
Design a scalable social network: Problems and Solutions
Design a scalable social network: Problems and SolutionsDesign a scalable social network: Problems and Solutions
Design a scalable social network: Problems and Solutions
IoT and developer chances
IoT and developer chancesIoT and developer chances
IoT and developer chances
Buiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-ServiceBuiding and Deploying SaaS with WSO2 as as-a-Service
Buiding and Deploying SaaS with WSO2 as as-a-Service
Memcached vs redis
Memcached vs redisMemcached vs redis
Memcached vs redis
Design a scalable site: Problem and solutions
Design a scalable site: Problem and solutionsDesign a scalable site: Problem and solutions
Design a scalable site: Problem and solutions
Sơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing MeSơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing Me
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed System
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
Architecture Patterns - Open Discussion
Architecture Patterns - Open DiscussionArchitecture Patterns - Open Discussion
Architecture Patterns - Open Discussion
Zingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHPZingme practice for building scalable website with PHP
Zingme practice for building scalable website with PHP
SaaS Introduction-May2014
SaaS Introduction-May2014SaaS Introduction-May2014
SaaS Introduction-May2014
Microservice Architecture
Microservice ArchitectureMicroservice Architecture
Microservice Architecture
7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications7 Stages of Scaling Web Applications
7 Stages of Scaling Web Applications
facebook architecture for 600M users
facebook architecture for 600M usersfacebook architecture for 600M users
facebook architecture for 600M users

Similar to Zing Database – Distributed Key-Value Database

Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
Hyderabad Scalability Meetup
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
Chinmay Kulkarni
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
Sematext Group, Inc.
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
Narayana B
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
Joe Alex
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
Scott Mansfield
Caching methodology and strategies
Caching methodology and strategiesCaching methodology and strategies
Caching methodology and strategies
Tiep Vu
Caching Methodology & Strategies
Caching Methodology & StrategiesCaching Methodology & Strategies
Caching Methodology & Strategies
Tiệp Vũ
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
Scott Mansfield
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
Bogdan Dina
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Sakari Keskitalo
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Sakari Keskitalo
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
Rich Lee
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Codership Oy - Creators of Galera Cluster
Exchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store ChangesExchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store Changes
Microsoft TechNet - Belgium and Luxembourg
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Gcp data engineer
Gcp data engineerGcp data engineer
Gcp data engineer
Narendranath Reddy T

Similar to Zing Database – Distributed Key-Value Database (20)

Understanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQLUnderstanding and building big data Architectures - NoSQL
Understanding and building big data Architectures - NoSQL
In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)In-Memory Data Grids - Ampool (1)
In-Memory Data Grids - Ampool (1)
Elasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep diveElasticsearch for Logs & Metrics - a deep dive
Elasticsearch for Logs & Metrics - a deep dive
Hadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_PlanHadoop Architecture_Cluster_Cap_Plan
Hadoop Architecture_Cluster_Cap_Plan
Hardware Provisioning
Hardware ProvisioningHardware Provisioning
Hardware Provisioning
Managing Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using ElasticsearchManaging Security At 1M Events a Second using Elasticsearch
Managing Security At 1M Events a Second using Elasticsearch
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
Caching methodology and strategies
Caching methodology and strategiesCaching methodology and strategies
Caching methodology and strategies
Caching Methodology & Strategies
Caching Methodology & StrategiesCaching Methodology & Strategies
Caching Methodology & Strategies
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDBEVCache: Lowering Costs for a Low Latency Cache with RocksDB
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance(ATS6-PLAT06) Maximizing AEP Performance
(ATS6-PLAT06) Maximizing AEP Performance
Data has a better idea the in-memory data grid
Data has a better idea   the in-memory data gridData has a better idea   the in-memory data grid
Data has a better idea the in-memory data grid
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Centralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stackCentralized log-management-with-elastic-stack
Centralized log-management-with-elastic-stack
Using galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wanUsing galera replication to create geo distributed clusters on the wan
Using galera replication to create geo distributed clusters on the wan
Exchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store ChangesExchange Server 2013 Database and Store Changes
Exchange Server 2013 Database and Store Changes
Taking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout SessionTaking Splunk to the Next Level - Architecture Breakout Session
Taking Splunk to the Next Level - Architecture Breakout Session
Gcp data engineer
Gcp data engineerGcp data engineer
Gcp data engineer

More from zingopen

Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệpZing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me Platform Policy
Zing Me Platform PolicyZing Me Platform Policy
Zing Me Platform Policy
Zing Me Workshop 11082012
Zing Me Workshop 11082012Zing Me Workshop 11082012
Zing Me Workshop 11082012
Quản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quảQuản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quả
The social shop- proposal
The social shop- proposalThe social shop- proposal
The social shop- proposal
Tích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing MeTích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing Me
Zing Open Platform APIs
Zing Open Platform APIsZing Open Platform APIs
Zing Open Platform APIs
Fanpage Management
Fanpage ManagementFanpage Management
Fanpage Management
Partnership Proposal
Partnership ProposalPartnership Proposal
Partnership Proposal
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXHCơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Checklist Zing Me Fanpage
Checklist Zing Me FanpageChecklist Zing Me Fanpage
Checklist Zing Me Fanpage
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan page
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan page
Check list Zing Me Fan page
Check list Zing Me Fan pageCheck list Zing Me Fan page
Check list Zing Me Fan page
Behavior of Zing Me users
 Behavior of Zing Me users Behavior of Zing Me users
Behavior of Zing Me users
Zing Me Users Proflie
Zing Me Users Proflie Zing Me Users Proflie
Zing Me Users Proflie
Build fame and make money with social media
Build fame and make money with social mediaBuild fame and make money with social media
Build fame and make money with social media
Google cooperate with VNG_Presentation
Google cooperate with VNG_PresentationGoogle cooperate with VNG_Presentation
Google cooperate with VNG_Presentation
Branding in Farm 2
Branding in Farm 2Branding in Farm 2
Branding in Farm 2
Zing me credential
Zing me credentialZing me credential
Zing me credential

More from zingopen (20)

Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệpZing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me cung cấp gói hỗ trợ miễn phí cho Doanh nghiệp
Zing Me Platform Policy
Zing Me Platform PolicyZing Me Platform Policy
Zing Me Platform Policy
Zing Me Workshop 11082012
Zing Me Workshop 11082012Zing Me Workshop 11082012
Zing Me Workshop 11082012
Quản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quảQuản lý Zing Me fanpage một cách hiệu quả
Quản lý Zing Me fanpage một cách hiệu quả
The social shop- proposal
The social shop- proposalThe social shop- proposal
The social shop- proposal
Tích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing MeTích hợp kỹ thuật của Ứng dụng trên Zing Me
Tích hợp kỹ thuật của Ứng dụng trên Zing Me
Zing Open Platform APIs
Zing Open Platform APIsZing Open Platform APIs
Zing Open Platform APIs
Fanpage Management
Fanpage ManagementFanpage Management
Fanpage Management
Partnership Proposal
Partnership ProposalPartnership Proposal
Partnership Proposal
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXHCơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Cơ hội và thách thức cho DN Vừa và Nhỏ trên MXH
Checklist Zing Me Fanpage
Checklist Zing Me FanpageChecklist Zing Me Fanpage
Checklist Zing Me Fanpage
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan page
Check List Zing Me Fan page
Check List Zing Me Fan pageCheck List Zing Me Fan page
Check List Zing Me Fan page
Check list Zing Me Fan page
Check list Zing Me Fan pageCheck list Zing Me Fan page
Check list Zing Me Fan page
Behavior of Zing Me users
 Behavior of Zing Me users Behavior of Zing Me users
Behavior of Zing Me users
Zing Me Users Proflie
Zing Me Users Proflie Zing Me Users Proflie
Zing Me Users Proflie
Build fame and make money with social media
Build fame and make money with social mediaBuild fame and make money with social media
Build fame and make money with social media
Google cooperate with VNG_Presentation
Google cooperate with VNG_PresentationGoogle cooperate with VNG_Presentation
Google cooperate with VNG_Presentation
Branding in Farm 2
Branding in Farm 2Branding in Farm 2
Branding in Farm 2
Zing me credential
Zing me credentialZing me credential
Zing me credential

Recently uploaded

"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan..."Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
Stephanie Beckett
Choosing the Best Outlook OST to PST Converter: Key Features and Considerations
Choosing the Best Outlook OST to PST Converter: Key Features and ConsiderationsChoosing the Best Outlook OST to PST Converter: Key Features and Considerations
Choosing the Best Outlook OST to PST Converter: Key Features and Considerations
webbyacad software
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
Alison B. Lowndes
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptxFIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Alliance
FIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptxFIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Alliance
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Alliance
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Alliance
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
Priyanka Aash
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partesExchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
Priyanka Aash
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024
Peter Caitens

Recently uploaded (20)

"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan..."Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
"Building Future-Ready Apps with .NET 8 and Azure Serverless Ecosystem", Stan...
What's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptxWhat's New in Copilot for Microsoft 365 June 2024.pptx
What's New in Copilot for Microsoft 365 June 2024.pptx
Choosing the Best Outlook OST to PST Converter: Key Features and Considerations
Choosing the Best Outlook OST to PST Converter: Key Features and ConsiderationsChoosing the Best Outlook OST to PST Converter: Key Features and Considerations
Choosing the Best Outlook OST to PST Converter: Key Features and Considerations
NVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space ExplorationNVIDIA at Breakthrough Discuss for Space Exploration
NVIDIA at Breakthrough Discuss for Space Exploration
How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...How UiPath Discovery Suite supports identification of Agentic Process Automat...
How UiPath Discovery Suite supports identification of Agentic Process Automat...
Retrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with RagasRetrieval Augmented Generation Evaluation with Ragas
Retrieval Augmented Generation Evaluation with Ragas
Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1Discovery Series - Zero to Hero - Task Mining Session 1
Discovery Series - Zero to Hero - Task Mining Session 1
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptxFIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar Introduction to FIDO.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptxFIDO Munich Seminar: FIDO Tech Principles.pptx
FIDO Munich Seminar: FIDO Tech Principles.pptx
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI CertificationTrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
TrustArc Webinar - Innovating with TRUSTe Responsible AI Certification
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
Garbage In, Garbage Out: Why poor data curation is killing your AI models (an...
FIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptxFIDO Munich Seminar: Securing Smart Car.pptx
FIDO Munich Seminar: Securing Smart Car.pptx
Zaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdfZaitechno Handheld Raman Spectrometer.pdf
Zaitechno Handheld Raman Spectrometer.pdf
Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+Scaling Vector Search: How Milvus Handles Billions+
Scaling Vector Search: How Milvus Handles Billions+
It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...It's your unstructured data: How to get your GenAI app to production (and spe...
It's your unstructured data: How to get your GenAI app to production (and spe...
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptxFIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
FIDO Munich Seminar: Strong Workforce Authn Push & Pull Factors.pptx
Finetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and DefendingFinetuning GenAI For Hacking and Defending
Finetuning GenAI For Hacking and Defending
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partesExchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Exchange, Entra ID, Conectores, RAML: Todo, a la vez, en todas partes
Redefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI CapabilitiesRedefining Cybersecurity with AI Capabilities
Redefining Cybersecurity with AI Capabilities
Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024Increase Quality with User Access Policies - July 2024
Increase Quality with User Access Policies - July 2024

Zing Database – Distributed Key-Value Database

  • 1. Zing Database – Distributed Key-Value Database Nguyễn Quang Nam Zing Web-Technical Team
  • 2. Content Why Introduction Overview architecture 1 3 2 Single Server/Storage 4 Distribution 5
  • 4. Some statistics: - Feeds: 1.6 B, 700 GB hard drive in 4 DB instances, 8 caching servers, 136 GB memory cache in used. - User Profiles: 44.5 M registered accounts, 2 database instances, 30 GB memory cache. - Comments: 350 M, 50 GB hard drive in 2 DB instances, 20 GB memory cache
  • 5. Why
  • 6. Access time L1 cache reference 0.5 ns Branch mispredict 5 ns L2 cache reference 7 ns Mutex lock/unlock 100 ns Main memory reference 100 ns Compress 1K bytes with Zippy 10,000 ns Send 2K bytes over 1 Gbps network 20,000 ns Read 1 MB sequentially from memory 250,000 ns Round trip within same datacenter 500,000 ns Disk seek 10,000,000 ns Read 1 MB sequentially from network 10,000,000 ns Read 1 MB sequentially from disk 30,000,000 ns Send packet CA->Netherlands->CA 150,000,000 ns by Jeff Dean (
  • 7. Standard & Real Requirement - Time to load a page < 200 ms - Read data rate ~12K ops/sec - Write data rate ~8K ops/sec - Caching service/Database recovery time < 5 mins
  • 8. Existent thing - RDBMS (MySQL, MSSQL): Write: too slow; Read: so so with a small DB, too bad with a huge DB - Cassandra (by Facebook): difficult to do operation/maintain, and performance is not so good - HBase/Hadoop: We use this for log system - MongoDB, Membase, Tokyo Tyrant, .. : OK! we use these in several cases, but not suitable for all
  • 10.  
  • 12. ZNonblockingServer - Based on TNonblockingServer (Apache Thrift) - 185K reqs/sec (original TNonblockingServer is just 45K reqs/sec) - Serialize/Deserialize data - Prevent overload server - Data is not secured while transferring - Protect service from invalid requests
  • 13. ICache - Least Recently Used/Time based expiration strategy - zlru_table<key_type, value_type>: hash table data structure - Re-write malloc/free functions instead of using standard malloc/free in glibc to reduce memory fragment - Support dirty-items marking => for lazy DB flush
  • 14. ZiDB - Separate into DataFile & IndexFile - 1 seek for a read, 1-2 seeks for a write - IndexFile (hash structure) is loaded onto memory as a mapping file (shared memory) to reduce system call - Write-ahead log to avoid data loss - Data magic-padding - Checksum & checkpoint for repair data - Partitioning DB for easier maintenance
  • 16. Key requirements: - Scalability - Load balance - Availability - Consistency
  • 17. 2 Models: - Centralized: 1 addressing server & multiple storage servers => bottleneck & single-point-of-failure - Peer-peer: Each server includes addressing module & storage 2 Types of routing: - Client routing: Each client itself does the addressing and query data - Server routing: The addressing is done at server
  • 18. Operation Flows * Addressing module is moved into each storage node in Peer-peer model Business Logic Server Addressing Server (DHT) Storage Layer Storage Node 1 ICache ZiDB Storage Module Storage Node N ICache ZiDB Storage Module … (1) Request key locations (2) Key locations (3) Get & Set operations (4) Operation returns
  • 19. Addressing: - Provide key locations of resources - Basically a Distributed Hash Table, using consistent hashing - Hashing: Jenkins, Murmur, or any algorithm that satisfies two conditions: - Uniform distribution of generated keys in the key space - Consistency (MD5, SHA are bad choice since performance)
  • 20. Addressing - Node location: Each node is assigned a continuous range of IDs (hashed key)
  • 21. Addressing - Node location: Golden ratio principle (a/b = 2b/a) - Init ratio = 1.618 - Max ratio ~ 2.6 - Easy to implement - Easy for routing from client 2 3 4 5 1
  • 22. Server 1: 1,2,3 Server 2: 4,5,6,7 Server 3: 8,9 1 4 7 3 6 2 5 8 9 Addressing - Node location: Virtual nodes - Each real server has multiple virtual nodes on ring - More virtual nodes, more balance of load - Hard to maintain table of nodes
  • 23. A A A B B C Addressing – Multi-layer rings - Store the change history of system - Provide availability/reconfigurability - Able to put a node on ring manually * Write: data is located on the highest ring * Read: data is located on the highest ring, then lower rings if not found
  • 24. Replication & Backup - Each node has one primary range of IDs, and Some secondary range of IDs - Each real node need a backup instance to replace in case it’s down * Data is queried from primary node, then secondary nodes
  • 25. Configuration: to find the best parameters to configure DB or to choose the suitable DB type. - How many read/write per second? - Length Deviation of data: data length is same same or much different each others, - Has updation/deletion data? - How important of data: acceptable loss or not - The old data can be recycled?
  • 26. Q & A Contact: Nguyễn Quang Nam [email_address]