SlideShare a Scribd company logo
New Journey of HBase in Alibaba and Cloud
Chunhui Shen and Long Cao
August 17,2018
Content AliHB-Introduction of Alibaba HBase
History,Tech Overview,Open Source,Core Scenarios
Recent Key Challenge & Improvements
GC Trouble,Separation of Computing & Storage,Cold-
Hot Data,Diagnostic System, Migration & Backup
03 HBase Ecosystem & Multi-model DB & Cloud
KV,Tabular,SQL,Graph,Time Series,Geospatial ,
Search, Mixed Workloads,Cloud
AliHB-Introduction of Alibaba HBase01
HBase History in Alibaba
• Why HBase
– Began using since 2010
– Active community
– Hadoop ecosystem
– Facebook successful case
– Google famous paper: Big Table
Our Choice in 2010
Open Source
Develop New
Big Data
Store System
• Used Version
– 0.20->0.90->0.92->0.94->0.98->1.1->2.0
• The earliest case in 2010-2011
– Search Store
– Taobao History Order
– Alipay Risk Management
• Internal branch AliHB
Overview of AliHB
• Performance
• High-Performance Data
Group IO
• Feature
• SQL、Secondary Index
• Multi-Tenants、Cold-Hot
Separation、Async API
• Stability
• High Availability Architecture
• Faster MTTR
• Verification in Double 11
Shopping Day
• Efficient Maintenance
• Effective Monitoring
• Full Path Trace
• No-pause migration
• 12000+ Nodes,100+ Clusters ,200+ Million OPS,100+ PB Data
• 20+ BU,6000+ Users, 100+ Production Changes per Day
Open Source and Community
• Contributing to open source since 2011
• 3 PMC, 6 Committers in Alibaba
• Sponsor the Chinese HBase Technology Community
• Already Organized 2 HBase Meetup
• At least one HBase Related tech article one day
• Tens of thousands of readers now, and more are coming
• Hosting HBase Con Asia 2018
• Promote the use of HBase through several conference talks
• Hope more people to join in HBase Community
Core Scenarios in Alibaba
Alipay Bills Cainiao Logistics
Monitor, Log,
Tracking, IoT Data…Message, Orders, Feeds … AI Storage
Ant Intelligent Security
Intelligent Customer Service
Search, BI Report…
Recent Key Challenge & Improvements02
GC Trouble
Slow Request
Very Slow
GC Problems Under100GB Memory
GC Trouble
Only for offline application
Rewriting with C++
Exploring a Thorough
GC Trouble
Type Pause Time Frequency
YGC 100ms+ Once per 5 Secs
CMS 100~500ms Once per 5 Mins
FGC 20s-180s Once per 7~60 Days
Type Pause Time Frequency
YGC 5ms Once per 5 Secs
CMS 100ms Once per 5 Hours
CCSMap BucketCacheV2
Allocation and reclaim the major memory
by hbase itself, rather than JVM
New GC algorithm in AJDK
Try best to reuse object(In Core Path) when
GC Trouble
New BucketCache in HBase-2.0
CCSMap in HBase-3.0
Separation of Computing & Storage
Localized Deployment
– Low IO latency with Short-Circuit Read
– Unbalanced storage space, especially between clusters
– Difficult to increase the usage ratio of CPU and Disk (both), especially when lots of scenarios
– Cluster scaling is slow because of datanode decommission
Separation of Computing & Storage
Shared-Storage Deployment
– Big shared storage, more
– Compute node can scale
– Storage node can scale
– Auto-scaling become
– Based on load statistics,
smart schedule between
– Share compute resources
with other applications
Heterogeneous Cold-Hot Storage
• HBase has the capability to hold all the data of whole life cycle
• But in most cases, like monitor, trace, order, logistics
• The recently generated data is often accessed, but occupy very little storage space
• The history data is rarely visited, but occupy a lot of storage space
• Common solution
• Cold storage system for history data
• Hot storage system for recent data
• Move the data from hot storage system to cold storage system periodically
Heterogeneous Cold-Hot Storage
• Easy To Use
• Auto Tiered
• Heterogeneous
• Read Optimization
Diagnostic System
“Request Rush?” — Monitor
“Big Region?” — Web UI
“Full Disk?” — df
“Bad Disk?” — tsar,demsg
12000+ Nodes,100+ Clusters ,6000+ Users
HBase Diagnostic Center
1. The unified entrance of trouble shooting
2. Experience/Solution => Function of Diagnostic System
Diagnostic System
One extra server for all
No Agent
Adding rule dynamically
Runtime information
Check all components
Only 10 seconds for a diagnosis6
Diagnostic System
 Compaction
 Stuck
 Balance Abnormal
 Table Abnormal
 Region Offline
 Replication Delay
 Too many files
 High Meta Load
 Multi Assign
 ……
 ZK Unavailable
 Block Miss
 NameNode Abnormal
 Full capacity of datanode
 Inconsistent state between
two namenodes
 Too much Xceivers
 Disk not mounted
 ……
 Insufficient disk space
 Slow Disk
 Bad Disk
 Too much TCP error
 Slow ping
 CPU hang
 Load too high
 Port is unreachable
 ……
Shared on
Apsara HBase
Migration & Backup
Migration & Backup
Independent with HBase
• almost no impact to service
• easy to upgrade
• support multi versions
• support the non-hbase
Second-level RPO
Minute-level RTO
HBase Ecosystem & Multi-model DB & Cloud03
Popularity changes per DB category
Ranking scores per category in percent
Data size per day
All in one
RelationalKey Value Doucument Graph Time Series Geospatial
Tabular NoSQL
All in one
RelationalKey Value Doucument Graph Time Series Geospatial
Tabular NoSQL
HBase Phoenix/AntsDB HBase JanusGraph OpenTSDB GeoMesa
Multi-model - Native Or Layer
HBase Ecosystem
KVIndex KVIndex
HBase Meet Cloud – Benefits
Cloud Native
New Hardware Flexibility Cost Savings
Fast Add/Remove
Fix bugs in time
End up paying for
Reduce human
Multi-model - Native Or Layer
HBase Ecosystem
KVIndex KVIndex
ApsaraDB HBase Platform – Cloud Native
(KV、Tabular 、Doucument)
(Full Text Index)
Hot data on SSD
Time Series
Cold data on HDD
and use EC like OSS
Warm data on SSD&HDD
Remote Read/Write use RDMA and 25G network
ApsaraDB HBase Platform Advantage
ApsaraDB HBase (ALiyun Product)
Apache HBase (Sofeware)
High availability 99.9% ~ 99.99% N/A
Data reliability 99.999999999% N/A
Online Ability
Multi-master clustering,Multi-AZ/Regon NO
GC FGC NO,YGC 5ms GC 20s~100s,YGC 100ms+
Reduce Cost
Storage Cost Cut by 50%+ on share cloud disk,Total 3 Copy Maybe on Cloud Disk,Total 9 Copy
Support Cold
Support OSS,Cut by 70% at less read NO
Multi-model DB Multi-model DB
KV,Tabular,SQL,Graph,Time Series,Geospatial
Full Text index, Search
Disaster recovery Backup and Restore NO,maybe3.0
Security user/password,ACL Kerberos,ACL
Analytics Spark on HBase ,More optimization Spark on HBase
Version upgrade Automatic upgrade N/A
control system
15min Create a DB/Monitor
Online add storage and node/Elastic Power in future
Big request ,Big Table merge,Hot Region …… NO
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud

More Related Content

What's hot

AWS EMR Cost optimization
AWS EMR Cost optimizationAWS EMR Cost optimization
AWS EMR Cost optimization
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
Cloudera, Inc.
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
Yoshinori Matsunobu
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
Yoshinori Matsunobu
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
DataWorks Summit/Hadoop Summit
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
Lars Hofhansl
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
DataWorks Summit
Node Labels in YARN
Node Labels in YARNNode Labels in YARN
Node Labels in YARN
DataWorks Summit
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
MariaDB plc
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
Amazon Web Services Korea
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
Biju Nair
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
Edward Yoon
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
Cloudera, Inc.
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BI
DataWorks Summit
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
DataWorks Summit/Hadoop Summit

What's hot (20)

AWS EMR Cost optimization
AWS EMR Cost optimizationAWS EMR Cost optimization
AWS EMR Cost optimization
The Impala Cookbook
The Impala CookbookThe Impala Cookbook
The Impala Cookbook
MyRocks Deep Dive
MyRocks Deep DiveMyRocks Deep Dive
MyRocks Deep Dive
MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바MySQL Advanced Administrator 2021 - 네오클로바
MySQL Advanced Administrator 2021 - 네오클로바
MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기MelOn 빅데이터 플랫폼과 Tajo 이야기
MelOn 빅데이터 플랫폼과 Tajo 이야기
SSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQLSSD Deployment Strategies for MySQL
SSD Deployment Strategies for MySQL
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
Apache HBase Performance Tuning
Apache HBase Performance TuningApache HBase Performance Tuning
Apache HBase Performance Tuning
What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?What is new in Apache Hive 3.0?
What is new in Apache Hive 3.0?
Node Labels in YARN
Node Labels in YARNNode Labels in YARN
Node Labels in YARN
How to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScaleHow to Manage Scale-Out Environments with MariaDB MaxScale
How to Manage Scale-Out Environments with MariaDB MaxScale
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
데브시스터즈 데이터 레이크 구축 이야기 : Data Lake architecture case study (박주홍 데이터 분석 및 인프라 팀...
HBase Application Performance Improvement
HBase Application Performance ImprovementHBase Application Performance Improvement
HBase Application Performance Improvement
PostgreSQL Deep Internal
PostgreSQL Deep InternalPostgreSQL Deep Internal
PostgreSQL Deep Internal
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)[211] HBase 기반 검색 데이터 저장소 (공개용)
[211] HBase 기반 검색 데이터 저장소 (공개용)
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUponHBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
LLAP: Building Cloud First BI
LLAP: Building Cloud First BILLAP: Building Cloud First BI
LLAP: Building Cloud First BI
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase

Similar to HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud

AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
Amazon Web Services
Dataflow in 104corp - AWS UserGroup TW 2018
Dataflow in 104corp - AWS UserGroup TW 2018Dataflow in 104corp - AWS UserGroup TW 2018
Dataflow in 104corp - AWS UserGroup TW 2018
Gavin Lin
Horizon for Big Data
Horizon for Big DataHorizon for Big Data
Horizon for Big Data
Schubert Zhang
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
Dataflow in 104corp - DataConTW2018
Dataflow in 104corp - DataConTW2018Dataflow in 104corp - DataConTW2018
Dataflow in 104corp - DataConTW2018
Gavin Lin
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon Redshift
Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015
Amazon Web Services
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
Amazon Web Services
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
Michael Stack
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
Amazon Web Services
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
Antonio Severien
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
Apache Tajo - An open source big data warehouse
Apache Tajo - An open source big data warehouseApache Tajo - An open source big data warehouse
Apache Tajo - An open source big data warehouse
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
Amazon Web Services
Aesop change data propagation
Aesop change data propagationAesop change data propagation
Aesop change data propagation
Regunath B
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
Amazon Web Services
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detection
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
Amazon Web Services
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Austin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at BazaarvoiceAustin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at Bazaarvoice

Similar to HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud (20)

AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924AWS Webcast - Managing Big Data in the AWS Cloud_20140924
AWS Webcast - Managing Big Data in the AWS Cloud_20140924
Dataflow in 104corp - AWS UserGroup TW 2018
Dataflow in 104corp - AWS UserGroup TW 2018Dataflow in 104corp - AWS UserGroup TW 2018
Dataflow in 104corp - AWS UserGroup TW 2018
Horizon for Big Data
Horizon for Big DataHorizon for Big Data
Horizon for Big Data
Large-scale Web Apps @ Pinterest
Large-scale Web Apps @ PinterestLarge-scale Web Apps @ Pinterest
Large-scale Web Apps @ Pinterest
Dataflow in 104corp - DataConTW2018
Dataflow in 104corp - DataConTW2018Dataflow in 104corp - DataConTW2018
Dataflow in 104corp - DataConTW2018
How Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon RedshiftHow Glidewell Moves Data to Amazon Redshift
How Glidewell Moves Data to Amazon Redshift
Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015Getting Started with Big Data and HPC in the Cloud - August 2015
Getting Started with Big Data and HPC in the Cloud - August 2015
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOTAWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
AWS APAC Webinar Week - Big Data on AWS. RedShift, EMR, & IOT
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Sparkhbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
Big Data Architectural Patterns
Big Data Architectural PatternsBig Data Architectural Patterns
Big Data Architectural Patterns
NoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBaseNoSQL: Cassadra vs. HBase
NoSQL: Cassadra vs. HBase
Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016Introduction to Kudu - StampedeCon 2016
Introduction to Kudu - StampedeCon 2016
Apache Tajo - An open source big data warehouse
Apache Tajo - An open source big data warehouseApache Tajo - An open source big data warehouse
Apache Tajo - An open source big data warehouse
Data & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon RedshiftData & Analytics - Session 2 - Introducing Amazon Redshift
Data & Analytics - Session 2 - Introducing Amazon Redshift
Aesop change data propagation
Aesop change data propagationAesop change data propagation
Aesop change data propagation
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS(BDT310) Big Data Architectural Patterns and Best Practices on AWS
(BDT310) Big Data Architectural Patterns and Best Practices on AWS
Architecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud DetectionArchitecting applications with Hadoop - Fraud Detection
Architecting applications with Hadoop - Fraud Detection
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
AWS Enterprise Summit Netherlands - Big Data Architectural Patterns & Best Pr...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Austin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at BazaarvoiceAustin Scales- Clickstream Analytics at Bazaarvoice
Austin Scales- Clickstream Analytics at Bazaarvoice

More from Michael Stack

hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloudhbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
Michael Stack
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinteresthbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinterest
Michael Stack
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
Michael Stack
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didihbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didi
Michael Stack
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
Michael Stack
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencenthbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencent
Michael Stack
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
Michael Stack
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
Michael Stack
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
Michael Stack
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
Michael Stack
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomihbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomi
Michael Stack
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBasehbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
Michael Stack
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
Michael Stack
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
Michael Stack
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACLhbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
Michael Stack
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBasehbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBase
Michael Stack
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
Michael Stack
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
Michael Stack
HBaseConAsia2019 Keynote
HBaseConAsia2019 KeynoteHBaseConAsia2019 Keynote
HBaseConAsia2019 Keynote
Michael Stack
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
Michael Stack

More from Michael Stack (20)

hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloudhbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinteresthbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Recent work on HBase at Pinterest
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltdhbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didihbaseconasia2019 HBase at Didi
hbaseconasia2019 HBase at Didi
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencenthbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Componenthbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibabahbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomihbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBasehbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solutionhbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memoryhbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACLhbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBasehbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
HBaseConAsia2019 Keynote
HBaseConAsia2019 KeynoteHBaseConAsia2019 Keynote
HBaseConAsia2019 Keynote
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latenciesHBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies

Recently uploaded

Do it again anti Republican shirt Do it again anti Republican shirt
Do it again anti Republican shirt Do it again anti Republican shirtDo it again anti Republican shirt Do it again anti Republican shirt
Do it again anti Republican shirt Do it again anti Republican shirt
The Money Wave 2024 Review: Is It the Key to Financial Success?
The Money Wave 2024 Review: Is It the Key to Financial Success?The Money Wave 2024 Review: Is It the Key to Financial Success?
The Money Wave 2024 Review: Is It the Key to Financial Success?
How Can Microsoft Office 365 Improve Your Productivity?
How Can Microsoft Office 365 Improve Your Productivity?How Can Microsoft Office 365 Improve Your Productivity?
How Can Microsoft Office 365 Improve Your Productivity?
Digital Host
The Ultimate Guide to Web Hosting Reviews in 2024.pdf
The Ultimate Guide to Web Hosting Reviews in 2024.pdfThe Ultimate Guide to Web Hosting Reviews in 2024.pdf
The Ultimate Guide to Web Hosting Reviews in 2024.pdf
Hosting Mastery Hub
High-Yield Dow Jones Stocks Worth Investing in Today.docx
High-Yield Dow Jones Stocks Worth Investing in Today.docxHigh-Yield Dow Jones Stocks Worth Investing in Today.docx
High-Yield Dow Jones Stocks Worth Investing in Today.docx
SFC Today
How to Become a Digital Marketer in 2024.docx
How to Become a Digital Marketer in 2024.docxHow to Become a Digital Marketer in 2024.docx
How to Become a Digital Marketer in 2024.docx
InfyQ Seo Experts
New York Institute of Technology degree Cert diploma offer
New York Institute of Technology degree Cert diploma offerNew York Institute of Technology degree Cert diploma offer
New York Institute of Technology degree Cert diploma offer
Latest Deals in the Metaverse & NFT Markets.docx
Latest Deals in the Metaverse & NFT Markets.docxLatest Deals in the Metaverse & NFT Markets.docx
Latest Deals in the Metaverse & NFT Markets.docx
SFC Today
How God led me to DTS? Through many different signs and connections that I c...
How God led me to DTS? Through many different signs and connections that  I c...How God led me to DTS? Through many different signs and connections that  I c...
How God led me to DTS? Through many different signs and connections that I c...
Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...
Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...
Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...
Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...
Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...
Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...
The Money Wave 2024 Review_ Is It the Key to Financial Success.pdf
The Money Wave 2024 Review_ Is It the Key to Financial Success.pdfThe Money Wave 2024 Review_ Is It the Key to Financial Success.pdf
The Money Wave 2024 Review_ Is It the Key to Financial Success.pdf

Recently uploaded (13)

Do it again anti Republican shirt Do it again anti Republican shirt
Do it again anti Republican shirt Do it again anti Republican shirtDo it again anti Republican shirt Do it again anti Republican shirt
Do it again anti Republican shirt Do it again anti Republican shirt
The Money Wave 2024 Review: Is It the Key to Financial Success?
The Money Wave 2024 Review: Is It the Key to Financial Success?The Money Wave 2024 Review: Is It the Key to Financial Success?
The Money Wave 2024 Review: Is It the Key to Financial Success?
How Can Microsoft Office 365 Improve Your Productivity?
How Can Microsoft Office 365 Improve Your Productivity?How Can Microsoft Office 365 Improve Your Productivity?
How Can Microsoft Office 365 Improve Your Productivity?
The Ultimate Guide to Web Hosting Reviews in 2024.pdf
The Ultimate Guide to Web Hosting Reviews in 2024.pdfThe Ultimate Guide to Web Hosting Reviews in 2024.pdf
The Ultimate Guide to Web Hosting Reviews in 2024.pdf
High-Yield Dow Jones Stocks Worth Investing in Today.docx
High-Yield Dow Jones Stocks Worth Investing in Today.docxHigh-Yield Dow Jones Stocks Worth Investing in Today.docx
High-Yield Dow Jones Stocks Worth Investing in Today.docx
How to Become a Digital Marketer in 2024.docx
How to Become a Digital Marketer in 2024.docxHow to Become a Digital Marketer in 2024.docx
How to Become a Digital Marketer in 2024.docx
New York Institute of Technology degree Cert diploma offer
New York Institute of Technology degree Cert diploma offerNew York Institute of Technology degree Cert diploma offer
New York Institute of Technology degree Cert diploma offer
Latest Deals in the Metaverse & NFT Markets.docx
Latest Deals in the Metaverse & NFT Markets.docxLatest Deals in the Metaverse & NFT Markets.docx
Latest Deals in the Metaverse & NFT Markets.docx
How God led me to DTS? Through many different signs and connections that I c...
How God led me to DTS? Through many different signs and connections that  I c...How God led me to DTS? Through many different signs and connections that  I c...
How God led me to DTS? Through many different signs and connections that I c...
Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...
Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...
Module 16 Incineration of Healthcare Waste and the Stockholm Convention Guide...
Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...
Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...
Java Training in Chandigarh.Mastering Java: From Fundamentals to Advanced App...
The Money Wave 2024 Review_ Is It the Key to Financial Success.pdf
The Money Wave 2024 Review_ Is It the Key to Financial Success.pdfThe Money Wave 2024 Review_ Is It the Key to Financial Success.pdf
The Money Wave 2024 Review_ Is It the Key to Financial Success.pdf

HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud

  • 1. New Journey of HBase in Alibaba and Cloud Chunhui Shen and Long Cao August 17,2018 八年磨一剑,HBase在阿里巴巴和云上的新征程
  • 2. Content AliHB-Introduction of Alibaba HBase History,Tech Overview,Open Source,Core Scenarios 01 Recent Key Challenge & Improvements GC Trouble,Separation of Computing & Storage,Cold- Hot Data,Diagnostic System, Migration & Backup 02 03 HBase Ecosystem & Multi-model DB & Cloud KV,Tabular,SQL,Graph,Time Series,Geospatial , Search, Mixed Workloads,Cloud
  • 4. HBase History in Alibaba • Why HBase – Began using since 2010 – Active community – Hadoop ecosystem – Facebook successful case – Google famous paper: Big Table Our Choice in 2010 Open Source Commercial Develop New Big Data Store System Cassandra MySQL、Oracle Data Burst • Used Version – 0.20->0.90->0.92->0.94->0.98->1.1->2.0 • The earliest case in 2010-2011 – Search Store – Taobao History Order – Alipay Risk Management • Internal branch AliHB
  • 5. Overview of AliHB 5 • Performance • High-Performance Data Structure、Lock-Free、 Group IO • Feature • SQL、Secondary Index • Multi-Tenants、Cold-Hot Separation、Async API • Stability • High Availability Architecture • Faster MTTR • Verification in Double 11 Shopping Day • Efficient Maintenance • Effective Monitoring • Full Path Trace • No-pause migration • 12000+ Nodes,100+ Clusters ,200+ Million OPS,100+ PB Data • 20+ BU,6000+ Users, 100+ Production Changes per Day
  • 6. Open Source and Community 6 • Contributing to open source since 2011 • 3 PMC, 6 Committers in Alibaba • Sponsor the Chinese HBase Technology Community • Already Organized 2 HBase Meetup • At least one HBase Related tech article one day • Tens of thousands of readers now, and more are coming • Hosting HBase Con Asia 2018 • Promote the use of HBase through several conference talks • Hope more people to join in HBase Community
  • 7. Core Scenarios in Alibaba 7 Ali-HBase 旺旺(IM) Alipay Bills Cainiao Logistics Log Monitor, Log, Tracking, IoT Data…Message, Orders, Feeds … AI Storage Ant Intelligent Security Intelligent Customer Service Recommendation Search, BI Report…
  • 8. Recent Key Challenge & Improvements02
  • 9. GC Trouble 9 Frequent Slow Request Very Slow Request Service Unavailable GC Problems Under100GB Memory
  • 10. GC Trouble 10 Only for offline application Rewriting with C++ Exploring a Thorough Solution
  • 11. GC Trouble 11 Type Pause Time Frequency YGC 100ms+ Once per 5 Secs CMS 100~500ms Once per 5 Mins FGC 20s-180s Once per 7~60 Days Type Pause Time Frequency YGC 5ms Once per 5 Secs CMS 100ms Once per 5 Hours FGC N/A N/A CCSMap BucketCacheV2 Allocation and reclaim the major memory by hbase itself, rather than JVM ZenGC New GC algorithm in AJDK Try best to reuse object(In Core Path) when programming
  • 12. GC Trouble New BucketCache in HBase-2.0 CCSMap in HBase-3.0
  • 13. Separation of Computing & Storage 13 Localized Deployment – Low IO latency with Short-Circuit Read – Unbalanced storage space, especially between clusters – Difficult to increase the usage ratio of CPU and Disk (both), especially when lots of scenarios – Cluster scaling is slow because of datanode decommission
  • 14. Separation of Computing & Storage 14 Shared-Storage Deployment – Big shared storage, more balanced – Compute node can scale independently – Storage node can scale independently – Auto-scaling become feasible – Based on load statistics, smart schedule between clusters – Share compute resources with other applications
  • 15. Heterogeneous Cold-Hot Storage 15 • HBase has the capability to hold all the data of whole life cycle • But in most cases, like monitor, trace, order, logistics • The recently generated data is often accessed, but occupy very little storage space • The history data is rarely visited, but occupy a lot of storage space • Common solution • Cold storage system for history data • Hot storage system for recent data • Move the data from hot storage system to cold storage system periodically
  • 16. Heterogeneous Cold-Hot Storage 16 • Easy To Use • Auto Tiered • Heterogeneous • Read Optimization
  • 17. Diagnostic System 17 “Request Rush?” — Monitor “Big Region?” — Web UI “Full Disk?” — df “Bad Disk?” — tsar,demsg …… 12000+ Nodes,100+ Clusters ,6000+ Users HBase Diagnostic Center 1. The unified entrance of trouble shooting 2. Experience/Solution => Function of Diagnostic System
  • 18. Diagnostic System 18 2 One extra server for all No Agent Adding rule dynamically Runtime information Check all components Only 10 seconds for a diagnosis6
  • 19. Diagnostic System 19  Compaction  Stuck  Balance Abnormal  Table Abnormal  Region Offline  Replication Delay  Too many files  High Meta Load  Multi Assign  …… HBase  ZK Unavailable  Block Miss  NameNode Abnormal  Full capacity of datanode  Inconsistent state between two namenodes  Too much Xceivers  Disk not mounted  …… ZK/HDFS  Insufficient disk space  Slow Disk  Bad Disk  Too much TCP error  Slow ping  CPU hang  Load too high  Port is unreachable  …… Hardware 50+ Rules 80%+ Accuracy Shared on Apsara HBase
  • 21. Migration & Backup 21 Independent with HBase • almost no impact to service • easy to upgrade • support multi versions • support the non-hbase target Second-level RPO Minute-level RTO
  • 22. HBase Ecosystem & Multi-model DB & Cloud03
  • 23. Popularity changes per DB category
  • 24. Ranking scores per category in percent
  • 26. All in one RelationalKey Value Doucument Graph Time Series Geospatial Tabular NoSQL
  • 27. All in one RelationalKey Value Doucument Graph Time Series Geospatial Tabular NoSQL HBase HBase Phoenix/AntsDB HBase JanusGraph OpenTSDB GeoMesa
  • 28. 28 Multi-model - Native Or Layer Neo4j InfluxDB CockroachDB PG HBase Ecosystem DataStax CosmosDB Multi-model KVIndex KVIndex Storage Multi-model
  • 29. HBase Meet Cloud – Benefits Cloud Native New Hardware Flexibility Cost Savings (TCO) RDMA Flash GPU Non-volatile memory Fast Add/Remove Resource Insight Fix bugs in time Self-driven End up paying for features Flexibility self-driven Reduce human ……
  • 30. 30 Multi-model - Native Or Layer Neo4j InfluxDB CockroachDB PG HBase Ecosystem DataStax CosmosDB Multi-model KVIndex KVIndex Storage Multi-model
  • 31. ApsaraDB HBase Platform – Cloud Native HBase (KV、Tabular 、Doucument) Solr/ES (Full Text Index) Hot data on SSD SQL Phoenix Graph JanusGraph Time Series OpenTSDB Geospatial GeoMesa Spark Cold data on HDD and use EC like OSS Warm data on SSD&HDD Remote Read/Write use RDMA and 25G network
  • 32. 32 ApsaraDB HBase Platform Advantage Item ApsaraDB HBase (ALiyun Product) Apache HBase (Sofeware) Basic High availability 99.9% ~ 99.99% N/A Data reliability 99.999999999% N/A Online Ability Multi-master clustering Multi-master clustering,Multi-AZ/Regon NO GC FGC NO,YGC 5ms GC 20s~100s,YGC 100ms+ Reduce Cost Storage Cost Cut by 50%+ on share cloud disk,Total 3 Copy Maybe on Cloud Disk,Total 9 Copy Support Cold Storage Support OSS,Cut by 70% at less read NO Multi-model DB Multi-model DB KV,Tabular,SQL,Graph,Time Series,Geospatial Full Text index, Search KV,Tabular Enterprise Characteristics Disaster recovery Backup and Restore NO,maybe3.0 Security user/password,ACL Kerberos,ACL Analytics Spark on HBase ,More optimization Spark on HBase Version upgrade Automatic upgrade N/A Self-driven Database control system 15min Create a DB/Monitor Online add storage and node/Elastic Power in future N/A Diagnostic Big request ,Big Table merge,Hot Region …… NO