hbaseconasia2019 Recent work on HBase at Pinterest

Recent work on HBase at Pinterest
Lianghong Xu
Pinterest Software Engineer, Tech Lead

HBase at Pinterest
• Backend for many critical services
• Graph database (Zen)
• Generic KV store (UMS)
• Around 50 HBase clusters
• HBase 0.94 since 2013, HBase 1.2 since 2016
• Internal repo with ZSTD, CCSMAP, Bucket cache, etc.

Agenda
• Omid: transaction layer for NoSQL database
• Sparrow: Omid made scalable
• Argus: database observer framework
• Ixia: near-realtime HBase indexing

NoSQL Embracing Transactions
SQL NoSQL
Relational
Transactional
Expressive
Simple
Fast
Scalable

Apache Omid at Pinterest
• Omid (Optimistically transaction Management In Datastores)
• Transaction framework on top of KV stores with HBase support
• Open-sourced by Yahoo! in 2016
• Powers next generation of Ads indexing at Pinterest

Apache Omid at Pinterest
• Omid (Optimistically transaction Management In Datastores)
• Transaction framework on top of KV stores with HBase support
• Open-sourced by Yahoo! in 2016
• Powers next generation of Ads indexing at Pinterest
• Pros: simple, reasonable performance, HA, pluggable backend with native HBase support
• Cons: No SQL interface, limited isolation levels, requires MVCC support

Omid Architecture
Client Transaction
Manager (TM)
begin/commit
timestamp/commit status
Data tables
Commit
table
read/write
check
commit
persist
commit

Omid internals
• Leverages Multi-version Concurrency Control (MVCC) support in HBase
• Transaction ID (begin timestamp) in version, commit timestamp in shadow cell
• OCC: lock-free implementation with central conflict detection mechanism
Omid data and commit table

Omid Scalability Problem
Client Transaction
Manager (TM)
begin/commit
Data tables
Commit
table
read/write
check
commit
persist
commit
Centralized batch
commit to HBase

Omid Scalability Problem
Client Transaction
Manager (TM)
begin/commit
Data tables
Commit
table
read/write
check
commit
persist
commit
Centralized batch
commit to HBase
Single-threaded request/reply
processor for serializability

Sparrow Architecture
Client Transaction
Manager (TM)
begin/commit
Data tables
Commit
table
read/write
check
commit

Client Transaction
Manager (TM)
begin/commit
Data tables
Commit
table
read/write
check
commit
persist
commit
Distributed client-side commit

Client Transaction
Manager (TM)
begin/commit
Data tables
Commit
table
read/write
check
commit
persist
commit
Parallel request processing

Sparrow: Omid made scalable
Client Transaction
Manager (TM)
begin/commit
Data tables
Commit
table
read/write
check
commit
persist
commit
Parallel conflict detection
persist
commit
Performance bottleneck

Sparrow techniques
• Client-side commit
• Client writes to commit table when there is no conflicts
• Explicitly mark aborted txn in commit table (-1)
• Reader may back off and abort concurrent writer in case of client failure or network partition
• Avoid performance bottleneck on TM
• Parallel request processing
• Multi-threaded request processor with in-memory conflict map
• beginTx no longer needs to wait until whole commit batch is written to HBase
• Timestamp allocation still needs to be synchronized (with negligible overhead)

Sparrow vs. Omid
beginTx P99: ~100X reduction
commitTx P99: ~3X reduction

Argus: Motivation and Problem Statement
• Clients request a real-time notification feature similar to a database trigger
• Incremental processing based on database changes
• Notification cannot be missed - ”at least once”
• Notification events could have different priorities and object types

Kafka-based Notification Pipeline

Percolator
(Google)
• Special notification column
• Observer threads periodically scan for changes
• Heavy-weight distributed scan and locking

Percolator
(Google)
Argus
• Special notification column
• Observer threads periodically scan for changes
• Heavy-weight distributed scan and locking
• Async notification by tailing HBase WAL
• Kafka for replayable DB change stream
• Support different priorities and types
• Lightweight, minimal impact on DB

Argus Architecture
Client
Argus
Observer
HBase
Annotated requests
Replication
proxy
WAL
Notification events (Kafka)
read/write
• HBase annotation: extra metadata in HBase requests to be passed down into WAL
• Replication Proxy: ”fake” regionservers with only replication RPC implemented

Argus Observers
• Process notification events in parallel with user-defined handlers
• Event dispatching, filtering, collapse, etc.
• Notification Handlers can be chained

Argus Observers
• Process notification events in parallel with user-defined handlers
• Event dispatching, filtering, collapse, etc.
• Notification Handlers can be chained
Use case on Ads indexing:
Batch processing (15 mins) -> incremental indexing (several seconds)

Ixia: Motivation
• Clients ask for secondary indexing support in HBase
• Analytics queries on HBase columns (filtering, range, aggregation)
• Why not SQL?
• Index build could take a long time
• Lack of horizontal scalability and tuning expertise

Ixia: Near-realtime Indexing with HBase + Muse (In-house Search Engine)
• Inspired by Lily indexer (HBase + Solr)
• Secondary indexes in Muse (written in C++, fast in-memory inverted/forward index)
• Source-of-truth data in HBase
• Index built asynchronously with HBase WAL through Kafka
• Ixia query engine: Thrift-based query service with a SQL-like interface

Ixia: Architecture
Client HBase
Replication
proxy
WAL
Indexer
Index events (Kafka)
write
Muse index docs
MuseQuery engine
query
search query
DB retrieval
Index
manager
Index
schema

Ixia: Pros and Cons
Pros Cons
Minimal impact on write path
Index and data stores scaled separately
Efficient indexing & retrieval
No strong index consistency

Ixia: status and future work
• Batch indexing in prod, reducing indexing time by ~15X
• Query engine serving full dark traffic, reducing query latency by up to 100X
• Future work:
• Realtime indexing into production
• SQL support
• Dynamic index backfilling

hbaseconasia2019 Recent work on HBase at Pinterest

Related slideshows

More Related Content

What's hot

What's hot (20)

Similar to hbaseconasia2019 Recent work on HBase at Pinterest

Similar to hbaseconasia2019 Recent work on HBase at Pinterest (20)

More from Michael Stack

More from Michael Stack (20)

Recently uploaded

Recently uploaded (13)

hbaseconasia2019 Recent work on HBase at Pinterest