Phoenix Secondary Indexing - LA HUG Sept 9th, 2013

Secondary Indexing in Phoenix
Jesse Yates
HBase Committer
Software Engineer
LA HBase User Group – September 4, 2013

Agenda
• About
• Other Indexing Frameworks
• Immutable Indexes
• Mutable Indexes in Phoenix
• Mutable Indexing Internals
• Roadmap
2 LA HUG – Sept 2013
https://www.madison.k12.wi.us/calendars

About me
• Developer at Salesforce
– System of Record, Phoenix
• Open Source
– Phoenix
– HBase
– Accumulo

Phoenix
• Open Source
– https://github.com/forcedotcom/phoenix
• “SQL-skin” on HBase
– Everyone knows SQL!
• JDBC Driver
– Plug-and-play
• Faster than HBase
– in some cases

Why Index?
• HBase is only sorted on 1 “axis”
• Great for search via a single pattern
Example!
LA HUG – Sept 20135

Example
name:
type:
subtype:
date:
major:
minor:
quantity:

Secondary Indexes
• Sort on ‘orthogonal’ axis
• Save full-table scan
• Expected database feature
• Hard in HBase b/c of ACID considerations

Agenda
• About
• Roadmap

http://www.wired.com/wiredenterprise/2011/10/microsoft-and-hadoop/

Other (Major) Indexing Frameworks
• HBase SEP
– Side-Effects Processor
– Replication-based
– https://github.com/NGDATA/hbase-sep
• Huawei
– Server-local indexes
– Buddy regions
– https://github.com/Huawei-Hadoop/hindex

Agenda
• About
• Roadmap

Immutable Indexes
• Immutable Rows
• Much easier to implement
• Client-managed
• Bulk-loadable

Bulk Loading
phoenix-hbase.blogspot.com

Index Bulk Loading
Identity Mapper
Custom Phoenix Reducer
HFile Output Format

Index Bulk Loading
PreparedStatement statement = conn.prepareStatement(dmlStatement);
statement.execute();
String upsertStmt = "upsert into
core.entity_history(organization_id,key_prefix,entity_history_id,
created_by, created_date)n" + "values(?,?,?,?,?)";
statement = conn.prepareStatement(upsertStmt);
… //set values
Iterator<Pair<byte[],List<KeyValue>>>dataIterator =
PhoenixRuntime.getUncommittedDataIterator(conn);

Agenda
• About
• Roadmap

The “fun” stuff…

1.5 years

Mutable Indexes
• Global Index
• Change row state
– Common use-case
– “expected” implementation
• Covered Columns

Usage
• Just SQL!
• Baby name popularity
• Mock demo

Usage
• Selects the most popular name for a given year
SELECT name,occurrences FROM baby_names WHERE year=2012 LIMIT 1;
• Selects the total occurrences of a given name across all years
SELECT /*+ NO_INDEX */ name,sum(occurrences) FROM baby_names
WHERE name='Jesse' GROUP BY name;
• Selects the total occurrences of a given name across all years allowing an
index to be used
SELECT name,sum(occurrences) FROM baby_names WHERE name='Jesse'
GROUP BY NAME;

Usage
• Update rows due to census inaccuracy
– Will only work if the mutable indexing is working
UPSERT INTO baby_names SELECT year,occurrences+3000,sex,name
FROM baby_names WHERE name='Jesse';
• Selects the now updated data (from the index table)
SELECT name,sum(occurrences) FROM baby_names WHERE
name='Jesse' GROUP BY NAME;
• Index table still used in scans
EXPLAIN SELECT name,sum(occurrences) FROM baby_names WHERE
name='Jesse' GROUP BY NAME;

Agenda
• About
• Roadmap

Internals
• Index Management
– Build index updates
– Ensures index is ‘cleaned up’
• Recovery Mechanism
– Ensures index updates are “ACID”

“There is no magic”
- Every programming hipster (chipster)

Mutable Indexing: Standard Write Path
26
Client HRegion
RegionCoprocessorHost
WAL
MemStore

Mutable Indexing: Standard Write Path
27
Client HRegion
WAL
MemStore

Mutable Indexing
28
Region
Coprocessor
Host
WAL
Region
Coprocessor
Host
Indexer Builder
WAL Updater
Durable!
Indexer
Index Table
Index Table
Index Table
Codec

Index Management
29
• Lives within a RegionCoprocesorObserver
• Access to the local HRegion
• Specifies the mutations to apply to the index
tables
public interface IndexBuilder{
public void setup(RegionCoprocessorEnvironmentenv);
public Map<Mutation, String>getIndexUpdate(Put put);
public Map<Mutation, String>getIndexUpdate(Deletedelete);
}

Why not write my own?
• Managing Cleanup
– Efficient point-in-time correctness
– Performance tricks
• Abstract access to HRegion
– Minimal network hops
• Sorting correctness
– Phoenix typing ensures correct index sorting

Example: Managing Cleanup
• Updates can arrive out of order
– Client-managed timestamps
ROW FAMILY QUALIFIER TS VALUE
Row1 Fam Qual 10 val1
Row1 Fam2 Qual2 12 val2

Index Table
ROW FAMILY QUALIFIER TS
Val1|Row1 Index Fam:Qual 10
Val1|Val2|Row1 Index Fam:Qual
Fam2:Qual2
12
Fam2:Qual2
13

Va1|Row1 Index Fam:Qual 10
Fam2:Qual2
12
Va1l|Val2|Row1 Index Fam:Qual
Fam2:Qual2
12
Fam2:Qual2
13

Managing Cleanup
• History “roll up”
• Out-of-order Updates
• Point-in-time correctness
• Multiple Timestamps per Mutation
• Delete vs. DeleteColumn vs. DeleteFamily
Surprisingly hard!

Phoenix Index Builder
• Much simpler than full index management
• Hides cleanup considerations
• Abstracted access to local state
public interfaceIndexCodec{
public void initialize(RegionCoprocessorEnvironmentenv);
public Iterable<IndexUpdate>getIndexDeletes(TableState state;
public Iterable<IndexUpdate>getIndexUpserts(TableState state);
}

Phoenix Index Codec

Dude, where’s my data?
Ensuring Correctness

HBase ACID
• Does NOT give you:
– Cross-row consistency
– Cross-table consistency
• Does give you:
– Durable data on success
– Visibility on success without partial rows

Key Observation
“Secondary indexing is inherently an easier
problem than full transactions… secondary
index updates are idempotent.”
- Lars Hofhansl

Idempotent Index Updates
• Doesn’t need full transactions
• Replay as many times as needed
• Can tolerate a little lag
– As long as we get the order right

Failure Recovery
• Custom WALEditCodec
– Encodes index updates
– Supports compressed WAL
• Custom WAL Reader
– Replay index updates from WAL
<property>
<name>hbase.regionserver.wal.codec</name><value>o.a.h.hbase.regionserver.w
al.IndexedWALEditCodec</value>
</property>
<property>
<name>hbase.regionserver.hlog.reader.impl</name>
<value>o.a.h.hbase.regionserver.wal.IndexedHLogReader</value>
</property>

Failure Situations
• Any time before WAL, client replay
• Any time after WAL, HBase replay
• All-or-nothing

Failure #1: Before WAL
46
Client HRegion
WAL
MemStore

Failure #1: Before WAL
47
Client HRegion
WAL
MemStore
No problem! No data
is stored in the
WAL, client just retries
entire update.

Failure #2: After WAL
48
Client HRegion
WAL
MemStore

Failure #2: After WAL
49
Client HRegion
WAL
MemStore
WAL replayed via
usual replay
mechanisms

Agenda
• About
• Mutable Indexes
• Roadmap

Roadmap
• Next release of Phoenix
• Performance testing
• Increased adoption
• Adding to HBase (?)

Open Source!
• Main:
https://github.com/forcedotcom/phoenix
• Indexing:
https://github.com/forcedotcom/phoenix/tree/mutable-si

(obligatory hiring slide)
We’re Hiring!

Questions? Comments?
jyates@salesforce.com
@jesse_yates

Phoenix Secondary Indexing - LA HUG Sept 9th, 2013

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Phoenix Secondary Indexing - LA HUG Sept 9th, 2013

Similar to Phoenix Secondary Indexing - LA HUG Sept 9th, 2013 (20)

Recently uploaded

Recently uploaded (20)

Phoenix Secondary Indexing - LA HUG Sept 9th, 2013

Editor's Notes