SlideShare a Scribd company logo
Secondary Indexing in Phoenix
Jesse Yates
HBase Committer
Software Engineer
LA HBase User Group – September 4, 2013
Agenda
• About
• Other Indexing Frameworks
• Immutable Indexes
• Mutable Indexes in Phoenix
• Mutable Indexing Internals
• Roadmap
2 LA HUG – Sept 2013
https://www.madison.k12.wi.us/calendars
About me
• Developer at Salesforce
– System of Record, Phoenix
• Open Source
– Phoenix
– HBase
– Accumulo
3 LA HUG – Sept 2013
Phoenix
• Open Source
– https://github.com/forcedotcom/phoenix
• “SQL-skin” on HBase
– Everyone knows SQL!
• JDBC Driver
– Plug-and-play
• Faster than HBase
– in some cases
4 LA HUG – Sept 2013
Why Index?
• HBase is only sorted on 1 “axis”
• Great for search via a single pattern
Example!
LA HUG – Sept 20135
Example
name:
type:
subtype:
date:
major:
minor:
quantity:
LA HUG – Sept 20136
Secondary Indexes
• Sort on ‘orthogonal’ axis
• Save full-table scan
• Expected database feature
• Hard in HBase b/c of ACID considerations
LA HUG – Sept 20137
Agenda
• About
• Other Indexing Frameworks
• Immutable Indexes
• Mutable Indexes in Phoenix
• Mutable Indexing Internals
• Roadmap
8 LA HUG – Sept 2013
9 LA HUG – Sept 2013
http://www.wired.com/wiredenterprise/2011/10/microsoft-and-hadoop/
Other (Major) Indexing Frameworks
• HBase SEP
– Side-Effects Processor
– Replication-based
– https://github.com/NGDATA/hbase-sep
• Huawei
– Server-local indexes
– Buddy regions
– https://github.com/Huawei-Hadoop/hindex
10 LA HUG – Sept 2013
Agenda
• About
• Other Indexing Frameworks
• Immutable Indexes
• Mutable Indexes in Phoenix
• Mutable Indexing Internals
• Roadmap
11 LA HUG – Sept 2013
Immutable Indexes
• Immutable Rows
• Much easier to implement
• Client-managed
• Bulk-loadable
12 LA HUG – Sept 2013
Bulk Loading
phoenix-hbase.blogspot.com
13 LA HUG – Sept 2013
Index Bulk Loading
Identity Mapper
Custom Phoenix Reducer
14 LA HUG – Sept 2013
HFile Output Format
Index Bulk Loading
PreparedStatement statement = conn.prepareStatement(dmlStatement);
statement.execute();
String upsertStmt = "upsert into
core.entity_history(organization_id,key_prefix,entity_history_id,
created_by, created_date)n" + "values(?,?,?,?,?)";
statement = conn.prepareStatement(upsertStmt);
… //set values
Iterator<Pair<byte[],List<KeyValue>>>dataIterator =
PhoenixRuntime.getUncommittedDataIterator(conn);
15 LA HUG – Sept 2013
Agenda
• About
• Other Indexing Frameworks
• Immutable Indexes
• Mutable Indexes in Phoenix
• Mutable Indexing Internals
• Roadmap
16 LA HUG – Sept 2013
The “fun” stuff…
17 LA HUG – Sept 2013
1.5 years
18 LA HUG – Sept 2013
Mutable Indexes
• Global Index
• Change row state
– Common use-case
– “expected” implementation
• Covered Columns
19 LA HUG – Sept 2013
Usage
• Just SQL!
• Baby name popularity
• Mock demo
20 LA HUG – Sept 2013
Usage
• Selects the most popular name for a given year
SELECT name,occurrences FROM baby_names WHERE year=2012 LIMIT 1;
• Selects the total occurrences of a given name across all years
SELECT /*+ NO_INDEX */ name,sum(occurrences) FROM baby_names
WHERE name='Jesse' GROUP BY name;
• Selects the total occurrences of a given name across all years allowing an
index to be used
SELECT name,sum(occurrences) FROM baby_names WHERE name='Jesse'
GROUP BY NAME;
LA HUG – Sept 201321
Usage
• Update rows due to census inaccuracy
– Will only work if the mutable indexing is working
UPSERT INTO baby_names SELECT year,occurrences+3000,sex,name
FROM baby_names WHERE name='Jesse';
• Selects the now updated data (from the index table)
SELECT name,sum(occurrences) FROM baby_names WHERE
name='Jesse' GROUP BY NAME;
• Index table still used in scans
EXPLAIN SELECT name,sum(occurrences) FROM baby_names WHERE
name='Jesse' GROUP BY NAME;
LA HUG – Sept 201322
Agenda
• About
• Other Indexing Frameworks
• Immutable Indexes
• Mutable Indexes in Phoenix
• Mutable Indexing Internals
• Roadmap
23 LA HUG – Sept 2013
Internals
• Index Management
– Build index updates
– Ensures index is ‘cleaned up’
• Recovery Mechanism
– Ensures index updates are “ACID”
24 LA HUG – Sept 2013
“There is no magic”
- Every programming hipster (chipster)
LA HUG – Sept 201325
Mutable Indexing: Standard Write Path
26
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
LA HUG – Sept 2013
Mutable Indexing: Standard Write Path
27
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
LA HUG – Sept 2013
Mutable Indexing
28
Region
Coprocessor
Host
WAL
Region
Coprocessor
Host
Indexer Builder
WAL Updater
Durable!
Indexer
Index Table
Index Table
Index Table
Codec
LA HUG – Sept 2013
Index Management
29
• Lives within a RegionCoprocesorObserver
• Access to the local HRegion
• Specifies the mutations to apply to the index
tables
public interface IndexBuilder{
public void setup(RegionCoprocessorEnvironmentenv);
public Map<Mutation, String>getIndexUpdate(Put put);
public Map<Mutation, String>getIndexUpdate(Deletedelete);
}
LA HUG – Sept 2013
Why not write my own?
• Managing Cleanup
– Efficient point-in-time correctness
– Performance tricks
• Abstract access to HRegion
– Minimal network hops
• Sorting correctness
– Phoenix typing ensures correct index sorting
LA HUG – Sept 201330
Example: Managing Cleanup
• Updates can arrive out of order
– Client-managed timestamps
LA HUG – Sept 201331
ROW FAMILY QUALIFIER TS VALUE
Row1 Fam Qual 10 val1
Row1 Fam2 Qual2 12 val2
Row1 Fam Qual 13 val3
Example: Managing Cleanup
Index Table
LA HUG – Sept 201332
ROW FAMILY QUALIFIER TS
Val1|Row1 Index Fam:Qual 10
Val1|Val2|Row1 Index Fam:Qual
Fam2:Qual2
12
Val3|Val2|Row1 Index Fam:Qual
Fam2:Qual2
13
Example: Managing Cleanup
LA HUG – Sept 201333
ROW FAMILY QUALIFIER TS VALUE
Row1 Fam Qual 10 val1
Row1 Fam2 Qual2 12 val2
Row1 Fam Qual 13 val3
Row1 Fam Qual 11 val4
Example: Managing Cleanup
LA HUG – Sept 201334
ROW FAMILY QUALIFIER TS VALUE
Row1 Fam Qual 10 val1
Row1 Fam Qual 11 val4
Row1 Fam2 Qual2 12 val2
Row1 Fam Qual 13 val3
Example: Managing Cleanup
LA HUG – Sept 201335
ROW FAMILY QUALIFIER TS
Va1|Row1 Index Fam:Qual 10
Val4|Row1 Index Fam:Qual 11
Val4|Val2|Row1 Index Fam:Qual
Fam2:Qual2
12
Va1l|Val2|Row1 Index Fam:Qual
Fam2:Qual2
12
Val3|Val2|Row1 Index Fam:Qual
Fam2:Qual2
13
Example: Managing Cleanup
LA HUG – Sept 201336
ROW FAMILY QUALIFIER TS
Va1|Row1 Index Fam:Qual 10
Val4|Row1 Index Fam:Qual 11
Val4|Val2|Row1 Index Fam:Qual
Fam2:Qual2
12
Va1l|Val2|Row1 Index Fam:Qual
Fam2:Qual2
12
Val3|Val2|Row1 Index Fam:Qual
Fam2:Qual2
13
Managing Cleanup
• History “roll up”
• Out-of-order Updates
• Point-in-time correctness
• Multiple Timestamps per Mutation
• Delete vs. DeleteColumn vs. DeleteFamily
Surprisingly hard!
LA HUG – Sept 201337
Phoenix Index Builder
• Much simpler than full index management
• Hides cleanup considerations
• Abstracted access to local state
LA HUG – Sept 201338
public interfaceIndexCodec{
public void initialize(RegionCoprocessorEnvironmentenv);
public Iterable<IndexUpdate>getIndexDeletes(TableState state;
public Iterable<IndexUpdate>getIndexUpserts(TableState state);
}
Phoenix Index Codec
LA HUG – Sept 201339
Dude, where’s my data?
40 LA HUG – Sept 2013
Ensuring Correctness
HBase ACID
• Does NOT give you:
– Cross-row consistency
– Cross-table consistency
• Does give you:
– Durable data on success
– Visibility on success without partial rows
41 LA HUG – Sept 2013
Key Observation
“Secondary indexing is inherently an easier
problem than full transactions… secondary
index updates are idempotent.”
- Lars Hofhansl
42 LA HUG – Sept 2013
Idempotent Index Updates
• Doesn’t need full transactions
• Replay as many times as needed
• Can tolerate a little lag
– As long as we get the order right
43 LA HUG – Sept 2013
Failure Recovery
• Custom WALEditCodec
– Encodes index updates
– Supports compressed WAL
• Custom WAL Reader
– Replay index updates from WAL
LA HUG – Sept 201344
<property>
<name>hbase.regionserver.wal.codec</name><value>o.a.h.hbase.regionserver.w
al.IndexedWALEditCodec</value>
</property>
<property>
<name>hbase.regionserver.hlog.reader.impl</name>
<value>o.a.h.hbase.regionserver.wal.IndexedHLogReader</value>
</property>
Failure Situations
• Any time before WAL, client replay
• Any time after WAL, HBase replay
• All-or-nothing
LA HUG – Sept 201345
Failure #1: Before WAL
46
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
LA HUG – Sept 2013
Failure #1: Before WAL
47
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
No problem! No data
is stored in the
WAL, client just retries
entire update.
LA HUG – Sept 2013
Failure #2: After WAL
48
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
LA HUG – Sept 2013
Failure #2: After WAL
49
Client HRegion
RegionCoprocessorHost
WAL
RegionCoprocessorHost
MemStore
WAL replayed via
usual replay
mechanisms
LA HUG – Sept 2013
Agenda
• About
• Other Indexing Frameworks
• Immutable Indexes
• Mutable Indexes
• Roadmap
50 LA HUG – Sept 2013
Roadmap
• Next release of Phoenix
• Performance testing
• Increased adoption
• Adding to HBase (?)
51 LA HUG – Sept 2013
Open Source!
• Main:
https://github.com/forcedotcom/phoenix
• Indexing:
https://github.com/forcedotcom/phoenix/tree/mutable-si
52 LA HUG – Sept 2013
(obligatory hiring slide)
We’re Hiring!
Questions? Comments?
jyates@salesforce.com
@jesse_yates

More Related Content

What's hot

HBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbmsHBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbms
Michael Stack
 
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
Michael Stack
 
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, TargetJourney of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Lucidworks
 
Apache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data StreamingApache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data Streaming
Shivji Kumar Jha
 
Apachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to knowApachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to know
Christian Gügi
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
Michael Stack
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
PivotalOpenSourceHub
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
Cloudera, Inc.
 
Sql over hadoop ver 3
Sql over hadoop ver 3Sql over hadoop ver 3
Sql over hadoop ver 3
Sudheesh Narayanan
 
Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...
Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...
Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...
Caserta
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
Chester Chen
 
Apache drill
Apache drillApache drill
Apache drill
MapR Technologies
 
Riak at shareaholic
Riak at shareaholicRiak at shareaholic
Riak at shareaholic
freerobby
 
Cloudera Hadoop Distribution
Cloudera Hadoop DistributionCloudera Hadoop Distribution
Cloudera Hadoop Distribution
Thisara Pramuditha
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Chris Fregly
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017
larsgeorge
 
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
Trieu Nguyen
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
Girish Khanzode
 
Semi-Supervised Learning In An Adversarial Environment
Semi-Supervised Learning In An Adversarial EnvironmentSemi-Supervised Learning In An Adversarial Environment
Semi-Supervised Learning In An Adversarial Environment
DataWorks Summit
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
Jacques Nadeau
 

What's hot (20)

HBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbmsHBaseConEast2016: Splice machine open source rdbms
HBaseConEast2016: Splice machine open source rdbms
 
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
HBaseConEast2016: How yarn timeline service v.2 unlocks 360 degree platform i...
 
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, TargetJourney of Implementing Solr at Target: Presented by Raja Ramachandran, Target
Journey of Implementing Solr at Target: Presented by Raja Ramachandran, Target
 
Apache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data StreamingApache Con 2021 Structured Data Streaming
Apache Con 2021 Structured Data Streaming
 
Apachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to knowApachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to know
 
HBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the ArtHBaseConEast2016: HBase and Spark, State of the Art
HBaseConEast2016: HBase and Spark, State of the Art
 
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
#GeodeSummit Keynote: Creating the Future of Big Data Through 'The Apache Way"
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
Sql over hadoop ver 3
Sql over hadoop ver 3Sql over hadoop ver 3
Sql over hadoop ver 3
 
Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...
Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...
Big Data Warehousing Meetup: Developing a super-charged NoSQL data mart using...
 
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at LyftSF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
SF Big Analytics_20190612: Scaling Apache Spark on Kubernetes at Lyft
 
Apache drill
Apache drillApache drill
Apache drill
 
Riak at shareaholic
Riak at shareaholicRiak at shareaholic
Riak at shareaholic
 
Cloudera Hadoop Distribution
Cloudera Hadoop DistributionCloudera Hadoop Distribution
Cloudera Hadoop Distribution
 
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
Global Big Data Conference Sept 2014 AWS Kinesis Spark Streaming Approximatio...
 
HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017HBase and Impala Notes - Munich HUG - 20131017
HBase and Impala Notes - Munich HUG - 20131017
 
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...Apache Phoenix with Actor Model (Akka.io)  for real-time Big Data Programming...
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Semi-Supervised Learning In An Adversarial Environment
Semi-Supervised Learning In An Adversarial EnvironmentSemi-Supervised Learning In An Adversarial Environment
Semi-Supervised Learning In An Adversarial Environment
 
Apache Arrow Flight Overview
Apache Arrow Flight OverviewApache Arrow Flight Overview
Apache Arrow Flight Overview
 

Similar to Phoenix Secondary Indexing - LA HUG Sept 9th, 2013

High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)
Jose Luis Lopez Pino
 
Search onhadoopsfhug081413
Search onhadoopsfhug081413Search onhadoopsfhug081413
Search onhadoopsfhug081413
gregchanan
 
Apache Hadoop 1.1
Apache Hadoop 1.1Apache Hadoop 1.1
Apache Hadoop 1.1
Sperasoft
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
Jakir Hossain
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Lucas Jellema
 
Pig Experience
Pig ExperiencePig Experience
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
Cascading
 
Untangling - fall2017 - week 8
Untangling - fall2017 - week 8Untangling - fall2017 - week 8
Untangling - fall2017 - week 8
Derek Jacoby
 
Hadoop_EcoSystem_Pradeep_MG
Hadoop_EcoSystem_Pradeep_MGHadoop_EcoSystem_Pradeep_MG
Hadoop_EcoSystem_Pradeep_MG
Pradeep MG
 
Cheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduceCheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduce
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
DataWorks Summit
 
Indexing Complex PostgreSQL Data Types
Indexing Complex PostgreSQL Data TypesIndexing Complex PostgreSQL Data Types
Indexing Complex PostgreSQL Data Types
Jonathan Katz
 
Hive Evolution: ApacheCon NA 2010
Hive Evolution:  ApacheCon NA 2010Hive Evolution:  ApacheCon NA 2010
Hive Evolution: ApacheCon NA 2010
John Sichi
 
mongodb_DS.pptx
mongodb_DS.pptxmongodb_DS.pptx
mongodb_DS.pptx
DavoudSalehi1
 
Big Data training
Big Data trainingBig Data training
Big Data training
vishal192091
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
viirya
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Mark Rittman
 
try
trytry
FOSS4G 2017 Spatial Sql for Rookies
FOSS4G 2017 Spatial Sql for RookiesFOSS4G 2017 Spatial Sql for Rookies
FOSS4G 2017 Spatial Sql for Rookies
Todd Barr
 
Azure Cosmos DB: Features, Practical Use and Optimization "
Azure Cosmos DB: Features, Practical Use and Optimization "Azure Cosmos DB: Features, Practical Use and Optimization "
Azure Cosmos DB: Features, Practical Use and Optimization "
GlobalLogic Ukraine
 

Similar to Phoenix Secondary Indexing - LA HUG Sept 9th, 2013 (20)

High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)High-level languages for Big Data Analytics (Presentation)
High-level languages for Big Data Analytics (Presentation)
 
Search onhadoopsfhug081413
Search onhadoopsfhug081413Search onhadoopsfhug081413
Search onhadoopsfhug081413
 
Apache Hadoop 1.1
Apache Hadoop 1.1Apache Hadoop 1.1
Apache Hadoop 1.1
 
Map reducecloudtech
Map reducecloudtechMap reducecloudtech
Map reducecloudtech
 
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
Introducing NoSQL and MongoDB to complement Relational Databases (AMIS SIG 14...
 
Pig Experience
Pig ExperiencePig Experience
Pig Experience
 
Elasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log ProcessingElasticsearch + Cascading for Scalable Log Processing
Elasticsearch + Cascading for Scalable Log Processing
 
Untangling - fall2017 - week 8
Untangling - fall2017 - week 8Untangling - fall2017 - week 8
Untangling - fall2017 - week 8
 
Hadoop_EcoSystem_Pradeep_MG
Hadoop_EcoSystem_Pradeep_MGHadoop_EcoSystem_Pradeep_MG
Hadoop_EcoSystem_Pradeep_MG
 
Cheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduceCheetah:Data Warehouse on Top of MapReduce
Cheetah:Data Warehouse on Top of MapReduce
 
Challenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on HadoopChallenges of Implementing an Advanced SQL Engine on Hadoop
Challenges of Implementing an Advanced SQL Engine on Hadoop
 
Indexing Complex PostgreSQL Data Types
Indexing Complex PostgreSQL Data TypesIndexing Complex PostgreSQL Data Types
Indexing Complex PostgreSQL Data Types
 
Hive Evolution: ApacheCon NA 2010
Hive Evolution:  ApacheCon NA 2010Hive Evolution:  ApacheCon NA 2010
Hive Evolution: ApacheCon NA 2010
 
mongodb_DS.pptx
mongodb_DS.pptxmongodb_DS.pptx
mongodb_DS.pptx
 
Big Data training
Big Data trainingBig Data training
Big Data training
 
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
Speed up Interactive Analytic Queries over Existing Big Data on Hadoop with P...
 
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop : Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
Enkitec E4 Barcelona : SQL and Data Integration Futures on Hadoop :
 
try
trytry
try
 
FOSS4G 2017 Spatial Sql for Rookies
FOSS4G 2017 Spatial Sql for RookiesFOSS4G 2017 Spatial Sql for Rookies
FOSS4G 2017 Spatial Sql for Rookies
 
Azure Cosmos DB: Features, Practical Use and Optimization "
Azure Cosmos DB: Features, Practical Use and Optimization "Azure Cosmos DB: Features, Practical Use and Optimization "
Azure Cosmos DB: Features, Practical Use and Optimization "
 

Recently uploaded

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
The Digital Insurer
 
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature StoreFrom 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
ScyllaDB
 
Chapter 4 - Test Analysis & Design Techniques V4.0
Chapter 4 - Test Analysis & Design Techniques V4.0Chapter 4 - Test Analysis & Design Techniques V4.0
Chapter 4 - Test Analysis & Design Techniques V4.0
Neeraj Kumar Singh
 
An Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise IntegrationAn Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise Integration
Safe Software
 
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with MilvusMultimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Zilliz
 
Product Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdfProduct Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdf
gaydlc2513
 
Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0
Neeraj Kumar Singh
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
Matthew Sinclair
 
STKI Israeli Market Study 2024 final v1
STKI Israeli Market Study 2024 final  v1STKI Israeli Market Study 2024 final  v1
STKI Israeli Market Study 2024 final v1
Dr. Jimmy Schwarzkopf
 
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
Larry Smarr
 
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My IdentityCNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
Cynthia Thomas
 
Chapter 6 - Test Tools Considerations V4.0
Chapter 6 - Test Tools Considerations V4.0Chapter 6 - Test Tools Considerations V4.0
Chapter 6 - Test Tools Considerations V4.0
Neeraj Kumar Singh
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions
 
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating AppsecGDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
James Anderson
 
Product Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdfProduct Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdf
gaydlc2513
 
Leading a Tableau User Group - Onboarding deck for new leaders
Leading a Tableau User Group - Onboarding deck for new leadersLeading a Tableau User Group - Onboarding deck for new leaders
Leading a Tableau User Group - Onboarding deck for new leaders
lward7
 
Getting Started Using the National Research Platform
Getting Started Using the National Research PlatformGetting Started Using the National Research Platform
Getting Started Using the National Research Platform
Larry Smarr
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
Mark Billinghurst
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
Safe Software
 
Leveraging AI for Software Developer Productivity.pptx
Leveraging AI for Software Developer Productivity.pptxLeveraging AI for Software Developer Productivity.pptx
Leveraging AI for Software Developer Productivity.pptx
petabridge
 

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature StoreFrom 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
From 1M to 1B Features Per Second: Scaling ShareChat's ML Feature Store
 
Chapter 4 - Test Analysis & Design Techniques V4.0
Chapter 4 - Test Analysis & Design Techniques V4.0Chapter 4 - Test Analysis & Design Techniques V4.0
Chapter 4 - Test Analysis & Design Techniques V4.0
 
An Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise IntegrationAn Introduction to All Data Enterprise Integration
An Introduction to All Data Enterprise Integration
 
Multimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with MilvusMultimodal Retrieval Augmented Generation (RAG) with Milvus
Multimodal Retrieval Augmented Generation (RAG) with Milvus
 
Product Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdfProduct Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdf
 
Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0Chapter 3 - Static Testing (Review) V4.0
Chapter 3 - Static Testing (Review) V4.0
 
20240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 202420240702 QFM021 Machine Intelligence Reading List June 2024
20240702 QFM021 Machine Intelligence Reading List June 2024
 
STKI Israeli Market Study 2024 final v1
STKI Israeli Market Study 2024 final  v1STKI Israeli Market Study 2024 final  v1
STKI Israeli Market Study 2024 final v1
 
From NCSA to the National Research Platform
From NCSA to the National Research PlatformFrom NCSA to the National Research Platform
From NCSA to the National Research Platform
 
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My IdentityCNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
CNSCon 2024 Lightning Talk: Don’t Make Me Impersonate My Identity
 
Chapter 6 - Test Tools Considerations V4.0
Chapter 6 - Test Tools Considerations V4.0Chapter 6 - Test Tools Considerations V4.0
Chapter 6 - Test Tools Considerations V4.0
 
Pigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdfPigging Solutions Sustainability brochure.pdf
Pigging Solutions Sustainability brochure.pdf
 
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating AppsecGDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
GDG Cloud Southlake #34: Neatsun Ziv: Automating Appsec
 
Product Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdfProduct Listing Optimization Presentation - Gay De La Cruz.pdf
Product Listing Optimization Presentation - Gay De La Cruz.pdf
 
Leading a Tableau User Group - Onboarding deck for new leaders
Leading a Tableau User Group - Onboarding deck for new leadersLeading a Tableau User Group - Onboarding deck for new leaders
Leading a Tableau User Group - Onboarding deck for new leaders
 
Getting Started Using the National Research Platform
Getting Started Using the National Research PlatformGetting Started Using the National Research Platform
Getting Started Using the National Research Platform
 
Research Directions for Cross Reality Interfaces
Research Directions for Cross Reality InterfacesResearch Directions for Cross Reality Interfaces
Research Directions for Cross Reality Interfaces
 
Coordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar SlidesCoordinate Systems in FME 101 - Webinar Slides
Coordinate Systems in FME 101 - Webinar Slides
 
Leveraging AI for Software Developer Productivity.pptx
Leveraging AI for Software Developer Productivity.pptxLeveraging AI for Software Developer Productivity.pptx
Leveraging AI for Software Developer Productivity.pptx
 

Phoenix Secondary Indexing - LA HUG Sept 9th, 2013

  • 1. Secondary Indexing in Phoenix Jesse Yates HBase Committer Software Engineer LA HBase User Group – September 4, 2013
  • 2. Agenda • About • Other Indexing Frameworks • Immutable Indexes • Mutable Indexes in Phoenix • Mutable Indexing Internals • Roadmap 2 LA HUG – Sept 2013 https://www.madison.k12.wi.us/calendars
  • 3. About me • Developer at Salesforce – System of Record, Phoenix • Open Source – Phoenix – HBase – Accumulo 3 LA HUG – Sept 2013
  • 4. Phoenix • Open Source – https://github.com/forcedotcom/phoenix • “SQL-skin” on HBase – Everyone knows SQL! • JDBC Driver – Plug-and-play • Faster than HBase – in some cases 4 LA HUG – Sept 2013
  • 5. Why Index? • HBase is only sorted on 1 “axis” • Great for search via a single pattern Example! LA HUG – Sept 20135
  • 7. Secondary Indexes • Sort on ‘orthogonal’ axis • Save full-table scan • Expected database feature • Hard in HBase b/c of ACID considerations LA HUG – Sept 20137
  • 8. Agenda • About • Other Indexing Frameworks • Immutable Indexes • Mutable Indexes in Phoenix • Mutable Indexing Internals • Roadmap 8 LA HUG – Sept 2013
  • 9. 9 LA HUG – Sept 2013 http://www.wired.com/wiredenterprise/2011/10/microsoft-and-hadoop/
  • 10. Other (Major) Indexing Frameworks • HBase SEP – Side-Effects Processor – Replication-based – https://github.com/NGDATA/hbase-sep • Huawei – Server-local indexes – Buddy regions – https://github.com/Huawei-Hadoop/hindex 10 LA HUG – Sept 2013
  • 11. Agenda • About • Other Indexing Frameworks • Immutable Indexes • Mutable Indexes in Phoenix • Mutable Indexing Internals • Roadmap 11 LA HUG – Sept 2013
  • 12. Immutable Indexes • Immutable Rows • Much easier to implement • Client-managed • Bulk-loadable 12 LA HUG – Sept 2013
  • 14. Index Bulk Loading Identity Mapper Custom Phoenix Reducer 14 LA HUG – Sept 2013 HFile Output Format
  • 15. Index Bulk Loading PreparedStatement statement = conn.prepareStatement(dmlStatement); statement.execute(); String upsertStmt = "upsert into core.entity_history(organization_id,key_prefix,entity_history_id, created_by, created_date)n" + "values(?,?,?,?,?)"; statement = conn.prepareStatement(upsertStmt); … //set values Iterator<Pair<byte[],List<KeyValue>>>dataIterator = PhoenixRuntime.getUncommittedDataIterator(conn); 15 LA HUG – Sept 2013
  • 16. Agenda • About • Other Indexing Frameworks • Immutable Indexes • Mutable Indexes in Phoenix • Mutable Indexing Internals • Roadmap 16 LA HUG – Sept 2013
  • 17. The “fun” stuff… 17 LA HUG – Sept 2013
  • 18. 1.5 years 18 LA HUG – Sept 2013
  • 19. Mutable Indexes • Global Index • Change row state – Common use-case – “expected” implementation • Covered Columns 19 LA HUG – Sept 2013
  • 20. Usage • Just SQL! • Baby name popularity • Mock demo 20 LA HUG – Sept 2013
  • 21. Usage • Selects the most popular name for a given year SELECT name,occurrences FROM baby_names WHERE year=2012 LIMIT 1; • Selects the total occurrences of a given name across all years SELECT /*+ NO_INDEX */ name,sum(occurrences) FROM baby_names WHERE name='Jesse' GROUP BY name; • Selects the total occurrences of a given name across all years allowing an index to be used SELECT name,sum(occurrences) FROM baby_names WHERE name='Jesse' GROUP BY NAME; LA HUG – Sept 201321
  • 22. Usage • Update rows due to census inaccuracy – Will only work if the mutable indexing is working UPSERT INTO baby_names SELECT year,occurrences+3000,sex,name FROM baby_names WHERE name='Jesse'; • Selects the now updated data (from the index table) SELECT name,sum(occurrences) FROM baby_names WHERE name='Jesse' GROUP BY NAME; • Index table still used in scans EXPLAIN SELECT name,sum(occurrences) FROM baby_names WHERE name='Jesse' GROUP BY NAME; LA HUG – Sept 201322
  • 23. Agenda • About • Other Indexing Frameworks • Immutable Indexes • Mutable Indexes in Phoenix • Mutable Indexing Internals • Roadmap 23 LA HUG – Sept 2013
  • 24. Internals • Index Management – Build index updates – Ensures index is ‘cleaned up’ • Recovery Mechanism – Ensures index updates are “ACID” 24 LA HUG – Sept 2013
  • 25. “There is no magic” - Every programming hipster (chipster) LA HUG – Sept 201325
  • 26. Mutable Indexing: Standard Write Path 26 Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore LA HUG – Sept 2013
  • 27. Mutable Indexing: Standard Write Path 27 Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore LA HUG – Sept 2013
  • 28. Mutable Indexing 28 Region Coprocessor Host WAL Region Coprocessor Host Indexer Builder WAL Updater Durable! Indexer Index Table Index Table Index Table Codec LA HUG – Sept 2013
  • 29. Index Management 29 • Lives within a RegionCoprocesorObserver • Access to the local HRegion • Specifies the mutations to apply to the index tables public interface IndexBuilder{ public void setup(RegionCoprocessorEnvironmentenv); public Map<Mutation, String>getIndexUpdate(Put put); public Map<Mutation, String>getIndexUpdate(Deletedelete); } LA HUG – Sept 2013
  • 30. Why not write my own? • Managing Cleanup – Efficient point-in-time correctness – Performance tricks • Abstract access to HRegion – Minimal network hops • Sorting correctness – Phoenix typing ensures correct index sorting LA HUG – Sept 201330
  • 31. Example: Managing Cleanup • Updates can arrive out of order – Client-managed timestamps LA HUG – Sept 201331 ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 Row1 Fam2 Qual2 12 val2 Row1 Fam Qual 13 val3
  • 32. Example: Managing Cleanup Index Table LA HUG – Sept 201332 ROW FAMILY QUALIFIER TS Val1|Row1 Index Fam:Qual 10 Val1|Val2|Row1 Index Fam:Qual Fam2:Qual2 12 Val3|Val2|Row1 Index Fam:Qual Fam2:Qual2 13
  • 33. Example: Managing Cleanup LA HUG – Sept 201333 ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 Row1 Fam2 Qual2 12 val2 Row1 Fam Qual 13 val3 Row1 Fam Qual 11 val4
  • 34. Example: Managing Cleanup LA HUG – Sept 201334 ROW FAMILY QUALIFIER TS VALUE Row1 Fam Qual 10 val1 Row1 Fam Qual 11 val4 Row1 Fam2 Qual2 12 val2 Row1 Fam Qual 13 val3
  • 35. Example: Managing Cleanup LA HUG – Sept 201335 ROW FAMILY QUALIFIER TS Va1|Row1 Index Fam:Qual 10 Val4|Row1 Index Fam:Qual 11 Val4|Val2|Row1 Index Fam:Qual Fam2:Qual2 12 Va1l|Val2|Row1 Index Fam:Qual Fam2:Qual2 12 Val3|Val2|Row1 Index Fam:Qual Fam2:Qual2 13
  • 36. Example: Managing Cleanup LA HUG – Sept 201336 ROW FAMILY QUALIFIER TS Va1|Row1 Index Fam:Qual 10 Val4|Row1 Index Fam:Qual 11 Val4|Val2|Row1 Index Fam:Qual Fam2:Qual2 12 Va1l|Val2|Row1 Index Fam:Qual Fam2:Qual2 12 Val3|Val2|Row1 Index Fam:Qual Fam2:Qual2 13
  • 37. Managing Cleanup • History “roll up” • Out-of-order Updates • Point-in-time correctness • Multiple Timestamps per Mutation • Delete vs. DeleteColumn vs. DeleteFamily Surprisingly hard! LA HUG – Sept 201337
  • 38. Phoenix Index Builder • Much simpler than full index management • Hides cleanup considerations • Abstracted access to local state LA HUG – Sept 201338 public interfaceIndexCodec{ public void initialize(RegionCoprocessorEnvironmentenv); public Iterable<IndexUpdate>getIndexDeletes(TableState state; public Iterable<IndexUpdate>getIndexUpserts(TableState state); }
  • 39. Phoenix Index Codec LA HUG – Sept 201339
  • 40. Dude, where’s my data? 40 LA HUG – Sept 2013 Ensuring Correctness
  • 41. HBase ACID • Does NOT give you: – Cross-row consistency – Cross-table consistency • Does give you: – Durable data on success – Visibility on success without partial rows 41 LA HUG – Sept 2013
  • 42. Key Observation “Secondary indexing is inherently an easier problem than full transactions… secondary index updates are idempotent.” - Lars Hofhansl 42 LA HUG – Sept 2013
  • 43. Idempotent Index Updates • Doesn’t need full transactions • Replay as many times as needed • Can tolerate a little lag – As long as we get the order right 43 LA HUG – Sept 2013
  • 44. Failure Recovery • Custom WALEditCodec – Encodes index updates – Supports compressed WAL • Custom WAL Reader – Replay index updates from WAL LA HUG – Sept 201344 <property> <name>hbase.regionserver.wal.codec</name><value>o.a.h.hbase.regionserver.w al.IndexedWALEditCodec</value> </property> <property> <name>hbase.regionserver.hlog.reader.impl</name> <value>o.a.h.hbase.regionserver.wal.IndexedHLogReader</value> </property>
  • 45. Failure Situations • Any time before WAL, client replay • Any time after WAL, HBase replay • All-or-nothing LA HUG – Sept 201345
  • 46. Failure #1: Before WAL 46 Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore LA HUG – Sept 2013
  • 47. Failure #1: Before WAL 47 Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore No problem! No data is stored in the WAL, client just retries entire update. LA HUG – Sept 2013
  • 48. Failure #2: After WAL 48 Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore LA HUG – Sept 2013
  • 49. Failure #2: After WAL 49 Client HRegion RegionCoprocessorHost WAL RegionCoprocessorHost MemStore WAL replayed via usual replay mechanisms LA HUG – Sept 2013
  • 50. Agenda • About • Other Indexing Frameworks • Immutable Indexes • Mutable Indexes • Roadmap 50 LA HUG – Sept 2013
  • 51. Roadmap • Next release of Phoenix • Performance testing • Increased adoption • Adding to HBase (?) 51 LA HUG – Sept 2013
  • 52. Open Source! • Main: https://github.com/forcedotcom/phoenix • Indexing: https://github.com/forcedotcom/phoenix/tree/mutable-si 52 LA HUG – Sept 2013

Editor's Notes

  1. Ok, not this elephant…
  2. e.g. stats, historical data
  3. Actual implementation coming in example blog post
  4. Actual implementation coming in example blog post
  5. And don’t forget to cleanup the old row state!
  6. 8pt font, &lt;200 lines, including comments